locked
NUMA optimization? RRS feed

  • Question

  • What's new with NUMA in SQL Server?

    Is there any way to get a reading, from SQL Server or perfmon or hardware vendor tools or VM tools, of how much you are crossing NUMA boundaries, and what that is costing?

    The only "NUMA management" trick I know is setting maxdop at or below the number of cores on a physical processor.  I guess the rest is supposed to be automagic.

    Is there anything else one can see or do regarding NUMA?

    Thanks,

    Josh

    Wednesday, September 9, 2015 12:25 AM

Answers

  • What's new with NUMA in SQL Server?

    Nothing that I know. NUMA was basically introduced for scale-ability to overcome limitations of SMP system.The biggest shared memory bus an all requests use to use this same memory bus. With introduction of NUMA system each NUMA nodes would not have there own memory and possibly own I/O channels.

    Is there any way to get a reading, from SQL Server or perfmon or hardware vendor tools or VM tools, of how much you are crossing NUMA boundaries, and what that is costing?

    The readings which should interest you is Local and Foreign memory readings. Local memory is memory of the node which process can easily access while memory allocated to any other nodes which process wants to access is foreign memory. The COST associated with foreign memory access is high so you need to monitor that.

    You need to see counter SQL Server:Buffer Node object and in that you can see

    1. Database pages

    2. Foreign pages.

    For more information you must read Growing and shrinking BPOOL under NUMA

    Jonthan Kehayias has written more about NUMA



    Please mark this reply as answer if it solved your issue or vote as helpful if it helped so that other forum members can benefit from it

    My Technet Wiki Article

    MVP

    • Marked as answer by JRStern Wednesday, September 9, 2015 2:48 PM
    Wednesday, September 9, 2015 5:06 AM
  • Hi,

    the dynamic management  sys.dm_os_buffer_descriptors give you a map of the various page held in the buffer pool, the column "numa_node" reports the numa node where the page is.

    I've read you are talking about VM Tools so I think you use VMware, beware that VMware has the virtual NUMA concept that is not tied to the number of vCores/vSocket you select...

    Anyway SQL Server has tasks that runs on a specific processor that reside on a specific NUMA node, the policy that make a task run on an available processor is dictated by SQLOS engine, if a particular task need to access a page that is not in its same NUMA node you have external access, "Foreign pages" counter in the Buffer Manager object gives you the number of time this events occur.

    To evaluate how much your performance degradates itself you have to know the architecture of your physical processor and physical RAm

    • Proposed as answer by Andrea Caldarone Wednesday, September 9, 2015 1:50 PM
    • Marked as answer by JRStern Wednesday, September 9, 2015 2:48 PM
    Wednesday, September 9, 2015 1:18 PM
  • >Is there anything else one can see or do regarding NUMA?

    The three most important things to understand about SQL Server and NUMA are:

    1) A session is associated with a single NUMA node, and will run only on the schedulers for that NUMA node.

    2) Database pages will be cached on the NUMA node on which they were first read from disk, and subsequently accessed by sessions possibly running on other NUMA nodes.

    3) The "NUMA Penalty" for most modern servers is not very great.  Each CPU socket is a NUMA node, and for a 2-socket or 4-socket server the connections between NUMA nodes are still very fast.

    Some NUMA-related configurations you might consider:

    You can set processor affinity for a SQL instance to restrict SQL to a subset of the NUMA nodes on the machine.  For instance if you have two instances on a large physical server, you might affinitize each instance to a subset of the NUMA nodes. 

    For a Virtual Machine you can avoid configurations that span NUMA nodes, or report inaccurate NUMA topologies to the Guest OS.  For Hyper-V this means disabling "NUMA Node Spanning" https://technet.microsoft.com/en-us/library/jj614459.aspx#BKMK_VNUMA This setting is misleadingly named, since it doesn't prevent VMs from having VCPUs on multiple NUMA nodes, it just ensures that the NUMA topology presented to the VM is a subset of the physical NUMA topology of the host.

    For a large instance that spans NUMA nodes, you might affinitize connections to selected NUMA nodes.  You might do this when hosting multiple applications on a single large instance, and want to prevent the plan and buffer cache pages for each application from being distributed across all the NUMA nodes.

    You can do this using Resource Governor

    ALTER RESOURCE POOL

    Or for older SQL Versions, using special TDS endpoints that map to specific NUMA nodes.

    Map TCP/IP Ports to NUMA Nodes (SQL Server)

    David


    David http://blogs.msdn.com/b/dbrowne/


    Wednesday, September 9, 2015 2:34 PM

All replies

  • What's new with NUMA in SQL Server?

    Nothing that I know. NUMA was basically introduced for scale-ability to overcome limitations of SMP system.The biggest shared memory bus an all requests use to use this same memory bus. With introduction of NUMA system each NUMA nodes would not have there own memory and possibly own I/O channels.

    Is there any way to get a reading, from SQL Server or perfmon or hardware vendor tools or VM tools, of how much you are crossing NUMA boundaries, and what that is costing?

    The readings which should interest you is Local and Foreign memory readings. Local memory is memory of the node which process can easily access while memory allocated to any other nodes which process wants to access is foreign memory. The COST associated with foreign memory access is high so you need to monitor that.

    You need to see counter SQL Server:Buffer Node object and in that you can see

    1. Database pages

    2. Foreign pages.

    For more information you must read Growing and shrinking BPOOL under NUMA

    Jonthan Kehayias has written more about NUMA



    Please mark this reply as answer if it solved your issue or vote as helpful if it helped so that other forum members can benefit from it

    My Technet Wiki Article

    MVP

    • Marked as answer by JRStern Wednesday, September 9, 2015 2:48 PM
    Wednesday, September 9, 2015 5:06 AM
  • Hi,

    the dynamic management  sys.dm_os_buffer_descriptors give you a map of the various page held in the buffer pool, the column "numa_node" reports the numa node where the page is.

    I've read you are talking about VM Tools so I think you use VMware, beware that VMware has the virtual NUMA concept that is not tied to the number of vCores/vSocket you select...

    Anyway SQL Server has tasks that runs on a specific processor that reside on a specific NUMA node, the policy that make a task run on an available processor is dictated by SQLOS engine, if a particular task need to access a page that is not in its same NUMA node you have external access, "Foreign pages" counter in the Buffer Manager object gives you the number of time this events occur.

    To evaluate how much your performance degradates itself you have to know the architecture of your physical processor and physical RAm

    • Proposed as answer by Andrea Caldarone Wednesday, September 9, 2015 1:50 PM
    • Marked as answer by JRStern Wednesday, September 9, 2015 2:48 PM
    Wednesday, September 9, 2015 1:18 PM
  • >Is there anything else one can see or do regarding NUMA?

    The three most important things to understand about SQL Server and NUMA are:

    1) A session is associated with a single NUMA node, and will run only on the schedulers for that NUMA node.

    2) Database pages will be cached on the NUMA node on which they were first read from disk, and subsequently accessed by sessions possibly running on other NUMA nodes.

    3) The "NUMA Penalty" for most modern servers is not very great.  Each CPU socket is a NUMA node, and for a 2-socket or 4-socket server the connections between NUMA nodes are still very fast.

    Some NUMA-related configurations you might consider:

    You can set processor affinity for a SQL instance to restrict SQL to a subset of the NUMA nodes on the machine.  For instance if you have two instances on a large physical server, you might affinitize each instance to a subset of the NUMA nodes. 

    For a Virtual Machine you can avoid configurations that span NUMA nodes, or report inaccurate NUMA topologies to the Guest OS.  For Hyper-V this means disabling "NUMA Node Spanning" https://technet.microsoft.com/en-us/library/jj614459.aspx#BKMK_VNUMA This setting is misleadingly named, since it doesn't prevent VMs from having VCPUs on multiple NUMA nodes, it just ensures that the NUMA topology presented to the VM is a subset of the physical NUMA topology of the host.

    For a large instance that spans NUMA nodes, you might affinitize connections to selected NUMA nodes.  You might do this when hosting multiple applications on a single large instance, and want to prevent the plan and buffer cache pages for each application from being distributed across all the NUMA nodes.

    You can do this using Resource Governor

    ALTER RESOURCE POOL

    Or for older SQL Versions, using special TDS endpoints that map to specific NUMA nodes.

    Map TCP/IP Ports to NUMA Nodes (SQL Server)

    David


    David http://blogs.msdn.com/b/dbrowne/


    Wednesday, September 9, 2015 2:34 PM
  • Great answers from everyone, thanks.  I was not aware of the foreign pages, need to play with that!

    But here's a follow-up:

    Are servers actually configured with separate IO controllers so they can be mapped to NUMA nodes?  While that seems a great idea, I have never heard talk of it in the real world, but then I haven't been much involved with that level of server configuration recently.  I also wonder if it can be virtualized - well, that is, if it is pretty much automatically virtualized when main storage is on a SAN anyway.  IO thread just supplies an IO address and NUMA takes care of itself.

    Josh

    ps - also just to clarify: if you configure your BIOS to use interleaved memory, does that completely defeat NUMA?  Or is it still recognized by Windows and SQL Server because multiple processor chips and multiple processor cores may still have cache locality even if main memory does not?

    Wednesday, September 9, 2015 2:55 PM
  • >Are servers actually configured with separate IO . . .

    Usually not.  That's not the point of the separate IO Completion threads per NUMA node.  Their role mainly to write the data from disk into the NUMA node from which the IO was requested.  Otherwise a session on NUMA NODE 0 might request an IO, and the thread that completes the IO is running on NUMA NODE 2, and writes the data there.

    >if you configure your BIOS to use interleaved memory, does that completely defeat NUMA

    Yes.

    >multiple processor chips and multiple processor cores may still have cache locality even if main memory does not

    Windows has no knowledge of the socket-level or core-level caches.  And so they work regardless of NUMA configuration.

    David


    David http://blogs.msdn.com/b/dbrowne/

    Wednesday, September 9, 2015 3:08 PM
  • >Are servers actually configured with separate IO . . .

    Usually not. 

    OK, thanks.

    That's not the point of the separate IO Completion threads per NUMA node.  Their role mainly to write the data from disk into the NUMA node from which the IO was requested.  Otherwise a session on NUMA NODE 0 might request an IO, and the thread that completes the IO is running on NUMA NODE 2, and writes the data there.

    Seems to me a smart controller and driver could handle that.

    >if you configure your BIOS to use interleaved memory, does that completely defeat NUMA

    Yes.

    >multiple processor chips and multiple processor cores may still have cache locality even if main memory does not

    Windows has no knowledge of the socket-level or core-level caches.  And so they work regardless of NUMA configuration.

    That's an interesting factoid.  Using NUMA then locks a thread to a node, and will incidentally improve cache consistency, a feature you can't otherwise get!  Given the huge caches on a lot of server-level processors and their importance to high efficiency I'll bet the improvement in cache efficiency is as significant as the NUMA locality!  Though it would be a project to prove it.

    Josh



    • Edited by JRStern Wednesday, September 9, 2015 4:51 PM
    Wednesday, September 9, 2015 4:49 PM