none
Blue Screen : Bad_Pool_Header

    Question

  • I have a four node Hyper-V cluster running 2008 SP2.  These are HP BL460c's all in one chassis, configured identically.  The nodes have been up for about a year.

    Recently we began seeing random blue screens from each of the four nodes.  I have each configured for minidumps.  There is no active user logged on to the host OS when this occurs.  Here's a Windbg !analyze -v for the latest crash, which happened just after noon today:

    BAD_POOL_HEADER (19)
    The pool is already corrupt at the time of the current request.
    This may or may not be due to the caller.
    The internal pool links must be walked to figure out a possible cause of
    the problem, and then special pool applied to the suspect tags or the driver
    verifier to a suspect driver.
    Arguments:
    Arg1: 0000000000000021, the data following the pool block being freed is corrupt.  Typically this means the consumer (call stack ) has overrun the block.
    Arg2: fffffa8020109000, The pool pointer being freed.
    Arg3: 0000000000008fc0, The number of bytes allocated for the pool block.
    Arg4: 003900390031002e, The corrupted value found following the pool block.

    Debugging Details:
    ------------------

    GetUlongFromAddress: unable to read from fffff80001fa4250

    BUGCHECK_STR:  0x19_21

    POOL_ADDRESS:  fffffa8020109000

    CUSTOMER_CRASH_COUNT:  1

    DEFAULT_BUCKET_ID:  DRIVER_FAULT_SERVER_MINIDUMP

    PROCESS_NAME:  WmiPrvSE.exe

    CURRENT_IRQL:  1

    IRP_ADDRESS:  ffffffffffffff88

    LAST_CONTROL_TRANSFER:  from fffff80001f44220 to fffff80001e6b450

    STACK_TEXT: 
    fffffa60`0c459478 fffff800`01f44220 : 00000000`00000019 00000000`00000021 fffffa80`20109000 00000000`00008fc0 : nt!KeBugCheckEx
    fffffa60`0c459480 fffff800`01e80a9e : 00000000`00000000 fffffa60`005ec7f0 fffffa80`00000000 fffffa60`20206f49 : nt!ExDeferredFreePool+0x765
    fffffa60`0c459530 fffff800`01e8cbae : 00000000`00000000 fffffa80`20e63970 fffffa60`0c459690 00000000`00000000 : nt!IopCompleteRequest+0x47e
    fffffa60`0c4595f0 fffff800`01e90613 : fffffa60`0c459710 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiDeliverApc+0x19e
    fffffa60`0c459690 fffff800`01e7494d : 00000000`00008fc0 fffff800`01fa3160 fffffa80`25e48c88 fffffa60`01900000 : nt!KiApcInterrupt+0x103
    fffffa60`0c459820 fffff800`02074fd6 : 00000000`00008fc0 fffff800`01fa3100 fffffa80`25e48f00 fffff800`01f45f07 : nt!IopfCompleteRequest+0x68d
    fffffa60`0c4598d0 fffff800`020cdb1e : fffffa80`25e48f00 00000000`00008fb0 fffffa60`0c459ca0 00000000`00000001 : nt!WmipIoControl+0xd6
    fffffa60`0c459a10 fffff800`020d6586 : fffff880`077a48e0 00000000`00001164 00000000`00000000 00000000`00000000 : nt!IopXxxControlFile+0x5be
    fffffa60`0c459b40 fffff800`01e6aef3 : fffff880`06d79650 fffffa80`20e63970 00000000`00000238 fffff800`020d5314 : nt!NtDeviceIoControlFile+0x56
    fffffa60`0c459bb0 00000000`771c6d8a : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiSystemServiceCopyEnd+0x13
    00000000`06b3e458 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0x771c6d8a


    STACK_COMMAND:  kb

    FOLLOWUP_IP:
    nt!KiDeliverApc+19e
    fffff800`01e8cbae 4533c9          xor     r9d,r9d

    SYMBOL_STACK_INDEX:  3

    SYMBOL_NAME:  nt!KiDeliverApc+19e

    FOLLOWUP_NAME:  MachineOwner

    MODULE_NAME: nt

    IMAGE_NAME:  ntkrnlmp.exe

    DEBUG_FLR_IMAGE_TIMESTAMP:  49e0237f

    FAILURE_BUCKET_ID:  X64_0x19_21_nt!KiDeliverApc+19e

    BUCKET_ID:  X64_0x19_21_nt!KiDeliverApc+19e

    Followup: MachineOwner

    ---

    This looks to me like a driver wasn't getting along well with wmiprvse.exe, hosed the memory contents, and crashed hard.  Would anyone here concur with this assessment?  As often as HP has been releasing driver updates for blades of late, it's nearly impossible to tell which driver might have been recently updated and started this problem.  Would a crash dump allow me to better determine what's going on here?  I've resisted that because these nodes have 32GB of RAM.

    Thanks in advance.

    -Jason

    Sunday, August 30, 2009 8:34 PM

Answers

  • Hello Jason,

     

    Thank you for the post.

     

    Please understand that to troubleshoot the blue screen issues, we usually need to perform debugging. However, in this forum, we do not provide debugging support. If you would like to perform debugging, please contact Microsoft Customer Support Service (CSS).

     

    To obtain the phone numbers for specific technology request, please refer to the website listed below:

    http://support.microsoft.com/default.aspx?scid=fh;EN-US;PHONENUMBERS

     

    If you are outside the US, please refer to http://support.microsoft.com for regional support phone numbers.

     

    I hope your issue can be resolved soon.

     

    Tim Quan - MSFT

     

    Monday, August 31, 2009 2:50 AM

All replies

  • Hello Jason,

     

    Thank you for the post.

     

    Please understand that to troubleshoot the blue screen issues, we usually need to perform debugging. However, in this forum, we do not provide debugging support. If you would like to perform debugging, please contact Microsoft Customer Support Service (CSS).

     

    To obtain the phone numbers for specific technology request, please refer to the website listed below:

    http://support.microsoft.com/default.aspx?scid=fh;EN-US;PHONENUMBERS

     

    If you are outside the US, please refer to http://support.microsoft.com for regional support phone numbers.

     

    I hope your issue can be resolved soon.

     

    Tim Quan - MSFT

     

    Monday, August 31, 2009 2:50 AM
  • Jason, did you ever find a fix for your issue? We are experiencing the exact same thing, Hyper-V, 4-nodes on BL460C G6, 2008 R2, all in one chassis. Here's a pinch of my debug:

    BAD_POOL_HEADER (19)
    The pool is already corrupt at the time of the current request.
    This may or may not be due to the caller.
    The internal pool links must be walked to figure out a possible cause of
    the problem, and then special pool applied to the suspect tags or the driver
    verifier to a suspect driver.
    Arguments:
    Arg1: 0000000000000021, the data following the pool block being freed is corrupt.  Typically this means the consumer (call stack ) has overrun the block.
    Arg2: fffffa801bed6000, The pool pointer being freed.
    Arg3: 0000000000004e70, The number of bytes allocated for the pool block.
    Arg4: 0000006d00750072, The corrupted value found following the pool block.

    Dave

    Tuesday, January 05, 2010 9:01 PM
  • hello,
    For the sake of completion, another exact same situation, BSOD's twice every month on average;

    BAD_POOL_HEADER (19)
    The pool is already corrupt at the time of the current request.
    This may or may not be due to the caller.
    The internal pool links must be walked to figure out a possible cause of
    the problem, and then special pool applied to the suspect tags or the driver
    verifier to a suspect driver.
    Arguments:
    Arg1: 0000000000000021, the data following the pool block being freed is corrupt.  Typically this means the consumer (call stack ) has overrun the block.
    Arg2: fffffa8011ead000, The pool pointer being freed.
    Arg3: 0000000000001f40, The number of bytes allocated for the pool block.
    Arg4: 0032003a00360030, The corrupted value found following the pool block.
    ...
    PROCESS_NAME:  WmiPrvSE.exe
    ...
    MODULE_NAME: nt
    IMAGE_NAME:  ntkrnlmp.exe
    ...
    STACK_TEXT: 
    nt!KeBugCheckEx
    nt!ExFreePool+0xa6a
    nt!IopCompleteRequest+0x5ce
    nt!IopfCompleteRequest+0x75d
    nt!WmipIoControl+0xd6
    nt!IopXxxControlFile+0x607
    nt!NtDeviceIoControlFile+0x56
    nt!KiSystemServiceCopyEnd+0x13

    2008 R2, ProLiant BL460c G6, only one node installed.

    Sander

    Tuesday, January 19, 2010 2:47 PM
  • Same problem here with Dell M610 Hyper-V R2 cluster. Happens shortly after boot up on both nodes. BugCheck analysis is the same on both. Logging into the system seems to trigger it.

    7: kd> !analyze -v
    *******************************************************************************
    *                                                                             *
    *                        Bugcheck Analysis                                    *
    *                                                                             *
    *******************************************************************************

    BAD_POOL_HEADER (19)
    The pool is already corrupt at the time of the current request.
    This may or may not be due to the caller.
    The internal pool links must be walked to figure out a possible cause of
    the problem, and then special pool applied to the suspect tags or the driver
    verifier to a suspect driver.
    Arguments:
    Arg1: 0000000000000021, the data following the pool block being freed is corrupt.  Typically this means the consumer (call stack ) has overrun the block.
    Arg2: fffffa804eb90000, The pool pointer being freed.
    Arg3: 0000000000001c20, The number of bytes allocated for the pool block.
    Arg4: 0000000000000000, The corrupted value found following the pool block.

    Debugging Details:
    ------------------

    Page 1782192 not present in the dump file. Type ".hh dbgerr004" for details
    Page 177ca52 not present in the dump file. Type ".hh dbgerr004" for details

    BUGCHECK_STR:  0x19_21

    POOL_ADDRESS:  fffffa804eb90000 Nonpaged pool

    DEFAULT_BUCKET_ID:  VISTA_DRIVER_FAULT

    PROCESS_NAME:  WmiApSrv.exe

    CURRENT_IRQL:  1

    IRP_ADDRESS:  ffffffffffffff88

    LAST_CONTROL_TRANSFER:  from fffff800017ed3cd to fffff800016bbf00

    STACK_TEXT: 
    fffff880`0ed2e658 fffff800`017ed3cd : 00000000`00000019 00000000`00000021 fffffa80`4eb90000 00000000`00001c20 : nt!KeBugCheckEx
    fffff880`0ed2e660 fffff800`016db56e : 00000000`a0000003 fffff6fb`831fdfb8 00000000`20206f49 00000000`00000000 : nt!ExFreePool+0xa6a
    fffff880`0ed2e710 fffff800`016be8cd : 00000000`00000000 fffff8a0`04a9b790 00000000`00000001 fffff880`0ed2e910 : nt!IopCompleteRequest+0x5ce
    fffff880`0ed2e7e0 fffff800`0194d23e : fffffa80`4eb7e210 fffff800`017eeb00 00000000`00000000 00000000`00000000 : nt!IopfCompleteRequest+0x75d
    fffff880`0ed2e8c0 fffff800`019d43a7 : fffffa80`4eb7e210 fffff880`0ed2eca0 fffff880`0ed2eca0 fffffa80`b640bc10 : nt!WmipIoControl+0xd6
    fffff880`0ed2ea10 fffff800`019d4c06 : fffffa80`48ee8f30 00000000`000001d4 00000000`00000000 00000000`00000000 : nt!IopXxxControlFile+0x607
    fffff880`0ed2eb40 fffff800`016bb153 : 00000000`00000453 00000000`00000001 fffff880`0ed2ebc8 0000007f`ffffffff : nt!NtDeviceIoControlFile+0x56
    fffff880`0ed2ebb0 00000000`770cff2a : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiSystemServiceCopyEnd+0x13
    00000000`00e7f3a8 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0x770cff2a


    STACK_COMMAND:  kb

    FOLLOWUP_IP:
    nt!WmipIoControl+d6
    fffff800`0194d23e 8bc3            mov     eax,ebx

    SYMBOL_STACK_INDEX:  4

    SYMBOL_NAME:  nt!WmipIoControl+d6

    FOLLOWUP_NAME:  MachineOwner

    MODULE_NAME: nt

    IMAGE_NAME:  ntkrnlmp.exe

    DEBUG_FLR_IMAGE_TIMESTAMP:  4a5bc600

    FAILURE_BUCKET_ID:  X64_0x19_21_nt!WmipIoControl+d6

    BUCKET_ID:  X64_0x19_21_nt!WmipIoControl+d6

    Followup: MachineOwner


    Jeff Graves, ORCS Web, Inc.
    Monday, March 15, 2010 3:06 AM
  • I opened a PSS case. They found a hotfix (976443) that addresses this issue, but now I'm getting a BSOD 0xD5 on CSVFilter.sys that we're working on.

    http://support.microsoft.com/kb/976443


    Jeff Graves, ORCS Web, Inc.
    Tuesday, April 06, 2010 3:20 PM
  • Hi Jeff,

    How are the CSVFilter.sys BSODs? Did you manage to find a solution?

    I've just started experiencing the BAD_POOL_HEADER (19) BSOD on one of my Hyper-V nodes - a Dell R610.  I was just wondering if the KB 976443 patch was stable?

    Regards,

    Daniel

    Saturday, April 17, 2010 4:34 PM
  • PSS hasn't been able to figure out what's causing the memory corruption with CSVFilter.sys. Luckily, the CSVFilter.sys BSOD can only by repro'd if I put a host node into maintenance mode and then immediately reboot it. PSS had me turn on special pool and create complete memory dumps (instead of kernel dumps) but thus far, they haven't been able to figure out the cause.

    That's the only issue I've experienced with this patch, and since it only happens when I'm rebooting a host node (when it's already in maintenance mode) it's an acceptable risk to me (whereas the BSOD without the patch seemed to occur on boot-up which was a bit more dangerous).


    Jeff Graves, ORCS Web, Inc.
    Saturday, April 17, 2010 4:41 PM
  • Thanks Jeff - that's good to know.

     

    We've just had a spectacular failure of a power supply in one of our iSCSI switches.  It managed to take out the UPS and the whole power circuit it was connected too.  For some reason this has sparked off the BAD_POOL_HEADER (19) BSODs after reboot.

     


    Saturday, April 17, 2010 5:20 PM
  • I just had what seems to be the exact same issue you gentlement are describing.  I have a 4 node Hyper-V 2.0 cluser running on Dell M710 blades.  1 of the hosts randomly blue screened over the weekend and the cluster didnt know how to handle it based on way I had my preffered hosts set.  We are using Equallogic iSCSI on for storage.

     

    I wanted to attribute this issue to all the problems associated with Nehalems running Hyper-V 2.0 and the broadcom NIC fiasco on the Dell M610/M710 blades, but luckily I came accross this thread.

    I only have the MiniDump which looks almost identical as the ones posted above. 

    I have another hotfix to apply regarding loss of NIC connectivity within guests and I will consider the hotfix posted in this thread.  Just wanted to share my situation in the even we can figure the commonality in hardware or configs.

     

    Dump Analsys below:

     

    *******************************************************************************
    *                                                                             *
    *                        Bugcheck Analysis                                    *
    *                                                                             *
    *******************************************************************************

    BAD_POOL_HEADER (19)
    The pool is already corrupt at the time of the current request.
    This may or may not be due to the caller.
    The internal pool links must be walked to figure out a possible cause of
    the problem, and then special pool applied to the suspect tags or the driver
    verifier to a suspect driver.
    Arguments:
    Arg1: 0000000000000021, the data following the pool block being freed is corrupt.  Typically this means the consumer (call stack ) has overrun the block.
    Arg2: fffffa802b3a7000, The pool pointer being freed.
    Arg3: 00000000000021f0, The number of bytes allocated for the pool block.
    Arg4: 0000000000000000, The corrupted value found following the pool block.

    Debugging Details:
    ------------------

    Page 40c56 not present in the dump file. Type ".hh dbgerr004" for details
    Page 21a62c not present in the dump file. Type ".hh dbgerr004" for details

    BUGCHECK_STR:  0x19_21

    POOL_ADDRESS:  fffffa802b3a7000 Nonpaged pool

    DEFAULT_BUCKET_ID:  VISTA_DRIVER_FAULT

    PROCESS_NAME:  WmiApSrv.exe

    CURRENT_IRQL:  1

    IRP_ADDRESS:  ffffffffffffff88

    LAST_CONTROL_TRANSFER:  from fffff80001dfa3cd to fffff80001cc8f00

    STACK_TEXT: 
    fffff880`0f91d658 fffff800`01dfa3cd : 00000000`00000019 00000000`00000021 fffffa80`2b3a7000 00000000`000021f0 : nt!KeBugCheckEx
    fffff880`0f91d660 fffff800`01ce856e : 00000000`a0000003 fffff6fb`82201000 00000000`20206f49 fffffa80`00000000 : nt!ExFreePool+0xa6a
    fffff880`0f91d710 fffff800`01ccb8cd : 00000000`00000000 fffff8a0`0117bb40 00000000`00000001 fffff880`0f91d910 : nt!IopCompleteRequest+0x5ce
    fffff880`0f91d7e0 fffff800`01f5a23e : fffffa80`46969d20 fffff800`01dfbb00 00000000`00000000 00000000`00000000 : nt!IopfCompleteRequest+0x75d
    fffff880`0f91d8c0 fffff800`01fe13a7 : fffffa80`46969d20 fffff880`0f91dca0 fffff880`0f91dca0 fffffa80`5950bbd0 : nt!WmipIoControl+0xd6
    fffff880`0f91da10 fffff800`01fe1c06 : fffffa80`25b4ff30 00000000`000001d4 00000000`00000000 00000000`00000000 : nt!IopXxxControlFile+0x607
    fffff880`0f91db40 fffff800`01cc8153 : 00000000`0000048a 00000000`00000001 fffff880`0f91dbc8 0000007f`ffffffff : nt!NtDeviceIoControlFile+0x56
    fffff880`0f91dbb0 00000000`77adff2a : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiSystemServiceCopyEnd+0x13
    00000000`00aef218 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0x77adff2a


    STACK_COMMAND:  kb

    FOLLOWUP_IP:
    nt!WmipIoControl+d6
    fffff800`01f5a23e 8bc3            mov     eax,ebx

    SYMBOL_STACK_INDEX:  4

    SYMBOL_NAME:  nt!WmipIoControl+d6

    FOLLOWUP_NAME:  MachineOwner

    MODULE_NAME: nt

    IMAGE_NAME:  ntkrnlmp.exe

    DEBUG_FLR_IMAGE_TIMESTAMP:  4a5bc600

    FAILURE_BUCKET_ID:  X64_0x19_21_nt!WmipIoControl+d6

    BUCKET_ID:  X64_0x19_21_nt!WmipIoControl+d6

    Followup: MachineOwner
    ---------

     

    Tuesday, April 27, 2010 4:30 PM
  • Hi,

    Did anyone found any solution for this problem? Now I have exact same problem with BL460 2008 R2 on 4 node Hyper-v cluster.

    /Harri

     

    BAD_POOL_HEADER (19)
    Arguments:
    Arg1: 0000000000000021, the data following the pool block being freed is corrupt.  Typically this means the consumer (call stack ) has overrun the block.
    Arg2: fffffa8039879000, The pool pointer being freed.
    Arg3: 0000000000001a40, The number of bytes allocated for the pool block.
    Arg4: 0000000000000000, The corrupted value found following the pool block.

    Debugging Details:
    ------------------

    GetUlongFromAddress: unable to read from fffff80001a6a1b0
    OVERLAPPED_MODULE: Address regions for 'kbdhid' and 'USBD.SYS' overlap
    BUGCHECK_STR:  0x19_21
    POOL_ADDRESS:  fffffa8039879000
    CUSTOMER_CRASH_COUNT:  1
    DEFAULT_BUCKET_ID:  DRIVER_FAULT_SERVER_MINIDUMP
    PROCESS_NAME:  WmiApSrv.exe
    CURRENT_IRQL:  1
    IRP_ADDRESS:  ffffffffffffff88

     

    Wednesday, May 26, 2010 4:28 AM
  • This patch solved the BAD_POOL_HEADER issue:

    http://support.microsoft.com/kb/976443

    After installing this patch, I am getting a different BSOD with the CSVFilter.sys driver on shutdown. PSS has found a piece of code that could cause this, but it seems to be a very low-level issue that could be isolated to my environment specifically (this is the first time they've ever encountered it).

    In short, this patch should solve the BAD_POOL_HEADER BSOD and should be safe to use in production.


    Jeff Graves, ORCS Web, Inc.
    Wednesday, May 26, 2010 1:02 PM
  • Just to add my 2c to the conversation; thanks for posting the link to the hotfix.

    I'm having the same issue with one of my Hyper-V nodes; Server 2008 R2, Hyper-V, HP LeftHand iSCSI SAN, HP BL465c G7 Blade. 

    I'm going to install the hotfix and see how it goes.

    Wednesday, December 01, 2010 12:32 AM
  • I'm having the same issue with one of my Hyper-V nodes; Server 2008 R2, Hyper-V, Dell Equallogic PS5000/6000 iSCSI SAN, Dell R610. 
    I'm going to install the Hotfix...

    Tuesday, March 15, 2011 9:00 AM
  • Just to toss in a monkey wrench, we are just now upgrading our Windows 7 x64 machine to Office 2007 Small Business. 

    We would uninstall all the compatibility packs, viewers, and old office. Then I would install Office 2007, log off of admin, log in as the user, and open Outlook and Word.

    Upon opening Outlook, the workstation would blue screen, and reboot. After it came back up, then the user was able to log in and operate normally. 

    I just added in a reboot after the uninstalls, then a reboot after install, and the BSOD went away. 

    This is what the minidump had:

    BAD_POOL_HEADER (19)
    The pool is already corrupt at the time of the current request.
    This may or may not be due to the caller.
    The internal pool links must be walked to figure out a possible cause of
    the problem, and then special pool applied to the suspect tags or the driver
    verifier to a suspect driver.
    Arguments:
    Arg1: 0000000000000021, the data following the pool block being freed is corrupt.  Typically this means the consumer (call stack ) has overrun the block.
    Arg2: fffff900c0eec000, The pool pointer being freed.
    Arg3: 0000000000001060, The number of bytes allocated for the pool block.
    Arg4: 0000000000000000, The corrupted value found following the pool block.

    Debugging Details:
    ------------------

    GetUlongFromAddress: unable to read from fffff80002c9a7f0

    BUGCHECK_STR:  0x19_21

    POOL_ADDRESS:  fffff900c0eec000 

    CUSTOMER_CRASH_COUNT:  1

    DEFAULT_BUCKET_ID:  VISTA_DRIVER_FAULT

    LAST_CONTROL_TRANSFER:  from fffff80002db79c2 to fffff80002c80680

    Thursday, June 27, 2013 10:46 PM