none
Bluescreen on NETIO.SYS after upgrade to Server 2008 R2 SP1 on UAG DA Cluster RRS feed

  • Question

  • Hi there,

    We upgraded our UAG DA Cluster to R2 SP1 last week (the UAG servers are HyperV guests) and use the dynamic memory feature. Now the array master bluescreens multiple times a day with a stop d1 on netio.sys (IRQ not less or equal). Anyone familiar with this problem?

    Regards,

    Alfred


    Regards, Alfred
    Tuesday, March 15, 2011 9:38 AM

Answers

  • Hello,

    bug check code 0xD1: http://msdn.microsoft.com/fr-fr/library/ff560244(v=VS.85).aspx

    That means a kernel-mode driver attempted to access pageable memory at a process IRQL that was too high.

    Please use Microsoft Skydrive to upload dump files (c:\windows\minidumps) and post a link here.

    Start by updating all possible drivers.

    You can also contact Microsoft CSS.

     


    This posting is provided "AS IS" with no warranties or guarantees , and confers no rights.

    Microsoft Student Partner
    Microsoft Certified Professional
    Microsoft Certified Systems Administrator: Security
    Microsoft Certified Systems Engineer: Security
    Microsoft Certified Technology Specialist: Windows Server 2008 Active Directory, Configuration
    Microsoft Certified Technology Specialist: Windows Server 2008 Network Infrastructure, Configuration

    Microsoft Certified Technology Specialist: Windows Server 2008 Applications Infrastructure, Configuration

    • Marked as answer by Erez Benari Monday, May 9, 2011 11:36 PM
    Monday, May 9, 2011 1:41 PM

All replies

  • We have the same issue. Dump details below:

    DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)
    An attempt was made to access a pageable (or completely invalid) address at an
    interrupt request level (IRQL) that is too high.  This is usually
    caused by drivers using improper addresses.
    If kernel debugger is available get stack backtrace.
    Arguments:
    Arg1: 00000008000000e1, memory referenced
    Arg2: 0000000000000002, IRQL
    Arg3: 0000000000000000, value 0 = read operation, 1 = write operation
    Arg4: fffff88001001bd0, address which referenced memory


    STACK_COMMAND:  kb

    FOLLOWUP_IP:
    NETIO!WfpNblInfoGet+0
    fffff880`01001bd0 488b81e0000000  mov     rax,qword ptr [rcx+0E0h]
    SYMBOL_STACK_INDEX:  3
    SYMBOL_NAME:  NETIO!WfpNblInfoGet+0
    FOLLOWUP_NAME:  MachineOwner
    MODULE_NAME: NETIO
    IMAGE_NAME:  NETIO.SYS
    DEBUG_FLR_IMAGE_TIMESTAMP:  4ce79381
    FAILURE_BUCKET_ID:  X64_0xD1_NETIO!WfpNblInfoGet+0
    BUCKET_ID:  X64_0xD1_NETIO!WfpNblInfoGet+0

    Tuesday, March 15, 2011 10:44 AM
  • Well here's mine, seems the same issue indeed.

    DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)
    An attempt was made to access a pageable (or completely invalid) address at an
    interrupt request level (IRQL) that is too high.  This is usually
    caused by drivers using improper addresses.
    If kernel debugger is available get stack backtrace.
    Arguments:
    Arg1: 0000000b000000e8, memory referenced
    Arg2: 0000000000000002, IRQL
    Arg3: 0000000000000000, value 0 = read operation, 1 = write operation
    Arg4: fffff88000e01bd0, address which referenced memory

    Debugging Details:
    ------------------

    READ_ADDRESS: GetPointerFromAddress: unable to read from fffff800018c70e8
     0000000b000000e8

    CURRENT_IRQL:  2

    FAULTING_IP:
    NETIO!WfpNblInfoGet+0
    fffff880`00e01bd0 488b81e0000000  mov     rax,qword ptr [rcx+0E0h]

    CUSTOMER_CRASH_COUNT:  1

    DEFAULT_BUCKET_ID:  DRIVER_FAULT_SERVER_MINIDUMP

    BUGCHECK_STR:  0xD1

    PROCESS_NAME:  System

    STACK_COMMAND:  kb

    FOLLOWUP_IP:
    NETIO!WfpNblInfoGet+0
    fffff880`00e01bd0 488b81e0000000  mov     rax,qword ptr [rcx+0E0h]

    SYMBOL_STACK_INDEX:  3
    SYMBOL_NAME:  NETIO!WfpNblInfoGet+0
    FOLLOWUP_NAME:  MachineOwner
    MODULE_NAME: NETIO
    IMAGE_NAME:  NETIO.SYS
    DEBUG_FLR_IMAGE_TIMESTAMP:  4ce79381
    FAILURE_BUCKET_ID:  X64_0xD1_NETIO!WfpNblInfoGet+0
    BUCKET_ID:  X64_0xD1_NETIO!WfpNblInfoGet+0
    Followup: MachineOwner


    Regards, Alfred
    Tuesday, March 15, 2011 10:51 AM
  • And we are using dynamic memory... next time it goes will turn it off for the VM and see if any change.

    Marcus

    Tuesday, March 15, 2011 11:26 AM
  • Nope, BSOD with static memory configuration for the VM.

    Tuesday, March 15, 2011 11:55 AM
  • Ok, thanks, that saves me some time testing the same. Have the server in drain mode now, see if it keeps running....
    Regards, Alfred
    Tuesday, March 15, 2011 11:58 AM
  • Hi All. I had an issue with netio.sys but it was due to an antivirus running in the UAG server that conflicted with TMG packet filter (the antivirus had a network protection feature). To recover the BSOD I had to start safe mode and manually uninstall the antivirus

    Hope it helps


    // Raúl - I love this game
    Tuesday, March 15, 2011 1:36 PM
  • Have ruled out AV on the server. Now nothing on it.

    Although when I turned dynamic memory off I had a BSOD soon after I havent had one since - EDIT: actully ignore that, its just gone again.

    Only seems to crash when directaccess is in use. Is fine overnight, but starts crashing near 9am each day.

    Marcus

    Tuesday, March 15, 2011 1:40 PM
  • Same here, in drain mode it seems to keep running. As soon as we take the server into production the bsod's start.
    Regards, Alfred
    Tuesday, March 15, 2011 1:58 PM
  • Well in drain mode it keeps running without any bsod's. I'll open a case with MS support.
    Regards, Alfred
    Thursday, March 17, 2011 8:32 AM
  • Let me know how you get on. If you need another instance to compare and confirm its not your setup let me know.

    Marcus

    Thursday, March 17, 2011 10:05 AM
  • Ok, thanks, I'll keep you posted.
    Regards, Alfred
    Thursday, March 17, 2011 11:28 AM
  • Any progress? May have to open a case...

    Marcus

    Tuesday, March 22, 2011 9:10 AM
  • Well, opened a case last friday, still no response.... I'll call them tomorrow.


    Regards, Alfred
    Tuesday, March 22, 2011 2:45 PM
  • Well, seems to be quite busy at the UAG support department, we're third on the priority list....

     


    Regards, Alfred
    Thursday, March 24, 2011 12:59 PM
  • Not good support!!! Some days it doesnt go at all. Yesterday was fine all day. Today gone twice in 20mins.

    Marcus

    Friday, March 25, 2011 10:36 AM
  • Ok, our case is under investigation right now. The engineer suggested a possible workaround to maintain high availability which we are testing right now.

    Hereby the suggested workaround;

    Run the following command from an elevated command line:

    netsh tmg set global name=disablendisregistration value=1 persistent

     

    Restart all TMG services (including fweng driver) – “Net Stop fweng”, “Net Start fwsrv”.  There is no need to reboot.

     

    This workaround may reduce network performance on the UAG server, but will hopefully at least let you run your full array as a short term workaround while we analyse the data.

    I'll keep you posted.


    Regards, Alfred
    Tuesday, March 29, 2011 2:23 PM
  • Hi,

    I had the same issue, taking a detailed look into the dump told me that not netio.sys was the cause but fweng is causing the issue. Even if netio.sys is the one that is 'seen' on the BSOD.

    After some research i've seen that the files of the TMG aren't up to date, especially fweng.sys was too old. Manually installing this Update Rollup http://support.microsoft.com/kb/2498770/en-us solved my problem, the UAG ist not crashing any more!

    Wednesday, March 30, 2011 8:59 AM
  • Just received an update from MS support. This issue is confirmed as a bug. A fix would require both a Windows fix as a TMG fix so guess this will take some time.

    As a temporary workaround:

    "netsh tmg set global name=disablendisregistration value=1 persistent"


    Regards, Alfred
    Tuesday, May 3, 2011 9:51 AM
  • We have implemented this technology at my company.  We have already had two servers BSOD with the NETIO.sys and D1 issue.  Can you give me any more information about the bug fix you are working with MS on?  I would like to engage my MS resources to investigate this as well.

    Thanks,

    Barry

    Tuesday, May 3, 2011 2:40 PM
  • Hi Barry,

    From the latest contact I had with MS I can tell you that there won't be a fix shortly and probably there won't be a fix at all except from the workaround disabling ndisregistration. This because of the complexity of the cause of the issue. We are currently testing the suggested workaround.

     

     


    Regards, Alfred
    Wednesday, May 4, 2011 2:52 PM
  • Alfred,  Is there any other information you can give me so my MS Tech that is currently here on site look up the case.  He has looked in the bug fix area and a couple of other areas and is not finding this case.  Just need something so he can find it and report our issues as well, I hate to create a case and start all over with troubleshooting this issue if you already have a working case.

    Thanks,

    Barry

    Thursday, May 5, 2011 3:56 PM
  • Sure its case "111031747139416". Currently assigned to the TMG development team.
    Regards, Alfred
    Friday, May 6, 2011 12:58 PM
  • Hello,

    bug check code 0xD1: http://msdn.microsoft.com/fr-fr/library/ff560244(v=VS.85).aspx

    That means a kernel-mode driver attempted to access pageable memory at a process IRQL that was too high.

    Please use Microsoft Skydrive to upload dump files (c:\windows\minidumps) and post a link here.

    Start by updating all possible drivers.

    You can also contact Microsoft CSS.

     


    This posting is provided "AS IS" with no warranties or guarantees , and confers no rights.

    Microsoft Student Partner
    Microsoft Certified Professional
    Microsoft Certified Systems Administrator: Security
    Microsoft Certified Systems Engineer: Security
    Microsoft Certified Technology Specialist: Windows Server 2008 Active Directory, Configuration
    Microsoft Certified Technology Specialist: Windows Server 2008 Network Infrastructure, Configuration

    Microsoft Certified Technology Specialist: Windows Server 2008 Applications Infrastructure, Configuration

    • Marked as answer by Erez Benari Monday, May 9, 2011 11:36 PM
    Monday, May 9, 2011 1:41 PM
  • Hello,

    We had exactly same symptoms after 2008 R2 SP1 installation yesterday(three random BSOD's in 6 hours). BSOD said NETIO.sys(from full dump) and fweng.sys(from minidump) was causing this issue. We didn't have Software Update 1 Rollup 3 for TMG installed. After installing Software Update 1 Rollup 3 for TMG, server has not crashed anymore. I keep my fingers crossed that BSOD problem is gone.

    Our setup is:

    - Single Windows 2008 R2 SP1 and all latest Windows updates
    - UAG SP1 with KB2475733 hotfix
    - TMG SP1 with Software Update 1 Rollup 3 for SP1


    -Hude-

    Wednesday, May 11, 2011 11:31 AM
  • "netsh tmg set global name=disablendisregistration value=1 persistent"

     

    How do we reverse this fix?

     

    Thanks

    Tuesday, May 24, 2011 4:06 PM
  • Have you had any other BSOD's since impletmenting Software Update 1 Rollup 3 for TMG?

    Thanks,

    Barry

    Tuesday, May 24, 2011 5:18 PM
  • Just curious, did this fix not work for you?  Why are you wanting to reverse it?
    Tuesday, May 24, 2011 5:18 PM
  • Hi,

    Yes our BSOD's are gone after Update 1 Rollup 3.

    To reverse this disablendisregistration fix run "netsh tmg set global name=disablendisregistration value=0 persistent".

    To check what is the current status of this setting run "netsh tmg show global name=disablendisregistration".

     

    I never even tried this disablendisregistration fix because it didn't sound like a real fix.


    -Hude-

    Tuesday, May 24, 2011 6:23 PM
  • Hi,

    This fix allowed me to be able to boot the server to be able to run the Update 1 Rollup 3. From what is stated here, the fix degrades performance, so I would like to reverse the fix to see if Rollup 3 fixed the original issue without degrading performance.

    Thanks for the reply...

    Wednesday, May 25, 2011 12:39 PM
  • Initially a MS engineer suggested that implementing the fix reduces network performance, we actually have a confirmed statement from the TMG developement team that there is no performance impact when disabling ndisregistration on UAG. From their explanation our issue was a know bug and will not be resolved bij any fix other then disabling ndisregistration.


    Regards, Alfred
    Thursday, May 26, 2011 8:07 AM
  • Alfred,

    Thank you for the information. It is nice to be able to find such quick responses from people who have experienced these issues.  I find it interesting that after installing TMG on another drive in the same server using Windows Server 2008 vs. 2008R2 I have not experienced the same issues.

    I also have unresolved issues with reports that do not appear in the older Server 2008 version.  I was just hoping that these issues would be resolved quickly.

    Thanks again for your feedback.

    Mike

     

     

    Thursday, May 26, 2011 12:56 PM
  • We have a 6 server farm all Windows 2008 R2 servers, all the same H/W,  and this issue only affected 2 of the servers, the other four servers have never experienced this issue.

    Barry

    Thursday, May 26, 2011 12:58 PM
  • What service needs to be restarted to make this take effect? 

    Thanks,

    Barry

    Monday, June 6, 2011 6:17 PM
  • Stop/Start all Microsoft Forefront TMG services apparently, this workaround disables the 6TO4 interface so no good if running an IPv6 only directaccess and not UAG which we are (wrong forum section I know but did not find any other refs).  MS. any news on a permanent fix?
    BobK ;)
    Sunday, June 19, 2011 11:28 AM
  • Server has now been up 24 + hours, no apparent problems...

    History:

    Installed SP1 and 2 further updates.

    After 15 mins Blue screen 0x000000D1, then again after 10 minutes

    Removed latest security update post SP1, blue screen again after 15 minutes

    Removed second security update post sp1, blue screen after 9 hours and again after 1 hour

    Disabled ndis registration (as above), reboot, this disabled 6TO4 interface and thus DirectAccess'.

    Re-enabled ndis registration and re-running DirectAccess setup, no reboot.

    Server still up...


    BobK ;)
    Monday, June 20, 2011 11:26 AM
  • Hello Everyone,

    this is Balint from MS support. Although we have not finished investigation, I would like to provide some information on this issue.

    In case you are running into the above referenced blue screen, please try updating  to the latest bits first, i.e. as of now apply:

    - Forefront UAG SP1 Rollup 1 KB2475733

    - Forefront TMG 2010 Rollup 4 KB2517957

    Would you see that the issue is not fixed, you can anytime try disabling ndis integration as mentioned before "netsh tmg set global name=disablendisregistration value=1 persistent".  As stated by others,  after this (perhaps only after reboot) you might see that the 6to4 adapter is disconnected and therefore you will also get Ipsec DosP errors.This seems to be caused by a boot-time timing issue, the workaround of which is to schedule an "advanced task" which runs 15 minutes after reboot and restarts the iphelper service. Bottom line is though that the first step (updating to the latest bits) should eliminate the issue and disabling ndis integration should only be a last resort.

    Tuesday, October 4, 2011 4:19 PM
  • Hello Everyone,

    this is Balint from MS support. Although we have not finished investigation, I would like to provide some information on this issue.

    In case you are running into the above referenced blue screen, please try updating  to the latest bits first, i.e. as of now apply:

    - Forefront UAG SP1 Rollup 1 KB2475733

    - Forefront TMG 2010 Rollup 4 KB2517957

    Would you see that the issue is not fixed, you can anytime try disabling ndis integration as mentioned before "netsh tmg set global name=disablendisregistration value=1 persistent".  As stated by others,  after this (perhaps only after reboot) you might see that the 6to4 adapter is disconnected and therefore you will also get Ipsec DosP errors.This seems to be caused by a boot-time timing issue, the workaround of which is to schedule an "advanced task" which runs 15 minutes after reboot and restarts the iphelper service. Bottom line is though that the first step (updating to the latest bits) should eliminate the issue and disabling ndis integration should only be a last resort.


    By applying both Forefront UAG and TMG updates mentioned by Balint the situation is stable for at least a day. This compared with approx 100 BSOD for 2 days when the problem occurred.
    Ronny de Jong | inovativ.nl | Blog: donnystyle.wordpress.com | Twitter: twitter.com/ronnydejong
    Tuesday, October 4, 2011 10:17 PM
  • If implementing - Forefront UAG SP1 Rollup 1 KB2475733 - Forefront TMG 2010 Rollup 4 KB2517957 Do you have to reverse the disablendisregistration setting? "netsh tmg set global name=disablendisregistration value=1 persistent". If so is the value supposed to be 0?
    Tuesday, December 6, 2011 3:40 PM
  •  

    Hello,

    Have experienced the same problem today with Windows Server 2008 R2 SP1 with
    UAG 2010 SP1 Update 1
    TMG 2010 SP2

    We got consequently Blue Screen (BSOD). If we started the machine in Safe Mode and configured
    the Microsoft TMG Firewall service to manual, we were successfully able to bring the server online.
    If we started the Microsoft TMG Firewall service, the machine got BSOD.

    Thanks to good support from Balint, we tested his workaround, which works.

    But we also discovered a new possible better workaround, try upgrading the NIC Drivers.


    We had very old NIC HP NC532i Dual Port 10GbE drivers, with version 5.0.13.0 with date 30.07.2009. Upgraded them to version 6.2.9.0 with date 04.02.2011. This resolved our issue.

    We have yet to test this a couple of days in production before I can guarantee that this works stabile, but the BSOD disappeared after the NIC driver upgrade

    Those of you using Balint's great workaround, check out if you have old NIC drivers.

    That might help :)


    Best Regards Anders Horgen

    • Proposed as answer by Anders Horgen Wednesday, February 8, 2012 8:41 PM
    • Unproposed as answer by Anders Horgen Thursday, February 9, 2012 11:38 AM
    Wednesday, February 8, 2012 8:41 PM
  • Hello,

    ref my post above.
    The workaround by upgrading the NIC drivers to the latest and greatest,
    worked unforuntately only 12hrs, before BSOD returned.

    We have now using the suggested wokrarond:

    netsh tmg set global name=disablendisregistration value=1 persistent


    Best Regards Anders Horgen

    Thursday, February 9, 2012 11:39 AM
  • I don't believe the above are the latest updates.  According to my records and what I am getting ready to update to,  the following are the latest updates:

    -Forefront UAG SP1 Update 1 KB2585140

    -Forefront TMG 2010 SP2 KB2555840


    • Edited by barmour88 Thursday, February 9, 2012 4:36 PM
    Thursday, February 9, 2012 1:42 PM
  • Just updated to Windows 2k8 R2 sp1 and i have same issue looks like.

    UAG version : sp1  without rollup 

    TMG version : sp2 without rollup

    Should i patch as well both rollup updates ?

    at the moment, i use the work around provided by Balint_PSS

    • Proposed as answer by Michael Joss Thursday, April 12, 2012 3:16 PM
    • Unproposed as answer by Michael Joss Thursday, April 12, 2012 3:16 PM
    Thursday, April 12, 2012 8:53 AM
  • for your informations, i applied this fix and i'm not seeing BSOD again.

    http://support.microsoft.com/?id=2664888

    Regards,

    Mike

    • Proposed as answer by Michael Joss Thursday, April 12, 2012 3:16 PM
    Thursday, April 12, 2012 3:16 PM
  • I applied the following and have not had a BSOD on any of our 5 servers

    TMG 2010 SP2

    Forefront UAG SP1 Update 1

    Rollup 1 for Forefront Unified Access Gateway (UAG) 2010 Service Pack 1 Update 1

    Barry

    Thursday, April 12, 2012 4:59 PM
  • Hi there,

    also BSOD here, last this morning. TMG and UAG latest patches and updates. I cannot install http://support.microsoft.com/?id=2664888 it's saying update is not applicable to your computer.

    Any ideas? I did not try the workaround yet...

    Marcus

    Tuesday, June 5, 2012 7:10 AM
  • Marcus,

    You need to install SP1 and Sp1 update 1 before you can install Sp2.

    All,

    Fresh build, Win2k8r2 - VMware

    BSOD.

    1. Safe Mode - apply netsh command as above (reboot)

    2. Reg mode - install SP1, sp1 update 1, then Sp2.

    3. Then installed the HOTFIX above and rebooted.

    4. Booted good.  Tried to reverse the netsh as above back out.  (Reboot into BSOD)

    5. (Safe Mode/No Networking) and put netsh BACK IN.  Reboot.

    6. Working now - and i'm leaving the crazy nesh command in there, screw it.


    if my post is helpful - please click on the green arrow. (please excuse, in advance, any perceived sarcasm/humor - as I often forget it does not translate through text) :)

    Friday, August 10, 2012 1:07 AM
  • Hi all,

    Te following article has been released documenting this problem:

    http://support.microsoft.com/kb/2732485

    Thanks

    Balint

    Monday, September 17, 2012 8:47 AM