none
Sysmon v7.01 uninstallation causing bugcheck 0x3b

    Question

  • Hello Everyone!
     
    We are trying to sort out the following problem yet unsuccessful so far. We have hundreds of servers where Sysmon must be installed (and was partly installed already). We noticed that in some occasions when it was necessary to uninstall Sysmon, the server BSOD’ed. As a test, we’ve installed and uninstalled Sysmon with a 20 minute interval on a group of very lighly loaded servers and were able to cause 7 BSODs on different machines (there were overall hundreds of install/uninstall cycles).

     

    Hosts - both virtualized and physical

    OS - Windows Server 2012 R2 in most recent tests

     

    Bugcheck details – the bugcheck code, faulting IP has been the same in all dumps:

     

    SYSTEM_SERVICE_EXCEPTION (3b)

    An exception happened while executing a system service routine.

    Arguments:

    Arg1: 00000000c0000005, Exception code that caused the bugcheck

    Arg2: fffff800404c4048, Address of the instruction which caused the bugcheck

    Arg3: ffffd0002311c620, Address of the context record for the exception that caused the bugcheck

    Arg4: 0000000000000000, zero.

     

    EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%08lx referenced memory at 0x%08lx. The memory could not be %s.

    FAULTING_IP:

    SysmonDrv+8048

    fffff800`404c4048 488b4840        mov     rcx,qword ptr [rax+40h]

     

    Processes where bugcheck happens: splunkd.exe, noderunner.exe, Microsoft.Exchange.Search.Service.exe (recent tests were made on Exchange Server 2016 servers).

     

    Upon dump analysis, we’ve noticed that the unregister thread runs concurrently with another thread where SysmonDrv is referenced for filtering an I/O event. In one case it was total of 3 executing threads – 2 with I/O event and 1 for unregistering SysmonDrv filter.

     

    Has anyone seen this behavior? How can we work around this? Thanks a bunch!

     

    Attaching example in a zippeduploads/52858/example.zip .txt file as apparently it doesn't fit in the post limits.

     

     

    Friday, February 23, 2018 10:28 AM

All replies

  • BTW I'm also seeing another post referring an older build of the SysMon tool: 

    https://forum.sysinternals.com/sysmon-600-bugcheck_topic32756_post151700.html#151700

    Monday, March 12, 2018 8:01 AM
  • FWIW, tested with Sysmon v8.04 and still see the behaviour. W2012R2 VMWare-based Exchange 2016 machine. Thread uninstalling Sysmon is still active (pending access to pagefile.sys) while another thread thinks its I/O should be filtered by Sysmon file system filter. At this time 0'd Sysmon memory is referenced causing 0x3b bugcheck.

    2: kd> !mex.Running
    Process   PID Thread             Id Pri Base Pri Next CPU CSwitches    User     Kernel State         Time Reason
    ..
    ..
    ngen.exe 3500 ffffe0003e2f9880 324c   6        6        2     61475 12s.594     3s.703 Running          0 WrDispatchInt
    ..

    2: kd> !mex.us -cpu
    1 thread: ffffe0003e2f9880
    nt!KeBugCheckEx
    ...
    SysmonDrv+0x80f8 <- fffff801`fe41a0f8 488b4840        mov     rcx,qword ptr [rax+40h] //rax == 00000...
    SysmonDrv+0x447e
    SysmonDrv+0x49ae
    SysmonDrv+0x2a3c
    fltmgr!FltpPerformPostCallbacks+0x2d7
    fltmgr!FltpPassThroughCompletionWorker+0x7d
    fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x38b
    fltmgr!FltpCreate+0x339
    nt!IopParseDevice+0x6c9
    nt!IopParseFile+0xba
    nt!ObpLookupObjectName+0x7be
    nt!ObOpenObjectByName+0x1e3
    nt!IopCreateFile+0x825
    nt!IoCreateFile+0x8a
    nt!NtCreateNamedPipeFile+0x121
    nt!KiSystemServiceCopyEnd+0x13
    0x7fff98f011ca

    0: kd> !mex.t ffffe0003e473340
    Process                   Thread                       CID       UserTime KernelTime ContextSwitches Wait Reason Time State
    System (ffffe00039044600) ffffe0003e473340 (E|K|W|R|V) 4.3830           0      406ms           18488 WrPageIn       0 Waiting

    Irp List:
        IRP              File Driver
        ffffe0004f83c470      disk

     # Child-SP         Return           Call Site
     0 nt!KiSwapContext+0x76
     1 nt!KiSwapThread+0x14e
     2 nt!KiCommitThreadWait+0x129
     3 nt!KeWaitForSingleObject+0x373
     4 nt!MiWaitForInPageComplete+0x147
     5 nt!MiIssueHardFault+0x330
     6 nt!MmAccessFault+0x6f2
     7 nt!KiPageFault+0x317
     8 fltmgr!TreeUnlinkMultiDoWalk+0x243e
     9 fltmgr!FltpDeleteContextList+0x42
     a fltmgr!FltpCleanupStreamListCtrlForInstanceRemoval+0xca
     b fltmgr!FltpFreeInstance+0x13a
     c fltmgr!FltUnregisterFilter+0xf5
     d SysmonDrv+0x2fbd
     e fltmgr!FltpDoUnloadFilter+0xfa
     f fltmgr!FltpMiniFilterDriverUnload+0xfc
    10 nt! ?? ::NNGAKEGL::`string'+0x5ae60 <- nt!IopLoadUnloadDriver
    11 nt!ExpWorkerThread+0x69f
    12 nt!PspSystemThreadStartup+0x18a
    13 nt!KiStartSystemThread+0x16

    0: kd> !mex.mirp ffffe0004f83c470
    ...
    File Details: ...
        Name          
        ============= ... 
        \pagefile.sys

    ...boils down to a pending storport IRP:

        ->2 \Driver\pvscsi INTERNAL_DEVICE_CONTROL     0 storport!RaDriverScsiIrp  12   e1 Pending Cancel, Success, Error CLASSPNP!TransferPktComplete 

    Tuesday, January 15, 2019 4:59 PM
  • Hello

    sorry for the delay in responding. This was reported to us at the end of December. It was traced to a race condition between the IO thread (ffffe0003e2f9880 in your example) and the unload thread (ffffe0003e473340).  This has been resolved and will be available in the forthcoming 8.05 release.

    Regards

    MarkC (MSFT)

    • Proposed as answer by egolov Monday, March 11, 2019 1:53 PM
    Wednesday, January 23, 2019 3:52 PM
  • Hello!

    Thank you for the information! We've tested this with v9.0 and so far not seeing the issue any longer. Thanks for the fix and continued perfection of the tool!

    Regards,

    Evgeny

    Monday, March 11, 2019 1:53 PM
  • Awesome thanks for the confirmation Evgeny.

    MarkC (MSFT)

    Monday, April 1, 2019 9:20 AM