locked
Error 41 Kernel RRS feed

  • Question

  • Hello There! After searching quite a bit through the Internet I have found similar problems  but not identical and i am kind of frustrated as to what the cause for my Error 41 may be because it seems like there are way too many variables.

    There has been no Hardware changes on my system and nothing major Software-wise. I have updated almost every driver i thought could be the cause .

    All i can offer is the last report from the event viewer and to mention that my PC never restarts,it just crashes by going to a black screen,the sound becomes a constant "beeeeep" and this happens not very often but it occurs almost only during gaming.

    Below is the report

    - System
    - Provider
    [
    Name]
    Microsoft-Windows-Kernel-Power
    [
    Guid]
    {331C3B3A-2005-44C2-AC5E-77220C37D6B4}
    EventID 41
    Version 2
    Level 1
    Task 63
    Opcode 0
    Keywords 0x8000000000000002
    - TimeCreated
    [
    SystemTime]
    2014-05-13T18:23:41.272010200Z
    EventRecordID 125360
    Correlation
    - Execution
    [
    ProcessID]
    4
    [
    ThreadID]
    8
    Channel System
    Computer Sheevah-PC
    - Security
    [
    UserID]
    S-1-5-18
    - EventData
    BugcheckCode 278
    BugcheckParameter1 0xfffffa800eaf24e0
    BugcheckParameter2 0xfffff88010015e2c
    BugcheckParameter3 0xffffffffc000009a
    BugcheckParameter4 0x4
    SleepInProgress false
    PowerButtonTimestamp

    0

    If i can provide any more Information that could be useful please ask and i'll reply asap.

    Thanks in Advance.

    Tuesday, May 13, 2014 7:46 PM

Answers

  • Unfortunately it seems that the cause of my problems is my GPU...

    I ran furmark and i got the same crash :(

    It occured quite fast as both my GPUs (running 560ti SLI) reached ~100 C at about half of the test.

    So that should determine the GPU and exclude everything else (Like the PSU) right?

    Also,i looked and noticed that one of my GPUs has one fin of the fan a bit stretched due to some bad technician's work which i am too upset to talk about.

    Do you think i should replace the Cooler of the mentioned GPU or the damage already done is Permanent?

    I will test both the GPUs one at a time to determine which one is the problem but i think it's obvious.

    And i am beginning to think that buying two Factory OC'd GPUs was a HUGE mistake...

    Right after 2 years i am having these issues.

    • Marked as answer by Cloud_TS Tuesday, May 27, 2014 6:16 AM
    Thursday, May 15, 2014 4:53 PM
  • After some tests i found out that the problematic GPU is NOT the one with the stretched fin of the fan but the other GPU,seems like one of the two fans attached on the GPU is struggling ALOT (hence the crashes due to overheat) . I should try replacing the fan IF i possible to find one that suits the cooler or else the whole thing.

    I believe my troubleshooting is/will be solved soon so i thank you a lot for all the Information you provided and the time looking into the .dmp i uploaded.. Your help your priceless,thanks again.

    • Marked as answer by Cloud_TS Tuesday, May 27, 2014 6:16 AM
    Thursday, May 15, 2014 5:48 PM

All replies

  • Hi,

    In order to assist you, we will need the .DMP files to analyze what exactly occurred at the time of the crash, etc.

    If you don't know where .DMP files are located, here's how to get to them:

    1. Navigate to the %systemroot%\Minidump folder.

    2. Copy any and all DMP files in the Minidump folder to your Desktop and then zip up these files.

    3. Upload the zip containing the .DMP files to Onedrive or a hosting site of your choice and paste in your reply. Prefered sites: Onedrive, Mediafire, Dropbox, etc. Nothing with wait-timers.

    4 (optional): The type of .DMP files located in the Minidump folder are known as Small Memory Dumps. In %systemroot% there will be what is known as a Kernel-Dump (if your system is set to generate). It is labeled MEMORY.DMP. The difference between Small Memory Dumps and Kernel-Dumps in the simplest definition is a Kernel-Dump contains much more information at the time of the crash, therefore allowing further debugging of your issue. If your upload speed permits it, and you aren't going against any strict bandwidth and/or usage caps, etc, the Kernel-Dump is the best choice. Do note that Kernel-Dumps are much larger in size due to containing much more info, which is why I mentioned upload speed, etc.

    If you are going to use Onedrive but don't know how to upload to it, please visit the following:

    Upload photos and files to Onedrive.

    Please note that any "cleaner" programs such as TuneUp Utilities, CCleaner, etc, by default will delete .DMP files upon use.

    If your computer is not generating .DMP files, please do the following:

    1. Start > type %systemroot% which should show the Windows folder, click on it. Once inside that folder, ensure there is a Minidump folder created. If not, CTRL-SHIFT-N to make a New Folder and name it Minidump.

    2. Windows key + Pause key. This should bring up System. Click Advanced System Settings on the left > Advanced > Performance > Settings > Advanced > Ensure there's a check-mark for 'Automatically manage paging file size for all drives'.

    3. Windows key + Pause key. This should bring up System. Click Advanced System Settings on the left > Advanced > Startup and Recovery > Settings > System Failure > ensure there is a check mark next to 'Write an event to the system log'.

    Ensure Small Memory Dump is selected and ensure the path is %systemroot%\Minidump.

    4. Double check that the WERS is ENABLED:

    Start > Search > type services.msc > Under the name tab, find Windows Error Reporting Service > If the status of the service is not Started then right click it and select Start. Also ensure that under Startup Type it is set to Automatic rather than Manual. You can do this by right clicking it, selecting properties, and under General selecting startup type to 'Automatic', and then click Apply.

    If you cannot get into normal mode to do any of this, please do this via Safe Mode.

    Regards,

    Patrick

    “Be kind whenever possible. It is always possible.” - Dalai Lama

    Tuesday, May 13, 2014 10:20 PM
  • Hi,

    To determine what may have happened and to identify a potential resolution, it is important to know what the computer was doing at the time just before the event occurred.

    This article gives you exactly what you want.

    http://support.microsoft.com/kb/2028504

     

    Regards


    v-yamliu

    Wednesday, May 14, 2014 9:42 AM
  • Hi,First of all thank you for the Response and sorry for this late reply but i was at work and i answered as soon as i got the chance. Here is a Link to the dump file you asked for,i'm not 100% sure but i think/hope it's the kernel one so it is of more help.

    As i am a new member i am not allowed to post links so i can't do otherwise but ask you to manually type the site... i uploaded the file on dropbox so all you have to do is type the "http://" part and paste the rest mentioned below.  My apologies for the inconvenience.

    /s/9g1ns357ocr6n01/051314-14086-01.dmp

    I hope this helps,looking forward to you reply.

         Regards

    Wednesday, May 14, 2014 2:42 PM
  • Hello and thanks for your reply!

    Just to mention that i already have tried following that troubleshooting article but after searching and searching i got frustrated and realised i always  ended up reading about something similar yet not identical to what i was experiencing...which is why i decided to create a thread.

    Wednesday, May 14, 2014 2:44 PM
  • Thanks a lot for the crash dump!

    The attached DMP file is of the VIDEO_TDR_TIMEOUT_DETECTED (117) bug check.

    This indicates that the display driver failed to respond in a timely fashion.

    0x117 is extremely similar to 0x116, therefore the troubleshooting steps remain the same:

    So, let me now explain what VIDEO_TDR_ERROR means. First off, TDR is an acronym for 'Timeout Detection and Recovery'. Timeout Detection and Recovery was introduced in Vista and carried over to Windows 7. Rather than putting exactly what Timeout Detection and Recovery does exactly, I'll just directly quote the MSDN article!

    Timeout detection:
    The GPU scheduler, which is part of the DirectX graphics kernel subsystem (Dxgkrnl.sys), detects that the GPU is taking more than the permitted amount of time to execute a particular task. The GPU scheduler then tries to preempt this particular task. The preempt operation has a "wait" timeout, which is the actual TDR timeout. This step is thus the timeout detection phase of the process. The default timeout period in Windows Vista and later operating systems is 2 seconds. If the GPU cannot complete or preempt the current task within the TDR timeout period, the operating system diagnoses that the GPU is frozen.
    To prevent timeout detection from occurring, hardware vendors should ensure that graphics operations (that is, DMA buffer completion) take no more than 2 seconds in end-user scenarios such as productivity and game play.
    Preparation for recovery:
    The operating system's GPU scheduler calls the display miniport driver's DxgkDdiResetFromTimeout function to inform the driver that the operating system detected a timeout. The driver must then reinitialize itself and reset the GPU. In addition, the driver must stop accessing memory and should not access hardware. The operating system and the driver collect hardware and other state information that could be useful for post-mortem diagnosis. 
    Desktop recovery:
    The operating system resets the appropriate state of the graphics stack. The video memory manager, which is also part of Dxgkrnl.sys, purges all allocations from video memory. The display miniport driver resets the GPU hardware state. The graphics stack takes the final actions and restores the desktop to the responsive state. As previously mentioned, some legacy DirectX applications might render just black at the end of this recovery, which requires the end user to restart these applications. Well-written DirectX 9Ex and DirectX 10 and later applications that handle Device Remove technology continue to work correctly. An application must release and then recreate its Direct3D device and all of the device's objects. For more information about how DirectX applications recover, see the Windows SDK.
     Article here.

    With this being said, if Timeout Detection and Recovery fails to recover the display driver, it will then shoot the 0x116 bugcheck. There are many different things that can cause a 0x116, which I will explain below:

    1. Ensure you have the latest video card drivers. If you are already on the latest video card drivers, uninstall and install a version or a few versions behind the latest to ensure it's not a latest driver only issue. If you have already experimented with the latest video card driver and many previous versions, please give the beta driver for your card a try.

    2. Remove and replace Kaspersky with Microsoft Security Essentials for temporary troubleshooting purposes as it may be causing conflicts:

    Kaspersky removal - http://support.kaspersky.com/common/service.aspx?el=1464

    MSE -  http://windows.microsoft.com/en-us/windows/security-essentials-download

    The following hardware issues can cause a TDR event:

    1. Unstable overclock (CPU, GPU, etc). Revert all and any overclocks to stock settings.

    2. Bad sector in memory resulting in corrupt data being communicated between the GPU and the system (video memory otherwise known as vRAM or physical memory otherwise known as RAM).

    GPU testing: Furmark, run for ~15 minutes and watch temperatures to ensure there's no overheating and watch for artifacts.

    RAM testing: Memtest (RUN FOR NO LESS THAN ~8 PASSES) - Refer to the below:

    Memtest:

    Memtest86+:

    Download Memtest86+ here:

    http://www.memtest.org/

    Which should I download?

    You can either download the pre-compiled ISO that you would burn to a CD and then boot from the CD, or you can download the auto-installer for the USB key. What this will do is format your USB drive, make it a bootable device, and then install the necessary files. Both do the same job, it's just up to you which you choose, or which you have available (whether it's CD or USB).

    Do note that some older generation motherboards do not support USB-based booting, therefore your only option is CD (or Floppy if you really wanted to).

    How Memtest works:

    Memtest86 writes a series of test patterns to most memory addresses, reads back the data written, and compares it for errors.

    The default pass does 9 different tests, varying in access patterns and test data. A tenth test, bit fade, is selectable from the menu. It writes all memory with zeroes, then sleeps for 90 minutes before checking to see if bits have changed (perhaps because of refresh problems). This is repeated with all ones for a total time of 3 hours per pass.

    Many chipsets can report RAM speeds and timings via SPD (Serial Presence Detect) or EPP (Enhanced Performance Profiles), and some even support changing the expected memory speed. If the expected memory speed is overclocked, Memtest86 can test that memory performance is error-free with these faster settings.

    Some hardware is able to report the "PAT status" (PAT: enabled or PAT: disabled). This is a reference to Intel Performance acceleration technology; there may be BIOS settings which affect this aspect of memory timing.

    This information, if available to the program, can be displayed via a menu option.

    Any other questions, they can most likely be answered by reading this great guide here:

    http://forum.canardpc.com/threads/28864-FAQ-please-read-before-posting
    3. Corrupt hard drive or Windows install / OS install resulting in corruption to the registry or page file.

    HDD diagnostics: Seatools - Refer to the below:

    http://www.seagate.com/support/downloads/seatools/

    You can run it via Windows or DOS. Do note that the only difference is simply the environment you're running it in. In Windows, if you are having what you believe to be device driver related issues that may cause conflicts or false positive, it may be a wise decision to choose the most minimal testing environment (DOS).

    Run all tests EXCEPT: Fix All, Long Generic, and anything Advanced.

    To reset your page file, follow the instructions below:

    a ) Go to Start...Run...and type in "sysdm.cpl" (without the quotes) and press Enter.

    - Then click on the Advanced tab,
    - Then on the Performance Settings Button,
    - Then on the next Advanced tab,
    - Then on the Virtual Memory Change button.

    b ) In this window, note down the current settings for your pagefile (so you can restore them later on).

    -Then click on the "No paging file" radio button, and

    - then on the "Set" button. Be sure, if you have multiple hard drives, that you ensure that the paging file is set to 0 on all of them.

    -Click OK to exit the dialogs.

    c ) Reboot (this will remove the pagefile from your system)

    d ) Then go back in following the directions in step a ) and re-enter the settings that you wrote down in step

    b ). Follow the steps all the way through (and including) the reboot.

    e ) Once you've rebooted this second time, go back in and check to make sure that the settings are as they're supposed to be.

    Run System File Checker:

    SFC.EXE /SCANNOW

    Go to Start and type in "cmd.exe" (without the quotes)

    At the top of the search box, right click on the cmd.exe and select "Run as adminstrator"

    In the black window that opens, type "SFC.EXE /SCANNOW" (without the quotes) and press Enter.

    Let the program run and post back what it says when it's done. 

    - Overheating of the CPU or GPU and or other components can cause 0x116 bugchecks. Monitor your temperatures and ensure the system is cooled adequately.

    - GPU failure- Heat, power issue (PSU issue), faulty vRAM, etc.

    The following software issues can cause a TDR event:

    - Incompatible drivers of any sort

    - Messy / corrupt registry

    - Corrupt Direct X - http://support.microsoft.com/kb/179113

    - Corrupt system files (run System File Checker as advised above)

    - Buggy and or corrupt 3rd party drivers. If you suspect a 3rd party driver being the issue, enable Driver Verifier:

    Driver Verifier:

    What is Driver Verifier?

    Driver Verifier is included in Windows 8/8.1, 7, Windows Server 2008 R2, Windows Vista, Windows Server 2008, Windows 2000, Windows XP, and Windows Server 2003 to promote stability and reliability; you can use this tool to troubleshoot driver issues. Windows kernel-mode components can cause system corruption or system failures as a result of an improperly written driver, such as an earlier version of a Windows Driver Model (WDM) driver.

    Essentially, if there's a 3rd party driver believed to be at issue, enabling Driver Verifier will help flush out the rogue driver if it detects a violation.

    Before enabling Driver Verifier, it is recommended to create a System Restore Point:

    Vista - START | type rstrui - create a restore point
    Windows 7 - START | type create | select "Create a Restore Point"
    Windows 8 - http://www.eightforums.com/tutorials/4690-restore-point-create-windows-8-a.html

    How to enable Driver Verifier:

    Start > type "verifier" without the quotes > Select the following options -

    1. Select - "Create custom settings (for code developers)"
    2. Select - "Select individual settings from a full list"
    3. Check the following boxes -
    - Special Pool
    - Pool Tracking
    - Force IRQL Checking
    - Deadlock Detection
    - Security Checks (Windows 7 & 8)
    - DDI compliance checking (Windows 8)
    - Miscellaneous Checks
    4. Select  - "Select driver names from a list"
    5. Click on the "Provider" tab. This will sort all of the drivers by the provider.
    6. Check EVERY box that is NOT provided by Microsoft / Microsoft Corporation.
    7. Click on Finish.
    8. Restart.

    Important information regarding Driver Verifier:

    - If Driver Verifier finds a violation, the system will BSOD. To expand on this a bit more for the interested, specifically what Driver Verifier actually does is it looks for any driver making illegal function calls. When and/if this happens, system corruption occurs if allowed to continue. When Driver Verifier is enabled, it is monitoring all 3rd party drivers (as we have it set that way) and when it catches a driver attempting to do this, it will quickly flag that driver as being a troublemaker, and bring down the system safely before any corruption can occur.

    - After enabling Driver Verifier and restarting the system, depending on the culprit, if for example the driver is on start-up, you may not be able to get back into normal Windows because Driver Verifier will detect it in violation almost straight away, and as stated above, that will cause / force a BSOD.

    If this happens, do not panic, do the following:

    - Boot into Safe Mode by repeatedly tapping the F8 key during boot-up.

    - Once in Safe Mode - Start > Search > type "cmd" without the quotes.

    - To turn off Driver Verifier, type in cmd "verifier /reset" without the quotes.
    ・    Restart and boot into normal Windows.

    If your OS became corrupt or you cannot boot into Windows after disabling verifier via Safe Mode:

    - Boot into Safe Mode by repeatedly tapping the F8 key during boot-up.

    - Once in Safe Mode - Start > type "system restore" without the quotes.

    - Choose the restore point you created earlier.

    -- Note that Safe Mode for Windows 8 is a bit different, and you may need to try different methods: 5 Ways to Boot into Safe Mode in Windows 8 & Windows 8.1

    How long should I keep Driver Verifier enabled for?

    I recommend keeping it enabled for at least 24 hours. If you don't BSOD by then, disable Driver Verifier. I will usually say whether or not I'd like for you to keep it enabled any longer.

    My system BSOD'd with Driver Verifier enabled, where can I find the crash dumps?

    They will be located in %systemroot%\Minidump

    Any other questions can most likely be answered by this article:
    http://support.microsoft.com/kb/244617

    Regards,

    Patrick

    “Be kind whenever possible. It is always possible.” - Dalai Lama

    Wednesday, May 14, 2014 11:35 PM
  • Unfortunately it seems that the cause of my problems is my GPU...

    I ran furmark and i got the same crash :(

    It occured quite fast as both my GPUs (running 560ti SLI) reached ~100 C at about half of the test.

    So that should determine the GPU and exclude everything else (Like the PSU) right?

    Also,i looked and noticed that one of my GPUs has one fin of the fan a bit stretched due to some bad technician's work which i am too upset to talk about.

    Do you think i should replace the Cooler of the mentioned GPU or the damage already done is Permanent?

    I will test both the GPUs one at a time to determine which one is the problem but i think it's obvious.

    And i am beginning to think that buying two Factory OC'd GPUs was a HUGE mistake...

    Right after 2 years i am having these issues.

    • Marked as answer by Cloud_TS Tuesday, May 27, 2014 6:16 AM
    Thursday, May 15, 2014 4:53 PM
  • After some tests i found out that the problematic GPU is NOT the one with the stretched fin of the fan but the other GPU,seems like one of the two fans attached on the GPU is struggling ALOT (hence the crashes due to overheat) . I should try replacing the fan IF i possible to find one that suits the cooler or else the whole thing.

    I believe my troubleshooting is/will be solved soon so i thank you a lot for all the Information you provided and the time looking into the .dmp i uploaded.. Your help your priceless,thanks again.

    • Marked as answer by Cloud_TS Tuesday, May 27, 2014 6:16 AM
    Thursday, May 15, 2014 5:48 PM
  • Great troubleshooting finding the faulty fan! Please keep me updated.

    Regards,

    Patrick

    “Be kind whenever possible. It is always possible.” - Dalai Lama

    Thursday, May 15, 2014 10:19 PM