none
PCI-SERR

    Question

  • I’m trying to troubleshoot a PCI error on an IBM x3650 M3 7945.  The system is
    brand new and I’ve tried both Windows Server 2008 R2 64 bit and Windows 7 Pro
    64 bit (either one works for my application) as operating systems.

    I’m trying to troubleshoot an error at the hardware level.  The errors
    stack and they look like this in the firmware log:

    | Chassis Event Log            |02/24/2012 15:20:30                                       
    |PLAT0228           |A software
    NMI has occurred on system "SN# KQ14RYF"                  |Major              |

    | Chassis Event Log            |02/24/2012 15:20:36                                       
    |PLAT0330           |Fault in slot
    "All PCI Error" on system "SN# KQ14RYF"                |Major              |

    | Chassis Event Log            |02/24/2012 15:20:36                                       
    |PLAT0330           |Fault in slot
    "One of PCI Error" on system "SN# KQ14RYF"             |Major              |

    | Chassis Event Log            |02/24/2012 15:21:49                                       
    |PLAT0230           |System
    "SN# KQ14RYF" has recovered from an NMI                       |Information        |

    | Chassis Event Log            |02/24/2012 16:41:24                                        |PLAT0106           |"Host Power" has been
    turned off                                    
    |Information        |

    | Chassis Event Log            |02/24/2012 16:44:57                                       
    |PLAT0107           |"Host
    Power" has been turned on                                     
    |Information        |

    And:

    | 51       |02/24/2012 15:2... |System chassis 1 (Critical Interrupt - NMI State): Assertion: Software NMI.                                  |No Extended Data                                           |
    |          |0:30               |                                                                                                             |                                                           |
    | 52       |02/24/2012 15:2... |Group (not a physical entity) 130 (Slot/Connector - All PCI Error): Assertion: PCI Express #0 Fault Statu... |Type: ERROR_CODE Sev: IBM_ERROR_FATAL Class: Computing ... |
    |          |0:36               |s asserted.                                                                                                  |Sub-Class: Chipset Instance: 0x00000000 GUID: 0x69F4DCC... |
    |          |                   |                                                                                                             |9C4A7D51187DA Extended Data: 00 06 29 45 C3 B9 14 00 14... |
    |          |                   |                                                                                                             | 00 00 00 00 00 52 83 BB 7F 00 00 00 00 FF FF 00 00 CC ... |
    |          |                   |                                                                                                             |21 0A 00 04 00 00 00 00 04 00 00 40 00 00 00 00 00 40 0... |
    |          |                   |                                                                                                             |0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00... |
    |          |                   |                                                                                                             | 00 00                                                     |
    | 53       |02/24/2012 15:2... |Group (not a physical entity) 130 (Slot/Connector - One of PCI Error): Assertion: PCI Express #255 Fault ... |Type: ERROR_CODE Sev: IBM_ERROR_FATAL Class: I/O BUS Su... |
    |          |0:36               |Status asserted.                                                                                             |b-Class: PCI Instance: 0xFF000000 GUID: 0x69F4DCC9C4A7D... |
    |          |                   |                                                                                                             |51187DA Extended Data: 00 06 29 45 C3 B9 14 00 80 00 00... |
    |          |                   |                                                                                                             | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 FF 00 00 ... |
    |          |                   |                                                                                                             |00 05 00 86 80 0C 34 00 00 01 00 00 00 01 00 00 00 46 0... |
    |          |                   |                                                                                                             |0 10 00 12 00 00 00 00 20 00 00 13 00 00 00 26 01 04 00... |
    |          |                   |                                                                                                             | 14 00 00 00 06 00 01 00 15 00 00 00 00 00 00 00 16 00 ... |
    |          |                   |                                                                                                             |00 00 00 40 00 00 17 00 00 00 00 00 00 00 18 00 00 00 0... |
    |          |                   |                                                                                                             |0 00 28 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00... |
    |          |                   |                                                                                                             | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ... |
    |          |                   |                                                                                                             |00 00 00 00 00 00 00 19 00 00 00 1A 01 00 00 00 00 00 0... |
    |          |                   |                                                                                                             |0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00          |



    I realize that this is probably a problem with a PCI card driver or hardware having issues with the computer but I’m trying to give some better feedback to the programmers than “hardware incompatibility”.  And I’m looking for advice on how get that for them but I've never gone this low in the hardware before.

    I’m also following up with IBM but since these are new servers it’s taking them a bit to get their support system figured out.

     

    Thanks,

    JP


    Monday, February 27, 2012 4:21 PM

Answers

  •  

    Hi,

    According to your description, the issue should be caused by hardware drive. I suggest to logon with Clean Boot first. And update the driver to the latest version for test.

    Furthermore, you can try to update BIOS and chipset from IBM, and you’d better contact IBM for hardware or drivers issue.

    Thank you for your understanding.

    Regards,

    Leo   Huang


    Leo Huang

    TechNet Community Support

    Wednesday, February 29, 2012 6:20 AM
    Moderator