none
IsAlive and KeepAlive checks in Win 2008

    Question

  • I want to know in detail about the isalive and keepalive health checks for physical disks in Windows 2008 Cluster. I know there are two level checks- File System level and Device Level. How do they operate in 2008 with Persistent Reservations ?

    I am getting following errors in my cluster log and would like to understand it better on what kind of keep alive is failing. I do know when this happens , the reservations got cleared on the storage due to a problem.

     

    000008ec.00000988::2010/03/17-10:30:33.147 ERR [RES] Physical Disk <Cluster Disk 13>: IsAlive sanity check failed!, pending IO completed with status 1235.

    000008ec.00000c00::2010/03/17-10:30:33.147 ERR [RES] Physical Disk <Cluster Disk 6>: IsAlive sanity check failed!, pending IO completed with status 1235.

    000008ec.00000f98::2010/03/17-10:30:33.147 ERR [RES] Physical Disk <Cluster Disk 8>: IsAlive sanity check failed!, pending IO completed with status 1235.

    000008ec.00000a04::2010/03/17-10:30:33.147 ERR [RES] Physical Disk <Cluster Disk 2>: IsAlive sanity check failed!, pending IO completed with status 1235.

    000008ec.00000530::2010/03/17-10:30:33.147 ERR [RES] Physical Disk <Cluster Disk 10>: IsAlive sanity check failed!, pending IO completed with status 1235.

    Wednesday, March 24, 2010 8:27 PM

Answers

All replies

  • Hi Amit

    err 1235

      ERROR_REQUEST_ABORTED                                         winerror.h
    # The request was aborted.

    ok you actually answer it yourself..if the reservation on the disk got cleared due to some problem on storage stack we are bound to fail the is alive sanity check.

    AFAIK is alive will try to access file on root directory and look alive check for reservation.


    Gaurav Anand
    Thursday, March 25, 2010 6:50 AM
  • Hi Amit

    err 1235

      ERROR_REQUEST_ABORTED                                         winerror.h
    # The request was aborted.

    ok you actually answer it yourself..if the reservation on the disk got cleared due to some problem on storage stack we are bound to fail the is alive sanity check.

    AFAIK is alive will try to access file on root directory and look alive check for reservation.


    Gaurav Anand
    Thursday, March 25, 2010 6:50 AM
  • Thanks Gaurav.

    If reservations are lost, how does the isalive sanity check knows about it ? Does it use PRI command to find if Reservation exists ?

    Loosing Reservation should not prevent MSCS to access the root directory ..

     

    Thursday, March 25, 2010 7:19 AM
  • Hi Amit

    If node loose the reservation it cannot access the disk. check here to read more on how PR works in failover clustering.

    http://itinfras.blogspot.com/2010/03/storage-architecture-changes-for.html


    Gaurav Anand
    Thursday, March 25, 2010 8:55 AM