none
Event ID 1055 & 1069

    Question

  • Hi,

    Please help me to identify the cause and fix.

    Event Type:    Error
    Event Source:    ClusSvc
    Event Category:    File Share Resource
    Event ID:    1055
    Date:        7/15/2011
    Time:        3:24:50 PM
    User:        N/A
    Computer:    USXXXXXXXX
    Description:
    Cluster File Share resource 'XXXXXXXXXXXX  ' has failed a status check. The error code is 64.

    For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
    Data:
    0000: 40 00 00 00               @...  

    Event Type:    Error
    Event Source:    ClusSvc
    Event Category:    Failover Mgr
    Event ID:    1069
    Date:        7/15/2011
    Time:        3:24:50 PM
    User:        N/A
    Computer:    USXXXXXXXX
    Description:
    Cluster resource 'XXXXXXXXXXXX  ' in Resource Group '3. JDE Xe Production Apps' failed.

    For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.


    Event Type:    Information
    Event Source:    ClusSvc
    Event Category:    Failover Mgr
    Event ID:    1205
    Date:        7/15/2011
    Time:        3:33:51 PM
    User:        N/A
    Computer:    USXXXXXXXX
    Description:
    The Cluster Service failed to bring the Resource Group "XXXXXXXXXXXX" completely online or offline.



    For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.



    d58:d64.07/08[15:24:43.548](000000) INFO File Share <41. WIM_Interfaces>: Retrying FindFirstFile on error 64 for share \\USxxxxxxx\WIM\*.* !
    d58:d64.07/08[15:25:51.800](000000) INFO File Share <41. WIM_Interfaces>: Retrying FindFirstFile on error 64 for share \\USxxxxxxx\WIM\*.* !
    d58:d64.07/08[15:27:00.051](000000) INFO File Share <41. WIM_Interfaces>: Retrying FindFirstFile on error 64 for share \\USxxxxxxx\WIM\*.* !
    d58:d64.07/08[15:28:08.255](000000) ERR  File Share <41. WIM_Interfaces>: Share has gone offline, Error=64 !
    af4:bbc.07/08[15:28:08.271](000000) WARN [FM] FmpHandleResourceTransition: Resource Name = f352b88b-402f-4f5b-a662-f3d2d95a1b79 [41. WIM_Interfaces] old state=2 new state=4
    af4:bbc.07/15[10:44:53.699](053254) WARN [FM] FmpHandleResourceTransition: Resource Name = 0bfa4d35-9a09-4fa3-bbd3-997c1234d8c7 [PBX-USxxxxx] old state=2 new state=4
    d58:d64.07/15[15:21:25.940](053261) INFO File Share <9. Manu Oracle Bak  >: Retrying FindFirstFile on error 64 for share \\USxxxxxxxx\BAK\*.* !
    d58:d64.07/15[15:22:34.191](053261) INFO File Share <9. Manu Oracle Bak  >: Retrying FindFirstFile on error 64 for share \\USxxxxxxxx\BAK\*.* !
    d58:d64.07/15[15:23:42.442](053261) INFO File Share <9. Manu Oracle Bak  >: Retrying FindFirstFile on error 64 for share \\USxxxxxxxx\BAK\*.* !
    d58:d64.07/15[15:24:50.631](053261) ERR  File Share <9. Manu Oracle Bak  >: Share has gone offline, Error=64 !
    af4:bbc.07/15[15:24:50.631](053261) WARN [FM] FmpHandleResourceTransition: Resource Name = f960efa5-0cc2-4344-ab4c-64efbe6f4845 [9. Manu Oracle Bak  ] old state=2 new state=4
    af4:dc8.07/15[16:32:09.490](053284) INFO [FM] FmpRmOfflineResource: RmOffline() for f0d87cf1-4b87-43cc-a19c-273d7c0a4486 returned error 997
    af4:dc8.07/15[16:38:05.950](053294) INFO [FM] FmpRmOfflineResource: RmOffline() for f0d87cf1-4b87-43cc-a19c-273d7c0a4486 returned error 997
    af4:690.07/15[16:38:15.544](053299) INFO [FM] FmpRmOfflineResource: RmOffline() for 4324d689-1f2a-43cd-ad63-19c3d76e81bd returned error 997
    af4:bc0.07/15[20:01:44.354](053314) WARN [LM] LogpReset:: deleting previous checkpoint file Q:\MSCS\chkCFFB.tmp


    Karuna
    Tuesday, July 19, 2011 6:15 PM

Answers

  • This is one of the tougher problems to troubleshoot in my experience. Cluster is trying to provide you high availability by routinely checking the file share to make sure it responds.  In your case, the function "FindFirstFile" does not respond in timely fashion so cluster service takes that file share offline and online or failover depending on the failure count.  I would read through the article carefully as you may need to implement many tuning changes.  Let's also make sure this is not a performance issue, are there 2018,2019,2020,2021's etc in system event log?  Does you customer store PST's and other large files not supported as I have seen that before.

    You cannot access the existing File Share resources on a Windows Server 2003 failover cluster
    http://support.microsoft.com/kb/961657

    Dave Guenthner [MSFT]
    This posting is provided "AS IS" with no warranties, and confers no rights.
    http://blogs.technet.com/b/davguents_blog/

    Tuesday, July 19, 2011 8:54 PM

All replies

  • This is one of the tougher problems to troubleshoot in my experience. Cluster is trying to provide you high availability by routinely checking the file share to make sure it responds.  In your case, the function "FindFirstFile" does not respond in timely fashion so cluster service takes that file share offline and online or failover depending on the failure count.  I would read through the article carefully as you may need to implement many tuning changes.  Let's also make sure this is not a performance issue, are there 2018,2019,2020,2021's etc in system event log?  Does you customer store PST's and other large files not supported as I have seen that before.

    You cannot access the existing File Share resources on a Windows Server 2003 failover cluster
    http://support.microsoft.com/kb/961657

    Dave Guenthner [MSFT]
    This posting is provided "AS IS" with no warranties, and confers no rights.
    http://blogs.technet.com/b/davguents_blog/

    Tuesday, July 19, 2011 8:54 PM
  • Hi Dave,

    I did the tuning of server serivce and still the issue gets reoccur.

    Issue happens only on Friday between (03:20 to 03:30PM) and later i stopped the AV services during the time period then the cluster failure did not happen at all.

    Do you have idea on this

    Regards,

    Karuna


    Karuna
    Friday, October 28, 2011 3:05 AM