Storage drives not showing in Nodes and cluster VIP down,after storage went offline and came back to Online.--can you help me it is some how urgent

Jawab Storage drives not showing in Nodes and cluster VIP down,after storage went offline and came back to Online.--can you help me it is some how urgent

  • 6. srpna 2012 3:46
     
     

    Dear all,

    It is very urgent can some body help me to resolve the issue.

    I have A/P cluster on Windows 2003R2 64 bit wth SQL Server 2005SP3.

    today morning due to some  reason my storage went Offline for some time,after it came back  my VIP not started,I  restarted both  nodes even though Drive are not came and VIP still down.

    Please some body suggest me how to investgate the issue.

    Thanks in advance


    SNIVAS

Všechny reakce

  • 6. srpna 2012 7:03
     
     

    Open the cluster log in Notepad, go to the bottom of the file then search backwards for "ERR" case sensitive. You can find the log on each node in %systemroot%\Cluster\cluster.log. Once you locate the cluster network not starting you should have some indication of the cause.

    The log times are in UTC time so adjust for accordingly for the timezone of your servers.


  • 7. srpna 2012 3:41
     
     

    Hey all,

    I did not find ERR 's ,but could see WAR

    0000082c.00000bd4::2012/08/05-05:08:06.580 WARN [NM] Network 16ddb513-2802-448d-91a6-327eacbe8db5 state is indeterminate, Scheduling Failure Isolation poll
    0000082c.00000bd4::2012/08/05-05:08:06.580 WARN [NM] Failed to read IsolationPollTimerValue or value out of range,status 2 using default 60000 ms
    0000082c.00000bd4::2012/08/05-05:08:06.580 INFO [NM] Started failure isolation timer (60000ms) for network 16ddb513-2802-448d-91a6-327eacbe8db5 (STORAGE)
    0000082c.00000bd4::2012/08/05-05:08:06.580 INFO [NM] Connectivity failure scope determination complete for network 16ddb513-2802-448d-91a6-327eacbe8db5.

    Actions I took

    1.Restarted nodes one by one..

    2.Shutdown the nodes one by one and bring back one Node

    But even I did see SAN drives  in Nodes or in  Disk management..and tried to startup Cluster services ..Cluster Services are not starting and saying warining start the services with Local account or network account.

    can any help me what went wrong after storage discoonect and reconnect back.

    Thanks in advance


    SNIVAS


    • Upravený SNIVAS 7. srpna 2012 8:27
    •  
  • 7. srpna 2012 11:02
     
     Odpovědět

    Until the cluster is up and running, only start one node and diagnose why the cluster service will not start - DON'T START OTHER NODES.

    Make sure that the quorum disk is connected. Is the quorum disk Fiber Channel and SAN connected? Confirm with your storage team that the quorum disk is connected correctly to the first node.

    Restart the first node (only) and immediately check the cluster log to see what is going wrong. Also look for errors (and warnings) in the Windows System Event log. Try to match these with the messges in the cluster log.

    • Označen jako odpověď SNIVAS 22. srpna 2012 9:50
    •  
  • 7. srpna 2012 12:25
     
     

    Thanks Stephen for replay,

    Here problem it self  quorumn disk not appering in any node,hence cluster services not starting.

    about connection we are using Fiber chanall,

    can you help me If any case If I want rebuild cluster what I ahave to do.

    Please provide steps.

    Thanks in advance


    SNIVAS

  • 7. srpna 2012 14:52
     
     
    Does the quorum disk happen to show in disk manager or is it missing from there also?  You could rebuild a new quorum drive if needed but you might be able to get it back up and running sooner if you can determine why it is not show available to the cluster.  Like Steven mentioned only bring one node up and possibly try to represent the quorum drive to that node to see if it shows up then.  You might check into fiber channel errors, or an issue on your storage system if this does not work.

    Chris Please remember to mark correct answers and helpful post.

  • 7. srpna 2012 16:58
    Moderátor
     
     

    If this is urgent, you should open a ticket with Microsoft PSS or your storage vendor.

    The quorum does not appear because the cluster service isn't starting. The question would be why isn't the cluster service starting. As Stephen Strong suggested, you should look in the cluster.log on the node where the failure is occurring for this answer. 


    Visit my blog about multi-site clustering

  • 22. srpna 2012 9:51
     
     

    Thank all,

    we found the LUNS which we lost in Storage..


    SNIVAS