Operating Systems:

Windows 2003 Enterprise Edition R2 X64bit

Problem:

While starting the cluster service on cluster node it is giving the below error

“Could not start Cluster Service on <Node>.

Error 1067: The process terminated unexpectedly.”

If we see the cluster.log file we can find the below error,

"00000ed0.00000f4c::2011/05/13-13:24:21.623 INFO [CS] fixquorum option chosen: quorum is not arbitrated or brought online
00000ed0.00000f4c::2011/05/13-13:24:21.623 INFO [INIT] ClusterInitialize called to start cluster.
00000ed0.00000f4c::2011/05/13-13:24:21.638 INFO [EP] Initialization...
00000ed0.00000f4c::2011/05/13-13:24:21.638 INFO [DM] Initialization
00000ed0.00000f4c::2011/05/13-13:24:21.638 ERR  [DM] DmInitialize: The hive was loaded- rollback, unload and reload again
00000ed0.00000f4c::2011/05/13-13:24:21.638 INFO [DM] DmpRestartFlusher: Entry
00000ed0.00000f4c::2011/05/13-13:24:21.638 INFO [DM] DmpUnloadHive: unloading the hive
00000ed0.00000f4c::2011/05/13-13:24:21.716 INFO [Qfs] QfsSetFileAttributes C:\WINDOWS\Cluster\CLUSDB.BKP$ 80, status 2
00000ed0.00000f4c::2011/05/13-13:24:21.716 INFO [Qfs] QfsDeleteFile C:\WINDOWS\Cluster\CLUSDB.BKP$, status 2
00000ed0.00000f4c::2011/05/13-13:24:21.716 INFO [DM] Loading cluster database from C:\WINDOWS\Cluster\CLUSDB
00000ed0.00000f4c::2011/05/13-13:24:21.793 INFO [DM] DmpStartFlusher: Entry
00000ed0.00000f4c::2011/05/13-13:24:21.793 INFO [DM] DmpStartFlusher: thread created
00000ed0.00000f4c::2011/05/13-13:24:21.793 ERR  [DM] Failed to open key Resources, status 2
00000ed0.00000f4c::2011/05/13-13:24:21.793 ERR  Cluster service suffered an unexpected fatal error at line 1386 of source module d:\nt\base\cluster\service\dm\dminit.c. The error code was 2
."

Cause:

This behavior is the result of a local Cluster database being corrupted, inaccessible, or otherwise unusable.

Solution:

To resolve the issue we have to try following options.

1. Start the cluster service on problematic node with below parameters,

Net start cluster.exe /NoQuorumLogging

Net start cluster.exe /ResetQuorumLog

Net start cluster.exe /NoQuorumLog

2. If still same issue we can try by replacing the Cluster local database file from active working node to problematic node as below

a. Take full backup of all data and application from active node

b. Down the active node

c. Rename the local database file in problematic node as below

File Name: CLUSDB

Location: C:\Windows\Cluster

3. Rename the corrupted clusdb file to clusdb.old and restart the problematic node

4. Go to the active node copy the database file from /Mscs folder in quorum disks

File Name: Chkxxx.tmp

5. Now copy this cluster database file to problematic node C:\Windows\Cluster location and restart the second node.

6. Now cluster service will get automatically start and both servers are working fine.

Note: There are so many reasons for cluster service starting issue. Depend on cluster.log which available in both cluster nodes we can get the exact cause of the issue

Reference: http://support.microsoft.com/kb/217157