none
FIM 2010 R2 - FIMService suddenly stops working & wont start

    Question

  • Time to hit the forums... maybe you guys can help

    I've setup FIM 2010 R2 in the following way:

    • FIM Service + FIM Portal + SSPR Portals on 2 physical servers      configured in NLB.
    • FIM Sync + SQL 2008 hosting all DB's on SQL Cluster

    The services was configured, and setup correctly. All Identities imported into the portal and UAT started on Password Self Service. Everything was working 100% for the last 3 weeks now. This morning I tried registering for password reset using a test user, I got service unavailable in the portal. Went to check that the FIM Service is running, and sure enough, on NLB Bode 1 the service is running 100%, but on NLB 2, the FIM Service was not running. Trying to start the service resulted in the normal windows error 1053 "the service did not respond in a timely fashion" with event viewer errors:

    1. Event ID 7009

    A timeout was reached (30000 milliseconds) while waiting for the Forefront Identity Manager Service service to connect

    2. Event ID 7000

    The Forefront Identity Manager Service service failed to start due to the following error:

    The service did not respond to the start or control request in a timely fashion.

    I've restarted the server several times, I've enabled advanced logging in the "Microsoft.ResourceManagement.Service.exe.config" file and tried to start the service, hoping in some sort of error logging in the log files... But nothing, nothing in the event viewer and no log files generated.

    My question, if node one is starting 100% with same accounts as node 2, but node 2 doesn't want to start... where do I start looking, how do I enable some sort of logging so I can at least try and troubleshoot this problem...

     

    NOTE: I had this problem is the past, and the only way to resolve it in the past was to re-install the FIM service.... This time, I would like to try and figure out why this is happening...

    Any ideas welcome...

     

    Thx

    Q

    Monday, July 23, 2012 11:19 AM

Answers

  • I found the solution in another thread, but the solution for me was related to our portal servers being behind an F5 load balancer.  In my case, I didn't make the above registry change.  

    Essentially, one virtual server was balancing port FIM Service 5726, and another virtual server was balancing FIM Service on 5725.  Each Virtual Server had the same pool members (our two FIM Service servers.)  

    Everything works great if you hit the same server when you get your OTP (hits one port), and enter your OTP back in to change your pw (which apparently hits the other port).  But it fails if you don't get the same server each time.  For me, setting up a persistence profile on the F5 to have "stickiness" across services worked.

    Check out this thread: http://social.technet.microsoft.com/Forums/pl-PL/ilm2/thread/4f6db34c-2c90-4336-a55a-f174e46f1678

    Matt


    Thursday, October 11, 2012 5:29 PM

All replies

  • Have you added any custom activities to the portal?  I had a similar problem where the FIM Service crashed all the time and it turned out there was an error in a custom activity that caused the problem.  Once I resolved that it worked like a charm.  Good luck and let us know how it goes.
    Monday, July 23, 2012 2:55 PM
  • Hi, unfortunately there are no custom activities yet... only standard SSPR activities and workflows. I manage to fix the problem by reinstalling the FIM Service, like I did before... but I cant keep doing this all the time... at some point this service should go into production... If only I could enable some sort of advanced logging for service start-up...
    Monday, July 23, 2012 6:09 PM
  • If you don't see anything on the FIM side you might wanna check SQL. Perhaps something is off over there... Allthough one would think you would get some sql related events on the FIM box. But it's just a shot.

    http://setspn.blogspot.com

    Monday, July 23, 2012 7:56 PM
  • I ran into the same issue in a customer lab today. The steps below resolved it for me after the server was restarted.  Still would be interested to know what the root cause was as it was working fine before.

    • Start Registry Editor (Regedit.exe).
    • To change the value data for the ServicesPipeTimeout DWORD value to 60000 in the Control key, follow these steps:
      1. Locate and then click the following registry key:
        HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet
      2. Click the Control subkey
      3. Right-click the ServicesPipeTimeout DWORD value, and then click Modify.
      4. Click Decimal.
      5. Type 60000, and then click OK.
    • If the ServicesPipeTimeout value is not available, add the new DWORD value, and then set its value data to 60000 in the Control key. To do so, follow these steps:
      1. Locate and then click the following registry key:
        HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet
      2. Click the Control subkey.
      3. On the Edit menu, point to New, and then click DWORD Value.
      4. Type ServicesPipeTimeout, and then press ENTER.
      5. Right-click the ServicesPipeTimeout DWORD value, and then click Modify.
      6. Click Decimal.
      7. Type a value of 60000, and then click OK.

        The value is 60000 milliseconds and is equivalent to 60 seconds or to one minute.
      Note This change does not take effect until the computer is restarted.
    • Proposed as answer by Gaston Thomasi Friday, September 21, 2012 10:19 AM
    Friday, August 03, 2012 3:42 PM
  • This solution was the silver-bullet fix for a client deployment of FIM 2010 r2, where we were hanging on services starting for FIMService install.  The key did NOT exist, and was added/reboot good to go.
    • Proposed as answer by MichaelS_GER Tuesday, March 05, 2013 2:18 PM
    Tuesday, September 11, 2012 9:56 PM
  • I found the solution in another thread, but the solution for me was related to our portal servers being behind an F5 load balancer.  In my case, I didn't make the above registry change.  

    Essentially, one virtual server was balancing port FIM Service 5726, and another virtual server was balancing FIM Service on 5725.  Each Virtual Server had the same pool members (our two FIM Service servers.)  

    Everything works great if you hit the same server when you get your OTP (hits one port), and enter your OTP back in to change your pw (which apparently hits the other port).  But it fails if you don't get the same server each time.  For me, setting up a persistence profile on the F5 to have "stickiness" across services worked.

    Check out this thread: http://social.technet.microsoft.com/Forums/pl-PL/ilm2/thread/4f6db34c-2c90-4336-a55a-f174e46f1678

    Matt


    Thursday, October 11, 2012 5:29 PM