none
File locks persisting after shutdown/restart (Causes serious roaming profile / redirected folder problems)

    Question

  • In testing Windows 10 (enterprise, 10240) with roaming profiles, we kept running into reliability issues - we'd sometimes get a temporary profile on login, or sometimes get lengthy hangs in the login process prior to any pre-shell login scripts. After a bit of work trying to narrow this down, the problem appears to be a more general issue with file locks that roaming profiles and redirected folders are particularly sensitive to.

    In short, if a user logs off Windows 10, all is well.

    If, on the other hand, they choose shutdown or restart, their connections to servers etc are not closed properly. This includes file locks, which are then left stuck until the server end times out the connection.

    For a user with a roaming profile, this can leave the ntuser.dat and ntuser.ini files locked on the server - this is what leads to getting a temporary profile on next login. Event viewer reports that it can't use the roaming copy of the profile because a file is in use.

    For a user with redirected folders (such a desktop, my documents, etc), the desktop.ini files in each of these locations is left open. This causes the lengthy delays for that user on subsequent logins as these time out in turn.

    As these locks do time out eventually (15 mins in our environment), the appearance is of intermittent problems, but having narrowed down what's going on, there does not appear to be an intermittent element to it. So far, we are able to reproduce this behaviour consistently on Windows 10.

    Where things get really weird is with Windows 10 fast startup enabled (i.e. the default). With this, if a roaming user is logged on, and shuts the machine down, the files etc remain locked while the machine is down as described above. But if the machine is powered back on then the connections close as it gets to the login prompt. We do not get this behaviour with a restart, or with a traditional full shutdown (fast startup disabled).

    This looks as though some part of the OS shutdown process is interfering with the user logoff process, such that the user logoff process is not completing successfully and closing its connections. When fast startup is enabled, a shutdown isn't really a shutdown, and powering the machine back on seemingly allows that interrupted process to continue. For a real shutdown or restart, the process isn't able to continue and the files remain locked until the server end times them out. When simply logging off, there is nothing to interfere with the logoff process, and it all just works. At least, this is what it looks like to us, after monitoring the file locks at the server end, while trying various logout/shutdown/restart operations at a range of clients - alternative explanations for the observed behaviour would be most welcome.

    In our case, the server end for the profiles is a netapp filer, but connections to Windows servers also persist until their natural timeout if the user shuts the PC down or restarts, but are promptly closed if the user logs off. Similarly, we can only reproduce this problem on Windows 10. So this looks to us like a Win10 issue rather than anything amiss at the server end.

    Is there a known issue here? How might we diagnose this further? Is there anything we can tweak to ensure that logoff completes and closes all its files and connections before shutdown happens?

    Wednesday, September 30, 2015 2:19 PM

All replies

  • More info on this.

    We can reproduce the problem with the profile on Server 2012 R2. This also makes it easier to packet trace.

    Screenshot: http://pcwww.liv.ac.uk/~mikejs/win10/trace_shutdown.png

    This shows the result of shutting down while logged on as a roaming user. Packet 3538 is the last one sent as the machine shut down.

    On powering it on (and it resuming the kernel from the fast-startup shutdown that's really a kind of hibernation), we get the close file requests, and the files are finally closed - some 70 seconds after the machine was shut down. For a full shutdown (fast startup disabled) or a restart, they never get sent and remain until timed out.

    Even more interesting is a packet trace of what happens on logoff.

    Screenshot: http://pcwww.liv.ac.uk/~mikejs/win10/trace_logoff.png

    Here, there is a 12 second gap between packets 3606 and 3607. The files are not closed until the PC has been sat at the login dialog for 10 seconds or so.

    It looks like the final closure of the files is delayed by something, but what we don't know what. If shutting down or restarting, that closure doesn't happen (or is postponed to next power on, if using fast startup).

    With a Win7 client, the close file requests happen as part of the normal logoff or shutdown process, with no strange delays.

    • Edited by Mike Sandells Thursday, October 1, 2015 5:49 PM Screenshot links really do work now.
    Thursday, October 1, 2015 3:36 PM
  • More testing.

    The previous traces were with our normal installation image, which has various software in it (Office etc), as well as antivirus and some other things which cause no problem elsewhere but mean it isn't vanilla.

    I've now reproduced the issue with a vanilla installation from the original 10240 enterprise ISO (64bit). The only change, once installation was complete, was to join it to our domain.

    The same symptoms are reproducible:

    • After logoff, the ntuser.dat and ntuser.ini on the server are not closed until 10 seconds or so after the login screen has appeared.
    • After shutdown with fast-startup enabled, the ntuser.dat and ntuser.ini are not closed until the machine is powered back on, or the locks time out at the server.
    • After restart, or shutdown with fast-startup disabled, the ntuser.dat and ntuser.ini are not closed until they time out at the server.

    Trying to log on while ntuser.dat still has a lock on it results in a temporary profile (not unreasonably).

    This is completely reproducible for us - we're unable to make this work (with a Win10 client). Should we be raising this more formally as a support issue? Is anyone not able to reproduce this?

    Friday, October 2, 2015 2:18 PM
  • Hi,

    we have the same issue.

    After some tests i have found, this issue do with SMB3. If the roaming Profile stored on a Server 2008R2 or Server 2003, it works fine. My temporaily Workaround: disable SMB2/SMB3 on the Windows 10 Client and enable SMB1 on the Server 2012R2 and Windows 10.

    I hope Microsoft corrected this issue with an update for SMB3 in the near future.


    Tuesday, October 13, 2015 2:27 PM
  • Interesting. I'll do some testing with this and see if we get the same behaviour.

    It's interesting because we observed the original behaviour with the profile stored on a netapp filer which is speaking smb 2.0 (2.1 support is switched off). We then tested whether we could reproduce the problem (file locks persisting at the server after logout) with the profile on a windows server (2012R2) and found we could, and packet tracing revealed a likely reason why (the client isn't sending the packets to close the files until later or not at all). If switching SMB3 off on the client fixes it, then the SMB3 option is affecting SMB2 behaviour.

    Wednesday, October 14, 2015 11:43 AM
  • Have you tested my solution?
    Tuesday, October 20, 2015 6:58 AM
  • I can confirm that turning smb2/3 off at the client is a workaround, but we had raised this more formally with Microsoft in the meantime. Still waiting for a response on that.

    Unfortunately, it isn't possible to turn smb3 off without turning smb2 off as well, which means xp-era smb1 communication which is worse than we have with our win7 service currently. So it's a bit drastic. If we have to, we might use that for student PCs (our students have roaming profiles; our staff do not), but we'd prefer a proper fix.

    Tuesday, October 20, 2015 10:30 AM
  • Thanks for your answere. I am looking forward to the response from Microsoft.
    Tuesday, October 20, 2015 10:39 AM
  • I'm having the exact same issues but we have windows 7 vmview machines and it just started about 6 months ago and we have had the same setup for about 3 years.  The user profiles are on a netapp cifs share and the redirected folders are on server 2012 r2.

    The issue is random like you stated.  The user comes in and logs just fine, goes to lunch or to another machine and then tries to log back in and boom.  They get temporary profile and the only way we can correct it is by deleteing the profile and redirected folders.  Once we do that there good to go and as they should because we just started them over.

    Ever since this started happening I have been seeing a large amount of CIFS: "An oplock break request to station 10.5.0.181 for filer ??? share user_profiles, file \???.V2\NTUSER.DAT has timed out "

    My View admin seems to think this is related to netapp and network issues but from reading your findings here I'm not so sure.

    Like you've already stated turning off smb2/smb3 is not a solution and in my case we've been using smb2 for sometime now.

    Does anyone know if win7 was recently updated through a patch to SMB3 which could be causing this for my machines?

    Monday, November 16, 2015 5:41 PM
  • An update on this, following some discussion with MS.

    Firstly, this is not as completely reproducible as I had thought from our initial investigations. It's a race condition, and happens (or is much more likely to happen) on faster PCs, especially PCs with an SSD. On older equipment with a traditional hard disk, it's a much rarer problem.

    What seems to be going on is that the files are not being closed immediately in case they need to be accessed again. i.e. SMB 2 and later are leaving things locked in case further access is needed, and relying on a secondary mechanism to free the lock if another client requests access. If the shutdown process is fast, it pulls the rug from under this before the files are properly closed. MS consider this an optimisation (i.e. this is by-design, not a bug), in that it makes shutdown faster. Switching to SMB1 solves the problem because that closes files immediately rather than after a delay.

    We're asking for a GPO or similar to delay shutdown to allow roaming profile sync to complete, i.e. an option to disable this optimisation in those cases where it causes problems, along the lines of the GPOs to wait for network on boot or to run logon scripts synchronously. Ideally, this would only delay things if the most recent user was a roaming user or similar. Not sure how far we'll get with that.

    If we can't get a built-in option to delay shutdown such that the profile can be closed properly, we're into other workarounds. One option would be a custom windows service that deliberately blocks shutdown for a short while (10-20 seconds) if there's been a roaming user in the current session. Looking into that at the moment.

    Wednesday, December 9, 2015 11:01 AM
  • Thanks for this info. That's a hard thing. SSDs are state of the art. But in my environment are only traditional hard disks and some computers are very old (100 Mbit lan card, Intel Core2Duo or Pentium Dual E2160). I think, the problem is the lock mode. Windows 10: write only and windows 7: write and read.
    Thursday, December 10, 2015 9:06 AM
  • What seems to be going on is that the files are not being closed immediately in case they need to be accessed again. i.e. SMB 2 and later are leaving things locked in case further access is needed, and relying on a secondary mechanism to free the lock if another client requests access. If the shutdown process is fast, it pulls the rug from under this before the files are properly closed. MS consider this an optimisation (i.e. this is by-design, not a bug), in that it makes shutdown faster.

    I am seeing exactly the same issue here. Unfortunately this is quite critical as it affects all users with roaming profiles who reboot their machine. They would rather have to manually log out, wait a while on the logon prompt and then reboot from the login screen. Hard to communicate to users. They will rather just reboot from the start menu.

    Unfortunately everybody rebooting the machine will then end up with a temporary user profile as the roaming profile files are still locked on server side. Also there is no resolution for the user other than wait until timeout occurred before re-attmpting to log-on. Moreover the user does not know when the time out occurs.

    I personally consider this as a bug, not a feature. Sure it could speed up shutdown if Windows does not care about potentially ongoing locks. But specifically for the roaming profile feature this is a show-stopper. It would also be easy for Microsoft to fix this. The process synchronizing the profile should just set a lock to instruct Windows not to shut down before the locks are released. Or more cleanly the shutdown procedure should just make sure all locks are released. This anyway only happens on shutdown where the user clearly intends to shut down the machine and therefore close all the files. I don't see why this would require another policy or setting to be enabled as Windows can clearly know that the user requested a shutdown and also Windows knows about roaming profiles and could (at least in this case) assure the files are unlocked properly.

    Edit: I also believe even in shutdown/startup with fast-shutdown enabled this is a bug. Imagine users might shut down one machine just to quickly log on to another one (ie. roaming to another machine). If files are still locked the user cannot log on with the roaming profile on any other machine until (very long) timeout has passed.

    Sunday, February 28, 2016 3:58 PM
  • I personally consider this as a bug, not a feature.

    As do I. Asking for a policy to disable this behaviour was a fall-back position after Microsoft made clear that they considered this behaviour normal and had no intention of fixing it.

    There are three workarounds that I know of:

    • Ensure that the profiles are stored on a server that supports SMB 2.1. This should allow the client (in the reboot scenario you describe) to ask for its lock back, and get it.
    • Turn SMB 2 (2.0 and 2.1) off entirely, either at the server where the profiles are stored, or at the client. Obviously this will have some side effects, mostly in terms of increased network traffic rather than actual faults though.
    • Have a windows service monitor the machine, and delay the shutdown process by 20 seconds or so if a roaming user has been logged on in the last 30 seconds. It also needs to ensure that fast startup is disabled, either entirely, or just while a roaming user is logged on. This inserts enough of a delay into the shutdown process that the roaming sync has time to complete normally and close its files properly. This is necessary because shutting down with fast startup enabled doesn't send the right kind of event (SERVICE_CONTROL_PRESHUTDOWN) or allow a delay.

    We're probably going to use a combination of these - the service, and looking into updating our filers to allow smb 2.1.

    Tuesday, March 1, 2016 10:35 AM

    • Ensure that the profiles are stored on a server that supports SMB 2.1. This should allow the client (in the reboot scenario you describe) to ask for its lock back, and get it.

    Though I am not fully sure of that. I am operating Samba 4.2.8 on my test rig and experienced this issue too while I can clearly see Win10 clients using SMB 3.0 protocol. Though there might be some issues to re-gain locks on Samba environments. But I can clearly see that on shutdown samba keeps ntuser.dat open and further login attempts of the user fail with a temporary profile created.
    Tuesday, March 1, 2016 12:12 PM
  • We tried with a Server 2012 R2 instance, and although the locks persist, attempts to access the locked file result in the server trying to contact the original owner of the lock, and clearing the lock if the original owner doesn't reply. (This was with running Wireshark at the server to see what's going on.)

    The timeout on this appears to be fairly short (10 seconds, I think), so allows a client to access the ntuser.dat before the roaming profile sync process gives up on it and creates a temporary profile.

    I was under the impression that this was an SMB 2.1 feature - certainly our netapp filers (running 2.0) don't do it. It may be an optional feature of 2.1, and other implementations may vary...

    We were also primarily testing the case where a user shuts down one machine and promptly tries to log on elsewhere, as this is a common scenario for use - students moving between classes. I don't think we did packet traces of what happens when you reboot and login again at the same PC with an SMB 2.1 back end. It makes a difference as to whether the original owner of the lock is contactable by the server. I'd recommend a quick packet trace from the server end to see what it's doing when the client tries to access the locked file.

    Tuesday, March 1, 2016 12:23 PM
  • Wa have this with NAS Filer and Windows 7. I've found just with openfiles that ntuser.dat and ntuser.ini are staying open during restart and the user gets a temporary profile.

    I've made a batch with a 30 seconds sleep (ping -n 30 127.0.0.1 >nul 2>&1) as a shutdown script. This seems to work.

    • Proposed as answer by BerndSehnde Tuesday, March 1, 2016 6:55 PM
    Tuesday, March 1, 2016 2:28 PM
  • If you're seeing it on Win7, it may be a different problem. We were unable to reproduce this on Win7, and packet traces show the requests to close the files being sent without any delays. Have you done any packet tracing?

    Wednesday, March 2, 2016 11:29 AM
  • Is there meanwhile any other solution to this problem?

    I'm observing the exact same behavior with a Samba 4.3.9 AD and Win 10 Pro clients.
    Tuesday, September 6, 2016 8:25 AM
  • I've tried to apply this as shutdown script via GPO on our Windows 10 test client (+ Windows Server 2012 R2 file server) and it doesn't work. No success even with 60 or 120 seconds waiting time.
    Tuesday, May 23, 2017 11:30 AM
  • In the meantime... We have the same problems with our Windows Server 2012 R2 and Windows 10 Enterprise LTSB clients. Roaming profile is lost every time after a client restart with the user  logged on. If the user does a logout and login again without rebooting the computer, the profile will not be lost.
    • Edited by s0lution Tuesday, May 23, 2017 11:36 AM
    Tuesday, May 23, 2017 11:33 AM
  • Has disabling Fast Start-up fixed this or did you try it?

    HKEY_LOCAL_MACHINE\SYSTEM\
    CurrentControlSet\Control\
    Session Manager\Power\HiberbootEnabled

    This can be done using Group Policy Preferences and it effectively disables Fast Startup; of course,

    setting it to 1 would instead enable it.

    setting to 0 would disable it.


    lforbes


    • Edited by lforbes Thursday, June 1, 2017 12:18 AM
    Thursday, June 1, 2017 12:17 AM
  • I've tried to disable Fast Start-up but the problem persists. For testing purposes I've recently set up one of our newest W12KR2 servers (Hyper-V VM, only role: secondary DC) to host the roaming profiles of our Windows 10 test users and when using this "file server" the folders in the .v6 profiles aren't locked during a client reboot and everything works fine so far. I still trying to find out what is the critical difference between the two servers...

    What I've also found out: The lock of the .v6 profile sub folders on the file server starts just at the moment when the reboot is initiated on the client and persists for about 2:20 minutes every time I've tested it (about 10 times so far). The reboot of the client took about 1:30 only, so if user logs on before 2:20 is reached, the profile can't be loaded from the server cause its locked. To estimate the locking duration I've just tried to cd to one of the profile subfolders on the file server continuously to test its accessibility while rebooting the client.
    • Edited by s0lution Wednesday, June 14, 2017 8:22 PM
    Wednesday, June 14, 2017 8:01 PM
  • What worked for me was combining of the shutdown-script and disabling fast boot. I put 10 pings in the shutdown script and it worked, but when the client has to copy a larger amount of files, this process gets aborted. You'll then end up with tmp-files in the directory.

    When I increase the number of pings, I can work around this. But you don't really know how high the ping should be with larger profiles.

    Friday, June 30, 2017 9:38 AM
  • I was not happy with this workaround and continued the search for something less error-prone and came across a post of someone who use the GPO-setting "Do not forcefully unload the users registry at user logoff". I just tested this setting and it seems to work.

    With the setting not configured and a reboot, no roaming takes place. With the setting enabled and a reboot, the roaming works.

    Note: This setting requires a reboot te be activated.



    • Edited by JPronk Friday, October 13, 2017 11:48 AM
    Friday, October 13, 2017 11:10 AM
  • Like others here ive been struggling with this one for a while. I find it affects both W10 and W7 machines, but as earlier poster stated it seems worse on faster machines especially if they have a SSD which more have nowadays so they are most likely W10. It also appears more likely to occur on restart instead of sign out then restart.

    I tried the gpo setting above but it didnt seem to work on the fast W7 machine im testing with as the ntuser.dat file was still left open after multiple rebooting.

    My only current reliable workaround is to run Computer manager on another machine (or log into users pc as a different user) and connect to our profile file server (its a netapp filer as well so you need an admin login to it) then from Shared folders>open files you can see if the user still has ntuser.dat open. Either close the individual file or from the Sessions list close the users session(s) which will also close any other files left open by them. I also used Computer manager to monitor the open files list whilst testing.

    Also on a netapp filer use command cifs sessions -c "user" to monitor files.

    Wednesday, November 29, 2017 11:39 AM
  • We have the same problem. Anyone who has a Microsoft case open and could paste the case number as reference to this thread?
    Friday, March 23, 2018 1:00 PM
  • We have the same problem. Anyone who has a Microsoft case open and could paste the case number as reference to this thread?
    Did you open a case with Microsoft?

    For us it seems that de GPO "Do not forcefully unload the users registry at user logoff" is working as a workaround on our VDI`s, but on fat client that causes other issues with the roaming profile. So we still are looking for a fix.
    • Edited by PKlapwijk Tuesday, July 3, 2018 7:15 AM
    Tuesday, June 12, 2018 6:17 AM
  • Same Issue with Windows 2012 r2 terminal server and Win 2012 r2 as file server! All servers use SSD!

    Lock time is between 3 and 5 min after user logoff from terminal!


    Tuesday, July 3, 2018 12:16 PM
  • We recently solved our problem with those mentioned persisting file/directory locks which frequently corrupted our Windows 10 roaming profiles. In our case the "Arcserve Backup Agent for Open Files for Windows" on our W2012R2/W2016 file server however was responsible for the persisting file locks after  a user logged off from a Windows 10 client. After removing this component no further file/directory locks on the file server have occured at user logoffs/client reboots.


    Wednesday, July 4, 2018 12:44 PM
  • We recently solved our problem with those mentioned persisting file/directory locks which frequently corrupted our Windows 10 roaming profiles. In our case the "Arcserve Backup Agent for Open Files for Windows" on our W2012R2/W2016 file server however was responsible for the persisting file locks after  a user logged off from a Windows 10 client. After removing this component no further file/directory locks on the file server have occured at user logoffs/client reboots.


    Just stopping the service did not solve the problem?? Has anyone else tested this solution?
    Tuesday, July 10, 2018 1:00 PM
  • Problem is in SMB 3.xx, try workaround:

    HKLM\System\CurrentControlSet\Services\LanmanWorkstation\Parameters\DormantDirectoryTimeout

    Set 1-5 sec.

    Default is 600 sec.

    See https://msdn.microsoft.com/en-us/library/windows/hardware/dn567661(v=vs.85).aspx

    Thursday, July 26, 2018 10:01 AM