none
Slow Windows Server 2016 Updates - some Infos we found - further discussion RRS feed

  • General discussion

  • Hello everybody,

    With dec '18 we again (after migrating to our new hardware) reached the point where every cummmulative update failed because of timeout.

    I think there is enough information around to say that this is a very big problem on many systems and has nothing to do with the commonly presented "solutions" about not getting these updates or having trouble with downloading or similar. It's a problem itself.

    To get those updates installed we tried much, but nothing helped apart from disabling AV. It still takes very long to install those cummulative updates but at least it works (downtime is still a shame and big problem).

    One thing i stumbled across and that i think is worth mention, first check if tracing is active und going mad. We had an older machine that had this problem but it was the only one. There is a reg-key that should be deleted to revert tracing to standard (HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\WindowsUpdate\Trace).

    Apart from that we digged deeper and looked at taskmanger and ressource-monitor for CPU and file-activity when the updates are "prepared" (the point where it failed for us everytime) and installed (where we only had problems on a few vms and it worked on 2nd try). This was a big eye-opener on several things:

    1.) I'm not sure what's the official position but from what you can see in file-activity it's obvious that MS seems to simply put together all those smaller updates into one "zipped" package which is far away from truely merging them. I think the only "merge" that happens is to replace older updates with the replacement if there is one. That's why these packages have reached such a big size and are still growing. That's also why they worked earlier on our system but now we get those timeouts.

    2.) So why disabling AV helped: Well since the update is basically a zip-file with an unkown number of smaller updates inside they need to get unpacked, checked for consistency and so on. That leeds to the following things:
    A process called TiWorker.exe is unpacking them to an own folder under %windir%\SoftwareDistribution. That causes a massive amount of read-write to the disc (on our side a brand-new datacore-iSCSI-storage on 10GB/s fibrechannel and with SSDs which basically idles most of the time). That was driving mad our AV-solution. There are also other folders used, but i'm not sure when in the process and for what purpose (%windir%\servicing and %windir%\system32\catroot).
    Later the read-write gets lower but they are still there (i think when all files are unpacked) but now the TiWorker.exe generates a huge amount of CPU-load. On our system it completely maxed out one vCPU for over 30 minutes (brand new HP DL380gen10 with Xeon Gold 5118 and VMware ESxi6.5 up2date). I can only guess what happens there but i think it's some kind of CRC-checking and checking the system for compatibility for all that smaller packages.
    For us it helped to exclude the folders and TiWorker.exe from AV-scan. That speed ups the process enough to don't get a timeout but isn't really great because TiWorker.exe seems to be inside a changing path and so we needed to exclude it system-wide which is a possible security-problem. It's similar with the folders, which are of course known by malware-programmers and it would be obviuos to try to use them to get around AV.

    3.) Apart from our specific solution, which is basicllay only a workaround, the problem seems to be the pure amount of files that are packed together inside the cummulative update and wahtever windows update is doing with them. This needs so much system-ressources that with ongoing unchanged process on ms-side the package will fail with a timeout sooner or later on every system in my opinion. At least the problem will probably get worse with every month and every new fix.


    A completely different part is, why it sometimes takes so long to restart the vm (aka "windows is prepared"). Performance counters on our old platform are not showing much during that time and it hasn't failed there since we migrated to the new hardware.
    Maybe someone of you has further information about what happens there and which folders are used (to check if excluding from AV for example helps) because it still takes plenty of time.

    Also i hope to do a "kickoff" to put together all known informations about the process that have been digged out (helpful information is very much scattered across blogs atm). Maybe we can find out more and find more workarounds since im' pretty sure that ms won't act on this apart from saying "hey it works on 2019. update." (which possibly is only because there are not so many fixes included until now).

    Some further information about our systems:
    - ESXi 6.5 on HP DL380gen10 completely up2date (Soft-/Firmware)
    - HPspecific ESXi-image
    - DataCore-SAN also on DL380gen10 with SSDs (OS-virtualdiscs on SSD only)
    - Bonded (LACP) 2x10GB/s for iSCSI
    CrystalDiskMark 6.0.0 x64:
       Sequential Read (Q= 32,T= 1) :  1094.799 MB/s
      Sequential Write (Q= 32,T= 1) :   526.509 MB/s
      Random Read 4KiB (Q=  8,T= 8) :   430.462 MB/s [ 105093.3 IOPS]
     Random Write 4KiB (Q=  8,T= 8) :   168.259 MB/s [  41078.9 IOPS]
      Random Read 4KiB (Q= 32,T= 1) :   417.054 MB/s [ 101819.8 IOPS]
     Random Write 4KiB (Q= 32,T= 1) :   157.304 MB/s [  38404.3 IOPS]
      Random Read 4KiB (Q=  1,T= 1) :    42.067 MB/s [  10270.3 IOPS]
     Random Write 4KiB (Q=  1,T= 1) :    14.260 MB/s [   3481.4 IOPS]
    - OS : Windows Server 2016 Datacenter (Full installation) [10.0 Build 14393] (x64)
    - WSUS

    I know many of you are very frustrated but when answering please keep in mind that it won't help much if simply blame ms for not getting this sorted out.
    At least give your specs, describe your problem and if you found something that helped at least a bit.
    • Edited by DDSYSADMIN Monday, January 14, 2019 9:10 AM
    Monday, January 14, 2019 8:49 AM

All replies

  • Well i digged out 3 possible "solutions" and want to add them here:

    1.) Download and manual install of the cummulative update - didn't help us

    2.) Using sconfig to install updates from admin-powershell - seems to do things different - have to check - no improvements as far as i can see

    3.) Install updated signatures for windows defender - since defender is deactivated shouldn't help but will try anyway

    Additional: When TiWorker.exe is generating that huge amount of CPU-Load the system writes like hell to C:\$Mft (Masterfiletable) und C:\$LogFile (NTFS-Volumeprotocol)

    Also %windir%\winsxs seems to involved

    Just noticed that there are similarities between what happens during update install and when doing a disc cleanup with update-cleanup. Looks pretty much the same in ressource-monitor. Trying to remember if i have ever seen the ammount of data to be cleaned up has reached more then a couple of MBs like it is often on Server 2008. Maybe it's now automatically done during update installation... just an idea...

    Forget about that last part. Just checked cbs.log and there you can see it black on white that it needs to handle a monstrous amount of packages and none of that takes longer than a few seconds but it sums up to about an hour now (last test with sconfig).

    -----------------------------------------------------------------------

    Pretty much looks like a "failure in design" by ms and if there isn't a way to avoid that cummulative updates i don't see a chance to speed things up much.

    But let's see if someone out there has an idea...



    • Edited by DDSYSADMIN Monday, January 14, 2019 1:54 PM
    Monday, January 14, 2019 10:59 AM
  • Tried another thing.

    Currently with 01-2019 we need around 1 hour in windows and another hour for the restart.

    Yesterday i tried to do a windows update cleanup before update installation, which took nearly 4 hours to complete (CleanUp-Wizard told it would only be 1 MB to get free).

    The main point is, after that update took only 1 hour over all and the best part of it, the downtime for restart was only a couple of minutes (around 10 i think). Will try to verify that on another VM today. 


    • Edited by DDSYSADMIN Tuesday, January 15, 2019 8:52 AM
    Tuesday, January 15, 2019 7:48 AM
  • As expected doing an update of defender signatures didn't change anything.

    The other thing, running an WU cleanup first, couldn't be verified on another VM until now (tried two).

    Maybe someone of you can try this.

    Until now everything i can say is: Disable all and everything that may interfere or slow down file operations and pump up single thread performance as much as possible and you may gain some minutes.

    Everything else has to be solved by MS for example by doing a true merge of all that dozens of packages inside the cummulative update or split in a package for the last years and another with only the packages of the current year inside.

    I really hope someone else can dig out more because i don't believe anything of that will happen.

    Maybe it's possible to store those packages uncompressed at least on the disc so there is no repacking or reducing checks.

    Last thing i'm a bit hoping for is doing the updates with wsus-offline instead but i don't see why this should be faster.
    • Edited by DDSYSADMIN Tuesday, January 15, 2019 12:18 PM
    Tuesday, January 15, 2019 12:17 PM
  • This month with KB4480961 we have seen a couple of failures on a couple of our servers.   To me it seems like an issue with the trustedinstaller perhaps timing out while shutting down, we have to manually download the patch and it finally went thru on some of these.    Are there any hotfixes for these issues on Windows 2016?

    Thanks in advanced
    Tuesday, January 15, 2019 12:58 PM
  • I would like to add, that update is slow because of the method and capped system ressources.

    First and foremost, in this month the CPU Cap seems to be gone, but there is a max memory around 300mb and disk speed is slowed to 300 KB/Sec Disk I/O.

    Also during each upgrade, all update files are being renamed and an the file structure is thousands of files and worst of all. Windows will try to work with all the files at once, this is the key reason for why Updates is slow.

    Microsoft needs to speed this up or allow Companies a GPO setting that allows CPU/DISK/MEMORY ressource management.

    Microsoft also needs to have a public downloadable Windows 2016 image that is updated with the latest CU, and provide download links for each ISO that fits the licensing they are offering.

    Friday, May 31, 2019 11:19 AM
  • For what it's worth:

    I had lots of problem with updates taking even days with packer / windows update module and noticed that it did work well with an older WSUS server.  When we DO NOT use the "Download express installation files" setting in "Update file and languages" the Cumulative Updates for Windows 2016 run perfect ....  

    Perhaps worth testing ?

    Friday, June 14, 2019 8:26 AM
  • This months patches were slow again - Does anyone have any more updates on this?

    Thursday, July 18, 2019 1:25 PM
  • Excuse me for necroposting and bad English. May be, this will help someone who have to update lots of Windows 2016 installations manually to save some time. Based on the https://windowsserver.uservoice.com/forums/295047-general-feedback/suggestions/32121229-stop-the-windows-update-madness-on-ws2016 solution.

    1. Before installing the updates, you may do this even days and weeks before, run this command: %SystemRoot%\System32\Dism.exe /online /Cleanup-Image /StartComponentCleanup This will prepare windows image by removing unneeded versions of updated binary components and may take from few minutes if run recently to some hours for the first time. Most important, it will dramatically shorten the most annoying thing - long reboot time with the "Getting Windows ready..." message, then you don't have any control of the server and services are down. You don't have to run this command before each and any update, but it seems to be useful at least once in each few months. It does not interfere too much with normal server workload and can be started even in production hours, because it usually does not use more then one CPU core. Only one downside I found - you can't start any software or Windows components installation in parallel, it will wait until the end of cleanup and most likely fail. As far as I can understand, this command by default should exist and run as a sheduled system task, but on server systems something went wrong.
    2. Then installing the updates, ignore "download" and "installing" messages - they are missleading. Actual installation begins in the download phase, you may find it youself by monitoring processes in the task manager. Unfortunatelly, update installation usually don't consume more than one CPU core, and I was unable to find any solution to this. So adding resources to the server usually don't speed up the process. Relax and wait, you have nothing to do with this.
    3. Upgrade to the Windows 2019 at the first possibility. This is the only ultimate solution.

    Documentation: https://docs.microsoft.com/en-us/windows-hardware/manufacture/desktop/clean-up-the-winsxs-folder

    Using the /StartComponentCleanup parameter of Dism.exe on a running version of Windows 10 gives you similar results to running the StartComponentCleanup task in Task Scheduler, except previous versions of updated components will be immediately deleted (without a 30 day grace period) and you will not have a 1-hour timeout limitation.

    Disclaimer: this will not help a lot. This will definitely don't help if some of the system files are damaged, use the full command set from the first link then:

    "%SystemRoot%\System32\sfc.exe" /scannow
    "%SystemRoot%\System32\Dism.exe" /online /Cleanup-Image /AnalyzeComponentStore
    "%SystemRoot%\System32\Dism.exe" /online /Cleanup-Image /CheckHealth
    "%SystemRoot%\System32\Dism.exe" /online /Cleanup-Image /ScanHealth
    "%SystemRoot%\System32\Dism.exe" /online /Cleanup-Image /RestoreHealth /Source:WIM:<your install.wim location and index>
    "%SystemRoot%\System32\Dism.exe" /online /Cleanup-Image /StartComponentCleanup

    Optional: "%SystemRoot%\System32\Dism.exe" /online /Cleanup-Image /StartComponentCleanup /resetbase


    Из ослиного гнезда ... :)

    Sunday, November 17, 2019 6:07 PM