none
Black Screen on all new Connections, SessionHost has to be rebooted, Error 1534

    Pertanyaan

  • hi,

    we're suffering from session hosts that produce black screen errors in a RDS 2016 farm.

    already connected users can mostly work, all new connections end up with a black screen.

    to resolve the error the server has to be restarted.

    i can say that

    - this error appears after error 1534 (Fehler bei der Profilbenachrichtigung des Ereignisses Delete für Komponente {709E2729-F883-441e-A877-ED3CEFC975E6}. Fehlercode: Das System kann die angegebene Datei nicht finden.) starts appearing in the eventviewer.

    - upon checking the registry for this SID i end up at "ProfileNotifyHandler Class app id {E10F6C3A-F1AE-4adc-AA9D-2FE65525666E} inprocserver32, C:\Windows\System32\gameux.dll".

    - starting explorer.exe per taskmgr does not open an actual explorer window although the process appears in taskmgr

    - tskmgr, eventvwr, cmd can be started without problems

    - affected users appear as active in RDS management

    - no third party security software is installed

    - farm is fully patched

    - HKLM\Software\Microsoft\WindowsNT\CurrentVersion\ProfileGUID and ProfileList are ok (no old or .bak entries)

    looking forward on how to resolve this without  rebooting the server or a permanent fix

    thank you

    best regards

    21 Juni 2017 9:31

Semua Balasan

  • Hi,

    Please run SFC /SCANNOW to see whether there are any corrupted system files.

    In addition, please also check terminal and remote desktop services logs to see whether more clues can be found under:

    Event Viewer -> Applications and Services Logs\ Microsoft\ Windows

    Best Regards,

    Amy


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    22 Juni 2017 12:04
    Moderator
  • hi,

    i can confirm that sfc /scannow does not resolve this error. furthermore there are no direct hints in any event viewer logs.

    by now i know that: users are able to connect but no profile is loaded/rendered, mstsc /admin can start various programs that do not need rendering (cmd, regedit, eventvwr but not explorer.exe), TermService is frozen, ProfSvc is frozen. it appears some key cannot be found/unloaded.

    this black screen error can only be resolved by rebooting the affected session host. this is a reproducible error affecting the whole rds farm.

    cheers

    05 Juli 2017 7:41
  • hi,

    is there any update on this reocurring error?

    i'm seeing the "black screen upon login" bug at least once a week on our rds 2016 farm.

    currently i'm following up on my suspicion if there's a connection to the installation of "firefox esr".

    cheers

    20 Juli 2017 7:02
  • Also seeing this issue at least once a week on multiple 2016 servers. Rebooting the server is the only way to correct the situation that I've found. I've explored all of the proposed fixes to no avail. This has been ongoing since March. No fix. Thank you, Microsoft, for making my job this much more difficult. 5 years from now when they finally come up with a fix, I'm sure we'll be strictly Linux based. There are too many of us who are way too fed-up with MS to want to continue up this path much longer - wide spread issues with no viable corrective action - not good for anyone.
    20 Juli 2017 16:05
  • i followed up on my suspicion that "firefox esr" might be the cause - i can now confirm that it is not.

    we continue seeing the "black screen upon user login until session host is rebooted" error at least once per week on random ocassions and hosts.

    31 Juli 2017 7:31
  • Hi,

    Sorry about any convenience caused.

    Yes, I would suggest you contact Microsoft Customer Support and Services where more in-depth investigation can be done so that you would get a more satisfying explanation and solution to this issue.

    In addition, if the issue has been proved as system flaw, the consulting fee should be refund.

    Best Regards,

    Amy


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    31 Juli 2017 7:52
    Moderator
  • Hello,

    are User Profile Disks in use in your configuration? Do you get Event ID 158, Source: Disk in the 'SYSTEM' Eventlog saying "Disk 1 has the same disk identifiers as one or more disks connected to the system. Go to Microsoft's support website (http://support.microsoft.com) and search for KB2983588 to resolve the issue."

    ?

    Best regards

    Martin

    10 Agustus 2017 15:39
  • hi,

    yes i do. to my knowledge it is a normal error that can be ignored though.

    current status on the black screen bug: incident with microsoft is open and in progress.

    i will post an update here as soon as i get any information back.

    cheers

    17 Agustus 2017 7:16
  • We had the issue on the 8th of August, did all the Updates and were not having the issue until today. It struck today at about 13:00 german time. Logon with whatever priviledges is not possible, Ctrl-Shift-Esc has no effect.There are SMB Shares on the server which are still available and i can also PSEXEC into a cmd-Box on the server. I can still see sessions with qwinsta, did not try anything further. Suggestions?

    best regards

    Martin

    23 Agustus 2017 22:37
  • Tried some things with PSEXEC / cmd remotely on the server:

    rwinsta - manually reset "orphaned" sessions, did not work, timed out

    Tried to restart RDP service, it restarted but no change.

    Another thing i would like to check for but have not done yet: Make sure Large Send Offload on the NIC is disabled.

    23 Agustus 2017 22:40
  • no suggestions. i've spent endless hours looking into it, no clue as of yet. yesterday we had 3 (!) hosts affected, the week before 0 ...

    incident with microsoft was opened 3 weeks ago, i'm still waiting for any feedback/solution.

    24 Agustus 2017 6:54
  • On our installation these are 2 physical RDP servers, even directly on the server (HP iLO console) the only thing we get is a black screen. I noticed when i remotely did a qwinsta inside the PSEXEC cmd on the host while trying to connect with mstsc, during the connection attempt i dont get any response inside the cmd window. After a while the black screen appears and immediately my cmd window responds again.

    qwinsta shows a lot of disconnected sessions that are somehow "orphaned", session IDs go up to around 280. After the reboot session IDs somewhat correspondend again with the number of users.

    I also found when executing "systeminfo" on the server, no logon server is displayed (n/a). Don´t know if this is meaningful. Tried nltest, everything seems fine.

    @Ibis Hotline: Do Smart Card readers play any role in your deployment?

    24 Agustus 2017 8:11
  • no smart card readers, no.

    when the error appears you can still logon as admin and start various applications that don't need gui rendering. that's what it comes down to actually - after the error appears new gui windows cannot be rendered, already opened applications are not affected. you can start cmd, regedit, eventvwr and even procmon. but you cannot start explorer.exe or iexplorer. exe or firefox and so on.

    because of the gui thing i doubt it's got anything to do with orphaned sessions. you can check "HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\ProfileList" and "HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\ProfileGui" though. Also you can check C:\USERS\ for old User Folders, but that's not it either.

    it's a bug, all we can do is wait or open an incident as have i.

    24 Agustus 2017 8:30
  • We are also seeing this behavior on one of our deployments of RDSH in 2016. If anyone is comfortable sharing, what settings are configured in your RDSH environment group policy?

    computer configuration - administrative templates - windows components - remote desktop services - remote desktop session host - remote session environment

    Also, what is configured in RDS Connections group policy?

    computer configuration - administrative templates - windows components - remote desktop services - remote desktop session host - Connections

    Looking to compare, to see if any similarities exist. Once I have access to the affected servers, I will post what is currently configured as well.

    On a side note, restarting the audio service of all things has at times resolved the problem without requiring a reboot, but it is not a consistent fix.

    28 Agustus 2017 21:14
  • generally speaking, you don't need to set there anything because most of these settings are managed within the rds console anyways.

    on the contrary, if some of these settings are set within gpos the rds console stops working properly. for example, in certain combinations you get errors when trying to add a host or change rds collection settings.

    long story short, leave those settings alone until you don't have a REALLY important reason. while i have set dozens of gpos/settings, theres nothing important to compare in the rds components gpos.

    our incident is still open without any usefull outcome as of yet.

    11 September 2017 13:07
  • no update as of yet, incident is still being worked on and unresolved.

    we continue seeing black screens after logon :/

    21 September 2017 13:29
  • KB4038801 changelog contains the promisiong line "Once the deadlock occurs, new logons fail until the logon computer is restarted."
    We'll be testing this.
    02 Oktober 2017 12:40
  • We still have this problem.

    KB4038801 is installed.

    12 Oktober 2017 7:28
  • i can confirm, KB4038801 didn't resolve this problem.

    our open MS incident is still being investigated.

    12 Oktober 2017 7:30
  • supposedly there will be a patch in november!

    see here:

    https://social.technet.microsoft.com/Forums/windowsserver/en-US/02c526f2-b3d1-4ade-97aa-f8610c798308/event-id-4005-the-windows-logon-process-has-unexpectedly-terminated-server-2016?forum=winserverTS

    16 Oktober 2017 9:18
  • We had similar problems with a customers environment. After about 4-5 hours of user load the server would semihang. New users would get the black screen but users that where connected to the RDHost server could continue working.

    We did a case with Microsoft tech support. They did lots of tests on the system, but I managed to find a solution my self. It seems like it was "User Profile Disk" that did what I believe was to many TCP connection and flooded the system after some time of load.

    We switched to "redirected folders" instead, and ever since the server has not crashed once. 

    24 Oktober 2017 11:29
  • if the problem is still relevant
    stop the AppReadiness service
    08 Nopember 2017 8:10
  • ok, why? does it resolve the issue? do you have any more background info?

    following this article (https://docs.microsoft.com/en-us/windows-server/security/windows-services/security-guidelines-for-disabling-system-services-in-windows-server) it is not recommended disabling the service.

    09 Nopember 2017 9:08
  • sadly, November 14, 2017—KB4048953 doesn't contain any "winlogon" development :/

    also, our incident with microsoft is still being investigated

    15 Nopember 2017 10:18
  • additionaly i tried stopping the appreadiness service on affected machines, the error is happening nonetheless :/
    22 Nopember 2017 10:05
  • finally, a hint: killing the audiodg.exe on the affected machine resolves the issue instantly.
    10 Januari 2018 8:36
  • Does the silence on this thread mean people have fixed this issue by any chance?

    I've got two Windows Server 2016 RDS hosts with this issue.  Both running under Hyper-V on a physical Windows Server 2016.  Hardware is an HP ProLiant ML350 Gen9.


    I've tried uninstalling Symantec Endpoint Protection (just leaving Windows Defender) which did improve things but it isn't fixed.


    I am using User Profile Disks, I do get 158 events logged about disk signatures.  If I move all users off the affected server (deny new logons, log them off and wait until they log on again) I have found the problem tends to affect the 'good' server before too long.  After a reboot of the virtual RDS hosts everything is fine for some time - it's not that one user logs on and their UPD kills it every time.  UPDs are stored on the physical server and accessed by the RDS hosts from a share.  Only Windows Defender on the physical server at the moment too.  I seem to have one VHDX that was left locked by the affected terminal server.  That is the largest at 3.03GB but I'm not sure if this is just a coincidence.


    On an affected server I just tried restarting the Windows Audio service, the affected server sprang back to life.  Physical server has no sound card.  I didn't check whether audiodg.exe was running unfortunately.  It isn't currently running on either server.


    I do have a number of event 7000 "The Diagnostic System Host services failed to start due to the following error: The services did not respond to the start or control request in a timely fashion" and the same for "Portable Device Enumerator Service".


    Hope this is of help to people.  Any hints, fixes, workarounds etc very welcome!


    Many thanks,

    Gareth

    20 Maret 2018 18:39
  • Working on Ibis' suggestion about the 'Windows Audio' service and AudioDG.exe hanging, I'm trying this as a scheduled task at 06:30 every morning to see if pre-emptively restarting it prevents the problem.  I'll let you know if there's any results but if successful there won't be any problems for a while.

    Gareth

    REM Terminal Server instability seems to be caused by AudioDG.exe hanging and the Windows Audio
    REM  service. Restarting these seems to bring it back up. So hopefully, pre-emptively restarting
    REM  them at intervals will prevent it.
    
    net stop Audiosrv
    taskkill /F /IM AudioDG.exe
    net start Audiosrv
    

    27 Maret 2018 8:26
  • I did something similar, but don't want to affect logged on users. I have the PowerShell script below run every hour to kill the AudioDG process when there are no users on the affected server. This works for this specific situation and has prevented issues that were happening on nearly a daily basis. 

    Really wish MS would fix the issue though, as this prevents us from moving forward with 2016 servers. 

    #Ignore Members of AD Group: ADMINS
    $Group = Get-ADGroupMember -Identity ADMINS
    
    #Find processes with Logged in Domain Users, but ignore specific users (admins and others, like service accounts that are always logged in). 
    $UC = (Get-Process -IncludeUserName | Where {($_.UserName -like "domain*") -and ($_.UserName -notlike "*somename*") -and ($_.UserName -notlike "*someothername*") -and ($Group.SamAccountName -notcontains (($_.UserName).ToString().Split("\"))[1])}).Count
    $AC = Get-Process | Where {$_.ProcessName -like "audiodg*"}
    
    #Kill the AudioDG service if no active users are logged on.
    If(!($UC) -and ($AC)) {
        Stop-Process $AC -Force
    }

    27 Maret 2018 16:16
  • Posted this in the other thread about start menu not working also, but this also applied to this issue, so posting here too.

    We have been running a production 2016 RDS with Multiple Session hosts for a year or so now, and came across this and the black screen issue every month consistently.   Our environment consists of 1 x WebAccess, Gateway, Licensing, Broker and 4 x Session Hosts.  1 Collection with UPD.

    Our only mitigation against this happening was to rebuild all the session hosts before they had been in production for one month.  So each of the 4 Session hosts would be replaced with fresh templates one a week over the month.   Performance progressively got worse as the month went on till eventually at around 1 month of a host being online, it would exhibit the black screens, and/or start menu not working,  Initially when these started happening a reboot would fix it, but eventually (within days) it would black screen and you cant log on at all (even at the console)

    After much time spent trying to figure out what was causing the issue, we finally came across a thread here about firewall rules getting created every single time a new user logged onto one of the session hosts.  As this RDS environment is used fairly heavily daily, we would end up with over 7000 new firewall rules getting added to each session host on a daily basis.  Each new firewall rule that gets added when a user logs on did not get cleaned up at logoff.

    We've since started running a Scheduled Task daily on each session host which deletes all these Firewall Rules, and since then, we've not had a single server requiring a reboot, performance has remained the same as a freshly deployed session host, logins don't slow down, no more black screens, no more start menu not working.

    We do regularly patch our servers, so I cant say for certain that this is the fix/workaround for these issues.  But it certainly has worked for us.   Still hoping MS get a fix out for this some time soon.

    I don't take credit for the script and I cant remember who posted it, but my version has a few slight changes in it.  Doesn't need a lot of the write-host and write-progress stuff in it if you run this as a scheduled task, but it does look alittle prettier when you manually run it in powershell.

    $profiles = get-wmiobject -class win32_userprofile cls Write-Host "`n`n`n`n`n`n`n`n"

    Write-Host "Getting Firewall Rules..." # deleting rules with no owner would be disastrous $Rules1 = Get-NetFirewallRule -All | Where-Object {$profiles.sid -notcontains $_.owner -and $_.owner } $Rules1Count = $Rules1.count Write-Host "" $Rules1Count "Rules`n" Write-Host "Getting Firewall Rules from ConfigurableServiceStore Store..." $Rules2 = Get-NetFirewallRule -All -PolicyStore ConfigurableServiceStore | Where-Object { $profiles.sid -notcontains $_.owner -and $_.owner } $Rules2Count = $Rules2.count Write-Host "" $Rules2Count "Rules`n" $Total = $Rules1.count + $Rules2.count Write-Host "Deleting" $Total "Firewall Rules:" -ForegroundColor Green $Result = measure-command { $start = (Get-Date) $i = 0.0 foreach($rule1 in $Rules1){ # action remove-itemproperty -path "HKLM:\System\CurrentControlSet\Services\SharedAccess\Parameters\FirewallPolicy\FirewallRules" -name $rule1.name # progress $i = $i + 1.0 $prct = $i / $total * 100.0 $elapsed = (Get-Date) - $start $totaltime = ($elapsed.TotalSeconds) / ($prct / 100.0) $remain = $totaltime - $elapsed.TotalSeconds $eta = (Get-Date).AddSeconds($remain) # display $prctnice = [math]::round($prct,2) $elapsednice = $([string]::Format("{0:d2}:{1:d2}:{2:d2}", $elapsed.hours, $elapsed.minutes, $elapsed.seconds)) $speed = $i/$elapsed.totalminutes $speednice = [math]::round($speed,2) Write-Progress -Activity "Deleting Rules1 ETA $eta elapsed $elapsednice loops/min $speednice" -Status "$prctnice" -PercentComplete $prct -secondsremaining $remain } foreach($rule2 in $Rules2) { # action remove-itemproperty -path "HKLM:\System\CurrentControlSet\Services\SharedAccess\Parameters\FirewallPolicy\RestrictedServices\Configurable\System" -name $rule2.name # progress $i = $i + 1.0 $prct = $i / $total * 100.0 $elapse = (Get-Date) - $start $totaltime = ($elapsed.TotalSeconds) / ($prct / 100.0) $remain = $totaltime - $elapsed.TotalSeconds $eta = (Get-Date).AddSeconds($remain) # display $prctnice = [math]::round($prct,2) $elapsednice = $([string]::Format("{0:d2}:{1:d2}:{2:d2}", $elapsed.hours, $elapsed.minutes, $elapsed.seconds)) $speed = $i/$elapsed.totalminutes $speednice = [math]::round($speed,2) Write-Progress -Activity "Deleting Rules2 ETA $eta elapsed $elapsednice loops/min $speednice" -Status "$prctnice" -PercentComplete $prct -secondsremaining $remain } } $end = get-date write-host end $end write-host eta $eta write-host $result.minutes min $result.seconds sec

    NOTE:  There is a powershell commandlet Remove-NetFirewallRule, though it is VERY slow, so I'm using the remove-itemproperty command to delete the registry keys as it works a lot faster.   And considering it is clearing 7000+ rules daily for me, it would takes ages to run otherwise.

    I'm interested to hear if this actually works for other people with these issues or it is something else entirely causing it.

    Cheers





    • Diedit oleh LapuLapu 28 Maret 2018 0:31 typo
    • Disarankan sebagai Jawaban oleh swi_als 28 Maret 2018 19:48
    27 Maret 2018 23:24
  • omg \o/

    lapulapu if this works out i cannot say how incredibly thankful i am. this issue has been killing me for the past year, i'll test your script asap in our environment!

    (very first result: your script found 32k rules, deleting them takes more than an hour on one of our hosts ... this is so exciting! thank you sooo much!)

    28 Maret 2018 14:33
  • I just tried these script out in a brand new RDS farm (had to reinstall it because of the same issues) and already 700 rules have been deleted on every host. I will keep you updated.

    "funny" side fact: Not even one enduser has ever logged on these hosts until now, only a few admin users that installed software were logged on.

    28 Maret 2018 14:45
  • I just tried these script out in a brand new RDS farm (had to reinstall it because of the same issues) and already 700 rules have been deleted on every host. I will keep you updated.

    "funny" side fact: Not even one enduser has ever logged on these hosts until now, only a few admin users that installed software were logged on.

    Once you've added a session host to the RDS collection, and start logging on as any user, whether remote, console, admin, etc they all start creating these rules.

    In our case, every single logon created 11 rules in each of the firewall policies.  I believe this is related to all the contents of the C:\Windows\SystemApps folder as to how many additional rules get created for your environment.  I haven't bothered looking at removing some of this stuff (who needs xbox/Cortana stuff on an RDS right?).

    So as you can imagine, with a heavily used server, these additional firewall rules grows very quickly.  


    • Diedit oleh LapuLapu 28 Maret 2018 21:33 typo
    28 Maret 2018 21:18
  • omg \o/

    lapulapu if this works out i cannot say how incredibly thankful i am. this issue has been killing me for the past year, i'll test your script asap in our environment!

    (very first result: your script found 32k rules, deleting them takes more than an hour on one of our hosts ... this is so exciting! thank you sooo much!)


    First time I ran this on our environment it had to delete over 192k rules on each server which took ages.  Hence why we run it daily now to keep script runtime to a minimum.
    28 Maret 2018 21:39
  • Just found the thread I referred to where I found the original firewall cleanup script.

    Just to credit the source of my script (user JS2010) and also has some good reading regarding these firewall rule issues, have a look at this thread. There are also a few other good examples of methods of cleaning up these firewall rules.

    new firewall rules created for each user

    28 Maret 2018 21:45
  • omg \o/

    lapulapu if this works out i cannot say how incredibly thankful i am. this issue has been killing me for the past year, i'll test your script asap in our environment!

    (very first result: your script found 32k rules, deleting them takes more than an hour on one of our hosts ... this is so exciting! thank you sooo much!)


    First time I ran this on our environment it had to delete over 192k rules on each server which took ages.  Hence why we run it daily now to keep script runtime to a minimum.

    This is looking very promising - thank you so much! On some of our hosts the Firewall Rules in the ConfigurableServiceStore must be so large its causing this error: Get-NetFirewallRule : Not enough storage is available to complete this operation. 

    If I try and open the reg key directly at "HKLM:\System\CurrentControlSet\Services\SharedAccess\Parameters\FirewallPolicy\RestrictedServices\Configurable\System" regedit times out with "Not responding". Has anyone else who has run the script found away around this problem? I assume we cant just import this registry key from a purged and working host?

    29 Maret 2018 9:48
  • we get the timeout at the second key as well

    01 April 2018 17:09
  • In the end I deleted the whole “Configurable” reg key, reset the firewall to default settings, and then forced a group policy update. Rebooted server and everything seems fine, no errors in the event log related to the firewall and black screen down to 1-2 seconds.
    01 April 2018 18:21
  • cool!

    i deleted the second "configurable" block and reimported it from a different machine, seems to work as well, after running the script now shows 0 keys in both blocks without errors.

    tonight i'll start running the script on a couple of additional machines and report back soon/next week.

    01 April 2018 18:41
  • Just wondering, has anyone implemented the suggestions to disable the Per-User services in Win2016 as described here:  https://docs.microsoft.com/en-us/windows/application-management/per-user-services-in-windows

    If so, does this stop the excessive Firewall rules being creating?

    02 April 2018 23:45
  • I've run the script across our 3 RDS Session Hosts where this Black Screen issue occurs.  11,000+ dud records were removed.  Unfortunately, within 12 hours they all had the issue occur again.  

    I'm now going to work through the weblink provided in my previous post to see if disabling some of those unnecessary services helps things come good.  

    05 April 2018 4:51
  • There is another cause of Black Screens which I also forgot to mention that I've come across.  Not sure if anyone else has come across this specifically.

    If a user has a completely full UPD when they log into a session host, that session host craps out, and all subsequent users that log onto that server will get the black screen for quite some time before they are presented with the desktop (if at all).

    This doesn't affect any existing users that are already logged that SH, only new logins.   A reboot will correct this issue.   But we do have to either resize that culprit users UPD or clear out some space in it, or they'll break another session host they get onto.

    We disallow new logins to that session host until all existing users logout before we can reboot it too.

    I've also made a powershell script that we also run nightly that does maintenance on all UPD's.  ie:  clears a lot of garbage temp, cache stuff created by some of our apps, defrags, compacts, shrinks, resizes, extends, etc.

    Happy to share that if it'll help anyone.

    Though users that do max our their profiles within the day can cause these black screens.

    06 April 2018 3:29
  • We're not using UPD, just regular Roaming Profiles and Folder Redirection (all applied via GPO).  We did have a ticket open with Microsoft and thought we had it solved by adding the SpecialRoamingOverrideAllowed registry key, as per the blog post at https://blogs.technet.microsoft.com/askpfeplat/2017/05/08/mysteriously-disappearing-start-menu-tiles-and-roaming-user-profiles/.  

    Unfortunately the cat came back a few days after closing the ticket so we looked at moving to UPD's, but because many of our users make use of multiple logins with the same account, it isn't supported (at least that's what I can gather).  

    I've reviewed event logs and there isn't any event/error/warning that seems to correlate to when this black screen issue occurs.  We do find that after 5 minutes, the black screen disappears and the desktop loads normally.  During that 5 minutes, we're able to press Ctrl+Alt+End and open task manager.  Various apps can be opened from there, except Explorer.  

    I've found today that if I Stop and Disable the App Readiness service, the problem goes away for new logins.  Simply restarting the service doesn't seem to resolve it.  After reverting the service back to manual start-up (default), the problem stays away, although not sure yet for how long.  

    We also don't seem to have an AudioDG process running in our environment, so can't comment on that one as an option.  

    A futher description of our symptoms is here:  https://social.technet.microsoft.com/Forums/windowsserver/en-US/2838a741-d93b-4f5c-bbc2-e55980224e55/remote-desktop-services-2016-connection-hangs-at-black-screen-for-5-minutes-before-displaying?forum=winserverTS

    We also experience this issue at the VM Console when logging in as a Domain User when the servers are experiencing the issue.  Local admin users get straight in.  

    This one has got me stuffed!

    EDIT:  We are a few months behind on our MS Updates for this environment, so I'm getting them applied early next week to see how things look then. 
    06 April 2018 3:53
  • We are experiencing the same problems with our RDS Servers. I tried the powershell script that you provided. Seems to clear up about 7k rules. 

    Do you mind sharing the other Powershell script where you maintain the UPD's?

    I currently have a copy VM setup to test the possible resolutions.

    Thanks

    09 April 2018 9:17
  • @RobertAngell

    We have exactly the same problem on our XenDesktop based server 2016 environment. We use roaming profiles with Citrix UPM, so we dont use any user profile disks.

    The problem is, we dont see any errors in the eventlog. We see a warning from the user profile service after the black-screen problem occurs -> 

    Profile notification of event Create for component {2c86c843-77ae-4284-9722-27d65366543c} failed, error code is Not implemented.

    I also tried to restart the audio service and kill the audiodg.exe, but that didnt helped. The app-Readiness service is set to manual but was started. Unfortunately I couldn't restart the service, now it's marked as "stopping" in the services console :-/

    In our case, we also cannot login through the VM-console with domain users. Have to try it with a local-admin account.

    10 April 2018 12:33
  • I like tho share my experience with freezes and black screens at logon with you guys. After searching for weeks we have found 250.000 firewall rules, which caused random freezes of the RDS servers. During a freeze all users on the RDS server were looking at a frozen screen for 30-90 seconds. 

    Disabling the Windows Firewall reduced the black screen from 10 minutes to a few seconds. Also, after a reboot, the logon was blazing fast again.

    Deleting the duplicated firewall rules fixed the freezes.

    When your using Windows 2016 RDS and User Profile Disks, you will see that a user logon will create several duplicated firewall rules in inbound, outbound an restricted services configurable. With 100 users you're RDS server logons will be slower and slower from 3 months until you encounter freezes and black screens after 9 months. The number of firewall rules will grow with thousand a day, so each day your delays will be longer. It's described in another Technet article Winserver 2016 RDS UPD issue

    So I can confirm that RDS and UPD combination will result in black screens in 3 - 6 months, which can be fixed by cleaning up Windows Firewall rules.







    12 April 2018 20:13
  • after running lapulapus fw script for a while on a rds 2016 with 700+ concurrent users i can confirm that it improves working with the collection but does not resolve the/our blackscreen issue.

    it helps tremendously with logon times and other problems resulting out of too big reg blocks. 200k+ useless fw rules take it's toll ... thanks again lapulapu for providing this script!

    @lapulapu i'd love to give your second script regarding user profile disks a try!

    i still think that the black screens in our collection result from a browsing issue. remember, killing the audiodg.exe instantly solves the issue in many cases. i believe that playing videos in internet explorer (firefox rds as well) hang and kill new logins.

    16 April 2018 7:10
  • Also suffering from the this random black screen issue on our RDS 2016 servers to.

    Does anyone know why users are creating the firewall rules in the first place and can it be disabled?

    Will be trying the fixes for Audio, AppReadiness and Duplicate Firewall Rules and feedback.

    17 April 2018 7:26
  • @Agent-OrangeCH


    We experiencing the exact same thing.
    It seems to get worse if we apply the April updates (PVS System so we rolled back).

    So far we don't have a solution and there seems to be no information in the Citrix community so far.

    regards

    Updates: Nevermind we found this old citrix knowledge base https://support.citrix.com/article/CTX201804
    So we give it a try and removed the VMWare tools Display driver (we recently updated Vmware tools). Reinstalled VDA and it seems to work now.


    • Diedit oleh bamu12 19 April 2018 8:09
    18 April 2018 19:08
  • Another update...

    MS Updates through March are applied, with the issue still occurring.  I find the only way to reliably resolve the issue is to stop and disable the App Readiness service - Stopping it immediately allows current black-screen users to finish their logins and brings up their desktop.  Disabling allows them to continue logging in with no delays.  

    However, I read somewhere that Microsofts official stance is to NOT disable the App Readiness service.  I'm leaving it enabled for now, pending feedback in this thread, if anyone else is willing to give it a go...

    26 April 2018 5:05
  • i had a really bad experience after doing that, had to replace a couple of hosts ..

    the MS recommendation you're refering to is here: https://docs.microsoft.com/en-us/windows-server/security/windows-services/security-guidelines-for-disabling-system-services-in-windows-server

    26 April 2018 7:39
  • I've permanently disabled the App Readiness service on one of our affected servers to see if this helps.  As noted Microsoft's stance is to not disable.  However, it also says that it should only be on the Datacentre edition of Server 2016 which is clearly not the case as it is also installed on all of our Server 2016 Standard machines.

    Update.  I'm still seeing the issue even after the service is disabled, so no dice there i'm afraid.

    14 Mei 2018 9:11
  • Having the same problem with XenApp 7.16 on Windows Server 2016.

    I can confirm it happens with users, admins (which has no GPO's applied) and console connection from hypervisor.

    Nothing of the above seems to help. We use PVS, and it does'nt take many logons until the Black Screen is preventing users to logon. Already logged on users can still work as normal. I don't find any error messages on the server either.

    What is wierd is that for several hours it was working fine. I rebooted the server (not in production yet, so I was testing) and instantly this black screen stops logons. From every boot there is no changes done because the same image is tested on (PVS)

    Have applied the OS Build 14393.2248 - does not help.

    15 Mei 2018 14:03
  • i can report positive results, we have been 3 (!) days without an incident as of today :D

    after i stocked up on ressources (lots of cpu/ram) on our fileserver for the UPDs, we continued with resetting UPDs that exced our old sizelimit. We are working with 1GB max, a small percentage of our profiles still are over 4GB+.

    Clearing out the UPDs/Fileserver Side also seems to have an positive Impact, as does the Firewallscript.

    ps: jinxed it, minutes after i wrote this entry a host died ... at it's core it remains being a browser  issue (mostly with IE, less FF)

    16 Mei 2018 8:05
  • I can also report that no news is good news - ie I've not had complaints from the customer on this issue.

    Things I've done - hopefully this may help fd_Jose Daniel.

    1. Update Symantec Endpoint Protection - Originally tried removing it and it did show improvement but wasn't cured.  Later put the latest version back on.

    2. Removed Firefox.  I don't think this helped at all but it seemed to be something a lot or people had in common so I tried removing it.

    3.  Set a script to run every morning that restarts the audio service:

    net stop Audiosrv
    taskkill /F /IM AudioDG.exe
    net start Audiosrv

    More elegant ways of doing this have been posted but there are less than 20 users on this 2-server farm so the frequency seems to be enough.  None of them make use of audio - beyond a 'ding' for mail at most - and the timing of the script is early enough that nobody is using the system.

    4. Clear unnecessary firewall rules.  I did use the script posted by someone (Thanks and apologies for being too lazy to look up who and give proper credit) to clear down the existing rules but it takes too long to run regularly and didn't clear all the rules.  My script is dirty...but quick.  Once I had a clean set of rules, I exported the appropriate registry keys.  Now I can just delete the entire key and re-import the clean ones.  This is done every night after people log off.  Note that if the firewall rules are deliberately changed, the exported reg file would have to be updated.  Like I said, it's not an elegant way of doing it...but it's quick and it seems to work.

    cd C:\Program Files\Batch
    reg delete hklm\SYSTEM\CurrentControlSet\Services\SharedAccess\Parameters\FirewallPolicy\RestrictedServices\Configurable\System /f
    reg import ".\Firewall configureable rules - emptied.reg"
    reg delete hklm\SYSTEM\CurrentControlSet\Services\SharedAccess\Parameters\FirewallPolicy\FirewallRules /f /va
    reg import ".\Firewall rules - emptied.reg"
    net stop MpsSvc
    net start MpsSvc

    5.  I did also try disabling per-user service creation as per this document but our current version of Windows Server 2016 is only 1607 and this doesn't work until 1709.

    Of these, the biggest impact was definitely the Audio service.

    Edit:  Should also point out that we use User Profile Disks (UPDs) but with the default 20GB limit.  We also use Folder Redirection and specify the location Outlook's OST file so none of our users have hit the 20GB limit yet.

    • Diedit oleh Cenedd 16 Mei 2018 8:43 Addition
    16 Mei 2018 8:41
  • 1: We don't have Symantec

    2: I can try removing FireFox

    3: The servers boot every morning, with a fresh copy of Windows (PVS)

    4: This "clears" itself with every boot. 

    5: Yup....  1607 here as well.

    In my testing, it does not have anything to say about the number of users for us. I've been testing with 2 regular users and 1 admin user. I still get the error, and it's pretty random. We are using Citrix Profile Manager for the userprofiles, and folder redirection. Our admin users are not using Citrix Profile Manager.

    16 Mei 2018 11:36
  • It seems that Citrix UPM deletes AppReadiness key, which is the cause in our environment.

    https://discussions.citrix.com/topic/394538-continued-problems-with-black-screen-at-session-start-with-windows-10/?page=2

    24 Mei 2018 13:21
  • So basically, RDS 2016 is not ready for deployment in the enterprise is pretty much what I am gathering from this thread... We have the same issues as everyone else however nothing so far has fixed it. Not erasing the firewall rules, not stopping the audio service, nothing.

    06 Juni 2018 17:47
  • In my opinion certainly the 1607 build does not seem to be worth the headache if you have the option of running 2012R2.  We seem to have got rid of the black screen issues by doing all the above but we still have issues with Windows Explorer locking up and taking out the Start Menu, Outlook and file browser windows.  To be fair, we are running Classic Shell so it is possible that is involved in the problem.

    Would be very interested to know if anyone is running 1709 or 1803 and found that it fixed the problems completely but if the choice is 1709 or 2012R2 and you want a peaceful life, I'd go 2012 myself.

    07 Juni 2018 8:36
  • the combined actions of resetting fw policies every night and resetting old profile disks got us down to 1 to 3 defective servers per week, as compared to up to 10 defective server a day (!) until a couple of weeks ago.

    i know of big installations who do not suffer from this error, i know of other big installations who do. apparently a lot of admins in this thread are kept busy with this nightmare.

    with the information i have today i'd rather use 2012r2 plus classic shell or, if one can afford it, citrix, but rds 2016 is obviously not at all enterprise ready.

    07 Juni 2018 10:13
  • Sorry, haven't checked in on this thread in quite some time.  Someone had requested my script that I use to clean up our UPDs.

    Long post, breaking it up into several posts as i keep getting marked as spam.

    We would have just gone back to Win2012R2 with all the issues we had earlier, but really wanted a consistent user experience between our Win10 pc's and remote desktops.

    I posted my script back in March to clear the nasty Firewall Rules which seemed to help some people.

    We've been successfully running the following production environment with the following:
    1 Web Access, Gateway, Licensing, Connection Broker
    4 Session Hosts
    1 UPD

    We barely have any issues now which we cant pinpoint to a UPD cause.  In fact 2 of the servers in the farm missed their last maintenance windows and their uptime is 68 days and still running well.

    We have about 1200 UPD profiles, but usually only have about 15-20 active sessions per session host at any one time throughout the day.  The firewall script that runs on each session host deletes about 7000 rules every day (I modified my original script to email me daily with the count of rules deleted)

    Storage is an issue for us, so we've had to limit user profiles to 1gb. However we did originally leave users able to save Documents, Desktop, etc of their profiles.  Disabling this would impact existing users considerably so we stuck with it.

    The only cause we come across now that causes us the Black Screen issue is when a users profile is completely filled.  This in turn causes that particular session host to crap out and start Black Screening for any new connections.  Existing sessions are unaffected.   This then requires us to block new connections to this session host until all sessions are logged off and it can be rebooted.

    Users constantly filled their profiles which grew the UPD's to max, consume more overall diskspace, and we could not pinpoint the users that had full profiles that were causing the servers to crap out.

    We then put in place a script that will do nightly maintenance to all the UPD's to clear unnecessary files, shrink all the disks, then email the team so we could find users at or near their storage limit so we can intervene before they cause issues with our servers.

    • Diedit oleh LapuLapu 13 Juni 2018 6:34 typo
    13 Juni 2018 4:26
  • The following script does do some wacky stuff which i'll try to explain a bit.

    On our UPD file server, I installed the Hyper-V role as I needed to use some of the  Hyper-V powershell commands.  I do believe you can get away with just installing the Hyper-V Powershell without the entire Hyper-V role, although I already did it and didn't bother trying to uninstall all but powershell.

    At the top I listed some commands that we also call at logon via GPO.  This creates a text file in the root of the user profile called <login.txt>.  Probably could've stored this info in AD or something, but this was my quick and nasty method.  Reason I use this is I cant use the modified date of the UPD disk because soon as it gets mounted, the date changes.  There's probably a better way to do this.

    We also have a LayoutModification.xml we use for our Start Menu Layout.  This file needs the modified date updated inorder for it to apply to users that log on.  Can only modify this file when no users are logged on so we run this script at 2am.  Sometimes it doesn't get updated because some nutters work really late.

    This will also only process/shrink profiles larger than the defined 300mb.

    There is also another file that is in the same folder as the script.  I couldnt figure out a way to get the volumeID of the partition with powershell so i call out to the batchfile to use diskpart to return the value back to the script. (if anyone knows a better way to do this)

    Basically this script will mount each UPD disk, if it has been logged into within last 2 days (based on login.txt), it'll process it. (no point doing the same thing to a file if it hasnt been used in the past 24 hours since script last ran).  It'll then find some PST files (users love to waste space with these) but mainly for the report.  It then also deletes alot of temp files and apps we no longer want, defrag, shrink, optimise, resize to actual space used, resize back to max allowed and then expand that volume.

    This will result in all UPD's being the actual size of the contained data, and not the max file size of the UPD.  Makes it quick to identify disks that are nearly full at a glance in the folder they're stored.

    I'm no expert with powershell and there is probably much better ways of handling this, but I needed a solution quick back then and this seems to work for us.  Apologies also as bits and pieces from these scripts have been butchered from other peoples scripts and I couldn't possibly remember where I sourced some of it.


    • Diedit oleh LapuLapu 13 Juni 2018 23:24 update
    13 Juni 2018 4:26
  • upd_Cleanup.ps1

    ###########################################
    ##                                       ##
    ##   Cleanup User Profile Disks (VHDX)   ##
    ##                                       ##
    ##   Requires: Hyper-V Role/Powershell   ##
    ##                                       ##
    ##	 Run as a local admin not domain     ##
    ###########################################
    ##
    ## Run these commands in a batch file at login from a GPO.  It writes the last user login date/time 
    ## to a text file in the profile. This login.txt is used so we dont process UPDs not
    ## used in current day and to determine when user last logged in because
    ## the UPD modified date will always be modified when the disk is mounted.
    ##
    ##    set hour=%time:~0,2%
    ##    if "%hour:~0,1%" == " " set hour=0%hour:~1,1%
    ##    set min=%time:~3,2%
    ##    if "%min:~0,1%" == " " set min=0%min:~1,1%
    ##    echo %date:~4,2%/%date:~7,2%/%date:~10,4% %hour%:%min% > %userprofile%\login.txt
    ##
    
    # Path to .xml files for Start Menu layouts (eg: LayoutModification.xml) 
    # All .xml files not in use will have modified date updated so all users start menu is updated.
    $pathXML = "D:\Custom"
    # Path to User Profile Disk share
    $pathUPD = "\\UPDserver\UPD-Share"
    # UNC path to store log files
    $pathLOG = "\\UPDserver\Logs$"
    # Log file prefix
    $logFile = "UPD_Log"
    $EmailBody = ""
    # minUPDsize dont bother processing files smaller than this
    $minUPDsize = 300Mb
    # maxUPsize resize all UPD disks to after compacting (usually same size defined in Collection)
    $maxUPDsize = 1gb
    
    ### Update Modified Date of .xml files for Start menu (LayoutModification.xml)
    Try {
    	Get-ChildItem -Path $pathXML | ForEach-Object {$CurrentXML = "$_"; $_.LastWriteTime = (Get-Date); $EmailBody = $EmailBody + "$_ <br>"} -ErrorAction Stop -WarningAction Stop
    	$EmailBody = $EmailBody + "Modified date for all files in $pathXML Updated.<br><br>"
    } Catch {
    	Write-Host "$CurrentXML not updated.  File in use!!!"
    	$EmailBody = $EmailBody + "$CurrentXML Not Updated.  File in use...<br><br>"
    }
    
    ### Date format for Log Files
    $date = (Get-Date -Format "yyyy-MM-dd_hh-mm")
    
    ### Get all UPD Disks (greater than 300Mb in size)
    $updFiles = Get-ChildItem $pathUPD -Filter "*.vhdx" | Where {$_.Length -gt $minUPDsize -and $_.name -ne "UVHD-template.vhdx"} | Sort-Object -Descending -Property Length | Select-Object -ExpandProperty fullname
    #Comment out above and use example below to process just one user profile disk manually.  Just update the .vhdx filename in the command
    #$updFiles = Get-ChildItem $pathUPD -Filter "*.vhdx" | Where {$_.name -eq "UVHD-S-1-5-21-2089814041-1972452211-576392789-84026.vhdx"} | Select-Object -ExpandProperty fullname
    
    $count = 0
    $subCount = 0
    $countUPD = $updFiles.Count
    $countSmaller = 0
    $countLarger = 0
    $countNoChange = 0
    $countSkipped = 0
    $countInUse = 0
    $countUserNotExist = 0
    
    ### Create Log File
    NEW-ITEM -Path $pathLOG\$logFile-$date.csv -ItemType file -force | OUT-NULL
    ADD-CONTENT -Path $pathLOG\$logFile-$date.csv "Status,User,Last Login,Before (Mb),After (Mb),File,PST Files"
    
    Write-Host "`nProcessing $countUPD Total User Profile Disks...`n" -ForegroundColor "Magenta"
    
    foreach ($file in $updFiles)
    {
    	$count ++
    	$subCount ++
    	
    	$PSTcount = 0
    	$PSTfiles = ""
    	$PSTlist = ""
    
    	### Get Username/Owner of the UPD
    	$sid = $file.Substring(($file.IndexOf("UVHD-")+5))
    	$sid = $sid.Substring(0,$sid.IndexOf(".vhdx"))
    	$secIdent = New-Object Security.Principal.SecurityIdentifier $sid
    	Try {
    		$user = ( $secIdent.Translate( [Security.Principal.NTaccount] ) )
    		$sam = $user.Value.Split('\')[1]
    	} Catch {
    		Write-Warning "ERROR : Cannot resolve SID $sid"
    		$user = "User Does Not Exist"
    		$sam = "User Does Not Exist"
    		$countUserNotExist ++
    	}
    
    	Try {
    		Write-Host "**** $file" -ForegroundColor "Yellow"
    		Write-Host "**** $user" -ForegroundColor "Cyan"
    		Write-Host "**** Starting..."
    		$sizeBefore = (Get-Item $file).Length/1Mb
    
    		### Mount the UPD
    		$mount = (Mount-VHD $file -Passthru -ErrorAction Stop -WarningAction Stop).DiskNumber
    		$driveLetter = (Get-Disk $mount | Get-Partition).DriveLetter
    		$defragArgs1 = $driveLetter + ": /X"
    		$defragArgs2 = $driveLetter + ": /K /L"
    		$volumeID=cmd /c $PSScriptRoot\get_VolumeID.bat $driveLetter | Out-String
    		$volumeID=$volumeID.Trim()
    		Write-Host "**** Mounted on $driveLetter with Volume ID $volumeID"
    
    		### Check if last login within last 2 days and ignore if older
    		$lastLoginFile = $driveLetter + ":\login.txt"
    		$yesterday = (Get-Date).AddDays(-2)
    
    		If (Test-Path ($lastLoginFile)) {
    			$lastLoginContent = Get-Content -path $driveLetter":\login.txt"
    			Try {
    				$lastLogin = [DateTime]::ParseExact($lastLoginContent.Trim(), "dd/MM/yyyy HH:mm", $null)
    			} Catch {
    				$lastLogin = [DateTime]::ParseExact($lastLoginContent.Trim(), "dd/MM/yyyy  H:mm", $null)
    			}
    			$lastLoginStr = $lastLogin.ToString("dd/MM/yyyy HH:mm")
    		
    			If($lastLogin -gt $yesterday) {
    				Write-Host "**** Has logged on in last 24 hours - Compacting..."
    				
    				### Find PST Files
    				$PSTfiles = Get-ChildItem -Path $driveLetter":" -Filter *.pst -Recurse -ErrorAction SilentlyContinue -WarningAction SilentlyContinue
    				$PSTcount = $PSTfiles.count
    				If ($PSTcount -gt 0) {
    					$PSTlist = $PSTfiles.FullName
    				} Else {
    					$PSTlist = ""
    				}
    
    				### Delete some Files from the Profile. 
    				Write-Host "**** Deleting Files..."
    				Remove-Item -path $driveLetter":\AppData\Local\Google\Chrome\User Data\Default\Cache" -Recurse -Force -ErrorAction SilentlyContinue -WarningAction SilentlyContinue
    				Remove-Item -path $driveLetter":\AppData\Local\Google\Chrome\User Data\Default\Media Cache" -Recurse -Force -ErrorAction SilentlyContinue -WarningAction SilentlyContinue
    				Remove-Item -path $driveLetter":\AppData\Local\Citrix" -Recurse -Force -ErrorAction SilentlyContinue -WarningAction SilentlyContinue
    				Remove-Item -path $driveLetter":\AppData\Local\GoToMeeting" -Recurse -Force -ErrorAction SilentlyContinue -WarningAction SilentlyContinue
    				Remove-Item -path $driveLetter":\AppData\Roaming\Dropbox" -Recurse -Force -ErrorAction SilentlyContinue -WarningAction SilentlyContinue
    				Remove-Item -path $driveLetter":\AppData\Local\Dropbox" -Recurse -Force -ErrorAction SilentlyContinue -WarningAction SilentlyContinue
    				Remove-Item -path $driveLetter":\Dropbox" -Recurse -Force -ErrorAction SilentlyContinue -WarningAction SilentlyContinue
    				Remove-Item -path $driveLetter":\Desktop\Dropbox*" -Force -ErrorAction SilentlyContinue -WarningAction SilentlyContinue
    				Remove-Item -path $driveLetter":\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Dropbox" -Recurse -Force -ErrorAction SilentlyContinue -WarningAction SilentlyContinue
    				Remove-Item -path $driveLetter":\Downloads\*" -include *.exe -Recurse -Force -ErrorAction SilentlyContinue -WarningAction SilentlyContinue
    				Remove-Item -path $driveLetter":\AppData\Local\Temp\*.*" -Recurse -Force -ErrorAction SilentlyContinue -WarningAction SilentlyContinue
    				Remove-Item -path $driveLetter":\AppData\Local\Temp\*" -Recurse -Force -ErrorAction SilentlyContinue -WarningAction SilentlyContinue
    				Remove-Item -path $driveLetter":\AppData\Local\Microsoft\Windows\INetCache\*.*" -Recurse -Force -ErrorAction SilentlyContinue -WarningAction SilentlyContinue
    				Remove-Item -path $driveLetter":\AppData\Local\Microsoft\Windows\INetCache\*" -Recurse -Force -ErrorAction SilentlyContinue -WarningAction SilentlyContinue
    				Remove-Item -path $driveLetter":\AppData\Local\Microsoft\Windows\WebCache\*.*" -Recurse -Force -ErrorAction SilentlyContinue -WarningAction SilentlyContinue
    				Remove-Item -path $driveLetter":\AppData\Local\Microsoft\Windows\WebCache\*" -Recurse -Force -ErrorAction SilentlyContinue -WarningAction SilentlyContinue
    				Remove-Item -path $driveLetter":\AppData\Local\T1ADF-Caches" -Recurse -Force -ErrorAction SilentlyContinue -WarningAction SilentlyContinue
    				Remove-Item -path $driveLetter":\AppData\Local\T1ADF-SmartDeploy" -Recurse -Force -ErrorAction SilentlyContinue -WarningAction SilentlyContinue
    				Remove-Item -path $driveLetter":\AppData\Local\Mozilla" -Recurse -Force -ErrorAction SilentlyContinue -WarningAction SilentlyContinue
    				Remove-Item -path $driveLetter":\AppData\Local\Mozilla Firefox" -Recurse -Force -ErrorAction SilentlyContinue -WarningAction SilentlyContinue
    				Remove-Item -path $driveLetter":\AppData\LocalLow\Mozilla" -Recurse -Force -ErrorAction SilentlyContinue -WarningAction SilentlyContinue
    				Remove-Item -path $driveLetter":\AppData\Roaming\Mozilla" -Recurse -Force -ErrorAction SilentlyContinue -WarningAction SilentlyContinue
    				Remove-Item -path $driveLetter":\Desktop\Firefox" -Force -ErrorAction SilentlyContinue -WarningAction SilentlyContinue
    				Remove-Item -path $driveLetter":\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Firefox" -Force -ErrorAction SilentlyContinue -WarningAction SilentlyContinue
    
    				
    				### Defrag the Disk and Consolidate the Free Space
    				Write-Host "**** Defrag VHDX..."
    				Invoke-Expression "Defrag $defragArgs1" | OUT-NULL
    				Invoke-Expression "Defrag $defragArgs2" | OUT-NULL
    		
    				### DISKPART to Shrink the Volume
    				Write-Host "**** Shrink VHDX..."
    				NEW-ITEM -Name Shrink.txt -ItemType file -force | OUT-NULL
    				ADD-CONTENT -Path Shrink.txt "select volume $volumeID"
    				ADD-CONTENT -Path Shrink.txt "shrink"
    				Invoke-Expression "DISKPART /S Shrink.txt" | OUT-NULL
    		
    				### Dismount the UPD
    				Dismount-VHD $file
    		
    				### Optimise the VHDX
    				Write-Host "**** Optimise VHDX..."
    				Mount-VHD $file -ReadOnly
    				Optimize-VHD $file -Mode Full
    				Dismount-VHD $file
    		
    				### Resize the VHDX to the Minimum Size of used space
    				Write-Host "**** Resize VHDX to minimum size possible..."
    				Resize-VHD $file -ToMinimumSize -ErrorAction SilentlyContinue -WarningAction SilentlyContinue
    		
    				### Compact the VHDX to get rid of unallocated free space
    				Write-Host "**** Compact VHDX..."
    				NEW-ITEM -Name Shrink.txt -ItemType file -force | OUT-NULL
    				ADD-CONTENT -Path Shrink.txt "select vdisk file=$file"
    				ADD-CONTENT -Path Shrink.txt "compact vdisk"
    				Invoke-Expression "DISKPART /S Shrink.txt" | OUT-NULL
    		
    				### Resize the VHDX back to the original Maximum size defined
    				Write-Host "**** Resize VHDX back to Maximum ($maxUPDsize)..."
    				Resize-VHD $file -SizeBytes $maxUPDsize -ErrorAction SilentlyContinue -WarningAction SilentlyContinue
    		
    				### Mount the UPD
    				$mount = (Mount-VHD $file -Passthru -ErrorAction Stop -WarningAction Stop).DiskNumber
    				$driveLetter = (Get-Disk $mount | Get-Partition).DriveLetter
    				$volumeID=cmd /c $PSScriptRoot\get_VolumeID.bat $driveLetter | Out-String
    				$volumeID=$volumeID.Trim()
    				Write-Host "**** Mounted on $driveLetter with Volume ID $volumeID"
    		
    				### DISKPART to Extend the Volume to the max size
    				Write-Host "**** Extend VHDX..."
    				NEW-ITEM -Name Shrink.txt -ItemType file -force | OUT-NULL
    				ADD-CONTENT -Path Shrink.txt "select volume $volumeID"
    				ADD-CONTENT -Path Shrink.txt "extend"
    				Invoke-Expression "DISKPART /S Shrink.txt" | OUT-NULL
    		
    				### Dismount the UPD
    				Dismount-VHD $file
    				
    				Write-Host "**** Finished...`n"
    				$sizeAfter = (Get-Item $file).Length/1Mb
    				If ($sizeBefore -gt $sizeAfter) {
    					ADD-CONTENT -Path $pathLOG\$logFile-$date.csv "Compacted,$sam,$lastLoginStr,$sizeBefore,$sizeAfter,$file,$PSTlist"
    					$sizeDiff = $sizeBefore - $sizeAfter
    					$totalSizeSaved = $totalSizeSaved + $sizeDiff
    					Write-Host "**** Recovered Disk Space       = $sizeDiff Mb" -ForegroundColor "Green"
    					Write-Host "**** Total Recovered Disk Space = $totalSizeSaved Mb`n" -ForegroundColor "Green"
    					$countSmaller ++
    				} ElseIf ($sizeBefore -lt $sizeAfter) {
    					$sizeDiff = $sizeAfter - $sizeBefore
    					$totalSizeSaved = $totalSizeSaved - $sizeDiff
    					Write-Host "**** Additional Disk Space       = $sizeDiff Mb" -ForegroundColor "Red"
    					$countLarger ++
    				} Else {
    					ADD-CONTENT -Path $pathLOG\$logFile-$date.csv "No Change,$sam,$lastLoginStr,$sizeBefore,$sizeAfter,$file,$PSTlist"
    					$countNoChange ++
    				}
    				$sizeBefore = ""
    				$sizeAfter = ""
    			} Else {
    				Write-Host "**** Last Login = $lastLoginStr - Skipping..."
    				ADD-CONTENT -Path $pathLOG\$logFile-$date.csv "Skipping,$sam,$lastLoginStr,$sizeBefore,$sizeAfter,$file,$PSTlist"
    				$countSkipped ++
    				Dismount-VHD $file
    			}
    		} Else {
    			Write-Host "**** Login.txt file does not exist - Creating..."
    			$yesterdayStr = $yesterday.ToString("dd/MM/yyyy HH:mm")
    			$yesterdayStr | Out-File $lastLoginFile
    			### Dismount the UPD
    			Dismount-VHD $file
    		}
    	} Catch {
    		### File is in use, write this to the log/output
    		Write-Host "**** File in Use...`n`n`n"
    		Dismount-VHD $file -ErrorAction SilentlyContinue
    		ADD-CONTENT -Path $pathLOG\$logFile-$date.csv "File in Use,$sam,,$sizeBefore,,$file,"
    		$countInUse ++
    	}
    	$sizeBefore = ""
    	$sizeAfter = ""
    	### Write a summary output to screen of current physical disk space recovered every 10 that are processed
    	If ($subCount -ge 10) {
    		Write-Host "**** $count Processed : Summary Recovered Disk Space = $totalSizeSaved Mb`n" -ForegroundColor "Magenta"
    		$subCount = 0
    	}
    }
    ADD-CONTENT -Path $pathLOG\$logFile-$date.csv " "
    ADD-CONTENT -Path $pathLOG\$logFile-$date.csv "Total UPD Processed = $count"
    ADD-CONTENT -Path $pathLOG\$logFile-$date.csv "Total Recovered Disk Space = $totalSizeSaved"
    Remove-Item -path "Shrink.txt" -Force -ErrorAction SilentlyContinue -WarningAction SilentlyContinue
    
    $EmailHeader = "<html><body>"
    $EmailBody = $EmailBody + "<a href='$pathLOG\$logFile-$date.csv'>$pathLOG\$logFile-$date.csv</a>"
    $EmailFooter = "</body></html>"
    
    $EmailContent = $EmailHeader + $EmailBody + $EmailFooter
    
    $style = "<style>BODY{font-family: Arial; font-size: 8pt;}"
    $style = $style + "TABLE{border: 1px solid black; border-collapse: collapse;}"
    $style = $style + "TH{border: 1px solid black; background: #dddddd; padding: 5px; }"
    $style = $style + "TD{border: 1px solid black; padding: 3px; }"
    $style = $style + "</style>"
    
    $smtpServer = "<SMTP SERVER>"
    $msgFrom = "<FROM EMAIL ADDRESS>"
    $msgTo ="<TO EMAIL ADDRESS>"
    $msgSubject = "RD UPD Maintenance $pathUPD : $countSmaller COMPACTED : $countLarger EXTENDED : $countNoChange NO CHANGE : $countSkipped SKIPPED : $countInUse IN USE : $countUserNotExist NOT EXIST"
    
    Send-MailMessage -from $msgFrom -to $msgTo -smtpServer $smtpServer -subject $msgSubject -BodyAsHTML $style$EmailContent

    Here's the batch file that goes in the same folder as this script

    get_VolumeID.bat

    @echo off
    for /f "tokens=2,3" %%a in ('echo list volume ^| diskpart') do (
        if %%b==%1 echo %%a
    )

    Obviously these I built specific to my environment, so be sure to read through them thoroughly and remove stuff that doesn't apply to you. (eg: deleting some files in the profiles).

    And please share if you have better methods of doing some of the stuff I do in the script.

    Hope this helps some people.

    • Diedit oleh LapuLapu 13 Juni 2018 6:35 typo
    13 Juni 2018 4:27
  • Further to this,

    Remembering that we have a max profile size limit of 1gb.

    After running my script above, I can check my log file or check the directories the UPDs are stored.

    As you can see below, its a lot quicker and easier to see which profiles are nearing the limit of their profile storage in their UPD. Otherwise you would just see all profiles the same size. (max profile size) and you wouldn't know how full the actual profile is.

    As you can see here below, I have 2 profile disks which are completely full, even after shrinking them.

    Luckily those 2 are disabled users, but if they were to log onto any of our session hosts, they will cause that server to start Black Screening.

    We also do run another script that runs once weekly that moved disabled users profiles to another folder and also emails a summary report of all profile disks for enabled/disabled users.

    Also worth noting is that we do not run Firefox in our RDS environment, however I did notice it was installed in a couple of our users profiles recently.  I've since added it to my script to delete it, and i'm probably going to use AppLocker soon.
    • Diedit oleh LapuLapu 13 Juni 2018 6:42 update
    13 Juni 2018 5:04
  • "I'm no expert with powershell" has to be the undertatement of the year :D

    thank you so much LapuLapu for sharing your work! can you say how long the process takes per vhdx on average?

    as stated before running your fw cleanup script helped tremendously already. additionally i started resetting UPDs manually which also helped quite a bit. i might give this a shot, it will take me some time though.

    what i can not guess is, if this also resolves the browser/audiodg part of the issue. as far as i can tell our hosts only crap out when a video is crashing. in firefox the issue is resolved by killing the audiodg process, in internet explorer this doesn't help and the server has to be rebooted. i could imagine it is temp files that fill up the disk thus killing logins.

    we're running a public system so applocker is a must have for us. it is buggy and needs some work but generally does its job, i can recommend running it.

    again, a big thank you for sharing this information!!

    fun fact: we still have an open ms incident with this issue. the support we are getting from this is laughable at best. a couple of weeks ago out of the blue they started telling us about firewall rules and a script that can be run ...

    13 Juni 2018 7:51
  • As our profile limit is only 1gb, it's pretty fast.  under 10 secs per vhdx or there abouts.  As we have 1200ish profiles, as mentioned in my posts I try to only process profiles that have been used in the past 24 hours to keep the script runtime to a minimum.

    I cant say we've had any issues with audiodg as many have, although we do not use firefox and only have IE and Chrome.  Whilst there are some needs to have firefox, we drew a line in the sand and had to make a choice on browsers as we didn't want profiles inflated even further by having additional stuff stored in profiles.

    If I were to have firefox installed, I'd most likely have additional tasks at logon and logoff to do further cleanup of cache/temp files for all browsers as these really do fill out UPDs pretty darn quick. Or I would have to look at excluding stuff specifically from saving in UPDs, but it looks like i'd have to recreate the whole collection.  I may look into this a bit further.

    My recent discovery of firefox being installed in a few of our users profiles which prompted me to forcibly remove it.  Those few users had about 400mb of space consumed in their profiles just from Mozilla/Firefox files.  That's nearly half our storage there alone.   And because we only clean the UPD's nightly, we'd most likely have users within a day completely fill their profiles.

    • Diedit oleh LapuLapu 13 Juni 2018 23:26 update
    13 Juni 2018 22:45
  • fun fact: we still have an open ms incident with this issue. the support we are getting from this is laughable at best. a couple of weeks ago out of the blue they started telling us about firewall rules and a script that can be run ...

    They've really got the finger on the pulse for this one ;)
    13 Juni 2018 23:29