none
Corrupt Segoe UI font and .NET framework after OSD with 2012 R2 CU1

    Question

  • Hi all,

    <edit>

    Doug's post below provided the solution. I wound up implementing the registry key fix he described as noted below.

    </edit>

    We recently upgraded from Configuration Manager 2012 SP1 CU3 to Configuration Manager 2012 R2 CU1 and immediately began having problems with OS deployment. Most of the problems were easy to fix (like needing to update our MDT boot images and change our task sequences so that they use the R2 client instead of the SP1 client), but one issue was very difficult to identify and resolve.

    We have a lot of Lenovo W540 and Dell Precision M4700 laptops. We are using the latest official driver packs from Lenovo and Dell for these models, and everything worked fine while we were on SP1 CU3. After upgrading to R2, these two models started failing OS deployment. The symptoms varied, but typically included a corrupt Segoe UI font:

    We would also see these errors while setting up the Configuration Manager client during the task sequence:

    "This application requires one of the following versions of the .NET Framework: v4.0.30319 Do you want to install this .NET Framework version now?"

     

    So since this was only occuring on two specific models, I assumed it was driver related (even though the exact same drivers had worked in SP1). I initially disabled the steps that install drivers for the W540 and M4700 and just added a step to install their network drivers. Without the machine-specific drivers loaded, OSD worked, but I had to manually load drivers after it finished, which was not ideal. I concluded there must be some sort of conflict with something new in the R2 CU1 client and one of the drivers, but narrowing down which driver was causing the problem was too time consuming, so I gave up on it.

    I tried using a base OS image that was not built using Configuration Manager, so it wouldn't have any version of the CM client in it. This worked fine, even with all the drivers enabled. I also tried using one of my old base OS images that had the SP1 CU3 client in it, and this also worked. So that made me suspect the 2012 R2 CU1 client was to blame for sure.

    I finally tried re-building my base OS images using the 2012 R2 client, but without CU1 (I just removed the PATCH= argument from the "Setup Windows and ConfigMgr" step and re-ran the build and capture task sequence). This worked fine, and is the solution I'm settling on. I can still include the PATCH= argument when I deploy this newly-captured base image.

    I don't know the full root cause, but it appears to be some sort of conflict with the device drivers used by these two specific models and something included in CU1. I suspect one of the drivers tries to upgrade .NET framework while a reboot is in progress, or something like that, but that's as far as I got.

    I hope this helps someone else, or if you have any ideas on what's broken, I'd love to hear about it.

    --Russel Riley



    • Edited by Russel Riley Thursday, November 13, 2014 11:18 PM Resolved
    Monday, July 14, 2014 6:07 PM

Answers

  • Russel and all.

    We were having the same issue, Random "need to install .NET" and Corupted font.

    We narrowed it down to a corrupt registry issue. The OSDApplyOS.exe was some how removing some Values in the .NET registry and sometimes removing keys from the Fonts registry.

    There is a little known BUG in the Microsoft Code of OSDApplyOS.exe which is the process that applies the OS image in the Apply Operating Sytem Step. It was explained in a Microsoft Internal document. (They would not forward me a copy of this document, it was read to me over the phone.)

    It was explained to me that in WinPE (where all the registry manipulation was happening), OSDApplyOS try's to "reorganize" the registry on the image that gets applied. Which is not a bad thing. 

    A setting in the WinPE registry tells the process to do this registry "reorganizing" once every 7 days. Hence the issue you were having on Monday, but by Wednessday, everything was back to normal. BUT, if the process runs out of kernal memory, the process simply "truncates" the registry before unloading it. Nice!

    There are two workarounds:

    Use a 64bit WinPE boot image. Our choice and it resolved our issues. In 2007/MDT 2010 there were utilities and other reasons to use a 32bit boot imges but in 2012, I couldn't come up with a good reason to stay with the 32bit boot image. Note: Make sure you have at least 2GB memory. I can reproduce this problem every time if I boot with a 64bit image, on a VM with 1GB Memory.

    Change/Add the Registry setting "RegistryReorganizationLimitDays" to > 100 days in WinPE 32bit boot image. I have no clear instructions on how to do this, but if you need to, MS should be able to help.

    Hope this helps.

    Doug

    

    • Proposed as answer by dcryer97 Wednesday, October 29, 2014 7:15 PM
    • Marked as answer by Russel Riley Thursday, November 13, 2014 11:16 PM
    Wednesday, October 29, 2014 7:14 PM

All replies

  • Have you opened a case with Microsoft CSS?

    Jason | http://blog.configmgrftw.com

    Monday, July 14, 2014 6:29 PM
  • Hi Jason,

    I have not opened a case, but I'm considering it. My main reason at this point for not opening a case is that as CSS representatives have told me frequently, they don't do root cause analysis. Now that I have a work around, I don't think they'll offer any support. (Based on my experience with the last 3 cases I've opened).

    --Russel

    Monday, July 14, 2014 8:04 PM
  • My main point is that they can't and won't fix an issue if they don't know about and CSS cases are the primary (some would say only) vector for them to learn about issues.

    Jason | http://blog.configmgrftw.com

    Monday, July 14, 2014 9:05 PM
  • I've opened a case. I'll update this thread when I have additional information to add.

    --Russel

    Monday, July 14, 2014 9:30 PM
  • Hi Jason,

    Thanks for encouraging me to open a case. My case is now closed and resolved. Our Build and Capture base image task sequence included the .NET 4.5 framework. At Microsoft's suggestion, I removed this and re-built the base OS images with the CU1 patch still included. Everything seems to be working fine now.

    So in all, we found any of the following items contributed to the issue:

    1. CU1 patch in base image
    2. Loading drivers for specific hardware models (although we've now seen the issue on a Lenovo ThinkPad T540 and Dell Precision T7500 too, so not as specific as initially though)
    3. .NET Framework 4.5 in base OS image (along with the Windows Updates that get applied to .NET - it could have been one of those too)
    4. Some degree of randomness (since some machines fail with the corrupt font and others with a .NET framework error).

    Hope that helps someone else in the future.

    Best regards,

    --Russel Riley

    • Marked as answer by Russel Riley Tuesday, July 22, 2014 7:44 PM
    • Unmarked as answer by Russel Riley Thursday, August 07, 2014 3:48 PM
    Tuesday, July 22, 2014 7:44 PM
  • Hi all,

    Since my last post we've had some interesting developments. Last week on Monday, OSD began to fail again. I tried removing all versions of .NET from our base image, and by Tuesday night we were back up and running. However, this Monday, it started failing again. This time I did nothing and by Wednesday afternoon, OSD was working again. I reverted back to using our base image that includes all the stuff I had removed, and it still works.

    So, it seems that something is occurring between Friday evening and Monday morning every week that kills OSD, but that resolves itself by late Tuesday/Wednesday. I'm currently looking at scheduled tasks on our Configuration Manager servers. We do have content validation enabled on our distribution points, and that runs on Saturdays, but it usually finishes within 2-3 hours, not 3-4 days. If I can't find any smoking gun, I'll try running content validation manually and checking if that kills OSD immediately rather than waiting until Monday to find out if it will fail again.

    Any other ideas?

    Thanks,

    -Russel

    Thursday, August 07, 2014 3:59 PM
  • Russel and all.

    We were having the same issue, Random "need to install .NET" and Corupted font.

    We narrowed it down to a corrupt registry issue. The OSDApplyOS.exe was some how removing some Values in the .NET registry and sometimes removing keys from the Fonts registry.

    There is a little known BUG in the Microsoft Code of OSDApplyOS.exe which is the process that applies the OS image in the Apply Operating Sytem Step. It was explained in a Microsoft Internal document. (They would not forward me a copy of this document, it was read to me over the phone.)

    It was explained to me that in WinPE (where all the registry manipulation was happening), OSDApplyOS try's to "reorganize" the registry on the image that gets applied. Which is not a bad thing. 

    A setting in the WinPE registry tells the process to do this registry "reorganizing" once every 7 days. Hence the issue you were having on Monday, but by Wednessday, everything was back to normal. BUT, if the process runs out of kernal memory, the process simply "truncates" the registry before unloading it. Nice!

    There are two workarounds:

    Use a 64bit WinPE boot image. Our choice and it resolved our issues. In 2007/MDT 2010 there were utilities and other reasons to use a 32bit boot imges but in 2012, I couldn't come up with a good reason to stay with the 32bit boot image. Note: Make sure you have at least 2GB memory. I can reproduce this problem every time if I boot with a 64bit image, on a VM with 1GB Memory.

    Change/Add the Registry setting "RegistryReorganizationLimitDays" to > 100 days in WinPE 32bit boot image. I have no clear instructions on how to do this, but if you need to, MS should be able to help.

    Hope this helps.

    Doug

    

    • Proposed as answer by dcryer97 Wednesday, October 29, 2014 7:15 PM
    • Marked as answer by Russel Riley Thursday, November 13, 2014 11:16 PM
    Wednesday, October 29, 2014 7:14 PM
  • Hi Doug,

    Thank you for your response. I am currently working on testing the RegistryReorganizationLimitDays registry value. To set the value in WinPE, I made a copy of the base boot image .WIM file and mounted it with DISM. I then used regedit to open and edit <mount folder>\Windows\System32\Config\system. I added the value to this key (which i had to manually create):

    HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Configuration Manager

    Value: RegistryReorganizationLimitDays

    Data: 0x1000

    I imported my modified WIM file as a new boot image, made a few modifications (adding some drivers I needed and enabling the PowerShell components), and configured my OSD task sequence to use the new boot image. So far everything is looking good, but I'm going to keep testing each day until I've gone at least a full week. The real test will come on Monday which used to be the day when our failures would start.

    Thank you,

    <edit>

    One thing I forgot to mention, I also switched our OSD task sequence to an x64 boot image, and have been runing this way for a few months now. This worked around the issue, but wasn't ideal for us because our OSD task sequence can deploy either x86 or x64 versions of Windows, but only if it uses an x86 boot image. I wound up splitting the task sequence into two different task sequences, one for x86 and one for x64, but if the registry changes Doug mentioned fix the issue, I'll be re-combining them back into a single universal OSD TS.

    </edit>

    --Russel


    • Edited by Russel Riley Wednesday, November 05, 2014 5:34 PM Added info for x64 boot image
    Wednesday, November 05, 2014 5:29 PM
  • I've re-tested the OS deployment task sequence using the new boot image every day now for more than a week and haven't had a single failure. Everything is looking much better now.

    Thank you Doug!

    --Russel

    Thursday, November 13, 2014 11:17 PM
  • That's great news Russel.

    I never got a clear answer on HOW the days are counted and from what starting point. The issue may resurface after 100 days. That worries me a little.

    Thanks.

    ..Doug


    • Edited by dcryer97 Friday, November 14, 2014 12:40 AM
    Friday, November 14, 2014 12:39 AM