none
<Server> has been paused because it has run out of disk space on Cluster Shared Storage?

    Question

  • Recently I had a problem with an HyperV/Failover cluster I just put up.  We have a back end SAN with 2.61 TB on the LUN.  I have 2 nodes in my cluster and a 10 GB Quorum disk.

    So I get this error on 5 of my VMs.

    Log Name:      Microsoft-Windows-Hyper-V-VMMS-Admin
    Source:        Microsoft-Windows-Hyper-V-VMMS
    Date:          9/16/2012 12:02:47 PM
    Event ID:      16060
    Task Category: None
    Level:         Error
    Keywords:      
    User:          SYSTEM
    Computer:      <removed by me>
    Description:
    <hostname> has been paused because it has run out of disk space on 'C:\ClusterStorage\Volume1\vm\<hostname>\'.

    Now, our CSV storage space is 2.61 TB.  See the allocation below:

    VM---------Disk Utlzd (GB)--Disk Allocated(GB)--Allocation Type (Thin/Thick/PassT)        

                                                                                   

    vs1-OS-----14---------------127-----------------Thin               

    vs2-OS-----12---------------127-----------------Thin               

    vs2-data---535--------------1035----------------Thin              

    vs3-OS-----22---------------127-----------------Thin               

    vs4-OS-----9----------------50------------------Thin

    vs5-OS-----15---------------127-----------------Thin               

    vs5-Data---1024-------------1024----------------Thick           

                                                                                   

    Totals-----1631-------------2617                                  

                                                                                   

    Cluster Shared Volume available = 2.61 TB                                                                           

    Ratio = 1:1          

    Now I did have a 500 GB backup of VS5 from when the cluster previously crashed.  Still, I had 111 GB of free space left on the CSV according to the Failover Cluster Manager. 

    I guess my question is how much overhead is really required here?  I don’t want my VM’s to “pause” because they don’t think they have enough space, even though I do.  I don’t know what to do here because I’ve used VMware and Xen and the allocated space is what you have.  So I’m kind of confused.

    I’ve been reading the forums and one suggestion was that there should be double the capacity that is allocated which seems unnecessary.  Another thing that keeps coming up said to delete the snapshots, but I’m not using any snapshots currently.   ???

    I just saw this when I was about to post my question:

    http://social.technet.microsoft.com/Forums/en-US/winserverhyperv/thread/d9ed7259-104b-4b7f-8e57-2cc5364f5381

    Is this telling me that if I’m using dynamic (I call them thin) disks that I cannot overallocate?



    • Edited by pwjohnston79 Monday, September 17, 2012 3:00 PM
    • Moved by BrianEhMVP Monday, September 17, 2012 4:42 PM (From:Hyper-V)
    Monday, September 17, 2012 2:58 PM

Answers

  • If you are taking volume snapshots (directly or indirectly by using backup tools like DPM or 3rd party Backup solutions that rely on volsnap) then there is an issue that we have understood about a week ago and a fix for that issue is being currently tested. The symptoms are that when snapshot is taken NTFS might briefly report free volume size to be 0 and then free volume size will go back to normal (we are talking about window of couple dozen milliseconds here). If VMMS queries free volume size and find that there is none it will pause VMs. So if VMMS is unlucky and does the query at the moment when NTFS reports 0 you might hit the issue.

    For now please contact Microsoft Support to get status on the fix availability.


    Regards, Vladimir Petter, Microsoft Corporation



    Friday, December 7, 2012 6:11 PM
  • The KB number for the hotfix is KB2791729.

    You will have to log a support call to get the hotfix. It's not showing on the knowledge base yet.

    Monday, December 17, 2012 2:20 PM

All replies

  • As far as overhead is concerned .... CSV by itself should not introduce much overhead for hyper-v scenario. Following algorithm might change, but to give you some sense

    - CSV does NOT keep any data on the volume on its own. In other words there are NO CSV metadata file that requires additional storage space.  All disk usage comes from the data in the files that NTFS, VSS and your applications have on the disk.

    - To improve performance CSV applies certain policies to reduce number of round-trips that a non-coordinating node need to make to the coordinating node (extra trips is extra latency on IO path). When a file grows in size CSV would apply more aggressive grow policy comparing to NTFS.

       1. CSV would always request at least 4K of allocation size to make sure the file is not resident. If you have lots of small files (below 4 K) then on CSV they will take more space comparing to the NTFS without CSV. NTFS will keep these files it its MFT table records. CSV disables resident files because they are not compatible with Direct IO.

       2. If file is smaller than 1 MB then when file is extended CSV would double required allocation size. This policy would not trigger unless there is a write to the file that is extending the file. VM config file most likely are in this category.

       3. If file is above 1 MB then CSV would round up required allocation size to 1 MB. This policy would not trigger unless there is a write to the file that is extending the file. Differential VHDs, while they are growing would be in this category.

    Once file is closed NTFS and CSV truncate allocation size to end of file reclaiming unused space.

    Note that base VHD would not be over-allocated since they are read-only.

    Given policies above you can calculate that if system runs for a long time then in a worst case scenario over-allocated space is around <number of files> * 1MB.

    For instance if you have 1000 VMs, and every VM has 1 diff VHD then we are talking about 1 GB of over-allocated space. That seems to be way below what you had so most likely at the moment you had someone for some reason to consume all that space.


    Regards, Vladimir Petter, Microsoft Corporation

    Monday, September 17, 2012 5:30 PM
  • Hi,

    In a particular scenario where you have Virtual Machines configured in Cluster Shared Volume (CSV), and if the CSV disk runs out of free space, the VMs enter into paused state. However, the Virtual Machines on the other nodes of the cluster that do not own the CSV are not affected.

    The node that owns the Cluster Shared Volume keeps checking the disk space and warns. When the warning is received the Virtual machine’s go into the paused critical state. The Node that don’t own the CSV, will not have any ways to determine if it is running out of space on CSV volume and hence will allow the virtual machines to run the cluster nodes that don’t own the CSV.

    To fix the issue, free disk space on these hard drives or move these VHD files to a new location.

    For more information please refer to following MS articles:

    Hyper-V Virtual Machine Snapshots: FAQ
    http://technet.microsoft.com/en-us/library/dd560637(v=WS.10).aspx
    Hyper-V: Avoid using differencing disks on virtual machines that run server workloads in a production environment
    http://technet.microsoft.com/en-us/library/ee941134(v=WS.10).aspx
    Virtual Machines enter the paused state due to low disk free space
    http://support.microsoft.com/kb/2486243


    Lawrence

    TechNet Community Support

    Tuesday, September 18, 2012 6:21 AM
    Moderator
  • There were no snapshots and I had 111GB free on the CSV when the machines when the machines went into a paused critical state.
    • Edited by pwjohnston79 Wednesday, September 19, 2012 7:30 PM
    Wednesday, September 19, 2012 7:30 PM
  • I've just had the same issue with a Windows Server 2012 host but a much simpler setup.  Non-clustered and direct attached SAS storage in the host ...suddenly last weekend two out of seven VMs are paused critical with a message about the disks running out of space.  There is plenty of free space on the host drive and also within the VHDX files.  I've opened a case with Microsoft - although haven't heard back yet...

    Oh and all of the VHDX files are fixed size (ie fully pre-allocated/expanded).

    Tuesday, November 13, 2012 1:19 AM
  • Me too.

    5.7TB free out of a 10TB volume and just had a VM pause because it didn't think it had enough disk space.

     

    Friday, December 7, 2012 9:17 AM
  • If you are taking volume snapshots (directly or indirectly by using backup tools like DPM or 3rd party Backup solutions that rely on volsnap) then there is an issue that we have understood about a week ago and a fix for that issue is being currently tested. The symptoms are that when snapshot is taken NTFS might briefly report free volume size to be 0 and then free volume size will go back to normal (we are talking about window of couple dozen milliseconds here). If VMMS queries free volume size and find that there is none it will pause VMs. So if VMMS is unlucky and does the query at the moment when NTFS reports 0 you might hit the issue.

    For now please contact Microsoft Support to get status on the fix availability.


    Regards, Vladimir Petter, Microsoft Corporation



    Friday, December 7, 2012 6:11 PM
  • Have here exactly the same siutation. Is there already a summary page (technet) for hotfixes according server 2012 as it is available for the 2008R2 system?
    Sunday, December 9, 2012 9:17 AM
  • Hi!

    I have exactly the same problem!!  I have two HYPERV1 and HYPERV2 windows 2012 servers that are in a cluster. They have iSCSI disk presented and the disk is a ClusterSharedVolume 10Tb with 7,5TB free space. When i run 3rd party backup software strange things happen i on the Hyper-V servers. They think that the volume is out of disk space and pauses the VM´s. In worst case they are stoped in panic. This has now caused my DC´s that were running in Hyper-V cluster to crash and fail.

    where do i get this fix? I need it bad...  

    Br Patrik

    Monday, December 10, 2012 10:39 AM
  • Hi Vladimir,

    Any chance you can post back to this thread when the fix is available with a link.

    I've been having the same issue on a 6 node cluster with 8x 8TB CSV's.
    Free space on most of the CSV's is more than 75% but we get vm's pause randomly during a backup window.

    Wednesday, December 12, 2012 8:31 AM
  • This fix is being currently tested. I do not think a public KB is available yet.

    Unfortunately I am not in a position to promis any ETAs on behalf of Microsoft. I am only a developer on the Clustering team. The only way to get ETAs at this point is to ask Microsoft support. I am sorry for inconvinience.


    Regards, Vladimir Petter, Microsoft Corporation



    Thursday, December 13, 2012 4:14 AM
  • Hello,

    We have the same issue here. The error happens randomly, on random VMs, not all of them. The volume where the VHDs are stored has 1.56 TB free (of 6.44 TB).

    The server is running Windows Server 2012 Datacenter.

    Regards,

    Pierre

    Friday, December 14, 2012 1:31 PM
  • The KB number for the hotfix is KB2791729.

    You will have to log a support call to get the hotfix. It's not showing on the knowledge base yet.

    Monday, December 17, 2012 2:20 PM
  • Do you think KB2791729 is out there on the web someplace?  Can a WSUS server download it?

    Thursday, December 20, 2012 10:14 PM
  • Officially you need to contact Microsoft Support to get it.

    Unofficially, it will probably get posted to thehotfixshare.net until it is available in WSUS.

    I did not see it in thehotfixshare so I posted it on my blog article I wrote on this topic here: http://blogs.catapultsystems.com/IT/archive/2012/12/07/vms-are-paused-prematurely-because-server-2012-is-cautious-of-dynamically-expanding-disks.aspx

    Thursday, December 20, 2012 11:17 PM
  • Do we have to pay Microsoft for this? I am trying to log a support case to get the official hotfix but Microsoft will only allow a case to be logged if we have a support contract (which apparently we do not even though we have SA).

    Same issue as others, three hyper-v hosts on Server 2012, all local storage. During Veeam backups guests are going into 'Paused' state and the Event ID 16060 is logged under Hyper-V-VMMS. With such a serious issue I am surprised that Microsoft have not yet created a public KB for this.

    For now I have created a powershell script that runs when Event 16060 is logged to resume all Paused guests, not a fix though as clients still get disconnected.

    Have others deployed the hotfix, and can confirm it fixes the issue without any side-effects?

    Wednesday, January 2, 2013 4:22 PM
  • To my best knowledge you do need to pay if the issue ends up being caused by a bug in the Microsoft product. Support personel should be able to confirm.

    Regards, Vladimir Petter, Microsoft Corporation


    Wednesday, January 2, 2013 7:47 PM
  • Hi all,

    i have the same problem with a server Hyper-V with 2 VM guest (Windows server 2012).

    One VM has beean paused after a job backup with veeam (but i have not space problem in my backup partition or any server partition).

    I have call support microsoft that tell me that Hotfix KB2791729 has been replace with hotfix KB2813630   (applicable also in non cluster environment like my scenario) that is also a public hotfix.

    I have installed hotfix KB2813630 in my server (it required restart of hyper v server).

    Bye.


    Friday, May 10, 2013 2:09 PM
  • Mr. CriBullo

    O problema foi corrigido após a appplying KB ou a causa foi outra !?




    Jonas Campos Analista de Suporte FCP | ITIL http://www.linkedin.com/profile/view?id=129941481&trk=tab_pro P Antes de imprimir, LEMBRE-se do Seu Compromisso com o Meio Ambiente.



    Tuesday, January 12, 2016 1:29 PM
  • Mr. BagOP

    The problem had been solved !? this happening to me and I use WS2012R2 . Many are blaming the backup tool ( Veeam Backup)

    Jonas Campos Analista de Suporte FCP | ITIL http://www.linkedin.com/profile/view?id=129941481&trk=tab_pro P Antes de imprimir, lembre-se do seu compromisso com o meio ambiente.

    Tuesday, January 12, 2016 1:31 PM