locked
Azure File Sync - Cloud Tiering - Same Volume, multiple tiering policies RRS feed

  • Question

  • Hi, All,

    Pretty excited to be piloting out Azure File Sync. It looks very promising.

    One of the biggest limitations of the technology I currently see is the 5TiB Azure File Share limit. Because of this, we'll have to split the data set we're using for the pilot into multiple Azure File Shares. Not the biggest of deals but seeing the limit increase will be a great improvement.

    That said, because we'll have to split the data set, we may end up in a situation where the same server is a member of multiple Sync Groups. In the documentation I have seen that nested Server Endpoint paths are not supported. We can work around this by creating different top-level folders on the same volume.

    The question I have not been able to see answered in the documentation is this:

    If the same volume on the same server hosts multiple Server Endpoints and those server endpoints all have cloud tiering enabled, how does Azure File Sync deal with that?

    As an example:

    1. Server1: D:\Shares\Data1 is a server endpoint in SyncGroup1. A 40% tiering policy is enabled on this endpoint.

    2. Server1: D:\Shares\Data2 is a server endpoint in SyncGroup2. A 70 % tiering policy is enabled on this endpoint.

    In this case, the Cloud Endpoint in SyncGroup1 will tier local content to Azure Files until the free space is 40%. Once it reaches that, it will theoretically no longer tier files.

    At the same time, the Cloud Endpoint in SyncGroup2 will want to maintain a 70% free space on the same volume. Even if it tiers every single file in the sync group, it still may not be able to achieve 70% free space if the Cloud Endpoint in SyncGroup1 has enough data to fill 60% of the volume.

    How does Azure File Sync deal with this scenario? Can Multiple tiering-enabled server endpoints co-exist on the same Volume or should each volume be limited to only 1 cloud tiering policy?

    Thanks in advance for any help!

    Tuesday, November 7, 2017 6:31 PM

Answers

  • Hi tFable,

    Thank you for your question... apologies for the inconvenience here - our documentation is still a work in progress :)

    To answer your question, if you place multiple server endpoints on the same volume, the one with the highest volume free space percentage will "win" for the entire volume. For instance, in your example, 70% would be the effective volume free space percentage for both server endpoints. For simplicity, I would personally recommend setting the volume free space percentage on both server endpoints to the same number, so you can look at either endpoint and get the correct information, but there's no harm in having them different - the highest one wins.

    To answer your other question, we are working very hard to increase the 5 TiB limit and hope to have information to share soon - stay tuned :)

    Thanks,

    Will Gries, Program Manager, Azure Files

    Tuesday, November 7, 2017 6:51 PM
  • Hi Tony,

    No, the 70% will apply to all the server endpoints on the volume. This means that we'll tier the coolest files, as defined by our heat map, across both server endpoints on the volume.

    In your second question, it's not exactly clear how long it would take to re-tier, because that depends on the other files on your volume. Items become more hot in our heat map based on access and writes, and less hot if they haven't been touched in awhile. For data that is mostly never touched, say if you're backing up old documents there, one access of a file could be enough to make that file the hottest on the volume. For data that is extremely active, one access of a file won't change it's relative heat much, and it will be re-tiered shortly. I actually don't know off the top of my head how long it would take to be re-tiered if we decide it's colder than the threshold... but I would be happy to pose the question to the Dev team if you're interested.

    Hope this helps,

    Will

    • Marked as answer by tFable Tuesday, November 7, 2017 8:21 PM
    Tuesday, November 7, 2017 7:45 PM

All replies

  • Hi tFable,

    Thank you for your question... apologies for the inconvenience here - our documentation is still a work in progress :)

    To answer your question, if you place multiple server endpoints on the same volume, the one with the highest volume free space percentage will "win" for the entire volume. For instance, in your example, 70% would be the effective volume free space percentage for both server endpoints. For simplicity, I would personally recommend setting the volume free space percentage on both server endpoints to the same number, so you can look at either endpoint and get the correct information, but there's no harm in having them different - the highest one wins.

    To answer your other question, we are working very hard to increase the 5 TiB limit and hope to have information to share soon - stay tuned :)

    Thanks,

    Will Gries, Program Manager, Azure Files

    Tuesday, November 7, 2017 6:51 PM
  • Hi, Will,

    Thanks for the speedy response. I read through all of your comments in the Azure File Sync documentation and couldn't find an answer to this one. It's great to be able to get an answer this quickly!

    Two follow-up questions:

    In my example, does it mean that the endpoint with the 70% policy will tier all of its files if the other endpoint's data size is larger than 30% of the entire volume's available space?

    Now let's assume that the disk space is at 50% free. If the entire data-set for the endpoint with 70% policy is tiered and I try to open a 200MB file, the file will be downloaded and I'll be able to see it. At this point, the endpoint only has visibility over its own "heatmap" for tiering (I assume) and it will see that every other file (except the one I just downloaded) has already been tiered.

    How long will the endpoint wait before it re-tiers the 200MB file I just opened, read/edited and then closed?

    Thanks again!
    Tony


    • Edited by tFable Tuesday, November 7, 2017 7:03 PM two, not one, question
    Tuesday, November 7, 2017 7:01 PM
  • Hi Tony,

    No, the 70% will apply to all the server endpoints on the volume. This means that we'll tier the coolest files, as defined by our heat map, across both server endpoints on the volume.

    In your second question, it's not exactly clear how long it would take to re-tier, because that depends on the other files on your volume. Items become more hot in our heat map based on access and writes, and less hot if they haven't been touched in awhile. For data that is mostly never touched, say if you're backing up old documents there, one access of a file could be enough to make that file the hottest on the volume. For data that is extremely active, one access of a file won't change it's relative heat much, and it will be re-tiered shortly. I actually don't know off the top of my head how long it would take to be re-tiered if we decide it's colder than the threshold... but I would be happy to pose the question to the Dev team if you're interested.

    Hope this helps,

    Will

    • Marked as answer by tFable Tuesday, November 7, 2017 8:21 PM
    Tuesday, November 7, 2017 7:45 PM
  • Thanks, Will!

    Knowing now that the "heatmap" is based on the entire data set on the volume across all endpoints, this makes much more sense. I think that is an important point that I'm sure lots of folks will want to know.

    I think, in general some guidance on the tiering mechanism would make a lot of enterprise users more comfortable to use the technology. My pilot use-case is relatively simple but if this solution is going to be deployed at larger and more complex environments, techs will surely want to know more details on how the tiering works before than can deploy it in production. I know I'd like to know more details about it before I can propose this solution for the rest of our production file shares.

    Thanks again for the guidance.

    Regards,

    Tony

    Tuesday, November 7, 2017 8:21 PM