locked
a partition of the Windows azure table can only be storaged one storage node? RRS feed

  • Question

  • "3.1.1       Scalability of the table

    The storage system achieves good scalability by distributing the partitions across many storage nodes.

    The system monitors the usage patterns of the partitions, and automatically balances these partitions across all the storage nodes.  This allows the system and your application to scale to meet the traffic needs of your table.  That is, if there is a lot of traffic to some of your partitions, the system will automatically spread them out to many storage nodes, so that the traffic load will be spread across many servers. However, a partition i.e. all entities with same partition key, will be served by a single node.   Even so, the amount of data stored within a partition is not limited by the storage capacity of one storage node."

    I  extracted two sentences from  the document "Windows Azure Table - May 2009.docx "(http://go.microsoft.com/fwlink/?LinkId=153401)with bold in the above paragraph ,it seems ambivalent.

    "However, a partition i.e. all entities with same partition key, will be served by a single node.   "
    The above sentence seems to state that  a partition  only can be storaged one storage node.
     
    "Even so, the amount of data stored within a partition is not limited by the storage capacity of one storage node"
     but  this sentence seems to state that a partition  can be storaged more than one storage node.

    those two sentences seems to be ambivalent ,how to explain that ?



    • Edited by Jacky_shen Monday, August 13, 2012 3:14 PM
    Monday, August 13, 2012 2:43 PM

Answers

    1. Yes. A partition is served by a single server. The data is stored across (potentially) many servers.
    2. Each server can potentially serve multiple partitions. If you have six partitions A, B, C, D, E, and F, it could be that Server 1 would serve A, B, and C, and Server 2 would serve D, E, and F. Those groups of partitions (A-C and D-F) are called "partition ranges," literally a range of partitions. At any time, table storage can decide to split those ranges based on load. If partition F is being accessed a lot, a third server may be brought in. Now Server 1 serves A-C, Server 2 serves D-E, and Server 3 serves F. This is what it means to split a partition range.
    3. Partitions themselves cannot be split.

    To answer your question about scale, a single partition will always be served by a single server, but because the data is spread out, it's not the number (or size) of entities that limits scale. It's the number of transactions: how often an entity from that partition is read or written. "Windows Azure Storage Abstractions and Their Scalability Targets" has a lot of detail about the scale limits, but essentially a single partition can handle 500 operations per second. It could be a partition with just one entity or a partition with a billion entities. No matter what, the limit is how many operations you're doing per second, not the size.

    • Marked as answer by Jacky_shen Wednesday, August 15, 2012 5:03 AM
    Wednesday, August 15, 2012 4:33 AM

All replies

  •  Normally The Data will be stored in DFS "Distributed file system"... So the same partition can be scattered in DFS... But all those Data will be served using only one Server...

    But the Server can handle many partition if the size of the partition is so small... This is called Range partitioning...

    For ex: You have 3 partitions like A B C...Each contains 10000 Records then the possibility is all the 3 partitions will be served by single Partition Server... If load on A increases, say 100000000000... Then Partition A will be served by Different partition Server... 

    But no partition cant be served using more than one Server...




    • Marked as answer by Jacky_shen Tuesday, August 14, 2012 1:27 AM
    • Unmarked as answer by Jacky_shen Tuesday, August 14, 2012 1:19 PM
    • Edited by The New Emperor Friday, May 16, 2014 2:58 PM Removed name
    Monday, August 13, 2012 3:34 PM
  • All data in a partition is served by a single node, but it's stored on many nodes. As an analogy, think of it like a web server sitting in front of a cluster of databases. All the data is being served by the web server, but it's being stored across many servers.

    Monday, August 13, 2012 6:38 PM
  • thank you so much.  
    • Edited by Jacky_shen Tuesday, August 14, 2012 1:27 AM
    Tuesday, August 14, 2012 1:27 AM
  • "However, a partition i.e. all entities with same partition key, will be served by a single node.  Even so, the amount of data stored within a partition is not limited by the storage capacity of one storage node"

    I still have some problems want to get clarified.

    1 ,"a partition i.e. all entities with same partition key, will be served by a single node"

    that is to say ,if  all entities in a partition with same partition key, that partition will have to be served by a single node ,but if  all entities in a partition with different partition key ,  they can be stored more than one storage node , right?

    for example ,suppose  partition A have two different partition  keys (range partition): partition key 1 and 2 ,then  partition key 1 and    partition key 2  can be splited into different partition server by the split operation,for example partition key 1 is split into  partition B   , and  partition key 2 is split into  partition C .So that is to say ,one partition A  is translated two partitions, partition B and C ,right?

    • Edited by Jacky_shen Tuesday, August 14, 2012 3:37 PM
    Tuesday, August 14, 2012 1:18 PM
  • "Suppose partition A [has] two different partition keys..." No, that can't happen. If two entities have different partition keys, they're in different partitions.

    "a partition" == "all the entities in a table that have the same partition key"

    Tuesday, August 14, 2012 6:59 PM
  • Hello Steve ,

    Thank you so much,

    may be I confused the partition(all entities with same partition key)  with the range partition.

    I still have some question need you help .

    1,So the the partition(all entities with same partition key) only be served by a single  partition server ,but it's data  is not limited by the storage capacity of one storage node ,right?

    2,the  range partition can be split ,and also  after the split operation, the range partition can  be served by more than one partition server  ,right ?

    3,if  the the partition(all entities with same partition key)  supports the split operation?

    If not, how the partition(all entities with same partition key)   does the sale out operation? suppose the partition has 10000000 million rows.

    as I know , the range partition  supports the split operation, and by split operation ,the  range partition can be spread to many partition server ,right?

    I am sorry if my question is so childish ,as I just learn Windows azure for a few days.  
    thank you in advanced.

    • Edited by Jacky_shen Wednesday, August 15, 2012 3:33 AM
    Wednesday, August 15, 2012 3:05 AM
    1. Yes. A partition is served by a single server. The data is stored across (potentially) many servers.
    2. Each server can potentially serve multiple partitions. If you have six partitions A, B, C, D, E, and F, it could be that Server 1 would serve A, B, and C, and Server 2 would serve D, E, and F. Those groups of partitions (A-C and D-F) are called "partition ranges," literally a range of partitions. At any time, table storage can decide to split those ranges based on load. If partition F is being accessed a lot, a third server may be brought in. Now Server 1 serves A-C, Server 2 serves D-E, and Server 3 serves F. This is what it means to split a partition range.
    3. Partitions themselves cannot be split.

    To answer your question about scale, a single partition will always be served by a single server, but because the data is spread out, it's not the number (or size) of entities that limits scale. It's the number of transactions: how often an entity from that partition is read or written. "Windows Azure Storage Abstractions and Their Scalability Targets" has a lot of detail about the scale limits, but essentially a single partition can handle 500 operations per second. It could be a partition with just one entity or a partition with a billion entities. No matter what, the limit is how many operations you're doing per second, not the size.

    • Marked as answer by Jacky_shen Wednesday, August 15, 2012 5:03 AM
    Wednesday, August 15, 2012 4:33 AM
  • Hello Steve ,

    After your explaintion, I get more clearly about the partition for the Windows azure table,thank you for your exhaustive help.

    BTW ,I very enjoy your videos for Windows azure in the channel9 , thank you for sharing those videos to us.  

    • Edited by Jacky_shen Wednesday, August 15, 2012 5:03 AM
    Wednesday, August 15, 2012 5:03 AM
  • I'm glad to have helped!
    Wednesday, August 15, 2012 5:04 AM