locked
Azure storage and data provenance RRS feed

  • Question

  • I want to design a solution that stores configuration parameters/data in the Azure storage. I want to also track changes made to this data and its versions. Does Azure provide such support to maintain provenance info. What would be a better option - using BLOBs files or tables for this?

     

    Monday, June 6, 2011 5:52 PM

Answers

  • If your configuration data is name-value pairs, Windows Azure Tables is likely the best solution for you. There isn't any built-in provenance tracking in Windows Azure Storage. You could persist the versioning and change tracking info with your configuration data.

    One thing to consider is each entity in Table Storage can have up to 252 properties and be 1MB each. If your configuration data is larger than that, you may need to consider BLOBs or partitioning your configuration data to fit Tables.


    Eric D. Boyd - Director, Chicago + Cloud at Centare 
    Blog | twitter
    • Proposed as answer by Eric D. Boyd Tuesday, June 7, 2011 3:14 AM
    • Marked as answer by Wenchao Zeng Monday, June 13, 2011 2:14 AM
    Tuesday, June 7, 2011 3:14 AM

All replies

  • If your configuration data is name-value pairs, Windows Azure Tables is likely the best solution for you. There isn't any built-in provenance tracking in Windows Azure Storage. You could persist the versioning and change tracking info with your configuration data.

    One thing to consider is each entity in Table Storage can have up to 252 properties and be 1MB each. If your configuration data is larger than that, you may need to consider BLOBs or partitioning your configuration data to fit Tables.


    Eric D. Boyd - Director, Chicago + Cloud at Centare 
    Blog | twitter
    • Proposed as answer by Eric D. Boyd Tuesday, June 7, 2011 3:14 AM
    • Marked as answer by Wenchao Zeng Monday, June 13, 2011 2:14 AM
    Tuesday, June 7, 2011 3:14 AM
  • Thanks for the reply.

    I will incorporate versioning and change tracking info with config data.

    The configuration has more dimensions than just name value pairs, it is also nested/hierarchical in nature and that is posing trouble to fit in azure tables as Azure table storage does not support queries that span multiple tables. This is where I am majorly confused what to use as workaround.

    Also,  I had realized about the 255 limit on properties. I was thinking if there was a better way to organize this like - instead of Table1 with column=property I could keep a column1=atrribute_name and column2=atrribute_value and then column3..n=other dimensions that make each configuration parameter unique. The issue I ran into here was deciding the partition-key and row-key. The documentation allows these to be just plain strings and I can think of only a composite combination of more than 2 attributes that would make each row unique. So I am assuming the uniqueness propoerty again must be handled explicitly in code. Is that assumption correct?

    Thanks!


    Mridu Baldevraj Narang
    Tuesday, June 7, 2011 3:57 AM
  • Mridu,

    The combination of PartionKey and RowKey uniquely identify an entity. Your should determine your PartitionKey strategy carefully. Your goal is to choose a partitioning scheme that keeps data that should live together, together, while balancing that goal with the ability to distribute your data across multiple storage nodes for load distribution.

    The RowKey represents the row in the specified Partition and together are unique. It is perfectly acceptable and simple to composite multiple attributes for either of these keys.

    If you try to create an entity with an existing PartitionKey and RowKey, you will get a conflict (409 HTTP Status Code) response.

    Eric


    Eric D. Boyd - Director, Chicago + Cloud at Centare 
    Blog | twitter
    • Proposed as answer by Eric D. Boyd Tuesday, June 7, 2011 5:44 PM
    Tuesday, June 7, 2011 5:44 PM
  • Thanks Eric. Could you point me to any good links that show how to create composite multiple attributes for either of the keys.

     

    Wednesday, June 8, 2011 6:40 PM
  • Mridu,

    You have a couple of options, you can concatenate the fields and set the PartionKey and/or RowKey with the concatenated value.

    or

    You could create a property that does that concatenation based on existing properties, and then use the DataServiceKey attribute on your class to use this property.

    Eric


    Eric D. Boyd - Director, Chicago + Cloud at Centare 
    Blog | twitter
    • Proposed as answer by Eric D. Boyd Wednesday, June 8, 2011 8:28 PM
    Wednesday, June 8, 2011 8:28 PM