locked
Azure Table Storage for logs very slow RRS feed

  • Question

  • Hello

    I'm using Azure Table Storage for my Worker Role logs, with a 1-minute transfer schedule.  So far I have about 30k records stored.

    Queries against the WADLogsTable are ever so slow.  Using Azure Storage Explorer, with a filter such as

    Timestamp > datetime'2011-11-25T09:30:00Z'
    the retrieval time for a few hundred records is typically about 40 seconds. (best time 15 seconds, worse is over a minute).  Wireshark traces show that just a few seconds are spent tanking the data from the cloud to my local Azure Storage Explorere, depending on the number of rows returned.  The rest of the time is presumably down to filtering on the table to chose the records to return.

    Is this typical?  Is there any way to speed this up?

    With kindest regards, Chris Lowe.

    Friday, November 25, 2011 2:11 PM

Answers

  • Hi,

    Please note that currently data in a Windows Azure table is indexed only on PartitionKey and RowKey attribute. Querying on any other attribute (as in your case) will result in full table scan and thus will be slower. Luckily for diagnostics logs, the PartitionKey value actually represent date/time value in Ticks. So essentially here is what you would need to do:

    1. For the date time you're interested in, calculate the ticks (Please ensure that the date/time value needs to be converted to UTC).

    2. Prepend a "0" in front of that.

    3. Use that value for query e.g. PartitionKey ge '0[datetimevalueinticks]'.

    This would be much much faster and efficient than querying on Timestamp attribute.

    Hope this helps.

    Thanks

    Gaurav Mantri

    Cerebrata Software

    http://www.cerebrata.com

     

    Friday, November 25, 2011 4:07 PM
  • Hi,

    I am not sure if ASE would let you do that. You can probably write a small application for that. [With due apologies for shameless product plug :)], you may want to look at the product we have developed for visualizing Azure Diagnostics data: Azure Diagnostics Manager. You can find more information about it on our website at http://www.cerebrata.com/Products/AzureDiagnosticsManager.

    Hope this helps.

     

    Thanks

     

    Gaurav

    Saturday, November 26, 2011 2:02 AM

All replies

  • Hi,

    Please note that currently data in a Windows Azure table is indexed only on PartitionKey and RowKey attribute. Querying on any other attribute (as in your case) will result in full table scan and thus will be slower. Luckily for diagnostics logs, the PartitionKey value actually represent date/time value in Ticks. So essentially here is what you would need to do:

    1. For the date time you're interested in, calculate the ticks (Please ensure that the date/time value needs to be converted to UTC).

    2. Prepend a "0" in front of that.

    3. Use that value for query e.g. PartitionKey ge '0[datetimevalueinticks]'.

    This would be much much faster and efficient than querying on Timestamp attribute.

    Hope this helps.

    Thanks

    Gaurav Mantri

    Cerebrata Software

    http://www.cerebrata.com

     

    Friday, November 25, 2011 4:07 PM
  • Where are you noticing the delay? From personal experience if there are a lot of records to display Azure Storage Explorer takes longer to render it than it does to download it.
    Friday, November 25, 2011 5:23 PM
  • Thanks for the question.  The delay is between the time Azure Storage Explorer sends the request, and the data starts to come back.  This is supported by wireshark traces.  I know ASE can take a few seconds to render the stuff, too. 

    Is it possible to introduce additional indexes on the tables?

    Friday, November 25, 2011 5:32 PM
  • This is very helpful.  Thank you.

    Is it possible to introduce additional indexes on the table storage to improve the retrieval time?

    Friday, November 25, 2011 5:33 PM
  • And may I say that the solution you've outlined works very well.  The time taken is now dominated by data transfer and rendering, which isn't Azure's problem.

    This query: PartitionKey >= '0634578370800000000'  started to return values within 3 seconds.

    Do you know if there's a way of calculating the ticks from a datatime on the query line within Azure Storage Explorer ?  E.g. some sort of query like

    PartitionKey >= '0' + AsString(TicksFromDatetime(datetime'2011-11-25T09:30:00Z'))

    Cheers,

    Chris.

    Friday, November 25, 2011 5:52 PM
  • Hi,

    I am not sure if ASE would let you do that. You can probably write a small application for that. [With due apologies for shameless product plug :)], you may want to look at the product we have developed for visualizing Azure Diagnostics data: Azure Diagnostics Manager. You can find more information about it on our website at http://www.cerebrata.com/Products/AzureDiagnosticsManager.

    Hope this helps.

     

    Thanks

     

    Gaurav

    Saturday, November 26, 2011 2:02 AM