none
How to re-engineer OLAP access to a modern data mart?

    Question

  • I am reengineering a crude data mart that was created in SQL Server 7.  (It consists of tables based directly on mainframe extracts, with no data cleaning or conforming, etc.)  The new system is starting to take shape in SQL Sever 2008 R2, but I am under pressure to make like-for-like data available immediately, which means the complete data modelling and cube building will have to wait. My problem is:  How should I expose the data to the users?

    Currently, people are using Access to link to views on the tables in the old (SQL Server 7) database. This creates a recurring PITA problem with the scheduled database refreshes:  If the extracts are late coming out of the mainframe, people have already connected to the database by the time the ETL tries to run, and it fails because of table locks. I suppose the same situation will arise in the new data mart if I again grant user access to views on the tables -- unless I modify the ETL to allow new versions of the tables to be built while the clients are still pointing to the old ones.

    Should I simply deny direct access to these tables, and instead provide stored procedures for use by Access and Excel clients?  (I haven't tried this before, and I'm worried about provoking a backlash if it doesn't go smoothly).

    
    

    Whatever I do, it may be necessary to create a messaging solution that informs users of which data they are using (i.e. last week versus this week)  while the tables are getting refreshed (or the extracts are late). Any tips?

    Monday, July 29, 2013 12:26 AM

Answers

  • If you truncate or perform other DDL then you will block the readers for the duration of the transaction.  If you truncate without a transaction, the users will be able to query immediately after the truncate completes.

    Building a data mart that is available for queries during load is more difficult, and typically requires using somewhat more expensive load mechanism.  For instance instead of truncating and loading your target table, truncate and load a staging table.  Then either use TSQL MERGE or ALTER TABLE SWITCH to update the target table.

    David


    David http://blogs.msdn.com/b/dbrowne/

    Thursday, August 01, 2013 12:20 PM

All replies

  • Rule #1: don't give users direct access to the data. Ever. :)

    What you might try is to create a snapshot of this database and let users use this snapshot to report on.

    Typical Uses of Database Snapshots


    MCSE SQL Server 2012 - Please mark posts as answered where appropriate.

    Monday, July 29, 2013 8:23 AM
  • Thanks -- I certainly agree with Rule #1, but how can I provide access without worrying about table locks? Eventually, the snapshot has to be refreshed, and if somebody is looking at the table using a Microsoft Access database they will lock it on me.
    Monday, July 29, 2013 9:41 PM
  • Don't use Access? I'm not really an expert in locking issues, but read locks shouldn't mess up your entire database.

    MCSE SQL Server 2012 - Please mark posts as answered where appropriate.

    Tuesday, July 30, 2013 6:31 AM
  • "Don't use Access" should probably be Rule #2, but we're stuck with it until we can give the Business a viable alternative. 
    Wednesday, July 31, 2013 2:14 AM
  • "Don't use Access" should probably be Rule #2, but we're stuck with it until we can give the Business a viable alternative. 

    Can't they report through Excel?

    You can connect to SQL Server with Excel (maybe use views on top of the actual tables) and use PivotTables and PivotCharts.


    MCSE SQL Server 2012 - Please mark posts as answered where appropriate.

    • Marked as answer by AllenN Wednesday, July 31, 2013 10:00 PM
    • Unmarked as answer by AllenN Wednesday, July 31, 2013 10:08 PM
    Wednesday, July 31, 2013 8:41 AM
  • I've been thinking about Excel, actually. I'll try to get them to explain how they are using the data and see if Excel will work for them. (Some only use Access because they think Excel still has a 65k row limit.)
    Wednesday, July 31, 2013 10:00 PM
  • To handle loading data into a database while users are connected and running queries, you can use READ COMMITTED SNAPSHOT isolation.  This will prevent report users from taking shared locks on the data to run reports.  They will need a "schema stability lock" (Sch-S) lock for the duration of each query, but that only prevents running ALTER TABLE, so dropping tables, renaming tables, performing partition switching.  INSERT, UPDATE, DELETE, and bulk insert that respects triggers and constraints can run concurrently.

    David


    David http://blogs.msdn.com/b/dbrowne/

    Wednesday, July 31, 2013 10:05 PM
  • Thanks, David.

    One of the 1st things my ETL does is truncate the target table so that it can be rebuilt from scratch, so the users should not be able to query it at all until the rebuild is complete. I'll keep thinking about how I might use the snapshot isolation, though.

    
    Thursday, August 01, 2013 5:41 AM
  • If you truncate or perform other DDL then you will block the readers for the duration of the transaction.  If you truncate without a transaction, the users will be able to query immediately after the truncate completes.

    Building a data mart that is available for queries during load is more difficult, and typically requires using somewhat more expensive load mechanism.  For instance instead of truncating and loading your target table, truncate and load a staging table.  Then either use TSQL MERGE or ALTER TABLE SWITCH to update the target table.

    David


    David http://blogs.msdn.com/b/dbrowne/

    Thursday, August 01, 2013 12:20 PM
  • Truncate really shouldn't have much impact on locking, as it is nearly instant.
    But indeed, partition swithcing (enterprise feature) is a good way to quickly populate your destination table (also near instant).

    MCSE SQL Server 2012 - Please mark posts as answered where appropriate.

    Thursday, August 01, 2013 12:22 PM
  • I'm also thinking about forcing the Access users to obtain recordsets via stored procedures rather than by linking to views or tables. Then, I could put some smarts in the proc to test if the table is currently being refreshed and point them to a backup if true (preferably with some kind of notification so they know they are looking at last week's data).

    Thursday, August 01, 2013 10:29 PM