none
What would be the best way to perform an ETL that performs a pivot in near real time RRS feed

  • Question

  • Hi,

    I have an Event table  that creates events in near real time with large amount of records :

    event_type

    time

    A

    01/12/2018 15:39

    B

    01/12/2018 10:43

    C

    02/12/2018 16:05

    A

    02/12/2018 16:39

    It needs to be transformed to: Fact_events

    date

    hour

    event_type_A_count

    event_type_B_count-

    event_type_count

    01/12/2018

    15:00

    1

    0

    0

    01/12/2018

    10:00

    0

    1

    0

    02/12/2018

    16:00

    1

    0

    1

    The ETL that performs the transformation runs every hour and collects the count of an hour back

    What would be the most efficient way to load the Fact_events ?

    One query that performs the pivoting ? - If yes what would be the query

    Intermediate steps that loads a staging table, if yes what would be the queries for that ?

    Thanks,

    Dani

    Friday, June 14, 2019 2:34 PM

Answers

  • You can use SSIS pivot transformation

    https://radacad.com/pivot-transformation-ssis-2012-complete-tutorial

    BTW based on the logic you posted above it seems that you may need to insert or update the Fact table, am I right?

    Does the fact table have an index on date and hour columns?

    create table #t (etype char(1) , t datetime)

    insert into #t values ('A','20180112 15:36')
    insert into #t values ('B','20180112 10:40')
    insert into #t values ('C','20180212 16:05')
    insert into #t values ('A','20180212 16:39')


    select convert(varchar(15),t,103),
    convert(varchar(2),t,114)+':00',
    count(case when etype='A' then 1  end ) e_A_count,
    count(case when etype='B' then 1  end ) e_B_count,
    count(case when etype='C' then 1 end ) e_C_count
     from #t
     group by convert(varchar(15),t,103),
    convert(varchar(2),t,114)+':00'



    Best Regards,Uri Dimant SQL Server MVP, http://sqlblog.com/blogs/uri_dimant/

    MS SQL optimization: MS SQL Development and Optimization
    MS SQL Consulting: Large scale of database and data cleansing
    Remote DBA Services: Improves MS SQL Database Performance
    SQL Server Integration Services: Business Intelligence

    • Marked as answer by danisroyi2 Sunday, June 16, 2019 7:01 AM
    Sunday, June 16, 2019 5:38 AM
    Answerer
  • Thanks,

    If I understand, you are suggesting to perform a pivot query directly from the Event table to the fact table.

    Assuming the Event table is large - Would it be more efficient to load an increment of one hour to a Staging table and as an additional step to perform the pivot for the staging table ?

    • Marked as answer by danisroyi2 Sunday, June 16, 2019 7:01 AM
    Sunday, June 16, 2019 6:54 AM

All replies