none
MDX help! Group by measure value ranges

    Question

  • I have a salesman dimension and sales fact table and some data as below:
    salesman     sales_amount
    -----------------------------
    Mike         $200
    Phil         $5
    John         $280
    Jessica      $550

    I'd like MDX query result as below:

    sales_range     Count
    -------------------------
    <$100           1
    $100-$500       2
    >$500           1

    The sales_range (<$100, $100-$500 and >$500) are fixed and hard-coded and never needed to be changed. And I can do dimension/fact table or cube changes for this if needed.

    Please help! How to write MDX query for this?

    Any input will be highly appreciated! Thanks!

    -- Please read my updated question below in this thread.

    Thursday, May 21, 2009 1:29 PM

Answers

  • Aha, total per customer, not individual record in fact? Then you're right, it requires MDX solution, not the one I recommended.

    Take a look at Richard's MDX. It's in short the idea in this case. The rest it requires is having calculated members Count1 and Count2 in columns. Here's the full MDX:

    WITH
      MEMBER Time.Count1 AS Aggregate(time1 : time2)
      MEMBER Time.Count2 AS Aggregate(time3 : time4)
      MEMBER Measures.[<$100] AS Count(Filter(NonEmpty(Customers.Members,
                                                                                     (Time.CurrentMember, Measures.Sales)
                                                                                    ), Measures.Sales < 100.0))
      MEMBER Measures.[$100-$500] AS Count(Filter(NonEmpty(Customers.Members,
                                                                                           (Time.CurrentMember, Measures.Sales)
                                                                                          ), Measures.Sales >= 100.0 AND Measures.Sales <= 500.0))
      MEMBER Measures.[>$500] AS Count(Filter(NonEmpty(Customers.Members,
                                                                                     (Time.CurrentMember, Measures.Sales)
                                                                                    ), Measures.Sales > 500.0))
    SELECT
      { Time.Count1, Time.Count2 } on 0,
      { Measures.[<$100], Measures.[$100-$500],  Measures.[>$500] } on 1
    FROM Cube

    Again, I used abbreviations, you provide full names.

    Hope this is better.

    Oh, and by the way, if I'm wrong, simply say so. Don't worry, I'll continue helping you if I know how.


    Tomislav Piasevoli
    Business Intelligence Specialist
    www.softpro.hr

    Monday, May 25, 2009 10:24 PM
    Answerer

All replies

  • One way to solve this is to introduce a new field in your fact table/view, SalesRangeID, that you'll fill with values 1, 2 or 3 depending on a sales_amount field value.

    case when sales_amount < 100.0 then 1 when sales_amount > 500.0 then 3 else 2 end

    Additionally, make a view or named query that will have 3 rows of data:

    select 1 AS SalesRangeID, '<$100' as SalesRangeName
      union all
    select 2 AS SalesRangeID, '$100-$500' as SalesRangeName
      union all
    select 3 AS SalesRangeID, '>$500' as SalesRangeName

    Make a dimension from that view/named query that will have only one attribute. Use ID for key, use Name for name. Order it by key, not the name.

    Link that dimension to fact on the same fields (SalesRangeID).

    Create additional measure group and link to same dimensions. Make a new measure on SalesmanID field from your fact table in that new measure group and use Distinct Count aggregation. That measure should give you count you need, count of distinct customers. All you need to do is drag this new dimension and new measure in pivot.



    You can also do something with MDX.

    If you're willing to loosen your requirements, you can make 3 calculated measures: Sales range <$100, Sales range $100-$500 and Sales range >$500. But you won't have the same layout you draw above as your result. It will be like this:

    Sales range <$100 | Sales range $100-$500 | Sales range >$500
               1                                  2                              1

    3 measures and their values (representing the count of distinct customers) below. If it is ok, MDX can be arranged. If not, too bad.

    The problem is that you have measure Count on columns and something on rows and that something is supposed to be a dimension, real one or utility one. In either case I think you'll have to create a new dimension for that.

    There is an option of defining those 3 ranges as 3 calculated members of existing Customer dimension, but I don't know if it would be an user friendly solution. Will your users have problems with that? Too confusing? I believe so. That why I suggest the first approach. It costs some aggregation, but it should be the fastest solution (in returning data, not in implementing it, yet it's worth a try).

    How about it?


    Tomislav Piasevoli
    Business Intelligence Specialist
    www.softpro.hr
    Thursday, May 21, 2009 9:10 PM
    Answerer
  • Anything is possible in MDX.  To get the three ranges out in one query it could be as simple as

    with member measures.[< 100] as

                    count(filter([Client Host].[Client Geography].[City],[Measures].[Total KBytes]<= 1000))

    member measures.[1000 - 10000] as

                    count(filter([Client Host].[Client Geography].[City],[Measures].[Total KBytes]> 1000 and [Measures].[Total KBytes]<= 10000))

    member measures.[> 10000] as

                    count(filter([Client Host].[Client Geography].[City],[Measures].[Total KBytes]> 10000))

    select

                    {measures.[< 100],

                     measures.[1000 - 10000],

                     measures.[> 10000]}

                     on 0

    from easternmining

    for more Analysis Services info see http://RichardLees.blogspot.com


    Richard
    Saturday, May 23, 2009 10:44 AM
  • Thank you so much Tomislav and Richarl!! (I am also grateful for Tomislav's great answer for my another posting - MDX help! Max then Sum on different sets)

    Sorry that I have simplified my question in my first posting. Please let me explain my question more details as follows:

    I have a salesman dimension and sales fact table and some data as below:

    salesman     sales_amount     Sales_date
    ------------------------------------------
    Mike         $200             2006-07-01
    Mike         $250             2006-07-02
    Phil          $5                 2006-08-25
    Phil          $50               2006-08-26
    John         $280             2006-08-26
    John         $300             2006-08-27
    Jessica      $550             2006-09-01
    Jessica      $600             2006-09-02

    Mike         $10               2006-09-09
    Mike         $100             2006-09-10
    Phil          $200             2006-09-22
    Phil          $200             2006-09-23
    John         $180             2006-09-23
    John         $100             2006-09-24
    Jessica      $50              2006-09-24
    Jessica      $70              2006-09-28

    The query parameters contains start_date1, end_date1, start_date2, end_date2.
    If the user inputs

       start_date1 = 2006-06-23,
       end_date1 = 2006-09-08,
       start_date2 = 2006-09-09,
       end_date2 = 2006-10-03

    Then I'd like the MDX query result as below:

    sales_total_range     Count1     Count2
    -----------------------------------------------
    <$100                 1          0
    $100-$500           1          4
    >$500                 2          0

    where, sales_total_range means the total sale amounts for a salesman for the selected time period:

    count1 for 2006-06-23 -> 2006-09-08 for <$100 is 1 :
       Phil ($5 + $50)
    count1 for 2006-06-23 -> 2006-09-08 for $100-$500 is 1 :
       Mike ($200 + $250)
    count1 for 2006-06-23 -> 2006-09-08 for >$500 is 2 : 
       John ($280 + $300)
       Jessica ($550 + 600)

    count2 for 2006-09-09 -> 2006-10-03 for <$100 is 0.
    count2 for 2006-09-09 -> 2006-10-03 for $100-$500 is 4 : 
       Mike ($10 + $100)
       Phil ($200 + $200)
       John ($180 + $100)
       Jessica ($50 + 70)
    count2 for 2006-09-09 -> 2006-10-03 for >$500 is 0.

    The reason why I need the format of the MDX query result as above is because I have to pass the dataset to a ComponentArt SqlChart report with two series (one is start_date1 -> end_date1 and the other is start_date2 and end_date2).

    I have written some script like below:

    member [measures].[< 100] as 
      count(filter(( STRTOMEMBER(@FromTimePKDate, CONSTRAINED) : STRTOMEMBER(@ToTimePKDate, CONSTRAINED) ),
                   SUM([Measures].[Sales_Amount]) < 100 )) 

    but the result doesn't fit the required format.

    I can make any needed changes on cube/dimension/fact table design and the dimension definition doesn't need much user friendly because the user is report designer. However, I cannot add SaleRangeID as suggested because the sale ranges are sum/total values based on the dynamic selected time period.

    Again, thank you so much Tomislav and Richarl for your great comments! I look forward to reading your more comments about my above question! Please Help!

    Monday, May 25, 2009 3:18 PM

  • "I can make any needed changes on cube/dimension/fact table design and the dimension definition doesn't need much user friendly because the user is report designer. However, I cannot add SaleRangeID as suggested because the sale ranges are sum/total values based on the dynamic selected time period."

    No, they're not. Sales ranges are fixed and there are only 3 of them. Values they represent always depend on current context and might be anything from zero to N, where N is the total number of your customers. So, values are dynamic, not ranges themselves.

    Regardless of that, I still think the best way to solve this is in your DW. You see, I like MDX very much, but efficient solutions even more. Solving this in DW is more efficient.

    In general, having a simple MDX is an indicator of good DW design. Having a complex MDX might be a good sign to reconsider your DW design. Or to use different approach, even technology sometimes.

    Your sales ranges are granularity based. Each transaction in your fact can be described by one of the three sales ranges. Therefore, introduction of this new dimension is a natural way to go here. You'll solve your problem in the most elegant way. And the DistinctCount measure set on CustomerID field should take care of complexity since it will give you proper numbers for any intersection you make in your pivot/report.

    The MDX in that case would be this:

    WITH
      MEMBER Measures.Count1 AS Aggregate(time1 : time2, Measures.CountOfCustomers)
      MEMBER Measures.Count2 AS Aggregate(time3 : time4, Measures.CountOfCustomers)
    SELECT
      { Measures.Count1, Measures.Count2 } on 0,
      { SalesRange.SalesRange.SalesRange.MEMBERS } on 1
    FROM Cube

    Because of the two ranges, you have to make 2 calculated measures, but they are simple in nature, as you can see.

    I used abbreviations, you should provide full member names.

    Am I right?


    Tomislav Piasevoli
    Business Intelligence Specialist
    www.softpro.hr
    Monday, May 25, 2009 7:46 PM
    Answerer
  • Thank you so much again, Tomislav for your great comment and quick reply! I really appreciate your help!

    If I understand correctly about your solution, then a sales_range dimension table will be added into the cube and a sales_range_id column will be added into the sales fact table as foreign key, and the MDX query result will be

    sales_range          count
    --------------------------------
    <$100                  2 (sales_id: 3, 6 - distinct salesmen: Phil, John)
    $100-$500            3 (sales_id: 1, 2, 4, 5 - distinct salesmen: Mike, Phil, John)
    >$500                  0

    , if given the data as follows:

    sales_id   salesman        sales_amount
    -------------------------------------
    1             Mike               $200
    2             Mike               $400
    3             Phil                $50
    4             Phil                $480
    5             John               $280
    6             John               $50

    and the fixed sales ranges are <$100, $100-$500 and >$500.

    However, the expected MDX query result has to like below:

    sales_TOTAL_range          count
    -------------------------------------------
    <$100                              0
    $100-$500                        1 (John : $100 <= ($280 [sales_id:5] + $50 [sales_id:6]) <= $500)
    >$500                              2 (Mike: ($200 [sales_id:1] + $400 [sales_id:2]) > $500, Phil: ($50 [sales_id:3] + $480 [sales_id:4]) > $500)

    Again, thank you so much Tomislav for your further detailed comment!! Could you please look at my above question and help me further? Thank you!!
    Monday, May 25, 2009 8:43 PM
  • Aha, total per customer, not individual record in fact? Then you're right, it requires MDX solution, not the one I recommended.

    Take a look at Richard's MDX. It's in short the idea in this case. The rest it requires is having calculated members Count1 and Count2 in columns. Here's the full MDX:

    WITH
      MEMBER Time.Count1 AS Aggregate(time1 : time2)
      MEMBER Time.Count2 AS Aggregate(time3 : time4)
      MEMBER Measures.[<$100] AS Count(Filter(NonEmpty(Customers.Members,
                                                                                     (Time.CurrentMember, Measures.Sales)
                                                                                    ), Measures.Sales < 100.0))
      MEMBER Measures.[$100-$500] AS Count(Filter(NonEmpty(Customers.Members,
                                                                                           (Time.CurrentMember, Measures.Sales)
                                                                                          ), Measures.Sales >= 100.0 AND Measures.Sales <= 500.0))
      MEMBER Measures.[>$500] AS Count(Filter(NonEmpty(Customers.Members,
                                                                                     (Time.CurrentMember, Measures.Sales)
                                                                                    ), Measures.Sales > 500.0))
    SELECT
      { Time.Count1, Time.Count2 } on 0,
      { Measures.[<$100], Measures.[$100-$500],  Measures.[>$500] } on 1
    FROM Cube

    Again, I used abbreviations, you provide full names.

    Hope this is better.

    Oh, and by the way, if I'm wrong, simply say so. Don't worry, I'll continue helping you if I know how.


    Tomislav Piasevoli
    Business Intelligence Specialist
    www.softpro.hr

    Monday, May 25, 2009 10:24 PM
    Answerer
  • Thank you SO much for the solution, Tomislav! :-) I REALLY appreciate your great and continued help, which I can always count on!! :D :D

    Tuesday, May 26, 2009 4:26 PM