none
SQL User Defined Functions for performing statistical calculations

    Question

  • Hi!

     

       I hope you can help.  I just wasn’t sure where to go with this question, so I’m hoping you can at least point me in the right direction.

     

       I’m writing a SQL Server stored procedure that returns information for a facility-wide scorecard-type report.  The row and columns are going to be displayed in a SQL Server Reporting Services report. 

     

       Each row of information contains “Current Month” and “Previous Month” numbers and a variance column.  Some rows may compare percentages, others whole numbers, others ratios, depending on the metric they’re measuring.  For each row/metric the company has specified whether they want to see a t-test or a chi-squared statistical test to determine whether or not there was a statistically significant difference between the current month and the previous month. 

     

       My question is this:  Do you know where I can find a set of already-written user defined functions to perform statistical calculations beyond the basic ones provided in SQL Server 2005?  I’m not using Analysis Services, so what I’m looking for are real SQL User Defined Functions where I can just pass my data to the function and have it return the result within a stored procedure. 

     

       I’m aware that there may be some third-party statistical packages out there we could purchase, but that’s not what I’m looking for.   And I’m not able to do anything like call Excel’s analysis pack functions from within my stored procedure.   I’ve asked.   They won’t let me do that.   I just need to perform the calculation within the stored procedure and return the result.

     

       Any suggestions?  Is there a site where people are posting their SQL Server UDF’s to perform statistical functions?  Or are you perhaps aware of something like a free add-in for SQL that will add statistical functions to those available in SQL?   I just don’t want to have to write my own t-test function or my own chi-squared function if someone has already done it.

     

     Thanks for your help in advance!  Oh, and please let me know if this should have been posted in the TSQL forum instead.  I wasn't entirely sure.

     

    Karen Grube

    Thursday, May 01, 2008 11:06 PM

Answers

All replies

  • T-SQL may not be fast enough for scientific and advanced statistics applications.  It is an interpreted language.

    This package has 45 advanced statistical functions.  You may be able to plug them into T-SQL:

    http://www.sharewareconnection.com/math-mechanixs.htm



    Kalman Toth SQL SERVER 2012 & BI TRAINING
    New Book: Beginner Database Design & SQL Programming Using Microsoft SQL Server 2012



    • Edited by Kalman Toth Friday, September 28, 2012 6:24 AM
    Friday, May 02, 2008 2:14 AM
  •  

    Thanks, but I realy couldn't see how these functions could be called from within a stored procedure, and I don't see the t-test function.

     

    Any other suggestions?  I'm only processing a small set of numbers at any one time, so it shouldn't be too slow.

     

    Thanks,

    Karen

    Friday, May 02, 2008 5:04 AM
  • Karen,

    I am not aware of any T-SQL library for statistical functions.

    Perhaps you can start building your own. Naturally, libraries built in C++ can be hooked into SQL Server as CLR-s.

    The following link contains among other statistical calculations, MEDIAN calculation.

    http://www.sqlusa.com/bestpractices2005/median/


    Kalman Toth SQL SERVER 2012 & BI TRAINING
    New Book: Beginner Database Design & SQL Programming Using Microsoft SQL Server 2012


    • Edited by Kalman Toth Friday, September 28, 2012 6:22 AM
    Friday, May 02, 2008 10:41 AM
  • Thanks!

     

      Although there are a few limited statistical calulations available on this site, I can't locate the t-test or chi-squared calculations I'm looking for in particular.

     

       Might there be some way I can perform these calculations perhaps by calling on some functionality of Analysis Services through a stored procedure?  My theory is that I really shouldn't have to recreate the wheel here, that someone has already done this.   Gosh, I can do this in Excel!  Why can't I do it in SQL?  I'm asking that seriously, not rhetorically.

     

       Any help would be greatly appreciated.

     

    Thanks!
    Karen

    Monday, May 05, 2008 8:37 PM
  • Karen,

    Have you made any progress towards answering this question? I am seeking methods (best practices) for calculating a variety of statistical functions (including r squared and t score) in native t-sql.

    So far, I haven't found any techniques or solutions, asides from manually programming the formulas themselves in sql stored procedures, calling external libraries like R, S+ or Matlab, or creating CLR stored procedures using C# or VB.net.

    I'm interested in any solution(s) you may have found.

    Boon Hogganbeck
    Wednesday, January 21, 2009 5:00 PM
  • Karen/Boon,

    Have either of you made any progress on finding SQL user defined functions? I found out that SAS has an addon for SQL server 2005. But you have to purchase it...Any pointers?

    Thursday, March 04, 2010 8:55 PM
  • STATS_T_TEST_

    docs.oracle.com/cd/B19306_01/server.102/b14200/functions157.htm 

    • STATS_T_TEST_ONE: A one-sample t-test

    • STATS_T_TEST_PAIRED: A two-sample, paired t-test (also known as a crossed t-test)

    • STATS_T_TEST_INDEP: A t-test of two independent groups with the same variance (pooled variances)

    • STATS_T_TEST_INDEPU: A t-test of two independent groups with unequal variance (unpooled variances)

    Friday, January 10, 2014 2:54 PM