Tuesday, March 29, 2011 1:48 PM
I'm going through the sample excel provided with Data Mining addin for Excel.
I created a new data structure from "Source Data". I added new data model, Decision trees (does not really matter which, neither works) to predict BikeBuyer. I click Classification Matrix against the internal test data, and the results show that it does not predict "Yes" for *ANY* row.
What am I doing wrong?
- Edited by algkep Tuesday, March 29, 2011 3:37 PM *clarified*
Tuesday, March 29, 2011 3:51 PM
A notice: if I create a "neural network" model on "training data" sheet, and run classification it shows that it predicted "Yes" for *ZERO* rows.
Now, if I delete all but first 1000 rows of data on that sheet, and repeat the procedure, it starts predicting... why is this happening?
Thursday, April 14, 2011 5:34 PMI am having the same issue with the decision tree, but I'm using my own data. Decision tree shows as the most accurate model, but there are no results where prediction = 1. Any help with this issue is much appreciated.
Friday, April 15, 2011 3:43 AM
You aren't doing anything wrong - if you look at the decision tree model, for example, you will see that none of the leaf nodes will have a probability of yes greater than 50%. The algorithms predict only the most probable state, and it is turning out that "Yes" is never greater than 50% for any row. A more correct approach would be to use a profit chart to find a probability threshold for considering "Yes" and then creating a query that looks like this (assuming a 30% threshold)
SELECT (PredictProbability([BikeBuyer],'Yes') > 0.3) AS IsBikeBuyer FROM MyModel PREDICTION JOIN ....
This type of query will return TRUE/FALSE based on the probability threshold that makes sense for your data.
Predixion Software, Inc.
Follow on Twitter
- Proposed As Answer by koles Wednesday, May 09, 2012 7:58 AM
Friday, April 15, 2011 6:52 AMbut isn't the sample excel included supposed to show you the models working? not a single of algorithm predicts anything..
Friday, April 15, 2011 9:09 AMssas mining structure implement a well known algorithm developed and tested since the 80's by A I community. If the model do not predict anything , i think it's about data or algorithm parameter . for example : If the tree is huge with many nods it's possible that you have done an over-fitting so the model transcript the data so there is generalization capacity
Thursday, April 21, 2011 2:15 PMyes, but it's a special sample file provided with data mining addins, which is supposed to showcase the algorithm, and yet it does not work. Has anyone ever successfully run the included training/querying example?
Monday, April 30, 2012 5:45 AM
and I'm back with that question - what's the use of sample file, if it doesn't sample you anything?
can anyone chime in on this?
Tuesday, May 01, 2012 9:46 AM
I've played around, and no matter which columns I use for training the models, the highest probability I ever get is something like 25%. Furthermore, when doing classification matrix, it's not possible to specify threshold, so it always, for all the test entries predicts "no".
Is it because I'm doing bad models, or what else could it be..
Wednesday, May 02, 2012 7:35 AM
If you want to predict on bike buyer, try changing the parameters. Else, try predict e.g. home owner. Watch this small video (no audio) for step by step instructions: http://youtu.be/36L0Cat5CEs
Dr. Nico Jacobs, SQL Server BI trainer @ U2U.net
- Edited by SQLWaldorf Wednesday, May 02, 2012 7:43 AM
Wednesday, May 02, 2012 12:25 PMI will try with home owner.
Wednesday, May 02, 2012 1:00 PMwhat kind of lift are you getting for bike buyer, home owner? what's the best I could expect?