Tuesday, November 13, 2012 11:33 PM
From a newbie.. :-)
A utilities retail company has information on the number of times a customer has paid their bill on time, the number of times the customer had to be sent a reminder, the number of times a payment they made bounced, etc.
The company also has historical data on the number of customers who defaulted on the last bill that was issued when they transferred to another electricity supplier.
What is the best sql server data mining algorithm to find a possible correlation between credit related information held concerning the customer and the likelihood that the customer will not pay their last bill?
Wednesday, November 14, 2012 4:02 PM
In your case, you're trying to find correlation and prediction of "Binomial values" (Categorical values).
1. Default yes' OR 1
2. Default no ' OR 0
I would try with these different algorithms
1. Microsoft Naive bayes
-Identifies the "degree of Independence" using Joint,marginal and conditional probabilites between your variables.
2. Microsoft Logistic Regression
-Binary probability model with logit function used for estimation via maximum likelihood.
3. Microsoft Decision tree
- Regression Trees and Classification Trees.
It'd be very helpful for you, if you're a developer, to understand a little bit on what these algorithms are doing under the hood.
please remember to mark as answered if the post helped resolve the issue.