How do I set Microsoft Clustering to do full text comparisons on strings?
-
Tuesday, November 13, 2012 8:40 AM
Hi guys. I'm trying to find clusters of people and addresses in a table of companies and people.
An address varchar column may contain:
'8 oak avenue'
'Oak avenues'
How do you set the clustering to cluster those together, i.e. partial string comparison?
Does it do it automatically?
Do I need to create a full text index on every column that I want to do a partial text match on?
Thanks,
Richard.
All Replies
-
Wednesday, November 14, 2012 10:30 AM
Instead if full text search you can use Querying Alternatives in SQL Server,For more information see below URL
http://www.mssqltips.com/sqlservertip/1342/full-text-search-querying-alternatives-in-sql-server/
Please remember to click Mark as Answer and Vote as Helpful on posts that help you. This can be beneficial to other community members reading the thread.
-
Thursday, November 15, 2012 6:39 AMAny updates?
Please remember to click Mark as Answer and Vote as Helpful on posts that help you. This can be beneficial to other community members reading the thread.
-
Thursday, November 15, 2012 2:38 PM
Thanks, but I specifically want to know if an Anlysis Services Data Mining model can use partial\fuzzy matching and how to do it (possibly through a full text index).
I'm not interested in normal SQL statements.
-
Thursday, November 15, 2012 3:30 PM
Analysis Services does'nt do any fuzzy matching as far as I know. You need to prepare the data and present to it.
One possible way you can cluster on a given scenario is to create Case and Nested tables stuctures. As a same address can exist in multiple cities, state, counties, you might want to create Case/nested tables as given:
e.g.
Casetable<- PersonID,
Nestedtable<-PersonID,
PartialAddress
State
City
You can accomplish creating this structure using SSIS term extraction and term lookup components. Or simply using string functions and row_number() over partition function in SQL.
hth,
rok
please remember to mark as answered if the post helped resolve the issue. ------------------------------------------------------ "Essentially, all models are wrong, but some are useful" - George E.P.Box
- Marked As Answer by RichardJECooke Saturday, November 17, 2012 10:03 PM

