Dienstag, 12. Juni 2012 10:41
I am trying to get familiar with DQS in SQL Server 2012. I am new with this technology, so I would like to clarify some points.
I am working in a project in which I have a sql server table with the following columns:
Locality, SecondaryLocality, Town, County, StreetName, SecondaryStreetName, HouseName, HouseNumber
I built my Knowledge Base based on this table and I created one Domain per column in the SQL Server Table.
I have another SQL Server Table that I am trying to match against the KB above. This table contains the following columns:
Address1, Address2, Town, County, HouseName, HouseNumber.
The problem is that for some rows Address1 can be StreetName and Address2 can be SecondaryStreetName, but in other cases Address1 and Address2 can be Locality, SecondaryLocality respectively.
In the KB, I defined 3 matching rules:
Matching House Rule:
House Name - Prerequisite
House Number - Exact - Weight 100%
Matching Address Rule:
Town - Prerequisite
Street Name - Exact - Weight 50%
Secondary Street Name - Exact - Weight 50%
Matching Locality Rule:
Town - Prerequisite
Locality- Exact - Weight 50%
Secondary Locality- Exact - Weight 50%
Matching County Rule:
County - Exact - Weight 100%
But in the matching project, I though I could join Address1, Address2 (from the SQL Table I want to match against the KB) with Matching Locality Rule and with Matching Address Rule. But I am not able to select more than one time either the Source Columns (from the SQL Server Table) or Domains (from the KB).
The point of the project is to get a SQL Sever Table or CSV file with the matching rows and another file or table with the non matching rows.
Can anyone advice me on how can I get it?
Thanks and best regards,
- Bearbeitet Joss83 Dienstag, 12. Juni 2012 10:42
Donnerstag, 14. Juni 2012 21:03
Thanks for your question about matching with DQS. I would like to understand your scenario better; you have created domains in a KB based on the columns in Table 1 and you wish to match Table 2 (Address1, Address2, Town, County, HouseName, HouseNumber) using the domains you created based on Table 1. The best practice would be to create domains in a KB based on Table 2 and create a matching policy that will fit the information in Table 2. The DQS matching system does not match your SQL table against the KB but it matches the table (e.g., Table 2) against itself using the values you have in your domains and the matching rules you define in the matching policy. Please bear in mind that if you created relations between values in your domains the accuracy of the matching will improve, for instance you could create synonyms in a 'City' domain such as NYC -> New York, namely if your data is inconsistent and contains both 'NYC' and 'New York' and you have defined these values as synonyms in your 'City' domain, the matching system will find these values as 100% match (as similarity between these values is low).
I hope I have answered your question.
| Data Quality Services