DQS is a new feature in SQL Server 2012 that provides you with a knowledge-driven data cleansing solution. For more information about DQS, see Introducing Data Quality Services. The FAQs were originally created by the following people in the DQS product team:
Note: This article is closely monitored. Any changes that you make will be evaluated and then quickly accepted, refined, or reverted. Because this is a wiki, additions or refinements to these FAQs might have been made by community members. To read the original FAQ document, click here.
Data quality represents the degree to which the data is suitable for usage in the required business processes. The quality of data can be defined, measured and managed through various data quality metrics such as completeness, conformity, consistency, accuracy, duplication etc. Data quality is achieved through people, technology and processes.
DQS is a knowledge-driven solution, focusing on creation and maintenance of a Data Quality Knowledge Base (DQKB) that is reused for performing various data quality operations, such as data cleansing and matching.
The main concept behind DQS is a rapid, easy-to-deploy, and easy-to-use data quality system that can be set up and used practically in minutes.
DQS is targeting organizations of all sizes who seek to improve the quality of their business data. The product's functionality enables business users, information workers and IT professionals to improve the quality of their data and manage their data quality processes and tasks.
Data Stewardship has as its main objective the management of the corporation's data assets in order to improve their reusability, accessibility, and quality. It is the Data Stewards' responsibility to approve business naming standards, develop consistent data definitions, determine data aliases and derivations, document the business rules of the corporation, monitor the quality of the data, and so forth.
DQS provides customers with capabilities that help improve the quality of their data. Data is usually generated by multiple systems and parties across organizational and geographic boundaries and often contains inaccurate, incomplete or stale data elements. The following scenarios are the data quality problems addressed by DQS in SQL Server 2012.
Delivering higher data quality in a consistent, controlled, managed, integrated and fast manner results in better business results. The DQS knowledge base approach enables the organization, through its data experts, to efficiently capture and refine the data quality related knowledge in a Data Quality Knowledge Base (DQKB).
Through its interactive cleansing capabilities and its integration with Integration Services and Windows Azure Marketplace, information workers and IT professionals will be able to collaborate and reuse this knowledge for various data quality improvements and enterprise data management processes (cleansing, matching, standardization, enrichment, etc.).
DQS is a part of the SQL Server product, and comprises a Data Quality Server and a dedicated Data Quality Client application. DQS also provides a DQS Cleansing component in Integration Services for an integrated easy-to-use cleansing experience.
DQS is a knowledge-driven solution, focusing on the creation and maintenance of a Data Quality Knowledge Base (DQKB) that can then be reused to perform various data quality operations, such as data cleansing and matching.
The main concept behind DQS is a rapid, easy-to-deploy, easy-to-use data quality product that can be set up with minimal effort. To that end, DQS focuses on creating an open environment for consuming third-party intellectual property (IP) and knowledge, which enables partners and ISVs to build DQKB content and assist customers to launch their data-quality initiatives using DQS in a smoother, friction-free manner.
DQS is a knowledge-driven solution, and in its heart resides in the DQKB. A DQKB stores all the knowledge related to a specific type of data sources, and is maintained by the organization’s data expert (often referred to as a data steward). For example, one DQKB can handle information on an org’s customer database, while another can handle employees database.
The DQKB contains data domains that relate to the data source (for example: name, city, state, zip code, ID). For each data domain, the DQKB stores all identified terms, spelling errors, validation and business rules, and reference data that can be used to perform data quality actions on the data source.
For detailed information about DQKB and domains, see DQS Knowledge Bases and Domains.
DQS enables a “self-service data quality experience” through a dedicated Data Quality Client application, where any data expert with virtually no database expertise can create, maintain, and run data-quality operations, with minimal setup and preparation time. Data Quality Client and the DQS Cleansing component in Integration Services enable the following capabilities in SQL Server.
A DQKB can be built by acquiring knowledge through data samples and user feedback. The DQKB is enriched through a computer-assisted knowledge discovery process, or by user-generated knowledge and IP by third-party reference data providers.
DQS in SQL Server 2012 will not provide any public APIs.
DQS will ship as part of SQL Server 2012.
Yes, DQS is a knowledge-driven solution that is based on Quality Specific Knowledge Bases that reside in SQL Server. The DQS knowledge base stores comprehensive quality-related knowledge in the form of data domains. These domains encapsulate the semantic representation of specific type of data sources (for example, name, city, state, zip-code, id number). For each data domain, the DQS knowledge base stores all identified terms, spelling errors, rules, and external reference data that can be used to cleanse the enterprise business data.
Building the DQKB combines advanced automatic algorithms and well-defined and streamlined processes that enable rapid knowledge acquisition, aligned with the specific enterprise data.
DQS includes defined connectivity to Windows Azure Marketplace and other third-party business services and data sets to enhance the DQKB with third-party cloud-based IP. For more information, see Reference Data Services in DQS.
Future versions of DQS might also include end-to-end (e2e) cloud solutions.
DQS is available now as part of the SQL Server 2012 RTM release. You can download SQL Server 2012 RTM from here. For detailed information about installing and configuring DQS, see the DQS Installation Guide and watch the installation video.