Fuzzy Matching - Address Cleansing RRS feed

  • Question

  • Hi *,

    does anyone know if MS supports some kind of breaking strategy within Fuzzy Lookup/Grouping?

    Besides that, I'd like to perform a address cleansing operation on a CRM database. I don't have a reference table (Street, Zip, LastLine, etc.) for that. Where can I get an appropriate database? Anyone has some experience with this issue?

    Thanks a lot.
    Wednesday, October 11, 2006 5:22 PM

All replies

  • Is there no one who uses SSIS for address validation?!
    Thursday, October 12, 2006 7:39 AM
  • I don't really understand what you mean by a breaking strategy, do you mean splitting an address out into its component parts, such as street, city, post code? If so then there is nothing in the stock components that does this. You would really want some third-party address verification software. Many of them will have their own API that could then be used to integrate this into a SSIS data flow, in fact we did this for a client recently. They key to provide effective address cleansing is the reference system, more than just a table of data really. If this is a one off or an infrequent requirement then I would just use a bureau type service rather than anything integrated into SSIS, but on the other hand if this is an ongoing requirement then purchasing a product and integrating it would make sense.

    The Fuzzy Lookup could be used to try and find a match between an existing table of addresses and a source address, but unless you are looking for existing customers or such like, you would have to purchase an address "file". That is generally what you are paying for in a third-party address product, so I think it would make sense to actually use the address software API rather than the Fuzzy.

    Thursday, October 12, 2006 9:38 AM
  • My experience is that many 3rd party data quality tools (i.e. Firstlogic IQ) have some very impressive features like the mentioned breaking groups (breaking the source data into several parts and perform cleansing only within these parts --> saves a lot processing time).

    I think the fuzzy search capabilities of SSIS are really powerful. Other products, like the mentioned firstlogic iq, doesn't perform very well in this respect. So I was thinking why not use SSIS and a comprehensive postal database to take advantage of both worlds. My biggest concerns are a really big slow down in performance and the lack of parsing features.

    Regards, S.
    Friday, October 13, 2006 11:41 AM
  • Ok, let me get this straight. Nobody ever tried to use fuzzy matching for adress verification?! What do you use fuzzy matching for instead?
    Monday, October 30, 2006 9:50 PM
  • Please wee following link Melissa Data has just released full SSIS Data Quality suite


    Saturday, February 21, 2009 4:09 PM