A way to check for "missing link" between multiple tables

Answered A way to check for "missing link" between multiple tables

  • Sunday, March 03, 2013 12:30 AM
     
     

    Hi All,

    I just inner joined 10 tables together and was expecting, say 1000 rows, but instead, I got only 950 rows.  I deleted the joined tables one by one, found the troubled table, and with further investigation, I found that a value that was used for the join on that table does not exist.  (Once I supplied the value and ran again, 1000 rows came up.)

    My question is, instead of removing the joined tables one by one in order to find the troubled table, can we write a SQL script to do that?


    BI Analyst

All Replies

  • Sunday, March 03, 2013 4:04 AM
    Moderator
     
     
    Have you tried LEFT JOINs?
  • Sunday, March 03, 2013 4:27 AM
     
     

    My objective is to check to see if my normalization process was successful.  I created about 10 tables using a flat table.  The flat table contains 1000 rows.  After I normalized the file (and generate 10 tables), I need to check if my process was correctly done.  One of the tables is the major table which contains some figures, keys from other tables, and an unique key.  I use this table to join the other 9 tables, but because the country table has a value that was misspelled when it was generated from the flat file, some rows were not populated.  I did use Left Joins to with "IS NULL" criteria to see which ID did not get populated, but it does not tell me which table was the cause. 


    BI Analyst

  • Sunday, March 03, 2013 10:24 AM
     
     Answered

    When you run into a situation like this, it is a bit of a needle in a haystack to find the error. And typically, each case is a little different from the previous one, why it is difficult to suggest a standard approach. If I understand your situation correctly, it is possible that

    SELECT *
    FROM   maintbl
    LEFT JOIN tbl1 ON ..
    LEFT JOIN tbl2 ON ...

    And see in which columns the NULL values turn up. But I may be misunderstanding your scenario.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    • Proposed As Answer by Satheesh Variath Sunday, March 03, 2013 1:22 PM
    • Marked As Answer by BIAnalyst Sunday, March 03, 2013 9:19 PM
    •  
  • Sunday, March 03, 2013 8:37 PM
     
     

    Hi Erland,

    Thank you for your help.  I do not know if I have explained myself correctly and that if you understood my scenario, but basically, say I joined 10 tables together and that I know it should have 1000 rows returned on the result, but instead, I am only getting 950 rows.  What is the best way to debug this using SQL script?  

    Could you finish your script assuming that I only have 3 tables - maintbl, tbl1, and tbl2?  Thank you in advance.



    BI Analyst

  • Sunday, March 03, 2013 8:42 PM
    Moderator
     
     
    Erland is suggesting to use LEFT JOIN instead of the INNER JOIN and include PK columns from each of the joined tables. This way when you get NULL for one of the PK you'll know exactly what value is missing and in what table. Of course, the above assumes that one table is the main and the rest should be joined with it.

    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog

  • Sunday, March 03, 2013 9:17 PM
     
     

    Hi Naomi,

    Thank you for your help.  I think I understand what Erland meant.  This suggestion was suggested by JingYoung earlier in this thread.  This will allow me to spot the table and then, by using the same method, I can find which value does not match. 



    BI Analyst

  • Monday, March 04, 2013 8:25 PM
     
     

    >> I just inner joined 10 tables together and was expecting, say 1000 rows, but instead, I got only 950 rows. <<

    In 30+ years of writing SQL, I have never done a ten table join! It sounds like your schema is a non-ormalized mess. 

    >> I deleted the joined tables one by one, found the troubled table, and with further investigation, I found that a value that was used for the join on that table does not exist. (Once I supplied the value and ran again, 1000 rows came up.) <<

    Where was the DRI constraints that would assure data integrity? Where is the DDL we need to give you answers? 

    >> My question is, instead of removing the joined tables one by one in order to find the troubled table, can we write a SQL script to do  that? <<

    This and your other postings tell us that you have no idea how to write SQL. If you were competent, the DDL would have prevented this vague “troubled table” in the first place. We would not even think of a script! 


    --CELKO-- Books in Celko Series for Morgan-Kaufmann Publishing: Analytics and OLAP in SQL / Data and Databases: Concepts in Practice Data / Measurements and Standards in SQL SQL for Smarties / SQL Programming Style / SQL Puzzles and Answers / Thinking in Sets / Trees and Hierarchies in SQL

  • Monday, March 04, 2013 8:48 PM
     
     

    You are correct that I have no idea on how to write SQL, that was why I am looking for help here.  I used query design editor to construct my queries.  It works very well.  

    The reason why I needed to join 10+ tables together is to check to see if the normalization process that I did was correct.  I normalized a flat file into 10+ tables relational database.  There are one-to-many and many-to-many relationship amount the table.  The best way to make sure that if these relationships and tables were created correctly was to make sure that I can get the same records by joining the tables together.  I am happy with the result that all the records show up as in the flat file.

    Please use your 30+ years of experience to help and skip all the useless cynical comments!  These cynical comments are not helping anyone. 


    BI Analyst