Find many-to-many relationship between two columns in one query

Answered Find many-to-many relationship between two columns in one query

  • Friday, March 01, 2013 5:31 PM
     
     

    Hi All,

    How can I run one query to determine if Column1 and Column2 in the following table have a many-to-many relationship?

    Column1

    Column2

    Column3

    b

    x

    s

    b

    x

    t

    c

    x

    s

    c

    y

    s

    c

    y

    t

    Just by eyeballing it, we know there is, but in a real life situation, you cannot eyeball millions of records to make that decision.

    I am using Access 2007.  (SQL script that works with Access 2007 will be great, but script for SQL Server will be fine too.)


    BI Analyst

All Replies

  • Friday, March 01, 2013 5:50 PM
     
      Has Code
    If Exists(Select Column1 From YourTable Group By Column1 Having Count(*) > 1)
    And
    Exists(Select Column2 From YourTable Group By Column2 Having Count(*) > 1)
    Begin
      Print 'Many to Many'
    End
    Else
    Begin
      Print 'Not Many to Many'
    End;

    Tom


  • Friday, March 01, 2013 7:11 PM
     
     

    Thank you, Tom, for your prompt help.  This is the similar code that I used before, but it does not really reflect that relationship because of Column3.

    If you change the third row on Column2 to "y" or just simply delete row 3 to 5, you will not see there is no many-to-many relationship between Column1 and Column2.  But when you run the query, the result will, mistakenly, be many-to-many.

    I think one of the solutions to to generate a query by group-by column1 and column2.  Then run your query against the new query result.  It will be a subquery, but how to do that?



    BI Analyst

  • Friday, March 01, 2013 7:23 PM
     
     Answered Has Code

    Just project it on columns in question before counting

    ;with c1c2 as (select distinct Column1, Column2 from mytable) select case when Exists(Select 1 From c1c2 Group By Column1 Having Count(*) > 1)

    and Exists(Select 1 From c1c2 Group By Column2 Having Count(*) > 1)

    then  'Many to Many' else 'No' end as result




    Serg

    • Marked As Answer by BIAnalyst Friday, March 01, 2013 8:12 PM
    •  
  • Friday, March 01, 2013 7:23 PM
     
     

    What is your definition of the two columns having a many to many relationship?

    Tom

  • Friday, March 01, 2013 7:47 PM
     
      Has Code

    Hi BI Analyst,

    Take a look at this code:

    IF OBJECT_ID('tempdb..#experiment') IS NOT NULL DROP TABLE #experiment;
    CREATE TABLE #experiment (c1 CHAR(1), c2 CHAR(1), c3 CHAR(1))
    
    INSERT INTO #experiment (c1,c2,c3) VALUES ('b','x','s'),('b','x','t'),('c','x','s'),('c','y','s'),('c','y','t')
    
    SELECT c1, c2, c1+c2 AS bind, 
    COUNT(c1) OVER (PARTITION BY c1+c2) c1perbind, 
    COUNT(c2) OVER (PARTITION BY c1+c2) c2perbind, 
    COUNT(c1+c2) OVER (PARTITION BY c1) bindsperc1,
    COUNT(c1+c2) OVER (PARTITION BY c2) bindsperc2
    FROM #experiment 

    What your question asks is if there can be determined a many-to-many relationship. A many-to-many relationship is logical relation comprised of a pair of one-to-many relationships. The code above shows you which relational pairing occur in these one-to many relationships. c1perbind (or c2) >1 indicates a one to many relation from that relational pairing, whereas the bindsperc1 indicates how many relations the individual c1 belongs to.

    Now look at this code:

     SELECT  c1, c2, bind, 
    COUNT(c1) OVER (PARTITION BY c1+c2) c1perbind, 
    COUNT(c2) OVER (PARTITION BY c1+c2) c2perbind, 
    COUNT(bind) OVER (PARTITION BY c1) bindsperc1,
    COUNT(bind) OVER (PARTITION BY c2) bindsperc2
    FROM (SELECT DISTINCT c1, c2, c1+c2 AS bind FROM #experiment ) Q1
    With the subquery reducing the set for the windows to operate on to distinct pairs the bindsperC1 and bindsperC2 will show count greater than one if they belong to a one-to-many relationship. When both are greater than 1 those rows are proof that the relationship between C1 and C2 is a many-to-many.


    If you're happy and you know it vote and mark.

  • Friday, March 01, 2013 8:12 PM
     
     

    Thank you all for your help!

    Serg's code does the trick.  Thanks again, Serg.

    @Tom: What I meant was that there are cells in column1 that relate to multiple cells in column2 and vice versa.  In the table, c in Column1 is related to x and y.  x in Column2 is related to b and c.  I am sure you already know the definition, but my example threw you off because they are in columns and not in tables.  This is a flat table that needs to normalized.  

    If you just take the first two rows for example and run your code against it, you will get 2 counts for b and 2 counts for x and, therefore, the result will be "many-to-many" - this is because of Column3.  But if you take out Column3, you will get only 1 count each and the result will be a 'No'.

    @Thomas: Your code is kind of deep and working very hard.  I will need to look into it later.

    Thanks again! 


    BI Analyst

  • Saturday, March 02, 2013 7:38 PM
     
     

    >> How can I run one query to determine if Column1 and Column2 in the following table have a many-to-many relationship? <<

    Please learn  the difference between rows and records. Please follow minimal Netqiuette and post DDL instead colored pictures.  

    WITH Counts (a_cnt, b_cnt)
    AS
    (SELECT COUNT(*) OVER (PARTITION BY a) AS a_cnt,
            COUNT(*) OVER (PARTITION BY b) AS b_cnt
      FROM Foobar),

    SELECT CASE WHEN MAX (a_cnt) = 1 
                THEN 'A is unique' ELSE 'A is multiple' END
           AS A,
           CASE WHEN MAX (b_cnt) = 1 
                THEN 'B is unique' ELSE 'B is multiple' END
           AS B
      FROM Counts AS C;




    --CELKO-- Books in Celko Series for Morgan-Kaufmann Publishing: Analytics and OLAP in SQL / Data and Databases: Concepts in Practice Data / Measurements and Standards in SQL SQL for Smarties / SQL Programming Style / SQL Puzzles and Answers / Thinking in Sets / Trees and Hierarchies in SQL

  • Saturday, March 02, 2013 8:37 PM
     
     

    Hi CELKO,

    Thank you for your guideline.  

    1. It was not easy to find a good definition of a row and a record.  But I find this on the Web:

    A row is a single line of your project. 
    A record is combination of one or multiple rows identifying a unique object and sharing the same first column.

    Based on that definition, I have 2 records and 5 rows in the table?

    2. I tested your code, but got the following error:

    Msg 156, Level 15, State 1, Line 7
    Incorrect syntax near the keyword 'SELECT'.

     3. Regarding the color thing, I just copied the table from Excel.  I have no intention to post color stuff on the post.


    BI Analyst