Node Traversal in a Tree

Node Traversal in a Tree

The idea of writing this article came from MSDN Transact SQL Forum's thread count total paired members of the tree by sqlquery. Later on the same user also posted another thread which is similar to this one under the same forum as find total money of parent node of tree according to a particular pattern.


Problem Description

The Problem has not been properly stated in the above thread. So for making it readable and understanding purposes I will recreate the statement.
Suppose we have a table which contains information of the nodes of a tree (can be any tree, no specifications). Table contains the member node, parent node and the information of the member being a left or right child. From this tree we have to find the nodes and their children which have two children. So using a personification for the problem statement. It could state that if you are a member then your father and you would come in result as parent and child; only when you have two children (both left and right)

Table:
IF OBJECT_ID('Tempdb..#Tree', N'U') IS NOT NULL DROP TABLE TEMPDB..#Tree
/*Table having member node, parent node and the location of child either left or right*/
CREATE TABLE #Tree(
    Member INT,
    Parent INT,
    LeftN  INT,
    RightN INT
    )
/*Inserting dummy values into #Tree*/
INSERT #Tree(
     Member,
     Parent,
     LeftN,
     RightN
     )
SELECT 1,  0,  0,  0   UNION
SELECT 2,  1,  1,  0   UNION
SELECT 3,  1,  0,  1   UNION
SELECT 4,  2,  1,  0   UNION
SELECT 5,  2,  0,  1   UNION
SELECT 6,  3,  1,  0   UNION
SELECT 7,  3,  0,  1   UNION
SELECT 8,  4,  1,  0   UNION
SELECT 9,  4,  0,  1   UNION
SELECT 10, 5,  1,  0   UNION
SELECT 11, 5,  0,  1   UNION
SELECT 12, 7,  0,  1   UNION
SELECT 13, 7,  1,  0   UNION
SELECT 14, 8,  1,  0   UNION
SELECT 15, 8,  0,  1   UNION
SELECT 16, 14, 1,  0   UNION
SELECT 17, 14, 0,  1   UNION
SELECT 18, 17, 1,  0   UNION
SELECT 19, 17, 0,  1  

So now when we try to visualize the tree we get the following structure.




Now, if you focus on any node, let's say 1 (root node), it has children 2, 3, 4, 5, 7, 8, 14, 17 which have both children. Similiarly if we see 4, it has 8, 14, 17 Thus the requirement comes to see the results as




Solution

The first thought which came to my mind is to use binary tree, so I asked the user if he could tell me the number of nodes or kind of the tree he is working in. As expected, no kind. There was no specification available. It can be any kind of tree and can be as many nodes (finite). So I thought of using dynamic SQL. Creating some variables, passing some parameters and so on. But then I decided to avoid dynamic SQL as much as possible. I read somewhere that it affects performance. Second thought which came to me; to traverse node by node and checking for possible solutions. Now this can be done either by a CTE or a while loop.

Using While

In this approach instead of just giving the result as the count of nodes, I am storing all the nodes with their respective children. So I created a table, which contains the parent node and their immediate sons. The final outcome can be seen in the image attached above.

Code below:
IF OBJECT_ID ('TEMPDB..#Interim', N'U') IS NOT NULL DROP TABLE #Interim
/*Table having member node and immediate son(s)*/
CREATE TABLE #Interim(
    Node INT,
    Son INT
    )
/*Code for finding immediate son(s)*/
INSERT #Interim(
     Node,
     Son
     )
SELECT  Node,  Member  AS  Son
FROM ( SELECT Parent AS Node
      FROM  #Tree
      WHERE  Parent<>0
      GROUP BY  Parent  HAVING  SUM (LeftN)=  SUM (RightN)) Interim
JOIN  #Tree T  ON Interim.Node=T.Parent
/*Inner table "Interim" contain nodes which have both left and right child*/


Summing up so far, we have a table named #Tree which contains the information about Nodes,Children,& Parents. Then we have a table name #Interim which contains the information about the nodes and their immediate son(s).

Now powering through....
DECLARE  @Run INT
/*Running the loop for <(maximum number of node)/2> times, since we don't know the level of depth and it can be any kind of tree*/
SET    @Run=( SELECT MAX(Node) FROM #Interim)/2
WHILE  @Run>0
BEGIN
    INSERT  #Interim
    SELECT  LeftI.Node,RightI.Son
    FROM    #Interim LeftI  JOIN  #Interim RightI  ON  LeftI.Son=RightI.Node
    WHERE    RightI.Son  IN  (  SELECT  Parent  AS  Node  FROM  #Tree
                              WHERE  Parent<>0
                              GROUP BY  Parent  HAVING   SUM(LeftN)= SUM (RightN))
    UNION
    SELECT  Node, Member AS Son
    FROM (  SELECT  Parent  AS  Node  FROM  #Tree
           WHERE  Parent<>0
           GROUP BY  Parent  HAVING   SUM (LeftN)= SUM (RightN))  Interim  JOIN  #Tree T  ON  Interim.Node=T.Parent
    WHERE    Member  IN(SELECT  Parent  AS  Node  FROM  #Tree
                       WHERE  Parent<>0
                       GROUP BY  Parent  HAVING   SUM (LeftN)= SUM (RightN))
    SET @Run=@Run-1
END

The first selection (before UNION), is used to select the nodes at depth=2 in regard to the parent node respectively. Whereas the second select statement is used to find the nodes at depth=1 having both children.
Now our Interim Table have all the values, but as we might have run the loop for more than required number of times. Thus it will have duplicates too. So for selecting the distinct required values.
SELECT DISTINCT  Node,Son  AS  SonsHavingBothChild  FROM  #Interim
WHERE Son IN ( SELECT Parent AS Node
             FROM  #Tree
             WHERE  Parent<>0
             GROUP BY  Parent  HAVING  SUM (LeftN)=  SUM (RightN))

Using Recursive CTE

In this approach instead of storing the results in a table, I am going to fetch the number of nodes which are related to the parent node as required.
So when we ask for Node 1, then the result should be 8.
Code below:
DECLARE  @Parent     INT
DECLARE  @NodeCount  TABLE  (Parent  INT  )
SET  @Parent=1 ;
WITH  CTE  AS (
    SELECT  Member,Parent  FROM  #Tree
    WHERE  Member=@Parent
    UNION  ALL
    SELECT  T.Member, T.Parent  FROM  #Tree T  JOIN  CTE C  ON  T.Parent=C.Member
)
INSERT  @NodeCount
SELECT  C.Parent  FROM  #Tree T  JOIN  CTE C  ON  C.Member=T.Member
WHERE  T.Parent<>0  AND  C.Parent<>@Parent
GROUP BY  C.Parent  HAVING  COUNT (C.Parent)>1
SELECT  COUNT (1)  AS  NodeCount  FROM @NodeCount


I hope this article would have helped those who are trying to find solutions to similar problem. If I have made any mistake please update.


See Also


Sort by: Published Date | Most Recent | Most Useful
Comments
  • The images cant be seen here so I would recommended if you could please follow this link

    <gallery.technet.microsoft.com/Particular-Node-Traversal-9ebcdf91>

  • Carsten Siemens edited Revision 2. Comment: Fixed the images urls

  • Thanks Carsten Siemens.

  • Naomi N, I can see your comment in my mail box but why not here. Beyond me!

    BTW would you please let me know where did you find the problems with the grammar. Whole time I was concentrating on technical, did not see this coming. :-)

  • There are two types of the comments. One type is on the article itself and another type is the revision comments (e.g. when you make a revision, you may tell what you did). The second type of the comments should be visible if you click on the history tab of the article.

  • himansu ji i cheked link which is provide by but can not understand

    if you provide full stored procedure for  counting  total  number of  pairs in 2:1 ratio or 1:2 ( 2 left nodes and 1 right node for a given node or 1 left node and 2 right node for a given node )ratio for first payment  and after that in 1:1 ratio means 1 left and 1 right node for a given node for all other payment.

    if new user join first from left then pair counting start from left ,if new user join first from right then pair counting start from left.

Page 1 of 1 (6 items)