A (directed) tree is one of the most common data types. But how do you use trees in LINQ? This article introduces a generic LINQ-Extension which solves the problem, gives an example and discusses alternatives.


The starting point of this article was a simple question in the LINQ forum which has been unanswered for four years...

How can I select tree nodes independently from their nesting level?

So it might be worth to have a closer look at the task. Let's start with an example. A simple TreeNode class may look like this:

public class TreeNode
    public string Id { get; set; }
    public int Data { get; set; }
    public List<TreeNode> Children { get; private set; }
    public TreeNode(string id, int data)
        : this(id, data, new TreeNode[0])
    { }
    public TreeNode(string node, int data, params TreeNode[] children)
        Id = node;
        Data = data;
        Children = new List<TreeNode>(children);

The test data are:

TreeNode rootNode =
    new TreeNode("node-0", 35,
        new TreeNode("node-1", 17,
            new TreeNode("node-2", 20)),
        new TreeNode("node-3", 19),
        new TreeNode("node-4", 5,
            new TreeNode("node-5", 25),
            new TreeNode("node-6", 40)));

↑ Return to Top


Implement a generic extension class which can traverse over arbitrary trees. The only precondition is that the node's children are stored as an IEnumerable. The selector lambda expression avoids hard-coding of the access path and makes the extension reusable.

public static class LinqTreeExtension
    public static IEnumerable<T> SelectNestedChildren<T>
        (this IEnumerable<T> source, Func<T, IEnumerable<T>> selector)
        foreach (T item in source)
            yield return item;
            foreach (T subItem in SelectNestedChildren(selector(item), selector))
                yield return subItem;

The extension method SelectNestedChildren traverses the tree depth-first, left to right.

↑ Return to Top

Usage Example

Example 1: 
Flattening the tree structure  (without the root node)

var flattenedTreeWithoutRoot
    = rootNode.Children
        .SelectNestedChildren(t => t.Children).ToList();

The resulting list enumerates the nodes with IDs "note-1" to "note-6". Please note:  The root node is not contained in the resulting list. We'll fix this in the next example.

Example 2: 
Flattening the tree structure  (including the root node)

var flattenedTreeIncludingRoot
    = new TreeNode[] {rootNode }
        .SelectNestedChildren(t => t.Children).ToList();

The query is executed on an array, which contains only the root node. The resulting list enumerates the nodes with IDs "note-0" to "note-6" as expected.

Example 3: Performing a LINQ query on all nested children

var result
    = new TreeNode[] { rootNode }
        .SelectNestedChildren(t => t.Children)
        .Where(t => t.Data > 30).ToList();

The resulting list contains two nodes with the IDs "node-0" and "node-6"

↑ Return to Top

Contradictions and alternatives

The proposed LINQ extension has significant pros:
  • The extension is applicable to all tree-structures which use an IEnumerable to hold the children.
  • It fits seamlessly into other LINQ operations.
However, in some cases you may want to consider an alternative:

Cyclic graph instead of a tree

Problem: The extension will loop forever if it runs into a cycle of the graph.
Solution: Modify the extension:
  • Add a HashSet<T> to the extension which keeps track of the already visited notes.
  • Your class T should override GetHashCode() and Equals()
  • The extensions have to skip already visited nodes.

Persisted tree in a database

Problem: If the tree is stored in a database (and not in memory), the extension produces unnecessary overhead.
Solution: All tree nodes reside in a database table which can be queried directly. The effect that they represent a logical tree or even a cyclic graph can be ignored. Using the Entity Framework the query looks like this:

var databaseResult = context.TreeNodes
    .Where(t => t.Data > 30).ToList();

However, at least two small changes should be applied to the sample class TreeNode:
  • The property Children should be changed to a virtual property to enable lazy loading.
  • The Entity Framework requires an additional parameterless constructor.

XML data structures

An XML data structure is a special case of a directed tree.

Problem: LINQ to XML has specialized on XML Documents.
Solution: Use LINQ to XML. It reduces your coding effort. You stick to a standard. You may also use other benefits from LINQ to XML. Example:

var xmlElements = XDocument.Load("myfile.xml").Descendants();
var result = xmlElements.Where(x => ...);

XDocument.Descendents() returns an IEnumerable<XElement> which can be used in a query directly.

Using trees with LINQ is simple. However, there is no "one size fits all" solution. 

↑ Return to Top


↑ Return to Top

See Also

Another important place to find a huge amount of Visual C# related articles is the TechNet Wiki itself. The best entry point is Visual C# Resources on the TechNet Wiki

↑ Return to Top

Other languages

This article is also available in other languages: