A (directed) tree is one of the most common data types. But how do you use trees in LINQ? This article introduces a generic LINQ-Extension which solves the problem, gives an example and discusses alternatives.

# Motivation

The starting point of this article was a simple question in the LINQ forum which has been unanswered for four years...

How can I select tree nodes independently from their nesting level?

So it might be worth to have a closer look at the task. Let's start with an example. A simple TreeNode class may look like this:

`public` `class` ``` TreeNode```
`{`
`    ``public` `string` ``` Id { ````get````; ````set``; }`
`    ``public` `int` ```Data { ````get````; ````set``; }`
`    ``public` ```List<TreeNode> Children { ````get````; ````private` ``` set````; }`

`    ``public` `TreeNode(``string` `id, ``int` `data)`
`        ````: ````this````(id, data, ````new` ``` TreeNode[0])```
`    ``{ }`

`    ``public` `TreeNode(``string` `node, ``int` `data, ``params` `TreeNode[] children)`
`    ``{`
`        ``Id = node;`
`        ``Data = data;`
`        ````Children = ````new` ``` List<TreeNode>(children);```
`    ``}`
`}`

The test data are:

`TreeNode rootNode =`
`    ``new` `TreeNode(``"node-0"``, 35,`
`        ``new` `TreeNode(``"node-1"``, 17,`
`            ``new` `TreeNode(``"node-2"``, 20)),`
`        ``new` `TreeNode(``"node-3"``, 19),`
`        ``new` `TreeNode(``"node-4"``, 5,`
`            ``new` `TreeNode(``"node-5"``, 25),`
`            ``new` `TreeNode(``"node-6"``, 40)));`

# Solution

Implement a generic extension class which can traverse over arbitrary trees. The only precondition is that the node's children are stored as an IEnumerable. The selector lambda expression avoids hard-coding of the access path and makes the extension reusable.

`public` `static` ``` class``` `LinqTreeExtension`
`{`
`    ``public` `static` ``` IEnumerable<T> SelectNestedChildren<T>```
`        ``(``this` `IEnumerable<T> source, Func<T, IEnumerable<T>> selector)`
`    ``{`
`        ``foreach` `(T item ``in` `source)`
`        ``{`
`            ````yield ````return` ``` item;```
`            ``foreach` ```(T subItem ````in` ``` SelectNestedChildren(selector(item), selector))```
`            ``{`
`                ````yield ````return` ``` subItem;```
`            ``}`
`        ``}`
`    ``}`
`}`

The extension method SelectNestedChildren traverses the tree depth-first, left to right.

# Usage Example

Example 1:
Flattening the tree structure  (without the root node)

```var flattenedTreeWithoutRoot ```
`    ````= rootNode.Children```
`        ````.SelectNestedChildren(t => t.Children).ToList();```

The resulting list enumerates the nodes with IDs "note-1" to "note-6". Please note:  The root node is not contained in the resulting list. We'll fix this in the next example.

Example 2:
Flattening the tree structure  (including the root node)

```var flattenedTreeIncludingRoot ```
`    ````= ````new` ``` TreeNode[] {rootNode }```
`        ````.SelectNestedChildren(t => t.Children).ToList();```

The query is executed on an array, which contains only the root node. The resulting list enumerates the nodes with IDs "note-0" to "note-6" as expected.

Example 3: Performing a LINQ query on all nested children

`var result`
`    ````= ````new` ``` TreeNode[] { rootNode }```
`        ````.SelectNestedChildren(t => t.Children)```
`        ````.Where(t => t.Data > 30).ToList();```

The resulting list contains two nodes with the IDs "node-0" and "node-6"

The proposed LINQ extension has significant pros:
• The extension is applicable to all tree-structures which use an IEnumerable to hold the children.
• It fits seamlessly into other LINQ operations.
However, in some cases you may want to consider an alternative:

## Cyclic graph instead of a tree

Problem: The extension will loop forever if it runs into a cycle of the graph.
Solution: Modify the extension:
• Add a HashSet<T> to the extension which keeps track of the already visited notes.
• Your class T should override GetHashCode() and Equals()
• The extensions have to skip already visited nodes.

## Persisted tree in a database

Problem: If the tree is stored in a database (and not in memory), the extension produces unnecessary overhead.
Solution: All tree nodes reside in a database table which can be queried directly. The effect that they represent a logical tree or even a cyclic graph can be ignored. Using the Entity Framework the query looks like this:

```var databaseResult = context.TreeNodes```
`    ````.Where(t => t.Data > 30).ToList();```

However, at least two small changes should be applied to the sample class TreeNode:
• The property Children should be changed to a virtual property to enable lazy loading.
• The Entity Framework requires an additional parameterless constructor.

## XML data structures

An XML data structure is a special case of a directed tree.

Problem: LINQ to XML has specialized on XML Documents.
Solution: Use LINQ to XML. It reduces your coding effort. You stick to a standard. You may also use other benefits from LINQ to XML. Example:

```var xmlElements = XDocument.Load(````"myfile.xml"``).Descendants();`

```var result = xmlElements.Where(x => ...);```

XDocument.Descendents() returns an IEnumerable<XElement> which can be used in a query directly.

Conclusion:
Using trees with LINQ is simple. However, there is no "one size fits all" solution.