Hierarchical Data in SQL Part 2: Recursive Queries with an Adjacency List

In the first post of this series, we saw a simple table representing a real-world hierarchy and the simplest of queries on that data, and finished with the question: “What else might we do with this data?”

Maybe we need to know the chain of responsibility leading from a particular position up to the top of the organisation.
Or, perhaps we would like to report on a subsection of the tree – for example, listing everyone working directly or indirectly below the Financial Director.
And what if we want to see what this organisational structure looks like?

For all of these, we can use recursive queries.

The ability to write recursive queries in Oracle has been available for a long time using their proprietary START WITH … CONNECT BY … syntax. Different syntax for writing recursive queries using Common Table Expressions appeared in the SQL:1999 standard, and was added to T-SQL in Microsoft SQL Server 2005. Oracle introduced support for recursive CTEs in 11g R2, and in PostgreSQL they have been supported since version 8.4.
I plan to discuss the Oracle specific syntax in a later post, but for these first posts I’ll be using CTEs.

A recursive query using a CTE consists of two parts in the CTE, plus a query to select the results from the CTE. In the CTE, the first part of the query is known as the anchor member, and is the starting point – the rows present in the first iteration. The anchor is then combined using UNION ALL to a second query, the recursive member, which references the enclosing CTE.
For the first iteration of the recursive member, the input it receives from referencing the CTE is the output of the anchor query. For each subsequent iteration, the input to the recursive member is the output from the previous iteration of the recursive member.
The result of the anchor query and the results from each iteration of the recursive query are all unioned together to form the output of the CTE.

1. Chain of Responsibility

To query the chain of responsibility leading upward from a chosen employee, the anchor query will be our employee, and the recursive part of the query will work its way up the tree towards the top of the organisation.
The addition of the columns Level and Path allow us to see the path taken by the query, and the depth of recursion that has taken place.

 

2. Subsection of the tree

To report data in the opposite direction, showing the employees who work directly or indirectly under a particular manager, we need only a slight change to the previous query.

 

3. Visualise the tree structure

This is where things can get a little bit more interesting.

By making use of ranking functions and a few Unicode characters, we can create a fairly good representation of the tree structure to help us visualise the relationships in the data.

Normally, if you are developing queries to serve data to a front-end application, you should leave the presentation of the data to the presentation layer. However, if you find yourself working directly with data, having the ability to query and present the data in a meaningful way could turn out to be very useful.

Using the same data from the previous examples and an SQL query run with the output to text in a fixed width font, here is a more graphical visualisation our hierarchical data structure:

I will show the SQL query for this in the next post.

This is part of a series of posts focusing on various aspects of working with hierarchical data in SQL. You can find a list of the other posts in the series here.

Leave a Reply