SQL “Recursion”

The SQL standard includes something called recursion in queries. This is also supported by PostgreSQL, but the documentation is a bit obfuscating.

Recursion is only possible withing a WITH statement that can then be referenced by a final SELECT statement. A WITH statement usually looks something like

WITH
 t AS (
  SELECT * FROM real_table;
 )
SELECT * FROM t

which is comparable to python’s with statement. A recursive WITH statement has the structure

WITH
 RECURSIVE t AS (
  non-recursive query
  UNION [ALL]
  recursive query
 )
SELECT * FROM t

If this were really about recursion I would start by looking for the anchor. For example in the definition of the factorial function f(n) the definition f 0 = 1 is a suitable anchor. A Haskell implementation of f(n) reads

f 0 = 1
f n = n * f (n - 1)

which can be called as f 10 to determine f(10). I am stating this for comparison with our SQL implementation.

But actually, WITH RECURSIVE is not about recursion and I have no idea how the SQL standards committee came up with that name. Instead, this is a method for iterative result build up, where t will play the role of a mutable object. That means we start with the non-recursive query and place it in the final results set. But we also make it available as t to the recursive query. The result of the recursive query is added to the final result set. But again, it is made available to another round of the recursive query. Note that only the result of the recursive query is fed into the recursive query, not the full final result set. The recursion is terminated if the recursive query returns zero rows.

To calculate f(n) we are using f(0) = 1 as starting point. We will first calculate all the pairs (f(n), n) instead of a specific f(n). Note that while t inside the recursive query only contains the output of the previously performed non-recursive query or recursive query, inside the final SELECT statement it references the final results set:

WITH RECURSIVE t AS (
    SELECT 1 AS f, 0 AS n
  UNION ALL
    SELECT f * (n + 1), n + 1 FROM t
)
SELECT f FROM t WHERE n=10 LIMIT 1

This query returns f(10). The LIMIT 1 is required since SQL cannot guess how many results with n=10 will appear in t, which would lead to an infinite loop.

Note that both implementations, Haskell and SQL, will crash for negative n in one way or the other.

A real world example for SQL recursion can be found in Carnivora where I am using it for unwinding a tree structure.

Published by

Sophie

Zweite Vorsitzende des Hemio – Verein für freie Kommunikation. Contact: sophie AT hemio.de.