Forgot password?

Recursion in MS SQL

Dmitriy Fedyashov

Sometimes, a stored procedure or function is required to use the results of a sample several times. In such cases, we often use temporary tables. However, it is worth considering some advantages and disadvantages of temporary tables.

Advantages:

  • Temporary tables are full tables. Therefore, you can create indexes and statistics for them. This can significantly speed up work with them.

Disadvantages:

  • Filling in a temporary table associated with the movement of data. Although it is a simple Insert operation, there is still a load on the disks with large amounts of data;
  • There is a risk of increased query execution time. Temporary tables are created in the tempdb database. And the load on this base is substantial.

Considering the risks of using temporary tables, the use of a generic table expression looks much more attractive.

 

Generic table expression

Common Table Expression (CTE) is an expression with a common table that can be used many times in a query. The CTE does not save data, but creates something like a temporary view. Some may say that a CTE is a subquery that precedes the main query. But this is not entirely true, because the subquery cannot be used several times, however, CTE can.

In which cases is it better to use a generic table expression?

1. To create recursive queries, with which you can get data in a hierarchical form;

2. With multiple references to the data set within the same query;

3. In order to replace views, temporary tables, table variables.

The advantages of CTE include: recursion, high speed query, concise query.

And the disadvantages can be only in the limited use. A CTE can only be used for the query to which it belongs. You cannot use it in other queries. In this case, you will have to use temporary tables or table variables.

Generic table expressions are simple and recursive.

Simple ones do not include references to theirselves, and recursive respectively include.

Recursive CTEs are used to return hierarchical data.

Consider an example of a simple CTE statement:

1
2
3
4
5
6
 WITH CTEQuery (Field1, Field2)
 AS
 (
 SELECT (Field1, Field2) FROM TABLE
 )
 SELECT * FROM CTEQuery

 Here CTEQuery is the name of the CTE;

                Field1, Field2 – field names of the request;

                Table – some table from which data is selected for use in the main query.

In this example, it is possible and not to explicitly specify the selection fields, since we select all the fields from the TestTable table:

1
2
3
4
5
6
WITH CTEQuery
 AS
 (
 SELECT * FROM Table
 )
SELECT * FROM CTEQuery

 With the help of CTE, you can optimize the main query if you take out part of the logic in the CTE. The fact is that CTE allows you to create several expressions (queries) at once. So you can split a complex query into several preliminary “views” using the CTE, and then link them in a common query: 

1
2
3
4
5
6
7
8
9
10
11
12
WITH CTEQuery1 (Field1, Field2) AS
(
 SELECT Field1 AS ID, Field2 FROM Table1
 WHERE Field2 >= 1000
),
CTEQuery2 (Field3, Field4) AS
(
 SELECT Field3 AS ID, Field4 FROM Table2
 WHERE Field4 = 'Москва'
)
 
SELECT * FROM CTEQuery1 INNER JOIN CTEQuery2 ON CTEQuery2.ID = CTEQuery1.ID

 As mentioned above, the main purpose of the CTE is recursion. A typical task for recursion is a tree traversal. So we can build a tree with the help of “with”. The recursive query structure first appeared in SQL Server 2005.

Take a look at the WITH statement: 

1
2
3
4
5
6
7
WITH RecursiveQuery AS
(
 {Anchor}
 UNION ALL
 {Joined TO RecursiveQuery}
)
SELECT * FROM RecursiveQuery

 {Anchor} - anchor, a query that defines the initial element of the tree (hierarchical list). Usually in anchor there is a WHERE clause that defines specific rows of the table.

After UNION ALL, the target table is followed from JOIN to the CTE expression.

{Joined to RecursiveQuery}- SELECT from target table. This is usually the same table used in the anchor. But in this query, it is connected to the CTE expression, forming recursion. The condition of this connection determines the parent-child relationship. It depends on whether you go to the upper levels of the tree or to the lower ones.

Let's look at a recursive query that returns a list of organizational units. Prepare data for this request:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
CREATE TABLE Department
(
ID INT,
ParentID INT,
Name VARCHAR(50)
)
 
INSERT INTO Department ( ID, ParentID, Name ) 
VALUES (1, 0, 'Finance Director')
INSERT INTO Department ( ID, ParentID, Name ) 
VALUES (2, 1, 'Deputy Finance Director')
INSERT INTO Department ( ID, ParentID, Name ) 
VALUES (3, 1, 'Assistance Finance Director')
INSERT INTO Department ( ID, ParentID, Name ) 
VALUES (4, 3, 'Executive Bodget Office')
INSERT INTO Department ( ID, ParentID, Name ) 
VALUES (5, 3, 'Comptroller')
INSERT INTO Department ( ID, ParentID, Name ) 
VALUES (6, 3, 'Purchasing')
INSERT INTO Department ( ID, ParentID, Name ) 
VALUES (7, 3, 'Debt Management')
INSERT INTO Department ( ID, ParentID, Name ) 
VALUES (8, 3, 'Risk Management')
INSERT INTO Department ( ID, ParentID, Name ) 
VALUES (9, 2, 'Public Relations')
INSERT INTO Department ( ID, ParentID, Name ) 
VALUES (10, 2, 'Finance Personnel')
INSERT INTO Department ( ID, ParentID, Name ) 
VALUES (11, 2, 'Finance Accounting')
INSERT INTO Department ( ID, ParentID, Name ) 
VALUES (12, 2, 'Liasion to Boards and Commissions')

 It is already clear that the structure of the divisions in the organization is hierarchical. Our task is to get a list of departments subordinate to the assistant of the financial director. If we talk in the context of a hierarchical tree, then we must find a branch and its leaves.

But first, let's see the whole list of divisions:

ID

ParentID

Name

1

0

Finance Director

2

1

Deputy Finance Director

3

1

Assistance Finance Director

4

3

Executive Bodget Office

5

3

Comptroller

6

3

Purchasing

7

3

Debt Management

8

3

Risk Management

9

2

Public Relations

10

2

Finance Personnel

11

2

Finance Accounting

12

2

Liasion to Boards and Commissions

 At the head there is the financial director, the deputy and the assistant report to him. Each of them has a group of units in its jurisdiction. The ParentID field indicates the "host" identifier. Thus, we have a ready-made master-slave connection.

Let's write a recursive query using WITH.

1
2
3
4
5
6
7
8
9
10
11
12
13
WITH RecursiveQuery (ID, ParentID, Name)
AS
(
 SELECT ID, ParentID, Name
 FROM Department dep
 WHERE dep.ID = 3
 UNION ALL
 SELECT dep.ID, dep.ParentID, dep.Name
 FROM Department dep
 JOIN RecursiveQuery rec ON dep.ParentID = rec.ID
)
SELECT ID, ParentID, Name
FROM RecursiveQuery

 In this example, the names of the fields that are to be selected in the CTE are clearly indicated. However, internal queries have the same fields. So you can simply remove this listing along with the brackets.

Inside the CTE, we have two similar queries. The first one selects the root element of the tree that we are building. The second is all subsequent subordinate elements, due to the connection with the CTE itself. "Recursion" in SQL is not really a recursion, but an iteration. You need to submit a query with a JOIN as a loop, and then everything will be immediately clear. In each iteration, we know the value of the previous sample and get the subordinate elements. In the next step, we get the subordinate elements for the previous sample. That is, each iteration is a transition down a tree, or up, depending on the communication condition.

The result of the above query is:

ID

ParentID

Name

3

1

Assistance Finance Director

4

3

Executive Bodget Office

5

3

Comptroller

6

3

Purchasing

7

3

Debt Management

8

3

Risk Management

 But what would this query look like without using CTE:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
DECLARE @Department TABLE (ID INT, ParentID INT, Name VARCHAR(50), Status INT DEFAULT 0)
-- First, we select the anchor in the table variable - the initial element from which we build the tree.
INSERT @Department
SELECT ID, ParentID, Name, 0
 FROM Department dep
 WHERE dep.ID = 3
 
DECLARE @rowsAdded INT = @@ROWCOUNT
-- We are going through a cycle until new departments are added in the previous step.
WHILE @rowsAdded > 0 
BEGIN
-- Mark entries in a table variable as ready for processing
UPDATE @Department SET Status = 1 WHERE Status = 0
-- Select child records for the previous record
INSERT @Department 
SELECT dep.ID, dep.ParentID, dep.Name, 0
 FROM Department dep
 JOIN @Department rec ON dep.ParentID = rec.ID
AND rec.Status = 1
SET @rowsAdded = @@ROWCOUNT 
 -- Mark entries found in the current step as processed 
 UPDATE @Department SET Status = 2 WHERE Status = 1 
END
SELECT * FROM @Department

 Such a cycle runs much slower than the CTE expression. And besides, it requires the creation of a table variable. And the amount of code doubled. Thus, CTE expressions are the best solution for a recursive tree traversal in MS SQL.

Similar articles:

Retour