What Is a Common Table Expression in SQL
Common Table Expressions (CTE) is a temporary result set that is returned by a single statement to be used further within the same statement. As it is a temporary result, it is not stored anywhere and does not consume disk space. However, you can reference it in the same manner as any table.
Or, we may consider a CTE as a kind of virtual table containing columns with physical records. This virtual table is created as the result of the query execution, used by another query, and deleted after that main query execution.
Since the introduction of SQL Server 2005, CTEs have become a popular method for database specialists who apply them to make complex queries easier to read and maintain: as SQL rules demand naming each CTE, the entire code becomes clearer to the reader.
Basic syntax and usage of CTEs
The CTE syntax is as follows:
WITH cte_name (column_1, column_2, column_3,...) AS (cte_query_definition) SELECT * FROM cte_name;
Such queries consist of two parts, and the first part is the CTE. Common Table Expressions are always introduced by the keyword WITH (that's why a CTE is also called an SQL WITH clause, and you can see this definition in many resources).
Thus, the parameters of the query with a CTE are:
- WITH - the introductory keyword necessary to define using the CTE. Only one WITH keyword is allowed.
- cte_name - the current Common Table Expression name.
- cte_query_definition - the actual CTE you apply.
The second part of the query syntax is a SELECT statement. You should write it right after the CTE without any columns, semicolons, or other punctuation symbols.
Now, let us delve into the practical usage of the CTEs.
Using the simple WITH clause
The simplest query format contains one CTE: the SQL Server first runs the query in the WITH clause, fetches the data, and stores it within a temporary relation. Then the main query uses that temporary relation and delivers the final result.
In the example below, we are going to use the SQL WITH clause to retrieve the list of top products by units sold.
WITH TopSoldProducts AS ( SELECT p.ProductID, p.Name AS ProductName, SUM(sod.OrderQty) AS TotalQuantitySold FROM Production.Product p JOIN Sales.SalesOrderDetail sod ON p.ProductID = sod.ProductID GROUP BY p.ProductID, p.Name ) SELECT ProductID, ProductName, TotalQuantitySold FROM TopSoldProducts;
Querying multiple CTEs in a single query
Some work scenarios require multiple CTEs to bring the result. There is no need to write several separate queries because SQL allows combining multiple CTEs in a single query with one of the SET operators, such as UNION or UNION ALL.
Note the following rules for constructing a query with multiple CTEs:
- Only one WITH keyword is allowed
- CTEs must be separated with commas
- A comma is not needed before the main query
- Only one main query is allowed
Let us apply CTEs to retrieve the information about products along with their associated subcategories and categories:
WITH ProductCategoryCTE AS ( SELECT ProductCategoryID, Name AS CategoryName FROM Production.ProductCategory ), ProductSubcategoryCTE AS ( SELECT ProductSubcategoryID, Name AS SubcategoryName, ProductCategoryID FROM Production.ProductSubcategory ), ProductDetailCTE AS ( SELECT p.ProductID, p.Name AS ProductName, p.ProductNumber, p.Color, p.ListPrice, ps.SubcategoryName, pc.CategoryName FROM Production.Product AS p LEFT JOIN ProductSubcategoryCTE AS ps ON p.ProductSubcategoryID = ps.ProductSubcategoryID LEFT JOIN ProductCategoryCTE AS pc ON ps.ProductCategoryID = pc.ProductCategoryID ) SELECT pd.ProductID, pd.ProductName, pd.ProductNumber, pd.Color, pd.ListPrice, pd.SubcategoryName, pd.CategoryName FROM ProductDetailCTE AS pd ORDER BY pd.ProductID;
Advanced usage of CTEs
In the previous section, we examined the scenarios of referencing simple CTEs, single or multiple in one query. Let us now consider more challenging cases - recursive CTEs and nested CTEs.
Recursive CTE
A recursive CTE suggests referencing itself - the result subset returned by the WITH clause is repeatedly referenced until getting the desired result.
The recursive CTE syntax is as follows:
WITH RECURSIVE cte_name AS ( cte_query_definition (the anchor member) UNION ALL cte_query_definition (the recursive member) ) SELECT * FROM cte_name;
As you see, the basic structure of the recursive CTE syntax is similar to the standard syntax of the non-recursive CTEs. However, there are significant differences that you need to notice when building queries with recursive CTEs:
- WITH RECURSIVE is used to introduce CTE instead of just WITH
- The first query definition is called the anchor member
- The second part of the CTE that will reference itself is called the recursive member
- UNION or UNION ALL connects the anchor member with the recursive member
The most common use case for applying recursive CTEs is querying hierarchical data, such as organizational structures, menus, routes, etc.
The following example showcases a recursive CTE, which provides a hierarchical view of sales territories. This view includes details like the territory IDs, names, regions, and hierarchy levels. This kind of report is beneficial in various professional situations.
WITH RecursiveCTE AS ( SELECT TerritoryID, Name, CountryRegionCode, [Group], 1 AS Level FROM Sales.SalesTerritory WHERE TerritoryID = 1 -- Starting TerritoryID UNION ALL SELECT T.TerritoryID, T.Name, T.CountryRegionCode, T.[Group], CTE.Level + 1 AS Level FROM Sales.SalesTerritory T JOIN RecursiveCTE CTE ON T.TerritoryID = CTE.TerritoryID WHERE CTE.Level < 100 ) SELECT * FROM RecursiveCTE;
Another advanced CTE usage technique is working with nested CTEs.
Nesting CTE
Like subqueries, CTEs can also be nested, or, in plain words, we can use a CTE definition inside another CTE definition.
The syntax of the nested CTE is as follows:
WITH cte_expression_1 as ( cte_query_1 ), cte_expression_2 as ( cte_query_2 -- (in the FROM clause refers to cte_expression_1) ) SELECT FROM cte_epression_2;
In this example, cte_expression_2 references the cte_expression_1. When the CTE is defined, we can use it in the subsequent expressions. The most common use case for nested CTEs is applying aggregate functions inside one another, such as when we need to find the average minimum or maximum, define the group comparisons, etc.
The below example demonstrates using the nested query to get the report about total sales for each product within their respective product categories along with the information about the total sales for each product category.
WITH ProductSalesByCategory AS ( SELECT pc.Name AS Category, psc.Name AS Subcategory, p.ProductID, p.Name AS ProductName, SUM(sod.LineTotal) AS TotalSales FROM Sales.SalesOrderDetail sod INNER JOIN Production.Product p ON sod.ProductID = p.ProductID INNER JOIN Production.ProductSubcategory psc ON p.ProductSubcategoryID = psc.ProductSubcategoryID INNER JOIN Production.ProductCategory pc ON psc.ProductCategoryID = pc.ProductCategoryID GROUP BY pc.Name, psc.Name, p.ProductID, p.Name ), CategoryTotalSales AS ( SELECT Category, SUM(TotalSales) AS CategoryTotal FROM ProductSalesByCategory GROUP BY Category ) SELECT c.Category, c.CategoryTotal, p.ProductName, p.TotalSales FROM ProductSalesByCategory p JOIN CategoryTotalSales c ON p.Category = c.Category ORDER BY c.Category, p.ProductName;
CTEs vs. other SQL features
Discussing SQL Common Table Expressions (CTEs), it's essential to highlight other SQL features that bear resemblance to CTEs in various aspects. Specifically, these include temporary tables and subqueries. Let's examine how these features differ and determine the most fitting application for each.
CTEs vs. Temporary table
Temporary tables are genuine tables that support indexes and constraints. They remain available during the session and can be accessed by multiple users. In contrast, CTEs are transient datasets that exist solely within a specific query and are only accessible during that query's execution.
Their objectives vary as well. CTEs primarily aid SQL developers in enhancing code clarity. On the other hand, temporary tables prove beneficial when handling large datasets or when a particular result set needs multiple references.
CTEs vs. Subqueries
CTEs and SQL subqueries share many similarities, but their differences dictate their optimal use cases.
As established, CTEs are positioned at the beginning of a query and must have designated names. In contrast, subqueries are integrated inline and don't necessitate naming.
Subqueries are one-time-use within a query, while CTEs can be referenced multiple times and support recursion. It's also worth noting that subqueries are compatible with the WHERE clause using IN and EXISTS, unlike CTEs.
When to use CTEs vs. Subqueries
CTEs are optimal for:
- Enhancing query readability. While CTEs don't typically influence query performance, they assist SQL developers in comprehending the code.
- Recursive queries. CTEs are handy for querying hierarchical data since subqueries lack recursive capabilities.
Subqueries are optimal for:
- Employing IN or EXISTS in the WHERE clause. CTEs are not suitable here.
- Modifying data in a table with information from another table. Subqueries efficiently extract the required data portion and integrate it into an UPDATE statement for another table.
In various scenarios, the choice between a subquery or a CTE is at the developer's discretion since both tools can serve multiple operational contexts.
Conclusion
In the realm of SQL Server, Common Table Expressions (CTEs) stand out as a potent tool for database developers and administrators. They provide a more readable format for complex queries, serve as a building block for recursive operations, and make it easier to manage and debug code. This not only enhances the developer experience but also aids in optimizing query performance.
It's essential, however, to understand when and how to use CTEs effectively. Having proper tools, such as dbForge SQL Complete can also help the practitioners significantly by providing a comprehensive set of features for SQL coding, including context-based suggestions, code beautifying, debugging, and more to simplify all code writing tasks and ensure the highest quality of the results.