What is the SQL Server COUNT() function
In SQL Server, the COUNT() function brings the number of table rows according to the definite criteria.
Depending on the scenario, it can count all records, including duplicates, NULLs, non-NULL values, and unique records. It can return 0 if the no rows match the criteria.
The function is mostly used with the SELECT statement. The data type it returns is INT.
The COUNT() function has been supported in SQL Server since 2008, and it is also supported in Azure SQL Database, Azure SQL Managed Instance, Azure Synapse Analytics, Analytics Platform System (PDW), SQL Endpoint in Microsoft Fabric, and Warehouse in Microsoft Fabric.
This function is valuable in the realm of relational databases, as it performs one of the fundamental aggregation operations. It is often used to verify and ensure data integrity, check data quality, and for reporting purposes.
Syntax of the COUNT() function
As mentioned, the SQL COUNT function mostly serves as a part of the SELECT statement. The basic syntax of the query is as follows:
SELECT COUNT(aggregate_expression)
FROM tables
[WHERE condition];
In this syntax:
- aggregate_expression defines for the specific column or expression containing the values to be counted
- tables define the database tables to retrieve values from. You need to specify at least one table in the FROM clause
- WHERE condition is an optional condition that specifies criteria for the records to be selected and counted
The simplest example is counting all records in the specific table column:
SELECT COUNT (Name)
FROM Production.Product;
The result is the number of all non-NULL values in the particular column you specify in the query.
Note
The COUNT() function is often used in the following form:
COUNT(*)
It counts all rows in the table, including duplicate values and NULLs. It does not take any parameters.
Now, let us examine the usage of the COUNT function in SQL Server databases.
SQL COUNT examples and common use cases
The COUNT function is among the most commonly used aggregate functions in SQL and has a wide range of applications. To demonstrate its use cases, we'll use dbForge Studio for SQL Server. It is a powerful integrated development environment (IDE) tailored for SQL Server specialists that provides a comprehensive solution for all database-related tasks.
Count all rows in one or multiple tables
If you need the definite number of all rows of the table that may include duplicates, non-null values, and nulls, the COUNT(*) form is the solution:
SELECT COUNT(*)
FROM tableName;
Assume we need to count all available products in the stock:
SELECT COUNT (*)
FROM Production.Product;
COUNT(*) can be applied to multiple tables too. The below example demonstrates how to count the number of all employees and all departments at once:
SELECT
(SELECT COUNT(*) FROM HumanResources.Employee) AS EmployeeCount,
(SELECT COUNT(*) FROM HumanResources.Department) AS DepartmentCount;
COUNT() with ALL
COUNT(ALL expression) is the specific form used when we need to count the total number of table rows containing non-null values. Here lies the difference between COUNT(*) and COUNT ALL - the asterisk (*) sets the function to count both non-null and null values.
SELECT COUNT(ALL color)
FROM Production.Product;
COUNT() with DISTINCT
When the function is used with the DISTINCT clause, it returns the number of unique non-null values. Therefore, COUNT DISTINCT is helpful whenever you need to eliminate repetitive data in the result set.
SELECT COUNT(DISTINCT columnName)
FROM tableName;
Assume we want to count the number of products and their available colors. In this case, we can use the SQL COUNT() function with the DISTINCT clause as shown below:
SELECT
COUNT(DISTINCT name) AS ProductName,
COUNT(DISTINCT color) AS ProductColor
FROM Production.Product;
COUNT() with condition
As the purpose of this function by default is fetching the number of table rows according to some condition, COUNT() is frequently used with the WHERE clause:
SELECT COUNT(expression)
FROM table
WHERE condition;
Assume we want to know how many products in stock have prices higher than 1000:
SELECT COUNT(*) as "More Expensive Products "
FROM Production.Product
WHERE ListPrice>1000;
COUNT() with GROUP BY
Grouping allows us to get a better readable result set, and the GROUP BY clause is also common in the queries with the COUNT() function. The standard syntax of the query is as follows:
SELECT expression1, expression2, ... expression_n,
COUNT(aggregate_expression)
FROM tables
[WHERE conditions]
GROUP BY expression1, expression2, ... expression_n;
For instance, we want to find out how many orders were placed by each customer. For that, we can apply the COUNT function to fetch the list of CustomerID values along with the number of orders each customer has placed, sorted by the number of orders.
SELECT CustomerID, COUNT(*) AS NumberOfOrders
FROM Sales.SalesOrderHeader
GROUP BY CustomerID
ORDER BY NumberOfOrders DESC;
COUNT(*) with other aggregate functions
The COUNT function is often combined with other aggregate functions in SQL Server to bring more detailed and precise results for the user's needs. Assume we want to retrieve the list of product categories along with the number of products in each category and define the average price of the products within each category.
SELECT
pc.Name AS CategoryName,
COUNT(*) AS NumberOfProducts,
AVG(p.ListPrice) AS AveragePrice
FROM Production.Product p
INNER JOIN Production.ProductSubcategory psc ON p.ProductSubcategoryID = psc.ProductSubcategoryID
INNER JOIN Production.ProductCategory pc ON psc.ProductCategoryID = pc.ProductCategoryID
GROUP BY pc.Name;
The COUNT function, due to its simplicity and value, is one of the most common SQL functions used in practice. However, to apply it effectively in environments, you should master some best practices. Let us review the most efficient approaches.
Best practices for optimal performance
The best methods for optimal query performance mostly target making SQL queries faster and lighter. In that aspect, you can use the following recommendations that apply to most use cases.
Apply COUNT(*), COUNT(column_name), or COUNT DISTINCT correctly
If you only require the number of all rows, use COUNT(*).
If you need more specific numbers, like the number of non-NULL values in a specific column, use COUNT(column_name).
The COUNT(DISTINCT column_name) option can be more resource-intensive. Make sure that the column you refer to is indexed and consider applying other strategies to retrieve the same values.
Use indexes
Make sure that the column you refer to count rows is indexed, especially if your query includes the WHERE condition. Using indexes helps speed up the query performance and filtering.
Avoid COUNT() in JOINs
If you're joining tables and applying the COUNT function, make sure you really need data from all tables at once. In some cases, referring to a single table can be more efficient.
Avoid COUNT() in subqueries
The COUNT function is correlated subqueries that may cause performance issues, as that type of subquery is quite resource-consuming. If your scenario allows it, redesign the query to apply Common Table Expressions (CTE) or temporary tables to derive the counted values from them instead of subqueries.
Use partitioning for large tables
Limiting the scope is always one of the most effective methods of query optimization. If the target table is very large, but you often need to query it to count rows according to certain criteria, use the table partitioning on those criteria.
Update statistics regularly
Make sure to have the stats always updated for tables and indexes to rely on up-to-date information for making optimal decisions.
Check execution plans
Analyze the execution plans of the COUNT queries and review them periodically. If you use the Query Profiler feature of dbForge Studio for SQL Server, note that it offers the possibility to analyze queries, troubleshoot the issues, detect, and resolve diverse performance troubles.
These best practices, when applied, ensure that your use of the COUNT function in SQL Server is efficient and optimized for your specific needs.