How to Compare Data in Tables of SQL Server Databases

Comparing tables in databases is an essential, routine task for data analysts and developers focused on data validation and synchronization. It's crucial for maintaining data consistency and integrity across the system. Consequently, database specialists need efficient methods to compare two tables and identify discrepancies between them quickly. These results can be achieved using SQL queries or more conveniently through modern graphical user interface (GUI) tools. This article will examine both approaches.

Using LEFT JOIN

A common SQL method to detect differences between two tables is the LEFT JOIN. This operation retrieves all records from the left table and the corresponding records from the right table. If there is no match, the query returns a null value, indicating what the secondary table lacks.

The basic LEFT JOIN syntax is:

SELECT *
FROM table1
LEFT JOIN table2
ON table1.column_name = table2.column_name

In this article, we will use the AdventureWorks2022 database to illustrate our cases. We have created the NewDepartment table and need to compare it with the Department table to detect the discrepancies between these tables.

The query to define differences between the tables using LEFT JOIN is:

SELECT *
FROM HumanResources.Department d
LEFT JOIN HumanResources.NewDepartment td
ON d.DepartmentID = nd.DepartmentID

Let us execute this query in SQL Server Management Studio (SSMS) and view the results. The output shows which records are missing in the NewDepartment table.

Set-based operations

Set-based operations (UNION, INTERSECT, and EXCEPT) combine or differentiate the results of multiple SELECT statements. Unlike JOINs that focus on columns, these operators manipulate rows:

UNION lists records that differ between two tables. UNION ALL is used to include duplicate results in the output
INTERSECT finds common rows between two tables
EXCEPT identifies rows in the first table not present in the second

Many database specialists prefer set-based operators for defining differences between the two tables. Let us review how they work.

Using UNION ALL to compare tables

In our test cases, we focus on identifying all records that may differ between the two tables, including any duplicate records. Therefore, we apply UNION ALL, and the query syntax is:

SELECT column1, column2.... columnN
FROM
(SELECT table1.column1, table1.column2
FROM table1
UNION ALL
SELECT table2.column1, table2.column2
FROM table2) table1
GROUP BY column1
HAVING COUNT(*) = 1

Let us compare the Department and NewDepartment tables in AdventureWorks2022 using UNION ALL.

SELECT DepartmentID, Name, GroupName
FROM (
SELECT DepartmentID, Name, GroupName
FROM HumanResources.Department d
UNION ALL
SELECT DepartmentID, Name, GroupName
FROM HumanResources.NewDepartment nd
) combined
GROUP BY DepartmentID, Name, GroupName
HAVING COUNT(*) = 1

The output contains records that are unique for each table.

Using INTERSECT to compare tables

The query syntax is:

SELECT column1, column2, columnN 
FROM table1
INTERSECT
SELECT column1, column2, columnN 
FROM table2

Let us use INTERSECT to define the common records between our two tables:

SELECT * 
FROM HumanResources.Department d
INTERSECT
SELECT * 
FROM HumanResources.NewDepartment nd

The output shows the records that exist in both tables.

Using EXCEPT to compare tables

The EXCEPT operator in SQL Server is an alternative to LEFT JOIN preferred by many database specialists for its simplicity, especially when handling non-keyed tables.

The query syntax is:

SELECT column1, column2, columnN FROM table1
EXCEPT
SELECT column1, column2, columnN FROM table2

Let us apply EXCEPT to find differences between the test tables:

SELECT * FROM HumanResources.Department d
EXCEPT
SELECT * FROM HumanResources.NewDepartment nd

The output presents those records that are present in the Department table, but are missing in the NewDepartment table.

Column comparison options

In many cases, discrepancies between tables are limited to specific columns, making the previously described methods insufficient for detecting these differences, particularly when working with large tables containing vast amounts of data. Therefore, it is essential to have a programmatic option to identify such differences efficiently.

One of the options is using the query with the combination of EXCEPT and UNION ALL. The query syntax is:

SELECT column1, column2, columnN, 'Table1' AS Source
FROM table1
EXCEPT
SELECT column1, column2, columnN, 'Table1' AS Source
FROM table2

UNION ALL

SELECT column1, column2, columnN, 'Table2' AS Source
FROM table2
EXCEPT
SELECT column1, column2, columnN, 'Table2' AS Source
FROM table1

The following query compares records in the Department and NewDepartment tables and adds a new column to indicate the origin of each row.

SELECT DepartmentID, Name, GroupName, 'Department' AS SourceTable 
FROM HumanResources.Department
EXCEPT
SELECT DepartmentID, Name, GroupName, 'Department' AS SourceTable
FROM HumanResources.NewDepartment
UNION ALL
SELECT DepartmentID, Name, GroupName, 'NewDepartment' AS SourceTable
FROM HumanResources.NewDepartment
EXCEPT
SELECT DepartmentID, Name, GroupName, 'NewDepartment' AS SourceTable
FROM HumanResources.Department

This output includes rows that exist in one table but not in the other, along with the SourceTable column indicating their origin.

Detailed comparison results with the source column

Performance considerations

When comparing tables to identify matches or differences, both JOIN operations and set-based operators like UNION, INTERSECT, and EXCEPT can be used. Each approach has its own advantages and disadvantages.

JOIN operations are flexible in handling various comparisons and are efficient when working with indexes, providing precise control over comparison specifics. They are also capable of integrating multiple tables in a single query. However, they are resource-intensive when dealing with large datasets, and the queries can become complex, especially with multiple JOINs. Additionally, finding non-matching rows might be less intuitive.

Set-based operators like UNION, INTERSECT, and EXCEPT are simple to use for comparing tables and provide better readability, maintainability, and debugging. They are ideal for full table comparisons. On the downside, these operators offer less flexibility in specifying conditions and comparison criteria. They require matching column counts and compatible data types and are best suited for finding exact matches or differences but are not suitable for more complex scenarios.

Choosing the right approach

Use JOINs for detailed comparisons based on specific conditions, or when complex operations involving multiple tables are needed
Use set-based operators for straightforward queries that quickly find distinct, common, or differing rows

Performance tips

Indexing: Ensure both JOIN and set-based queries are supported by proper indexes
Query optimization: Analyze and tune queries using query execution plans to identify bottlenecks or inefficient operations
Data management: Regularly update statistics and potentially reorganize or rebuild indexes
Testing and profiling: Analyze queries to determine which method is more efficient under specific conditions

Understanding and choosing the right tool or method for table comparisons can greatly enhance your database management efficiency.

A more intuitive method for comparing table data in databases involves using dedicated tools with graphical user interfaces (GUI tools). These tools simplify the configuration of tasks through a visual approach, and they display results in a user-friendly manner. For managing large tables and databases, when comparison is integral to database development and management, an automated, GUI-based approach is particularly advantageous.

Using dbForge Data Compare

dbForge Data Compare for SQL Server is a specialized tool designed for comparing large datasets, compatible with live databases, script folders, and native SQL Server backups. It is available both as a standalone application and as an add-in for SSMS.

This tool enables users to analyze differences in SQL data, generate detailed comparison reports, and synchronize data across SQL Server databases. Convenient wizards ensure quick and accurate configuration and execution of tasks.

dbForge Data Compare finds data differences in Source and Target SQL Server database tables. The Source database contains the data that will be synchronized with the Target, ensuring that Target matches Source.

When comparison is completed, you are able to view detailed information on data differences in Data Comparison window. You can filter the results, analyze them and further perform synchronization to make the Source and Target tables identical.

Stages of data comparison

Creating connections to SQL Server servers where the needed databases are located.
Selecting Source and Target and setting up additional parameters before comparing table data.
Mapping required databases in Source and Target manually if automatic mapping cannot satisfy your specific needs.
Analyzing comparison results and selecting data to be updated, inserted, or deleted.
Obtaining synchronization script for further usage or synchronizing data in databases.

Getting data differences between two SQL Server databases

Now let's compare data in two SQL Server databases using the dbForge Data Compare wizard.

Click New Data Comparison on the Start Page to open the New Data Comparison wizard. Select the Source and Target types: you can choose a live database, a backup, or a script folder.

Set the required Source and Target: select the database type, specify the connection to SQL Server, and choose the databases to participate in this task. After that, click Next.

On the Options page, adjust the required comparison options and click Next.

Select objects that you want to compare on the Mapping wizard page. If necessary, specify key columns and the list of columns for comparison. Click Compare to start the comparison task.

Analyzing data differences between two SQL Server databases

When the data comparison task is complete, the tool produces a clear view of differences found in the databases.

View the data comparison results in the tool

You can view the detailed results of the data comparison and apply the robust options to analyze the output.

View the available options for data comparison analysis

Further, you can synchronize the data immediately from within the Data Compare window or automate this task. Read on to find out how to synchronize data in SQL Server databases.