Welcome to today’s post.
In today’s post I will explain what a pivot query is in SQL Server T-SQL and how it can be used to solve some data queries that would otherwise be more challenging to extract in application development.
In a previous post I showed how to use the OVER and PARTITION BY clauses in a SELECT SQL query to perform aggregated operations over a partitioned group of results, which are ordered by a window function (ROW(), RANK()).
Instead of partitioning and aggregating data, we will look as how to pivot columns of data in a query.
What is the SQL PIVOT Operator?
A PIVOT operator is used to transform a result that is table-valued into a column-valued result set.
What this means is that a table result like the following:
Column1 Column2 … ColumnN
Value11 Value12 … Value1N
… … … …
ValueM1 ValueM2 … ValueMN
The above Is then transformed into a column valued result like the following:
Column1 Pivot-Column-1 Pivot-Column-2 … Pivot-Column-N
Column-Value1 Value1 Value2 ValueN
… … … … …
Column-ValueM1 Value1 Value2 … ValueN
Where
Value[i] in the columns is the aggregation of data grouped by the column value Column1-Value[i].
To get from the table-valued result to the column-valued result I will show an example of a table-valued result from a SELECT … GROUP BY query.
We start with some sample data:
SELECT [ID],
[ProductName]
[SaleAmount]
[SaleDate]
FROM [ProductSales]
The results show as:
ID ProductName SaleAmount SaleDate
1 Apples 100 2021-01-01
2 Oranges 200 2021-01-15
3 Bananas 150 2021-02-01
4 Apples 200 2021-02-15
5 Pears 100 2021-02-15
6 Apples 150 2021-03-01
7 Bananas 100 2021-03-15
8 Apples 250 2021-04-01
9 Oranges 50 2021-04-15
10 Apples 300 2021-05-01
11 Pears 100 2021-05-15
12 Bananas 150 2021-05-15
13 Apples 150 2021-06-01
14 Pears 150 2021-06-15
In the next section, I will show how to turn a row of columnar data into a cross tab report.
Rotating Row Data into a Cross Tab Report
Suppose that I want to create a report that rotates the above result set into a cross tab column-valued report where the aggregations are the column values of the result, and the groupings of the of the aggregations are by the column names.
Essentially, the column names in the pivot table will be buckets that we allocate each record amount to during the grouping of data.
The more familiar table valued method to aggregate the above data is to use a familiar GROUP BY select query as shown below where we can aggregate sale amounts by the product name:
SELECT [ProductName], Sales=SUM([SaleAmount])
FROM [ProductSales]
GROUP BY [ProductName]
Or aggregate sales amount by each month of the year a product has a sale:
SELECT [ProductName], Month=MONTH([SaleDate]), Sales=SUM([SaleAmount])
FROM [ProductSales]
GROUP BY [ProductName], MONTH([SaleDate])
The transform we have in mind is the following is a 6-month columnar report:
ProductName Jan Feb Mar Apr May Jun
[Product 1] v11 v12 v13 v14 v15 v16
[Product 2] v21 v22 v23 v24 v25 v26
…
[Product N] vN1 vN2 vN3 vN4 vN5 vN6
Where:
[Product1] … [ProductN] are the products that will be aggregated, and
and
v[i][j] are the totals allocated to the pivot columns.
In the next section, I will show how we can use pivots with a number of useful summary and quantity reports.
Example Pivot Reports
In this section, I will cover how to generate some useful reports based on the Pivot operator.
Sales Summary Totals by Month
This then leads us to a sales summary monthly report that aggregates sales by the product, then partitions these aggregated totals into each month. The SQL query we use is shown below:
SELECT [ProductName], Jan=[1], Feb=[2], Mar=[3], Apr=[4], May=[5], Jun=[6]
FROM
(
SELECT [ProductName],
Month=MONTH([SaleDate]),
Sales=SUM([SaleAmount])
FROM [ProductSales]
GROUP BY [ProductName], MONTH([SaleDate])
) As DataSource
PIVOT (SUM([Sales]) FOR [Month] IN ([1], [2], [3], [4], [5], [6])) pvt
The result is a column-valued report as follows:
ProductName Jan Feb Mar Apr May Jun
Apples 100 200 150 250 300 150
Bananas NULL 150 100 NULL 150 NULL
Oranges 200 NULL NULL 50 NULL NULL
Pears NULL 100 NULL NULL 100 150
You can see that where there are no records that correspond to a sale for that given month, the aggregated total for that month is given a NULL total.
The above report is for sales totals that are for a product within each month.
What is we want sales totals that are cumulative? A cumulative total is when we accumulate the aggregated totals from the first month to the current month in the report column.
So, if I sold 100 apples in January, 200 in February, and 300 in March, then the cumulative total for February is 300 and for March it is 600. To visualize this, we can first write a basic table-valued GROUP BY query that shows the cumulative totals for each month in each row. This can be done with the following GROUP BY query:
SELECT t.[ProductName],
Month=MONTH(t.[SaleDate]),
Sales=
(
SELECT SUM([SaleAmount])
FROM [ProductSales]
WHERE [ProductName] = t.[ProductName] AND
MONTH([SaleDate]) <= Month(t.[SaleDate])
)
FROM [ProductSales] t
GROUP BY [ProductName], MONTH(t.[SaleDate])
The resulting table-valued result is as follows:
ProductName Month Sales
Apples 1 100
Apples 2 300
Apples 3 450
Apples 4 700
Apples 5 1000
Apples 6 1150
Bananas 2 150
Bananas 3 250
Bananas 5 400
Oranges 1 200
Oranges 4 250
Pears 2 100
Pears 5 200
Pears 6 350
The above doesn’t look too elegant, however it shows the month 1..6 and it’s accumulative total.
Sales Totals 6 Month Cumulative Summary
Now let’s improve on the above report with a column-valued cross tab report using the SQL PIVOT operator:
SELECT [ProductName],
Jan=[1], Feb=[2], Mar=[3], Apr=[4], May=[5], Jun=[6]
FROM
(
SELECT t.[ProductName],
Month=MONTH(t.[SaleDate]),
Sales=
(
SELECT SUM([SaleAmount])
FROM [ProductSales]
WHERE [ProductName] = t.[ProductName] AND
MONTH([SaleDate]) <= Month(t.[SaleDate])
)
FROM [ProductSales] t
GROUP BY [ProductName], MONTH(t.[SaleDate])
)
AS DataSource
PIVOT (SUM([Sales]) FOR [Month] IN ([1], [2], [3], [4], [5], [6])) pvt
Notice that we have taken the GROUP BY query and used it as a sub-query for the PIVOT query:
SELECT [ProductName],
Jan=[1], Feb=[2], Mar=[3], Apr=[4], May=[5], Jun=[6]
FROM
(
[GROUP BY sub-query]
)
AS DataSource
PIVOT (SUM([Sales]) FOR [Month] IN ([1], [2], [3], [4], [5], [6])) pvt
The resulting cumulative totals report is shown:
ProductName Jan Feb Mar Apr May Jun
Apples 100 300 450 700 1000 1150
Bananas NULL 150 250 NULL 400 NULL
Oranges 200 NULL NULL 250 NULL NULL
Pears NULL 100 NULL NULL 200 350
Turning the above into a 12-month cumulative sales summary report is then quite straightforward.
We have some sample data that covers a 12-month duration:
12 Months Sales Data …
ID ProductName SaleAmount SaleDate
1 Apples 100 2021-01-01
2 Oranges 200 2021-01-15
3 Bananas 150 2021-02-01
4 Apples 200 2021-02-15
5 Pears 100 2021-02-15
6 Apples 150 2021-03-01
7 Bananas 100 2021-03-15
8 Apples 250 2021-04-01
9 Oranges 50 2021-04-15
10 Apples 300 2021-05-01
11 Pears 100 2021-05-15
12 Bananas 150 2021-05-15
13 Apples 150 2021-06-01
14 Pears 150 2021-06-15
15 Apples 50 2021-07-01
16 Oranges 100 2021-07-01
17 Pears 50 2021-07-15
18 Apples 100 2021-08-01
19 Bananas 200 2021-08-15
20 Oranges 350 2021-09-01
21 Pears 200 2021-09-15
22 Bananas 50 2021-09-15
23 Apples 400 2021-10-01
24 Oranges 150 2021-10-01
25 Pears 250 2021-11-01
26 Bananas 100 2021-11-01
27 Apples 250 2021-11-15
28 Apples 150 2021-12-01
29 Pears 100 2021-12-01
30 Oranges 450 2021-12-15
Sales Totals 12 Month Cumulative Summary
We will work some magic with the PIVOT operator to produce a cumulative summary report:
SELECT [ProductName],
Jan=[1],
Feb=[2],
Mar=[3],
Apr=[4],
May=[5],
Jun=[6],
Jul=[7],
Aug=[8],
Sep=[9],
Oct=[10],
Nov=[11],
Dec=[12]
FROM
(
SELECT t.[ProductName], Month=MONTH(t.[SaleDate]),
Sales=
(
SELECT SUM([SaleAmount])
FROM [ProductSales]
WHERE [ProductName] = t.[ProductName] AND
MONTH([SaleDate]) <= Month(t.[SaleDate])
)
FROM [ProductSales] t
GROUP BY [ProductName], MONTH(t.[SaleDate])
)
AS DataSource
PIVOT (SUM([Sales]) FOR [Month]
IN ([1], [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12])) pvt
What we just did was to add an additional column Sales, to the report, summed up by the product, then pivoted the sales aggregation by the month of the product sale.
The resulting 12-month cumulative totals report is shown:
ProductName Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Apples 100 300 450 700 1000 1150 1200 1300 NULL 1700 1950 2100
Bananas NULL 150 250 NULL 400 NULL NULL 600 650 NULL 750 NULL
Oranges 200 NULL NULL 250 NULL NULL 350 NULL 700 850 NULL 1300
Pears NULL 100 NULL NULL 200 350 400 NULL 600 NULL 850 950
An alternative report can be a count (quantity) of the total sales instead of the sales amount totals.
Sales Count by Month
The sales count by month is generated with the following GROUP BY query:
SELECT Sales_Count=COUNT([SaleAmount]),
Month=MONTH([SaleDate])
FROM ProductSales t
GROUP BY MONTH([SaleDate])
The table-valued result is then:
Sales_Count Month
2 1
3 2
2 3
2 4
3 5
2 6
3 7
2 8
3 9
2 10
3 11
3 12
Cumulative Sales Count by Month
We can now write the above query as a cumulative sales count for each month:
SELECT Sales_Count=
(
SELECT COUNT(s.[SaleAmount])
FROM [ProductSales] s
HERE MONTH(s.[SaleDate]) <= MONTH(t.[SaleDate])
),
Month=MONTH([SaleDate])
FROM ProductSales t
GROUP BY MONTH([SaleDate])
The cumulative sales count by month is generated with the following GROUP BY query:
Sales_Count Month
2 1
5 2
7 3
9 4
12 5
14 6
17 7
19 8
22 9
24 10
27 11
30 12
Quarterly Sales Quantity Report by Product with Grouping
What if we wanted a report that aggregated the sales counts by each quarter?
We can start with a basic GROUP BY query.
SELECT [ProductName],
(
SELECT COUNT(s.[SaleAmount])
FROM [ProductSales] s
WHERE s.ProductName = t.ProductName AND
MONTH(s.[SaleDate]) >= 1 AND MONTH(s.[SaleDate]) <= 3
) AS Q1,
(
SELECT COUNT(s.[SaleAmount])
FROM [ProductSales] s
WHERE s.ProductName = t.ProductName AND
MONTH(s.[SaleDate]) IN (4, 5, 6)
) AS Q2,
(
SELECT COUNT(s.[SaleAmount])
FROM [ProductSales] s
WHERE s.ProductName = t.ProductName AND
MONTH(s.[SaleDate]) IN (7, 8, 9)
) AS Q3,
(
SELECT COUNT(s.[SaleAmount])
FROM [ProductSales] s
WHERE s.ProductName = t.ProductName AND
MONTH(s.[SaleDate]) IN (10, 11, 12)
) AS Q4
FROM ProductSales t
WHERE YEAR(t.[SaleDate]) = 2021
GROUP BY [ProductName]
The table-valued result is then:
ProductName Q1 Q2 Q3 Q4
Apples 3 3 2 3
Bananas 2 1 2 1
Oranges 1 1 2 2
Pears 1 2 2 2
Quarterly Sales Amount Report by Product with Grouping
We can also use the above to generate a report that aggregated the sales amounts by the quarter.
SELECT [ProductName],
(
SELECT SUM(s.[SaleAmount])
FROM [ProductSales] s
WHERE s.ProductName = t.ProductName AND MONTH(s.[SaleDate]) >= 1 AND
MONTH(s.[SaleDate]) <= 3
) AS Q1,
(
SELECT SUM(s.[SaleAmount])
FROM [ProductSales] s
WHERE s.ProductName = t.ProductName AND MONTH(s.[SaleDate]) IN (4, 5, 6)
) AS Q2,
(
SELECT SUM(s.[SaleAmount])
FROM [ProductSales] s
WHERE s.ProductName = t.ProductName AND MONTH(s.[SaleDate]) IN (7, 8, 9)
) AS Q3,
(
SELECT SUM(s.[SaleAmount])
FROM [ProductSales] s
WHERE s.ProductName = t.ProductName AND MONTH(s.[SaleDate]) IN (10, 11, 12)
) AS Q4
FROM ProductSales t
WHERE YEAR(t.[SaleDate]) = 2021
GROUP BY [ProductName]
The table-valued result is then:
ProductName Q1 Q2 Q3 Q4
Apples 450 700 150 800
Bananas 250 150 250 100
Oranges 200 50 450 600
Pears 100 250 250 350
Quarterly Sales Amount Report by Product with Pivot
The next step is to write the above quarterly sales amount report using the PIVOT operator.
SELECT [ProductName],
Q1=[1], Q2=[2], Q3=[3], Q3=[4]
FROM
(
SELECT [ProductName], Quarter=1,
Sales=SUM([SaleAmount])
FROM [ProductSales]
GROUP BY [ProductName], MONTH([SaleDate])
HAVING (MONTH([SaleDate]) = 1) OR
(MONTH([SaleDate]) = 2) OR
(MONTH([SaleDate]) = 3)
UNION
SELECT [ProductName], Quarter=2, Sales=SUM([SaleAmount])
FROM [ProductSales]
GROUP BY [ProductName], MONTH([SaleDate])
HAVING (MONTH([SaleDate]) = 4) OR
(MONTH([SaleDate]) = 5) OR
(MONTH([SaleDate]) = 6)
UNION
SELECT [ProductName], Quarter=3, Sales=SUM([SaleAmount])
FROM [ProductSales]
GROUP BY [ProductName], MONTH([SaleDate])
HAVING (MONTH([SaleDate]) = 7) OR
(MONTH([SaleDate]) = 8) OR
(MONTH([SaleDate]) = 9)
UNION
SELECT [ProductName], Quarter=4, Sales=SUM([SaleAmount])
FROM [ProductSales]
GROUP BY [ProductName], MONTH([SaleDate])
HAVING (MONTH([SaleDate]) = 10) OR
(MONTH([SaleDate]) = 11) OR
(MONTH([SaleDate]) = 12)
)
As DataSource
PIVOT
(SUM([Sales]) FOR [Quarter] IN ([1], [2], [3], [4])) pvt
The column-valued result is then:
ProductName Q1 Q2 Q3 Q3
Apples 450 700 150 800
Bananas 250 150 250 100
Oranges 200 50 450 600
Pears 100 250 250 350
As you can see, we have managed to implement using a combination of the familiar GROUP BY and PIVOT operators, some elegant query results that can be utilised within many applications that need to group aggregations of data into columnar buckets. Without the advantage in being able to make use of the PIVOT operator, we would have to resort to utilising some additional scripting within a stored procedure or within our application business logic to generate the pivot columns.
The above queries could be used in useful scenarios like generating reports with the totals of different types of users such as using a particular service or site utilization in particular areas on a given month of the year.
For more details refer to the PIVOT operator on the Microsoft site.
That is all for today’s post.
I hope you have found this post useful and informative.
Andrew Halil is a blogger, author and software developer with expertise of many areas in the information technology industry including full-stack web and native cloud based development, test driven development and Devops.