Saturday, October 17, 2009

Derived Tables

Sometimes querying data is not that simple and there may be the need to create temporary tables or views to predefine how the data should look prior to its final output. Unfortunately there are problems with both of these approaches if you are trying to query data on the fly.

Temporary table approach
With the temporary tables approach you need to have multiple steps in your process, first to create the temporary table, then to populate the temporary table, then to select data from the temporary table and lastly cleanup of the temporary table.

View approach
With the view approach you need to predefine how this data will look, create the view and then use the view in your query. Granted if this is something that you would be doing over and over again this might make sense to just create a view, but let's look at a totally different approach.

SQL Server provides allows to create derived tables on the fly and then use these derived tables within your query. This is similar to creating a temporary table and then using the temporary table in your query.

Let's take a look at an example where we query Northwind database to find out how many customers fall into various categories based on sales. The categories that we have predefined are as follows:

Total Sales between 0 and 5,000 = Micro
Total Sales between 5,001 and 10,000 = Small
Total Sales between 10,001 and 15,000 = Medium
Total Sales between 15,001 and 20,000 = Large
Total Sales > 20,000 = Very Large


There are several ways to get this data but we will use derived tables approach.

The first step is to find out the total sales by each customer, which can be done with the following statement.

SELECT o.CustomerID,
SUM(UnitPrice * Quantity) AS TotalSales
FROM [Order Details] AS od
INNER JOIN Orders AS o
ON od.OrderID = o.OrderID
GROUP BY o.CustomerID

This is a partial list of the output:

CustomerID TotalSales
-------------------------------
ALFKI 4596.2000
ANATR 1402.9500
ANTON 7515.3500
WOLZA 3531.9500

Now classify the TotalSales value into the OrderGroups that was specified above:

SELECT o.CustomerID,
SUM(UnitPrice * Quantity) AS TotalSales,
CASE
WHEN SUM(UnitPrice * Quantity)
BETWEEN 0 AND 5000 THEN 'Micro'
WHEN SUM(UnitPrice * Quantity)
BETWEEN 5001 AND 10000 THEN 'Small'
WHEN SUM(UnitPrice * Quantity)
BETWEEN 10001 AND 15000 THEN 'Medium'
WHEN SUM(UnitPrice * Quantity)
BETWEEN 15001 AND 20000 THEN 'Large'
WHEN SUM(UnitPrice * Quantity)
> 20000 THEN 'Very Large'
END AS OrderGroup
FROM [Order Details] AS od
INNER JOIN Orders AS o
ON od.OrderID = o.OrderID
GROUP BY o.CustomerID

This is a partial list of the output:

CustomerID TotalSales OrderGroup

-------------------------------------------
ALFKI 4596.2000 Micro
ANATR 1402.9500 Micro
ANTON 7515.3500 Small
WOLZA 3531.9500 Micro

There can be many customers who fit into each of these groups and this is where the derived table comes into play. What we are doing here is using the same query from the step above, but using it as derived table OG. Then we are selecting data from this derived table for our final output just like we would with any other query. All of the columns that are created in the derived table are now available in our final query.


SELECT OG.OrderGroup,
COUNT(OG.OrderGroup) AS OrderGroupCount
FROM (SELECT o.CustomerID,
SUM(UnitPrice * Quantity) AS TotalSales,
CASE
WHEN SUM(UnitPrice * Quantity)
BETWEEN 0 AND 5000 THEN 'Micro'
WHEN SUM(UnitPrice * Quantity)
BETWEEN 5001 AND 10000 THEN 'Small'
WHEN SUM(UnitPrice * Quantity)
BETWEEN 10001 AND 15000 THEN 'Medium'
WHEN SUM(UnitPrice * Quantity)
BETWEEN 15001 AND 20000 THEN 'Large'
WHEN SUM(UnitPrice * Quantity)
> 20000 THEN 'Very Large'
END AS OrderGroup
FROM [Order Details] AS od
INNER JOIN Orders AS o
ON od.OrderID = o.OrderID
GROUP BY o.CustomerID) AS OG
GROUP BY OG.OrderGroup

This is the complete list of the output from the above query.

OrderGroup OrderGroupCount
-----------------------------------
Large 10
Medium 11
Micro 33
Small 15
Very Large 20

No comments: