Search This Blog

Thursday, 15 May 2008

TSQL - GROUP BY and ALL


We sure have used Group By in our SQL queries to group data and get result sets with aggregates from SQL Server, But before I explain lets create a temporary table with order details table using the Northwind database in SQL 2000

I will just join orders and [order detail] table in Northwind to get the data i need into a temporary table as shown below

SELECT 
  O.OrderID, 
  ProductID, 
  UnitPrice, 
  Quantity, 
  (UnitPrice*Quantity) AS Amount,
  CustomerID
INTO 
  #tempOrders
FROM 
  Orders O 
INNER JOIN
  [order details] D 
ON
  O.[orderid] = D.[orderid]
ORDER BY
  ProductID

So now i have a table called #tempOrders with the order details i need.

Now suppose I'd like to see the customers that were sold Product #1 along with the total amount that they spent. I will usea query with a GROUP BY clause as below with a WHERE condition to filter records

SELECT 
   CustomerID,
   SUM(Amount) AS TotalAmount
 FROM
   #tempOrders
 WHERE 
   ProductID = 1
 GROUP BY 
   CustomerID


Now, let's say that I'd like to see all customers that have been sold any products, but we still just want to see the "TotalAmount" for ProductID #1. For customers that have never ordered ProductID #1, it should output a "TotalAmount" value of 0. One way to do this is with a CASE expression as shown below

SELECT 
  CustomerID,
  SUM(CASE WHEN ProductID = 1 THEN Amount ELSE 0 END) AS TotalAmount
FROM
  #tempOrders
GROUP BY 
  CustomerID

Now this would return customers who haven't purchased Product #1 with a total of 0. In situations like these the SUM(CASE...) expression can be replaced with a GROUP BY ALL.

SELECT 
    CustomerID,
    ISNULL(SUM(Amount), 0) AS TotalAmount
  FROM
    #tempOrders
  WHERE 
    ProductID = 1
  GROUP BY ALL CustomerID

Values that are excluded from the aggregation according to the WHERE clause have NULL values returned, the ISNULL function makes sure all customers who haven't ordered Product #1 have a total of 0 instead of NULL. The ALL option basically says "ignore the WHERE clause when doing the GROUPING, but still apply it for any aggregate functions". So, in this case, the WHERE clause is not considered when generating the population of CustomerID values, but it is applied when calculating the SUM. This is very much like our first solution, where we removed the WHERE clause completely, and used a SUM(CASE...) expression to conditionally calculate the aggregate.

GROUP BY ALL is kind of obscure and neat to know, but not really useful in most situations since there are usually easier or better ways to get this result. This won't work if we want all Customers to be displayed, since a customer must have at least one order to show up in the result.Another limitation is we can not use GROUP BY ALL if we want to return a grand total for all orders

No comments: