Access Subquery Techniques

Even some experienced Access developers shy away from writing SQL directly because they only use the Query Designer. That’s a shame, because, unless you’re willing to write SQL, you can’t use subqueries, which are a powerful tool for solving some especially thorny data retrieval problems. In this article, Mike Gunderloy introduces subqueries and shows how you can use them in Access.

So what’s a subquery? That’s easy: A subquery is an SQL SELECT statement that’s nested inside of another SQL statement. You can use subqueries as part of a SELECT, SELECT INTO, INSERT INTO, DELETE, or UPDATE statement; in this article, I’ll only cover subqueries in SELECT statements. But the big question is, “Why would you want to use a subquery?” I’ll start out by looking at a practical example.

Using a subquery to calculate intervals

Suppose you’d like to know how frequently your customers place orders (this example, like all of the others in this article, will use the Access 2002 version of the Northwind sample database for its data). You can create a query to provide this information by following these steps:

  1. Create a new query based on the Orders table.
  2. Choose to show the CustomerID and OrderDate columns.
  3. In the Field row of the third column of the query, type this expression:
PreviousOrderDate: (SELECT MAX(OrderDate)

FROM Orders AS Orders1

WHERE Orders1.OrderDate < Orders.OrderDate

AND Orders1.CustomerID = Orders.CustomerID)
  1. In the Field row of the fourth column of the query, type this expression:
OrderInterval: [OrderDate]-[PreviousOrderDate]
  1. Set the CustomerID field to sort ascending, and the OrderDate field to sort descending. Now run the query. Figure 1 shows the results. For each order you can see the date of the previous order as well as the number of days between the two orders. If you look at this query in SQL view, here’s what you’ll find:
SELECT Orders.CustomerID, Orders.OrderDate,

    (SELECT MAX(OrderDate)

     FROM Orders AS Orders1

     WHERE Orders1.OrderDate < Orders.OrderDate

     AND Orders1.CustomerID = Orders.CustomerID)

               AS PreviousOrderDate,

[OrderDate]-[PreviousOrderDate] AS OrderInterval

FROM Orders

ORDER BY Orders.CustomerID, Orders.OrderDate DESC;

The embedded SELECT statement in parentheses is a subquery. To be more precise, this particular example is a correlated subquery: one that uses a field from the main table as a part of the WHERE clause in the subquery.

 
Figure 1

Subquery syntax

A subquery can appear in the field list (as in the preceding example) or in a WHERE or HAVING clause, where it provides a set of one or more values to evaluate. In a field list, the subquery must return a single value, which is normally assured by using an aggregation such as MAX or SUM. In the WHERE or HAVING clause there are three basic forms for a subquery:

comparison [ANY|ALL|SOME] (subquery)

expression [NOT] IN (subquery)

[NOT] EXISTS (subquery)

The first form, using the ANY, SOME, or ALL keywords, allows you to filter a query based on the results of another query. For example, consider this query:

SELECT TOP 5 UnitPrice

FROM Products

ORDER BY UnitPrice DESC

That gives a result set with the five highest prices in the Products table:

Unit Price

$263.50

$123.79

$97.00

$81.00

$62.50

Now, I’ll use that as a subquery. First, here it is with the ANY keyword:

SELECT UnitPrice FROM PRODUCTS

WHERE UnitPrice < ANY

(SELECT TOP 5 UnitPrice

FROM Products

ORDER BY UnitPrice DESC)

ORDER BY UnitPrice DESC

The result set for that query starts off:

Unit Price

$123.79

$97.00

$81.00

$62.50

$55.00

$53.00 (more rows omitted)

Note that the $123.79 unit price is included, because it’s less than any one of the rows returned by the subquery. The result might be more clear if you think of this using the equivalent SOME keyword (this returns exactly the same results as the previous example):

SELECT UnitPrice FROM PRODUCTS

WHERE UnitPrice < SOME

(SELECT TOP 5 UnitPrice

FROM Products

ORDER BY UnitPrice DESC)

ORDER BY UnitPrice DESC

The alternative is to use ALL for the subquery comparison, which returns different results. The query is:

SELECT UnitPrice FROM PRODUCTS

WHERE UnitPrice < ALL

(SELECT TOP 5 UnitPrice

FROM Products

ORDER BY UnitPrice DESC)

ORDER BY UnitPrice DESC

And the results start off:

Unit Price

$55.00

$53.00

$49.30

$46.00

$45.60

$43.90 (more rows omitted)

Now, the $123.79 unit price is removed from the results (along with several others); the query returns only rows in the main table that are less than all of the rows returned by the subquery.

The second form of subquery syntax, using In or Not In, allows you to use a set of values (rather than a single value) in a WHERE clause. Suppose, for example, that you want to see the names of all employees who’ve sold anything to a particular customer. You can accomplish that with this subquery in the WHERE clause:

SELECT FirstName, LastName

FROM Employees

WHERE EmployeeID IN

   (SELECT EmployeeID

    FROM Orders

    WHERE CustomerID = 'ALFKI')

To see the employees who haven’t sold anything to this customer, just replace In with Not In. In case you didn’t know, In and Not In work perfectly well without subqueries as well. For example, this is a valid query:

SELECT FirstName, LastName

FROM Employees

WHERE EmployeeID IN (3,5,6)

Finally, the EXISTS and NOT EXISTS keywords let you make decisions based on whether there are any records at all in a subquery. For example, to find all products that have been ordered by a customer, you could use this query:

SELECT ProductName

FROM Products

WHERE EXISTS

  (SELECT Orders.OrderID

   FROM Orders INNER JOIN [Order Details]

   ON Orders.OrderID = [Order Details].OrderID

   WHERE Products.ProductID =

               [Order Details].ProductID

   AND CustomerID = 'ALFKI')

More subquery solutions

Let’s look at some other querying problems that are easily solved with the use of subqueries. One of these is the problem of getting a “top per group” result set. For example, suppose you’d like to see the most recent three order dates for each customer in the database. You can accomplish this task with this query:

SELECT CompanyName, OrderDate

FROM Customers INNER JOIN Orders

ON Customers.CustomerID = Orders.CustomerID

WHERE OrderDate IN

  (SELECT TOP 3 OrderDate

   FROM Orders

   WHERE Orders.CustomerID = Customers.CustomerID

   ORDER BY OrderDate DESC)

ORDER BY CompanyName ASC, OrderDate DESC

Note that the WHERE clause in the subquery joins a field from the subquery with a field from the main query. This is what makes the subquery return different results for each customer in the main query.

Another use for subqueries is to add rankings to a totals query. For example, you might want to know the sales rank of each product. This is most easily done with two queries. The first is a totals query that collects the total sales for each product:

SELECT ProductName,

  SUM(Quantity) AS TotalSales

FROM [Order Details] INNER JOIN Products

ON [Order Details].ProductID =

               Products.ProductID

GROUP BY ProductName

ORDER BY SUM(Quantity) DESC

After that query has been saved as qryProductTotals, you can use this query with a subquery to generate the sales rankings:

SELECT ProductName, TotalSales,

  (SELECT COUNT(*)

   FROM qryProductTotals AS QPT

   WHERE qryProductTotals.TotalSales

           <=  QPT.TotalSales) AS Rank

FROM qryProductTotals

ORDER BY TotalSales DESC

Figure 2 shows the results of running the second query. Note how duplicate sales figures are handled by assigning a tie rank. The query works by looking at a second copy of the source query and counting the number of rows that have a total equal to or greater than that of the current row. If you run this query, you’ll discover that it’s extremely slow, because it needs to run the nested totals query once for every row in the result set.

 
Figure 2

Finally, subqueries are very useful for answering “above average” questions. For example, which products cost more than the average product? Here’s a query with a subquery that gives the answer:

SELECT ProductName, UnitPrice

FROM Products

WHERE UnitPrice >

(SELECT AVG(UnitPrice)

FROM Products)

ORDER BY UnitPrice DESC

The path to SQL enlightenment

Subqueries are perhaps the simplest queries in Access that absolutely require you to write some SQL. Even if you use the QBE grid to construct your overall query (as I did for the first example in this article), you can’t avoid writing an SQL statement for the subquery, either in a field definition or in a WHERE or HAVING clause. The requirement to write SQL makes many beginning Access developers shy away from using subqueries. That’s a pity, because some problems (for example, the ranking query or the top per group query) are most easily solved by subqueries.

Rather than avoiding subqueries, I urge you to embrace them. Learning enough SQL to write subqueries will help you gain confidence in writing SQL statements, and ultimately you’ll find that you can use this knowledge to write other types of queries directly in SQL. Access is practically unique as a product in letting you switch easily from a graphical view of a query to an SQL view. By learning how to make this switch on your own, you’ll develop SQL skills that will serve you well in other less flexible products, as well as in writing VBA code that uses SQL statements.

 Your download file is called  Gunderloy_SubQuery.accdb 

 

Other Pages On This Site You Might Like To Read

About Mike Gunderloy

Web application developer specializing in Ruby on Rails applications. Especially interested in working as part of a distributed team. Solid experience with telecommuting and remote work using a variety of tools over the past 15 years. Also a Senior Writer for the defunct Web Worker Daily, with dozens of books and hundreds of articles under his belt.
This entry was posted in Other Topics. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *


*


This site uses Akismet to reduce spam. Learn how your comment data is processed.