You aren’t limited to joining tables only when two fields are equal. In the third of an ongoing series of articles on SQL, Peter shares some techniques to exploit more powerful options when linking tables in queries.
LIKE most developers, you probably join tables in your queries by dragging a field from one table to a field in another table. While tremendously easy to do, this is just the tip of the proverbial join iceberg. There are at least two other ways of joining tables, and both let you do some neat things.
When you join two tables by dragging and dropping, you get a SQL query that looks like this:
SELECT * FROM Table1 INNER JOIN Table2 ON Table1.Field1 = Table2.Field2
This is known as an “equi-join” because it requires the two fields that are joined to be equal in order for rows from the tables to be included in the result. If you bypass Access’ Query Design window (by switching to SQL on the View menu, for instance) you can create a join on two fields that aren’t equal, which is known as a non-equi-join.
In last month’s Working SQL column SQL Without Joins, I mentioned working on a problem with Joe Fallon at Nestle to create a schedule of games for a round-robin hockey tournament. At the time, we created a table of possible games by generating every combination of every team in the tournament. This list included some impossible games (teamA playing teamA) and some redundant games (teamA plays teamB along with teamB plays teamA). Joe pointed out that if we sorted the teams by name, every team only needs to play the teams higher than itself in the list. In a four-team league, these are the games that must be played:
TeamA vs. TeamB TeamA vs. TeamC TeamA vs. TeamD TeamB vs. TeamC TeamB vs. TeamD TeamC vs. TeamD
While there are probably many ways to generate this list, a single SQL query that uses a non-equi-join does it the best:
SELECT tblTeams.Team, tblTeams_1.Team FROM tblTeams INNER JOIN tblTeams As tblTeams_1 ON tblTeams.Team < tblTeams_1.Team
The non-equi-join (tblTeams.Team < tblTeams_1.Team) in this query generates every combination of a team and all the teams sorted after it in the list, which is exactly what we needed.
Because SQL is set oriented and doesn’t care about order, creating an ordered list, such as a hierarchy, can be difficult. Let’s say you have a table of employees that lists the department they’re in and a level number that shows where in the company hierarchy they are. With that table and this query you can find every employee’s immediate boss:
SELECT DISTINCTROW tblEmploy.EName, tblBoss.EName FROM tblEmploy INNER JOIN tblEmploy AS tblBoss ON tblEmploy.EDept = tblBoss.EDept AND tblEmploy.ELevel < tblBoss.ELevel WHERE [tblBoss].[ELevel]-[tblEmploy].[ELevel]=1;
This query uses a non-equi-join (ELevel < ELevel) to generate all the combinations of a boss and all the employees lower in the company food chain. It also uses an equi-join (EDept = EDept) to make sure that only the combinations where the employees are in the same department appear. Finally, the Where clause winnows out most of the generated rows by limiting the result to the rows where the boss is only one level above the employee.
This leads to one of the trickier problems in SQL: finding a record that comes just before another record (as in “What sales order did we send out just before order number 19720?”). Here’s a query that uses a non-equi-join to generate all the combinations for each order and all the orders that came before it:
SELECT tblOrders.ONumber FROM tblOrders INNER JOIN tblOrders AS tblOrders_1 ON tblOrders.ODate < tblOrders_1.Odate WHERE tblOrders_1.ONumber = ‘19720’ ORDER BY tblOrders.ODate - tblOrders_1.ODate
The Where clause limits the list to the rows involving the order we’re interested in. Sorting by the difference between the dates of the two records causes the order closest in time to order ‘19720’ to rise to the top of the list.
There’s another variation on the join that helps me answer questions such as “Which of our dealers hasn’t bought anything from us this month?” These questions boil down to asking, “Which records are in one table and not in another?” With a standard join, you can only answer this question by using a subquery. A simpler and generally faster way is to use the strategy Access follows with the Find Unmatched query wizard—the outer join.
The join you create by dragging and dropping fields is an inner join. While very useful, an inner join is designed to be limiting: only the records that satisfy the join appear in the result. An outer join is more inclusive. With an outer join, all the records from one of the tables appear in the query. If there’s no matching record in the other table, then the fields from that table are set to Null in the result. This feature lets you find the records that don’t have a match in the joined table by checking to see if some field from it is Null:
SELECT Table1.Field1, Table2.Field2 FROM Table1 LEFT JOIN Table2 ON Table1.Field1 = Table2.Field2 WHERE Table2.Field2 Is Null;
Unlike the inner join, the outer join comes in two flavors. Being a southpaw myself, I favor the left outer join (usually written simply as left join). In the left join, all the records in the table to the left of the word join appear in the query (in this case, Table1). In a right join, all the records in the table to the right of the word join appear in the query (Table2).
Unlike non-equi-joins, you can create outer joins in the Access Query Design window. Once you’ve joined the two tables together, double-click on the line that connects the two tables. The dialog box shown in Figure 1 appears and asks which of the following you want:
- Leave the query as an inner join (Option 1).
- Have all of Table1’s records in the result (Option 2—a left join).
- Have all of Table2’s records in the result (Option 3—a right join).
Double-clicking on the line requires a certain amount of fine motor control, especially if your monitor is set to a high resolution. I’ve often been driven to move the tables around until the line between them is sloped—it seems to be easier to click on the line then. If you double-click and the query properties window comes up, just close it and start again. Be advised! When you have tables that are joined on more than one field, if one of the joins is an outer join, they all must be outer joins.
Access tends to hide the Join clause where you can’t see it, which is too bad. As I’ve tried to suggest here, it’s a hidden treasure if you take the time to exploit it.