With the help of a structured naming convention, you can harness the power of Access objects to develop data mining applications quickly and efficiently.
I work for a department called Analytical Services. Our department acts as a liaison between Information Technology and Sales. IT does a superb job of collecting and storing massive amounts of data, and our salespeople excel at using information to help our clients grow their businesses. The challenge in Analytical Services is to turn the raw data stored in the company’s data warehouse into useable information for our sales team. My job is to help build automated applications that will mine the data warehouse and produce needed reports “at the speed of business.”
For simple requests, I can create a new Access database, attach a few data warehouse tables, create some queries, and have the answer in no time. Unfortunately, straightforward requests are rarely the case. More common are requests for sets of reports that require many steps of data aggregation. In these instances, I must build dozens of queries to generate the data requested. As the queries pile up, my database grows increasingly more complex, making it difficult for me (and others in my group) to decipher. What’s more, converting this database into an application to automatically produce the reports is no simple task, due to its chaotic order.
After working like this for some time (and incurring much pain and suffering), I determined that my lack of organization stemmed from Access’s insistence on ordering tables and queries alphabetically. While I did my best to name my queries and tables logically (for instance, “Sum Dollars to Market,” “Dollar Sales by Market”), these names could never be long enough to describe them precisely. I inevitably ended up with similar-sounding names for very different database objects. Another problem I had with this haphazard approach was that when I was all done with the database, I didn’t know which query to run first. I also would have a hard time knowing which query created which table unless I did a lot of hunting.
To address these concerns, I came up with a naming convention. While most naming conventions improve the maintainability of applications by making them self-documenting, my convention also improved my ability to quickly set up automated, query-driven applications. Our entire department uses this convention, which allows us to easily share applications. This simple-to-learn technique has proven to be very powerful, solving several development issues.
To explain how the system works, I’ll take you through the development of a small query-based application. A copy of the completed database is available in the accompanying Download file.
Building a system
The best way to show off this convention is to take a sample application. I needed to produce a report that showed the average price for a requested product category by brand, by region, and by week. I started by sketching out what the report would look like, so I’d know exactly what data I was going to need (see Figure 1).
After reviewing the report specifications with the end-users, I then needed to determine which databases, tables, and fields contained the data we needed, as well as any calculations required to arrive at the output. For this report, the only tables I needed were Product Information, Geography Definition, and Sales. Now I was ready to turn to Access.
I began by creating a standalone database for this application. The next step was to attach the necessary tables. After the tables were attached, I renamed them and added a prefix to the table names so that they were grouped by database (see Figure 2).
In this example, all three tables came from our Red Brick database, so I used the prefix “rb” in front of the table names. Some applications I build might contain tables from multiple databases, such as Red Brick, Oracle, and SQL Server. Adding a prefix (o for Oracle, ss for SQL Server, and so forth) organizes the tables effectively. This prefix becomes essential as the number of Access objects increases. It’s important to note that you should rename attached tables as soon as you attach them, before you start building queries against them. Unless you’re using a tool such as SpeedFerret or Peter Bryant’s DBRat from the February 1998 issue of Smart Access (see “Searching Access”), changing tables already in use is a pain.
Naming convention rules
The first query in my application combined the product, sales, and geography information into one table (see Figure 3).
With the query created, it was time to name the query, as well as the table it would create. The following three rules control the naming of queries and any tables made by them.
All queries are named with the prefix #x, where # is a single digit and x is a lower-case alpha character. The prefixes go in the sequence 1a, 1b, 1c, and so forth. When we run out of letters at 1z, the initial digit is incremented to start the series 2a, 2b, 2c. The prefix is followed by two spaces to help set the prefix apart, and then any name that concisely describes the query. Typical query names are:
Only digits 1-9 are used, because using “10a” will cause that query to be ordered by Access after query “1a”, which isn’t the correct order. Using this convention, there’s room for 936 queries — more than enough for the applications I’ve written.
Queries should be numbered in the order that they’re created. If you need to insert a query into the middle of a routine, a second digit is added to the prefix. For example:
|0a||Inserted query prior to query 1a|
|1a1||Inserted query here|
|1a2||Second inserted query|
This naming scheme will later be used to control the order in which the queries are to be executed. While it would be nice to number the queries in the order that they’re to be executed, this isn’t practical. It’s virtually impossible to know at the start of development all the queries you’ll need to build, not to mention exactly what order they’ll require in the finished application. Numbering the queries in the order in which they’re created is a good trade-off because you’ll typically create the queries in an order that’s close to the final execution order. As you’ll see, VBA is used to assign the correct order to automatically process the queries later on. Ordering queries as they’re built also tends to keep query procedures grouped together in “blocks.”
When I was first developing this system, I toyed with the idea of using the query creation date to sort the queries. (This date can be seen when the query window view mode is set to “Details.”) However, this sort order doesn’t give me control over the query order. If I need to move queries around, I’m out of luck because I can’t change the creation date.
These prefixes also make it easy to refer to specific queries. For example, it’s much quicker to write down “1a” than a long query name. A simple two- or three-character string is all that’s needed to reference every unique query in a database.
All tables created through Access queries should be named with an M + the name of the query that made them. For example:
Doing this makes it very simple to trace a table back to its parent query. This technique is vital in troubleshooting applications. By using this convention, you can easily pick any point in the application and backtrack through it to retrace its steps. For example, if there’s a problem with table “M4c Percent Gains”, I immediately know to look at query “4c Percent Gains” to see how the table was created. If that query looks fine, I can then check the table query 4c is based on (usually “M4b”), and so on.
Use make-table queries
This brings me to an important lesson I learned early on — use make-table queries. If you have a multi-step calculation or aggregation, you have two options: You can create multiple select queries and nest them one inside the other, or you can make tables from each query individually and base subsequent queries on those tables.
Nesting queries has some major drawbacks. For one thing, they’re difficult to debug. If a query fails, you don’t know if it failed on that query, or on one of its nested queries. The other problem is that using nested queries can cause a PC to quickly run out of memory, since it must hold the results from each of the nested queries to process them all at once.
You can eliminate these two problems by creating intermediate work tables. You’ll also get the added benefit of having the data available for other processes. If you need to reuse any piece of the data in later calculations, you can simply extract it from one of the intermediate tables.
Returning to my example, I named the first query in the application “1a Total Sales by Region”. This query creates a table that will be named “M1a Total Sales by Region”.
The next query used the results from query “1a” to calculate the average price by week. I named this query “1b Avg Price by Week” and the table it created “M1b Avg Price by Week” (see Figure 4). The table made by query “1b” gave me the data needed for the report.
Automating the system
To automate this simple two-query system, I used a VBA function that executes each query in turn. This function gives me control over the order of the query execution (regardless of the order in which they were written). It also gives me a lot of flexibility — I can easily add looping (that is, execute certain queries a set number of times), add decision-making, import data, and so forth.
Function RunQueries() 'Open debug window to view progress DebugWindow 'Call function to execute queries ActionQuery "1a Total Sales by Region" ActionQuery "1b Avg Price by Week" End Function
To execute the queries within the RunQueries() function, I call a VBA function called ActionQuery():
Function ActionQuery(strQueryName As String) Dim qdf As QueryDef Dim db As DATABASE Dim strTableName As String 'First test for table to be made and delete the 'table if found strTableName = "M" & strQueryName If TestForTBL(strTableName) Then Set db = CurrentDb DoCmd.DeleteObject acTable, strTableName End If 'Run action query Debug.Print strQueryName Set db = CurrentDb Set qdf = db.QueryDefs(strQueryName) qdf.Execute dbFailOnError End Function
Before executing the query, this function tests to see whether a table already exists that needs to be deleted. Remember that the table created by a make-table query is always named the same as the query making it, along with a preceding “M.” The TestforTBL() function makes use of a simple query (“00 Get Tablenames”), which gets all the tablenames out of the MSysObjects table that aren’t system tables. If the table exists, it’s deleted before running the make-table query. This is important due to a bug in Access (see the sidebar, “Make-Table Bug”).
Function TestForTBL(strTableName As String) Dim db As DATABASE Dim rs As Recordset Dim strName As String Dim intTables As Integer Dim intI As Integer Set db = CurrentDb() 'Use query to look up tablenames in MSysObjects Set rs = db.OpenRecordset("00 Get Tablenames", _ dbOpenDynaset) rs.MoveLast intTables = rs.RecordCount 'Loop through all tables, checking for match rs.MoveFirst For intI = 1 To intTables strName = rs![Name] 'Check tablename against system tablename If strTableName = strName Then TestForTBL = "True" Exit Function End If rs.MoveNext Next intI 'If match not found, return 'False' value TestForTBL = "False" End Function
The ActionQuery() function works for any action queries, including appends, updates, deletes, and pass-through queries. For those queries, the test for a table in the form of “M + queryname” will fail, and the query will run normally.
There’s also a line of code that prints the queryname to the debug window. This is helpful for testing applications. RunQueries() calls a function that opens the debug window. As the function runs, it will print each query it executes to the debug window. In this way, you can see exactly what Access is doing, and when it’s doing it.
Function DebugWindow() 'Open the debug window SendKeys "^g", True End Function
Keeping large databases organized
The example shown here is a very small application. The real power of this technique becomes apparent when building large databases with hundreds of queries performing various levels of aggregations and calculations. No matter how big it gets, I can still find my way around any database built using this convention, whether the system was written by me or someone else on my team. The end result is our ability to deliver crucial data mining applications in a hurry!
Sidebar: Make-Table Bug
When executing make-table queries from macros or VBA in Access, it’s important that you first check to see whether the table to be created already exists, and if it does, delete it. This is because there’s an Access bug that will allow a make-table query to create a table with the same name as a table that already exists. You can’t delete the “duplicate” table, because Access creates this table without an owner and won’t give you privileges to even view the table. Running the make-table query multiple times will produce multiple copies of the same table, but only the first table can be deleted. The only way to get rid of these duplicate objects is to create a new database and import all of the other objects except for the problem tables. This isn’t my idea of a fun time.
I spent some time trying to pinpoint the cause of this odd behavior so that I might be able to reproduce the bug at will. While I couldn’t narrow it down to any one variable, I did notice some interesting behavior that seems to be related to the problem. When I repaired a database, for example, the problem wouldn’t occur. However, my applications often consist of dozens of queries that are run from VBA or a macro. The process of running these queries somehow “corrupts” the database and allows the problem to occur. Without checking for the table and deleting it with code, I’d have to repair the database each time I ran the application to prevent the problem. Also, this problem didn’t affect all make-table queries in my applications — only some of them. Finally, I only observe this problem when my database is large — at least 150 objects (including tables, queries, forms, macros, and modules).
When I first observed this behavior in Access 95, I contacted Microsoft. While they didn’t offer an explanation for the problem, the service engineer recommended that I use code to test for the table to be created and delete it before running any make-table query. I’ve provided the Test1g.MDB database in the accompanying Download file to give you a chance to play with the problem.