Peter Vogel describes how a combination of SQL and Access generated a solution for a scheduling application. In the process, this article describes how a set-based approach to the problem will let you find the most effective SQL solution. On the way, Peter provides two different solutions to the problem.
One of my clients is a manufacturing company that has an assembly line. The company is a “make to order” organization, so no two items being built are exactly alike. Each unit is customized based on the options selected by the customers. However, while each unit being built is different (and there are several thousand different combinations of options available to my client’s customers), the time it takes to add each option is known. What my client wanted was a prediction of when each item would reach each of the work stations on the assembly line, including the most important station: the end. Given a list of options selected for a product and the time to assemble each product, I had to generate the date when each assembly step would complete.
As I studied the problem, I realized that I’d already solved it. This application would use a set of techniques that I’d developed many years ago for a hospital maintenance system. The hospital job had been one of my first high-pressure systems. If you think that it’s important to keep your server up and running, it’s nothing compared to the importance of making sure that every piece of equipment in a trauma room is in working order. In order to ensure that nothing will fail when you show up at the emergency room door, the hospital had to repair its equipment before it failed. This process, called preventative maintenance, consisted of taking the equipment out of service, performing some standard activities, and putting the equipment back in service.
What the hospital had needed was a system that would generate all of the preventative maintenance activities due to be done in any period of time. What made the problem interesting was that every piece of equipment had a different schedule. The schedule, or plan, for a piece of equipment could be as simple as replacing all hydraulic fluids every three months. Other pieces of equipment might have a complicated plan consisting of a maintenance check every three months, a tune-up every six months, and a complete strip-down and rebuild each year. Sometimes the plan explicitly skipped activities: The quarterly tune-up in my previous example wouldn’t have to be done during the annual strip-down and rebuild, for instance.
And, of course, not everything would go according to plan. If a piece of equipment actually broke down, any regular maintenance would be done as part of the repair. As a result, the subsequent tune-up or maintenance check could be skipped. In these situations, the whole schedule would have to be adjusted so that the next activity would be scheduled based on the repair, not the last preventative maintenance activity.
Creating the tables
As with any database solution, I started by designing the tables for the application. To begin with, I needed a table that would hold the information for the equipment that was to be serviced. An abbreviated version of that information is shown in Table 1 (the full version of the table included information on the vendor who sold the equipment, among other pieces of information).
Table 1. The equipment table.
|EquipId||Unique ID assigned to each piece of equipment|
|PlanName||Name of the schedule that the equipment is to follow|
The next table that I needed had to hold the information for the equipment plans. I needed to record what activity was to be performed and how many weeks were to elapse between each activity. You can see the layout in Table 2, which includes the name of the activity and the number of weeks until the next activity. In Table 3, you can see the data for the complicated plan that I described previously. It shows that the plan begins with a maintenance check, followed 12 weeks later by another maintenance check. Twelve weeks after that, a tune-up and another maintenance check are required. The final quarterly maintenance check follows 12 weeks later, with the annual strip-down and rebuild happening 12 weeks after that. Twelve weeks after the rebuild, the cycle begins again with the first maintenance check. In Figure 1, you can see the Access form that I used to maintain the plan table.
Table 2. The plan table.
|PlanName||Name of a schedule that equipment can follow|
|Step||The position in the plan for this step|
|PlanActivity||Activity to be performed at this step|
|Weeks||Number of weeks until the next activity|
Table 3. A typical plan.
Only one table was left to be defined. As activities were performed, a record of the activity had to be made. In many ways, this was the most important part of the whole process. The records of the preventative maintenance work were an important input to the hospital’s accreditation review. A hospital that wasn’t accredited would have its government funding cut or could even be closed. As I said, this was a high-pressure system. Table 4 shows the design for the activity table, which lists all of the activities for all equipment in the system.
Table 4. The activity table.
|EquipId||ID for the equipment that the activity was performed on|
|Status||Activity Status: 0 = scheduled, 1=completed|
|PlanName||Name of the plan that the activity was performed on|
|Step||Number of the step in the plan for the activity|
|ActDate||Date the activity was performed|
The SQL code
Given the tables that I’d laid out, I could start developing the code that would give me the schedule that the client wanted. The first query that I came up with generated the next activity date. By passing the flag “ww” to the VBA DateAdd function, the function adds the number in the Weeks field table to the date in ActDate to give the date of the next activity (Figure 2 shows the joins between the two tables in the Query design window):
SELECT tblEquip.EquipId, DateAdd("ww",Weeks,ActDate) FROM tblPlan INNER JOIN (tblEquip INNER JOIN tblAct ON tblEquip.EquipId = tblAct.EquipId) ON tblPlan.PlanName = tblAct.PlanName;
Unfortunately, what this query did was join every equipment record to every step in their plan and then to every activity ever performed with that piece of equipment. The record that I wanted was the next date from the last activity performed using the next step in the equipment’s plan. While those records were in the mass of records produced from my query, I needed to whittle down the result to just the records that I wanted.
I decided that the first problem I’d tackle was finding the last activity performed on the piece of equipment. Finding the date of the last activity for each piece of equipment is relatively easy, thanks to the SQL Max function, which returns the largest value in the selected records:
Select EquipId, Max(ActDate) From tblAct Group By EquipId
I could now take my original query and add this query to it in order to restrict my results. The following rewrite of my main query only uses the date generated from the last activity performed on any piece of equipment:
SELECT E.EquipId, DateAdd("ww",Weeks,ActDate) FROM tblPlan INNER JOIN (tblEquip As E INNER JOIN tblAct ON E.EquipId = tblAct.EquipId) ON tblPlan.PlanName = tblAct.PlanName Where ActDate = (Select Max(ActDate) From tblAct Where tblAct.EquipId = E.EquipId)
This version of my query is a correlated subquery. I first assigned the alias “E” to tblEquipment in the From clause of my main query. In my subquery, I used that alias to reference the EquipId field in the subquery, causing the two queries to be correlated. This use of the alias causes the subquery to be run each time a record is generated in the main query. The alias also causes the EquipId in the subquery’s Where clause to be drawn from the current record in the main query. The net result is that the subquery finds the maximum date for the current equipment record in the main query. The Where clause in the main query then restricts the activity records to the activity record for that date.
While this helped, I still needed to find the task record for the next task to be performed. This could also be done using subqueries by: 1) finding the last activity; 2) finding the step performed in that activity; and 3) adding 1 to get the next step:
Select Step + 1 From tblAct Where tblAct.PlanName = P.PlanName And ActDate = (Select Max(ActDate) From tblAct Where tblAct.PlanName = P.PlanName)
I could now put all of this together to generate the date of next activity to be performed for every piece of equipment:
SELECT E.EquipId, DateAdd("ww",Weeks,ActDate) FROM tblPlan AS P INNER JOIN (tblEquip AS E INNER JOIN tblAct ON E.EquipId = tblAct.EquipId) ON P.PlanName = tblAct.PlanName WHERE tblAct.ActDate = (Select Max(ActDate) From tblAct Where tblAct.EquipId = E.EquipId) And P.Step = (Select Max(Step) + 1 From tblAct Where tblAct.PlanName = P.PlanName And tblAct.EquipId = E.EquipId And ActDate = (Select Max(ActDate) From tblAct Where tblAct.EquipId = E.EquipId))
I could have continued to work with this solution, but it had some significant problems. For instance, when the last task in the plan was performed, there would be no “next task.” In this situation, I’d have to fix the query so that it would use the data from the first task in the plan. Another problem occurred when a new piece of equipment was added to the system. It would have no activity records and so would generate no “next activity.” A problem also occurred if an activity needed to have its next task rescheduled.
I could handle the problem of new equipment by generating a dummy “first activity” when new equipment was added. A similar solution would handle rescheduling activities by generating a new “last activity.” When an activity record was created for the last step in a plan, I could reset its Step field to zero, so that I was positioned to recycle through the plan, beginning at step 1.
In general, though, all of these solutions involved generating bad data. The activity record added when a new piece of equipment was added didn’t really record a maintenance activity. Resetting the Step field would mean that many activity records wouldn’t correctly report what activity was performed.
I’ve always believed that solutions that generate this kind of “dummy data” create more problems than they solve. To begin with, for instance, any report on equipment activity would have to be written to skip the dummy record created when a new piece of equipment was added.
I was also concerned about performance. Normally, I don’t worry about performance until the system is running in test mode. Only at that point can I tell where the system is actually too slow. Still, the deeply nested subqueries in my scheduling query, with each subquery re-executed for every record in the main query, would exact a toll as more records were added to the database. Since one of the purposes of the system was to generate new activity records, I could expect the scheduling process to take longer and longer as time went on. I knew that there had to be a better way.
This is often the hardest part of the development process. You’ve done so much work, committed to so much code, and now you’re thinking about throwing it all away. My experience has been that this is often the best thing to do and that, in the end, you’ll finish the project faster than if you try to patch up your existing work. When you get to this point in the project, you’re smarter and more knowledgeable about the problem than you were when you started. It would be foolish not to take advantage of your experience. I deleted all of my queries (but I kept the tables).
A better solution
Re-thinking the problem, I decided to stop thinking procedurally. My whole approach to the problem really reflected typical programming thinking. Fundamentally, I’d been using subqueries like VBA subroutines.
When working with SQL, it’s often better to think in terms of sets. What I was really trying to do was find the last activity record for every piece of equipment. In general, finding the “last” or “first” of anything with SQL is an inappropriate use of SQL. Since SQL has no inherent order, the concepts of “first,” “last,” “second,” or “third” don’t make a lot of sense. The closest that you can get to “first” or “last” in SQL is the built-in Last function, which returns the last record physically added to the database–not necessarily the record that I wanted.
However, there’s a way around the problem. Thinking in terms of sets, what I wanted was the activity record that had no records greater than it. This record would be the last activity record.
My query to find the last activity record began by joining every record in the activity table to every other record in the activity table. Rather than join where two records were equal, however, I joined the records where the ActDate was greater than or equal to the ActDate in the copy of the table:
SELECT A.EquipId, A.ActDate FROM tblAct AS A INNER JOIN tblAct AS B ON A.ActDate <= B.ActDate And A.EquipId = B.EquipId GROUP BY A.EquipId, A.ActDate HAVING Count(*) = 1;
The result would be that every activity record for a piece of equipment would be joined to every activity record for the same piece of equipment, but only where the record was for a later activity. The last activity performed could be joined only to itself, since no other activity record would be greater than it. I could now use the Group By clause to organize the records into subsets of records for the same piece of equipment and activity date. A count of the records for each entry in table “A” would let me find the activity record. The last activity would be in the subset with only one record. The Having clause let me select that single record set:
Select A.EquipId, A.ActDate, Max(A.Step), Max(A.PlanName) From tblAct As A Inner Join tblAct As B On A.ActDate <= B.ActDate And A.EquipId = B.EquipId Group By A.EquipId, A.ActDate Having Count(*) = 1
In a SQL statement’s Select, when using the Group By clause, all fields must either be the fields that you’re grouping on or be used in some summarizing function. I used the Max function to pull out the Step and PlanName fields that I wanted.
Of course, this join of every activity record to every other activity record could also be a real performance hog. I was counting on the query optimizer to find a way to get to the answer quickly.
Having retrieved the last Step and ActDate for every piece of equipment, I now appended them to a temporary holding table:
INSERT INTO tblTemp (EquipId, ActDate, Status, PlanName, Step) SELECT A.EquipId, A.ActDate, 0, Max(A.PlanName), Max(A.Step) FROM tblAct AS A INNER JOIN tblAct AS B ON A.ActDate <= B.ActDate And A.EquipId = B.EquipId GROUP BY A.EquipId, A.ActDate HAVING (((Count(*))=1));
The tblTemp that I generated could now be joined to the Plan table to generate the next activity for every piece of equipment:
SELECT *, DateAdd("ww",Weeks,ActDate) FROM tblTemp AS T INNER JOIN tblPlan AS P ON T.PlanName = P.PlanName Where P.Step = T.Step + 1
Converting this query into an Append query would add the next activity for every piece of equipment to the activity table.
Using the temporary table let me solve the problem with reaching the end of the plan. Before inserting the activity records, I ran a query that updated the temporary table. This query checked to see whether the step in the temporary table was greater than the last step in the plan. If it was, I set the step to 1 to restart the plan. Since this update took place in my temporary table, my real data was left alone:
Update tblTemp As P Set Step = 0 Where Step = (Select Max(Step) From tblPlan Where tblPlan.PlanName = P.PlanName)
Merging with Access
Now that I had my base queries, I was ready to build the application. While my SQL statement could generate the next activity, I needed the ability to generate not one, but several activities. The client might, for instance, want to add all of the work to be done in a one-month period. For pieces of equipment on one-week cycles, this meant that I had to run the query multiple times to add all of the weekly activities in a month.
I enhanced the query to generate all of the activities up to a date specified by the user. My VBA code would execute my query and then check to see whether records had been added to the activity table. If records were added, I ran the scheduling routine again. When no more activity records were added, I knew that all of the activities prior to the cutoff date had been added. The final version of the query looked like this:
PARAMETERS [End Date] DateTime; INSERT INTO tblAct (EquipId, PlanName, Step, Status, ActDate) SELECT T.EquipId, T.PlanName, P.Step, 0, DateAdd("ww",Weeks, ActDate) FROM tblTemp AS T INNER JOIN tblPlan AS P ON T.PlanName = P.PlanName WHERE P.Step = T.Step + 1 AND DateAdd("ww",Weeks, ActDate) < [End Date];
The VBA was short and to the point. I put all of my SQL statements in queries that I could call from my code (and got the benefits of having my queries pre-compiled). The code performed the following steps:
- Count the number of records in the activity table.
- Delete the records in the temporary table.
- Regenerate the records into the temporary table.
- Add records to update the activity table.
- Count the number of records in the activity table.
If records were added, I repeated the process. In Figure 3, you can see the form that allowed users to generate a new schedule. Here’s the code that appears behind the “Schedule” button:
Dim dbs As Database Dim rec As Recordset Dim que As QueryDef Dim lngActCount As Long Set dbs = CurrentDb() Set que = dbs.QueryDefs("qryScheduleWithEndDate") que.Parameters("End Date") = Me.txtEndDate Set rec = dbs.OpenRecordset _ ("Select Count(*) From tblAct") Do While rec(0) > lngActCount lngActCount = rec(0) dbs.QueryDefs("qryTempDelete").Execute dbs.QueryDefs("qryTempAppend").Execute dbs.QueryDefs("qryTempUpdate").Execute que.Execute Set rec = dbs.OpenRecordset _ ("Select Count(*) From tblAct") Loop rec.Close Set rec = Nothing dbs.Close Set dbs = Nothing
The system saw a number of enhancements from the simple process that I’ve described here. For instance, the code I’ve shown you still won’t handle overriding the schedule. I won’t walk you through the details, but that feature was handled by eliminating the temporary table. Instead, I moved two fields from the temporary table to the equipment table to hold the next step to be performed and the date of last activity. If some unexpected activity eliminated the need to perform the next scheduled activity, rescheduling was easy. The users just brought up the equipment record, set the step in the plan that they wanted to perform next, and entered the date of the emergency activity. Using the information in the equipment table, the system would schedule the new step to happen the appropriate number of weeks following the emergency. New equipment was handled the same way, by filling in the equipment table activity date with the date that equipment went into service and by specifying the first step in the plan to execute (which defaulted to 1).
I also modified the insert part of the query so that the new activities were added to a holding table instead of appending directly to the activity table. This let users review the new records and modify dates or activities before committing to the generated records. Committing the schedule just consisted of copying the records in the holding table to the activity table. I also separated the insert of the activity records from the update of the equipment record so that I could update the date of the next activity without adding activity records. This feature allowed the user to specify that activity records were only to be added after a specified date. Initially, my routine would update only the “next activity date” in the equipment record until it reached the start date for the scheduling period. Then, for records after the start date, I updated the equipment table fields and also inserted records into the activity table. Finally, by adding Where clauses to the queries, I provided the ability to limit scheduling to equipment in a particular location or type.
All of these enhancements required only minor changes to my SQL statements. From the beginning, I could have written this system with recordset processing code that read the tables on record-by-record basis. I suspect that would have required considerably more than the dozen-and-a-half lines of code that I used. With that much code, I don’t know if I could have implemented all of those enhancements at a reasonable cost to my client. I’ve included a sample database of the version of the system in the accompanying Download file so that you can experiment with your own enhancements.
Building for the future
I’ve used the techniques that I developed for the hospital on numerous occasions. My client with the assembly line was just the most recent implementation. I’ve also learned to recognize the things that this routine won’t do. It doesn’t handle assigning multiple resources to the same task (that is, putting two people on a job to get it done in half the time). Nor does it handle critical path networks, where some jobs can’t be done until several other jobs are completed. Still, the ability to generate a set of dates given a set of planned activities has turned out to be the solution to a number of problems.
The hospital project was also the first time that I started to “think with sets” and exploit the power of SQL. Combined with Access’s ability to generate a user interface and provide the code to control the execution of SQL statements, I’ve been able to deliver more functionality faster than I could with any other tool.