It’s an unusual Jet database that will convert smoothly to SQL Server. Rick Dobson shows how to mix your own code with the Upsizing Wizard. But in many cases it’s not just Access data that you need to convert, so he shows how Access can let you move Outlook data into SQL Server.
The main reason that I needed to upgrade to SQL Server was that my database exceeded the Jet size limit for a single database file (2GB–the maximum data file size for a SQL Server 2000 database is 32 terabytes). A secondary objective was to keep all my data in a single file. Even after the migration of the database to SQL Server, because all of my data would be in a single database, I could maintain some of the simplicity and familiarity of an Access solution for Jet by using an Access Data Project.
Because the sample database would take about forever to download, there’s no download database for this article. However, all code samples referenced directly or indirectly in the article are included in two code listings that are available for download.
How did I end up with a 2+GB database? The primary purpose for this database was to store e-mail messages with their attachments, pulled from a Microsoft Outlook message store. Migrating the messages from Outlook to a database lets me use queries to distinguish between spam and non-spam messages. I then feed the results of that analysis back to a VBA program, SPAM Blocker for Microsoft Outlook, running behind Outlook, which uses the information to sort incoming messages into good or spam message folders. The database grew from essentially nothing to 2GB in about two years for two reasons: First, the volume of spam was growing rapidly over this period (only a small fraction of the databases was non-spam messages). Second, I expanded the number of e-mail addresses contributing messages to the database.
Using the Access Upsizing Wizard
The migration process for moving the values in a Jet database to a SQL Server database can be divided into three main segments:
- Create a SQL Server database, which acts as a container for other objects, such as tables.
- Create tables in the new SQL Server database based on the design of tables in the original Jet database.
- Copy the data from the Jet database to SQL Server.
The wizard has uneven success at transforming Jet databases to SQL Server databases, depending on the objects in the Jet database. For instance, queries joining three or more tables don’t convert well from Jet to SQL Server. In general, larger databases don’t convert smoothly.
As a result, I wasn’t surprised when the Access Upsizing Wizard failed to convert my e-mail message database when I tried to consolidate all three migration steps into one task. There were three reasons:
- My database was at the 2GB size limit for Jet.
- Two fields (Body and HTMLBody) can have very long string values.
- Column values are populated from over the Internet without any cleaning. Several columns contained any characters that a sender can stuff into an e-mail message.
I did find that the Access Upsizing Wizard was successful at completing the first two steps of the migration process. Fortunately, you can set the wizard to create a database based on the tables in your Jet database without having the wizard populate the tables with values. Some very straightforward ADO code that you run after using the Access Upsizing Wizard can populate the tables.
You open the Access Upsizing Wizard by choosing Tools | Database Utilities | Upsizing Wizard. The fourth screen in the wizard has the most important setting for my problem: the checkbox with the label “Only create the table structure; don’t upsize any data” (see Figure 1). With this checkbox selected, the wizard creates the tables selected earlier in the wizard without attempting to copy values into those tables. Eliminating the attempt to copy values also dramatically speeds up the wizard’s operation.
Populating tables with ADO
After running the wizard, I now needed to get my data from my Jet database to my SQL database. Basically, the strategy I followed was to create two recordsets–one for a table in a Jet database and a second for the corresponding table in the SQL Server database. While there may be more sophisticated solutions, this process lets me add any code required to massage or clean my data. As you’ll see, this method also lets me copy non-Jet data.
In my code, I first opened an ADO connection to the Mail.mdb database file and created an ADO recordset on the table that I wanted to copy. I then opened a connection to the SQL Server database and created a recordset there based on my new, empty table (the connection to the SQL Server database has to specify the adLockOptimistic option so that I could do updates through the connection). Finally, I looped through the recordset from my Jet table and copied the fields to my empty table:
strProvider = _ "Provider=Microsoft.Jet.OLEDB.4.0;" strPath = _ "Data Source = " & _ "\\Cab233a\SpamBlocker\Mail.mdb;" cnn1.ConnectionString = _ strProvider & strPath cnn1.Open rst1.Open "JunkMailProperties", cnn1 cnn2.ConnectionString = _ "Provider=SQLOLEDB; " & _ "Data Source = CAB233a;" & _ "Initial Catalog=MailSQL;" & _ "Integrated Security = SSPI" cnn2.Open rst2.Open "JunkMailProperties", cnn2, _ adOpenKeyset, adLockOptimistic Do Until rst1.EOF rst2.AddNew rst2(1) = rst1(1) rst2(2) = rst1(2) …rest of fields rst2.Update rst1.MoveNext Loop
The complete solution requires three additional procedures to copy values for the other spam table and the two good e-mail message tables. Because these tables can have many rows, each procedure can take a long time to complete. It’s a good idea to call all four procedures from a fifth procedure.
Adding Outlook messages
However, I also needed the ability to copy Outlook messages to my new SQL Server database, something that the Upsizing Wizard won’t handle directly. But I can copy Outlook data to SQL Server, similarly to the way that I copied Jet data.
The main items used for the process include a variable pointing at an Outlook folder from which I was to copy messages and two recordsets–one for a table of properties (for example, the message’s subject and body) and another for a message’s attachment file and display names. The code loops through mail items in the target folder and copies values to one or both recordsets depending on the contents of a given message (if there are no attachments for a message, the code doesn’t copy to the recordset for attachments).
The code discussed in this section uses the MAPI object model to navigate through Outlook items, such as the mail items in a folder. You can recover mail items from any folder for your database. My SPAMBlocker tool stores unwanted mail in a folder named Junk E-mail. Therefore, the code sample in this section iterates through mail items in the Junk E-mail folder. This folder resides at the same level as the default Inbox folder. The code changes slightly if you need to process messages from a folder nested below the default Inbox level.
Because the code accesses mail items, Outlook will display the Outlook security prompt before processing each item. You can bypass this with the help of the Express ClickYes utility from www.express-soft.com/mailmate/clickyes.html (you can turn the Express ClickYes utility on and off from your code as described in my article in the August 2003 issue).
This code from the CopyJunkMailFolderToMailDB procedure illustrates code for pointing the olFldTarget variable at the Junk E-mail folder. Several preceding Dim statements declare variables to appropriate Outlook and MAPI types. For example, the olApp variable is declared as an Application type in the Outlook object model. After finding my e-mail folder, I create a connection to my SQL Server database and I’m ready to start copying data:
Set olApp = Outlook.Application Set olNS = olApp.GetNamespace("MAPI") Set olFldSource = _ olNS.GetDefaultFolder(olFolderInbox) Set olFlds = olNS.Folders.Item(1).Folders Set olFldTarget = olFlds.Item("Junk E-mail") str1 = "Provider=SQLOLEDB;" & _ "Data Source = Cab233a;" & _ "Initial Catalog = MailSQL;" & _ "user id = myid;Password=mypass" cnn1.ConnectionString = str1 cnn1.Open
This segment from the CopyJunkMailFolderToMailDB procedure opens two recordsets that point at the JunkMailProperties and JunkMailAttachments tables, respectively. Then, the code loops through the mail items in the Junk E-mail folder, adding a row to the recordset for the JunkMailProperties table. If the count of attachments for a message is greater than 0, the code adds a new row for each attachment to the recordset for the JunkMailAttachments table. The code copies the value from the first field in the current row for the first recordset to the first field in the current row for the second recordset. In this way, the code assures that the MailID column in the JunkMailProperties and JunkMailAttachments tables are synchronized:
rst1.Open "JunkMailProperties", cnn1, _ adOpenKeyset, adLockOptimistic rst2.Open "JunkMailAttachments", cnn1, _ adOpenKeyset, adLockOptimistic For I = 1 To olFldTarget.Items.Count Set olMail = olFldTarget.Items(I) rst1.AddNew rst1(1) = olMail.SenderName rst1(2) = Left(olMail.To, 255) rst1(3) = olMail.ReceivedTime rst1(4) = olMail.Subject rst1(5) = olMail.Body rst1(6) = olMail.HTMLBody rst1.Update If olMail.Attachments.Count > 0 Then Dim int1 As Integer For int1 = 1 To olMail.Attachments.Count rst2.AddNew rst2(0).Value = rst1(0).Value rst2(1) = int1 rst2(2) = _ olMail.Attachments(int1).FileName rst2(3) = _ olMail.Attachments(int1).DisplayName rst2.Update Next End If Next I
Things are different after you move to SQL Server. SQL Server is slower to show column values in a datasheet view of the JunkMailProperties than Jet was, for instance. Returning long strings in SQL Server seemed to adversely impact performance more than returning the same values for the Jet database.
Unfortunately, I also needed to redesign many of my queries for finding spam indicators because SQL Server doesn’t support many commonly used Jet database features. For example, the Instr function can be used in Jet queries to find the starting position of one string value within another. SQL Server doesn’t support this function, but you can achieve comparable results with the T-SQL PATINDEX function (the syntax is not identical).
My original Access solution for the Jet database referenced custom VBA functions in some queries, something SQL Server doesn’t support. My workaround in this case was to write a VBA procedure that connects to the messages databases, loops through these messages, and calls my custom VBA functions passing values returned from each message. The main VBA function can run from any module in an .adp or .mdb file.
The demands of converting a very large database are different from those for a small or medium-sized database, and this article gives you the tricks for succeeding with a large database.
Other Related Pages