In the good old days, creating Word documents from Access required extensive knowledge of the Word object model plus some significant programming knowledge. With Word 2003, that’s all changed. Peter Vogel illustrates how you don’t even need Access 2003 to take advantage of Word’s XML functionality.
In Office 2003, virtually all of the Office applications acquire XML functionality. By and large, as Access developers, you won’t care (though Office developers will find the XML extensions to the Excel and Word object models very useful). However, Access developers often find themselves calling on the power of Word to create documents from their Access applications. Up until now, to create Word documents you’d have to load Word into memory (an expensive proposition) and then manipulate the Word object model (not an intuitive task). Word 2003 makes the whole process much simpler—and you can do it with any version of Access.
With Office 2003, all of the Office applications except Outlook can be saved and loaded in an XML format. Excel has had this feature since Office 2002, though with significant limitations (you couldn’t save objects or charts embedded in your spreadsheet, for instance). Word 2003, on the other hand, supports all the functionality of Word when a document is saved as XML.
So what does this have to do with you? It means that you can create a Word 2003 document just by concatenating together a bunch of string variables and data. You could use an XML tool for creating the document (for instance, a DOM or SAX parser), but that may be more technology than you need. Also, one of the benefits of using Word’s XML dialect (called WordML) is that you don’t need to use Word to create a Word document—so why throw away part of that benefit by loading some XML tool?
The simplest Word document
Creating a WordML document is easy. This is the simplest possible document that you can create in WordML and have it successfully loaded and displayed by Word 2003:
<?xml version='1.0'?> <w:wordDocument xmlns:w='http://schemas.microsoft.com/ office/word/2003/2/wordml'> <w:body> <w:p> <w:r> <w:t>Hello, World.</w:t> </w:r> </w:p> </w:body> </w:wordDocument>
This document’s opening tag is the XML declaration tag. It’s followed by the wordDocument tag that encloses the rest of the WordML tags. The body tag holds the displayable text in the document. Within the body tag, my example has p (for paragraph) tags, r (for run) tags, and t (for text) tags. Nestled in my t tags is the actual text that I’m displaying: “Hello, World.” The result, displayed in Word 2003, can be seen in Figure 1.
Figure 1
The most critical part of this document is the part of the wordDocument tag beginning with xmlns. The xmlns pseudo-attribute defines a namespace (“http://schemas.microsoft.com/office/word/2003/2/wordml”) and associates it with the prefix w. A namespace is just an arbitrary string of letters, though most namespaces tend to resemble HTML URLs in format. When Word processes the document, it only responds to tags that are tied back to that namespace. As a result, it’s important that you make sure that you enter the namespace correctly. In fact, since I’m working with a beta version of Word 2003, it’s entirely possible that the arbitrary string of letters that make up the WordML namespace may change by the time you read this. It’s not hard to find out what the right namespace is, though. Just create a document in Word 2003, save it as XML, and open the document in either Notepad or Internet Explorer. The right namespace will be defined with the w prefix in the first couple of lines of the file.
At the risk of being too obvious, here’s the VBA code to create my sample document:
Dim strWordDocument As String strWordDocument ="<?xml version='1.0'?>" & _ "<w:wordDocument xmlns:w='" & _ "http://schemas.microsoft.com/" & _ "office/word/2003/2/wordml'>" & _ "<w:body><w:p><w:r><w:t>Hello, World.</w:t>" & _ "</w:r></w:p></w:body></w:wordDocument>" Dim fl As Scripting.FileSystemObject Dim txt As Scripting.TextStream Set fl = New Scripting.FileSystemObject Set txt = fl.OpenTextFile("c:\MyWord.XML", _ ForAppending, True) txt.Write strWordDocument txt.Close Set txt = Nothing Set fl = Nothing
Integrating with the desktop
In my sample, I’ve created a document with the extension .XML. If the user double-clicks on the file, it will be opened in the default XML processor, Internet Explorer. I could just as easily have created the document with the .DOC extension (“MyWord.DOC”). If a user double-clicked on the document, it would be opened in Word 2003 just like a legacy Word document. You can also save the document with the .XML extension, but flag the document to be opened with Microsoft Word 2003. All that’s necessary is to add an XML processing instruction to the top of your document to indicate that Word is the preferred tool for processing the document:
<?xml version='1.0'?> <?mso-application progid='Word.Document'?> <w:wordDocument ...
If you save the document with the extension .XML, the icon on the desktop is a combination of the Word icon and the XML icon. However, when the user double-clicks on it, the file will open in Word just as if it had the .DOC extension.
Formatting
Anything that you can do in Word, you can do in WordML. This includes inserting fields, adding hyperlinks, creating tables, defining styles, and more. Rather than drag you through all of those details, I’ll give you a look at your ability to format text in a Word document. For my example, I’ll assume that you want to send a letter like this:
Dear Mr. James:
We have been very patient about your outstanding account, due since April 3, 1882. We wish to advise you that if we haven’t received full payment of the $200.00 within the next 30 days, we will seek legal action.
This letter consists of two paragraphs containing both bolded and italicized text. Creating multiple paragraphs is just a matter of enclosing your text in separate p tags. In this example, I’ve used an empty p tag (<w:p/>) to create the blank line between my two paragraphs):
<w:body> <w:p> <w:r> <w:t>Dear Mr. James:</w:t> <w:r> <w:p> <w:p/> <w:p> <w:r> <w:t> We have been... . Legal action. <w:t> <w:r> <w:p>
To format parts of your text differently from other parts, you need to break your text up into separate r tags. An r tag represents a run of text—a set of characters that are all formatted in the same way. You can have multiple runs of text in the same paragraph. In this example, I’ve inserted a soft break in the middle of a paragraph:
<w:t>I think that I shall never see</w:t><w:br/> <w:t>A poem as lovely as a tree</w:t>
The result of using the br tag is that the two lines of the poem will appear on separate lines without starting a new paragraph.
To bold my text, I first need to break it out into separate runs:
<w:p> <w:r> <w:t> We have been very patient about your outstanding account, due since </w:t> </w:r> <w:r> <w:t> April 3, 1882. <w:t> </w:r> </w:t>
For the run that I want to display in boldface, I need to set the bold property for the run. The properties for a run are held in an rPr tag (run properties) within the r tag. To set text to bold, you use the b tag, which has an attribute called val. You must set the val attribute to “on”:
<w:r> <w:rPr> <w:b w:val='on'/> </w:rPR <w:t> $200.00 </w:t> </w:r>
Italics are handled in the same way, using the i tag instead of the b tag. With either tag, it’s essential that you use the w prefix with the val attribute and that the word “on” is lowercase.
Conclusion
As you can see, if you can declare a string variable, you can create a Word 2003 document. You don’t need to load Word, and you don’t need to learn the Word object model. There are supposed to be more than 3,000 tags in WordML, but you’ll only need a couple of dozen to handle all the documents that an Access developer would need to create. The Office documentation includes a description of every tag and what it does.
The sample database in this month’s issue generates a form letter from the customer list in the Northwind database. In a future issue we’ll look at ExcelML and show you how to generate Excel spreadsheets from Access without using Excel.