API Kits for Working with ODF

In the previous sections, I showed the approach of working directly with the ODF specification and the validator and using trial and error to generate valid ODF files. In this section, you’ll move up the abstraction ladder and look at using libraries/API kits/wrapper libraries that work with ODF. Such libraries can be a huge help if they are implemented well and reflect conscientious effort on the part of the authors to wrestle with some of the issues I discussed in the previous section.

You can find a good list of tools that support ODF here:

http://en.wikipedia.org/wiki/OpenDocument_software

You can find another good list here:

http://opendocumentfellowship.com/development/tools

In this chapter, I’ll cover two API kits:

In the next two subsections, I will show you how to use Odfpy and OpenDocumentPHP.

Odfpy

I’ll first use Odfpy to generate a minimalist document and then to ­re-­create the ­full-­blown ODF text document from earlier in the chapter. To use it, follow the documentation here:

http://opendocumentfellowship.com/files/api-for-odfpy.odt

You can access the code via Subversion:

svn export http://opendocumentfellowship.com/repos/odfpy/trunk odfpy

To generate a “Hello World” document, use this:

            from odf.opendocument import OpenDocumentText
            from odf.text import P
            
            textdoc = OpenDocumentText()
            p = P(text="Hello World!")
            textdoc.text.addElement(p)
            textdoc.save("helloworld_odfpy.odt")
         

This code will generate helloworld_odfpy.odt with the following file structure:

            File Name                                             Modified             Size
            mimetype                                       2007-12-03 15:06:20           39
            styles.xml                                     2007-12-03 15:06:20          403
            content.xml                                    2007-12-03 15:06:20          472
            meta.xml                                       2007-12-03 15:06:20          426
            META-INF/manifest.xml                          2007-12-03 15:06:20          691
         

But the generated instance doesn’t validate (according to the ODF Validator), even though OO.o 2.2 has no problem reading the file. For many practical purposes, this may be OK, though it’d be nice to know that a document coming out of Odfpy is valid since that’s the stated design goal of Odfpy.

Re-creating the Example ODF Text Document

Let’s now use Odfpy to generate a more substantial document. The following code demonstrates how you can use Odfpy to ­re-­create the ­full-­blown ODF text document from earlier in the chapter. The code is a rather literal translation of the markup to the corresponding object model of Odfpy—and should give you a feel for how to use Odfpy.

               # odfpy_gen_example.py
               
               """
               
                 Description:  This program used odfpy to generate a simple ODF text document
                 odfpy:  http://opendocumentfellowship.com/projects/odfpy
                 documentation for odfpy:  http://opendocumentfellowship.com/files/api-for-
               odfpy.odt
               
               """
               
               from odf.opendocument import OpenDocumentText
               from odf.style import Style, TextProperties, ParagraphProperties, 
               ListLevelProperties, FontFace
               from odf.text import P, H, A, S, List, ListItem, ListStyle, ListLevelStyleBullet, 
               ListLevelStyleNumber, ListLevelStyleBullet, Span
               from odf.text import Note, NoteBody, NoteCitation
               from odf.office import FontFaceDecls
               from odf.table import Table, TableColumn, TableRow, TableCell
               from odf.draw import Frame, Image
               
               # fname is the path for the output file
               fname= '[PATH-FOR-OUTPUT-FILE]';
               #fname='D:\Document\PersonalInfoRemixBook\examples\ch17\odfpy_gen_example.odt'
               
               # instantiate an ODF text document (odt)
               textdoc = OpenDocumentText()
               
               # styles
               """
               <style:style style:name="Standard" style:family="paragraph" style:class="text"/>
               <style:style style:name="Text_20_body" style:display-name="Text body"
                style:family="paragraph"
                style:parent-style-name="Standard" style:class="text">
                <style:paragraph-properties fo:margin-top="0in" fo:margin-bottom="0.0835in"/>
               </style:style>
               """
               
               s = textdoc.styles
               
               StandardStyle = Style(name="Standard", family="paragraph")
               StandardStyle.addAttribute('class','text')
               s.addElement(StandardStyle)
               
               TextBodyStyle = Style(name="Text_20_body",family="paragraph", 
               parentstylename='Standard', displayname="Text body")
               TextBodyStyle.addAttribute('class','text')
               TextBodyStyle.addElement(ParagraphProperties(margintop="0in", 
               marginbottom="0.0835in"))
               s.addElement(TextBodyStyle)
               
               # font declarations
               """
                <office:font-face-decls>
                   <style:font-face style:name="Arial" svg:font-family="Arial"
                     style:font-family-generic="swiss"
                     style:font-pitch="variable"/>
                 </office:font-face-decls>
               """
               
               textdoc.fontfacedecls.addElement((FontFace(name="Arial",fontfamily="Arial", 
               fontfamilygeneric="swiss",fontpitch="variable")))
               
               # Automatic Style
               
               # P1
               """
               <style:style style:name="P1" style:family="paragraph"
                     style:parent-style-name="Standard"
                     style:list-style-name="L1"/>
               """
               P1style = Style(name="P1", family="paragraph", parentstylename="Standard", 
               liststylename="L1")
               textdoc.automaticstyles.addElement(P1style)
               
               # L1
               """
               <text:list-style style:name="L1">
                 <text:list-level-style-bullet text:level="1"
                   text:style-name="Numbering_20_Symbols"
                   style:num-suffix="." text:bullet-char="•">
                   <style:list-level-properties text:space-before="0.25in"
                     text:min-label-width="0.25in"/>
                   <style:text-properties style:font-name="StarSymbol"/>
                 </text:list-level-style-bullet>
               </text:list-style>
               """
               L1style=ListStyle(name="L1")
               # u'\u2022' is the bullet character (http://www.unicode.org/charts/PDF/U2000.pdf)
               bullet1 = ListLevelStyleBullet(level="1", stylename="Numbering_20_Symbols", 
               numsuffix=".", bulletchar=u'\u2022')
               L1prop1 = ListLevelProperties(spacebefore="0.25in", minlabelwidth="0.25in")
               bullet1.addElement(L1prop1)
               L1style.addElement(bullet1)
               textdoc.automaticstyles.addElement(L1style)
               
               # P6
               """
                 <style:style style:name="P6" style:family="paragraph"
                     style:parent-style-name="Standard"
                     style:list-style-name="L5"/>
               """
               
               P6style = Style(name="P6", family="paragraph", parentstylename="Standard", 
               liststylename="L5")
               textdoc.automaticstyles.addElement(P6style)
               
               # L5
               """
               <text:list-style style:name="L5">
                 <text:list-level-style-number text:level="1"
                   text:style-name="Numbering_20_Symbols"
                   style:num-suffix="." style:num-format="1">
                   <style:list-level-properties text:space-before="0.25in"
                     text:min-label-width="0.25in"/>
                 </text:list-level-style-number>
               </text:list-style>
               """
               
               L5style=ListStyle(name="L5")
               numstyle1 = ListLevelStyleNumber(level="1", stylename="Numbering_20_Symbols", 
               numsuffix=".", numformat='1')
               L5prop1 = ListLevelProperties(spacebefore="0.25in", minlabelwidth="0.25in")
               numstyle1.addElement(L5prop1)
               L5style.addElement(numstyle1)
               textdoc.automaticstyles.addElement(L5style)
               
               # T1
               """
                  <style:style style:name="T1" style:family="text">
                     <style:text-properties fo:font-style="italic" style:font-style-asian="italic"
                       style:font-style-complex="italic"/>
                   </style:style>
               """
               T1style = Style(name="T1", family="text")
               T1style.addElement(TextProperties(fontstyle="italic",fontstyleasian="italic",
               fontstylecomplex="italic"))
               textdoc.automaticstyles.addElement(T1style)
               
               # T2
               """
                <style:style style:name="T2" style:family="text">
                     <style:text-properties fo:font-weight="bold" style:font-weight-asian="bold"
                       style:font-weight-complex="bold"/>
                   </style:style>
               """
               T2style = Style(name="T2", family="text")
               T2style.addElement(TextProperties(fontweight="bold",fontweightasian="bold",
               fontweightcomplex="bold"))
               textdoc.automaticstyles.addElement(T2style)
               
               # T5
               """
                  <style:style style:name="T5" style:family="text">
                     <style:text-properties fo:color="#ff0000" style:font-name="Arial"/>
                   </style:style>
               """
               T5style = Style(name="T5", family="text")
               T5style.addElement(TextProperties(color="#ff0000",fontname="Arial"))
               textdoc.automaticstyles.addElement(T5style)
               
               # now construct what goes into <office:text>
               
               h=H(outlinelevel=1, text='Purpose (Heading 1)')
               textdoc.text.addElement(h)
               p = P(text="The following sections illustrate various possibilities in ODF Text", 
               stylename='Text_20_body')
               textdoc.text.addElement(p)
               
               textdoc.text.addElement(H(outlinelevel=2,text='A simple series of paragraphs 
               (Heading 2)'))
               textdoc.text.addElement(P(text="This section contains a series of paragraphs.", 
               stylename='Text_20_body'))
               textdoc.text.addElement(P(text="This is a second paragraph.", 
               stylename='Text_20_body'))
               textdoc.text.addElement(P(text="And a third paragraph.", stylename='Text_20_body'))
               
               textdoc.text.addElement(H(outlinelevel=2,text='A section with lists (Heading 2)'))
               textdoc.text.addElement(P(text="Elements to illustrate:"))
               
               # add the first list (unordered list)
               textList = List(stylename="L1")
               item = ListItem()
               item.addElement(P(text='hyperlinks', stylename="P1"))
               textList.addElement(item)
               
               item = ListItem()
               item.addElement(P(text='italics and bold text', stylename="P1"))
               textList.addElement(item)
               
               item = ListItem()
               item.addElement(P(text='lists (ordered and unordered)', stylename="P1"))
               textList.addElement(item)
               
               textdoc.text.addElement(textList)
               
               # add the second (ordered) list
               
               textdoc.text.addElement(P(text="How to figure out ODF"))
               
               textList = List(stylename="L5")
               #item = ListItem(startvalue=P(text='item 1'))
               item = ListItem()
               item.addElement(P(text='work out the content.xml tags', stylename="P5"))
               textList.addElement(item)
               
               item = ListItem()
               item.addElement(P(text='work styles into the mix', stylename="P5"))
               textList.addElement(item)
               
               item = ListItem()
               item.addElement(P(text='figure out how to apply what we learned to spreadsheets and 
               presentations', stylename="P5"))
               textList.addElement(item)
               
               textdoc.text.addElement(textList)
               
               # A paragraph with bold, italics, font change, and hyperlinks
               """
                     <text:p>The <text:span text:style-name="T1">URL</text:span> for <text:span
                         text:style-name="T5">Flickr</text:span> is <text:a xlink:type="simple"
                         xlink:href="http://www.flickr.com/"
                         >http://www.flickr.com</text:a>. <text:s/>The <text:span
                         text:style-name="T2"
                         >API page</text:span> is <text:a xlink:type="simple"
                         xlink:href="http://www.flickr.com/services/api/"
                       >http://www.flickr.com/services/api/</text:a></text:p>
               """
               p = P(text='The ')
               # italicized URL
               s = Span(text='URL', stylename='T1')
               p.addElement(s)
               p.addText(' for ')
               # Flickr in red and Arial font
               p.addElement(Span(text='Flickr',stylename='T5'))
               p.addText(' is ')
               # link
               link = A(type="simple",href="http://www.flickr.com", text="http://www.flickr.com")
               p.addElement(link)
               p.addText('.  The ')
               # API page in bold
               s = Span(text='API page', stylename='T2')
               p.addElement(s)
               p.addText(' is ')
               link = A(type="simple",href="http://www.flickr.com/services/api", 
               text="http://www.flickr.com/services/api")
               p.addElement(link)
               
               textdoc.text.addElement(p)
               
               # add the table
               """
               <table:table-column table:number-columns-repeated="3"/>
               """
               
               textdoc.text.addElement(H(outlinelevel=1,text='A Table (Heading 1)'))
               
               table = Table(name="Table 1")
               
               table.addElement(TableColumn(numbercolumnsrepeated="3"))
               
               # first row
               tr = TableRow()
               table.addElement(tr)
               tc = TableCell(valuetype="string")
               tc.addElement(P(text='Website'))
               tr.addElement(tc)
               tc = TableCell(valuetype="string")
               tc.addElement(P(text='Description'))
               tr.addElement(tc)
               tc = TableCell(valuetype="string")
               tc.addElement(P(text='URL'))
               tr.addElement(tc)
               
               # second row
               tr = TableRow()
               table.addElement(tr)
               tc = TableCell(valuetype="string")
               tc.addElement(P(text='Flickr'))
               tr.addElement(tc)
               tc = TableCell(valuetype="string")
               tc.addElement(P(text='A social photo sharing site'))
               tr.addElement(tc)
               tc = TableCell(valuetype="string")
               
               link = A(type="simple",href="http://www.flickr.com", text="http://www.flickr.com")
               p = P()
               p.addElement(link)
               tc.addElement(p)
               
               tr.addElement(tc)
               
               # third row
               tr = TableRow()
               table.addElement(tr)
               tc = TableCell(valuetype="string")
               tc.addElement(P(text='Google Maps'))
               tr.addElement(tc)
               tc = TableCell(valuetype="string")
               tc.addElement(P(text='An online map'))
               tr.addElement(tc)
               tc = TableCell(valuetype="string")
               
               link = A(type="simple",href="http://maps.google.com", text="http://maps.google.com")
               p = P()
               p.addElement(link)
               tc.addElement(p)
               tr.addElement(tc)
               
               textdoc.text.addElement(table)
               
               # paragraph with footnote
               
               """
                  <text:h text:outline-level="1">Footnotes (Heading 1)</text:h>
                     <text:p>This sentence has an accompanying footnote.<text:note text:id="ftn0"
                         text:note-class="footnote">
                         <text:note-citation>1</text:note-citation>
                         <text:note-body>
                           <text:p text:style-name="Footnote">You are reading a footnote.</text:p>
                         </text:note-body>
                       </text:note>
                       <text:s text:c="2"/>Where does the text after a footnote go?</text:p>
               """
               
               textdoc.text.addElement(H(outlinelevel=1,text='Footnotes (Heading 1)'))
               p = P()
               textdoc.text.addElement(p)
               p.addText("This sentence has an accompanying footnote.")
               note = Note(id="ftn0", noteclass="footnote")
               p.addElement(note)
               note.addElement(NoteCitation(text='1'))
               notebody = NoteBody()
               note.addElement(notebody)
               notebody.addElement(P(stylename="Footnote", text="You are reading a footnote."))
               p.addElement(S(c=2))
               p.addText("Where does the text after a footnote go?")
               
               # Insert the photo
               
               """
                    <text:h text:outline-level="1">An Image</text:h>
                     <text:p>
                       <draw:frame draw:name="graphics1" text:anchor-type="paragraph"
                         svg:width="5in"
                         svg:height="6.6665in" draw:z-index="0">
                         <draw:image xlink:href="Pictures/campanile_fog.jpg" xlink:type="simple"
                           xlink:show="embed"
                           xlink:actuate="onLoad"/>
                       </draw:frame>
                     </text:p>
               """
               
               textdoc.text.addElement(H(outlinelevel=1,text='An Image'))
               p = P()
               textdoc.text.addElement(p)
               # add the image
               # img_path is the local path of the image to include
               img_path = '[PATH-FOR-IMAGE]';
               #img_path = 'D:\Document\PersonalInfoRemixBook\examples\ch17\campanile_fog.jpg'
               href = textdoc.addPicture(img_path)
               f = Frame(name="graphics1", anchortype="paragraph", width="5in", height="6.6665in", 
               zindex="0")
               p.addElement(f)
               img = Image(href=href, type="simple", show="embed", actuate="onLoad")
               f.addElement(img)
               
               # save the document
               textdoc.save(fname)
            

You can examine the output from this code:

http://examples.mashupguide.net/ch17/odfpy_gen_example.odt

OpenDocumentPHP

OpenDocumentPHP (http://opendocumentphp.org/) is a PHP API kit for ODF in its early stages of development.

In this section, I’ll show how to use OpenDocumentPHP version 0.5.2, which you can get from here:

http://downloads.sourceforge.net/opendocumentphp/OpenDocumentPHP-0.5.2.zip

Alternatively, you can install OpenDocumentPHP using PEAR:

http://opendocumentphp.org/index.php/home/11-new-pear-server-for-opendocumentphp

Some autogenerated documentation of the API is available here:

http://opendocumentphp.org/static/apidoc/svn/

Unzip the file in your PHP library area. To see a reasonably complicated example of what you can do, consult the samples in OpenDocumentPHP/samples.

Here I will write a simple helloworld-generated document to demonstrate how to get started with the library:

            <?php
            require_once 'OpenDocumentPHP/OpenDocumentText.php';
            $text = new OpenDocumentText('D:\Document\PersonalInfoRemixBook\examples\ch17\
            helloworld_opendocumentphp.odt');
            $textbody = $text->getBody();
            $paragraph = $textbody->nextParagraph();
            $paragraph->append('Hello World!');
            $text->close();
            ?>
         
[Note]Note

 You need ZipArchive to be enabled in PHP to run OpenDocumentPHP. On Linux systems use the -- enable-zip option at compile time. On Windows systems, enable php_zip.dll inside php.ini.

The following is a more elaborate example using OpenDocumentPHP to generate a couple of headers and several paragraphs. The paragraphs are associated with a Text body style.

            <?php
            
            require_once 'OpenDocumentPHP/OpenDocumentText.php';
            
            $fullpath = 'D:\Document\PersonalInfoRemixBook\examples\ch17\odp_gen_example.odt';
            
            /*
             * If file exists, remove it first.
             */
            if (file_exists($fullpath)) {
                unlink($fullpath);
            }
            
            $text = new OpenDocumentText($fullpath);
            
            # set some styles
            
            /**
            
            <style:style style:name="Standard" style:family="paragraph" style:class="text"/>
            <style:style style:name="Text_20_body" style:display-name="Text body"
             style:family="paragraph"
             style:parent-style-name="Standard" style:class="text">
             <style:paragraph-properties fo:margin-top="0in" fo:margin-bottom="0.0835in"/>
            </style:style>
            
            **/
            
            $Standard_Style = $text->getStyles()->getStyles()->getStyle();
            $Standard_Style->setStyleName('Standard');
            $Standard_Style->setFamily('paragraph');
            $Standard_Style->setClass('text');
            
            $textBody_Style = $text->getStyles()->getStyles()->getStyle();
            $textBody_Style->setStyleName('Text_20_body');
            $textBody_Style->setDisplayName('Text body');
            $textBody_Style->setFamily('paragraph');
            $textBody_Style->setClass('text');
            
            $pp = $textBody_Style->getParagraphProperties();
            $pp->setMarginTop('0in');
            $pp->setMarginBottom('0.0835in');
            
            # write the headers and paragraphs
            
            $textbody = $text->getBody()->getTextFragment();
            
            $heading = $textbody->nextHeading();
            $heading->setHeadingLevel(1);
            $heading->append('Purpose (Heading 1)');
            
            $paragraph = $textbody->nextParagraph();
            $paragraph->setStyleName('Text_20_body');
            $paragraph->append('The following sections illustrate various possibilities in ODF
            Text');
            
            $heading = $textbody->nextHeading();
            $heading->setHeadingLevel(2);
            $heading->append('A simple series of paragraphs (Heading 2)');
            
            $paragraph = $textbody->nextParagraph();
            $paragraph->setStyleName('Text_20_body');
            $paragraph->append('This section contains a series of paragraphs.');
            $paragraph = $textbody->nextParagraph();
            $paragraph->setStyleName('Text_20_body');
            $paragraph->append('This is a second paragraph.');
            $paragraph = $textbody->nextParagraph();
            $paragraph->setStyleName('Text_20_body');
            $paragraph->append('And a third paragraph.');
            
            $text->close();
            
            ?>
         

You can examine the output from this script here:

http://examples.mashupguide.net/ch17/odp_gen_example.odt

Leveraging OO.o to Generate ODF

If you are willing and able to have OpenOffice.org installed on your computer, it is possible to use OO.o itself as a big library of sorts to parse and generate your ODF documents and to convert ODF to and from other formats. Libraries/tools that use this approach include the following:

On Win32-oriented systems, you can access OpenOffice.org via a COM interface. For instance, the following Python code running the win32all library will generate a new .odt document by scripting OO.o:

            import win32com.client
            
            objServiceManager = win32com.client.Dispatch("com.sun.star.ServiceManager")
            objServiceManager._FlagAsMethod("CreateInstance")
            objDesktop = objServiceManager.CreateInstance("com.sun.star.frame.Desktop")
            objDesktop._FlagAsMethod("loadComponentFromURL")
            
            args = []
            objDocument = objDesktop.loadComponentFromURL("private:factory/swriter", "_blank",
            0, args)
            objDocument._FlagAsMethod("GetText")
            objText = objDocument.GetText()
            objText._FlagAsMethod("createTextCursor","insertString")
            objCursor = objText.createTextCursor()
            objText.insertString(objCursor, "The first line in the newly created text
            document.\n", 0)