OpenDocument Essentials

By J. David Eisenberg

 Next

Preface

OASIS OpenDocument Essentials introduces you to the XML that serves as an internal format for office applications. OpenDocument is the native format for OpenOffice.org, an open source, cross-platform office suite, and KOffice, an office suite for KDE (the K desktop environment).

Who Should Read This Book?

You should read this book if you want to extract data from OpenDocument files, convert your data to OpenDocument format, or simply find out how the format works.

If you need to know absolutely everything about the OpenDocument format, you should download the Open Document Format for Office Applications (OpenDocument) in PDF form from or OpenDocument format from the OASIS TC website. That document was a major source of reference for this book.

Who Should Not Read This Book?

If you simply want to use OpenOffice.org or KOffice to create documents, you need only download the software from http://www.openoffice.org or http://www.koffice.org and start using it. There’s no need for you to know what’s going on behind the scenes unless you wish to satisfy your lively intellectual curiosity.

About the Examples

The examples in this book are written using a variety of tools and languages. I prefer to use open-source tools which work cross-platform, so most of the programming examples will be in Perl or Java. I use the Xalan XSLT processor, which you may find at http://xml.apache.org. All the examples in this book have been tested with OpenOffice.org version 1.9.100, Perl 5.8.0, and Xalan-J 2.6.0 on a Linux system using the SuSE 9.2 distribution. This is not to slight any other applications that use OpenDocument (such as KOffice) nor any other operating systems (MacOS X or Windows); it’s just that I used the tools at hand.

Organization of This Book

Chapter 1, The Open Document Format

This chapter tells you how a document in OpenDocument format is stored and what its major components are.

Chapter 2, The meta.xml, styles.xml, settings.xml, and content.xml Files

This chapter explains the XML elements that describe meta-information (information about the document), style information, and various settings associated with a document in OpenDocument format. It also describes the general structure of the file that contains a document’s content.

Chapter 3, Text Document Basics

This chapter tells you how text documents handle character, paragraph, and section formatting. It also describes bulleted and numbered lists, and outline numbering.

Chapter 4, Text Documents—Advanced

This chapter covers frames, images, fields, footnotes, tracking changes, and tables in text documents.

Chapter 5, Spreadsheets

Spreadsheets have a great deal in common with tables; this chapter points out the similarities and differences. It also covers topics such as formulas and content validation.

Chapter 6, Drawings

This chapter explains the OpenDocument elements for basic shapes such as lines, rectangles, circles, etc.; stroke and fill properties; 3-D elements and text animation.

Chapter 7, Presentation

Text and drawings are at the heart of a presentation; this chapter covers the elements used to add backgrounds, transitions, and sound.

Chapter 8, Charts

The OpenDocument format has elements that allow you to represent charts based on data in your spreadsheets. This chapter describes the elements for chart titles, legends, axes and tickmarks.

Chapter 9, Filters in OpenOffice.org

You don’t have to create a stand-alone application to transform XML files to OpenDocument format. In this chapter, you’ll find out how to make an import filter that integrates your transformations into the OpenOffice.org application.

Appendix A, The XML You Need for OpenDocument

XML, the Extensible Markup Language, is the “native language” of OpenOffice.org. If you haven’t used XML before, you should read this appendix to familiarize yourself with this remarkably powerful and flexible format for structuring data and documents.

Appendix B, The XSLT You Need for OpenDocument

XSLT is an XML markup language that describes how to transform an input XML document to an output document, which may be either plain text or XML. XSLT makes it easy to have a single document serve many purposes. This appendix is a brief introduction to this powerful language.

Appendix C, Utilities for Processing OpenDocument Files

This appendix contains utility programs that we created while writing this book. They made it easier for us to manipulate OpenDocument files, and we hope they do the same for you.

Conventions Used in This Book

Constant Width is used for code examples and fragments.

Constant width bold is used to highlight a section of code being discussed in the text.

Constant width italic is used for replaceable elements in code examples.

Names of XML elements will be set in constant width enclosed in angle brackets, as in the <office:document> element. Attribute names and values will be in constant width, as in the fo:font-size attribute with a value of 0.5cm.

This book uses callouts to denote “points of interest” in code listings. A callout is shown as a white number in a black circle; the corresponding number after the listing gives an explanation. Here’s an example:

Roses are red,
   Violets are blue. 1
Some poems rhyme;
   This one doesn’t. 2
1

Violets are actually violet. Saying that they are blue is an example of poetic license.

2

This poem uses the literary device known as a surprise ending.

How to Contact Us

Please address comments and questions concerning this book to the publisher:

O’Reilly & Associates, Inc. 101 Morris Street Sebastopol, CA 95472 1-800-998-9938 (in the United States or Canada) 1-707-829-0515 (international/local) 1-707-829-0104 (fax)

The author has a web page for this book, where he lists errata, examples, or any additional information. You can access this page at:

For more information about O’Reilly & Associates books, conferences, software, Resource Centers, and the O’Reilly Network, see the web site at:

Acknowledgments

Thanks to Simon St. Laurent, the original editor of this book, who thought it would be a good idea and encouraged me to write it. Thanks also to Erwin Tenhumberg, who suggested that I update the book from the original OpenOffice.org version to the current description of OpenDocument. Thanks also to Adam Moore, who converted the original HTMl files to OpenOffice.org format, and to Jean Hollis Weber, who assisted with final layout and proofreading. Edd Dumbill wrote the document which I modified slightly to create Appendix A. Of course, any errors in that appendix have been added by my modifications. Michael Chase provided a platform-independent version of the pack and unpack programs described in the section called “Unpacking and Packing OpenDocument files”.

Since this is a work in progress, I also want to thank all the people who are taking the time to read and review it and send their comments. Special thanks to Valden Longhurst, who found a multitude of typographical and grammatical oddities.


Copyright (c) 2005 O’Reilly & Associates, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled "GNU Free Documentation License".

Table of Contents