Alas, nothing is as easy as it seems, and when it comes to ePUB, nothing even seems easy. Even though the
HTML
file that you created in the last lesson is a valid Web page, the Open eBook specification requires a header for the file that tells devices how to interpret it, and it requires four separate files that need to be created in order to package the book for reading. In this lesson we'll look at those small adjustments.
The XHTML
header
This header information should be the same for all the content files in your document. It defines the document as a
XHTML
document and describes the character set and language that will be used. The code is very simple and should just be copy and pasted in place of the
<HTML>
tag in your content file.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
Anytime you create a new content file for your book, paste this code at the top, then continue with the same
HTML
as previously discussed. That's the easy part. Now on to the four fiendish files.
File 1: mimetype
When an eReader encounters a book with the
.ePUB
extension, the first thing it checks for is whether it is actually an eBook. Many things could be packaged into an eBook format, but they won't read without a
mimetype
file. This is one short line of text in a text document with no extension. Simply open your text editor and in a new file type:
application/epub+zip
Nothing else goes in this file. Save the file as "mimetype" and after you have closed it, edit the file name in your file explorer to remove the extension. You will get a warning message that changing the file extension might make the file unreadable, but that is okay. Just delete the four letters of the extension (including the period) and save the changes.
File 2: container.xml
This is also a very short text document with the
.XML
file extension. Only one part of this file will change for each eBook you create: the name of the third fiendish file. Copy and paste the following into a text document.
<?xml version="1.0"?>
<container version="1.0" xmlns="urn:oasis:names:tc:opendocument:xmlns:container">
<rootfiles>
<rootfile media-type="application/oebps-package+xml"
full-path="oebps/BookName.opf" />
</rootfiles>
</container>
The only part of this file that ever changes is the "BookName" which is the name of the
.OPF
file that defines your book. The container file tells the reader where the packaged eBook files are located. Save this text file and change the extension to
.XML
. Copy and paste the file into every eBook you create and just change the BookName to the current project.
File 3: The .OPF
package file
The
.OPF
file is the one that defines what your book is, where all the pieces of it are located, and any information about the book (metadata) that you would like people to know. Metadata could include the ISBN, price, category, and a host of other information. For our purposes, we are going to create a
.OPF
with the absolute minimum information that must be included in order for the book to be considered a valid
.ePUB
file. This is a text file that contains
XML
elements that are defined by the Open eBook specification. The elements included here are the minumum set that are required. Once again, the term BookName is a placeholder used to define the specific files in your eBook.
<?xml version="1.0" encoding="UTF-8"?>
<package xmlns="http://www.idpf.org/2007/opf" version="2.0" unique-identifier="BookName001">
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf">
<dc:title>BookName</dc:title>
<dc:creator opf:role="aut">Author</dc:creator>
<dc:language>en</dc:language>
<dc:identifier id="bookid">BookName001</dc:identifier>
<metadata>
<manifest>
<item id="content" href="content.html" media-type="application/xhtml+xml"/>
<item id="toc" href="BookName.ncx" media-type="application/x-dtbncx+xml"/>
</manifest>
<spine toc="toc">
<itemref idref="content"/>
</spine>
</package>
There is a minimum of three sections in the package file: The metadata, the manifest, and the spine.
- In the metadata you must include at least a title, creator, language, and unique identifier. Title is pretty obvious. Creator usually starts with the role of author. Other roles may also be defined, but we will not deal with those until a much later lesson. For our purposes, we are doing these lessons in English, therefore the language code is "en". We will look at other languages in the future. Finally, every book needs a unique identifier. This is referenced in two locations: in the metadata and in the
<package>
element opening tag. This is supposed to be a combination of letters and numbers that uniquely identify this eBook from every other eBook that could ever be created. In some instances, the ISBN number may be used. In other cases, commercial software will generate a random code for the book. For now, we will use the BookName and three digits. You can change the numbers for each version of the file you create. - In the manifest, you will list every file that is to be included in reading your eBook. At minimum, the manifest will include the content file(s) for your eBook and the fourth fiendish file which will be discussed next. If your eBook contains multiple content files, graphics, fonts, or any other content, it will all need to be listed in this section.
- The spine lists the files that will appear in the Contents of your eBook. If you create a file for each chapter in the book, for example, each of those files will be listed in the spine.
The
.OPF
is both the most complicated file in the eBook package and in many ways the most important. Save the text file and change the extension to
.OPF
.
File 4: The .NCX
Table of Contents
The
.NCX
file provides the reading system with navigation points in your eBook and is required to be a conforming eBook. It has a few header items and then a listed table of contents with the names of the files (or locations within files) and the display name for each. It is a text file with the extension changed to
.NCX
.
<?xml version="1.0" encoding="UTF-8"?>
<ncx version="2005-1" xml:lang="en" xmlns="http://www.daisy.org/z3986/2005/ncx/">
<head>
<meta content="toc-example" name="dtb:uid"/>
</head>
<docTitle><text>Table of Contents</text></docTitle>
<navMap>
<navPoint id="1" playOrder="1">
<navLabel>
<text>BookName</text>
</navLabel>
<content src="content.html"/>
</navPoint>
</navMap>
</ncx>
This information provides navigation points for sidebar navigation in various eBook readers and for assistive technologies. It is a required file for your conforming
ePUB
eBook. Save the file as a text file and then change the extension to
.NCX
.
Those are the four fiendish files that are required in every properly formed
ePUB
eBook. If you are not sure you've followed everything, I've created a small ZIP file of Aesop's Fables that includes all the pieces you see in this post. You can download it at
NWE Signatures eBook Samples where I'll continue to post samples from these exercises.
In the next exercise, we'll work on properly organizing and packaging the eBook.