Friday, September 16, 2011

Creating an ePub eBook from a Word Processing Document

Print on demand service Lulu announced a new eBook EPUB Converter and eBook Publishing Tools, I thought I would try it out. Previously I have used LuLu to distribute an ePub version of a book on the Apple iTunes store, but the process of converting my content to the ePub format was not easy.

Lulu also offer a EPUB Creator Guide which is full of tips which would be of use to those producing a paper book, as well as eBooks. I skimmed through the guide and Launce into a new book project. The first few steps are the same as for a paperback, you enter the title and author for the book and Lulu issues you an ISBN (or you supply your own). You then upload the word processing file with the book content in it.

At this point you have the choice of supplying PDF, in which case your eBook is only distributed by Lulu, or a file which can be converted to ePub (so it can be distributed by iBooks and others). I had the book both in the form of a set of one web page per chapter (for use in a Learning Management System) and as one word processing master document (used to create the PDF for the printed book). Conventional wisdom would say the web original would be better for ePub as that is a web based format. But I wanted to try Lulu's converter which works with word processing files.

One problem is that Lulu does not accept master documents (with separate files for chapters). So I have to save the book as one file. At this point I assumed I would have to use a Microsoft Word DOC file, but later discovered I could use the LibreOffice ODT format. So I tried both formats.

After uploading the file Lulu took less than a minute to convert to ePub. It reported some changes it made:

In order to create your EPUB, we had to make a few changes to your document:

  • Your file "ict_sustainability.doc" uses optional hyphens, which is currently not supported. We have converted the file anyway, but it may contain formatting errors, Please review the result.
  • We removed multiple blank lines found in succession to prevent unwanted page breaks.
  • We adjusted all image formatting to be set inline. To create an EPUB, images must be centered inline in the document.
  • We have stripped all headers and/or footers in your document. Headers and footers are not supported in the EPUB format.
The book has only one image and the DOC file is only 642 kbytes (for 132 pages). After conversion it was 251 kbytes of EPUB, which was worrying as I thought something must have been lost in the conversion.

On opening the ebook with Calibre, I found all my chapters there in the table of contents. The title page was not correctly formatted, my image missing and the table of contents generated by the word processor was present but is superfluous for an eBook.

The metadata for the book (title, auditor, publisher, description, ISBN and the lie) was correctly formatted in the epub document. This is an impressive feature, in that the Author does not need to do any work to have this information inserted in the ebook. They author simply enters the information where promoted by Lulu's forms and this is inserted.

One surprise was that the Sigil e-book editor reported dozens of errors in the epub code generated by Lulu. One errors was that the Language element is missing (Lulu is only supporting books in English at present, but even so the books should indicate they are in English). All the other errors are "attribute 'target" is not declared for element 'a'". While these do not seem serious errors, it is worrying that Lulu generates them as ebook distributors and ebook readers are much less tolerant of errors than web servers and browsers.

While the layout of the print book was reproduced, the word and line spacing was not correct. Some paragraphs had lines of text touching, others had too much white space. Also the text is displayed fully justified, which does not work well with the poor layout algorithms of the ebook readers.

However, given the minimal effort required, this was a reasonable first attempt. I then realized I could use the native ODT file format of LibreOffice (and Open Office). That produced a ODT file of only 92 kbytes and eboup of 202 kbytes. At that point I assumed there must be a mistake, but the content appeared to be all there (apart from the image).

The formatting of the eBook from ODT looked much better than from DOC (not surprisingly as ODT is a native format for the word processor I am using). The title page, chapter and section headings where in the correct font and color. The paragraph text was not overlapping. The fully justified text still did not look quite right as the Lulu conversion process (as it warned) had removed the hyphenated text.

Obviously reading the tips in the Lulu guide would help remove the extraneous print book formatting to get the eBook to look right. One issue is that as the book had started out as a set of web pages, with one chapter per page, the major headings were marked as "level 2:" (H2 in HTML). The ebook treats each of these as a new book chapter, resulting in the book being overly fragmented. It appears I will need to reformat the book with the H2 replaced with H3 (H3 with H4 and so on).

It would be useful if Lulu added some more features to the converter and the guide, so that the author could create one source document for the print and eBook editions. This is just about possible with the print and PDF versions.

Obviously it is possible to use the same book chapter content and use them in two differently formatted master documents. But I did this previously for the book "Green Technology Strategies" and the process gets very complicated very quickly. It would be good to have an option where it was possible, for example, to include the table of contents for the printed edition, but have the epub conversion process omit it. The conversion process could also ignore requests for fully justified text.

While I will need to fix up my book contents for the ebook, I thought I might was well step through the rest of the Lulu punishing process to see if there are any changes from the print version. The next step was to create a "marketing image". This is much the same as the front cover for a paperback book (an ebook has no equivalent to a back cover). One difference is the aspect ratio, with the ebook image being much squarer than a typical paperback (apparently it is for a 3x4 computer screen). The description enterer for the book (keywords and the like) are the same as for a paperback. There is an extra step for Digital Rights Management (DRM) where for 25 cents a copy
Adobe Digital Editions stops the book being copied (I don't use this).

Lulu suggested a price of $8.99 (most ebooks are less than $10), with my revenue being $7.20 for sales on Lulu and $5.66 elsewhere. One surprise is with the book price was the that discount option has been removed. With a paperback book it is common to have a retail price and a discount, but Lulu argue that people buying ebooks see the discount as an indication the book is overpriced. It occurs to me that this might be because people realize ebooks cost little to manufacture and distribute. In any case Lulu point out that a low price of $.99 to $2.99 is likely to attract more sales.

There appears to be a curious anomaly in the Lulu pricing model. At prices above $3.31, my share of revenue on a book sale via Lulu is higher than when the book sold elsewhere. This makes sense as both Lulu and the book retailer have to receive a share of the revenue. But at $3.31, the revenue is the same from both and at prices below $3.31, the retinue via Lulu is lower than for elsewhere sales.

My inclination is to accept Lulu's suggestion and set the price at $2.99

1 comment:

Atlantis Word Processor Team said...

You could easily convert any TXT, ODT or MS Word document (RTF, DOC, DOCX) to EPUB with Atlantis Word Processor:

It supports cover images, font embedding, and multilevel TOCs. Its EPUBs always pass the EPUB validation test.