Moving print publications to EPUB – Part 1

Part 1 – Content Order

Samples provided in this write-up use Adobe InDesign CS5 (7.0.2) and Digital Editions Export 3.0.0

Over the past 8 to 10 years publishers worldwide have moved to Adobe InDesign and InCopy workflows for their book production. Using InDesign for the layout and InCopy for editorial purposes. With the current trend towards ereading, it seems more then logical that publishers are looking at converting existing InDesign documents into EPUB format.
Although earlier versions of InDesign have had support for Digital Editions Export (EPUB), up until the most recent release of InDesign (CS5), we would have had to change the way in which our documents were laid out in order to best prepare them for EPUB output. This was primarily due to the fact that the content order of the EPUB would be based on the document’s Page Layout. InDesign CS5 enables content order creation based on Page Layout or XML Structure.

Page Layout

When the EPUB content order is based on the Page Layout structure of the InDesign document the following occurs:

  • Text-threads — or stories as they are referred to in InDesign — are added as an uninterrupted content flow. Any non-threaded content that might decorate the document pages, such as images with captions or break-out text, is added in the content order after the story content is inserted. In other words, these objects will not be placed according to their contextual page reference.
  • Secondly InDesign first of all assumes a left-to-right object order when determining the content order. After the left-to-right classification it will look at the top-to-bottom position of objects. This means that if there is an object placed at the bottom of the page that is positioned further to the left than a text frame above it, for example an image, the object will appear first in the content structure.  In the illustration below the image marked with “3” on the graphic below is positioned further left then the caption “4” above it. As a result the image will appear before the caption in the content order. (note: in this example the article listed as ‘1’ is actually continued from the previous page text thread… hence it taking precedence over the other objects).

page sample with content order of items on page marked.

  • Objects that are part of a group are handled as if they are individual objects. For instance if item “3” and “4” were grouped together, this would have no impact at all on the resulting EPUB content order. The image “3” would still appear prior to the caption 4″.

In a nutshell using an existing InDesign page layout as the source of the content order, would require significant rework in order to generate correct the contextual content order in the published EPUB. For instance in order to define contextual placement of images, these objects need to become objects that are anchored or inline

XML Structure (InDesign CS5)

InDesign CS5, adds the ability to order the content based on the document’s XML structure.  Even when the current version of your InDesign documents doesn’t contain an XML structure, it is relatively simple to mark-up a document’s content with XML.

Building a consistent XML content structure for an InDesign document does require some preparation. Especially where a large amount of document is concerned, it will be worth spending some time and effort on defining the XML Tags and style-mapping rules — which InDesign styles are to be mapped to certain XML tags —

As the article focus is on content order, I’m going to assume that document text is consistently styled and running headers and footers are build using Master Page items.

When the original XML content order is created for an InDesign document, by initially adding untagged items, there are a few things to keep in mind:

  • XML structure processes objects based on the layer they reside in. The bottom most layers are handled first.
  • Positional placement — the left-to-right, top-to-bottom as seen in Page Layout order — is ignored.
  • Objects originating from the same layer are added to the XML structure based on their stacking order — the objects added last to a document page will appear lower in the content structure.
  • As with the page layout structure, text-threads are added as an uninterrupted content flow and non-threaded content that might decorate the document pages, such as images with captions or break-out text, are added to the XML structure story content as separate items.

This means that the structure that is created isn’t fully predictable. However, the power of the XML structure is that content order can be corrected without the need to physically make layout type changes to the InDesign document.

  • Once the XML Structure is visible objects can be selected and moved into their appropriate contextual position.

xml structure, illustration shows moving object to new order in structure pane.

During EPUB export, the updated XML structure can now be used to define the content order in the EPUB.

resulting content order in EPUB when viewed in Adobe Digital Editions.

See also Part 2 – Controlling content breaks in this article series on Moving print publications to EPUB. Controlling content breaks.




Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

This Post Has 20 Comments

  1. APIS-MEDIA says:

    Obtaining ROI from the net is everyone’s job.

  2. hafez says:

    Dear Cari,
    I solved it. Many thanks for great help.

  3. hafez says:

    Dear Cary
    I use the “XML Structure” for creating epub file form indesign cs6.Just one thing, I have some text box in my file and like to export them in epub too. Can you please help me for creating them in the epub.
    Thenks

  4. Joey says:

    HI Cari, I’m working on converting my first book to epub. I have split it into chapters since it has many photos. The problem is when I export the chapter it still shows text for the next chapter. How do I get rid of the extra text?

  5. krish says:

    Hi Cari,
    I have just tried epub for the first time in indesign cs5.5. the problem I am facing is when i put a graphic or colored box beneath a text frame and export it to epub… the output shows that the text and graphic order were not the same as i designed in indesign… how to fix it… plz help

    • Cari Jansen says:

      @krish

      The easiest technique to control where graphics appear in EBooks, is to turn them into anchored objects and thereby turn them into inlines in the EBook.

      Hope this helps. Cari

  6. Cari Jansen says:

    Which version of InDesign are you using Rod? With InDesign CS5.5 you can break-up your EPUB based on the paragraph style you have applied to your chapter titles for instance. Have you tried that? Alternatively turn each chapter into an individual InDesign document and combine them using InDesign’s Book feature, and export to EPUB from the Book panel menu. — Hope this helps, Cari

  7. Rod says:

    I’m not an expert in InDesign. I’m trying to format my InDesign book to an epub file and I can’t figure out how to set it up so the chapters show up individually on ibooks. Right now, it is just one continuous story with no breaks. How do I set up the breaks so my epub file will show separate chapters?

  8. Cari Jansen says:

    hi Chloe,

    Unfortunately no CS5.5 doesn’t include any widow/orphan information in the resulting CSS file it generates. You’d have to manually add that afterwards, but be aware that not all eReaders support it either. However that might not work either if each line in the poem sits in a separate paragraph.

    What you could do is try and trick things…

    Try wrapping those bits of text you want to keep together in a separate <div>, add a class, and in the css include: page-break-inside:avoid;

    E.g.:
    div.keeptogether {
    page-break-inside:avoid;
    }

    and in html:

    <div class="keeptogether">
    ... content here ...
    </div>

  9. Chloe says:

    Will the EPUB output from CS 5.5 honor the Indesign paragraph style “keep options” to keep all lines in the paragraph together? I’m trying to design something that is very much like a long poem and I don’t want page breaks in the middle of stanzas. How can I achieve this?

  10. Cari Jansen says:

    @Jon
    You’ll have to do a fair bit of CSS and HTML work to turn the EPUB output from CS5 into a fixed layout. InDesign will generate ‘reflowable’ content, not ‘paginated’-content.
    Also you can’t just ‘rezip’ an expanded EPUB file on the Mac, that won’t work.
    Try using something like PDF/XML Inspector or XML Author to open the EPUB without ‘breaking’ it. Or else if you can use Terminal and rezip the file using command-line.
    you might want to try just creating an EPUB without adding that com.apple.ibooks.display.options.xml file first to see what InDesign actually generates first.
    Also please note you can not combine reflowable html + fixed-layout in a single EPUB, that’s just not possible. It either has to be one or the other.

  11. Jon Mason says:

    Great article.

    I am have a problem though!

    I am trying to create a ‘fixed layout’ ePub file from inDesign 5 for a photography book I’m creating.

    I have tried unzipping the ePub file and adding the “com.apple.ibooks.display-options.xml” file within the “META-INF” directory, then re-zipping and changing the file extension back to epub.

    This will not then transfer to my iPad.

    Any advice?

    HELP!

  12. Cari Jansen says:

    @Eric

    You’d need to expand the ‘Story’ tag (see image above) in InDesign Structure and then move the image & caption within this tag. locate the text after which you want to included it, then insert them there. Should work. :)

    If not, hop onto http://forums.adobe.com/community/indesign/indesign_general and post a question and upload some screenshots of your Structure panel to start out with.

  13. Eric M says:

    I am trying to do exactly what you demonstrate here with exporting to epub based on XML structure and I am running into a problem I think you may be able to help clear up.
    I have been exporting a chapter from one of out titles using a variety of different methods. After learning the perils of “Base on Layout” in regards to where the content ends up, I began to investigate XML tags and realize it may be the best option for a lot of our books. However, I am losing elements entirely. Here is the rundown:
    The document (prepared by the typesetters, not myself) contains the whole chapter’s text as one Story. It also contains 1 photo and its caption as a second Story. The photo and caption appear on the second page in the layout.
    I created tags for the whole Story, the Photo, the Story for the caption, a tag for the pre-photo text and a tag for the remaining text. Nested with in the main Story, both sub-story tags appear, and when I drag the Photo tag and the Caption tag between the two tags within Story, the final .epub displays neither the caption text nor the image. However, they do still appear in the .epub if I move their tags to before or after the Story. (So not nested)
    I even went as far as to map every style in the document to tags, and reordered all those tags, which had the pseudo-desired result of rearranging all the heads and paragraphs.
    Any information regarding this would be greatly appreciated. If you need more information, I will gladly provide it.
    Thanks!

  14. Cari Jansen says:

    @John I’m not a Microsoft Word expert, so that workflow might be something that’s better asked on a Word blog somewhere I’d say. Sorry!

  15. John duVal says:

    Can conversion of word docs to XML docs to epub be explained more in laymen’s terms?

  16. Cari Jansen says:

    Great article David :-) Absolute valid points.

    When we are working with existing print documents that need to also be converted to EPUB, what works best does depend on the original document. The biggest problem overall with content order is that only an editor/author knows the true content order, so it’s a little difficult to automate at this point.

    If it is quick and easy to re-create the inline content flow in a new document with all content… that’s a good start, it also gets structured content out of the document, a working EPUB and the EPUB content could be reused in a newly to be created XML-in EPUB-out workflow.

    In other cases, the XML-structure is a good tool to be able to retain the original layout of the document and still get an EPUB out of it, without having to recreate the document specifically for EPUB. This workflow can also be the start of an XML-in EPUB-out workflow.

    Overall the win-win would be to get the content-order right from the start, by using an XML-in and EPUB-out workflow in publications that are published to different media. That way we’re really starting to look at single source publishing.

    Cari

  17. This is great, Cari, but here’s some follow up on why the structure pane is a good (or bad) idea: http://indesignsecrets.com/structure-pane-versus-page-order-for-epub-export-from-indesign.php

Leave A Reply