Visual to Textual Explainer

Why are we doing this? 

Visual publications are an important style of communication commonly used in education and trade book publishing and also magazine publishing. They succeed in delivering engaging reading experiences through a combination of rich illustration, photography, typography and page layout. The value of graphic design is not to be dismissed as simply “making things look pretty”, it creates a more inviting and rewarding reading experience which can be vital for:
  • Conveying complex ideas and detailed information
  • Increasing understanding and information retention
  • Combatting learning or reading difficulties
  • Breaking down language barriers
  • Reaching reluctant readers
  • Communicating to younger readers

By their very nature, digital visual publications may be found to have a lack of accessibility for those unable to see them. These problems can be largely grouped into three categories.

  1. The loss of the purely visual cues that indicate structure, hierarchy and reading order for those with sight impairment, low vision and blindness.
  1. An unsuitable screen size being used to view the publication causing text to become illegible or causing an unsatisfactory pan and zooming method of reading when the page is enlarged. 
  1. The lack of capabilities of a reading system in presenting the visual aspects of the publication

Visual design is an art form. Rather than prohibiting or limiting the design choices of digital visual publications, and in doing so forcing them to remain as print only publications, our recommended approach is to provide an opportunity to add information, instruction and rules within a single visual publication that can be used to generate an alternative textual presentation of the same content when it is more suitable, useful or accessible to read in that way. 

By using best practice techniques that already exist in the EPUB3 specification and supporting documentation, there is an opportunity to add recommended extra functionality which can be introduced without disrupting existing reading systems and publications, whilst simultaneously extending a file's readiness for reading systems that adapt to a new way of displaying content.

To be clear, this recommendation is NOT to create separate files, pages or renditions, but instead it is to use a single source of content and to display that content in a more suitable way for some readers and reading situations.

By preparing a Primarily Visual Publication as Visual-To-Textual we also increase the accessibility for readers of the visual presentation.


  • Create a recommendation for the preparation of visual to textual documents
  • Recommend additional metadata for declaration that publications are prepared in this way
  • Create a recommendation using only the techniques that already exist in the EPUB3 specification
  • Create a specification that meets the needs and requirements of publishers, user agents, and users
  • Ability to handle existing EPUB3 web code standards for the presentation of text, images, video, audio and media overlays

Out of Scope

  • DRM (as outlined by our charter)
  • Animation and interactions which already exist in fixed-layout EPUB3 but have no direct method for use in reflowable EPUB3.

Key Use Cases

The Fixed Layout Accessibility Task Force has identified the definition of a Visual to Textual ebook as:
  • A format that means a primarily visual publication can be sufficiently and effectively read by solely textual means, and therefore also be sufficiently and effectively read by audial means by using text-to-speech (TTS).
  • A format which can be read from beginning to end without user input (moving forwards/backwards through the reading order without manual input)
  • A format where the reading position is retained for the next reading session
  • A format where the user can access the table of contents at any time
  • A format where the user can always find their position
  • A format that can be streamed, offlined, and downloaded

How It Works

The obvious main difference between visual ebooks and textual ebooks is in their presentation. 
With a Visual-To-Textual publication the Primarily Visual Publication is set up as a fixed-layout EPUB, which can also be presented as a reflowable EPUB. The presentation mode is a binary choice between:
  • visual - whereby all content is presented as a regular fixed-layout EPUB; or
  • textual - whereby all content is presented as a regular reflowable EPUB with the preservation of content and structure is in the same correct reading order and image descriptions attached along with the disregarding of visual styling and positioning.

The Textual Presentation 

Elements of the Primarily Visual Publication are either preserved or disregarded in the following ways:

Maintaining the Spine order for content order

The presentation of content pages is ordered in the same way in both styles of presentation.