1 of 52

Slide Notes

Hi, my name is Sally Wilson and I am a librarian at Ryerson University in Toronto.

This presentation is about how you can take take a public domain or Creative Commons ebook and work with it to create something new.

Creating an ebook from scratch can be a great learning experience; however, it can also be a daunting proposition. By hacking an existing ebook, you can get much of the same experience in a more manageable package. Variations on this theme are applicable across a wide range of library settings:

In a HigherEd setting, a text can be enhanced and customized for a course

In schools at any level, students can remix a public domain work as a class project

In public libraries, augmenting a text of local interest could be part of a community project
DownloadGo Live

How To Hack Ebooks In The Public Domain

Published on Nov 19, 2015

No Description



Hi, my name is Sally Wilson and I am a librarian at Ryerson University in Toronto.

This presentation is about how you can take take a public domain or Creative Commons ebook and work with it to create something new.

Creating an ebook from scratch can be a great learning experience; however, it can also be a daunting proposition. By hacking an existing ebook, you can get much of the same experience in a more manageable package. Variations on this theme are applicable across a wide range of library settings:

In a HigherEd setting, a text can be enhanced and customized for a course

In schools at any level, students can remix a public domain work as a class project

In public libraries, augmenting a text of local interest could be part of a community project

Untitled Slide

So in as nutshell... we will look at where you can find public domain and Creative Commons licensed ebooks that you can use, discuss a couple of ways of editing them, and explore some ideas of what is possible. I hope this will inspire you to create a project of your own.

Before we start I'd like to comment on a few of the words in the title of the presentation namely: hacking, ebooks and public domain.


The first concept is hacking. When I say hacking I don't mean hacking in an evil sense of criminal activity that may land you in jail...
Photo by Bob Jagendorf

Untitled Slide

I mean hacking in the sense of remixing, reusing, taking something that already exists and building on it.
Photo by Marcus Q

Untitled Slide

Hacking an ebook dovetails njcely with the recent trend towards creating and making as a way for learning that has become popular in schools and libraries and would fit well as a project undertaken in a makerspace or fab lab.
Photo by jabella


The second concept is ebooks.

By ebooks I really mean epubs, a particular type of file that has been created for use with dedicated ereaders (Kobo, Nook, Sony) or on a tablet (iPad, Android) running ereader software.

Amazon uses a proprietary format for their Kindles and Fires. They don't work with epubs directly, but an ePub that you have created or edited can be converted into a format to run on an Amazon device.
Photo by libraryman

Untitled Slide

One of the main characteristics of an epub is its ability to flow to fill the container it is being used with. This means that the book can read quite easily across a wise variety of devices from smart phones to mini ereaders to tablets and iPads.

We will see later that this can present some challenges when editing.

Photo by dno1967b


The third concept is Public Domain.

Most works are automatically protected by copyright when they are created; however, after a specific period of time (this varies from country to country) they enter the public domain and can be used as building blocks for the creation of new knowledge.

Photo by inju


The first thing we need to do for this project is find a work that is in the public domain.
Photo by hugovk

Untitled Slide

We need to be a little careful in doing this. Just because books are in the public domain and are "free" in that you can access them on the Internet doesn't necessarily mean you can do what you want with them.

For example - Google has scanned lots of public domain book from their partner libraries, but they put limitations on what you can do with them.

Also, as I mentioned, because copyright laws differ from country to country, works will enter the public domain at different times depending upon where you live.

So we need to find a reliable source.
Photo by inggmartinez


A Major source of public domain ebooks is Project Gutenberg. In most cases if you download from Gutenberg.org you should be fine as the US copyright laws are more restrictive than ours.

There are also Canadian and Australian Gutenberg sites which contain additional works for use in those countries.


To remix Gutenberg books, you will need to remove all references to Gutenberg in the books (usually license information at the beginning and end of the book) so that you are working with the straight text of the book and not something that has any Gutenberg branding.
Photo by NickDawson

Untitled Slide

You may be thinking "What can I do with a public domain ebooks"? The answer is pretty much anything.

You can republish them as they are, add your own material, recast them in different media, remix them with other works, translate them, etc.

The possibilities are only limited by your imagination.

Your next question may be "Can we really do this"?
Photo by P1r

Untitled Slide

There is ample precedent for using public domain works.

For example - Disney has made a lot of money from the retelling of Grimm's and other fairy tales such as Cinderella, Beauty and the Beast, Aladdin, etc. (It is ironic that they lobbied extensively to extend the length of copyright protection in the states to prevent others from doing the same thing.)

Photo by Randy Pertiet

Untitled Slide

Another example is Tom Stoppard's Rosencrantz and Guildenstern are Dead which builds upon two characters in Hamlet. Shakespeare himself was a master of remixing as he took existing stories and reworked them into his plays.
Photo by Paul Lowry

Untitled Slide

And we also have recent mashups of the Zombie phenomena and Jane Austen. So you can see from these few examples that there is considerable re-use of existing materials both now and in pre-digital times.
Photo by snacktime2007

Untitled Slide

You are not just limited to public domain works (which tend to be older texts) for this project, there is another whole class of resources that you can remix that are made available under Creative Commons licenses.

Normally when a work is created it is automatically protected by copyright, but Creative Commons allows the creator to indicate how he or she would like the work to be used.

CC has 6 flavours of licenses with varying degrees of flexibility - some of which allow you to do whatever you want with the work as long as you attribute the original work properly.

Photo by Enokson


For example Cory Doctorow, the science fiction writer and journalist makes his works available under a couple of CC licenses:

Attribution Non-commercial No derivative BY-NC-ND
Which is a restrictive license that basically allows you to make copies as long as you are not making money. You cannot however remix these works.

Attribution Non-commercial ShareAlike BY-NC-SA
This license lets you do what you want with the work as you attribute it, don't make any money off it and license in the same way so that others can build on what you have created.
Photo by Erwin Boogert

Untitled Slide

Also, for those of you working in an academic setting, there are more and more Open Educational Resources (OERs) appearing that are licensed under various CC licenses that you might want to use.

These include various teaching resources and complete textbooks covering a wide range of academic levels.

Some examples of these can be found at:

OER Commons www.oercommons.org
Saylor Foundation www.saylor.org
Openstax College openstaxcollege.org

And here in Canada, BC has just launched an open textbook project available at:

Photo by opensourceway


Once you have found a public domain book or something with an appropriate Creative Commons license that you want to work with, download it to your computer.

The file you download will most likely be quite small as it will be mostly text and it will have a file extension of .epub.
Photo by I'm George


Before we look at the contents of an ePub, now is a good time to say a few words about the ePub standard. There is a standard for epubs which was developed by the IDPF (International Digital Publishing Forum). Up until October 2011, the ePub standard was epub2. The majority of epubs that you come across, particularly public domain ones, will adhere to this standard.

But In October 2011 the epub3 standard was finalized. It allows for more interactivity, embedding of video and audio, better rendering of mathematical equations, media overlays, and support for scalable vector graphics. We are in a bit of a catch-22 situation regarding the adoption of epub3 as publishers are reluctant to use the format if the ereader systems don't support it and the ereader vendors are reluctant to upgrade their software if there aren't many epub3 books in the marketplace.

Epub3 will become the default format, but this may take some time.

Opening your ePub

Back to opening up your ePub.

An ePub is a set of files zipped together for use on an ereader or device running ereader software. It can be easily unzipped into its component parts. On windows, all you need to do is change the extension from .epub to .zip and use the windows unzip program

Photo by herefordcat


Mac users will need to download some free software called ePub Zip/Unzip available at code.google.com. Latest version 2.0.1 (as of July 2013)

Photo by exacq

Untitled Slide

The unzipped ePub consists of several folders and files and a packaging list that indicates how the whole thing is put together

What you see here is pretty much standard.

From the right - we have a Mimetype file which explicitly states that this is an ePub.

And two folders:

The META-INF folder has a container file in it which points to the file in the contents folder that defines the contents of the book. This folder can also contain files to do with Digital Rights Management (DRM), but this is not something that will show up in a public domain book.

If you do find DRM files here, you may want to reconsider the text you are hacking.

The name of the other folder can vary. It is often CONTENTS, OEBPS (Open eBook Publication Structure), or in the case of Gutenberg the name of the folder may be a number.

This folder contains multiple files that are the content and the metadata of the book.

It may include

one or more style sheets,
a folder of images used inside the book and for the cover,
the text of the book often broken down into chapters,
a contents file (a packing list),
metadata (author, title, etc.) and
a table of contents file. The TOC file is not the TOC that will appear in the book, but a hierarchical TOC used by the ereader software.

This is what you will see if you unzip the file manually. Depending upon how familiar you are with XML and HTML and how willing you are to learn some of the intricacies of epubs you may be quite happy to do all your edits directly in these files.

If you are glazing over at this point you will be happy to hear that they is an easier way to hack an ePub.

AVAILABLE FROM code.google.com

Sigil is open source software that you can use to create and edit ebooks.

You can open your ePub in Sigil, make your edits and save the ePub.

Sigil will look after the situations where adding or changing involves editing multiple files.

Sigil is available from code.google.com

Untitled Slide

This is a screenshot of what Sigil looks like. It is similar to a word processing or HTML editor.

It has three panels:
File listing on left, the table of contents on right, and an editing pane in the middle with both code and WYSIWYG views.

Various menu bars for editing are,across the top.

Okay, so now we have our ePub opened up either manually or in Sigil, what can we do?


One thing you might want to add to your ePub is some images (photos, maps, original artwork)

Adding an image to your ePub requires several steps.

If you are manually editing your epub you will need to place your image in the image folder, add the name of the image to the content file and then edit the text file to link to the image.

With Sigil you navigate to the spot where you want to insert the image and then browse to it on your computer. Sigil will look after adding the image to the epubs image directory and adding a reference to it in the contents file.
Photo by JoãoMoura

Untitled Slide

Here is a very basic example of adding an image of the author, Mary Shelley, to Frankenstein.

Untitled Slide

Another type of image you might want to add Is maps to help your readers develop a sense of place. The letters at the beginning of Frankenstein are written from St. Petersburg in the 1700s. This map is from the same time period and was found on Wikipedia Commons.

One thing to note with images is that they may not show up exactly as you envisioned due to the fluid nature of epubs. Some ereaders seem to have minds of their own when it comes to displaying images, so despite your attempts to format how an image will display, the ereader does its own thing. Examples of this include resizing of images, and splitting images and captions over two pages.


In addition to finding images that you can use on the web, you might also want to include original artwork. If you are working on a community project, you could solicit contributions of art from the participants. The same thing with students, they could be asked to illustrate sections of a text.
Photo by Nita J Y

Untitled Slide

This is an example of adding some original artwork to Catharine Parr Traill's Backwoods of Canada.

It looks as if the original version had images, but they have been stripped out of this text from Gutenberg - a good opportunity for creating your own illustrations.


Another thing you can do is work with your epub's cascading style sheets (CSS). Your epub will come with one or more style sheets that can be edited to change how the content is formatted and displayed.

Photo by Spinstah


You can do things such as changing font face, size, color, but remember that some ereaders will override this with their own fonts.

May find that you need to change both colour and size to ensure that the text looks different in all ereaders.

I have edited the style sheets to make the font of a particular paragraph larger and blue.

Following are a few screenshots of how that text appears in several different ereaders.

iBooks on an iPad

On iBooks, font larger and blue which is what was specified.

Adobe Digital Editions

on a laptop
On Adobe digital editions on a laptop the font is also larger and blue.

Overdrive on an Android Device

On an Android device running Overdrive, the font is just larger and, of course, on eink readers such as the kobo, there is no colour.


Another thing you might want to do is flag sections of text. With etexts it is not possible to say "go to page 29" as everyone has different pagination, but it is possible to take advantage of the built in search functionality of ereaders.
Photo by Arria Belli

Untitled Slide

In this example a few characters, ***25, have been added to the text. Students can quickly jump to that section by using the ereader's search function.


You can also include questions, comments and activities in the text.
Photo by hdurnin

Untitled Slide

In the example here we have a couple of suggested activities at the end of the first chapter.

Results of these activities could later be incorporated into a new edition of the ePub.


Most of the public domain ebooks at Gutenberg lack cover images but this is something that you might want to add.

First you will need to create or find an Image that you want to use for your cover. There are a couple of things that you will need to keep in mind when creating a cover:

Dimensions should be similar to a print book with height 1.5x the width

Should look good as a thumbnail as this is what readers will see on their virtual bookshelves.

Should look good in black and white

Add text for the title and author to your image

Photo by Great Beyond

Untitled Slide

Here is an example of a basic cover that shows up reasonably well in small and black and white versions.

How do you do this?

Manual instructions

1. Create or find an image that you like.
2. Save it to the images directory of your ePub
3. Create an HTML file in the main content directory that has the html coding to render the image. You can name it what you want with the extension .html
4. The packing list file, content.opf, needs to have some coding added so that the ereader system knows about the image and that it is to be used as the cover.

With Sigil:

1. Click on Tools,
2. Click on Add cover,
3. Choose other files and select the image from your computer.

Your work is done as Sigil looks after adding a cover HTML file and adding the appropriate coding to the content.opf file.


So far I have been focused primarily on hacking an ebook as an educational experience, but we shouldn't overlook the great marketing and promotional opportunity this is for you library. At the very least you could put something on the title page indicating that it is your library's project or you could do something a little fancier with your logo, a list if contributors etc.
Photo by Xiaozhuli


With epub2 it is not possible to embed audio or video, but you can link out to it. When doing this you will need to keep in mind that the reader will need Internet access to take advantage of the link.

Untitled Slide

LibriVox (based in Montreal) provides free to use recordings of public domain books. Most of these are available through the Internet Archive.

You can link to recordings of entire books or just of sections.

Photo by kayray

Untitled Slide

A link to a librivox recording of the four letters at the beginning of Frankenstein has been added here.

To do this all you need to do is add the text and link to the appropriate HTML file.

To use the link Internet access is required.


At the end of your book you could create a custom bibliography of resources related to the text.
Photo by sukisuki

Untitled Slide

This is a very brief custom bibliography for Heart of Darkness with a couple of images and notes about additional readings and resources.

Untitled Slide

This is a similar bibliography for Frankenstein with a reference to a book in the library and a reminder to students about links to articles in the Learning Management System, Blackboard.

Untitled Slide

These are just a few examples of things that you might want to do with your edited ePub. Once you have finished your editing you will need to zip your files back up again if you have manually edited them or save your file if you have been using Sigil.


Before making your epub available to your readers,it is a good idea to validate your handiwork to ensure that it is a valid epub and will be readable on all readers/software

If you have manually edited your ePub you can use the IDPF's ePub Validator


Sigil has its own validation built in. It will warn you of html errors as you go, but it is a good idea to run the validator as well. It is available under Tools on the menu bar. Just click on validate ePub with Flight Crew

If you do have errors you will need to fix them before you make your ePub available for downloading as it may not display properly in all ereader systems.
Photo by williamhartz

Untitled Slide

I hope this presentation has given you some ideas about how you can augment a public domain book or a Creative Commons licensed resource as a means of enriching the learning experience and/or providing opportunities for collaboration on a community project.
Photo by mis1u1

swilson@ryerson.ca July 2013

Thank you for listening. If you have any comments, questions or suggestions, please contact me via email or Twitter.


Creative Commons licensed BY-NC Sally Wilson 2013, where not otherwise licensed.
Photo by robert.barney