The three letter identifiers used in the data denote the speaker. Identifiers are determined by the speaker’s username on Librivox. The identifier variousindicates a reading performed by multiple speakers.

Please note: the links given in this section point to the original unprocessed data sources on Librivox and Project Gutenberg. To get the processed and aligned parallel audiobook corpus, see download.

Emma  
sbe audio
scr audio
ekl audio
mfo audio
mth audio
various audio
- text
Huck  
acr audio
msm audio
jgr audio
pch audio
various audio
- text
Sherlock  
rgo audio
msm audio
dcl audio
various audio
- text
Treasure  
apr audio
msm audio
ksh audio
various audio
- text