Hubbry Logo
search
logo
2260847

Distributed Proofreaders

logo
Community Hub0 Subscribers
Write something...
Be the first to start a discussion here.
Be the first to start a discussion here.
See all
Distributed Proofreaders

Distributed Proofreaders (commonly abbreviated as DP or PGDP) is a web-based project that supports the development of e-texts for Project Gutenberg by allowing many people to work together in proofreading drafts of e-texts for errors. As of December 2025, the site had digitized 50,000 titles.

Distributed Proofreaders was founded by Charles Franks in 2000 as an independent site to assist Project Gutenberg. Distributed Proofreaders became an official Project Gutenberg site in 2002.

On 8 November 2002, Distributed Proofreaders was slashdotted, and more than 4,000 new members joined in one day, causing an influx of new proofreaders and software developers, which helped to increase the quantity and quality of e-text production.

In 2006, the Distributed Proofreaders Foundation was formed to provide Distributed Proofreaders with its own legal entity and not-for-profit status, separate from Project Gutenberg. The founding trustees were Charles Franks, Juliet Sutherland, and Gregory B. Newby.

In July 2015, the 30,000th Distributed Proofreaders produced e-text was posted to Project Gutenberg. DP-contributed e-texts comprised more than half of works in Project Gutenberg by 2009.

DP servers are located in the United States, and therefore works must be cleared by Project Gutenberg as being in the public domain according to United States copyright law before they can be proofread and eventually published.

Public domain works, typically books with expired copyright, are scanned by volunteers or sourced from digitization projects, and the images are run through optical character recognition (OCR) software. Since OCR software is far from perfect, the resulting text always includes errors. To correct them, pages are made available to volunteers via the Internet; the original page image and the recognized text appear side by side. Each set is presented to multiple volunteers to enter corrections, which results in a combined dataset that minimizes errors. This process distributes the time-consuming error-correction process with a method akin to distributed computing.

A post-processor combines the pages and prepares the text for uploading to Project Gutenberg.

See all
User Avatar
No comments yet.