|Some people are exceedingly prolific to the point that transcription for publication lags behind our ability to publish. Some texts do not yield easily to optical character recognition (OCR) programs. To decipher these difficult pieces of text, epigraphy or an automated epigraphic system called “reCaptcha” is used that sends these words out to the internet to be used in conjunction with the Captcha system (a Turing discriminator) that allows websites to determine if the requester of a piece of data is a human or a computer program.
This discriminator uses two distorted graphic representations of text strings to tell if the reader is a computer or a human. The human is able to recognize a distorted word where a computer interrogator cannot. The first word is known but the second is the word the OCR program failed to recognize. If the first word is recognized, the second, unknown word has a high probability of being correct. http://www.google.com/recaptcha/learnmore
Scholars Recruit Public for Project
By PATRICIA COHEN, Published: December 27, 2010
Since University College London began transcribing the papers of the Enlightenment philosopher Jeremy Bentham more than 50 years ago, it has published 27 volumes of his writings — less than half of the 70 or so ultimately expected.
His position included arguments in favour of individual and economic freedom, usury, the separation of church and state, freedom of expression, equal rights for women, the right to divorce, and the decriminalising of homosexual acts. He argued for the abolition of slavery and the death penalty and for the abolition of physical punishment, including that of children. Although strongly in favour of the extension of individual legal rights, he opposed the idea of natural law and natural rights, calling them “nonsense upon stilts. Wiki…
The painstaking job of transcribing often hard-to-decipher handwritten documents from history’s lead players — not to mention a lack of money — has meant that most originals are seen by a just a handful of scholars and kept out of the public’s reach altogether. After more than five decades, only slightly more than half of James Madison’s papers have been transcribed and published, while work on Thomas Jefferson’s papers, begun in 1943, probably won’t be finished until around 2025.
Now the scholars behind the Bentham Project think they may have come up with a better way: crowd-sourcing.
Starting this fall, the editors have leveraged, if not the wisdom of the crowd, then at least its fingers, inviting anyone — yes, that means you — to help transcribe some of the 40,000 unpublished manuscripts from University College’s collection that have been scanned and put online. In the roughly four months since this Wikipedia-style experiment began, 350 registered users have produced 435 transcripts.