Digital Lit

Project Gutenberg

First Published: Internet World
Date Published: 1994
Copyright © 1994 by Kevin Savetz


What if books were free? What if anyone could get a copy of whatever books he or she wanted, without so much as a walk to the library? Would libraries become extinct? Would bookstores crumble?

Today on the Internet, thousands of books are indeed available for free. Classics from Jules Verne, Edgar Rice Burroughs, William Shakespeare and Edgar Allen Poe grace the virtual shelves of the Internet's libraries. Government research, reference books and foreign language literature are available right down the virtual hallway. Library cards are free (well, as cheap as your access to the 'net) and you can keep that book for as long as you like. It's digital.

We're talking about electronic books - also known as "etexts." From the Shakespeare's Taming of the Shrew (written circa 1600) to Bruce Sterling's The hacker Crackdown (written circa 1992), etexts encompass all genres, many languages and every era of human writing.

Click here to read an insightful (and hilarious) reader reply to this article.

You can read one on your Newton or your portable computer. Cozy up with an etext on a long airplane ride, or on the beach, or in the bathtub. (Well, maybe not.) Seriously, even though electronic texts (along with your computer) might not be as portable as a paperback book. and even though you can't find the latest Danielle Steele novel on disk, etexts are wonderful resources for reading or research.

Why would anyone want to read a book on a computer rather than... well, from a book? According to Michael Hart, the man behind Project Gutenberg, one of the most well-known projects creating etexts, "The biggest advantage of electronic books is they cost about one one-hundredth of what a book costs. If you buy a copy of Alice in Wonderland right now, even in paperback, it will cost at least five bucks. On a 1.44-megabyte floppy, the same book uses a nickel's worth of space."

Because electronic texts are stored on a computer, it is much simpler to search a book - or an entire library - for specific information. Research using etexts is faster and easier than using traditional media. "My favorite example," Hart says, "is a paper I wrote on death and marriage in Hamlet, MacBeth, Romeo & Juliet and Othello. This is the kind of paper you couldn't even write without a computer. There are too many citations for even a Shakespeare scholar - you would be up to the ceiling in index cards."

"I am hoping - presuming almost, that as books are added to the public domain electronic library, these kinds of papers will be written. Your everyday research paper is 90% research and 10% writing. With etexts, it will be 10% research and 90% writing the paper," he says. Research using etexts doesn't require trips to the library and allows more time for thinking about what to write rather than shuffling pages. According to Hart, because the actual research will take less time, researchers will have more time for original thought.

"The term 'exhaustive research' will disappear, because it won't exhaust you anymore," he says. "Painstaking research will be as far in the past as painstaking hand copying in the monasteries."

Although the idea of reading books on a computer screen may seem farfetched now, Hart believes that as the cost of storing information and accessing networks continues to fall, electronic texts will become a standard means of reading and retrieving information. "I'm not saying books are going to completely die out in the next ten years, but this is an idea whose time has come," Hart says.

One technical problem slowing the release of texts lies in copyright law. Unless given specific permission from an author, you can't type in any old book or magazine article and re-publish it on the Internet (although folks sometimes have done that with my work - stop that!). The type of information you're likely to find in etexts is free information such as government documents (U.S. government documents are not copyrightable), works with expired copyrights, works placed in the public domain by their creators, and works that authors have made available in electronic form on an experimental basis. (The latter group is the most rare, but it does include Bruce Sterling's excellent book "The Hacker Crackdown" and Eric Raymond's "New Hacker's Dictionary". Both books are also available in paperback, by the way.) According to the Internet on a Disk newsletter (discussed below), "When authors put their work in the public domain or retain electronic rights and make their work freely available in electronic form, the public gains access to their work for the indefinite future, and the authors win new readers."

Still, most authors aren't willing to "give" their work away as an etext, so most etexts are available beccause they have expired copyrights, meaning (under current copyright law) the author must be dead for 50 years. So, while you can find many of the classics online, don't expect too many of today's New York Times bestsellers to be given away on the Internet archives.

Finding Etexts

OK, so where can you find etexts? First we'll look at the instant gratification methods for quelling an etext fixation, then we'll find out more about etext projects and their individual archives.

If you only check two places for electronic texts - here are the ones. On the Usenet, read alt.etext, where you'll find information and listings of the latest etexts and ejournals (electronic magazines). Next, FTP or gopher to etext.archive.umich.edu, the electronic text archive at the University of Michigan. This is my favorite etext site, with goodies from all of the organized etext projects.

                    Internet Gopher Information Client v1.11
                 Root gopher server: etext.archive.umich.edu

comedies

--> 1. allswellthatendswell. 2. asyoulikeit. 3. comedyoferrors. 4. cymbeline. 5. loveslabourslost. 6. measureforemeasure. 7. merchantofvenice. 8. merrywivesofwindsor. 9. midsummersnightsdream. 10. muchadoaboutnothing. 11. periclesprinceoftyre. 12. tamingoftheshrew. 13. tempest. 14. troilusandcressida. 15. twelfthnight. 16. twogentlemenofverona. 17. winterstale.

The Projects

Of course, someone needs to take the time and trouble to make etexts available. That usually means typing or scanning in documents, proofreading and finally distributing them on the Internet and online services. Several projects, operating cooperatively and in parallel, exist to do just that. Each of the following organizations has its own goals and ideas, and there is plenty of work to be shared.

THE FOURTH WORLD DOCUMENTATION PROJECT: The purpose of The Fourth World Documentation Project is to gather documents written by or about Fourth World Nations, process them into electronic text and distribute them to tribal governments, researchers, and individuals with an interest in the Fourth World. This project is an ongoing venture undertaken by The Center For World Indigenous Studies. Questions about The Fourth World Documentation Project may be sent to John Burrows at jburrows@halcyon.com.

PROJECT GUTENBERG: Led by Michael Hart, is perhaps the leader of etext distribution. to date, Project Gutenberg has released upwards of 100 etexts. Hart says the number of volunteers who type or scan books triples every year. Hart has only met about 5% of them - the rest are network users near and far. For the past three years, the project has released twice the number of books as the previous year. "This doubling curve will get us to our goal of 10,000 books by the year 2001," Hart says.

Project Gutenberg's home base on the Internet is available via FTP at: mrcnext.cso.uiuc.edu:/etext Contents are arranged by year of release. The newest releases are in /etext/etext94, last year's are in the /etext/etext93 directory and so on.

You may also receive Project Gutenberg etexts via e-mail. To retrieve list of available files, along with instructions on how to retrieve them, send a message: To: almanac@oes.orst.edu: Body: send gutenberg catalog

ONLINE BOOK INITIATIVE: The Online Book Initiative has been formed to make available freely distributable collections of information. It shares huge collections of books, conference proceedings, reference material, catalogues and more. The purpose of the Online Book Initiative is to create a publicly accessible repository for this information, a "net-worker's library." The OBI's goals are broader than releasing only electronic books. They're also active in getting journals, catalogues, conference proceedings, magazines, manuals, maps, images and technical documentation online.

To peruse the OBI archives, gopher to world.std.com and choose "OBI The Online Book Initiative". For more information, send e-mail to obi@world.std.com.

OXFORD TEXT ARCHIVE: The Oxford Text Archive is a facility provided by Oxford University. The Archive contains electronic versions of literary works by many major authors in English, Greek, Latin, English and a dozen other languages. It contains standard reference works as well as collections of unpublished materials prepared by field workers in linguistics. The total size of the Archive exceeds a gigabyte and there are over 1,300 titles in its catalog. Unlike many other etext projects, not all of the Oxford etexts are free and many are not freely distributable.

You can get the catalog via e-mail - To: LISTSERV@BROWNVM.BITNET Body: GET HUMANIST FILELIST

or by anonymous FTP from Internet site ota.ox.ac.uk (129.67.1.165) in the ota directory.

THE WIRETAP BOOK COLLECTION: The Wiretap collection is a huge archive of electronic texts, including fiction, religious texts and government information.

You can access Wiretap by gophering to wiretap.spies.com and choosing "Wiretap Online Library".

                     Root gopher server: wiretap.spies.com

1. About the Internet Wiretap/ --> 2. Electronic Books at Wiretap/ 3. GAO Transition Reports/ 4. Government Docs (US & World)/ 5. North American Free Trade Agreement/ 6. Usenet alt.etext Archives/ 7. Usenet ba.internet Archives/ 8. Various ETEXT Resources on the Internet/ 9. Video Game Archive/ 10. Waffle BBS Software/ 11. Wiretap Online Library/ 12. Worldwide Gopher and WAIS Servers/

PLEASE COPY THIS DISK: Finally, there is Please Copy This Disk, the verbose name for a project that does not create etexts. Instead, PCTD redistributes the Internet's etexts Internet on floppy disks. Their texts come from Project Gutenberg, the Oxford Archive, Wiretap and other sources. They've done a great job of cataloging the Internet's etexts and making them available in one place. Disks from PCTD cost $10 each for duplication and handling. Ten bucks for "free" information may seem steep, but the service is aimed at educators and others who may not have time or the right equipment to cruise the Internet trolling for etexts. As you might guess from the name, you may freely redistribute PCTD disks.

PCTD distributes a monthly newsletter listing what new etexts are available. (This is a great service even if you don't plan to buy their disks: once you know what new texts are available, you may just want to find them online yourself.)

For more information, or to receive the newsletter, send e-mail to samizdat@world.std.com.


Articles by Kevin Savetz