Creating a Free-to-Read International Digital Library

 

Erika Linke, Associate University Librarian, Carnegie Mellon

Denise Troll, Associate University Librarian, Carnegie Mellon

 

This paper articulates the vision of a free-to-read, universally accessible, million-book digital resource, and the concrete steps being taken to move from vision to reality.  With fifteen partners in China, India, and the United States, the million-book project illustrates the collaborative effort and essential elements involved in creating a large-scale, multi-national digital library. 

 

To date, funding for equipment and training has been received from the U.S. government to establish digitization centers in China and India.  An additional equipment grant has been submitted to set up more centers and enable the digitization of different media.  In January 2002, on-site training will begin to ensure that standards essential to providing universal access and facilitating future migration of the resources are followed.  The labor costs of digitization are being supported by China and India.  The training materials and personnel, and metadata capture, retrieval, and delivery software are being contributed by Carnegie Mellon University. 

 

All project partners will contribute content to ensure that the digital library is broad, diverse and multilingual.  Each partner country has begun to identify materials to be added to the collection.  A collection development meeting of partner libraries in the United States, funded by a planning grant, resulted in the design of a pilot project to send 25,000 books to China and India and monitor the effort of selecting, packing, and tracking the materials to and from their destination.  The pilot will enable projections of labor and shipping costs and turn-around times to facilitate streamlining the process for the million books.  Out-of-copyright material will be the focus of initial digitization efforts, but an additional planning grant will convene selected publishers in February 2002 to explore incentives for including copyrighted material.  This planning grant will also convene representatives from selected non-profit organizations to discuss alternative models for creating a trusted repository for the collection.  In the meantime, discussion of content selection criteria will continue among the partners, and additional grants will be written to fund freight costs and the cost of seeking copyright permission. 

 

A trusted repository is necessary to ensure that the collection is not only sustained over time, but that it continues to grow.  Sustainability includes financial considerations.  While the vision of the million-book project is to provide universal and perpetual access to a free-to-read resource, efforts to partner with publishers and non-profit organizations acknowledge that the free-to-read resource will have minimal functionality.  Companies like OCLC are encouraged to provide value-added services for which they could charge a fee. 

 

Carnegie Mellon University Libraries and School of Computer Science are writing grant proposals and coordinating the project.  OCLC has committed to maintaining a registry of the books sent for scanning, books already digitized, and books waiting to be sent.  The registry will prevent duplication of effort and enable tracking materials and project progress.  The million-book digital library will be a big step in the democratization of knowledge and provide a testbed for future research in many disciplines, including automatic summarization, extraction of structural metadata, and machine translation among languages.