Projects like
Project Gutenberg (est. 1971),
Million Book Project (est. circa 2001),
Google Books (est. 2004), and the
Open Content Alliance (est. 2005) scan books on a large scale. One of the main challenges to this is the sheer volume of books that must be scanned. In 2010 the total number of works appearing as books in human history was estimated to be around 130 million. All of these must be scanned and then made searchable online for the public to use as a
universal library. Currently, there are three main ways that large organizations are relying on: outsourcing, scanning in-house using commercial book scanners, and scanning in-house using robotic scanning solutions. As for outsourcing, books are often shipped to be scanned by low-cost sources to
India or
China. Alternatively, due to convenience, safety and technology improvement, many organizations choose to scan in-house by using either overhead scanners which are time-consuming, or digital camera-based scanning machines which are substantially faster and is a method employed by Internet Archive as well as Google. Once the page is scanned, the
data is either entered manually or via OCR, another major cost of the book scanning projects. Due to
copyright issues, most scanned books are those that are out of copyright; however, Google Books is known to scan books still protected under copyright unless the
publisher specifically prohibits this.
Collaborative projects There are many collaborative digitization projects throughout the United States. Two of the earliest projects were the Collaborative Digitization Project in Colorado and
NC ECHO – North Carolina Exploring Cultural Heritage Online, based at the
State Library of North Carolina. These projects establish and publish best practices for digitization and work with regional partners to digitize cultural heritage materials. Additional criteria for best practices have more recently been established in the UK, Australia and the European Union.
Wisconsin Heritage Online is a collaborative digitization project modeled after the Colorado Collaborative Digitization Project. Wisconsin uses a
wiki to build and distribute collaborative documentation. Georgia's collaborative digitization program, the Digital Library of Georgia, presents a seamless virtual library on the state's history and life, including more than a hundred digital collections from 60 institutions and 100 agencies of government. The
Digital Library of Georgia is a
GALILEO initiative based at the University of Georgia Libraries. In the twentieth century, the
Hill Museum and Manuscript Library photographed books in Ethiopia that were subsequently destroyed amidst political violence in 1975. The library has since worked to photograph manuscripts in Middle Eastern countries. In South Asia, the Nanakshahi trust is digitizing manuscripts of
Gurmukhī script. In Australia, there have been many collaborative projects between the
National Library of Australia and universities to improve the repository infrastructure that digitized information would be stored in. Some of these projects include, the ARROW (Australian Research Repositories Online to the World) project and the APSR (Australian Partnership for Sustainable Repository) project. == Methods ==