The Internet Archive is a nonprofit organization that preserves digital content, including over 900 billion webpages, old television, books, and music. It operates the Wayback Machine, which allows users to access historical versions of websites. It is significant because it serves as a public record of the internet and other digital media, ensuring that historical information remains accessible even if original sources are deleted or altered.
The Internet Archive is facing lawsuits from publishers and the music industry over copyright issues. In the Hachette case, publishers argued that the Archive's digitization and lending of books violated copyright law. Similarly, Universal Music Group is suing over the Archive's Great 78 Project, which digitizes old 78 RPM records. These cases threaten the Archive's ability to preserve and share digital content.
The Great 78 Project is an initiative by the Internet Archive to digitize and preserve early 78 RPM records, which are fragile and often unplayable on modern equipment. It is controversial because Universal Music Group claims that the project violates copyright law, despite the Archive's argument that it serves researchers and preserves cultural history.
If the Internet Archive shuts down, we risk losing access to a vast repository of digital history, including websites, books, music, and television. This could lead to a fragmented and less reliable record of our digital past, with increased reliance on unregulated or illegal archives that may lack the Archive's standards of curation and preservation.
The Internet Archive preserves digital content by crawling over 1 billion URLs daily, storing them on hard drives, and indexing them for access through the Wayback Machine. It also digitizes physical media like books and records, making them available for research and public use. The Archive collaborates with over 1,300 libraries worldwide to ensure broad preservation efforts.
The Hachette v. Internet Archive case involves the Archive's practice of digitizing physical books and lending them digitally under a 'controlled digital lending' model. Publishers argued this violated copyright law, and the court ruled against the Archive, stating that creating and distributing digital copies without publisher authorization infringes on copyright.
The Wayback Machine is a tool provided by the Internet Archive that allows users to access historical versions of websites. It is significant because it preserves the internet's history, enabling users to see how websites have evolved over time and recover content that has been deleted or altered. It serves as a critical resource for accountability and historical research.
The Internet Archive is funded through a combination of library payments for digitization services, major donors and foundations, and contributions from end users. It operates on an annual budget of approximately $20-25 million, relying on public support to maintain its free services.
Brewster Kahle emphasizes that libraries play a crucial role in preserving and providing access to digital content. He argues that libraries, as nonprofit entities, ensure that historical and cultural materials remain available to the public, even as digital formats evolve. Libraries also serve as a counterbalance to corporate control over information.
The publishers' case against the Internet Archive is based on the argument that digitizing and lending books without authorization violates copyright law. The court ruled that the Archive's actions constituted making additional copies of works, which is not protected under the 'first sale' doctrine that applies to physical books.
More than 900 billion webpages are preserved on The Wayback Machine, a history of humanity online. Now, copyright lawsuits could wipe it out.