We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode How the Wayback Machine is fighting linkrot

How the Wayback Machine is fighting linkrot

2024/9/5
logo of podcast Decoder with Nilay Patel

Decoder with Nilay Patel

AI Deep Dive AI Chapters Transcript
People
M
Mark Graham
N
Nilay Patel
以尖锐评论和分析大科技公司和政治人物而闻名的《The Verge》编辑总监。
Topics
Nilay Patel:本期节目讨论了互联网面临的一个重大问题:大量网络内容正在离线,造成数字遗产的流失。许多网站,包括新闻网站、社交媒体平台等,都面临着内容消失的风险。这不仅影响了公众获取信息,也对历史研究和文化传承造成威胁。 Mark Graham:互联网档案库(Internet Archive)致力于通过Wayback Machine等工具来对抗链接腐烂,保存网络内容。Wayback Machine是一个时间机器,可以访问过去网站的快照。互联网档案库每天都在收集和保存大量数据,这需要巨大的存储空间和持续的资金支持。 Wayback Machine的运作涉及到网络爬取、存档、索引等技术。互联网档案库面临着许多挑战,例如网络的超个性化、互联网的碎片化、以及互联网向应用程序的演变。这些变化使得网络内容的保存更加复杂。此外,人工智能公司对网络数据的抓取也引发了争议,一些网站开始限制抓取,这给互联网档案库的工作带来了新的挑战。 互联网档案库在保存内容的同时,也需要尊重知识产权和个人隐私。他们会根据权利持有者的要求删除某些内容,并对可能造成现实世界伤害的内容进行审查。互联网档案库的资金来源包括付费服务、个人捐赠和机构捐赠。 互联网档案库的保存工作对社会具有重要意义,它有助于保护数字遗产,促进公众获取信息,并为历史研究提供宝贵的资料。但互联网档案库也需要不断适应新的技术和社会环境,才能更好地完成其使命。 Mark Graham:Wayback Machine是互联网档案库的一个服务,它通过存档网络内容来对抗链接腐烂。互联网档案库每天都在收集和保存大量数据,这需要巨大的存储空间和持续的资金支持。Wayback Machine的运作涉及到网络爬取、存档、索引等技术。互联网档案库面临着许多挑战,例如网络的超个性化、互联网的碎片化、以及互联网向应用程序的演变。这些变化使得网络内容的保存更加复杂。此外,人工智能公司对网络数据的抓取也引发了争议,一些网站开始限制抓取,这给互联网档案库的工作带来了新的挑战。互联网档案库在保存内容的同时,也需要尊重知识产权和个人隐私。他们会根据权利持有者的要求删除某些内容,并对可能造成现实世界伤害的内容进行审查。互联网档案库的资金来源包括付费服务、个人捐赠和机构捐赠。互联网档案库的保存工作对社会具有重要意义,它有助于保护数字遗产,促进公众获取信息,并为历史研究提供宝贵的资料。但互联网档案库也需要不断适应新的技术和社会环境,才能更好地完成其使命。

Deep Dive

Chapters
The episode introduces the problem of digital decay and the role of the Internet Archive in preserving online content.
  • 38% of links from 2013 are no longer accessible.
  • The Internet Archive was founded in 1996 and launched the Wayback Machine in 2001.
  • The Wayback Machine allows users to view snapshots of websites at given moments in time.

Shownotes Transcript

The web has a problem: huge chunks of it keep going offline. The web isn’t static, parts of it sometimes just… vanish.

But it’s not all grim. The Internet Archive has a massive mission to identify and back up our online world into a vast digital library. In 2001, it launched the Wayback Machine, an interface that lets anyone call up snapshots of sites and look at how they used to be and what they used to say at a given moment in time. Mark Graham, director of the Wayback Machine, joins Decoder this week to explain both why and how the organization tries to keep the web from disappearing.

**Links: **

  • When Online Content Disappears | Pew Research)

  • Game Informer is shutting down | The Verge)

  • When Media Outlets Shutter, Why Are the Websites Wiped, Too? Slate)

  • MTV News lives on in the Internet Archive | The Verge)

  • The video game industry is mourning the loss of Game Informer | The Verge)

  • Guest host Hank Green makes Nilay Patel explain why websites have a future | Decoder)

  • How The Onion is saving itself from the digital media death spiral | Decoder)

  • The Internet Archive is defending its digital library in court today | The Verge)

  • The Internet Archive has lost its first fight to scan and lend ebooks | The Verge)

  • The Internet Archive just lost its appeal over ebook lending | The Verge)

**Credits: **

Decoder is a production of The Verge and part of the Vox Media Podcast Network.

Our producers are Kate Cox and Nick Statt. Our editor is Callie Wright. Our supervising producer is Liam James.

The Decoder music is by Breakmaster Cylinder.

Learn more about your ad choices. Visit podcastchoices.com/adchoices)