- Internet Archive aims to create an accessible repository of all human knowledge
- Comprehensive record of the media encourages wider perspectives
- Public can participate in preservation efforts by recommending pages and uploading data
What if you couldn’t remember what you did yesterday? Or what happened the last time you touched a hot stove? Or who won the Civil War?
Memories remind us of what happened in the past and teach us to make better choices in the future. But what happens when that past becomes invisible, deluged by the ever-changing present?
“We’ve become such a media-soaked environment; information is coming from every which way,” says Brewster Kahle, founder and digital librarian of the Internet Archive (IA). “People are having a hard time understanding and distilling it all — they tend to get lost in the present.”
One thousand librarians are helping to build the archive’s collections, currently estimated at over 30 petabytes of data.
The average life of a web page is only 100 days before it’s changed or deleted. But the IA is preserving those pages, along with books, music, video, and software, and making them permanently accessible to anybody in the world who wants to use them.
Library of Alexandria 2.0
Kahle views the IA’s mission as akin to building the Library of Alexandria 2.0. The famed library flourished in ancient Egypt for several centuries before being devastated by fire.
“Libraries are usually destroyed by governments because the new guys don’t want the old stuff around,” says Kahle. “If there had been a backup copy of the original Library of Alexandria, we’d still have the plays of Euripides or the other works of Aristotle. But we don’t.”
IA’s best-known service, The Wayback Machine, has collected 279 billion historical web pages, and is growing at the rate of half a billion pages per week. One thousand librarians are helping to build the archive’s collections, currently estimated at over 30 petabytes of data, and used by hundreds of thousands of people a day. The IA maintains copies on servers around the world, in case of another fire like the one that wiped out the Library of Alexandria.
Scientists are anxious to collect and preserve government data. Vast amounts of digital information are at risk of vanishing when a presidential term ends and administrations change. For example, 83% of government pdfs disappeared between 2008 and 2012.
I think this is the opportunity of our generation. The last generation put a man on the moon. I think we can achieve universal access to all knowledge. ~Brewster Kahle
Working with the Library of Congress, the IA archives all .gov and .mil websites, selected FTP sites, and official social media accounts in the End of Term Archive.
“We’re also working with climate scientists at various universities who are concerned about what the change in administration might mean for their research,” says Kahle. Climate Mirror is a distributed effort with several universities to mirror and back-up U.S. Federal Climate data.
“People put reusable datasets into the IA as a kind of non-profit cloud service,” says Kahle.
One effort by the IA archives 61 channels of television from 25 countries, 24 hours a day, seven days a week. The archive is text-searchable based on transcripts.
Since the US presidential election in November 2016, the IA has created a special collection focused on the television appearances of Donald Trump.
“When Trump was elected, we pulled out his campaign appearances and interviews and made it into a special collection that can be searched to uncover what candidate Trump or president-elect Trump has said he’s going to do. It’s been fantastically useful,” says Kahle.
The collection launched on January 7, 2017. Its most popular video, of one of the presidential debates, has been seen or downloaded one million times.
“Television is a very pervasive and persuasive medium,” says Kahle. “Yet it just flows over us. Unlike text, it’s not easy to quote, compare, and contrast.”
Kahle asserts that collecting information and making it available isn’t a partisan activity.
“We’re trying to make it so people can refer to what has happened; so that if someone makes a claim about x or y, you can compare it against what has been published before,” says Kahle.
We, the people
“We need more participation and better tools to make sense of it all, to help people understand what’s coming at them from many different angles, and not just recline back and watch one bubble,” says Kahle. “Right now it’s hard to get a broader view.”
You can contribute to the preservation of our digital history by suggesting websites to be archived, or by uploading your own data to be preserved or indexed by the IA.
“I think this is the opportunity of our generation,” says Kahle. “The last generation put a man on the moon. I think we can achieve universal access to all knowledge.”