Preserving Websites? Yes, it Exists!

Author: Victoria Swindle

This fall I interned for the Senator John Heinz History Center here in Pittsburgh. Given the current situation, I was working remotely on a project relating to the Coronavirus/COVID-19 Pandemic under the guidance of Archivists Sierra Green and Carly Lough through the History Center’s Detre Library and Archives. I was able to spend hours reviewing trial captures (crawls) of websites from multiple areas of society that have been affected by the pandemic.

These crawls captured websites from local business and industry, community organizations, religious communities, municipal government, people with disabilities, schools, recreation, healthcare, charitable giving, the arts and cultural institutions. A number of the business webpages captured were small businesses that had to close their doors due to complications of the current situation. I have shared a screenshot from a crawl that was done of a small business called The Pittsburgh Yarn Company that had to close its store. Above the text there is an empty white box along with two images missing above the business name. The main point of archiving this particular website, despite the images missing, is what the text says. The company states that, “The COVID-19 pandemic was really the nail in the coffin for us and there is no way we can continue to maintain the shop.” Thankfully in this case, the History Center archivists were able to “patch” this crawl so that the missed images were preserved as well.

The capture of this webpage is now a part of the History Center’s web archive, which will be open to researchers in the future. This is just one of many stories that have been preserved through crawls using this way of archiving the web. Throughout my work, I have also reviewed Facebook posts, Instagram posts, and even videos. There was a number of local businesses and places of worship that posted weekly videos for their audience to watch. So, every week, there were several webpages that I would review to make sure the weekly content was displaying correctly. Now, there is a digital timeline that is created for every week that particular webpage was captured.

At times, an initial crawl would not go according to plan. An example of this was an instance where images of the staff from a business would not load when I replayed the archived webpage using the Wayback Machine. In this case, it was important to have these images show up in the archived webpage because they helped tell the complete story of that particular business. As a result, I would flag this as an issue and either Sierra or Carly would tweak the crawling parameters or try a different approach to produce a more complete result.

Each of our lives have been affected by the current pandemic that is hitting the world. These archived webpages are now part of the growing documentation of how Western Pennsylvanians have responded to the pandemic. So much of the information about this pandemic is rapidly published, updated and changed only on the Internet.  The History Center’s ability to archive webpages is crucial in its efforts to document all sections of society in Western Pennsylvania and the Pittsburgh area during this pandemic.

Victoria Swindle, Museum Studies Intern at the Senator John Heinz History Center’s Detre Library & Archives, Fall 2020

Constellations Group