Skip to Main Content

Special Interest Group Meeting Notes: Digital Preservation SIG 10/22/2025

Notes from the most recent meetings of special interest groups at Southeastern

October 22, 2025

The Digital Preservation SIG met on October 22, 2025 via Zoom
Topic: Web Archiving

Web Archiving Presentation by Kim Gianfrancesco

Presentation Links and Discussion Notes

Links shared during Kim's Presentation:

Discussion Notes:

What is Vassar’s scope for archiving for the college? What do you preserve?

  • The whole website is not being archived. 
  • There are certain offices of the college that are crawled regularly.
  • This includes sunsetting sites, and sites with ongoing projects that they want to be preserved. 

Do people ever ask if they can see their archived sites?

  • Vasser uses Archive-It so everything is available to view on the Wayback Machine. 
  • They can also provide access to “good crawls” through digital library 

What is a good crawl vs. a bad crawl?

  • Sites with flash animations can’t be archived. Flash is gone now! 
  • Usually if a crawl is not good, it didn’t capture what it was supposed to. Sometimes this can just happen if a site was crawled when it wasn’t working.  

Some concerns about using the Internet Archive: 

  • The Internet Archive was hacked recently. This can make it scary to put all of your eggs into one basket. 
  • The Internet Archive has also been sued a lot recently. (Hachette vs. Internet Archive, Great 78s lawsuit). Concern about sustainability when they are spending money to fight lawsuits.

Community Webs, a program of Archive-It and the Internet Archive, advances the capacity for public libraries and other cultural heritage organizations to document the digital heritage of their communities through web archiving, digital preservation, and community archiving : https://communitywebs.archive-it.org/ 

A lot of local news platforms are behind paywalls. What if you wanted to do some community web archiving around local news? Can you make arrangements to get behind a paywall?

  • With Archive-it you can theoretically crawl after logging in. 
  • There are some challenges in getting it to work properly.
  • Overall if the crawler can’t access the site, it is not going to be crawled. 
  • Crawling materials behind a paywall is also a copyright issue. You would need permission from the organization before doing so. 
  • A Note from Zack: with sites using cloudflare, you have to make sure settings are configured to allow the bots to crawl the site.  

METRO’s repository software can support web archives. Jen recently discovered an older METRO, grant-funded site Culture in Transit available through their DCMNY repository.  

Web Archives at Archive-It can be cataloged in catalogs and repositories and accessed through a link.

Archive-it has a feature where it will automatically re-direct to a sunsetting site.

Maybe at a future SIG, Kim, Jen and Palash could share their recent NDSA presentation. 

The iPRES international conference is being held soon in New Zealand. They have something called iPRES radio where every day they will post out updates. The radio is timed for different regions around the world: https://www.dpconline.org/events/eventdetail/528/-/ipres-radio-2025

Next meeting date TBD. If anyone has a topic they'd like us to focus on, let Jen at Southeastern know!

 

Southeastern NY Library Resources Council
21 South Elting Corners Road | Highland, NY 12528
Phone: (845) 883-9065
www.senylrc.org