One organization just starting working with Catalog-It. They have decided to pursue a dual strategy of publishing onto NY Heritage and through Catalog-it. About Catalog-it:
Q: When working on digital preservation, do you need to retroactively change all your file names? How often are you checking your digital masters to make sure they’re still good?
Organizations invested in creating a lot of digital files (scan everything!) and now faced with storing and managing those files. It’s good to digitize when you have a plan for access + when the original materials are at risk. Practice good selection when you’re considering what to digitize.
One institution is using Flickr to “store” graduation photos. Is that a wise decision? Another organization is also using Flickr to showcase recent library history/events. It’s a good option for items that aren’t a good fit for New York Heritage.
One library has a server they are using. They are starting to go through things to determine what is relevant. It became an issue when there started to be a shortage in server space. It’s important to think about how scanning can be scaled up in a manageable way.
Some organizations are using Google Drive for storage. Google doesn’t provide information about the number of files in - or storage size of - directories, which is challenging from a Digital Preservation standpoint. There are add-ons that can do this, but there is concern about using them (potential for malware, etc.).
One library is experimenting with Bagger/Bagit to package/store files.
One organization has a collection of digital newspaper articles. Right now they are on google drive and on a server. Looking for ways to make them more available to the public? Currently using them as the source for blog posts.
One organization has a lot of different files originating from different places. Some are emailed images that were shared by the public and not necessarily scans from the collection. Does anyone have a protocol in terms of naming things?
Fixity checking
Fixity checking is something you do to make sure your files haven’t changed, corrupted, moved, or altered in any way
You run your files through a program and it assigns a unique number to each file (often referred to as a digital fingerprint). Then you periodically check the files. If a number changes, then you know the file has been altered/corrupted/deleted.
One library does fixity checking every month. Everything gets checked over the span of a year. Uses Quick Hash GUIi - https://www.quickhash-gui.org/. Open source, free software. Willing to share procedures, provide a demo.
Not a consensus on how often fixity checking should be done. It’s really all over the map on how often organizations do this.
If something goes wrong: this is why it’s important to have multiple copies. If a fixity check shows that a file has been changed, you can replace it with a copy.
Some of the more expensive digital preservation systems incorporate fixity checking and repair.
Using stand-alone fixity software is not out of reach for smaller institutions.
Southeastern uses Amazon Glacier, which duplicates data and does automatic checking and repairing. Cloud services like Amazon offer this as part of the service. However these services don’t always provide an audit/reports. Amazon might be working on it because so many cultural heritage institutions are using Glacier to support digital preservation.
Other than reliability you need to think about: who has access to the data, what are the back-ups, what is the accountability for it? If you’re uploading data to something, you have to think about how it can be recovered.
Financial side: you need to assign a value to your data. How much would it cost to recreate the work that has been done? This can be a way to advocate for the software, hardware, services you need to protect your investment.
DPOE-N offers emergency hardware grants. One library was able to get a grant for a hard drive. They also offer professional development grants. https://www.dpoe.network/emergency-hardware-support.
https://www.dpoe.network/professional-development-support/
They are a great resource for learning about digital preservation.
Helpful class on digital preservation: https://libraryjuiceacademy.com/shop/course/183-introduction-digital-preservation/
Digital Preservation Handbook is a great resource (includes video tutorials throughout): https://www.dpconline.org/handbook
A good place to start: create a high-level inventory of what you have. This is especially helpful if you have materials in different locations. Record Formats, numbers, etc. This helps you get a sense for how much you have and what you need to do to preserve it. All planning decisions flow from this document.
Be cautious of low cost hard drives. If they seem too good to be true, they probably are. https://www.grc.com/validrive/the-report.htm
Do you store like (physical) materials together or keep items together with their collections?
Would it be good to continue discussions and create a special interest group on this topic? People are interested! Southeastern will create a SIG on Digital Preservation. These will be announced via our newsletter. People who attended the previous SIG will also be notified of the next meeting.
Sign up for Southeastern’s newsletter here: https://airtable.com/appF5045dT9RSe7HP/shrF1StKqcdSVugMT