Maintenance: Preservation, Development and Migrations
Introduction
We have reached the final stage of the first cycle of our digital archive’s life. We have planned out and created our digital archive, preserved and secured our invaluable archival content, and made it accessible to the community.
However, we cannot just sit back now and do nothing. The wheels of a digital archive need a lot of continuous oiling to preserve and make our content accessible. This is why, in digital archiving, in particular, the notion and approach of active maintenance is essential; it is an integral part of long-term preservation.
Here is how the United Nations views the matter: failure.
Digital Preservation is the active management and maintenance of digital objects so they can be accessed and used by future users.
The goal of digital preservation is the accurate rendering of authenticated content over time to ensure its authenticity, accessibility, and usability.
Active maintenance is not only needed to keep our digital archive running but also the key to long-term preservation, authenticity, and access to digital content. Suppose the format of our files becomes obsolete, our storage media fails, or our backup software is outdated and flawed. In that case, our invaluable digital content may become compromised, damaged, or lost altogether, along with all the work we have put into building and developing the digital archive.
Given that active maintenance is not an afterthought but rather the very core of digital archiving, it requires a systematic approach and regular performance of a set of actions that include monitoring and migration. As with access and security, it is a good practice to create a Maintenance Plan centered around the two main sets of maintenance actions: listing, describing, and scheduling the execution of maintenance monitoring and migration activities. The specific elements of the Maintenance Plan, such as periods for regular checkups or concrete procedures, will be dictated by the circumstances of a given archive. We can, however, describe the key elements and actions that need to be included.
Active maintenance: Monitoring
To properly maintain our digital archive, we must monitor its functions and elements and make the necessary adjustments. This primarily includes monitoring, checkup, and preservation actions on data, software, and storage media. In addition, we need to regularly observe, revise, and update our data security and access plans and their implementation.
Data Monitoring and Preservation Actions
Monitoring and preservation actions we need to conduct on our data in the maintenance phase are, in essence, a continuation of the work we performed as part of the preparation of material for ingest—from plain backup of data to checkups of file format, validity, fixity, and quality assurance.
At this stage, we need to plan and schedule regular, periodic performance of these preservation actions, check for any irregularities, and then follow up to amend them. We also need to plan for these functions as a necessary step in any major archival data-related activity, such as data migration or software replacement.
Antivirus checkups. As always, ensuring that data is virus-free is an essential precondition for any further actions on data. In addition to antivirus measures in place for the entire Digital Archiving System, we also need to be mindful and perform antivirus checks on our digital content whenever it has been exposed to a networked environment or other virus-related threats.
Backup. For the maintenance phase, it is important that our backup copies are also monitored and replaced when appropriate—mirroring any actions on our archival master files. Hence, backup files should be subjected to the same type of scheduled checkups as our master files. Alternatively, we could make new backup copies from master files following their regular checkups. Additionally, if there are any changes to the archival master files, their backup files will also need to be replaced. It is a good practice to back up considerations in the maintenance phase whenever possible, including developing a so-called “Disaster Recovery Plan.”
This refers to creating a plan on how our data will be recovered or replaced in case of any major natural or human-caused failure, damage, theft, or malicious attack on our digital archival content or system. The plan will be based on our existing backup arrangements, which define the number of backup copies, their geographic location, and the type of storage media used, as described earlier. The disaster recovery plan should provide instructions on which backup copies should be used, in which disaster-type circumstances, and by which technological means to replace and recover our data.
Format and obsolesce. In the pre-ingest phase, we made sure that all our files were in formats that were operational and could be opened and properly displayed by currently available software. Similarly, in the maintenance phase, we need to schedule regular audits of our file formats to ensure that they are not in danger of becoming obsolete. If we find that a format we are using is becoming obsolete or that support for it will be discontinued, we need to act, which usually means migrating files to a newer or more suitable format. For both file format monitoring and migration we can use specialized software tools, some of which we already mentioned.
Fixity, Validity, and Quality Assurance. Even preserved digital files can change over time, which can then affect their format and/or quality. Hence, similar to monitoring file formats, we also need to plan and schedule regular checkups of our files’ fixity, validity, and quality. Equally, we should plan to include these preservation actions as part of any major data-related actions, such as data migration.
What differs is how we will approach irregularities we might detect or any changes we find to our data. In case of a detected change in a file’s format, quality, or fixity, we can follow a three-step rule of thumb: repair, restore, and document.
This means that in the first instance, we can try to repair the file using dedicated software tools for the given file format. If repair is not an option, we should restore the file from one of our backups. In case we do not have a backup or it is not usable, we can decide to preserve the changed original file. Regardless of what we decide, in the end, we need to document our action and detail the decision that was made and why, which is to be preserved as part of the metadata along with the file.
Monitoring Software
The software we apply in our Digital Archiving System—be it open source or commercial, an all-in-one solution, or a combination of specialized tools—also needs to be regularly monitored so that it continues to meet our requirements and avoids becoming obsolete.
An archive’s requirements, as mentioned earlier, are not “set in stone.” As it is expected that they will change over time, we need our software to support those changes—which is why it is so important for the software we use to have strong support. We can then rely on this support—in the form of a community of software users and developers or a commercial service—to provide upgrades or additions for any new or revised requirements.
Monitoring and improving our software will ensure that it continues to meet our requirements, even when those requirements change. However, suppose our monitoring shows that a specific software can no longer be adapted or is losing its support community. In that case, we need to find a new appropriate software solution and migrate to it.
RESOURCE Alert!: Software Tool Registers. |
---|
Although digital archiving would not be possible without software tools, and their quick pace of proliferation has been very beneficial, the sheer number and scope of possible and offered solutions can create difficulties in finding and selecting the right option. A number of digital archiving software registries have been created, which provide lists and descriptions of different tools. A good starting point is the COPTR registry, which has the advantage of drawing information from a variety of sources and thus provides a good overview. |
Therefore, an important element of monitoring our software as part of maintenance is following the new developments and services provided through upgrades and novel solutions and having access to a community of software users and developers. This is especially the case for civil society-run human rights archives, as many of them struggle with the expertise, resources, and capacity needed for the development and maintenance of the technological element of a digital archive. There are inspiring examples that show how such synergistic partnerships can be built, and new out-of-the-box solutions can be applied to shared technology-related problems.
Monitoring Storage Media
Monitoring our storage media is necessary to detect any errors or damage in a timely manner and prevent it from becoming obsolete or outdated.
Over time, storage media can become unstable and unreliable and cause data corruption or loss. A rule of thumb for a safe preservation practice is for storage media to be given only a short lifetime, sometimes estimated at only three to five years. This means that we will need to find and obtain a new storage media and migrate our data after this period. This migration is somewhat less demanding than the file format migration but still requires all data preservation actions to be performed as part of the process.
It is a good practice to expect and plan for human- and technology-caused failures to happen to our storage media over time, even in the best of circumstances, regardless of how good the technology is. This is why developing a clear Disaster Recovery Plan benefits a digital archive’s maintenance.
The best strategy, however, remains to create a strong, resilient backup system with multiple independent copies stored in different locations and using different technologies (whenever possible). Coupled with the regular performance of all data preservation actions, risks will be minimized and spread, thus ensuring we never have to rely on a single piece of technology to preserve our invaluable data.
Monitoring Access and Data Security
Implementation of our Access and Data Security plans also needs to be regularly audited to ensure they remain functional and meet the requirements. The plans and related practices should be revised when requirements change or shortcomings are identified.
In terms of access, monitoring includes following the statistics of our data use and users. Such data should be provided by the access software solution we use and can help us better tailor, organize, and deliver our access services.
Regarding security, we need to carefully monitor the planned and implemented arrangements to ensure timely identification of any weak spots or shortcomings that might threaten our data. Data on access and using the archival material can also be useful for reviewing and revising our Data Security Plan.
We should also always be on the lookout—when possible, through a community of users or other CSOs—to advance our access services and security arrangements by applying novel technologies that can sometimes substantially improve both the user experience and the safety of our data.
Active maintenance: Migration
Migration is one of the concepts that are uniquely important for digital archiving. In a sense, it represents the very essence of it: the constant change, adaptation, and solution-finding required to keep our digital content alive, preserve it, and make it accessible in the future.
Ominously, this comes at the very end of a digital archive’s life cycle, signaling the end of one of its iterations and the beginning of another.
Migration of Data, Software, and Storage Media
In the previous section, we introduced different ways migration is essential in maintaining a digital archive, as it allows us to preserve our data regardless of its format, the software we use, or the storage media.
Deciding to migrate data, software, or storage media should be planned and scheduled to the greatest extent possible. It should also not be made quickly. New, groundbreaking software or hardware solutions might be tempting as a great way to improve our archive’s services. Still, we should be wary of untested solutions—and bear in mind that any migration is not a simple process, as it requires time and resources and inevitably brings its share of risks. Data can be changed, damaged, or lost even in a simple transfer, and more so in a complex format or software migration. However, if we hesitate for too long, the software might become obsolete, making migration much more difficult.
Therefore, migration needs to be performed promptly, systematically, and carefully planned, following specific basic rules and best practices:
- Whether we are migrating data, software, or storage media, we should always include a set of preservation actions—including checkups of fixity, validity, and quality assurance—as a mandatory step before and after the actual transfer of files.
- For file format migration, it is important always to retain the original file along with the new migrated file, as the migrated file might have lost some of the properties of the original file, which is not always detectable by the software. In such a case, we will have to decide what data we consider acceptable to lose (if any).
- To reduce the risk of changes to files going unnoticed by our software during file format migration, it would be a good idea to plan for a quality control test as part of the migration process. This would include manually opening and checking a reasonably sized sample of migrated files based on a set of acceptance criteria we develop. For example, it's formatting, look and feel, and functionality.
- For software and storage media migration, it is also a good practice to retain the original files for an appropriate period of time following the migration—anything from a few days up to a year or longer, depending on the type of migration. This is because, frequently, it is only post hoc that we discover process shortcomings or data changes that occurred during the migration. We can repeat the process and avoid data change if we retain the originals.
- We should always include backup copies in any migration plans and workflows and make sure that after the migration is completed, new backup copies are created from the migrated master files.
- 00:00