Maintenance: Preservation, Development and Migrations

From Wiki
Revision as of 20:55, 17 December 2023 by DEV (talk | contribs)

Intro

We have reached the final stage of the first cycle of our digital archive’s life. We have planned out and created our digital archive, preserved and secured our invaluable archival content, and made it accessible to the community.

However, we cannot just sit back now and do nothing. The wheels of a digital archive need a lot of continuous oiling to keep preserving and making our content accessible. This is why in digital archiving in particular, the notion and approach of active maintenance is essential; it is an integral part of long-term preservation.

Here is how the United Nations views the matter:

Digital Preservation is the active management and maintenance of digital objects so they can be accessed and used by future users.

The goal of digital preservation is the accurate rendering of authenticated content over time to ensure its authenticity, accessibility and usability.

Active maintenance is not only needed to keep our digital archive running, but also the key to long-term preservation, authenticity, and access to digital content. If the format of our files becomes obsolete, our storage media fails, or our backup software is outdated and flawed, our invaluable digital content may become compromised, damaged, or lost altogether, along with all the work we have put into building and developing the digital archive.

Given that active maintenance is not an afterthought but rather the very core of digital archiving, it requires a systematic approach and regular performance of a set of actions that include monitoring and migration. As with access and security, it is a good practice to create a Maintenance Plan centered around the two main sets of maintenance actions: listing, describing, and scheduling the execution of maintenance monitoring and migration activities. The specific elements of the Maintenance Plan, such as time periods for regular checkups, or concrete procedures, will be dictated by the circumstances of a given archive. We can, however, describe the key elements and actions that need to be included.

Active maintenance: Monitoring

To properly maintain our digital archive, we need to monitor its functions and elements and make the necessary adjustments. This primarily includes monitoring, checkup, and preservation actions on data, software, and storage media. In addition, we need to regularly observe, revise, and update our data security and access plans and their implementation.


Data Monitoring and Preservation Actions

Monitoring and preservation actions we need to conduct on our data in the maintenance phase are in essence a continuation of the work we performed as part of the preparation of material for ingest—from plain backup of data to checkups of file format, validity, fixity, and quality assurance.

At this stage, we need to plan and schedule regular, periodic performance of these preservation actions, check for any irregularities, and then follow up to amend them. We also need to plan for these functions as a necessary step in any major archival data-related activity, such as data migration or software replacement.

Antivirus checkups. As always, ensuring that data is virus-free is an essential precondition for any further actions on data. In addition to antivirus measures in place for the entire Digital Archiving System, we also need to be mindful and perform antivirus checks on our digital content whenever it has been exposed to a networked environment or other virus-related threats.

Backup. For the maintenance phase, it is important that our backup copies are also monitored and replaced when appropriate—mirroring any actions on our archival master files. Hence, backup files should be subjected to the same type of scheduled checkups as are our master files. Alternatively, we could make new backup copies from master files following their regular checkups. Additionally, if there are any changes to the archival master files, their backup files will also need to be replaced.

It is a good practice to back up considerations in the maintenance phase whenever possible, including development of a so-called “Disaster Recovery Plan.” This refers to creating a plan on how our data will be recovered or replaced in case of any major natural or human-caused failure, damage, theft, or malicious attack on our digital archival content or system. The plan will be based on our existing backup arrangements, which define the number of backup copies, their geographic location, and type of storage media used, as described earlier. The disaster recovery plan should provide instructions on which of our backup copies should be used, in which disaster type circumstances, and by which technological means to replace and recover our data.

Format and obsolesce. In the pre-ingest phase, we made sure that all our files are in formats that are operational and can be opened and properly displayed by currently available software. Similarly, in the maintenance phase, we need to schedule regular audits of our file formats to ensure that they are not in danger of becoming obsolete. If we find that a format we are using is becoming obsolete, or that support for it will be discontinued, we need to act, which usually means migrating files to a newer or more suitable format. For both file format monitoring and migration we can use specialized software tools, some of which we already mentioned.

Fixity, Validity and Quality Assurance. Even preserved digital files can change over time, which can then affect their format and/or quality. Hence, similar to monitoring file formats, we need to also plan and schedule regular checkups of our files’ fixity, validity, and quality. Equally, we should plan to include these preservation actions as part of any major data-related actions, such as data migration.

What differs is how we will approach irregularities we might detect or any changes we find to our data. In case of a detected change in a file’s format, quality, or fixity, we can follow a three-step rule of thumb: repair, restore, and document.

This means that in the first instance we can try to repair the file using dedicated software tools for the given file format. If repair is not an option, we should restore the file from one of our backups. In case we do not have a backup or it is not usable, we can decide to preserve the changed original file. Regardless of what we decide, in the end, we need to document our action and detail the decision that was made and why, to be preserved as part of metadata along with the file.

Monitoring Software

The software we apply in our Digital Archiving System—be it open source or commercial, an all-in-one solution or a combination of specialized tools—also needs to be regularly monitored so that it continues to meet our requirements and avoids becoming obsolete.

An archive’s requirements, as mentioned earlier, are not “set in stone.” As it is expected that they will change over time, we need our software to support those changes—which is why it is so important for the software we use to have a strong support. We can then rely on this support—in the form of a community of software users and developers or a commercial service—to provide upgrades or additions for any new or revised requirements.

Monitoring and improving our software will ensure that it continues to meet our requirements, even when those requirements change. However, if our monitoring shows that a specific software can no longer be adapted, or that it is losing its support community, we need to find a new appropriate software solution and migrate to it.


Resource alert! Software Tool Registers


Although digital archiving would not be possible without software tools, and their quick pace of proliferation has been very beneficial, the sheer number and scope of possible and offered solutions can create difficulties finding and selecting the right option. A number of digital archiving software registries have been created, which provide lists and descriptions of different tools. A good starting point is the COPTR registry, which has the advantage of drawing information from a variety of sources and thus provides a good overview.



Therefore, an important element of monitoring our software as part of maintenance is to follow the new developments and services provided through upgrades and novel solutions, and to have access to a community of software users and developers. This is especially the case for civil society-run human rights archives, as many of them struggle with expertise, resources, and capacity needed for development and maintenance of the technological element of a digital archive. There are inspiring examples that show how such synergistic partnerships can be built, and new out-of-the-box solutions can be applied to shared technology-related problems.

Monitoring Storage Media. Monitoring our storage media is necessary to detect any errors or damage in a timely manner, as well as to prevent it becoming obsolete or outdated.

Over time, storage media can become unstable and unreliable, and cause data corruption or loss. A rule of thumb for a safe preservation practice is for storage media to be given only a short lifetime, sometimes estimated at only three to five years. This means that after this period we will need to find and obtain a new storage media and migrate our data to it. This migration is somewhat less demanding than the file format migration but still requires all data preservation actions to be performed as part of the process.

It is a good practice to expect and plan for failures—both human- and technology-caused—to happen to our storage media over time, even in the best of circumstances, and regardless how good the technology is. This is why developing a clear Disaster Recovery Plan is so beneficial for a digital archive’s maintenance.

The best strategy, however, remains to create a strong, resilient backup system with multiple independent copies, stored in different locations and using different technologies (whenever possible). Coupled with regular performance of all data preservation actions, risks will be minimized and spread, thus ensuring we never have to rely on a single piece of technology to preserve our invaluable data.

Monitoring Access and Data Security

Implementation of our Access and Data Security plans also needs to be regularly audited to ensure they remain functional and meet the requirements. When requirements change or shortcomings are identified, the plans and related practices should be revised.

In terms of access, monitoring includes following the statistics of our data use and users. Such data should be provided by the access software solution we use, and can help us better tailor, organize, and deliver our access services.

With regard to security, we must rigorously monitor the planned and implemented arrangements to ensure timely identification of any weak spots or shortcomings that might put our data under threat. Data on access and the use of the archival material can also be useful for reviewing and revising our Data Security Plan.

We should also always be on the lookout—when possible, through a community of users or other CSOs—to advance our access services and security arrangements by applying novel technologies that can sometimes substantially improve both the user experience and the safety of our data.

Active maintenance: Migration

Migration is one of the concepts that are uniquely important for digital archiving. In a sense, it represents the very essence of it: the constant change, adaptation, and solution finding required to keep our digital content alive, preserve it, and make it accessible in the future.

Ominously, this comes at the very end of a digital archives’ life cycle, signaling the end of one of its iterations and the beginning of another.

Migration of Data, Software, and Storage Media

In the previous section, we introduced different ways in which migration is an essential activity in the maintenance of a digital archive, as it allows us to preserve our data regardless of its format, the software we use, or the storage media.

Making a decision to migrate data, software, or storage media should be planned and scheduled to the greatest extent possible. It should also not be made easily. New, groundbreaking software or hardware solutions might be tempting as a great way to improve our archive’s services, but we should be wary of untested solutions—and bear in mind that any migration is not a simple process, as it requires time and resources, as well as inevitably brings its share of risks. Even in a simple transfer, and more so in a complex format or software migration, data can be changed, damaged, or lost. However, if we hesitate for too long, the software might become obsolete and make the task of migration much more difficult.

Therefore, migration needs to be performed timely, in a systematic and carefully planned way, following certain basic rules and best practices.

●       Whether we are migrating data, software, or storage media, we should always include a set of preservation actions—including checkups of fixity, validity, and quality assurance—as a mandatory step both before and after the actual transfer of files.

●       For file format migration, it is important to always retain the original file along with the new migrated file, as the migrated file might have lost some of the properties of the original file, which is not always detectable by the software. In such a case, we will have to decide what data we consider acceptable to lose (if any).

●       To reduce the risk of changes to files during file format migration going unnoticed by our software, as part of a migration process, it would be a good idea to plan for a quality control test. This would include manually opening and checking a reasonably sized sample of migrated files based on a set of acceptance criteria we develop—for example, in terms of its formatting, look and feel, and functionality.

●       For software and storage media migration, it is also a good practice to retain the original files for an appropriate period of time following the migration—anything from a few days up to a year or longer, depending on the type of migration. This is because, frequently, it is only post hoc that we discover process shortcomings or data changes that occurred during the migration. We can repeat the process and avoid data change if we retain the originals.

●       We should always include backup copies in any migration plans and workflows and make sure that after the migration is completed, new backup copies are created from the migrated master files.

01:00

A new collection of material has been gathered for archiving and long-term preservation, raising a set of new requirements. We will need to migrate our content to a new Digital Archiving System. In other words, it is time to start the process anew.