Digital Archiving Lifecycle
The trouble with digital archiving is it is not really archiving—or at least not only archiving.
Rather, digital archiving is a never-ending process of transformation of the digital content one is trying to save from oblivion, and of the system in which it is preserved.
In that sense, digital archiving is a lot like that famous line from the song Hotel California: “You can check in any time you like, but you can never leave.”
Digital archiving is not a process that ends at a certain point. Any content we enter into a digital archive—any solution we apply to its storage, preservation, security, or access—is bound to eventually be transformed, the data migrated, and the technologies replaced.
To borrow a metaphor: Let us compare digital archiving to archiving of an object, say an antique, 5,000-year-old clay tablet. To preserve that physical object, we can leave it to sit in its storage space and only need to ensure that the optimal conditions in which it is stored are not changed. The opposite is true with digital archiving objects: To preserve them, we have to change digital objects and their environment continuously. We have to migrate data and transform the archival system to avoid format, storage, software, or other technologies becoming obsolete.
Digital archiving therefore does not have an end point. Rather, it is a cyclical process in which stages follow one after the other continuously, without a final destination. Creation of a digital archive can hence be seen as only the beginning of the process—the cycle’s first iteration—which will then be repeated for as long as we wish to preserve the archive’s digital content.
To reflect this fundamentally important characteristic of digital archiving, its circular and continual character, the manual applies the “Digital Archiving Life Cycle Model.”
The Digital Archiving Life Cycle Model also usefully presents and makes salient several other key characteristics of the digital archives. It draws attention to the need for taking actions and actively managing a digital archive throughout its life cycle. At the same time, the model clearly presents the wide scope of responsibilities involved in the digital archiving process. Finally, the Life Cycle Model makes clear and salient the important fact that decisions and actions in each phase affect what can be done and how in each following stage and any new iteration of the process.
Having awareness of these dynamic relationships between all the phases in digital archiving is needed to make informed decisions in each phase so that they do not limit the possibilities for actions and solutions in the subsequent stages.
It should be said that there is no single universal model to describe the digital archiving process. The models applied vary depending on an archive’s content, purpose, and users, as well as the archiving organization’s policies and practices. The Life Cycle Model of digital archiving used in this manual was developed to tailor to the specific needs and challenges of CSOs. It reflects some of the elements of the OAIS Reference Model and partly the structure of the DCC Curation Life Cycle Model. [NK|FP1] The OAIS Model is the most widely used model for digital archiving, while the DCC Life Cycle Model includes many of the considerations that also affect CSOs engaged in digital archiving.
This manual applies a simplified Life Cycle Model that focuses on key aspects of the process for CSOs. Presented visually, the model shows the main stages of digital archiving following each other in the shape of a circle, just as numbers do on a clock, with the end point marking the beginning of a new circle—a new, slightly different iteration of the process. See Figure 1.
Figure 1. Digital Archiving Life Cycle Model
12:00 AM
Once a strong need for creation of a digital archive has been identified and a firm organizational decision has been made to develop it, the process begins with the Planning and Organizing stage.
As a first step, we need to develop a General Plan, which will define the Guiding Principles of the archive as well as address key organizational, technological, and resource-related issues that will be encountered throughout the digital archive’s life cycle. The Guiding Principles are based on responses the organization gives to a set of core questions, such as, What needs to be preserved? Why? Who will use it? And how?
The General Plan needs to be complemented with the creation of an Identification Inventory, selection, organization, and description of the material we want to preserve. This is because any further decision or action in the process will rely on information about the format, amount, scope, size, topic, or other characteristics of the collected material for preservation, as well as the ability to identify, manage, and locate groups or individual items.
To round off this stage, we will need to plan, design, and select our future Digital Archiving System—a digital repository and content management system that will host our archival content. A Digital Archiving System consists of hardware and software elements that will need to be carefully selected, given that their characteristics will affect other important aspects of our digital archive.
The Planning and Organizing stage is the foundation for the creation of any archive, including a digital one. It shapes all the other stages and defines the decisions and actions to be taken in them. Different elements of the Planning and Organizing stage will need to be revisited, consulted, and reviewed at various points later in the process. Finally, at the close of a digital archive’s life cycle, the process will return to this initial phase, this time to plan and organize for a digital archive’s development and transformation through the next iteration of its life cycle.
12:15 AM
The second stage includes a group of Digitization, Description, Preparation, and Preservation actions that lead to the process of inputting our digital material into a Digital Archiving System. These actions are separate but go hand in hand, as they are interrelated and need to be well-coordinated. Digitization of any physical material needs to be done in sync with the decisions regarding how these objects will be described (i.e., which information, or metadata, about them needs to be captured in the digitization process)—much like born-digital material, whose metadata needs to be selected as well.
This is followed by a number of actions aimed at proper preservation of the archive’s content by maintaining its integrity and credibility (i.e., ensuring that the objects are not compromised and that any changes made to them are recorded).
The material, both digitized and born-digital, is then ingested into the Digital Archiving System and onto the storage media. In this process, the content and its descriptions—its metadata—are captured and stored in the Digital Archiving System. Additional checkups are then performed and backup copies are created and stored separately.
12:30 AM
Providing Access and Data Security is the main task in the third stage of digital archiving. These two separate functions are interrelated and need to be kept in balance to provide for the optimal effect—the widest possible access to be provided—while maintaining data safety and protection. This includes both protection of any private, sensitive, or copyrighted data and measures to provide for safety of data and storage systems for protection against physical harm and cyber-threats.
Providing wider access, for example by making a digital archive accessible through open databases or online platforms, will pose an additional set of data security issues compared with providing access to a closed group of users. Similarly, different items in the digital archive may require varying levels of protection and controlled access. Therefore, appropriate levels of access need to be defined for different groups of users in relation to different parts of the archive.
12:45 AM
Maintenance Through Preservation and Migrations is the action that dominates the fourth stage of digital archiving. Once the digital archive has been designed, set up, and populated—and its data preserved, secured, and made accessible—all these functions need to be maintained and monitored. The content and the system need to be managed and eventually migrated and transformed.
Regular maintenance checks need to be performed on the data (to ensure its continued integrity and credibility, as well as format usability), the system (to provide for continued security and open access), and hardware and software technologies (to ensure their proper functioning and act timely when they need to be migrated or transformed to prevent them from becoming obsolete).
01:00 AM
At this point, another iteration of the digital archiving process begins anew.
1200:00 AM
[NK|FP1] Rather than having links appear parenthetically, I’m changing them to hyperlinks.