background image
Archivo General de Indias, Spain
One best practice example for implementing a large-scale migration strategy is the Archivo
General de Indias in Spain. Originally set up as a pilot project for large-scale digitisation to
test the possibility of digitising the holdings of all Spanish archives, the Archivo was set up
in 1992 on proprietary technology, without considering long-distance, integrated access.
The advent of the Internet brought a complete technological change, and forced the
Archivo to migrate 45,398 bundles of digitised documents to a new platform. Although the
computerisation of the Archivo General de Indias is well documented (see:
<>), the documentation on
migrating the data to a new system is unfortunately, not publicly available. It would be this
kind of information that is needed most for managers of memory institutions to make
informed decisions about archiving and preservation strategies.
Instability of content
By digitising analogue materials, once finite cultural objects stored or in form of various
"media" such as books, records, paintings, sculpture, manuscripts, etc. suddenly become very
flexible and non-finite.This means that they can be easily altered, manipulated, copied,
stored, and accessed, maybe resulting in a multiplicity of versions of a particular document.
As much as flexibility of digital information is to the advantage of creators and users, it
greatly complicates matters for archives, libraries and museums who are mainly concerned
with collecting and preserving cultural objects of long-term value that are final or in some
sense definite.The simplicity to produce versions of digital documents that might in some
respects have different functionality and interactivity, different relationships to other
documents, or a different `look and feel' creates a severe problem for cultural heritage
institutions concerned with establishing and guaranteeing the integrity of digital cultural
information objects. Similarly, confronted with a flood of unidentified resources on the
Web, the ability for users to distinguish an "authoritative" digital representation from other
resources is of great importance, especially to the scholarly community.
The ability of cultural heritage institutions to maintain the integrity, authenticity and
validity of digital cultural material is both one of the greatest assets and selling points that
distinguishes memory institutions from other content providers on the web, but also the
greatest challenges cultural heritage institutions concerned with long-term preservation face
today."Whatever preservation method is applied, ..., the central goal must be to preserve
information integrity; that is, to define and preserve those features of an information object
that distinguish it as a whole and singular work. In the digital environment, the features that
determine information integrity (...) include the following: content, fixity, reference,
provenance, and context." (Task Force on Archiving of Digital Information, 1996: p. 12)
Content integrity
Content integrity refers to the requirement to preserve as much of the original content
as possible.Yet, content exists at various levels of abstraction.This raises the question, at
which level it will be preserved: at the bit stream level, where a digital object only consist of
0s and 1s, at the format and structure level, where also the layout, design, the resolution and
accuracy of colour presentation is concerned? Or does one try to capture content at the
highest level of abstraction, where content is defined in terms of knowledge and