|
Chinese-European workshop on Digital
Preservation
ABSTRACTS:
Andreas
Aschenbrenner
--------------------------------------------------------------
Significant Properties and File Format
Characteristics
--------------------------------------------------------------
The presentation will introduce the notion of
?significant properties? - the features of digital
objects that need to be preserved. The definition of
significant properties at the inception of a digital
preservation programme will guide an organisation in choosing the
preservation method suited for their specific
requirements. The preservation of an object?s
significant properties is constrained by the
characteristics inherent in its file format. The presentation
will therefore also discuss file format characteristics, and
their role in a successful long-term preservation
effort. ------------------------------
File-Format-Registries
------------------------------ Cooperation has
always been essential in the digital preservation
community regarding knowledge exchange and collaboration in research
activities. As initiatives increasingly address
implementation, cooperation also gains practical
significance. Initiatives embark on collaboratively
building services that are required by various
preservation systems. This presentation addresses file format
registries. The preservation community jointly calls
for a register that identifies and documents file
formats, to come to terms with the myriad of different
file formats. Already activities towards building a file
format registry are emerging, and some preservation initiatives
already rely on such a future service in their current
approaches.
Reinhard
Altenhoner
------------------------------ Persistent
Identifiers ------------------------------
The electronic publishing process implies in generell
certain attributes: "fast, cost-saving, worldwide
acessible". Seen from the author's or the user's
perspective, however, those attributes are sufficient
to ensure permanent access to the electronic publication?
The usual experience with the Internet promises not a sufficient
result: URLs don't offer a mechanism that allows
internet based publications to be identified
unequivocally and being traceable at anytime. One
solution to that situation is being provided by
Persistent Identifiers such as Uniform Resource Names (URN), whose
implementation Die Deutsche Bibliothek consolidates
within the scope of the project EPICUR.
An implementation in local contexts cannot provide the persistency
of that addressing scheme on its own. For example, if
an institution ceases the maintenance and provision of
a digital collection, their references - even when
applied by a persistent adressing scheme - would
equally be volatile as in case of URLs. To ensure a long-term
usability of a permanent addressing scheme such as URN, it is
necessary tocreate an infrastructure with an
institutional backup. The talk focusses on a
general description of the Persistent Identifier
activities worldwide and on productive implementations. Based on
the EPICUR project in Die Deutsche Bibliothek, a prospect
on continuative activities will be given.
Ren└ van
Horik
------------------------------ Preservation of
image formats ------------------------------
Issues covered in the presentation: , Definition and
description of image formats. , Types of image formats. ,
Overview of theories (and the assumptions on which the theories
are based) on digital preservation relevant for
preservation of image formats. o->
Metadata o-> File format standards o-> Role of
XML o-> Registries o-> Etc. , Overview of available
digital preservation solutions relevant for the
preservation of digital images. o-> Format
registry o-> Format identification o-> Digital
archiving o-> Distributed storage o->
Emulation o-> Etc. , Importance of evolutionary approach.
Digital data formats are relative young. Only the
future can judge which assumptions were right. ,
'Building blocks' for the long term preservation of digital raster
images o-> Based on following assumptions: ,
Graphics file format standards are durable , Digital data encoded
in the XML data format is durable data , Metadata on digital
objects is essential in order to understand and
process digital images in the future , Building block 1: Graphics
file format standards. o-> Discussion of graphic file formats
used in the period 1994 - 2004 o-> TIFF
format seems the best format for long-term access.
o-> Discussion of TIFF file format , Building block 2:
XML data format for durable encoding of the bitstream
of digital images. o-> How to express the bitstream of a
digital image in XML? , Expression of content model in XML ,
Binary to XML conversion , XML to binary conversion (in the
future) o-> Methods available to express a digital image in
XML , Bitstream syntax description language (BSDL) , Universal
Virtual Computer (UVC) , Format language for Audio-Visual object
representation (Flavor/Xflavor) o-> Comparison of three
methods , Building block 3: Preservation metadata element sets
for digital images. o-> Methods to create and
store preservation metadata on digital images (e.g.
"automatic metadata exposure" project of
RLG). o-> Some important metadata element sets for digital
images: , NISO Z39.87 (technical metadata for digital still
images) , EXIF (for digital images created by digital
camera's) , SepiaDES (for digital surrogates of historical
photographs , Etc. , Conclusions: o-> (Baseline) TIFF
seems best format for long-term storage of digital
master images o-> Expression of bitstream in XML: more
research required Preservation metadata:
application profiles and registries help to
'discriminate exactly what we know vaguely'.
----------------------------------------------------------------------------------------
Case study: Preservation of image documents at the
Netherlands Institute for Scientific Information
Services (NIWI-KNAW)
----------------------------------------------------------------------------------------
Issues covered in the presentation: , Task and
mission of NIWI-KNAW o-> Archiving of scientific data
(Netherlands Historical Data Archive for archiving
data sets created by scholars in the Humanities) o-> Research
& Development of ICT applications in Humanities
(e.g. Historical discipline) o-> Digital data collection
creation projects (historical censuses, GIS, visual
material) , Creation and archiving of image documents at
NIWI-KNAW o-> Project oriented o-> Scope on
digitisation of historical sources o-> Relation analogue
original - digital surrogate o-> Benchmarking digitisation
chain o-> Examples (mainly digitisation of historical
photographic collections) , How to guarantee
long-term access to images? o-> Risk management (assessment
of risks that threat long-term access to digital
images and the impact of the risks) o-> In some situations
microfilm is best archival medium! , 'Film based imaging' ,
Preservation microfilming vs. preservation imaging o-> Access
= preservation. , OAI-PMH , LOCKSS o-> OAIS reference
model , Function as "checklist" , Used in practice to
implement a data archive (of image documents)
--------------------------------------------------
Preservation of Scientific Data in the Humanities
--------------------------------------------------
Issues covered in the presentation: , Scientific data
archives in the Humanities and Social Sciences: o-> Social
science data archives. (Survey based. Archiving methods
developed in the 1970s). o-> Electronic text archives.
(Importance of TEI for describing content, context and
structure of electronic texts). o-> Historical data archives
(both structured and unstructured data. Archiving
routines based on social science data archives). , Source
oriented computing vs. problem oriented computing and its
impact on data archiving methods. , Importance of relation
dataset - historical source o-> Public record offices.
(Relative recent interest in digital archiving,
definition issues, legal context, adoption of principle
of provenance in digital environment).
, International situation: o-> Social
Science and Historical data archives in Europe (Organisation
and situation in a number of countries) o->
International collaboration , IFDO (International Federation of
Data Organizations) , CESSDA Council of European Social Science
Data Archives) o-> User oriented organisations , AHC:
Association for History and Computing. , ACH/ALLC: Association
for Computers and the Humanities / Association for
literary and linguistic computing.
, Important standards:I o-> OAIS (helps to
establish common vocabulary) o-> DDI (Data Documentation
Initiative) o-> XML based standards, e.g. METS
, Changing research practice and influence on
data archiving o-> Scholars working together in
networked environment ("Collaboratories" /
"Sharium") o-> Data archives must be active at the beginning
of the data life cycle o-> Central vs.
distributed models for storage o-> Open Access
Andreas
Rauber
------------------------------------------------------------
ELOS: The EU FP6 Network of Excellence on Digital
Libraries, with a specific focus on its
Preservation Cluster activities
------------------------------------------------------------
Digital Libraries (DL) have been made possible through the
integration and use of a number of IC technologies,
the availability of digital content on a global scale
and a strong demand for users who are now online. They
are destined to become essential part of the information
infrastructure in the 21st century. The DELOS network intends to
conduct a joint program of activities aimed at
integrating and coordinating the ongoing research
activities of the major European teams working in
DL-related areas with the goal of developing the next generation
DL technologies. This talk will provide
an overview of the seven Research Clusters within the
DELOS Network, with a specific focus on activities of
the DELOS Preservation Cluster.
----------------------------------------------------------------------
Using Utility Analysis to Evaluate and Compare
Preservation Strategies
----------------------------------------------------------------------
Long-term preservation solutions become critical as an
increasing amount of information is being digitized or
directly created and thus existing only in electronic
form. While different approaches, such as
Emulation, Migration, or Computer Museums, are being
proposed as solutions to this challenge, neither of
them excels in all circumstances. Selection of the most appropriate
strategy and tools becomes a non-trivial task. In this
talk we present an adapted version of Utility
Analysis, which can be applied to selecting the
optimal preservation solution for each individual
situation. This analysis method, which is usually applied in
infrastructure projects, such as highways, airports,
or city district development, is here used to combine
the wide range of requirements, which are to be
considered in order to select a suitable preservation strategy.
Additionally, we present a framework for identifying and
defining the criteria influencing the choice of a
particular preservation solution, such as a specific
migration tool. The evaluation metric is explained
theoretically and demonstrated via case-studies
performed for different application domains.
Thomas
------------------------------------------------------
Preservation of Scientific Data (in Natural Sciences)
------------------------------------------------------
Preservation of "Primary Data" is of very high relevance
in science. While letter publications become redundant
by review publications over the years, and while
review articles are remembered by many readers,
primary data often can not be reconstructed. Primary
data as there is e.g. weather data, accelerator data, space
observation data, build the backbone of scientific
research and publication activities. It is essentially
required to reconstruct experiments, to recalculate
final results in scientific publications and to check
their correctness. The existence of primary data makes the
difference between fiction and science. Primary data
often is open for re-usage in other research
activities, e.g. measurement of the radius of the
proton at CERN. This talk will offer an overview of
the relevance of primary data in natural science, of
its preservation requirements, and how it is preserved
today.
--------------------------------------------------------------------
Title: Preservation Planning, Institutional Strategies
and Policies
--------------------------------------------------------------------
Should we preserve everything or only a selection of
the available information? This is the question which
will be illuminated from different point of view
within this talk. Institutions all around the world
are developing strategies and policies for preservation.
Many developments are done in parallel redundantly, some are still
open. This talk will give a brief overview of the
activities (in selection) and the solutions they
developed or the state of discussion process. This
overview is part of the project "nestor"
http://www.langzeitarchivierung.de/index.php?newlang=eng supported
by the Germany Ministry for Education and Research.
Hilde van
Wijngaarden
------------------------------------------------------------------------------
Different approaches to digital preservation
(Migration, Emulation, UVC, etc)
------------------------------------------------------------------------------
Digital preservation consists of three subjects: safe
storage, preservation metadata and permanent access.
First we have to make sure digital objects are stored
on secure storage media and are maintained by proper
procedures for safety, back-up and refreshment. In order
to be able to retrieve the stored objects, we have to register
information on the objects in preservation metadata
and work on the technical possibilities to render the
stored objects, now and in the future. To work on
permanent access solutions, a number of questions have
to be answered about what it is we want to view and use in the
future. Different strategies can be deployed, each
with their own advantages and disadvantages. In this
presentation these strategies will be explained and
linked to their intended use and their possibilities,
including examples. Apart from existing strategies, new procedures
and especially tools will have to be tested and
developed to keep our digital archives accessible.
Research and development on permanent access requires
continuous effort and internation co-operation.
-------------------------------------------------------------------------------
Case study: Preservation Strategies of the National
Library of The Netherlands
-------------------------------------------------------------------------------
As a deposit library, the National Library of the
Netherlands (KB) was faced with having to store
digital publications already more than ten years ago.
As the number of digital publications was growing, the
KB decided to make digital preservation one of its main concerns.
This resulted, among other things, in an operational
digital archive (the e-Depot) and projects to develop
preservation functionality. In this presentation two
projects will be explained in more detail: the
Preservation Manager, a tool for the monitoring of technical
metadata, and the Universal Virtual Computer for JPEG. This
UVC is a new approach for the rendering of digital
objects, without depending on current platforms or
formats. Together with IBM, we developed a first
working UVC, which will be demonstrated.
--------------------------------------------------------------------------------------------
Case study: Preservation of scientific e-journals at
the National Library of The Netherlands
--------------------------------------------------------------------------------------------
The digital archiving system of the KB, the e-Depot,
stores e-journals of major international publishers
automatically and for the long-term. This amounts to a
total of over 2 million articles stored today and this
is just after the first year that the system has been operational.
Two major publishers, Elsevier and Kluwer, deposit their
world production of e-journals at the e-Depot. And
since they publish mainly in the field of Science,
Technology and Medicine, the e-Depot now holds about
20% of everything that is recently published in this field,
world-wide. This presentation will explain what lead to this
result, how the e-Depot works, what we have agreed
with the publishers and what we plan for the future.
We call the e-Depot a 'safe place', working towards
international co-operation (safe place strategy) and
towards certification as a so-called trusted depository.
Michael
Day
-------------------------
Metadata for preservation
------------------------- In recent years there
have been a range of metadata specifications and
frameworks developed to support digital preservation activities.
These range from formats that are intended to be specific
to certain types of resources to generic frameworks
based on the information model defined by the
Reference Model for an Open Archival Information
System (OAIS). Those specifications that exist have been developed
from the perspective of a variety of different
professional domains and world-views. The presentation
will attempt to define preservation metadata,
introduce some of the most important schemas and standards
being developed, and outline some of the problems that result
from the differing perspectives that inform their
development.
-------------------------------------------------
The OAIS Reference Model: current implementations
-------------------------------------------------
The OAIS Reference Model (ISO 14721:2003) is an important part of
the current digital preservation landscape. Initially
developed by the Consultative Committee on Space Data
Systems, the OAIS establishes a common framework of
terms and concepts, identifies the basic functions of
an archival system, and provides an information model for managing
digital objects and information packages. The OAIS
information model has proved to be extremely
influential on the development of preservation
metadata schemas. While the OAIS Reference Model does NOT specify
any implementation, it has informed the implementation of
some preservation systems, including the National
Library of the Netherlands deposit system. This
presentation will provide an introduction to the OAIS
Reference Model and highlight some recent implementations that have
been informed by it.
|