Published February 13, 2018 | Version 1.1
Project deliverable Open

SMURF (Semantically Marked Up Record Format) Profile

  • 1. National Archives of Estonia
  • 2. DLM Forum
  • 3. University of Brighton
  • 4. National Archives of Hungary
  • 5. National Archives of Slovenia
  • 6. Danish National Archive

Description

The purpose of this report is to describe SMURF (semantically marked up record format) profile, which includes ERMS (electronic records management systems) and SFSB (simple file-system based) records as described below. When extracting information from a producer’s system one has the choice of two generic options:

1. Extracting data in a relational database structure

Extracting data from a relational database into a long-term preservation format (SIARD) that preserves the properties of the relational database so that the data can be imported into a relational database management system (RDBMS) on Access. Access can happen via database queries or via a search field.

The main access use cases are:

a. The producer wishes to retrieve their data for business purposes and/or re-use.

b. The consumer wishes to consult the data for purposes of research.

c. The archivist wishes to retrieve the data for professional treatment: to check and, if necessary perform preservation actions, etc. More information about this option can be read in the SIARD 2.0 Profile Specification.

2. Extracting data and metadata as records

Extract the records and normalise them to a standard E-ARK XML format. This means that the records are semantically marked up using metadata. Being technically valid and complying with this specification makes them directly accessible for validation, data management, indexing and searching. Their structured semantic metadata description is explicit rather than hidden inside a RDBS. The representation of descriptive metadata inside the archive can be in the E-ARK SMURF AIP format and/or another native archive format. The main advantages over the RDBS representation are that:

o Records from different sources can be merged.

o Search and access is possible across all records from all sources.

o Records can be managed and accessed uniformly.

o The original database / records system software does not need to be licensed and preserved.

Files

E-ARK D3.3.pdf

Files (967.6 kB)

Name Size Download all
md5:be825f47788645fe79746403c35bce33
967.6 kB Preview Download