The Importance of Metadata in Content Management

Understanding the importance of Metadata is vital for anyone seeking to design an effective Content Management system. To support more effective Content Management, a variety of models and standards have been established. In this post, we will outline some of the common Metadata models designed to improve resource discovery for users, with a focus on The Dublin Core Metadata Initiative (DCMI).

What is Metadata?

“Structured or semi-structured information that enables the creation, registration, classification, access, preservation and disposition of records through time and within and across domains”

 

Metadata is the data about the data that makes everything work smoothly. It refers to structured data describing anything that can be named. This can include anything from publishing date to web pages, author names to books, images to songs, and more. Metadata tags are used to aid resource discovery, improved resource organization, and the exchangeability of data and resources.

Metadata provides contextual and descriptive details about resources that are not necessarily self-describing. Without a useable Metadata Model, users of a given system will not be able to efficiently search or access information. Indeed, poor Content Management as a result of poor Metadata practices is as good as shoving your data into a filing cabinet. A strong content metadata model will lead to enhanced Content Management.

What is Content Management?

Content Management refers to the organization of content, like text, graphics, and multimedia along with an effective tagging system (the Metadata). It can cover the management of things like a website’s digital assets (e.g. WCMS, DAM), enterprise documentation (e.g. ECM), graphics and multimedia (e.g. DAM), or be a component-based system (e.g. CCMS).

A good Content Management system will store the data it structures in the most efficient way possible on a single repository. It will also allow content to be reused by different publications on the system, edited, and allow it to be repurposed as and when required. But what is the relationship between Metadata and Content Management?

The Importance of Metadata in Content Management

For Metadata to be effective for any Content Management system, it needs to provide at least the following 4 functions:

1. Searching

Users need to be able to search for key data. As such, content must be categorized and described well. This will allow users to search for files and content in a range of ways, be it by author, date of publication, title, or specific keywords.

2. Distribution

Values linked to content can be required by different applications to regulate where content is distributed or shared and when this happens. Distribution Metadata also establishes the journey content has made. An example of this is a log of actions an object has had taken against it.

3. Accessibility

Metadata models need to consider the security of managed objects. Through filtering at the distribution stage based on matching Metadata values, appropriate content can be accessed when required. Targeted content can then be securely delivered based on business rules across domains.

4. Retention

Metadata is usually used by Record Management applications to implement retention rules of a given organization. When Metadata is insufficient, this can cause issues with regards to retention rules, how records are preserved, and in what format.

Why are Metadata Standards Important?

Metadata standards are vital when it comes to promoting technical interoperability. They facilitate information exchange across systems and organizations. Without Metadata standards, it is difficult to aggregate, manipulate, and harness data across diverse systems. Following appropriate Metadata standards also improves system security and record protection through the prevention of inappropriate access and malfeasant intrusions.

In addition, organizations working with certain external parties may find that they require certain Metadata standards to be met. For example, the US Department of Justice often requires organizations it deals with to adopt ISO/IEC 11179 Metadata standards. The table below outlines some of the key standards available or required by a variety of industries:

Industry or Discipline Available/Required Standards
Archiving & Social Science ·         Data Documentation Initiative (DDI)

·         Encoded Archival Description (EAD)

·         Text Encoding Initiative (TEI)

The Arts ·         Categories for Description of Works (CDWA)

·         Visual Resources Association (VRA Core)

Biology & Ecology ·         Darwin Core (DwC)

·         Ecological Metadata Language (EML)

Cataloging ·         Data Catalog Vocabulary (DCAT)
Education ·         Learning Objects Metadata (IEEE LOM)
Enterprise Data ·         Common Warehouse Metamodel (CWM)
Governmental Organizations ·         E-Government Metadata Standard (e-GMS)

·         Global Information Locator Service (GILS)

·         ISO/IEC 11179

Imaging ·         NISO MIX Z39.87
Music ·         Music Encoding Initiative (MEI)
Networking ·         Dublin Core Metadata Initiative (DCMI)

·         Digital Object Identifier (DOI)

Publishing ·         Online Information Exchange (ONIX)
Records Management ·         ISO 23081

Corporate Content Management: Understanding the Dublin Core Metadata Initiative

The DCMI is one of the best-known standards that is well established in corporate Content Management systems.  This model outlines some simple guidelines and best practices when it comes to Metadata design and implementation. It also uses a minimal set of vocabulary terms to describe data. To meet the standards of the Dublin Core model, Metadata should:

1. Meet functional specifications:

Metadata must be designed to meet search and security requirements from applications connecting to the repository. Specific tasks for different applications also need to be achievable using the underlying Metadata.

2. Be represented by a domain model:

A domain model describes both the data itself and its relationship or behavior with other data. A file would have a content type, responsible party, and classification possibilities.

3. Define its own terminology:

The properties used to describe a given model must adhere to the set vocabulary of the Dublin Code. However, they should also declare the Metadata terms that define the policies and best practices suitable for the application. For example, a file may be described by its title, topic, creation date, classification type, and classification status.

Example of a Metadata Model for a Managed PDF Repository

When designing Metadata models it is important to ensure that the number of fields for a specific type of content is minimized to ensure all important fields are always updated. Where possible, free form fields should be resisted and pre-defined lists used instead as they help to reduce input errors and loss of control. Nevertheless, cross-organizational as well as department-specific fields need to be allowed for.

An example of a Metadata model built on the Dublin Core methods is outlined below:

Functional Specifications

  • Users should be able to search for PDFs.
  • Using standard attributes of PDF files as well as status classifications should be possible.
  • Through the provision of relevant Metadata values as files are archived, appropriate files should be able to be found.

Domain Model:

Managed Object: PDFs

  • Content-Type – PDF
  • Responsible Party – Author
  • Classifications:
    • Type
    • Status

Metadata Terminology

  • Content-Type – PDF (Auto-selected)
  • Creation Date – (Date Field)
  • Title – (free form field)
  • Topic – (free form field)
  • Classification Type – (predefined list)
    • Value 1 – Internal Document
    • Value 2 – External Document
    • Value 3 – Legal Documentation
  • Classification Status – (predefined list)
    • Value 1 – Draft
    • Value 2 – Pre-Approval
    • Value 3 – Approved
    • Value 4 – Archived

 

Summing Up

Understanding Metadata is vitally important when designing effective Content Management systems. Abiding by established models and standards and mapping out the requirements of domains makes keeping information organized easier. It also ensures that the use of data is maximized without wasting resources.

This content is brought to you by Maxim Panych.

Photo: Shutterstock