PDS4 Provenance Discipline Data Dictionary User’s Guide
Last edited: 2025-11-25

Introduction

  1. Purpose of this User’s Guide

    • This User’s Guide provides an overview of the Provenance Discipline Data Dictionary. The guide details how to include the dictionary in a PDS4 label, describes the organization of the dictionary’s classes and attributes, provides definitions for these classes and attributes, and lists example excerpts from labels that use them.

  2. Audience

    • This User’s Guide should be useful to data providers intending to archive Provenance data with PDS as well as PDS Nodes who are working with these data providers.

Overview of the Provenance Discipline Data Dictionary

The Provenance Discipline Data Dictionary contains classes and attributes specific to the Provenance discipline.
Steward: Steve Hughes, Engineering Node, john.s.hughes@jpl.nasa.gov

Document Outline

  1. How to Extend the Provenance Discipline Data Dictionary

  2. How to Include the Provenance Discipline Data Dictionary in a PDS4 Label

  3. Organization of Classes and Attributes

    1. Class Organization

    2. Attributes by Class

  4. Definitions

    1. Classes (in alphabetical order)

    2. Attributes (in alphabetical order)

  5. Examples

  6. Edit History

How to Extend the Provenance Discipline Data Dictionary

The PDS4 Provenance LDD is derived from the PROV Data Model (PROV-DM), a W3C Recommendation dated 30 April 2013. It was intentionally designed as a general-purpose provenance framework capable of supporting a broad range of PDS use cases—for example, the calibration provenance of a Voyager image was among the earliest scenarios considered.

As common or recurring use cases are identified, additional specialized classes are introduced to address them. The PDS4 Logical Identifier (LID) supersession use case is the first such example. The corresponding class, SupersededLID, has now been added and is available in the schema.

How to Include the Provenance Discipline Data Dictionary in a PDS4 Label

The dictionary consists of a set of files with names in the form PDS4_PROV_xxxx_yyyy.ext, where

  • xxxx = the PDS4 Information Model version, e.g. 1O00

  • yyyy = the Provenance Discipline Data Dictionary version, e.g. 1210

and the file extensions are

  • .csv = A comma-separated value table of dictionary attributes

  • .JSON = The dictionary contents in JSON format

  • .sch = The dictionary “rules” as an XML Schematron file

  • .txt = The report generated when the dictionary was built

  • .xml = The PDS4 label that describes this set of files

  • .xsd = The dictionary contents as an XML schema file

Only the schema and Schematron files are needed for validating a PDS4 label.

The latest version of this dictionary may be found on the PDS web site at https://pds.nasa.gov/datastandards/dictionaries/index.shtml#prov.

The following is an example showing the use of this dictionary in a PDS4 label.

<?xml version="1.0" encoding="UTF-8"?>
<?xml-model href="https://pds.nasa.gov/pds4/pds/v1/PDS4_PDS_1O00.sch" schematypens="http://purl.oclc.org/dsdl/schematron"?>
<?xml-model href="https://pds.nasa.gov/pds4/prov/v1/PDS4_PROV_1O00_1210.sch" schematypens="http://purl.oclc.org/dsdl/schematron"?>
<Product_Observational xmlns="http://pds.nasa.gov/pds4/pds/v1"
    xmlns:prov="http://pds.nasa.gov/pds4/prov/v1"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="https://pds.nasa.gov/pds4/pds/v1/PDS4_PDS_1O00.xsd
                        https://pds.nasa.gov/pds4/prov/v1/PDS4_PROV_1O00_1210.xsd">

The following is a schematic example showing the location of every Provenance Discipline Data Dictionary class and attribute in a PDS4 label. Note that not all classes and attributes may be mutually compatible, and the example does not include any recursion, even if recursion is allowed.

<Observation_Area>
  ...
  <Discipline_Area>
    <prov:Provenance>
      <prov:title/>
      <prov:local_id/>
      <prov:description/>
      <prov:Provenance_Entities>
        <prov:Entity>
          <prov:title/>
          <prov:local_id/>
          <prov:description/>
          <prov:Attributes>
            <prov:attribute/>
            <prov:value/>
          </prov:Attributes>
        </prov:Entity>
      </prov:Provenance_Entities>
      <prov:Provenance_Activities>
        <prov:Activity>
          <prov:title/>
          <prov:local_id/>
          <prov:description/>
          <prov:Attributes>
          </prov:Attributes>
        </prov:Activity>
      </prov:Provenance_Activities>
      <prov:Provenance_Agents>
        <prov:Agent>
          <prov:title/>
          <prov:local_id/>
          <prov:description/>
          <prov:Person>
            <prov:given_name/>
            <prov:family_name/>
            <prov:orcid/>
          </prov:Person>
          <prov:Organization>
            <prov:name/>
            <prov:rorid/>
          </prov:Organization>
          <prov:Software>
            <prov:name/>
          </prov:Software>
          <prov:External_Reference>
            <prov:Affiliation>
              <prov:name/>
              <prov:rorid/>
            </prov:Affiliation>
          </prov:External_Reference>
          <prov:Attributes>
          </prov:Attributes>
        </prov:Agent>
      </prov:Provenance_Agents>
      <prov:Provenance_Relationships>
        <prov:Relationship>
          <prov:title/>
          <prov:local_id/>
          <prov:description/>
          <prov:time/>
          <prov:Relation_Type>
            <prov:Used>
              <prov:activity1/>
              <prov:entity1/>
            </prov:Used>
            <prov:WasAssociatedWith>
              <prov:activity1/>
              <prov:agent1/>
            </prov:WasAssociatedWith>
            <prov:WasAttributedTo>
              <prov:entity1/>
              <prov:agent1/>
            </prov:WasAttributedTo>
            <prov:WasDerivedFromUsedUsed>
              <prov:generatedEntity/>
              <prov:usedEntity/>
              <prov:activity1/>
            </prov:WasDerivedFromUsedUsed>
            <prov:WasGeneratedBy>
              <prov:entity1/>
              <prov:activity1/>
            </prov:WasGeneratedBy>
            <prov:WasInformedBy>
              <prov:activity2/>
              <prov:activity1/>
            </prov:WasInformedBy>
            <prov:ActedOnBehalfOf>
              <prov:agent2/>
              <prov:agent1/>
              <prov:activity1/>
            </prov:ActedOnBehalfOf>
            <prov:WasStartedBy>
              <prov:activity2/>
              <prov:entity1/>
              <prov:activity1/>
            </prov:WasStartedBy>
          </prov:Relation_Type>
          <prov:Attributes>
          </prov:Attributes>
        </prov:Relationship>
      </prov:Provenance_Relationships>
    </prov:Provenance>
    <prov:SupersededLID>
      <prov:title/>
      <prov:local_id/>
      <prov:description/>
      <prov:Entity>
        <prov:title/>
        <prov:local_id/>
        <prov:description/>
        <prov:Attributes>
          <prov:attribute/>
          <prov:value/>
        </prov:Attributes>
      </prov:Entity>
    </prov:SupersededLID>
  </Discipline_Area>
  ...
</Observation_Area>

The namespace for the Provenance Discipline Data Dictionary is http://pds.nasa.gov/pds4/prov/v1, abbreviated “prov:”.

Organization of Classes and Attributes

Class Organization

Below is a structured list showing the organization of classes, ordered by appearance in the PDS4 label. Each class name is linked to its complete definition in the Definitions section.

Attributes by Class

The attributes immediately under each class (if any) are listed below. Both classes and attributes are ordered by appearance in the PDS4 label; however, each class is listed only once, even if that class can appear in more than one place in a PDS4 label. Each class and attribute name is linked to its complete definition in the Definitions section.

Provenance (attribute list)

Provenance_Entities (attribute list)

Entity (attribute list)

Attributes (attribute list)

Provenance_Activities (attribute list)

Activity (attribute list)

Provenance_Agents (attribute list)

Agent (attribute list)

Person (attribute list)

Organization (attribute list)

Software (attribute list)

External_Reference (attribute list)

Affiliation (attribute list)

Provenance_Relationships (attribute list)

Relationship (attribute list)

Relation_Type (attribute list)

Used (attribute list)

WasAssociatedWith (attribute list)

WasAttributedTo (attribute list)

WasDerivedFromUsedUsed (attribute list)

WasGeneratedBy (attribute list)

WasInformedBy (attribute list)

ActedOnBehalfOf (attribute list)

WasStartedBy (attribute list)

SupersededLID (attribute list)

Definitions

Classes (in alphabetical order)

ActedOnBehalfOf

An activity association is an assignment of responsibility to an agent for an activity, indicating that the agent had a role in the activity. It further allows for a plan to be specified, which is the plan intended by the agent to achieve some goals in the context of this activity.

Activity

An activity is something that occurs over a period of time and acts upon or with entities; it may include consuming, processing, transforming, modifying, relocating, using, or generating entities..

Affiliation

The Affiliation class describes

Agent

An agent is something that bears some form of responsibility for an activity taking place, for the existence of an entity, or for another agent’s activity.

Attributes

Attributes: an OPTIONAL set (attrs) of attribute-value pairs representing additional information about this attribution.

Entity

An entity is a physical, digital, conceptual, or other kind of thing with some fixed aspects; entities may be real or imaginary.

External_Reference

The External_Reference class describes

Organization

The Organization class describes

Person

The Person class describes

Provenance

This Provenance class is an implementation of the WC3 Provenance Model pattern ….

Provenance_Activities

The Provenance Activities class contains the Activity definitions.

Provenance_Agents

The Provenance Agents class contains the Agent definitions.

Provenance_Entities

The Provenance Entities class contains the Entity definitions.

Provenance_Relationships

The Provenance Agents class contains the Agent definitions.

Relation_Type

The Relation Type class contains the set of relation types of which only one is allowed for each relationship.

Relationship

The Relationship class defines a relationship.

Software

The Software class describes …

SupersededLID

The Superseded LID LDD relates two LIDs, one that supersedes the other.

Used

Usage is the beginning of utilizing an entity by an activity. Before usage, the activity had not begun to utilize this entity and could not have been affected by the entity. (Note: This definition is formulated for a given usage; it is permitted for an activity to have used a same entity multiple times.)

WasAssociatedWith

An activity association is an assignment of responsibility to an agent for an activity, indicating that the agent had a role in the activity. It further allows for a plan to be specified, which is the plan intended by the agent to achieve some goals in the context of this activity.

WasAttributedTo

Attribution is the ascribing of an entity to an agent.

WasDerivedFromUsedUsed

A derivation is a transformation of an entity into another, an update of an entity resulting in a new one, or the construction of a new entity based on a pre-existing entity.

WasGeneratedBy

Generation is the completion of production of a new entity by an activity. This entity did not exist before generation and becomes available for usage after this generation.

WasInformedBy

An activity association is an assignment of responsibility to an agent for an activity, indicating that the agent had a role in the activity. It further allows for a plan to be specified, which is the plan intended by the agent to achieve some goals in the context of this activity.

WasStartedBy

An activity association is an assignment of responsibility to an agent for an activity, indicating that the agent had a role in the activity. It further allows for a plan to be specified, which is the plan intended by the agent to achieve some goals in the context of this activity.

Attributes (in alphabetical order)

activity1

activity1: an identifier (a) for the activity - The attribute activity1 is an OPTIONAL identifier for an activity.

  • PDS4 data type: ASCII_Short_String_Collapsed

  • Valid values: N/A

  • Minimum Length: 1

  • Maximum Length: 255

  • Nillable: No

  • in ActedOnBehalfOf:

    • Minimum occurrences: 1

    • Maximum occurrences: 1

  • in Used:

    • Minimum occurrences: 1

    • Maximum occurrences: 1

  • in WasDerivedFromUsedUsed:

    • Minimum occurrences: 0

    • Maximum occurrences: 1

  • in WasStartedBy:

    • Minimum occurrences: 0

    • Maximum occurrences: 1

  • in WasAssociatedWith:

    • Minimum occurrences: 1

    • Maximum occurrences: 1

  • in WasGeneratedBy:

    • Minimum occurrences: 0

    • Maximum occurrences: 1

  • in WasInformedBy:

    • Minimum occurrences: 0

    • Maximum occurrences: 1

activity2

activity2: an identifier (a) for the activity - The attribute activity2 is an OPTIONAL identifier for an activity.

  • PDS4 data type: ASCII_Short_String_Collapsed

  • Valid values: N/A

  • Minimum Length: 1

  • Maximum Length: 255

  • Nillable: No

  • Minimum occurrences: 1

  • Maximum occurrences: 1

agent1

agent1: the identifier (ag) of the agent whom the entity is ascribed to, and therefore bears some responsibility for its existence; The attribute agent1 is an OPTIONAL identifier for an agent.

  • PDS4 data type: ASCII_Short_String_Collapsed

  • Valid values: N/A

  • Minimum Length: 1

  • Maximum Length: 255

  • Nillable: No

  • in WasAttributedTo:

    • Minimum occurrences: 1

    • Maximum occurrences: 1

  • in WasAssociatedWith:

    • Minimum occurrences: 0

    • Maximum occurrences: unbounded

  • in ActedOnBehalfOf:

    • Minimum occurrences: 0

    • Maximum occurrences: 1

agent2

agent2: the identifier (ag) of the agent whom the entity is ascribed to, and therefore bears some responsibility for its existence; The attribute agent2 is an OPTIONAL identifier for an agent.

  • PDS4 data type: ASCII_Short_String_Collapsed

  • Valid values: N/A

  • Minimum Length: 1

  • Maximum Length: 255

  • Nillable: No

  • Minimum occurrences: 0

  • Maximum occurrences: 1

attribute

attribute: the attribute component of an OPTIONAL set (attrs) of attribute-value pairs representing additional information about this attribution.

  • PDS4 data type: ASCII_Short_String_Collapsed

  • Valid values: N/A

  • Minimum Length: 1

  • Maximum Length: 255

  • Nillable: No

  • Minimum occurrences: 1

  • Maximum occurrences: 1

description

The attribute description provides

  • PDS4 data type: ASCII_Short_String_Collapsed

  • Valid values: N/A

  • Minimum Length: 1

  • Maximum Length: 255

  • Nillable: No

  • in Provenance:

    • Minimum occurrences: 1

    • Maximum occurrences: 1

  • in SupersededLID:

    • Minimum occurrences: 0

    • Maximum occurrences: 1

  • in Entity:

    • Minimum occurrences: 0

    • Maximum occurrences: 1

  • in Activity:

    • Minimum occurrences: 0

    • Maximum occurrences: 1

  • in Agent:

    • Minimum occurrences: 0

    • Maximum occurrences: 1

  • in Relationship:

    • Minimum occurrences: 1

    • Maximum occurrences: 1

entity1

entity1: an entity identifier (e) – The attribute entity1 is an OPTIONAL identifier for an entity.

  • PDS4 data type: ASCII_Short_String_Collapsed

  • Valid values: N/A

  • Minimum Length: 1

  • Maximum Length: 255

  • Nillable: No

  • in WasStartedBy:

    • Minimum occurrences: 0

    • Maximum occurrences: 1

  • in WasAttributedTo:

    • Minimum occurrences: 1

    • Maximum occurrences: 1

  • in Used:

    • Minimum occurrences: 0

    • Maximum occurrences: unbounded

  • in WasGeneratedBy:

    • Minimum occurrences: 1

    • Maximum occurrences: 1

family_name

The attribute family_name.

  • PDS4 data type: ASCII_Short_String_Collapsed

  • Valid values: N/A

  • Minimum Length: 1

  • Maximum Length: 255

  • Nillable: No

  • Minimum occurrences: 1

  • Maximum occurrences: 1

generatedEntity

generatedEntity: the identifier (e2) of the entity generated by the derivation

  • PDS4 data type: ASCII_Short_String_Collapsed

  • Valid values: N/A

  • Minimum Length: 1

  • Maximum Length: 255

  • Nillable: No

  • Minimum occurrences: 1

  • Maximum occurrences: 1

given_name

The attribute given_name.

  • PDS4 data type: ASCII_Short_String_Collapsed

  • Valid values: N/A

  • Minimum Length: 1

  • Maximum Length: 255

  • Nillable: No

  • Minimum occurrences: 1

  • Maximum occurrences: 1

local_id

The attribute local_identifier provides

  • PDS4 data type: ASCII_Short_String_Collapsed

  • Valid values: N/A

  • Minimum Length: 1

  • Maximum Length: 255

  • Nillable: No

  • in Relationship:

    • Minimum occurrences: 1

    • Maximum occurrences: 1

  • in Provenance:

    • Minimum occurrences: 0

    • Maximum occurrences: 1

  • in SupersededLID:

    • Minimum occurrences: 1

    • Maximum occurrences: 1

  • in Entity:

    • Minimum occurrences: 1

    • Maximum occurrences: 1

  • in Activity:

    • Minimum occurrences: 1

    • Maximum occurrences: 1

  • in Agent:

    • Minimum occurrences: 1

    • Maximum occurrences: 1

name

The attribute name.

  • PDS4 data type: ASCII_Short_String_Collapsed

  • Valid values: N/A

  • Minimum Length: 1

  • Maximum Length: 255

  • Nillable: No

  • Minimum occurrences: 1

  • Maximum occurrences: 1

orcid

The attribute orcid.

  • PDS4 data type: ASCII_Short_String_Collapsed

  • Valid values: N/A

  • Minimum Length: 1

  • Maximum Length: 255

  • Nillable: No

  • Minimum occurrences: 0

  • Maximum occurrences: 1

rorid

The attribute rorid.

  • PDS4 data type: ASCII_Short_String_Collapsed

  • Valid values: N/A

  • Minimum Length: 1

  • Maximum Length: 255

  • Nillable: No

  • Minimum occurrences: 0

  • Maximum occurrences: 1

time

time: an OPTIONAL “usage time” (t), the time at which the entity started to be used

  • PDS4 data type: ASCII_Short_String_Collapsed

  • Valid values: N/A

  • Minimum Length: 1

  • Maximum Length: 255

  • Nillable: No

  • Minimum occurrences: 0

  • Maximum occurrences: 1

title

The attribute title provides

  • PDS4 data type: ASCII_Short_String_Collapsed

  • Valid values: N/A

  • Minimum Length: 1

  • Maximum Length: 255

  • Nillable: No

  • Minimum occurrences: 1

  • Maximum occurrences: 1

usedEntity

usedEntity: the identifier (e1) of the entity used by the derivation

  • PDS4 data type: ASCII_Short_String_Collapsed

  • Valid values: N/A

  • Minimum Length: 1

  • Maximum Length: 255

  • Nillable: No

  • Minimum occurrences: 1

  • Maximum occurrences: 1

value

value: the value component of an OPTIONAL set (attrs) of attribute-value pairs representing additional information about this attribution.

  • PDS4 data type: ASCII_Short_String_Collapsed

  • Valid values: N/A

  • Minimum Length: 1

  • Maximum Length: 255

  • Nillable: No

  • Minimum occurrences: 1

  • Maximum occurrences: 1

Examples

Example PDS4 label snippet from urn:nasa:pds:insight-ifg-mars:data-ifg-calibrated:ifg-cal-sol0212-20190702t010324-20190703t014257-gpt2hz::2.0:

<Discipline_Area>
  <prov:SupersededLID>
    <prov:title>urn:nasa:pds:insight-ifg-mars:data-ifg-calibrated:ifg-cal-sol0212-20190702t010324-20190703t014257-gpt2hz</prov:title>
    <prov:local_id>Superseded LIDs</prov:local_id>
    <prov:description>Product LID change due to change in product start/stop times (due to reprocessing with a new SCLK kernel).</prov:description>
    <prov:Entity>
      <prov:title>urn:nasa:pds:insight-ifg-mars:data-ifg-calibrated:ifg-cal-sol0212-20190702t010324-20190703t014257-gpt2hz</prov:title>
      <prov:local_id>urn:nasa:pds:insight-ifg-mars:data-ifg-calibrated:ifg-cal-sol0212-20190702t010324-20190703t014257-gpt2hz</prov:local_id>
      <prov:description>New LID supersedes old LID.</prov:description>
      <prov:Attributes>
        <prov:attribute>Supersedes</prov:attribute>
        <prov:value>urn:nasa:pds:insight-ifg-mars:data-ifg-calibrated:ifg-cal-sol0212-20190702t011217-20190703t015130-gpt2hz</prov:value>
      </prov:Attributes>
      <prov:Attributes>
        <prov:attribute>Reason</prov:attribute>
        <prov:value>Duplication</prov:value>
      </prov:Attributes>
    </prov:Entity>
  </prov:SupersededLID>
</Discipline_Area>

Edit History

See also: PROV change log.
2025-11-25 Steve Hughes