Identifiers and Relationships

Identifiers (identifiers) are any identifiers applied to records. These identifiers may allow tracking records (as in the case of collector numbers), reference other resources (e.g., GenBank numbers), or form relationships among specimens (such as hosts of parasites).

Background

Arctos is built on the premise that each catalog record will gather all the information known about the object or record, especially resolvable, unique links to external and online resources. The Extended Specimen Network is one way to describe the Arctos implementation of Linked Open Data principles (LOD) of machine-readable interlinkages across the internet. Identifiers and relationships between records are the working blocks of LOD.

Arctos originally created identifiers from a paired identifier type and value. This worked reasonably well for short periods of time and small numbers of well-defined identifiers. However, over time and with increased users, cryptic references to people (the “ABC” in ‘ABC 123’) become lost or ambiguous. Less-precise identifiers create the possibility that several unrelated series are mixed under a type.

Agent-Based Identifiers

Arctos Agents are entities that perform or represent an action or activity, which may refer to a person, an organization, an institutional catalog, etc. Many “traditional” identifiers clearly reference Agents, just in ambiguous and nonpersistent ways. Arctos has made this connection explicit; all identifier values may now be “issued by” an Agent whether a person, organization or other discoverable entity.

Benefits of the Agent-Based model

Drawbacks of the Agent-Based model

Identifier Types in Arctos

There are three main categories of identifier that Arctos supports, and it is strongly recommended to have an Issued by agent (even if unknown; yes, unknown is an option!):

Type A: Identifiers that have URLs where the “issued by” agent can be used to create explicit agent links (i.e., conforms to Linked Open Data). Supports the Extended Specimen Network.

Arctos fields    
IssuedBy Value Type
MVZ Bird Collection https://arctos.database.museum/guid/MVZ:Bird:69400 Arctos record GUID
NCBI Nucleotide - GenBank http://www.ncbi.nlm.nih.gov/nuccore/EU011370 identifier

Type B: Largely used by specific collections for internal purposes. Arctos may use these as shortcuts to auto-link to exactly one Agent. Not usable in the Extended Specimen Network.

Arctos fields    
IssuedBy Value Type
NK 39385 NK
AF 51930 AF

Type C: Identifiers which are not resolvable (no URL-based assignments) and cannot be auto-assigned. These may be used to describe original data (e.g, collector number, field number, preparator number) where the “issued by” agent (person or organization or shared catalog) needs to be assigned manually. Not usable in the Extended Specimen Network.

Arctos fields    
IssuedBy Value Type
James L. Patton 1811 collector number
Carla Cicero 1062 preparator number
Lindsay Wildlife Hospital 2004-335 identifier
Bell Museum Bird Collection X7314 preparator number

Type D: Legacy identifiers that are transitioning to one of the other categories; as is, these identifiers do not provide useful information and are likely making data difficult to find and manage (at best causing confusion). None of these should be used with new data and ideally will migrate to a more explicit solution.


Other Identifier Type

Coll_Obj_Other_ID_Num . Other_ID_Type VARCHAR2(75) not null

This field describes the kind of identifier using a controlled vocabulary. Note that many are arbitrary; Agents are much more capable of pointing to data.

ID Issued By

ID Issued By is the Agent issuing the identifier. “Issuing” may involve any process of creating the identifier, such as a collector writing something in a notebook or on a tag, or a subdivision of NCBI creating a URL representing a genetic sequence. Note that this allows very fine-scaled “typing” of identifiers; a subdivision of a department within an institution is easily achievable and wholly unambiguous, for example. Much of this functionality has traditionally - and very roughly - been embedded into type, or as part of the identifier itself (such as a collector’s initals prefixing their collector number).

ID Assigned By

ID Assigned By is the Agent assigning the identifier to a catalog record. Note that some identifiers may be assigned by bot agents, and these should receive extra scrutiny. This information is generally extracted from the user’s environment rather than being asserted.

assigned_date

Date on which identifier was assigned.

ID References

ID References is a controlled vocabulary defining the item to which the other ID was originally applied. “Self” is the value used when an ID was applied to the current item; all other values create a (sometimes-resolvable) relationship to another item. Note that the “other half” of an ID-created relationship does not necessarily resolve to a cataloged item (though it should), and is not limited to other records in Arctos (relationships can be formed to any online resource).

A special type (“Arctos record GUID”) is available for linking records within Arctos. This type ensures that identifiers and issued by agent are properly formed.

Various tools are available for detecting and creating reciprocal relationships, or a bot may be enabled to fully automate this process.

remarks

A remarks field is available for any clarifying information.

General Guidelines

Be as specific and complete as possible in choosing both an Other ID Type and assigning an Other ID Number. Everything that follows is an elaboration of this simple concept.

Other ID numbers are in a zero-or-one-or-many:1 relationship with Cataloged Items. There is no limit to the number of Other IDs that may be assigned to a catalog item, and there is no implication that IDs must be unique, particularly identifying, or even useful. Capture every identifier associated with a specimen – someone at some time considered the identifier useful, and may wish to locate the specimen using it.

Loaned specimens occasionally return with de-facto other IDs (in the form of attached barcodes, GenBank numbers, “personal numbers,” etc.). Record all these as Other IDs.

Choosing Type

See documentation for type definitions and guidelines. “Identifier” is usually a good choice.

Arctos References

To reference another record in Arctos, use Arctos record GUID. This type requires metadata, but will generally be able to automate the addition of such from partial information (such as the triplet).

Personal Catalogs

To reference a person’s personal catalog, collector number or preparator number should be used.

Lot-at-location

field number is appropriate for cataloging e.g., lots of fish. (This is not a recommendation for such traditional practices!)

Organism

Organism ID is appropriate for referencing a parent Organism. These are often (but not necessarily) cataloged in the Arctos:Entity collection, and many tools for assembling Organisms exist. Only one Organism ID is allowed.

separated data

UUID (expected, but not required, to be of the form described here) is useful for linking data together in the absence of better identifiers.

all other identifiers

All other identifiers are properly entered as type identifier. This type should always have metadata, and some data require precise handling.

How To

Instructions for doing specifc tasks related to identifiers in Arctos

Edit this Documentation

If you see something that needs to be edited in this document, you can create an issue using the link under the search widget at the top left side of this page, or you can edit directly here.