Understanding the Arctos Locality Model

The core Arctos locality model consists of 4 primary tables. This guide and the following illustration describe their primary function and interaction.

Arctos Locality Stack Diagram

Specimen_Event

Specimen-Event is the link from specimens to localities. Specimen-Event is not shared; a unique instance exists for and establishes every specimen<—>locality link, so a specimen with multiple encounters or unaccepted coordinates will have multiple specimen-events.

Collecting_Event

Collecting Event adds verbatim data plus dates. Collecting events are shared; one collecting_event may be parent to any number of specimen_events. In the case of co-collected specimens (e.g., hosts and parasites) maintaining one collecting event for multiple specimen-events is doubly critical.

Locality

Locality adds formality and vertical spatial data to collecting events. Localities are shared; one locality may be parent to any number of collecting_events.

Geography

Geography adds formalized descriptive data to locality. Geography is shared; one geography may be parent to any number of localities.

Not Included

In addition to the primary tables listed above, geology_attributes adds any number of hierarchical geology determinations to localities, table GEOG_SEARCH_TERM adds discovery data (such as old or local placenames, or placenamess in local charactersets) to geography, and several service-populate fields in Locality add automated georeference and reverse-georeference data which aids in discoverability and provides editing suggestions.

The Locality Stack

“A specimen’s locality” or can be viewed as everything from one record in collecting_event, locality (potentially including geology), and geog_auth_rec, while specimen_event is the glue which attaches “the locality” to specimens. Note that all Arctos keys are bit-wise, and very minor differences in the data (error distance or units, punctuation in remarks, etc.) can force into existence new and (slightly) different data objects. (Arctos provides “fuzzy” merger tools.) In addition to inconsequential differences, localities (which are simultaneously descriptive and/or spatial) often differ from similar localities by specific locality, coordinate point, coordinate error distance, elevation, depth, or the choice of higher geography. That is, there is not necessarily one correct interpretation of “Fairbanks, Alaska.” Collecting events often differ by the format of verbatim date or locality. Finally, the choice of higher geography is often somewhat arbitrary. All of these factors must be considered when attempting to ensure that a specimen shares a locality stack with other specimens. Collecting_event_id and collecting_event_name serve as proxies to the locality stack and may be used in the bulkloader or data entry screens to select existing “places.”

Hierarchy

The core data model is hierarchical; an example as a hierarchical “tree” is given below.

All of the above is from one geography: North America, United States, Alaska.

There are two localities given; we will focus only on Locality 1, which contains two collecting events differing only by date. Collecting Event 1-1 contains two specimen-events and, like all other specimen-events, each contains exactly one specimen. Collecting Event 1-2 also contains two collecting events, but in this case they contain the same specimen - perhaps a tagged individual which moved and was re-encountered on the same day.

Note that the specimen<–>specimen_event relationship is always 1:1; all other relationships in this model are 1:∞. One geography may contain two (or zero or two million) localities, one locality may contain two (or zero or two million) events, and one event may contain two (or zero or two million) specimen-events.

(Note also that the possibility of 1:0 relationships is in the context of specimens; “unused” data objects may exist in support of other nodes, such as Media.)