New specimen records may be created from a single flat (non-relational) file, a text file in which all (or most) data for a single cataloged item are in a single row. This file can be created with any convenient client-side application. The file is then loaded into a similarly structured table on the server, and a server-side application (the bulkloader) parses the columns from each row into the relational structure of the database. The process provides an independent layer of data checking before new information is incorporated into the database proper. Original data that are received in electronic format may require minimal manipulation; you can sometimes merely add the necessary columns to build a file in the bulk-loading format.

Bulkloader templates should be created from the Bulkloader Builder in Arctos. All other means, including this documentation, may produce non-current data which will be rejected.

There is no standard method for moving data into table Bulkloader. You may import data from any file format, type the data into the table, write your own data entry screen, or use any other method you choose. We appreciate documentation, even for specialized datasets – contact us if you wish to contribute.

You may mix accessions, collections, or anything else in a single load.

The specimen Bulkloader will not alone handle every eventuality that may ever occur while entering data. (The suite of tools available should.) Use flags to mark incomplete records for further editing, tie to other bulkloaders with UUIDs, or talk to your friendly local Arctos development team BEFORE you make a mess.

Error messages should include more than enough information to allow you to locate and correct the problem. If that isn’t the case, contact us with the error message and a description of the action that caused the error message.

Arctos is case-sensitive. JOHN DOE is not the same value as John Doe. Leading and trailing spaces and other non-printing characters matter.

The web-based applications may not work well for very large loads. Contact us if you’re having problems.

Agent Names

Agent Names must match a unique namestring, not necessarily the preferred name. If you are loading “John Smith” and there are three John Smiths in Arctos, you might create a new name “John Smith (my project)” and use that namestring in your data. Once loaded, the records will display preferred name, and agent name “John Smith (my project)” may be removed.


Special note primarily for botanists: The bulkloader requires taxonomy.scientific_name, not taxonomy.display_name. That is, “Carex bigelowii subsp. lugens” rather than “Carex bigelowii Torr. subsp. lugens (Holm) T.V. Egorova”.

Any of the following are acceptable taxon name values (current 23 Aug 2011, see code table for most current formulas):

Be sure anything coming from other applications (especially Microsoft products) has not changed field length, precision, or other attributes. Watch dates and non-integer numbers (such as decimal latitude) most closely.


The following table describes select individual fields in the Bulkloader. Check the Bulkloader Builder for the latest table structure. Do not attempt to use this as a template. Let us know if it’s out of date, incomplete, cryptic, or otherwise useless.

Field Name Data Type/Vocabulary Description/Example
conditionally required
not required
Collection_Object_Id any unique number Temporary record identifier; Does NOT carry over to any internal primary keys.
Cat_Num set by collection Existing catalog number, or leave blank to assign sequential numbers on upload.
Began_Date ISO8601 date [ doc ] Earliest date the specimen could have been collected.
Ended_Date ISO8601 date [ doc ] Latest date the specimen could have been collected.
Verbatim_Date text; any string [ doc ] Examples: ‘winter 2002’; ‘1 Nov 2002’; ‘Nov 2002’.
VERIFICATIONSTATUS text; ctverificationstatus  
SPECIMEN_EVENT_TYPE text; ctspecimen_event_type Type of specimen-event relationship
Event_Assigned_By_Agent text; agent name Agent asserting specimen-to-event relationship; often coordinate determiner.
Event_Assigned_Date date Date on which the specimen-event relationship is made
Coll_Event_Remarks text; any string Remarks about Collecting Event.
Higher_Geog text; pre-existing [ doc ] Higher Geography exactly as it appears in table Geog_Auth_Rec. New values must be added to the database prior to bulk-loading.
Maximum_Elevation integer > minimum_elevation [ doc ] Maximum elevation from which the specimen could have come. Used in conjunction with Minimum_Elevation and Orig_Elev_Units.
Minimum_Elevation integer < maximum_elevation [ doc ] Minimum elevation from which the specimen could have come. Used in conjunction with Maximum_Elevation and Orig_Elev_Units.
Orig_Elev_Units text; ctorig_elev_units Used in conjunction with Maximum_Elevation and Minimum_Elevation. (Code table controlled.)
Spec_Locality text; any string [ doc ] Specific locality from which a specimen originates.
Locality_Remarks text; any string Remarks associated with Locality.
  — Begin coordinate fields. All coordinate data are optional unless Orig_Lat_Long_Units is specified, and leaving Orig_Lat_Long_Units NULL will cause all other coordinate data to be ignored. —  
Orig_Lat_Long_Units text; ctlat_long_units [ doc ] Lat/Long units as given by the determining agent and before any transformations.
Datum text; ctdatum [ doc ] Map datum used to determine Lat/Long. Required if coordinates are given.
GEOREFERENCE_SOURCE text; any string [ doc ] A code indicating the reference from which a Lat/Long was determined.
GEOREFERENCE_PROTOCOL text; ctgeoreference_protocol  
Max_Error_Distance number [ doc ] The maximum possible error in distance between the recorded Lat_Long and the actual Lat_Long of the specific locality. Required if Max_Error_Units provided.
Max_Error_Units text; ctlat_long_error_units [ doc ] The units in which the Max_Error_Distance are recorded. Required if Max_Error_Distance provided. Geographic coordinates may be entered in decimal degrees1, degrees-minutes-seconds2, or in degrees with decimal minutes3 [ doc ].
Dec_Lat1 number Decimal latitude.
Dec_Long1 number Decimal longitude.
LatDeg2 and 3 positive number Degrees Latitude (Integer, 90 or less).
LatMin2 positive number Minutes Latitude (Integer, less than 60).
LatSec2 positive number Seconds Latitude (Decimal fraction, less than 60).
LatDir2 and 3 text; N or S Latitude Direction: “N” or “S” (North or South).
LongDeg2 and 3 positive number Degrees Longitude (Integer, 180 or less).
LongMin2 positive number Minutes Longitude (Integer, less than 60).
LongSec2 positive number Seconds Longitude (Decimal fraction, less than 60).
LongDir2 and 3 text; W or E Longitude Direction: “E” or “W” (East or West).
Dec_Lat_Min3 positive number Decimal Latitude Minutes (Used with LatDeg, decimal fraction, less than 60).
Dec_Long_Min3 positive number Decimal Longitude Minutes (Used with LongDeg, decimal fraction, less than 60).
  — end coordinate fields —  
Verbatim_Locality text; any string [ doc ] The locality, entered as closely as possible to the original text provided by the collector. (Not necessarily the same as specific locality.)
Collecting_Source text; ctcollecting_source [ doc ] Source from which the specimen was received. Example: “wild caught”
Habitat text; any string [ doc ] A description of habitat at the time of the collecting event.
Associated_Species text; any string A description of other species occurring at the collecting event. Use relationships to other specimens when possible.
Coll_Object_Remarks text; any string Remarks about the cataloged item.
Id_Made_By_Agent text; agent name [ doc ] Determiner, or agent who identified the specimen.
Identification_Remarks text; any string [ doc ] Remarks associated with this identification.
Made_Date ISO8601 date [ doc ] Date that the taxonomic determination (or identification) was made.
Nature_of_Id text; ctnature_of_id [ doc ] How identification was determined. (Code-table controlled.)
Taxon_Name text; taxon name [ doc ] Scientific Name assigned by identifying agent.
Other_Id_Num_x text; any string Other identifying numbers (ie, original field number).
Other_Id_Num_Type_x text; ctcoll_other_id_type Used in conjunction with Other_Id_Num. (Code-table controlled.)
Other_Id_References_x text; ctid_references Establish relationships to other specimens. (Code-table controlled.)
Collector_Agent_x text; agent name Collector or preparator name as it appears in Arctos. At least one collector_agent is required.
Collector_Role_x text; ctcollector_role Collector Role.
Part_Name_x text; ctspecimen_part_name [ doc ] At least one part is required.
Part_lot_count_x number [ doc ] A part_lot_count is required for all non-null parts.
Part_Condition_x text; any string [ doc ] A description of the latest documented condition.
Part_disposition_x text; ctcoll_obj_disp [ doc ] A Part_disposition is required for all non-null parts. Example: “in collection”
Part_Barcode_x text; any barcode [ doc ] Barcode on the part as it will be read by a barcode scanner.
Part_Container_Label_x text; any string [ doc ] Label on the container (e.g., Nunc tube). The human-readable printing on the container. NULL results in no changes to the part container; ignored if Part_Barcode_x is null.
Part_Remark_x text; any string Remark about the part.
Part_preservation_x text; ctpart_preservation [ doc ] This is a shortcut to creating a part attribute of type preservation. Attribute date will default to current_date and determiner will default to enteredAgent
Accn text; accn number [ doc ] Accession Number assigned upon acceptance of specimens. Format is accn number without collection information, but see cross-collection considerations.
EnteredBy text; agent name [ doc ] Agent entering the data into this table. Must match agent_name of type login. NULLable if entered_by_agent_id provided.
ENTERED_AGENT_ID number; key EnterdBy’s agent_id. Increased performance over EnteredBy.
GUID_Prefix text; controlled [ doc ] Unique-within-Arctos identifier of the collection under which the specimen will be cataloged. Replaces Institution_Acronym + Collection_Cde.
collection_id number; key Primary key of table Collection. Alternative to GUID_prefix.
Status text; any string This is where errors are stored after Bulkloader processing. More Info
Flags text; ctflags Flag indicating the specimen needs further work.
Attribute text; ctattribute_type [ doc ] Attribute name. (Code-table controlled.)
Attribute_value text; various [ doc ] Value of the attribute. Leaving this NULL will cause the bulkloader to ignore the attribute entry regardless of other values.
Attribute_units text; L,W, etc. [ doc ] Units on attribute_value, where appropriate.
attribute_remarks text; any string [ doc ] Remarks about the attribute.
attribute_date ISO8601 date [ doc ] Date the attribute was determined.
attribute_det_meth text; any string [ doc ] How the attribute was determined.
attribute_determiner text; agent name [ doc ] Agent who determined the attribute.
locality_id number; key A primary key from table locality may be used in place of locality information. A value here will over-ride anything entered into higher_geog, spec_locality, coordinates, etc.
locality_name string; Exact:locality.locality_name A persistent locality identifier which may be used in place of locality information. A value here will over-ride anything entered into higher_geog, spec_locality, coordinates, geology, etc.
collecting_event_id number; key A primary key from table collecting_event may be used in place of collecting_event information. A value here will over-ride anything entered into higher_geog, spec_locality, coordinates, dates, method, etc. * All date fields should be formatted as ISO 8601, e.g., 2006-12-31.
cataloged_item_type text: ctcataloged_item_type Designates the type of material held, passed to biodiversity data aggregators as BasisOfRecord. A value here will over-ride anything entered into Default Cataloged Item Type in Manage Collection.

Primary Key Warning

Some values may be replaced by or require primary keys: locality_id, entered_by_agent_id, collecting_event_id, etc. These are internal database identifiers that exist only for convenience, and may be updated, transferred to another data object, or removed for seemingly arbitrary reasons and without warning. They’ll probably work over short time-periods, but we offer no guarantees.


Once a record is marked to load by making status “autoload_core” (loads data from table bulkloader) or “autoload_extras” (also marks UUID-linked records in “component loaders” to autoload), a script periodically attempts to parse the record into the normalized core Arctos structure. This may result in two things: * the record is created and marked for cache refresh, or * an error is returned in the status column

Records which successfully load must be refreshed in the cache before appearing in the user interfaces. Records are refreshed in the order they enter the queue. This process often takes less than one minute, but in the case of many thousands of records being queued can take up to several days. Reports/Services >View Statistics >FLAT status provides a summary of the state of the cache, and may be useful in estimating processing time.

Note that there is a period of time between successful loading and the cache being refreshed where records are not visible in any user interface.

Additionally, status DELETE (case-sensitive) can be used to mark records for deletion. This process generally takes about 30 minutes.