abis-mapping v9.0.1 survey_site_data v3.0.0
SYSTEMATIC SURVEY SITE DATA TEMPLATE INSTRUCTIONS
Intended Usage
This Systematic Survey Site Data template should be used to record data about a Site area where species occurrences have been sampled during a systematic survey.
This Systematic Survey Site template must be used in combination with the
Systematic Survey Occurrence Data template and the Systematic Survey Metadata template,
and in some cases the Systematic Survey Site Visit template.
Templates have been provided to facilitate integration of data into the Biodiversity Data Repository (BDR) database. Not all types of data have been catered for in the available templates at this stage - if you are unable to find a suitable template, please contact bdr-support@dcceew.gov.au to make us aware of your data needs.
Data Validation Requirements:
For data validation, you will need your data file to:
- be the correct file format,
- have fields that match the template downloaded (do not remove, or change the order of fields),
- have extant values for mandatory fields (see Table 1), and
- comply with all data value constraints; for example the geographic coordinates are consistent with a geodeticDatum type of the 5 available options.
Additional fields may be added after the templated fields (noting that the data type is not assumed and values will be encoded as strings).
FILE FORMAT
- The systematic survey site data template is a UTF-8 encoded csv (not Microsoft
Excel Spreadsheets). Be sure to save this file with your data as a .csv (UTF-8) as follows,
otherwise it will not pass the csv validation step upon upload.
[MS Excel: Save As > More options > Tools > Web options > Save this document as > Unicode (UTF-8)]
otherwise it will not pass the csv validation step upon upload. - Do not include empty rows.
FILE NAME
When making a manual submission to the Biodiversity Data Repository,
the file name must include the version number
of this biodiversity data template (v3.0.0).
The following format is an example of a valid file name:
data_descripion-v3.0.0-additional_description.csv
where:
data_description: A short description of the data (e.g.survey_sites,test_data).v3.0.0: The version number of this template.additional_description: (Optional) Additional description of the data, if needed (e.g.test_data)..csv: Ensure the file name ends with.csv.
For example, survey_sites-v3.0.0-test_data.csv or test_data-v3.0.0.csv
FILE SIZE
MS Excel imposes a limit of 1,048,576 rows on a spreadsheet, limiting a CSV file to the header row followed by 1,048,575 occurrences. Furthermore, MS Excel has a 32,767 character limit on individual cells in a spreadsheet. These limits may be overcome by using or editing CSV files with other software.
Larger datasets may be more readily ingested using the API interface. Please contact bdr-support@dcceew.gov.au to make us aware of your data needs.
TEMPLATE FIELDS
The template contains the field names in the top row. Table 1 will assist you in transferring your data to the template indicating:
- Field name in the template (and an external link to the Darwin Core standard for that field where relevant);
- Description of the field;
- Required i.e. whether the field is mandatory, conditionally mandatory, or optional;
- Format (datatype) required for the data values for example text (string), number (integer, float), or date;
- Example of an entry or entries for that field; and
- Vocabulary links within this document (for example pick list values) where relevant. The fields that have suggested values options for the fields in Table 1 are listed in Table 2 in alphabetical order of the field name.
ADDITIONAL FIELDS
Data that does not match the existing template fields may be added as additional columns in
the CSV files after the templated fields.
For example, fieldNotes, continent, country, countryCode, stateProvince, georeferencedDate,
landformPattern, landformElement, aspect, slope.
Table 1: Systematic Survey Site data template fields with descriptions, conditions, datatype format, and examples.
| Field # | Name | Description | Mandatory / Optional | Datatype Format | Examples |
|---|---|---|---|---|---|
| 1 | siteID | An identifier for the site. Within the dataset, should be unique per siteIDSource | Mandatory if existingBDRSiteIRI is not provided. Mandatory if siteIDSource is provided. |
String | P1 |
| 2 | siteIDSource | The organisation that assigned the SiteID to this Site | Mandatory if existingBDRSiteIRI is not provided. Mandatory if siteID is provided. |
String | TERN |
| 3 | existingBDRSiteIRI | Verbatim IRI of an existing Site in the BDR that new information is being added to. The IRI will typically start with https://linked.data.gov.au/dataset/bdr/sites/. This field should ONLY be used in a delta update workflow where properties of an existing BDR Site need updating. | Mandatory if siteID and siteIDSource are not provided. | String | https://linked.data.gov.au/dataset/bdr/sites/TERN/P1 |
| 4 | siteType | The type of site that relates to its sampling type and/or dimensions. | Optional | String | Plot (Vocabulary link) |
| 5 | siteName | A name for the site that may be more descriptive than the siteID. | Optional | String | Plot 1 |
| 6 | siteDescription | The site (plot) description covers important aspects of the site (generally of the land surface). Some overlap in collected information does occur due to the modular nature of the survey processes. The description provides significant background information to gain an appreciation of the plot history, topography, position in the landscape and for understanding the likely relationship between the soils, vegetation and fauna. | Optional | String | Fine woody debris. |
| 7 | habitat | A collection of habitat types representing the dominant vegetation structural formation class adopted by the National Vegetation Information System (NVIS). | Optional | List | Chenopod Shrubland | Closed Fernland (Vocabulary link) |
| 8 | relatedSiteID | Identifier of a related site to the specified site e.g. parent site, same site with different identifier. This site must be included in this dataset. | Mandatory if relatedSiteIDSource is provided. | String | P1 |
| 9 | relatedSiteIDSource | The organisation that assigned the relatedSiteID to the related site. | Mandatory if relatedSiteID is provided. | String | TERN |
| 10 | relatedSiteIRI | Verbatim IRI of of a related site to the specified site e.g. parent site, same site with different identifier. | Optional | String | https://linked.data.gov.au/dataset/bdr/sites/TERN/P1 |
| 11 | relationshipToRelatedSite | Relationship between the site and the related site. This field can be used to record Site identifiers for the same site from different custodians through the use of URIs. | Optional | String | PART OF (Vocabulary link) |
| 12 | locality | The specific description of the place. | Optional | String | Cowaramup Bay Road |
| 13 | decimalLatitude | The geographic latitude (in decimal degrees, using the spatial reference system given in geodeticDatum) of the geographic origin of a Site. Positive values are north of the Equator, negative values are south of it. Legal values lie between -90 and 0, inclusive for Southern hemisphere. | Optional | Number | -34.036 |
| 14 | decimalLongitude | The geographic longitude (in decimal degrees, using the spatial reference system given in geodeticDatum) of the geographic origin of a Site. Positive values are east of the Greenwich Meridian, negative values are west of it. Legal values lie between 0 and 180, inclusive for the BDR use case. | Optional | Number | 146.363 |
| 15 | footprintWKT | A Well-Known Text (WKT) representation of the shape (footprint, geometry) that defines the Site. A Site may have both a point-radius representation and a footprint representation, and they may differ from each other. | Optional | WKT | LINESTRING (146.363 -34.036, 146.363 -34.037) (WKT notes) |
| 16 | geodeticDatum | The geodetic datum, or spatial reference system (SRS) upon which the geographic coordinates given for the Site are based. | Optional | String | WGS84 (Vocabulary link) |
| 17 | coordinateUncertaintyInMeters | The horizontal distance (in metres) from the given decimalLatitude and decimalLongitude describing the smallest circle containing the whole of the Site. Leave the value empty if the uncertainty is unknown, cannot be estimated, or is not applicable (because there are no coordinates). Zero is not a valid value for this term. | Optional | Integer | 50 |
| 18 | dataGeneralizations | Actions taken to make the shared data less specific or complete than in its original form. | Optional | String | Coordinates given in decimalLatitude, decimalLongitude, easting and northing have been rounded to 0.1 DEG. The observer name has been changed to a unique User ID. |
CHANGELOG
Changes from Systematic Survey Site Data Template v2.0.0
CHANGED FIELDS
- Add field
existingBDRSiteIRI. Type is URI, can be blank. Rows with values must be unique within a template. - Add field
relatedSiteIDSource. Type is string, can be blank. - Add field
relatedSiteIRI. Type is IRI, can be blank.
CHANGED VALIDATION
siteIDis no longer required and unique on its own, instead;siteIDandsiteIDSourceare conditionally mandatory. Must be provided together, or neither provided.siteIDandsiteIDSourceare unique together, i.e. each row with these fields must have a unique combination.- Either
siteIDandsiteIDSource, orexistingBDRSiteIRI, or both, must be provided in each row. - Fields
relatedSiteIDandrelatedSiteIDSourceare conditionally mandatory together, both must be provided, or neither must be provided. - When provided, fields
relatedSiteIDandrelatedSiteIDSourcemust match asiteIDandsiteIDSourcein the template. - When either
relatedSiteIRI, orrelatedSiteIDandrelatedSiteIDSource, are provided,relationshipToRelatedSitemust be provided.
APPENDICES
APPENDIX-I: Vocabulary List
With the exception of geodeticDatum and relationshipToRelatedSite, the data validation
does not require fields to adhere to the vocabularies specified for the various vocabularied fields.
These vocabularies are merely provided as a means of assistance in developing consistent language
within the database. New terms may be added to more appropriately describe your data that goes
beyond the current list.
Table 2: Suggested values for controlled vocabulary fields in the template. Each term has a preferred label with a definition to aid understanding
of its meaning. For some terms, alternative
labels with similar semantics are provided.
Note: The values for geodeticDatum and relationshipToRelatedSite must come from one of the Preferred labels or Alternate Labels in this
table.
APPENDIX-II: Well Known Text (WKT)
For general information on how WKT coordinate reference data is formatted is available here. The length of a WKT string or of its components is not prescribed; however, MS Excel does has a 32,767 (32K) character limit on individual cells in a spreadsheet.
It is possible to edit CSV files outside of Excel in order to include more than 32K characters.
APPENDIX-III: Timestamp
Following date and date-time formats are acceptable within the timestamp:
| TYPE | FORMAT |
|---|---|
| xsd:dateTimeStamp with timezone | yyyy-mm-ddThh:mm:ss.sTZD (eg 1997-07-16T19:20:30.45+01:00) OR yyyy-mm-ddThh:mm:ssTZD (eg 1997-07-16T19:20:30+01:00) OR yyyy-mm-ddThh:mmTZD (eg 1997-07-16T19:20+01:00) |
| xsd:dateTime | yyyy-mm-ddThh:mm:ss.s (eg 1997-07-16T19:20:30.45) OR yyyy-mm-ddThh:mm:ss (eg 1997-07-16T19:20:30) OR yyyy-mm-ddThh:mm (eg 1997-07-16T19:20) |
| xsd:Date | dd/mm/yyyy OR d/m/yyyy OR yyyy-mm-dd OR yyyy-m-d |
| xsd:gYearMonth | mm/yyyy OR m/yyyy OR yyyy-mm |
| xsd:gYear | yyyy |
Where:
yyyy: four-digit year
mm: two-digit month (01=January, etc.)
dd: two-digit day of month (01 through 31)
hh: two digits of hour (00 through 23) (am/pm NOT allowed)
mm: two digits of minute (00 through 59)
ss: two digits of second (00 through 59)
APPENDIX-IV: UTF-8
UTF-8 encoding is considered a best practice for handling character encoding, especially in
the context of web development, data exchange, and modern software systems. UTF-8
(Unicode Transformation Format, 8-bit) is a variable-width character encoding capable of
encoding all possible characters (code points) in Unicode.
Here are some reasons why UTF-8 is recommended:
- Universal Character Support: UTF-8 can represent almost all characters from all writing systems in use today. This includes characters from various languages, mathematical symbols, and other special characters.
- Backward Compatibility: UTF-8 is backward compatible with ASCII (American Standard Code for Information Interchange). The first 128 characters in UTF-8 are identical to ASCII, making it easy to work with systems that use ASCII.
- Efficiency: UTF-8 is space-efficient for Latin-script characters (common in English and many other languages). It uses one byte for ASCII characters and up to four bytes for other characters. This variable-length encoding minimises storage and bandwidth requirements.
- Web Standards: UTF-8 is the dominant character encoding for web content. It is widely supported by browsers, servers, and web-related technologies.
- Globalisation: As software applications become more globalised, supporting a wide range of languages and scripts becomes crucial. UTF-8 is well-suited for internationalisation and multilingual support.
- Compatibility with Modern Systems: UTF-8 is the default encoding for many programming languages, databases, and operating systems. Choosing UTF-8 helps ensure compatibility across different platforms and technologies.
When working with text data, UTF-8 encoding is recommended to avoid issues related to character representation and ensure that a diverse set of characters and languages is supported.
For assistance, please contact: bdr-support@dcceew.gov.au