abis-mapping v9.0.1 survey_metadata v3.0.0
SYSTEMATIC SURVEY METADATA TEMPLATE INSTRUCTIONS
Intended Usage
This Systematic Survey Metadata template should be used to record metadata relating to a Systematic Survey dataset.
The Systematic Survey Metadata template must be used in combination with the Systematic Survey Occurrence template and, in some cases, the Systematic Survey Site template with or without the Systematic Survey Site Visit template.
Templates have been provided to facilitate integration of your data into the Biodiversity Data Repository database. Not all types of data have been catered for in the available templates at this stage; therefore, if you are unable to find a suitable template, please contact bdr-support@dcceew.gov.au to make us aware of your data needs.
Data Validation Requirements:
For data validation, you will need your data file to:
- be the correct file format,
- have fields that match the template downloaded (do not remove, or change the order of fields),
- have extant values for mandatory fields (see Table 1), and
- comply with all data value constraints; for example the geographic coordinates are consistent with a geodeticDatum type of the 5 available options.
Additional fields may be added after the templated fields (noting that the data type is not assumed and values will be encoded as strings).
FILE FORMAT
- The systematic survey metadata template is a UTF-8 encoded csv (not Microsoft
Excel Spreadsheets). Be sure to save this file with your data as a .csv (UTF-8) as
follows, otherwise it will not pass the in-browser csv validation step upon upload.
[MS Excel: Save As > More options > Tools > Web options > Save this document as > Unicode (UTF-8)] - Do not include empty rows.
FILE NAME
When making a manual submission to the Biodiversity Data Repository,
the file name must include the version number
of this biodiversity data template (v3.0.0).
The following format is an example of a valid file name:
data_descripion-v3.0.0-additional_description.csv
where:
data_description: A short description of the data (e.g.survey_meta,test_data).v3.0.0: The version number of this template.additional_description: (Optional) Additional description of the data, if needed (e.g.test_data)..csv: Ensure the file name ends with.csv.
For example, survey_meta-v3.0.0-test_data.csv or test_data-v3.0.0.csv
FILE SIZE
MS Excel imposes a limit of 1,048,576 rows on a spreadsheet, limiting a CSV file to the header row followed by 1,048,575 occurrences. Furthermore, MS Excel has a 32,767 character limit on individual cells in a spreadsheet. These limits may be overcome by using or editing CSV files with other software.
Larger datasets may be more readily ingested using the API interface. Please contact bdr-support@dcceew.gov.au to make us aware of your data needs.
TEMPLATE FIELDS
The template file contains the field names in the top row that form part of the core Survey data model. Table 1 will assist you in transferring your data to the template with the following information:
- Field name in the template (and an external link to the Darwin Core standard for that field where available);
- Description of the field;
- Required i.e. whether the field is mandatory, conditionally mandatory, or optional;
- Datatype format required for the data values for example text (string), number (integer, float), or date; and
- Example/s of an entry for that field.
- Vocabulary links within this document (for example pick list values) where relevant. The fields that have suggested values options for the fields in Table 1 are listed in Table 2 in alphabetical order of field name.
ADDITIONAL FIELDS
Data that do not match the existing template fields may be added as additional columns in
the CSV files after the templated fields.
For example, sampleSizeUnit, sampleSizeValue.
Table 1: Systematic Survey Metadata template fields with descriptions, conditions, datatype format, and examples.
| Field # | Name | Description | Mandatory / Optional | Datatype Format | Examples |
|---|---|---|---|---|---|
| 1 | surveyID | A unique identifier for the survey. Important if more there is more than one survey in the project or the dataset. | Mandatory | String | COL1 |
| 2 | surveyName | Brief title for the survey. | Mandatory | String | Disentangling the effects of farmland use, habitat edges, and vegetation structure on ground beetle morphological traits - Summer |
| 3 | surveyPurpose | A description of the survey objective | Optional | String | Summer sampling for peak insect diversity. |
| 4 | surveyType | Description of type of survey conducted | Optional | String | Wet pitfall trapping (Vocabulary link) |
| 5 | surveyStart | The date data collection commenced. | Mandatory | Timestamp | 21/09/2020 |
| 6 | surveyEnd | The date data collection was completed. | Optional | Timestamp | 23/09/2020 |
| 7 | targetTaxonomicScope | The range of biological taxa covered by the survey. Multiple terms are allowed, separated by a vertical bar aka pipe | | Optional | List | Coleoptera | Formicidae (Vocabulary link) |
| 8 | targetHabitatScope | The habitats targeted for sampling during the survey. Multiple terms are allowed, separated by a vertical bar aka pipe | | Optional | List | Woodland (Vocabulary link) |
| 9 | spatialCoverageWKT | Well Known Text (WKT) expression of the geographic coordinates that describe the survey's spatial extent. Ensure the coordinates are arranged in 'longitude latitude' order and do not include the CRS in the WKT expression (it comes from the geodeticDatum field). | Mandatory if geodeticDatum is provided. | WKT | POLYGON ((146.363 -33.826, 148.499 -33.826, 148.499 -34.411, 146.363 -33.826)) (WKT notes) |
| 10 | geodeticDatum | The geodetic datum upon which the geographic coordinates in the Spatial coverage (WKT) are based. | Mandatory if spatialCoverageWKT is provided. | String | GDA2020 (Vocabulary link) |
| 11 | surveyOrgs | Name of organisations or individuals for whom Survey is being conducted. Multiple terms are allowed, separated by a vertical bar aka pipe | | Optional | List | NSW Department of Planning, Industry and Environment | CSIRO |
| 12 | surveyMethodCitation | A citation or reference to the survey methods used. | Optional | List | Ng, K., Barton, P.S., Blanchard, W. et al. Disentangling the effects of farmland use, habitat edges, and vegetation structure on ground beetle morphological traits. Oecologia 188, 645–657 (2018). https://doi.org/10.1007/s00442-018-4180-9 |
| 13 | surveyMethodDescription | Free text description of the survey method used. | Optional | String | Our experimental design consisted of four 400 m transects running from inside each woodland patch out into four adjoining farmland uses (crop, rested, woody debris application, revegetation plantings). To quantify potential edge efects on beetle species traits, we sampled beetles at five locations along each transect: 200 and 20 m inside woodlands, 200 and 20 m inside farmlands, and at the woodland–farmland edge (0 m). Each sampling location comprised a pair of wet invertebrate pitfall traps. separated by a drift fence (60 cm long x 10 cm high) to help direct arthropods into traps. We opened a total of 220 pairs of traps for 14 days during spring (Oct–Nov 2014), and repeated sampling during summer (January–February 2015). Beetle samples from each pitfall trap pair, and across the two time periods, were pooled to provide one sample per sampling location. |
| 14 | surveyMethodURL | A DOI or link to the reference about the survey method, if available. | Optional | List | https://biocollect.ala.org.au/document/download/2022-01/202201%20CBR%20Flora%20and%20Vegetation%20report_draftv1.pdf | https://doi.org/10.1002/9781118945568.ch11 |
| 15 | keywords | Terms, phrases or descriptors that highlight the key attributes of the study. Multiple terms are allowed, separated by a vertical bar aka pipe | | Optional | List | ground beetle | habitat | morphology | traits | farmland | woodland | remnant vegetation | split-plot study |
CHANGELOG
Changes from Systematic Survey Metadata Template v2.0.0
- This template now accepts multiple rows of data, to represent multiple Surveys in a Dataset.
CHANGED FIELDS
- Because multiple rows are now allowed,
surveyIDis now a mandatory field, and each row must have a unique value within the template, in order to identify each row.
APPENDICES
APPENDIX-I: Vocabulary List
With the exception of geodeticDatum, data validation
does not require adherence to the vocabularies for the various vocabularied fields.. These vocabularies are provided as a
means of assistance in developing consistent language within the database. New terms can be added
to more appropriately describe your data that goes beyond the current list.
Table 2 provides some suggested values from existing sources such as: Biodiversity Information Standard (TDWG), EPSG.io Coordinate systems worldwide, the Global Biodiversity Information System, and Open Nomenclature in the biodiversity era.
Table 2: Suggested values for the controlled vocabulary fields in the template. Each term has
a preferred label with a definition to aid understanding of its meaning. For some terms, alternative
labels with similar semantics are provided.
Note: the value for geodeticDatum
must come from one of five options in this table.
APPENDIX-II: Well Known Text (WKT)
For general information on how WKT coordinate reference data is formatted is available here. The length of a WKT string or of its components is not prescribed; however, MS Excel does has a 32,767 (32K) character limit on individual cells in a spreadsheet.
It is possible to edit CSV files outside of Excel in order to include more than 32K characters.
Note: Ensure the coordinates are arranged in longitude latitude order and do not include the CRS in the WKT expression (it comes from the geodeticDatum field).
APPENDIX-III: Timestamp
Following date and date-time formats are acceptable within the timestamp:
| TYPE | FORMAT |
|---|---|
| xsd:dateTimeStamp with timezone | yyyy-mm-ddThh:mm:ss.sTZD (eg 1997-07-16T19:20:30.45+01:00) OR yyyy-mm-ddThh:mm:ssTZD (eg 1997-07-16T19:20:30+01:00) OR yyyy-mm-ddThh:mmTZD (eg 1997-07-16T19:20+01:00) |
| xsd:dateTime | yyyy-mm-ddThh:mm:ss.s (eg 1997-07-16T19:20:30.45) OR yyyy-mm-ddThh:mm:ss (eg 1997-07-16T19:20:30) OR yyyy-mm-ddThh:mm (eg 1997-07-16T19:20) |
| xsd:Date | dd/mm/yyyy OR d/m/yyyy OR yyyy-mm-dd OR yyyy-m-d |
| xsd:gYearMonth | mm/yyyy OR m/yyyy OR yyyy-mm |
| xsd:gYear | yyyy |
Where:
yyyy: four-digit year
mm: two-digit month (01=January, etc.)
dd: two-digit day of month (01 through 31)
hh: two digits of hour (00 through 23) (am/pm NOT allowed)
mm: two digits of minute (00 through 59)
ss: two digits of second (00 through 59)
s: one or more digits representing a decimal fraction of a second
TZD: time zone designator (Z or +hh:mm or -hh:mm)
APPENDIX-IV: UTF-8
UTF-8 encoding is considered a best practice for handling character encoding, especially in
the context of web development, data exchange, and modern software systems. UTF-8
(Unicode Transformation Format, 8-bit) is a variable-width character encoding capable of
encoding all possible characters (code points) in Unicode.
Here are some reasons why UTF-8 is recommended:
- Universal Character Support: UTF-8 can represent almost all characters from all writing systems in use today. This includes characters from various languages, mathematical symbols, and other special characters.
- Backward Compatibility: UTF-8 is backward compatible with ASCII (American Standard Code for Information Interchange). The first 128 characters in UTF-8 are identical to ASCII, making it easy to work with systems that use ASCII.
- Efficiency: UTF-8 is space-efficient for Latin-script characters (common in English and many other languages). It uses one byte for ASCII characters and up to four bytes for other characters. This variable-length encoding minimises storage and bandwidth requirements.
- Web Standards: UTF-8 is the dominant character encoding for web content. It is widely supported by browsers, servers, and web-related technologies.
- Globalisation: As software applications become more globalised, supporting a wide range of languages and scripts becomes crucial. UTF-8 is well-suited for internationalisation and multilingual support.
- Compatibility with Modern Systems: UTF-8 is the default encoding for many programming languages, databases, and operating systems. Choosing UTF-8 helps ensure compatibility across different platforms and technologies.
When working with text data, UTF-8 encoding is recommended to avoid issues related to character representation and ensure that a diverse set of characters and languages is supported.
For assistance, please contact: bdr-support@dcceew.gov.au