abis-mapping v9.0.1 incidental_occurrence_delete v1.0.0

INCIDENTAL OCCURRENCE DELETE TEMPLATE INSTRUCTIONS

Intended Usage

This Incidental Occurrence Delete template should be used to delete occurrence data.

Data Validation Requirements:

For data validation, you will need your data file to:

be the correct file format,
have fields that match the template downloaded (do not remove, or change the order of fields),
have extant values in mandatory fields (see Table 1),

FILE FORMAT

The incidental occurrence data template is a UTF-8 encoded csv (not Microsoft Excel Spreadsheets). Be sure to save this file with your data as a .csv (UTF-8) as follows, otherwise it will not pass the csv validation step upon upload.
[MS Excel: Save As > More options > Tools > Web options > Save this document as > Unicode (UTF-8)]
Do not include empty rows.

FILE NAME

When making a manual submission to the Biodiversity Data Repository, the file name must include the version number of this biodiversity data template (v1.0.0). The following format is an example of a valid file name:

data_descripion-v1.0.0-additional_description.csv

where:

data_description: A short description of the data (e.g. incidental_occ, test_data).
v1.0.0: The version number of this template.
additional_description: (Optional) Additional description of the data, if needed (e.g. test_data).
.csv: Ensure the file name ends with .csv.

For example, incidental_occ_delete-v1.0.0-test_data.csv or test_data-v1.0.0.csv

FILE SIZE

MS Excel imposes a limit of 1,048,576 rows on a spreadsheet, limiting a CSV file to the header row followed by 1,048,575 occurrences. Furthermore, MS Excel has a 32,767 character limit on individual cells in a spreadsheet. These limits may be overcome by using or editing CSV files with other software.

Larger datasets may be more readily ingested using the API interface. Please contact bdr-support@dcceew.gov.au to make us aware of your data needs.

TEMPLATE FIELDS

The template contains the field names in the top row. Table 1 will assist you in transferring your data to the template indicating:

Field name in the template;
Description of the field;
Required i.e. whether the field is mandatory, conditionally mandatory, or optional;
Format (datatype) required for the data values for example text (string), number (integer, float), or date;
Example of an entry or entries for that field;

Table 1: Incidental occurrence delete template fields with descriptions, conditions, datatype format, and examples.

Field #	Name	Description	Mandatory / Optional	Datatype Format	Examples
1	providerRecordID	Unique (within provider) identifier for the record.	Mandatory	String	8022FSJMJ079c5cf
2	providerRecordIDSource	Person or Organisation that generated the providerRecordID.	Mandatory	String	Western Australian Biodiversity Information Office

APPENDICES

APPENDIX-I: UTF-8

UTF-8 encoding is considered a best practice for handling character encoding, especially in the context of web development, data exchange, and modern software systems. UTF-8 (Unicode Transformation Format, 8-bit) is a variable-width character encoding capable of encoding all possible characters (code points) in Unicode.
Here are some reasons why UTF-8 is recommended:

Universal Character Support: UTF-8 can represent almost all characters from all writing systems in use today. This includes characters from various languages, mathematical symbols, and other special characters.
Backward Compatibility: UTF-8 is backward compatible with ASCII (American Standard Code for Information Interchange). The first 128 characters in UTF-8 are identical to ASCII, making it easy to work with systems that use ASCII.
Efficiency: UTF-8 is space-efficient for Latin-script characters (common in English and many other languages). It uses one byte for ASCII characters and up to four bytes for other characters. This variable-length encoding minimises storage and bandwidth requirements.
Web Standards: UTF-8 is the dominant character encoding for web content. It is widely supported by browsers, servers, and web-related technologies.
Globalisation: As software applications become more globalised, supporting a wide range of languages and scripts becomes crucial. UTF-8 is well-suited for internationalisation and multilingual support.
Compatibility with Modern Systems: UTF-8 is the default encoding for many programming languages, databases, and operating systems. Choosing UTF-8 helps ensure compatibility across different platforms and technologies.

When working with text data, it's generally a good idea to use UTF-8 encoding to avoid issues related to character representation and ensure that your software can handle a diverse set of characters and languages.

For assistance, please contact: bdr-support@dcceew.gov.au