Element | Character Set |
---|---|
Reporting Format Statement | Use the standard US-ASCII character encoding set without extensions or use UTF-8. |
Reporting Format Description | Use the standard US-ASCII character encoding set without extensions (no characters beyond the 127 characters) or use UTF-8 (which includes the ASCII character set). Most English-language dataset submissions will require only characters included in the standard ASCII character set; however, UTF-8 is useful since it can support non-English characters. A character encoding scheme (e.g. ASCII, UTF-8) is used to map and translate bytes between computers and into human-readable characters. Using either of these character encodings will increase machine readability and interoperability. |
Required or Recommended | recommended |
Element | Delimiter |
---|---|
Reporting Format Statement | Use a comma as the delimiter while use of other commas must be protected. |
Reporting Format Description | Save tabular data in comma separated values (CSV) format. The delimiter between columns is the comma character ",". For commas not meant to be a delimiter (e.g. used within a cell), use a vertical bar "|" or a semicolon ";" instead of a comma or protect the comma with matching double quotation marks around the entire value. |
Required or Recommended | strongly recommended |
Element | Data Matrix |
---|---|
Reporting Format Statement | Data portion of the file organized as a matrix of rows and columns with no empty lines and equal number of columns. |
Reporting Format Description | The contents of the data portion of the file must be organized in a logical and readable matrix format. There can be no empty rows and there must be the same number of columns across all of its rows. Dataset creators may want to consider ending CSV files with a newline character \n which indicates to any CSV reader that it has read the end of the CSV file. |
Required or Recommended | strongly recommended |
Element | Column or Row Name Orientation |
---|---|
Reporting Format Statement | Provide a header Column or Row that describes the type of information found in that Column or Row. |
Reporting Format Description | The Data Matrix portion of each file should contain a header Column or Row following the description under the "Column or Row Names" section of this reporting format. The Column or Row names will identify the type of information found in that Column or Row. Tabular data can be organized: 1) Horizontally meaning that data are organized in rows and there's a new name describing the data at the start of every row 2) Vertically meaning that data are organized in columns and there is a new column name describing the data at the top of every column. You can describe this orientation of the data within your data matrix under Data_Orientation in the File-level Metadata templates. |
Required or Recommended | strongly recommended |
Element | File Name |
---|---|
Reporting Format Statement | Use unique, descriptive file names. |
Reporting Format Description | Provide unique file names that are as descriptive as possible about the file contents. Use only letters (e.g. CamelCase), numbers, hyphens, and underscores. Do not include spaces. Do not start with an underscore or hyphen. |
Required or Recommended | strongly recommended |
Element | Column or Row Names |
---|---|
Reporting Format Statement | Use unique, descriptive Column or Row Names. |
Reporting Format Description | Provide unique Column or Row names that convey basic information about the contents. Use only letters, numbers, hyphens, and underscores. Do not include spaces and we recommend not using CamelCase. Do not start with an underscore or hyphen and we recommend not starting with a number. Descriptions of the information found within each Column or Row should be reported and defined in the CSV Data Dictionary (CSV_dd.csv). |
Required or Recommended | strongly recommended |
Element | Units |
---|---|
Reporting Format Statement | Provide variable units of measurement. |
Reporting Format Description | Provide the units of measurement for the variable in one of the following ways: 1) immediately below the Column Name as a next row or immediately adjacent to the Row Name as next column; and/or 2) only in the CSV Data Dictionary (CSV_dd.csv) Insert "N/A" when units aren't applicable. Additional information on units should be reported and defined in the CSV Data Dictionary (CSV_dd.csv). Data should be represented with units of measurement approved by the International System of Units (SI), derived units (e.g., degree Celsius). Non-SI units are accepted for use and should be defined and referenced in the CSV Data Dictionary (CSV_dd.csv). |
Required or Recommended | required |
Element | Consistent Values |
---|---|
Reporting Format Statement | Be consistent within Column or Row. |
Reporting Format Description | All data within the Column or Row must use the same units of measurement. Do not mix text and numeric data within the same Column or Row. |
Required or Recommended | strongly recommended |
Element | Missing Value Codes |
---|---|
Reporting Format Statement | Use consistent missing value codes. |
Reporting Format Description | All cells in the data matrix must have a value. Cells with missing data are represented with missing value code. For Columns or Rows containing numeric data, use "-9999" as the missing value code (or modify to match significant figures given the data). For Columns or Rows containing character data, use "N/A" as the missing value code. Missing values must be represented by values that can never be construed as actual data and must be consistent across variables. Report all Missing Value Codes in the File-level Metadata whether following the reporting format guidance or using different Missing Value Codes. |
Required or Recommended | strongly recommended |
Element | Temporal Data |
---|---|
Reporting Format Statement | Provide temporal data in UTC format or Local Standard Time with offset. |
Reporting Format Description | A Column or Row containing temporal data can be date only following the ISO 8601 standard (YYYY-MM-DD) and completed to known precision (e.g. YYYY-MM, YYYY). Time is not required, but all times must be preceded by a date if reported in the same columns or row. If date and time are split between two columns, call one "date" and the other "time". Times must be reported in Coordinated Universal Time (UTC) (YYYY-MM-DD hh:mm:ss) (use of "Z" and "T" characters are unnecessary) or Local Standard Time reporting offset or time zone in the File-level metadata. Do not report time using Daylight Savings Time. Complete times to known precision (e.g. YYYY-MM-DD hh). For timestamped data reported as intervals, specify the interval in the column/row name or in CSV Data Dictionary (CSV_dd). Temporal data using different standards can be provided as a separate variable (i.e., in an adjacent field) but only in addition to UTC format or Local Standard Time. In cases where the entire file consists of temporal data collected at a single date and time, the date and time must be reported in the File-level Metadata. |
Required or Recommended | recommended |
Element | Temporal Data Range |
---|---|
Reporting Format Statement | Timestamped data presented as ranges. |
Reporting Format Description | Present range timestamped data as paired Columns or Rows for start and stop times ("dateTime_start" and "dateTime_end" or "time_start" and "time_end"). The Column/Row Name for timestamped data given as a range should specify if the measurement is the start, stop, or midpoint value, or be explained in the CSV Data Dictionary (CSV_dd). |
Required or Recommended | recommended |
Element | Spatial Data |
---|---|
Reporting Format Statement | Provide spatial data in WGS84 decimal format. |
Reporting Format Description | All geographic coordinates must be provided in WGS84 decimal format. Latitude and longitude must be provided as separate variables (i.e., in an adjacent field). For geolocated records, each row in the data matrix must contain coordinates. Spatial data using different standards can be provided as a separate variable (i.e., in an adjacent field) but only in addition to WGS84 decimal format. In cases where the data file does not include geographic coordinates for each row/column in the data matrix and the entire file consists of measurements collected at a single location, the geographic coordinates must be reported in the File-level Metadata either as a single point location or bounding box. |
Required or Recommended | recommended |