Best Practices for Metadata
Metadata should be recorded both at the project level and at the data level. At the project level, this information is often referred to as documentation.
Good data documentation includes information on:
- the context of data collection: project history, aims, objectives and hypotheses
- data collection methods: data collection protocol, sampling design, instruments, hardware and software used, data scale and resolution, temporal coverage and geographic coverage
- dataset structure of data files, cases, relationships between files
- data sources used
- data validation, checking, proofing, cleaning and other quality assurance procedures carried out
- modifications made to data over time since their original creation and identification of different versions of datasets
- information on data confidentiality, access and use conditions, where applicable
At the data-level, datasets should also be documented with:
- names, labels and descriptions for variables, records and their values
- explanation of codes and classification schemes used
- codes of, and reasons for, missing values
- derived data created after collection, with code, algorithm or script used to create them
- weighting and grossing variables created
- data listing with descriptions for cases, individuals or items studied
More best practices information - especially for the ecological and environmental sciences can be found on the DataONE website.