"There is no general and definitive list of topics that should be covered in a DMP for a research project", and researchers are often left to their own devices as to how to fill out a DMP. This might include (but is not limited to) data that are: •
Experimental •
Observational •
Raw or derived • Physical collections • Models • Simulations • Curriculum materials • Software • Images • How will the data be acquired? When and where will they be acquired? • After collection, how will the data be processed? Include information about • Software used • Algorithms •
Scientific workflows • File formats that will be used, justify those formats, and describe the naming conventions used. • Quality assurance & quality control measures that will be taken during sample collection, analysis, and processing. • If existing data are used, what are their origins? How will the data collected be combined with existing data? What is the relationship between the data collected and existing data? • How will the data be managed in the short-term? Consider the following: •
Version control for files • Backing up data and data products • Security & protection of data and data products • Who will be responsible for management
Metadata content and format Metadata are the contextual details, including any information important for using data. This may include descriptions of temporal and spatial details, instruments, parameters, units, files, etc. Metadata is commonly referred to as "data about data". Issues to be considered include: • How detailed has the metadata to be in order to make the data meaningful? • How will the metadata be created and/or captured? Examples include
lab notebooks, GPS hand-held units, Auto-saved files on instruments, etc. • What format will be used for the metadata? What are the
metadata standards commonly used in the respective scientific discipline? There should be justification for the format chosen.
Policies for access, sharing, and re-use • Describe any obligations that exist for sharing data collected. These may include obligations from funding agencies, institutions, other professional organizations, and legal requirements. • Include information about how data will be shared, including when the data will be accessible, how long the data will be available, how access can be gained, and any rights that the data collector reserves for using data. • Address any ethical or privacy issues with data sharing • Address
intellectual property &
copyright issues. Who owns the copyright? What are the institutional, publisher, and/or funding agency policies associated with intellectual property? Are there embargoes for political, commercial, or patent reasons? • Describe the intended future uses/users for the data • Indicate how the data should be cited by others. How will the issue of persistent citation be addressed? For example, if the data will be deposited in a public archive, will the dataset have a
persistent identifier (e.g.,
ARK,
DOI,
Handle,
PURL,
URN) assigned to it?
Long-term storage and data management • Researchers should identify an appropriate archive for the long-term preservation of their data. By identifying the archive early in the project, the data can be formatted, transformed, and documented appropriately to meet the requirements of the archive. Researchers should consult colleagues and professional societies in their discipline to determine the most appropriate database, and include a backup archive in their data management plan in case their first choice goes out of existence. • Early in the project, the primary researcher should identify what data will be preserved in an archive. Usually, preserving the data in its most raw form is desirable, although data derivatives and products can also be preserved. • An individual should be identified as the primary contact person for archived data, and ensure contact information is always kept up-to-date in case there are requests for data or information about data.
Budget Data management and preservation costs may be considerable, depending on the nature of the project. By anticipating costs ahead of time, researchers ensure that the data will be properly managed and archived. Potential expenses that should be considered are • Human resources and staff as they handle data preparation, management, documentation, and preservation • Hardware and/or software needed for data management, backing up, security, documentation, and preservation • Costs associated with submitting the data to an archive The data management plan should include how these costs will be paid. == NSF Data Management Plan ==