Section: 2.1 Understanding the data lifecycle | MOOC Basics of managing and sharing research data

Section outline

- Select activity 1. DEFINITION There are many definitions of resear...
  
  1. Definition
  
  There are many definitions of research data. The OECD (Organisation for Economic Co-operation and Development) definition is the most commonly used:
  
  "Research data" are defined as factual records (numerical scores, textual records, images and sound) used as primary sources for scientific research, and that are commonly accepted in the scientific community as necessary to validate research findings
  Source : OECD, OECD Principles and Guidelines for Access to Research Data from Public Funding, Paris, 2007.
  
  As example, the University of Leeds describes research data as:
  
  Any information that has been collected, observed, generated or created to validate original research findings. Altough usually digital, research data also includes non-digital formats such as laboratory notebooks and diaries."
  Source : University of Leeds Library.
  
  In the context of open science, these definitions can be complemented by broadening the scope of research data produced by researchers, which can also allow other researchers to conduct new research projects.
- Select activity 2. DIVERSITY OF RESEARCH DATA Depending on the pro...
  
  2. Diversity of research data
  Depending on the project, the research data may be:
  
  produced or collected: these are the data created, elaborated, generated during research activities (observations, measurements, etc.)
  
  pre-existing: these are already existing data (corpus, archives...) which are used for the project. The data used may initially have been collected in a context other than the research, but they are used as research data within the framework of the project.
  
  These data can be qualitative (interview data, observation data, open-field questionnaire etc.) or quantitative (measurement table, scored evaluation questionnaire, thermometer etc.). Depending on the context in which they were created (capture or production), their exploitation, analysis and processing, research data may be of different kinds, contained in various media and of all types.
  
  There are several descriptive classifications. One of these is:
  
  the source of the data
  
  the form of the data
- Select activity WHAT IS THE SOURCE OF THE DATA? OBSERVATION DATA -...
  
  What is the source of the data?
  
  Observation data
  
  Observation data
  
  Observation data are captured in real time. They are captured by observing a behaviour or activity and are therefore most often unique and impossible to reproduce. This is the case with sensor data, neuroimaging, astronomical photography or survey data.
  
  Experimental data
  
  Experimental data
  
  Experimental data are obtained from laboratory equipment. They are often reproducible but this can be costly. Chromatographs and DNA chips fall into this category.
  
  Computational or simulation data
  
  Computational or simulation data
  
  Computational or simulation data are generated by computer or simulation models. They generate more important metadata. They are often reproducible provided that the model is properly documented. For simulations data, the test model wich is used is often as important than the data generated from the simulation and sometimes even more so. Examples include meteorological models, seismic simulation models and economic models.
  
  Derived or compiled data
  
  Derived or compiled data
  
  Derived or compiled data are derived from the processing or combination of raw data. They are often reproducible but expensive. This is the case for data obtained by text mining, 3D models or compiled databases.
  
  Reference data
  
  Reference data
  
  Collection or accumulation of small datasets that have been peer reviewed, annotated and made available.
- Select activity WHAT FORM DOES THIS DATA TAKE? TEXTUAL DATA : FIEL...
  
  What form does this data take?
  Textual data : Field or laboratory notes, survey responses...
  Digital data : Tables, measures...
  Audiovisual data : Images, sounds, videos…
  Computer codes
  Discipline-specific data : For example FITS in spatial data or CIF in crystallography...
  Specific data produced by some instruments
- Select activity 3. WHY MANAGE AND SHARE YOUR DATA * QUANTITY: a go...
  
  3. Why manage and share your data
  
  Quantity: a good management is necessary because of big data and especially to avoid data loss.
  Quality: sharing data requires good data management practices, which improves the quality of research work.
  Validation of research results: sharing data contribute to validate research results. More and more publishers ask researchers to make available all underlying data mentioned in the submitted article.
  Integrity: making data available ensures a better security against scientific fraud.
  Valorisation: data sharing allows the researcher to enhance the value of his data and increase its visibility (citation).
  Funding: data sharing (based on the principle of "as open as possible, as closed as necessary") may be a condition for project funding.
  Reproducibility and reuse: the cost of creating, collecting and processing data can be very high. Reusing existing data rather than recreating them reduces time and cost of research.
  Interdisciplinarity: databases allow better search, extraction, cross-references and visualization of data, particularly from different disciplines.
  Exhumation of "fossilized" data: publications provide access to about 10% of the data. The 90% remaining stays on computer hard drives and are not used. They are called "fossilized data". Proper management and sharing of this data would prevent the loss of unique data.
  Patrimonial value: Some research data can have a scientific patrimonial value. It is particularly important to organize a good management and sharing of these data.
- Select activity 4. RESEARCH DATA LIFE CYCLE The data life cycle is...
  
  4. Research Data Life Cycle
  
  The data life cycle is the set of steps involved in the management, preservation and dissemination of the research data, associated with research activities. This cycle guides researchers through the research data management process to enable them and their stakeholders to make the most of the research data generated.
  Source : DoRANum
  
  It can be divided into six different phases: Planning, Collecting, Analysing, Publishing, Preserving, and Reusing.
  
  Source : Adaptation of Research data lifecycle – UK Data Service
- Select activity EXTERNAL RESOURCES * Research data lifecycle, UK D...
  
  External Resources
  
  Research data lifecycle, UK Data Service
  Passport For Open Science, CoSo

2.1 Understanding the data lifecycle

Section outline

1. Definition

2. Diversity of research data

What is the source of the data?

Observation data

Observation data

Experimental data

Experimental data

Computational or simulation data

Computational or simulation data

Derived or compiled data

Derived or compiled data

Reference data

Reference data

What form does this data take?

Textual data : Field or laboratory notes, survey responses...

Digital data : Tables, measures...

Audiovisual data : Images, sounds, videos…

Computer codes

Discipline-specific data : For example FITS in spatial data or CIF in crystallography...

Specific data produced by some instruments

3. Why manage and share your data

4. Research Data Life Cycle

External Resources

Centre de ressources Urfist

Callisto

Aide

Nous suivre

Section outline

1. Definition

2. Diversity of research data

What is the source of the data?

Observation data

Observation data

Experimental data

Experimental data

Computational or simulation data

Computational or simulation data

Derived or compiled data

Derived or compiled data

Reference data

Reference data

What form does this data take?

Textual data : Field or laboratory notes, survey responses...

Digital data : Tables, measures...

Audiovisual data : Images, sounds, videos…

Computer codes

Discipline-specific data : For example FITS in spatial data or CIF in crystallography...

Specific data produced by some instruments

3. Why manage and share your data

4. Research Data Life Cycle

External Resources

Liens de bas de page

Centre de ressources Urfist

Callisto

Aide

Nous suivre