DATA MANAGEMENT PLAN (DMP) GUIDELINES
1. Data Management Plan
Stands as the working document that explains how this data will be collected, stored, secured, and distributed to encourage reuse of data produced by our Researchers. That being said, also, it is there to give confident to the funders that, the data will be handled with care and in a transparent way since we dealing with an Open Access platform. Moreover, is to ensure that VUT comply with all the legal, ethical, and contractual obligations and promote best practice of managing research data.
2. Executive Summary
Define the data collection tools that will be implemented in the project. Data collection tools refer to the devices/instruments used to collect data, such as a paper questionnaire or computer-assisted interviewing system, case Studies, Checklists, Interviews, Observation, and Surveys or Questionnaires are all tools used to collect data.
Define the data collection methods that will be used in the project e.g. Qualitative or Quantitative Research Method. Including the community or population the data will be collected from. Data is collected to be further subjected to hypothesis testing which seeks to explain a phenomenon.
Clearly state the purpose of the data collection and its relation to the objectives of the project. Identify the types and formats of data that will be generated e.g. Observational Data, Experimental Data, Derived / Compiled Data etc. Identify the expected size of the data to be collected and identify whom might this data be useful to.
3. Data Regulations & Policies
The researchers need to comply with the following data protection policies:
3.1. POPIA: The purpose of the Protection of Personal Information Act (POPIA) is to protect people from harm by protecting their personal information. To stop their money being stolen, to stop their identity being stolen, and generally to protect their privacy, which is a fundamental human right. Effective from 1 July 2020.
3.2. VUT RDM POLICY: The purpose of this policy is to provide guidelines for research data generation, storage, dissemination and reuse. The policy seeks to provide a consistent research practices related to data management principles by ensuring that all research data produced at the university are managed and curated effectively and efficiently.
3.3. NRF 2015 MANDATE: This statement mandates the institutions to publish data that is from research project which is funded by the NRF to an Open Access repository to encourage reuse of data.
4. Definitions
Principal Investigator or Researcher: A researcher means a person who has entered into an employment relationship with the Vaal University of Technology in an academic position, full time or part time and whether full appointment or join appointment including adjunct, honorary affiliate and assistantships. Principal Investigators (PI) and other researchers are generally regarded as stewards and custodians of research data. However, if PI choose to delegate responsibility within their research groups, the PIs remain accountable to the University for the stewardship of research data.
Research Data: For the purpose of this policy, research data are defined as tangible and intangible factual records (numerical, textual records, images and sounds) regardless of the form or the media on which it may be recorded, that is used as primary sources for research and that are commonly accepted in the research community as necessary to validate research findings. Research data include but is not limited to computer software, materials, specimens, chemical entities, laboratory notebooks, notes of any type, survey or routine questionnaire, photographs, films, audio recordings, digital images, biological samples, algorithms, reagents, charts, graphs, statistics and conclusions.
Research data set: This is defined as a systematic, partial representation of the subject being investigated. Excluded in this definition are laboratory note books, preliminary analysis, drafts of scientific papers, plans for future research, peer review report or personal communication.
Research Administrative and Financial Records: These are records and documents, materials and information that relate to administrative, financial and human resources management of research. These include but not limited to financial information, administrative information, cost or pricing of materials, travel expenses and any other information which may be required for reporting by the research funding agencies.
Data management Plan: This refers to the administrative process by which data is acquired, validated, stored, protected and processed throughout its lifecycle. It includes accessibility by other users
Metadata: The structured information that describes, explains, locates or otherwise make it easier to retrieve, use, or manage data resource or information resource.
Open Access: Means the immediate, online, free availability of research outputs that can be accessed by anyone and is free from most copyright and licensing restrictions.
Open Data: Data that can be freely used, reused and distributed to anyone subject to attribution of authorship of such data.
Embargoed data: this refers to data to which access is restricted for legal, privacy and confidentiality and or commercial purposes.
Public Funded Research: Public funded research refers to all the research supported financially by public or tax payers funding. It can be provided through an agency or it can be undertaken in government institutions or laboratories. In the context of this policy, the government refers to South African government. International agencies can also provide to South African researcher funding from their own public funded agencies.
Misconduct: When it is necessary to secure such data during research misconduct proceedings, the University through the office of Deputy Vice-Chancellor, Research, 6 Innovation, Commercialisation and Internationalisation (DCV RICI) may take custody of such research data.
5. DMP Critical Points
5.1. Making data findable, including provisions for metadata
Is the data produced and/or used in the project discoverable with metadata, identifiable and locatable by means of a standard identification mechanism (e.g. persistent and unique identifiers such as Digital Object Identifiers)?
• What naming conventions do you follow?
• Will search keywords be provided that optimize possibilities for re-use?
• Do you provide clear version numbers?
• What metadata will be created? In case metadata standards do not exist in your discipline, please outline what type of metadata will be created and how.
5.2. Making data openly accessible
Which data produced and/or used in the project will be made openly available as the default? If certain datasets cannot be shared (or need to be shared under restrictions), explain why, clearly separating legal and contractual reasons from voluntary restrictions.
Note that in multi-beneficiary projects it is also possible for specific beneficiaries to keep their data closed if relevant provisions are made in the consortium agreement and are in line with the reasons for opting out.
• How will the data be made accessible by deposition in a repository e.g. Figshare?
• What methods or software tools are needed to access the data?
• Is documentation about the software needed to access the data included?
• Is it possible to include the relevant software (e.g. in open source code)?
• Where will the data and associated metadata, documentation and code be deposited? Preference should be given to certified repositories which support open access where possible.
• Have you explored appropriate arrangements with the identified repository? If there are restrictions on use, how will access be provided?
• How will the identity of the person accessing the data be ascertained?
5.3. Making data interoperable
• Are the data produced in the project interoperable, that is allowing data exchange and re-use between researchers, institutions, organisations, countries.
• What data and metadata vocabularies, standards or methodologies will you follow to make your data interoperable?
• Will you be using standard vocabularies for all data types present in your data set, to allow inter-disciplinary interoperability?
• In case it is unavoidable that you use uncommon or generate project specific ontologies or vocabularies, will you provide mappings to more commonly used ontologies?
5.4. Increase data re-use (through clarifying licences)
Clarify how will the data be licensed to permit the widest re-use possible.
When will the data be made available for re-use? If an embargo is sought to give time to publish or seek patents, specify why and how long this will apply,
bearing in mind that research data should be made available as soon as possible.
Are the data produced and/or used in the project useable by third parties, in particular after the end of the project? If the re-use of some data is restricted, explain why.
How long is it intended that the data remains re-usable?
Describe data assurance processes.
5.5. Allocation of resources
Who will be responsible for data management in your project?
Are the resources for long term preservation discussed e.g. cost of data storage and potential value?
Who decides and how what data will be kept and for how long?
5.6. Data security
What provisions are in place for data security (including data recovery as well as secure storage and transfer of sensitive data)?
Is the data safely stored in certified repositories for long term preservation and curation?
5.7. Ethical aspects
Are there any ethical or legal issues that can have an impact on data sharing? These can also be discussed in the context of the ethics review.
Is informed consent for data sharing and long-term preservation included in questionnaires dealing with personal data?