Secondary Data Analysis: Ethical Issues and Challenges

This work is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License which allows users to read, copy, distribute and make derivative works for non-commercial purposes from the material, as long as the author of the original work is cited properly.

Background

Research does not always involve collection of data from the participants. There is huge amount of data that is being collected through the routine management information system and other surveys or research activities. The existing data can be analyzed to generate new hypothesis or answer critical research questions. This saves lots of time, money and other resources. Also data from large sample surveys may be of higher quality and representative of the population. It avoids repetition of research & wastage of resources by detailed exploration of existing research data and also ensures that sensitive topics or hard to reach populations are not over researched (1). However, there are certain ethical issues pertaining to secondary data analysis which should be taken care of before handling such data.

Secondary data analysis

Secondary analysis refers to the use of existing research data to find answer to a question that was different from the original work (2). Secondary data can be large scale surveys or data collected as part of personal research. Although there is general agreement about sharing the results of large scale surveys, but little agreement exists about the second. While the fundamental ethical issues related to secondary use of research data remain the same, they have become more pressing with the advent of new technologies. Data sharing, compiling and storage have become much faster and easier. At the same time, there are fresh concerns about data confidentiality and security.

Issues in Secondary data analysis

Concerns about secondary use of data mostly revolve around potential harm to individual subjects and issue of return for consent. Secondary data vary in terms of the amount of identifying information in it. If the data has no identifying information or is completely devoid of such information or is appropriately coded so that the researcher does not have access to the codes, then it does not require a full review by the ethical board. The board just needs to confirm that the data is actually anonymous. However, if the data contains identifying information on participants or information that could be linked to identify participants, a complete review of the proposal will then be made by the board. The researcher will then have to explain why is it unavoidable to have identifying information to answer the research question and must also indicate how participants’ privacy and the confidentiality of the data will be protected. If the above said concerns are satisfactorily addressed, the researcher can then request for a waiver of consent.

If the data is freely available on the Internet, books or other public forum, permission for further use and analysis is implied. However, the ownership of the original data must be acknowledged. If the research is part of another research project and the data is not freely available, except to the original research team, explicit, written permission for the use of the data must be obtained from the research team and included in the application for ethical clearance.

However, there are certain other issues pertaining to the data that is procured for secondary analysis. The data obtained should be adequate, relevant but not excessive. In secondary data analysis, the original data was not collected to answer the present research question. Thus the data should be evaluated for certain criteria such as the methodology of data collection, accuracy, period of data collection, purpose for which it was collected and the content of the data. It shall be kept for no longer than is necessary for that purpose. It must be kept safe from unauthorized access, accidental loss or destruction. Data in the form of hardcopies should be kept in safe locked cabinets whereas softcopies should be kept as encrypted files in computers. It is the responsibility of the researcher conducting the secondary analysis to ensure that further analysis of the data conducted is appropriate. In some cases there is provision for analysis of secondary data in the original consent form with the condition that the secondary study is approved by the ethics review committee. According to the British Sociological Association’s Statement of Ethical Practice (2004) the researchers must inform participants regarding the use of data and obtain consent for the future use of the material as well. However it also says that consent is not a once-and-for-all event, but is subject to renegotiation over time (3). It appears that there are no guidelines about the specific conditions that require further consent.

Issues in Secondary analysis of Qualitative data

In qualitative research, the culture of data archiving is absent (4). Also, there is a concern that data archiving exposes subject’s personal views. However, the best practice is to plan anonymisation at the time of initial transcription. Use of pseudonyms or replacements can protect subject’s identity. A log of all replacements, aggregations or removals should be made and stored separately from the anonymised data files. But because of the circumstances, under which qualitative data is produced, their reinterpretation at some later date can be challenging and raises further ethical concerns.

Conclusion

There is a need for formulating specific guidelines regarding re-use of data, data protection and anonymisation and issues of consent in secondary data analysis.

Acknowledgements

The authors declare that there is no conflict of interest.

References

Articles from Iranian Journal of Public Health are provided here courtesy of Tehran University of Medical Sciences