Disclosure risk estimation in household sample surveys

Authors

DOI:

https://doi.org/10.5380/atoz.v12i0.89682

Keywords:

Disclosure risk, Statistical Disclosure Control, Microdata, Household sample surveys.

Abstract

Introduction: This article aims to estimate the disclosure risk – the probability of discovering the identity of the respondent unit in a disseminated database – from the Continuous National Household Sample Survey public use file. Method: The estimation was carried out using a probabilistic model, more specifically, the Benedetti and Franconi model, colloquially called the “Italian approach”. Results: It was observed that although most records have a very low disclosure risk, there are some that require greater attention regarding their dissemination, due to the high risk. This occurs even for more aggregated geographical dissemination areas. Conclusions: The estimates of disclosure risk presented indicate that Statistical Disclosure Control techniques are fundamental tools to help information producers in their mission to guarantee the confidentiality of information.

Author Biographies

Bruno Freitas Cortez, Escola Nacional de Ciências Estatísticas (ENCE)

Doutorando em População, Território e Estatísticas Públicas

Maysa Sacramento de Magalhães, Escola Nacional de Ciências Estatísticas (ENCE)

Pesquisadora Titular da Escola Nacional de Ciências Estatísticas

References

Benedetti, R., & Franconi, L. (1998). Statistical and technological solutions for controlled data dissemination. In Pre-proceedings of New Techniques and Technologies for Statistics, pp. 225–232.

Benedetti, R., Capobianchi, A., & Franconi, L. (1998). Individual risk of disclosure using sampling design information. Contributi Istat 1412003. https://www.researchgate.net/publication/243784265_Individual_risk_of_disclosure_using_sampling_design_information

Benschop, T., Machingauta, C., & Welch, M. (2019). Statistical disclosure control: a practice guide. https://sdcpractice.readthedocs.io/en/latest/

Bethlehem, J., Keller, W., & Pannekoek, J. (1990) Disclosure control of microdata. Journal of the American Statistical Association, 85(409), 38-45. https://doi.org/10.2307/2289523

Capobianchi, A., Polettini, S., & Lucarelli M. (2001). Strategy for the implementation of individual risk methodology into μ-ARGUS. Technical report, Report for the CASC project. 1.2-D1.

Castanha, R. C. G. (2021). A ciência de dados e a cientista de dados. AtoZ: novas práticas em informação e conhecimento. 10(2), 1-4. http://dx.doi.org/10.5380/atoz.v10i2.79882

Duncan, G. T., Elliot, M., & Salazar-González, J. J. (2011). Statistical Confidentiality: Principles and Practice. 10.1007/978-1-4419-7802-8

Elamir, E. A. H., & Skinner, C. (2006). Record level measures of disclosure risk for survey microdata. Journal of Official Statistics. 22(3), 525-539. https://www.researchgate.net/publication/247361113_Record_Level_Measures_of_Disclosure_Risk_for_Survey_Microdata

Elliot, M. J., & Domingo-Ferrer, J. (2018). The future of statistical disclosure control. Paper published as part of The National Statistician's Quality Review. December 2018. https://www.researchgate.net/publication/329884395_The_future_of_statistical_disclosure_control

Franconi, L., & Polettini, S. (2004). Individual risk estimation in μ-Argus: a review. In J. Domingo-Ferrer (Ed.), Privacy in statistical databases. Lecture Notes in Computer Science. 262–272. https://link.springer.com/chapter/10.1007/978-3-540-25955-8_20

Government Statistical Service. (2014). GSS/GSR Disclosure control guidance for microdata produced from social surveys. https://analysisfunction.civilservice.gov.uk/wp-content/uploads/2018/03/Guidance-for-microdata-produced-from-social-surveys-4.pdf

Hundepool, A., Domingo-Ferrer, J., Franconi, L., Giessing, S., Nordholt, E. S., Spicer, K., & De Wolf, P-P. de. (2012). Statistical disclosure control. Wiley.

Instituto Brasileiro de Geografia e Estatística. (2018). Confidencialidade no IBGE: procedimentos adotados na preservação do sigilo das informações individuais nas divulgações de resultados das operações estatísticas. IBGE.

Instituto Brasileiro de Geografia e Estatística. (2021). Pesquisa Nacional por Amostra de Domicílios Contínua. Notas técnicas Versão 1.8. IBGE.

Instituto Brasileiro de Geografia e Estatística. (2023). Estratégia geral de tecnologia da informação e comunicação do IBGE. EGTI 2023-2024. IBGE.

Machado, J. H., & Famá, R. (2011). Ativos intangíveis e governança corporativa no mercado de capitais brasileiro. Revista Contemporânea de Contabilidade, 8(16), 89-110. https://periodicos.ufsc.br/index.php/contabilidade/article/view/2175-8069.2011v8n16p89/20046

Polettini, S. (2003). Some remarks on the individual risk methodology. In Proceedings of the Joint ECE/Eurostat Work Session on Statistical Data Confidentiality. https://unece.org/fileadmin/DAM/stats/documents/ece/ces/2003/04/confidentiality/wp.18.e.pdf

Polettini, S., & Stander, J. (2004). A Bayesian hierarchical model approach to risk estimation in statistical disclosure limitation. In Domingo-Ferrer, J., Torra, V. (eds.) Privacy in Statistical Databases, pp. 247-261. 10.1007/978-3-540-25955-8_19

Rinott, Y. (2003). On models for statistical disclosure risk estimation. In Proceedings of the Joint ECE/Eurostat Work Session on Statistical Data Confidentiality. https://api.semanticscholar.org/CorpusID:16509807

Rocha, D. F. (2019). Concorrência em mercados digitais e desafios ao controle de atos de concentração. Revista de Defesa da Concorrência. 7(2). https://revista.cade.gov.br/index.php/revistadedefesadaconcorrencia/article/view/413/236

Santos, Y. T., & Kowata, E. T. (2018). A importância do Big Data nas organizações. V Congresso de Ensino, Pesquisa e Extensão da UEG. https://www.anais.ueg.br/index.php/cepe/article/view/13307

Skinner, C. J., & Holmes, D. J. (1998). Estimating the re-identification risk per record in microdata. Journal of Official Statistics. 14(4), 361-372. https://www.scb.se/contentassets/ca21efb41fee47d293bbee5bf7be7fb3/estimating-the-re-identification-risk-per-record-in-microdata.pdf

Taylor, L., Zhou, X. H., & Rise, P. (2018). A tutorial in assessing disclosure risk in microdata. Statistics in Medicine, 37(25), 3693-3706.

Templ, M. (2017). Statistical disclosure control for microdata: methods and applications in R Softcover reprint of the original. Springer.

United Nations. (2015). United Nations Fundamental Principles of Official Statistics: implementation guidelines. https://unstats.un.org/unsd/dnss/gp/Implementation_Guidelines_FINAL_without_edit.pdf

Waal, T., & Willenborg, L. C. R. J. (1996). A view on statistical disclosure control for microdata. Survey Methodology. 22(1), 95-103. https://www150.statcan.gc.ca/n1/en/pub/12-001-x/1996001/article/14381-eng.pdf?st=KKaA-W9y

Willenborg, L, & Waal, T. de. (2001). Elements of statistical disclosure control. Springer.

Published

2023-12-29

How to Cite

Cortez, B. F., & de Magalhães, M. S. (2023). Disclosure risk estimation in household sample surveys. AtoZ: Novas práticas Em informação E Conhecimento, 12, 1–11. https://doi.org/10.5380/atoz.v12i0.89682

Issue

Section

Papers