Extending the Fellegi-Sunter record linkage model for mixed-type data with application to the French national health data system
Author
Abstract
Suggested Citation
DOI: 10.1016/j.csda.2022.107656
Download full text from publisher
As the access to this document is restricted, you may want to search for a different version of it.
References listed on IDEAS
- Sayers, Adrian & Ben-Shlomo, Yoav & Blom, Ashley W. & Steele, Fiona, 2015. "Probabilistic record linkage," LSE Research Online Documents on Economics 64894, London School of Economics and Political Science, LSE Library.
- P. Lahiri & Michael D. Larsen, 2005. "Regression Analysis With Linked Data," Journal of the American Statistical Association, American Statistical Association, vol. 100, pages 222-230, March.
- Rebecca C. Steorts & Rob Hall & Stephen E. Fienberg, 2016. "A Bayesian Approach to Graphical Record Linkage and Deduplication," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(516), pages 1660-1672, October.
- Hofert, Marius & Mächler, Martin, 2016. "Parallel and Other Simulations in R Made Easy: An End-to-End Study," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 69(i04).
- Enamorado, Ted & Fifield, Benjamin & Imai, Kosuke, 2019. "Using a Probabilistic Model to Assist Merging of Large-Scale Administrative Records," American Political Science Review, Cambridge University Press, vol. 113(2), pages 353-371, May.
- J. B. Copas & F. J. Hilton, 1990. "Record Linkage: Statistical Models for Matching Computer Records," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 153(3), pages 287-312, May.
- Kim, Gunky & Chambers, Raymond, 2012. "Regression analysis under incomplete linkage," Computational Statistics & Data Analysis, Elsevier, vol. 56(9), pages 2756-2770.
- Mauricio Sadinle, 2017. "Bayesian Estimation of Bipartite Matchings for Record Linkage," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(518), pages 600-612, April.
- Abdullah-Al Mamun & Robert Aseltine & Sanguthevar Rajasekaran, 2016. "Efficient Record Linkage Algorithms Using Complete Linkage Clustering," PLOS ONE, Public Library of Science, vol. 11(4), pages 1-21, April.
Most related items
These are the items that most often cite the same works as this one and are cited by the same works as this one.- Li‐Chun Zhang & Tiziana Tuoto, 2021. "Linkage‐data linear regression," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 184(2), pages 522-547, April.
- Bera Sabyasachi & Chatterjee Snigdhansu, 2020. "High dimensional, robust, unsupervised record linkage," Statistics in Transition New Series, Statistics Poland, vol. 21(4), pages 123-143, August.
- Sabyasachi Bera & Snigdhansu Chatterjee, 2020. "High dimensional, robust, unsupervised record linkage," Statistics in Transition New Series, Polish Statistical Association, vol. 21(4), pages 123-143, August.
- Ben Powell & Paul A. Smith, 2020. "Computing expectations and marginal likelihoods for permutations," Computational Statistics, Springer, vol. 35(2), pages 871-891, June.
- Han Ying, 2020. "Discussion of “Small area estimation: its evolution in five decades”, by Malay Ghosh," Statistics in Transition New Series, Statistics Poland, vol. 21(4), pages 30-34, August.
- Ying Han, 2020. "Discussion of "Small area estimation: its evolution in five decades", by Malay Ghosh," Statistics in Transition New Series, Polish Statistical Association, vol. 21(4), pages 30-34, August.
- Daniel H. Weinberg & John M. Abowd & Robert F. Belli & Noel Cressie & David C. Folch & Scott H. Holan & Margaret C. Levenstein & Kristen M. Olson & Jerome P. Reiter & Matthew D. Shapiro & Jolene Smyth, 2017. "Effects of a Government-Academic Partnership: Has the NSF-Census Bureau Research Network Helped Improve the U.S. Statistical System?," Working Papers 17-59r, Center for Economic Studies, U.S. Census Bureau.
- John M. Abowd & Joelle Abramowitz & Margaret C. Levenstein & Kristin McCue & Dhiren Patki & Trivellore Raghunathan & Ann M. Rodgers & Matthew D. Shapiro & Nada Wasi & Dawn Zinsser, 2021.
"Finding Needles in Haystacks: Multiple-Imputation Record Linkage Using Machine Learning,"
Working Papers
21-35, Center for Economic Studies, U.S. Census Bureau.
- John M. Abowd & Joelle Hillary Abramowitz & Margaret Catherine Levenstein & Kristin McCue & Dhiren Patki & Trivellore Raghunathan & Ann Michelle Rodgers & Matthew D. Shapiro & Nada Wasi & Dawn Zinsser, 2021. "Finding Needles in Haystacks: Multiple-Imputation Record Linkage Using Machine Learning," Working Papers 22-11, Federal Reserve Bank of Boston.
- Duncan Smith, 2020. "Re‐identification in the Absence of Common Variables for Matching," International Statistical Review, International Statistical Institute, vol. 88(2), pages 354-379, August.
- Tatiana Komarova & Denis Nekipelov & Evgeny Yakovlev, 2018.
"Identification, data combination, and the risk of disclosure,"
Quantitative Economics, Econometric Society, vol. 9(1), pages 395-440, March.
- Tatiana V. Komarova & Denis Nekipelov & Evgeny Yakovlev, 2011. "Identification, data combination and the risk of disclosure," CeMMAP working papers CWP38/11, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Komarova, Tatiana & Nekipelov, Denis & Yakovlev, Evgeny, 2018. "Identification, data combination and the risk of disclosure," LSE Research Online Documents on Economics 79384, London School of Economics and Political Science, LSE Library.
- Angelo Moretti & Natalie Shlomo, 2023. "Improving Probabilistic Record Linkage Using Statistical Prediction Models," International Statistical Review, International Statistical Institute, vol. 91(3), pages 368-394, December.
- Betancourt, Brenda & Sosa, Juan & Rodríguez, Abel, 2022. "A prior for record linkage based on allelic partitions," Computational Statistics & Data Analysis, Elsevier, vol. 172(C).
- N. Salvati & E. Fabrizi & M. G. Ranalli & R. L. Chambers, 2021. "Small area estimation with linked data," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(1), pages 78-107, February.
- Ray Chambers & Andrea Diniz da Silva, 2020. "Improved secondary analysis of linked data: a framework and an illustration," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 183(1), pages 37-59, January.
- Al-Kandari Noriah M. & Lahiri Partha, 2016. "Prediction of a Function of Misclassified Binary Data," Statistics in Transition New Series, Statistics Poland, vol. 17(3), pages 429-447, September.
- Lee, Gyumin & Lee, Sungjun & Lee, Changyong, 2023. "Inventor–licensee matchmaking for university technology licensing: A fastText approach," Technovation, Elsevier, vol. 125(C).
- Durrant, Gabriele B. & D'Arrigo, Julia & Steele, Fiona, 2011. "Using field process data to predict best times of contact conditioning on household and interviewer influences," LSE Research Online Documents on Economics 52201, London School of Economics and Political Science, LSE Library.
- Kim, Gunky & Chambers, Raymond, 2012. "Regression analysis under incomplete linkage," Computational Statistics & Data Analysis, Elsevier, vol. 56(9), pages 2756-2770.
- Eric Chyn & Kareem Haggag, 2023.
"Moved to Vote: The Long-Run Effects of Neighborhoods on Political Participation,"
The Review of Economics and Statistics, MIT Press, vol. 105(6), pages 1596-1605, November.
- Eric Chyn & Kareem Haggag, 2019. "Moved to Vote: The Long-Run Effects of Neighborhoods on Political Participation," NBER Working Papers 26515, National Bureau of Economic Research, Inc.
- Eric Chyn & Kareem Haggag, 2019. "Moved to Vote: The Long-Run Effects of Neighborhoods on Political Participation," Working Papers 2019-079, Human Capital and Economic Opportunity Working Group.
- Debabrata Dey, 2003. "Record Matching in Data Warehouses: A Decision Model for Data Consolidation," Operations Research, INFORMS, vol. 51(2), pages 240-254, April.
More about this item
Keywords
Expectation Conditional Maximization (ECM) algorithm; Hurdle gamma distribution; Low prevalence variables; Mixture model; Probabilistic record linkage;All these keywords.
Statistics
Access and download statisticsCorrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:179:y:2023:i:c:s0167947322002365. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .
Please note that corrections may take a couple of weeks to filter through the various RePEc services.