IDEAS home Printed from https://ideas.repec.org/a/spr/metrik/v83y2020i3d10.1007_s00184-019-00731-8.html
   My bibliography  Save this article

A new multiple outliers identification method in linear regression

Author

Listed:
  • Vilijandas Bagdonavičius

    (Vilnius University)

  • Linas Petkevičius

    (Vilnius University)

Abstract

A new method for multiple outliers identification in linear regression models is developed. It is relatively simple and easy to use. The method is based on a result giving asymptotic properties of extreme studentized residuals. This result is proved under rather general conditions on estimation procedure and covariate distribution. An extensive simulation study shows that the proposed method has superior performance as compared to various existing methods in terms of masking and swamping values. Advantage of the method is particularly visible in case of large datasets and (or) large numbers of outliers. The analysis of several well-known real data examples confirms that in most cases the new method identifies outliers better than other commonly used methods.

Suggested Citation

  • Vilijandas Bagdonavičius & Linas Petkevičius, 2020. "A new multiple outliers identification method in linear regression," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 83(3), pages 275-296, April.
  • Handle: RePEc:spr:metrik:v:83:y:2020:i:3:d:10.1007_s00184-019-00731-8
    DOI: 10.1007/s00184-019-00731-8
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00184-019-00731-8
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00184-019-00731-8?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Y. She & K. Chen, 2017. "Robust reduced-rank regression," Biometrika, Biometrika Trust, vol. 104(3), pages 633-647.
    2. Marco Riani & Aldo Corbellini & Anthony C. Atkinson, 2018. "The Use of Prior Information in Very Robust Regression for Fraud Detection," International Statistical Review, International Statistical Institute, vol. 86(2), pages 205-218, August.
    3. Billor, Nedret & Hadi, Ali S. & Velleman, Paul F., 2000. "BACON: blocked adaptive computationally efficient outlier nominators," Computational Statistics & Data Analysis, Elsevier, vol. 34(3), pages 279-298, September.
    4. Zani, Sergio & Riani, Marco & Corbellini, Aldo, 1998. "Robust bivariate boxplots and multiple outlier detection," Computational Statistics & Data Analysis, Elsevier, vol. 28(3), pages 257-270, September.
    5. Hadi, Ali S., 1992. "A new measure of overall potential influence in linear regression," Computational Statistics & Data Analysis, Elsevier, vol. 14(1), pages 1-27, June.
    6. Roy E. Welsch & Edwin Kuh, 1977. "Linear Regression Diagnostics," NBER Working Papers 0173, National Bureau of Economic Research, Inc.
    7. Chun Gun Park & Inyoung Kim, 2018. "Outlier detection using difference-based variance estimators in multiple regression," Communications in Statistics - Theory and Methods, Taylor & Francis Journals, vol. 47(24), pages 5986-6001, December.
    8. A. H. M. Rahmatullah Imon, 2005. "Identifying multiple influential observations in linear regression," Journal of Applied Statistics, Taylor & Francis Journals, vol. 32(9), pages 929-946.
    9. Todorov, Valentin & Filzmoser, Peter, 2009. "An Object-Oriented Framework for Robust Multivariate Analysis," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 32(i03).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. A.A.M. Nurunnabi & M. Nasser & A.H.M.R. Imon, 2016. "Identification and classification of multiple outliers, high leverage points and influential observations in linear regression," Journal of Applied Statistics, Taylor & Francis Journals, vol. 43(3), pages 509-525, March.
    2. Junlong Zhao & Chao Liu & Lu Niu & Chenlei Leng, 2019. "Multiple influential point detection in high dimensional regression spaces," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 81(2), pages 385-408, April.
    3. M. Habshah & M. R. Norazan & A.H.M. Rahmatullah Imon, 2009. "The performance of diagnostic-robust generalized potentials for the identification of multiple high leverage points in linear regression," Journal of Applied Statistics, Taylor & Francis Journals, vol. 36(5), pages 507-520.
    4. M. Hubert & P. Rousseeuw & K. Vakili, 2014. "Shape bias of robust covariance estimators: an empirical study," Statistical Papers, Springer, vol. 55(1), pages 15-28, February.
    5. Andrea Bergesio & María Eugenia Szretter Noste & Víctor J. Yohai, 2021. "A robust proposal of estimation for the sufficient dimension reduction problem," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 30(3), pages 758-783, September.
    6. Kondylis, Athanassios & Hadi, Ali S., 2006. "Derived components regression using the BACON algorithm," Computational Statistics & Data Analysis, Elsevier, vol. 51(2), pages 556-569, November.
    7. A.A.M. Nurunnabi & Ali S. Hadi & A.H.M.R. Imon, 2014. "Procedures for the identification of multiple influential observations in linear regression," Journal of Applied Statistics, Taylor & Francis Journals, vol. 41(6), pages 1315-1331, June.
    8. Luigi Grossi & Fabrizio Laurini, 2020. "Robust asset allocation with conditional value at risk using the forward search," Applied Stochastic Models in Business and Industry, John Wiley & Sons, vol. 36(3), pages 335-352, May.
    9. Valentin Todorov & Matthias Templ & Peter Filzmoser, 2011. "Detection of multivariate outliers in business survey data with incomplete information," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 5(1), pages 37-56, April.
    10. Catherine Fuss & Angelos Theodorakopoulos, 2018. "Compositional Changes in Aggregate Productivity in an Era of Globalisation and Financial Crisis," Working Papers of VIVES - Research Centre for Regional Economics 627696, KU Leuven, Faculty of Economics and Business (FEB), VIVES - Research Centre for Regional Economics.
    11. Peter J. Luke & Mark E. Schaffer, 1999. "Wage Determination in Russia: An Econometric Investigation," CERT Discussion Papers 9908, Centre for Economic Reform and Transformation, Heriot Watt University.
    12. L. Pitsoulis & G. Zioutas, 2010. "A fast algorithm for robust regression with penalised trimmed squares," Computational Statistics, Springer, vol. 25(4), pages 663-689, December.
    13. Cristian BARRA & Roberto ZOTTI, 2019. "Bank Performance, Financial Stability And Market Concentration: Evidence From Cooperative And Non‐Cooperative Banks," Annals of Public and Cooperative Economics, Wiley Blackwell, vol. 90(1), pages 103-139, March.
    14. Hong Choon Ong & Ekele Alih, 2015. "A Control Chart Based on Cluster-Regression Adjustment for Retrospective Monitoring of Individual Characteristics," PLOS ONE, Public Library of Science, vol. 10(4), pages 1-30, April.
    15. Stefani, Gianluca & Gadanakis, Yiorgos & Lombardi, Ginevra Virginia & Tiberti, Marco, 2017. "The impact of financial leverage on farms capacity to react in market shocks," 2017 International Congress, August 28-September 1, 2017, Parma, Italy 261156, European Association of Agricultural Economists.
    16. Batalla-Bejerano, Joan & Costa-Campi, Maria Teresa & Trujillo-Baute, Elisa, 2016. "Collateral effects of liberalisation: Metering, losses, load profiles and cost settlement in Spain’s electricity system," Energy Policy, Elsevier, vol. 94(C), pages 421-431.
    17. repec:hal:spmain:info:hdl:2441/o45fqtltm960r11iq437ski90 is not listed on IDEAS
    18. Steffen Liebscher & Thomas Kirschstein, 2015. "Efficiency of the pMST and RDELA location and scatter estimators," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 99(1), pages 63-82, January.
    19. Jiří Schwarz & Martin Pospíšil, 2018. "Bankruptcy, Investment, and Financial Constraints: Evidence from the Czech Republic," Eastern European Economics, Taylor & Francis Journals, vol. 56(2), pages 99-121, March.
    20. A. H. M. Rahmatullah Imon, 2005. "Identifying multiple influential observations in linear regression," Journal of Applied Statistics, Taylor & Francis Journals, vol. 32(9), pages 929-946.
    21. AMENDOLA, Adalgiso & BARRA, Cristian & BOCCIA, Marinella & PAPACCIO, Anna, 2018. "Market Structure and Financial Stability: Theory and Evidence," CELPE Discussion Papers 156, CELPE - CEnter for Labor and Political Economics, University of Salerno, Italy.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:metrik:v:83:y:2020:i:3:d:10.1007_s00184-019-00731-8. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.