IDEAS home Printed from https://ideas.repec.org/a/spr/jglopt/v73y2019i2d10.1007_s10898-018-0713-3.html
   My bibliography  Save this article

Mixed integer quadratic optimization formulations for eliminating multicollinearity based on variance inflation factor

Author

Listed:
  • Ryuta Tamura

    (Tokyo University of Agriculture and Technology
    October Sky Co., Ltd.)

  • Ken Kobayashi

    (Fujitsu Laboratories Ltd.)

  • Yuichi Takano

    (Senshu University
    University of Tsukuba)

  • Ryuhei Miyashiro

    (Tokyo University of Agriculture and Technology)

  • Kazuhide Nakata

    (Tokyo Institute of Technology)

  • Tomomi Matsui

    (Tokyo Institute of Technology)

Abstract

Multicollinearity exists when some explanatory variables of a multiple linear regression model are highly correlated. High correlation among explanatory variables reduces the reliability of the analysis. To eliminate multicollinearity from a linear regression model, we consider how to select a subset of significant variables by means of the variance inflation factor (VIF), which is the most common indicator used in detecting multicollinearity. In particular, we adopt the mixed integer optimization (MIO) approach to subset selection. The MIO approach was proposed in the 1970s, and recently it has received renewed attention due to advances in algorithms and hardware. However, none of the existing studies have developed a computationally tractable MIO formulation for eliminating multicollinearity on the basis of VIF. In this paper, we propose mixed integer quadratic optimization (MIQO) formulations for selecting the best subset of explanatory variables subject to the upper bounds on the VIFs of selected variables. Our two MIQO formulations are based on the two equivalent definitions of VIF. Computational results illustrate the effectiveness of our MIQO formulations by comparison with conventional local search algorithms and MIO-based cutting plane algorithms.

Suggested Citation

  • Ryuta Tamura & Ken Kobayashi & Yuichi Takano & Ryuhei Miyashiro & Kazuhide Nakata & Tomomi Matsui, 2019. "Mixed integer quadratic optimization formulations for eliminating multicollinearity based on variance inflation factor," Journal of Global Optimization, Springer, vol. 73(2), pages 431-446, February.
  • Handle: RePEc:spr:jglopt:v:73:y:2019:i:2:d:10.1007_s10898-018-0713-3
    DOI: 10.1007/s10898-018-0713-3
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10898-018-0713-3
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10898-018-0713-3?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Miyashiro, Ryuhei & Takano, Yuichi, 2015. "Mixed integer second-order cone programming formulations for variable selection in linear regression," European Journal of Operational Research, Elsevier, vol. 247(3), pages 721-731.
    2. Dimitris Bertsimas & Angela King, 2016. "OR Forum—An Algorithmic Approach to Linear Regression," Operations Research, INFORMS, vol. 64(1), pages 2-16, February.
    3. Toshiki Sato & Yuichi Takano & Ryuhei Miyashiro & Akiko Yoshise, 2016. "Feature subset selection for logistic regression via mixed integer optimization," Computational Optimization and Applications, Springer, vol. 64(3), pages 865-880, July.
    4. Ian T. Jolliffe, 1982. "A Note on the Use of Principal Components in Regression," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 31(3), pages 300-303, November.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Chengjie Yang & Ruren Li & Zongyao Sha, 2020. "Exploring the Dynamics of Urban Greenness Space and Their Driving Factors Using Geographically Weighted Regression: A Case Study in Wuhan Metropolis, China," Land, MDPI, vol. 9(12), pages 1-21, December.
    2. Yuichi Takano & Ryuhei Miyashiro, 2020. "Best subset selection via cross-validation criterion," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 28(2), pages 475-488, July.
    3. Iuliia Iliashenko & Fragkoulis Papagiannis & Patrizia Gazzola & Nataliia Cherkas & Daniele Grechi, 2023. "Entrepreneurial Behaviour and Organisational Propensity to Innovate in a Public-Sector Context," Journal of Entrepreneurship and Innovation in Emerging Economies, Entrepreneurship Development Institute of India, vol. 32(1), pages 111-156, March.
    4. Ken Kobayashi & Yuichi Takano & Kazuhide Nakata, 2021. "Bilevel cutting-plane algorithm for cardinality-constrained mean-CVaR portfolio optimization," Journal of Global Optimization, Springer, vol. 81(2), pages 493-528, October.
    5. Pankaj Tiwari, 2023. "Influence of Millennials’ eco-literacy and biospheric values on green purchases: the mediating effect of attitude," Public Organization Review, Springer, vol. 23(3), pages 1195-1212, September.
    6. Tomokaze Shiratori & Ken Kobayashi & Yuichi Takano, 2020. "Prediction of hierarchical time series using structured regularization and its application to artificial neural networks," PLOS ONE, Public Library of Science, vol. 15(11), pages 1-23, November.
    7. Wu, Hong, 2023. "Evaluating the role of renewable energy investment resources and green finance on the economic performance: Evidence from OECD economies," Resources Policy, Elsevier, vol. 80(C).
    8. Jireh Yi-Le Chan & Steven Mun Hong Leow & Khean Thye Bea & Wai Khuen Cheng & Seuk Wai Phoong & Zeng-Wei Hong & Yen-Lin Chen, 2022. "Mitigating the Multicollinearity Problem and Its Machine Learning Approach: A Review," Mathematics, MDPI, vol. 10(8), pages 1-17, April.
    9. Gambella, Claudio & Ghaddar, Bissan & Naoum-Sawaya, Joe, 2021. "Optimization problems for machine learning: A survey," European Journal of Operational Research, Elsevier, vol. 290(3), pages 807-828.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Leonardo Di Gangi & M. Lapucci & F. Schoen & A. Sortino, 2019. "An efficient optimization approach for best subset selection in linear regression, with application to model selection and fitting in autoregressive time-series," Computational Optimization and Applications, Springer, vol. 74(3), pages 919-948, December.
    2. Young Woong Park & Diego Klabjan, 2020. "Subset selection for multiple linear regression via optimization," Journal of Global Optimization, Springer, vol. 77(3), pages 543-574, July.
    3. Matteo Lapucci & Tommaso Levato & Marco Sciandrone, 2021. "Convergent Inexact Penalty Decomposition Methods for Cardinality-Constrained Problems," Journal of Optimization Theory and Applications, Springer, vol. 188(2), pages 473-496, February.
    4. Serrano, Breno & Minner, Stefan & Schiffer, Maximilian & Vidal, Thibaut, 2024. "Bilevel optimization for feature selection in the data-driven newsvendor problem," European Journal of Operational Research, Elsevier, vol. 315(2), pages 703-714.
    5. Dennis Shen & Peng Ding & Jasjeet Sekhon & Bin Yu, 2022. "Same Root Different Leaves: Time Series and Cross-Sectional Methods in Panel Data," Papers 2207.14481, arXiv.org, revised Oct 2022.
    6. Carlos Moreno-Miranda & Hipatia Palacios & Daniele Rama, 2019. "Small-holders perception of sustainability and chain coordination: evidence from Arriba PDO Cocoa in Western Ecuador," Bio-based and Applied Economics Journal, Italian Association of Agricultural and Applied Economics (AIEAA), vol. 8(3), December.
    7. Fernandez-Haddad, Zaira & Quiroga, Sonia, 2011. "Adaptation Of Mediterranean Crops To Water Pressure In The Ebro Basin: A Water Efficiency Index," 2011 International Congress, August 30-September 2, 2011, Zurich, Switzerland 114358, European Association of Agricultural Economists.
    8. Kawano, Shuichi & Fujisawa, Hironori & Takada, Toyoyuki & Shiroishi, Toshihiko, 2015. "Sparse principal component regression with adaptive loading," Computational Statistics & Data Analysis, Elsevier, vol. 89(C), pages 192-203.
    9. Heni Masruroh & Soemarno Soemarno & Syahrul Kurniawan & Amin Setyo Leksono, 2023. "A Spatial Model of Landslides with A Micro-Topography and Vegetation Approach for Sustainable Land Management in the Volcanic Area," Sustainability, MDPI, vol. 15(4), pages 1-26, February.
    10. Tao Xu & He Meng & Jie Zhu & Wei Wei & He Zhao & Han Yang & Zijin Li & Yuhan Wu, 2021. "Optimal Capacity Allocation of Energy Storage in Distribution Networks Considering Active/Reactive Coordination," Energies, MDPI, vol. 14(6), pages 1-24, March.
    11. Minjung Kyung & Ju-Hyun Park & Ji Yeh Choi, 2022. "Bayesian Mixture Model of Extended Redundancy Analysis," Psychometrika, Springer;The Psychometric Society, vol. 87(3), pages 946-966, September.
    12. Hugh L. Christensen, 2015. "Algorithmic arbitrage of open-end funds using variational Bayes," International Journal of Financial Engineering (IJFE), World Scientific Publishing Co. Pte. Ltd., vol. 2(04), pages 1-38, December.
    13. Jiaju Miao & Pawel Polak, 2023. "Online Ensemble of Models for Optimal Predictive Performance with Applications to Sector Rotation Strategy," Papers 2304.09947, arXiv.org.
    14. Mirza Pasic & Halima Hadziahmetovic & Ismira Ahmovic & Mugdim Pasic, 2023. "Principal Component Regression Modeling and Analysis of PM 10 and Meteorological Parameters in Sarajevo with and without Temperature Inversion," Sustainability, MDPI, vol. 15(14), pages 1-22, July.
    15. Cai, Yuezhou & Hanley, Aoife, 2012. "Building BRICS: 2-Stage DEA analysis of R&D efficiency," Kiel Working Papers 1788, Kiel Institute for the World Economy (IfW Kiel).
    16. Travaglini, Guido, 2010. "Supervised Principal Components and Factor Instrumental Variables. An Application to Violent CrimeTrends in the US, 1982-2005," MPRA Paper 22077, University Library of Munich, Germany.
    17. Amir Ahmadi-Javid & Pooya Hoseinpour, 2022. "Convexification of Queueing Formulas by Mixed-Integer Second-Order Cone Programming: An Application to a Discrete Location Problem with Congestion," INFORMS Journal on Computing, INFORMS, vol. 34(5), pages 2621-2633, September.
    18. Gambella, Claudio & Ghaddar, Bissan & Naoum-Sawaya, Joe, 2021. "Optimization problems for machine learning: A survey," European Journal of Operational Research, Elsevier, vol. 290(3), pages 807-828.
    19. Kimia Keshanian & Daniel Zantedeschi & Kaushik Dutta, 2022. "Features Selection as a Nash-Bargaining Solution: Applications in Online Advertising and Information Systems," INFORMS Journal on Computing, INFORMS, vol. 34(5), pages 2485-2501, September.
    20. Youssef M. Aboutaleb & Moshe Ben-Akiva & Patrick Jaillet, 2020. "Learning Structure in Nested Logit Models," Papers 2008.08048, arXiv.org.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:jglopt:v:73:y:2019:i:2:d:10.1007_s10898-018-0713-3. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.