IDEAS home Printed from https://ideas.repec.org/a/eee/jmvana/v193y2023ics0047259x22001178.html
   My bibliography  Save this article

TNN: A transfer learning classifier based on weighted nearest neighbors

Author

Listed:
  • Sheng, Haiyang
  • Yu, Guan

Abstract

Weighted nearest neighbors (WNN) classifiers are popular non-parametric classifiers. Despite the significant progress in WNN, most existing WNN classifiers are designed for traditional supervised learning problems where both training samples and test samples are assumed to be independent and identically distributed. However, in many real applications, it could be difficult or expensive to obtain training samples from the distribution of interest. Therefore, data collected from some related distributions are often used as supplementary training data for the classification task under the distribution of interest. It is essential to develop effective classification methods that could incorporate both training samples from the distribution of interest (if they exist) and supplementary training samples from a different but related distribution. To address this challenge, we propose a novel Transfer learning weighted Nearest Neighbors (TNN) classifier. As a WNN classifier, TNN determines the weights on the class labels of training samples for different test samples adaptively by minimizing an upper bound on the conditional expectation of the estimation error of the regression function. It puts decreasing weights on the class labels of the successive more distant neighbors. To accommodate the difference between training samples from the distribution of interest and supplementary training samples, TNN adds a non-negative offset to the distance between each supplementary training sample and the test sample, and thus constrains the excessive influence of the supplementary training samples on the prediction. Our theoretical studies show that, under certain conditions, TNN is consistent and minimax optimal (up to a logarithmic factor) in the covariate shift setting. In the posterior drift or the more general setting where both covariate shift and posterior drift exist, the excess risk of TNN depends on the maximum posterior discrepancy between the distribution of the supplementary training samples and the distribution of interest. Both our simulation studies and an application to the land use/land cover mapping problem in geography demonstrate that TNN outperforms other existing methods. It can serve as an effective tool for transfer learning.

Suggested Citation

  • Sheng, Haiyang & Yu, Guan, 2023. "TNN: A transfer learning classifier based on weighted nearest neighbors," Journal of Multivariate Analysis, Elsevier, vol. 193(C).
  • Handle: RePEc:eee:jmvana:v:193:y:2023:i:c:s0047259x22001178
    DOI: 10.1016/j.jmva.2022.105126
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0047259X22001178
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.jmva.2022.105126?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Masashi Sugiyama & Taiji Suzuki & Shinichi Nakajima & Hisashi Kashima & Paul Bünau & Motoaki Kawanabe, 2008. "Direct importance estimation for covariate shift adaptation," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 60(4), pages 699-746, December.
    2. Heckman, James, 2013. "Sample selection bias as a specification error," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 31(3), pages 129-137.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Dylan Brewer & Alyssa Carlson, 2024. "Addressing sample selection bias for machine learning methods," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 39(3), pages 383-400, April.
    2. Darima Fotheringham & Michael A. Wiles, 2023. "The effect of implementing chatbot customer service on stock returns: an event study analysis," Journal of the Academy of Marketing Science, Springer, vol. 51(4), pages 802-822, July.
    3. Song, Wei-Ling & Uzmanoglu, Cihan, 2016. "TARP announcement, bank health, and borrowers’ credit risk," Journal of Financial Stability, Elsevier, vol. 22(C), pages 22-32.
    4. Raymundo M. Campos-Vázquez, 2013. "Efectos de los ingresos no reportados en el nivel y tendencia de la pobreza laboral en México," Ensayos Revista de Economia, Universidad Autonoma de Nuevo Leon, Facultad de Economia, vol. 0(2), pages 23-54, November.
    5. Jonathan Gruber & Aaron Yelowitz, 1999. "Public Health Insurance and Private Savings," Journal of Political Economy, University of Chicago Press, vol. 107(6), pages 1249-1274, December.
    6. Campbell, Randall C. & Nagel, Gregory L., 2016. "Private information and limitations of Heckman's estimator in banking and corporate finance research," Journal of Empirical Finance, Elsevier, vol. 37(C), pages 186-195.
    7. Leye Li & Louise Yi Lu & Dongyue Wang, 2022. "External labour market competitions and stock price crash risk: evidence from exposures to competitor CEOs’ award‐winning events," Accounting and Finance, Accounting and Finance Association of Australia and New Zealand, vol. 62(S1), pages 1421-1460, April.
    8. Calcagno, R. & Renneboog, L.D.R., 2004. "Capital Structure and Managerial Compensation : The Effects of Renumeration Seniority," Discussion Paper 2004-120, Tilburg University, Center for Economic Research.
    9. Son K. Lam & Thomas E. DeCarlo & Ashish Sharma, 2019. "Salesperson ambidexterity in customer engagement: do customer base characteristics matter?," Journal of the Academy of Marketing Science, Springer, vol. 47(4), pages 659-680, July.
    10. McCausland, David & Pouliakas, Konstantinos & Theodossiou, Ioannis, 2005. "Some are Punished and Some are Rewarded: A Study of the Impact of Performance Pay on Job Satisfaction," MPRA Paper 14243, University Library of Munich, Germany.
    11. Gary F. Peters & Andrea M. Romi & Juan Manuel Sanchez, 2019. "The Influence of Corporate Sustainability Officers on Performance," Journal of Business Ethics, Springer, vol. 159(4), pages 1065-1087, November.
    12. Fossen, Frank M. & König, Johannes, 2015. "Public health insurance and entry into self-employment," VfS Annual Conference 2015 (Muenster): Economic Development - Theory and Policy 112934, Verein für Socialpolitik / German Economic Association.
    13. Li, Chunyu & Lou, Chenxin & Luo, Dan & Xing, Kai, 2021. "Chinese corporate distress prediction using LASSO: The role of earnings management," International Review of Financial Analysis, Elsevier, vol. 76(C).
    14. Fernando Rios-Avila & Gustavo Canavire-Bacarreza, 2018. "Standard-error correction in two-stage optimization models: A quasi–maximum likelihood estimation approach," Stata Journal, StataCorp LP, vol. 18(1), pages 206-222, March.
    15. Brian H. Boyer & Taylor D. Nadauld & Keith P. Vorkink & Michael S. Weisbach, 2023. "Discount‐Rate Risk in Private Equity: Evidence from Secondary Market Transactions," Journal of Finance, American Finance Association, vol. 78(2), pages 835-885, April.
    16. Kadreva, Olga, 2016. "The influence of quantity and age of children on working women’ salaries," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 41, pages 62-77.
    17. Stolowy, Hervé & Jeanjean, Thomas & Erkens, Michael, 2011. "The economic consequences of increasing the international visibility of financial reports," HEC Research Papers Series 957, HEC Paris.
    18. Bruneel, Johan & Clarysse, Bart & Bobelyn, Annelies & Wright, Mike, 2020. "Liquidity events and VC-backed academic spin-offs: The role of search alliances," Research Policy, Elsevier, vol. 49(10).
    19. Leon Zolotoy & Don O’Sullivan & Keke Song, 2021. "The Role of Ethical Standards in the Relationship Between Religious Social Norms and M&A Announcement Returns," Journal of Business Ethics, Springer, vol. 170(4), pages 721-742, May.
    20. Hans A. Holter & Dirk Krueger & Serhiy Stepanchuk, 2019. "How do tax progressivity and household heterogeneity affect Laffer curves?," Quantitative Economics, Econometric Society, vol. 10(4), pages 1317-1356, November.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:jmvana:v:193:y:2023:i:c:s0047259x22001178. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/622892/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.