IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0198475.html
   My bibliography  Save this article

Predictive modeling for odor character of a chemical using machine learning combined with natural language processing

Author

Listed:
  • Yuji Nozaki
  • Takamichi Nakamoto

Abstract

Recent studies on machine learning technology have reported successful performances in some visual and auditory recognition tasks, while little has been reported in the field of olfaction. In this paper we report computational methods to predict the odor impression of a chemical from its physicochemical properties. Our predictive model utilizes nonlinear dimensionality reduction on mass spectra data and performs the clustering of descriptors by natural language processing. Sensory evaluation is widely used to measure human impressions to smell or taste by using verbal descriptors, such as “spicy” and “sweet”. However, as it requires significant amounts of time and human resources, a large-scale sensory evaluation test is difficult to perform. Our model successfully predicts a group of descriptors for a target chemical through a series of computer simulations. Although the training text data used in the language modeling is not specialized for olfaction, the experimental results show that our method is useful for analyzing sensory datasets. This is the first report to combine machine olfaction with natural language processing for odor character prediction.

Suggested Citation

  • Yuji Nozaki & Takamichi Nakamoto, 2018. "Predictive modeling for odor character of a chemical using machine learning combined with natural language processing," PLOS ONE, Public Library of Science, vol. 13(6), pages 1-13, June.
  • Handle: RePEc:plo:pone00:0198475
    DOI: 10.1371/journal.pone.0198475
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0198475
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0198475&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0198475?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Gabor J. Szekely & Maria L. Rizzo, 2005. "Hierarchical Clustering via Joint Between-Within Distances: Extending Ward's Minimum Variance Method," Journal of Classification, Springer;The Classification Society, vol. 22(2), pages 151-183, September.
    2. Yuji Nozaki & Takamichi Nakamoto, 2016. "Odor Impression Prediction from Mass Spectra," PLOS ONE, Public Library of Science, vol. 11(6), pages 1-15, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Tanoy Debnath & Takamichi Nakamoto, 2020. "Predicting human odor perception represented by continuous values from mass spectra of essential oils resembling chemical mixtures," PLOS ONE, Public Library of Science, vol. 15(6), pages 1-13, June.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jia Zhu & Xingcheng Wu & Xueqin Lin & Changqin Huang & Gabriel Pui Cheong Fung & Yong Tang, 2018. "A novel multiple layers name disambiguation framework for digital libraries using dynamic clustering," Scientometrics, Springer;Akadémiai Kiadó, vol. 114(3), pages 781-794, March.
    2. Linde, Jona & Sonnemans, Joep & Tuinstra, Jan, 2014. "Strategies and evolution in the minority game: A multi-round strategy experiment," Games and Economic Behavior, Elsevier, vol. 86(C), pages 77-95.
    3. Zdeňka Náglová & Tereza Horáková, 2017. "Position of the Bakery Enterprises in the Czech Republic According to Detailed Specification of the Businesses," Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis, Mendel University Press, vol. 65(5), pages 1719-1727.
    4. Renato Amorim, 2015. "Feature Relevance in Ward’s Hierarchical Clustering Using the L p Norm," Journal of Classification, Springer;The Classification Society, vol. 32(1), pages 46-62, April.
    5. Quessy, Jean-François, 2021. "A Szekely–Rizzo inequality for testing general copula homogeneity hypotheses," Journal of Multivariate Analysis, Elsevier, vol. 186(C).
    6. Carmen C. Rodríguez-Martínez & Mitzi Cubilla-Montilla & Purificación Vicente-Galindo & Purificación Galindo-Villardón, 2023. "X-STATIS: A Multivariate Approach to Characterize the Evolution of E-Participation, from a Global Perspective," Mathematics, MDPI, vol. 11(6), pages 1-15, March.
    7. Fionn Murtagh & Pierre Legendre, 2014. "Ward’s Hierarchical Agglomerative Clustering Method: Which Algorithms Implement Ward’s Criterion?," Journal of Classification, Springer;The Classification Society, vol. 31(3), pages 274-295, October.
    8. Zdeněk Hlávka & Marie Hušková & Simos G. Meintanis, 2020. "Change-point methods for multivariate time-series: paired vectorial observations," Statistical Papers, Springer, vol. 61(4), pages 1351-1383, August.
    9. Brault, Vincent & Ouadah, Sarah & Sansonnet, Laure & Lévy-Leduc, Céline, 2018. "Nonparametric multiple change-point estimation for analyzing large Hi-C data matrices," Journal of Multivariate Analysis, Elsevier, vol. 165(C), pages 143-165.
    10. Changhyeon Song & Kwangsoo Shin, 2019. "Business Model Design for Latecomers in Biopharmaceutical Industry: The Case of Korean Firms," Sustainability, MDPI, vol. 11(18), pages 1-15, September.
    11. Moon, Seongmin & Hicks, Christian & Simpson, Andrew, 2012. "The development of a hierarchical forecasting method for predicting spare parts demand in the South Korean Navy—A case study," International Journal of Production Economics, Elsevier, vol. 140(2), pages 794-802.
    12. Athanasios Constantopoulos & John Yfantopoulos & Panos Xenos & Athanassios Vozikis, 2019. "Cluster shifts based on healthcare factors: The case of Greece in an OECD background 2009-2014," Advances in Management and Applied Economics, SCIENPRESS Ltd, vol. 9(6), pages 1-4.
    13. Mantas Svazas & Valentinas Navickas & Yuriy Bilan & Joanna Nakonieczny & Jana Spankova, 2021. "Biomass Clusterization from a Regional Perspective: The Case of Lithuania," Energies, MDPI, vol. 14(21), pages 1-15, October.
    14. Rizzo, Maria L. & Haman, John T., 2016. "Expected distances and goodness-of-fit for the asymmetric Laplace distribution," Statistics & Probability Letters, Elsevier, vol. 117(C), pages 158-164.
    15. Jiang, Qing & Hušková, Marie & Meintanis, Simos G. & Zhu, Lixing, 2019. "Asymptotics, finite-sample comparisons and applications for two-sample tests with functional data," Journal of Multivariate Analysis, Elsevier, vol. 170(C), pages 202-220.
    16. Manavi, Seyed Alireza & Jafari, Gholamreza & Rouhani, Shahin & Ausloos, Marcel, 2020. "Demythifying the belief in cryptocurrencies decentralized aspects. A study of cryptocurrencies time cross-correlations with common currencies, commodities and financial indices," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 556(C).
    17. Simos G. Meintanis & Joseph Ngatchou-Wandji & James Allison, 2018. "Testing for serial independence in vector autoregressive models," Statistical Papers, Springer, vol. 59(4), pages 1379-1410, December.
    18. Nathanaël Randriamihamison & Nathalie Vialaneix & Pierre Neuvial, 2021. "Applicability and Interpretability of Ward’s Hierarchical Agglomerative Clustering With or Without Contiguity Constraints," Journal of Classification, Springer;The Classification Society, vol. 38(2), pages 363-389, July.
    19. Tanoy Debnath & Takamichi Nakamoto, 2020. "Predicting human odor perception represented by continuous values from mass spectra of essential oils resembling chemical mixtures," PLOS ONE, Public Library of Science, vol. 15(6), pages 1-13, June.
    20. Lee, Sangyeol & Meintanis, Simos G. & Pretorius, Charl, 2022. "Monitoring procedures for strict stationarity based on the multivariate characteristic function," Journal of Multivariate Analysis, Elsevier, vol. 189(C).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0198475. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.