IDEAS home Printed from https://ideas.repec.org/a/spr/infosf/v15y2013i3d10.1007_s10796-012-9404-7.html
   My bibliography  Save this article

Semantic similarity measurement using historical google search patterns

Author

Listed:
  • Jorge Martinez-Gil

    (University of Malaga)

  • José F. Aldana-Montes

    (University of Malaga)

Abstract

Computing the semantic similarity between terms (or short text expressions) that have the same meaning but which are not lexicographically similar is an important challenge in the information integration field. The problem is that techniques for textual semantic similarity measurement often fail to deal with words not covered by synonym dictionaries. In this paper, we try to solve this problem by determining the semantic similarity for terms using the knowledge inherent in the search history logs from the Google search engine. To do this, we have designed and evaluated four algorithmic methods for measuring the semantic similarity between terms using their associated history search patterns. These algorithmic methods are: a) frequent co-occurrence of terms in search patterns, b) computation of the relationship between search patterns, c) outlier coincidence on search patterns, and d) forecasting comparisons. We have shown experimentally that some of these methods correlate well with respect to human judgment when evaluating general purpose benchmark datasets, and significantly outperform existing methods when evaluating datasets containing terms that do not usually appear in dictionaries.

Suggested Citation

  • Jorge Martinez-Gil & José F. Aldana-Montes, 2013. "Semantic similarity measurement using historical google search patterns," Information Systems Frontiers, Springer, vol. 15(3), pages 399-410, July.
  • Handle: RePEc:spr:infosf:v:15:y:2013:i:3:d:10.1007_s10796-012-9404-7
    DOI: 10.1007/s10796-012-9404-7
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10796-012-9404-7
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10796-012-9404-7?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Zhang, Guoqiang & Eddy Patuwo, B. & Y. Hu, Michael, 1998. "Forecasting with artificial neural networks:: The state of the art," International Journal of Forecasting, Elsevier, vol. 14(1), pages 35-62, March.
    2. Angelos Hliaoutakis & Giannis Varelas & Epimenidis Voutsakis & Euripides G.M. Petrakis & Evangelos Milios, 2006. "Information Retrieval by Semantic Similarity," International Journal on Semantic Web and Information Systems (IJSWIS), IGI Global, vol. 2(3), pages 55-73, July.
    3. Jiexun Li & G. Alan Wang & Hsinchun Chen, 2011. "Identity matching using personal and social identity features," Information Systems Frontiers, Springer, vol. 13(1), pages 101-113, March.
    4. Leo Egghe & Loet Leydesdorff, 2009. "The relation between Pearson's correlation coefficient r and Salton's cosine measure," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 60(5), pages 1027-1036, May.
    5. Silke Retzer & Pak Yoong & Val Hooper, 2012. "Inter-organisational knowledge transfer in social networks: A definition of intermediate ties," Information Systems Frontiers, Springer, vol. 14(2), pages 343-361, April.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jorge Martinez-Gil & Alejandra Lorena Paoletti & Mario Pichler, 0. "A Novel Approach for Learning How to Automatically Match Job Offers and Candidate Profiles," Information Systems Frontiers, Springer, vol. 0, pages 1-10.
    2. Jorge Martinez-Gil & Alejandra Lorena Paoletti & Mario Pichler, 2020. "A Novel Approach for Learning How to Automatically Match Job Offers and Candidate Profiles," Information Systems Frontiers, Springer, vol. 22(6), pages 1265-1274, December.
    3. Lin-Chih Chen, 2021. "Interactive Topic Search System Based on Topic Cluster Technology," Information Systems Frontiers, Springer, vol. 23(5), pages 1227-1243, September.
    4. Malu Castellanos & Florian Daniel & Irene Garrigós & Jose-Norberto Mazón, 2013. "Business Intelligence and the Web," Information Systems Frontiers, Springer, vol. 15(3), pages 307-309, July.
    5. Rolando Quintero & Miguel Torres-Ruiz & Magdalena Saldaña-Pérez & Carlos Guzmán Sánchez-Mejorada & Felix Mata-Rivera, 2023. "A Conceptual Graph-Based Method to Compute Information Content," Mathematics, MDPI, vol. 11(18), pages 1-22, September.
    6. Lin-Chih Chen, 0. "Interactive Topic Search System Based on Topic Cluster Technology," Information Systems Frontiers, Springer, vol. 0, pages 1-17.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Chulhwan Chris Bang, 2015. "Information systems frontiers: Keyword analysis and classification," Information Systems Frontiers, Springer, vol. 17(1), pages 217-237, February.
    2. Ghiassi, M. & Saidane, H. & Zimbra, D.K., 2005. "A dynamic artificial neural network model for forecasting time series events," International Journal of Forecasting, Elsevier, vol. 21(2), pages 341-362.
    3. Barrow, Devon K., 2016. "Forecasting intraday call arrivals using the seasonal moving average method," Journal of Business Research, Elsevier, vol. 69(12), pages 6088-6096.
    4. Jani, D.B. & Mishra, Manish & Sahoo, P.K., 2017. "Application of artificial neural network for predicting performance of solid desiccant cooling systems – A review," Renewable and Sustainable Energy Reviews, Elsevier, vol. 80(C), pages 352-366.
    5. Nataša Glišović & Miloš Milenković & Nebojša Bojović & Libor Švadlenka & Zoran Avramović, 2016. "A hybrid model for forecasting the volume of passenger flows on Serbian railways," Operational Research, Springer, vol. 16(2), pages 271-285, July.
    6. Christian Fieberg & Daniel Metko & Thorsten Poddig & Thomas Loy, 2023. "Machine learning techniques for cross-sectional equity returns’ prediction," OR Spectrum: Quantitative Approaches in Management, Springer;Gesellschaft für Operations Research e.V., vol. 45(1), pages 289-323, March.
    7. Szafranek, Karol, 2019. "Bagged neural networks for forecasting Polish (low) inflation," International Journal of Forecasting, Elsevier, vol. 35(3), pages 1042-1059.
    8. Sangseop Lim & Chang-hee Lee & Won-Ju Lee & Junghwan Choi & Dongho Jung & Younghun Jeon, 2022. "Valuation of the Extension Option in Time Charter Contracts in the LNG Market," Energies, MDPI, vol. 15(18), pages 1-14, September.
    9. Bontempi, Gianluca & Ben Taieb, Souhaib, 2011. "Conditionally dependent strategies for multiple-step-ahead prediction in local learning," International Journal of Forecasting, Elsevier, vol. 27(3), pages 689-699, July.
    10. Huber, Jakob & Stuckenschmidt, Heiner, 2020. "Daily retail demand forecasting using machine learning with emphasis on calendric special days," International Journal of Forecasting, Elsevier, vol. 36(4), pages 1420-1438.
    11. Carlo Fezzi & Luca Mosetti, 2018. "Size matters: Estimation sample length and electricity price forecasting accuracy," DEM Working Papers 2018/10, Department of Economics and Management.
    12. Van Belle, Jente & Guns, Tias & Verbeke, Wouter, 2021. "Using shared sell-through data to forecast wholesaler demand in multi-echelon supply chains," European Journal of Operational Research, Elsevier, vol. 288(2), pages 466-479.
    13. Georg Groh & Christoph Fuchs, 2011. "Multi-modal social networks for modeling scientific fields," Scientometrics, Springer;Akadémiai Kiadó, vol. 89(2), pages 569-590, November.
    14. Roman Matkovskyy & Taoufik Bouraoui, 2019. "Application of Neural Networks to Short Time Series Composite Indexes: Evidence from the Nonlinear Autoregressive with Exogenous Inputs (NARX) Model," Journal of Quantitative Economics, Springer;The Indian Econometric Society (TIES), vol. 17(2), pages 433-446, June.
    15. Ye, Yuan & Lu, Yonggang & Robinson, Powell & Narayanan, Arunachalam, 2022. "An empirical Bayes approach to incorporating demand intermittency and irregularity into inventory control," European Journal of Operational Research, Elsevier, vol. 303(1), pages 255-272.
    16. CIOBANU Dumitru & BAR Mary Violeta, 2013. "On The Prediction Of Exchange Rate Dollar/Euro With An Svm Model," Revista Economica, Lucian Blaga University of Sibiu, Faculty of Economic Sciences, vol. 65(2), pages 91-109.
    17. Chenghao Zhong & Wengao Lou & Yongzeng Lai, 2023. "A Projection Pursuit Dynamic Cluster Model for Tourism Safety Early Warning and Its Implications for Sustainable Tourism," Mathematics, MDPI, vol. 11(24), pages 1-17, December.
    18. Th I Götz & G Lahmer & V Strnad & Ch Bert & B Hensel & A M Tomé & E W Lang, 2017. "A tool to automatically analyze electromagnetic tracking data from high dose rate brachytherapy of breast cancer patients," PLOS ONE, Public Library of Science, vol. 12(9), pages 1-31, September.
    19. Nastac, Iulian & Dobrescu, Emilian & Pelinescu, Elena, 2007. "Neuro-Adaptive Model for Financial Forecasting," Journal for Economic Forecasting, Institute for Economic Forecasting, vol. 4(3), pages 19-41, September.
    20. Joo, Rocío & Bertrand, Sophie & Chaigneau, Alexis & Ñiquen, Miguel, 2011. "Optimization of an artificial neural network for identifying fishing set positions from VMS data: An example from the Peruvian anchovy purse seine fishery," Ecological Modelling, Elsevier, vol. 222(4), pages 1048-1059.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:infosf:v:15:y:2013:i:3:d:10.1007_s10796-012-9404-7. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.