Improving statistical keyword detection in short texts: Entropic and clustering approaches
Author
Abstract
Suggested Citation
DOI: 10.1016/j.physa.2012.11.052
Download full text from publisher
As the access to this document is restricted, you may want to search for a different version of it.
References listed on IDEAS
- Zhou, Hongding & Slater, Gary W., 2003. "A metric to search for relevant words," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 329(1), pages 309-327.
- Mehri, Ali & Darooneh, Amir H., 2011. "The role of entropy in word ranking," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 390(18), pages 3157-3163.
- David J. Hand & Heikki Mannila & Padhraic Smyth, 2001. "Principles of Data Mining," MIT Press Books, The MIT Press, edition 1, volume 1, number 026208290x, April.
- Marcelo A. Montemurro & Damián H. Zanette, 2010. "Towards The Quantification Of The Semantic Information Encoded In Written Language," Advances in Complex Systems (ACS), World Scientific Publishing Co. Pte. Ltd., vol. 13(02), pages 135-153.
- J. P. Herrera & P. A. Pury, 2008. "Statistical keyword detection in literary corpora," The European Physical Journal B: Condensed Matter and Complex Systems, Springer;EDP Sciences, vol. 63(1), pages 135-146, May.
Citations
Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
Cited by:
- de Arruda, Henrique F. & Marinho, Vanessa Q. & Lima, Thales S. & Amancio, Diego R. & Costa, Luciano da F., 2018. "An image analysis approach to text analytics based on complex networks," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 510(C), pages 110-120.
- Silva, Filipi N. & Amancio, Diego R. & Bardosova, Maria & Costa, Luciano da F. & Oliveira, Osvaldo N., 2016. "Using network science and text analytics to produce surveys in a scientific topic," Journal of Informetrics, Elsevier, vol. 10(2), pages 487-502.
- Bian, Tian & Hu, Jiantao & Deng, Yong, 2017. "Identifying influential nodes in complex networks based on AHP," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 479(C), pages 422-436.
- Mehri, Ali & Agahi, Hamzeh & Mehri-Dehnavi, Hossein, 2019. "A novel word ranking method based on distorted entropy," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 521(C), pages 484-492.
- Jamaati, Maryam & Mehri, Ali, 2018. "Text mining by Tsallis entropy," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 490(C), pages 1368-1376.
- Diego R Amancio, 2015. "Probing the Topological Properties of Complex Networks Modeling Short Written Texts," PLOS ONE, Public Library of Science, vol. 10(2), pages 1-17, February.
- Ma, Tinghuai & Li, Jing & Liang, Xinnian & Tian, Yuan & Al-Dhelaan, Abdullah & Al-Dhelaan, Mohammed, 2019. "A time-series based aggregation scheme for topic detection in Weibo short texts," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 536(C).
Most related items
These are the items that most often cite the same works as this one and are cited by the same works as this one.- Mehri, Ali & Agahi, Hamzeh & Mehri-Dehnavi, Hossein, 2019. "A novel word ranking method based on distorted entropy," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 521(C), pages 484-492.
- Jamaati, Maryam & Mehri, Ali, 2018. "Text mining by Tsallis entropy," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 490(C), pages 1368-1376.
- Marcelo A Montemurro & Damián H Zanette, 2013. "Keywords and Co-Occurrence Patterns in the Voynich Manuscript: An Information-Theoretic Analysis," PLOS ONE, Public Library of Science, vol. 8(6), pages 1-9, June.
- Le, Hong Hanh & Viviani, Jean-Laurent, 2018.
"Predicting bank failure: An improvement by implementing a machine-learning approach to classical financial ratios,"
Research in International Business and Finance, Elsevier, vol. 44(C), pages 16-25.
- Hong Hanh Le & Jean-Laurent Viviani, 2018. "Predicting bank failure: An improvement by implementing machine learning approach on classical financial ratios," Post-Print halshs-01615106, HAL.
- Li, Hui & Sun, Jie, 2009. "Hybridizing principles of the Electre method with case-based reasoning for data mining: Electre-CBR-I and Electre-CBR-II," European Journal of Operational Research, Elsevier, vol. 197(1), pages 214-224, August.
- Min-feng Lee & Guey-shya Chen & Shao-pin Lin & Wei-jie Wang, 2022. "A Data Mining Study on House Price in Central Regions of Taiwan Using Education Categorical Data, Environmental Indicators, and House Features Data," Sustainability, MDPI, vol. 14(11), pages 1-15, May.
- Caruso, Germán & Scartascini, Carlos & Tommasi, Mariano, 2015.
"Are we all playing the same game? The economic effects of constitutions depend on the degree of institutionalization,"
European Journal of Political Economy, Elsevier, vol. 38(C), pages 212-228.
- German Caruso & Carlos Scartascini & Mariano Tommasi, 2013. "Are We All Playing the Same Game? The Economic Effects of Constitutions Depend on the Degree of Institutionalization," Research Department Publications IDB-WP-237, Inter-American Development Bank, Research Department.
- Tommasi, Mariano & Scartascini, Carlos & Caruso, Germán, 2013. "Are We All Playing the Same Game?: The Economic Effects of Constitutions Depend on the Degree of Institutionalization," IDB Publications (Working Papers) 4612, Inter-American Development Bank.
- M. Almiñana & L. Escudero & A. Pérez-Martín & A. Rabasa & L. Santamaría, 2014. "A classification rule reduction algorithm based on significance domains," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 22(1), pages 397-418, April.
- Silvia Figini & Ron Kenett & SILVIA SALINI, 2010.
"Integrating Operational and Financial Risk Assessments,"
UNIMI - Research Papers in Economics, Business, and Statistics
unimi-1099, Universitá degli Studi di Milano.
- Silvia FIGINI & Ron S. KENETT & Silvia SALINI, 2010. "Integrating operational and financial risk assessments," Departmental Working Papers 2010-02, Department of Economics, Management and Quantitative Methods at Università degli Studi di Milano.
- Onur Doğan & Hakan Aşan & Ejder Ayç, 2015. "Use Of Data Mining Techniques In Advance Decision Making Processes In A Local Firm," European Journal of Business and Economics, Central Bohemia University, vol. 10(2), pages 6821:10-682, January.
- Diego R Amancio, 2015. "Probing the Topological Properties of Complex Networks Modeling Short Written Texts," PLOS ONE, Public Library of Science, vol. 10(2), pages 1-17, February.
- Patricia E. N. Lutu & Andries P. Engelbrecht, 2013. "Base Model Combination Algorithm for Resolving Tied Predictions for K -Nearest Neighbor OVA Ensemble Models," INFORMS Journal on Computing, INFORMS, vol. 25(3), pages 517-526, August.
- Adrien Jamain & David Hand, 2008. "Mining Supervised Classification Performance Studies: A Meta-Analytic Investigation," Journal of Classification, Springer;The Classification Society, vol. 25(1), pages 87-112, June.
- Usó-Doménech, J.L. & Nescolarde-Selva, J.A. & Lloret-Climent, M. & Gash, H., 2016. "Semantics of language for ecosystems modelling: A model case," Ecological Modelling, Elsevier, vol. 328(C), pages 85-94.
- Mehri, Ali & Jamaati, Maryam, 2021. "Statistical metrics for languages classification: A case study of the Bible translations," Chaos, Solitons & Fractals, Elsevier, vol. 144(C).
- Adrian Otoiu & Emilia Titan, 2014.
"An Alternative Method of Component Aggregation for Computing Multidimensional Well-Being Indicators,"
International Journal of Economic Sciences, Prague University of Economics and Business, vol. 2014(4), pages 38-52.
- Adrian Otoiu & Emilia Titan, 2014. "An Alternative Method of Component Aggregation for Computing Multidimensional Well-Being Indicators," Proceedings of International Academic Conferences 0802491, International Institute of Social and Economic Sciences.
- Wang, Wenjun & Liu, Dong & Liu, Xiao & Pan, Lin, 2013. "Fuzzy overlapping community detection based on local random walk and multidimensional scaling," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 392(24), pages 6578-6586.
- Yi-Chen Chung & Hsien-Ming Chou & Chih-Neng Hung & Chihli Hung, 2021. "Using Textual and Economic Features to Predict the RMB Exchange Rate," Advances in Management and Applied Economics, SCIENPRESS Ltd, vol. 11(6), pages 1-8.
- Chen-Yang Cheng, 2014. "Indoor localization algorithm using clustering on signal and coordination pattern," Annals of Operations Research, Springer, vol. 216(1), pages 83-99, May.
- Christmann, Andreas & Steinwart, Ingo & Hubert, Mia, 2006. "Robust Learning from Bites for Data Mining," Technical Reports 2006,03, Technische Universität Dortmund, Sonderforschungsbereich 475: Komplexitätsreduktion in multivariaten Datenstrukturen.
More about this item
Keywords
Keyword detection; Linguistic; Statistical analysis; Entropy;All these keywords.
Statistics
Access and download statisticsCorrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:phsmap:v:392:y:2013:i:6:p:1481-1492. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.journals.elsevier.com/physica-a-statistical-mechpplications/ .
Please note that corrections may take a couple of weeks to filter through the various RePEc services.