An intelligent model based on integrated inverse document frequency and multinomial Naive Bayes for current affairs news categorisation

My bibliography Save this article

An intelligent model based on integrated inverse document frequency and multinomial Naive Bayes for current affairs news categorisation

Author

Listed:

Sachin Kumar
(University of University)
Aditya Sharma
(University of University)
B Kartheek Reddy
(University of University)
Shreyas Sachan
(University of University)
Vaibhav Jain
(University of University)
Jagvinder Singh
(Delhi Technological University)

Registered:

Abstract

Digital technologies, their product and services have empowered the masses to generate information at a faster pace. Digital technologies based information sharing platforms such as news websites and social media platforms such as Facebook, Twitter, Instagram, What’s app etc have flooded the information space due to the easy generation of information and dissemination to the masses instantly. Information classification has been an important task, especially in newspapers and media organisations. In another area also, information or text classification has an important role to play so that important and vital information can be classified based on the already predefined categories. In journalism, editors and resources persons were allocated the task to recognise and classify the news stories so that they can be placed in the predefined categories of economy and business news, political news, social news, editorial section, education and career, and sports information etc. Nowadays the process of classification and segregation of textual information has become challenging due to the flow of diverse, vast information. Additionally, the pace of information and its updates, access and competition among the media House have made it more challenging. Hence automated and intelligent tools which can classify the information and text accurately and efficiently is needed to reduces human efforts, time and increase productivity. This paper presents an intelligent, efficient and robust intelligent machine learning model based on Multinomial Naive Bayes(MNB) to classify the current affairs news stories. The proposed Inverse Document Frequency(IDF) integrated MNB model achieves classification accuracy of 87.22 per cent. The experiment results are also compared with other machine learning models such as Logistics Regression(LR), Support Vector Machine(SVM), K-Nearest Neighbours(KNN) and Random forest(RF). The results demonstrate that the presented model is better in term of accuracy and may be deployed in real world information classification and media domain to improve the productivity, efficiency of the current affairs news classification process.

Suggested Citation

Sachin Kumar & Aditya Sharma & B Kartheek Reddy & Shreyas Sachan & Vaibhav Jain & Jagvinder Singh, 2022. "An intelligent model based on integrated inverse document frequency and multinomial Naive Bayes for current affairs news categorisation," International Journal of System Assurance Engineering and Management, Springer;The Society for Reliability, Engineering Quality and Operations Management (SREQOM),India, and Division of Operation and Maintenance, Lulea University of Technology, Sweden, vol. 13(3), pages 1341-1355, June.

Handle: RePEc:spr:ijsaem:v:13:y:2022:i:3:d:10.1007_s13198-021-01471-7
DOI: 10.1007/s13198-021-01471-7

Download full text from publisher

As the access to this document is restricted, you may want to search for a different version of it.

References listed on IDEAS

Salminen, Joni & Yoganathan, Vignesh & Corporan, Juan & Jansen, Bernard J. & Jung, Soon-Gyo, 2019. "Machine learning approach to auto-tagging online content for content marketing efficiency: A comparative analysis between methods and content type," Journal of Business Research, Elsevier, vol. 101(C), pages 203-217.
Tarek Kanan & Edward A. Fox, 2016. "Automated arabic text classification with P-Stemmer, machine learning, and a tailored news article taxonomy," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 67(11), pages 2667-2683, November.
Sachin Kumar & Jagvinder Singh & Ompal Singh, 2020. "Ensemble-based extreme learning machine model for occupancy detection with ambient attributes," International Journal of System Assurance Engineering and Management, Springer;The Society for Reliability, Engineering Quality and Operations Management (SREQOM),India, and Division of Operation and Maintenance, Lulea University of Technology, Sweden, vol. 11(2), pages 173-183, July.
Joachims, Thorsten, 1998. "Making large-scale SVM learning practical," Technical Reports 1998,28, Technische Universität Dortmund, Sonderforschungsbereich 475: Komplexitätsreduktion in multivariaten Datenstrukturen.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Sanjiban Sekhar Roy & Ali Ismail Awad & Lamesgen Adugnaw Amare & Mabrie Tesfaye Erkihun & Mohd Anas, 2022. "Multimodel Phishing URL Detection Using LSTM, Bidirectional LSTM, and GRU Models," Future Internet, MDPI, vol. 14(11), pages 1-15, November.
Sachin Kumar & Zairu Nisha & Jagvinder Singh & Anuj Kumar Sharma, 2022. "Sensor network driven novel hybrid model based on feature selection and SVR to predict indoor temperature for energy consumption optimisation in smart buildings," International Journal of System Assurance Engineering and Management, Springer;The Society for Reliability, Engineering Quality and Operations Management (SREQOM),India, and Division of Operation and Maintenance, Lulea University of Technology, Sweden, vol. 13(6), pages 3048-3061, December.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Sachin Kumar & Shivam Panwar & Jagvinder Singh & Anuj Kumar Sharma & Zairu Nisha, 2022. "iCACD: an intelligent deep learning model to categorise current affairs news article for efficient journalistic process," International Journal of System Assurance Engineering and Management, Springer;The Society for Reliability, Engineering Quality and Operations Management (SREQOM),India, and Division of Operation and Maintenance, Lulea University of Technology, Sweden, vol. 13(5), pages 2572-2582, October.
Luca Zanni, 2006. "An Improved Gradient Projection-based Decomposition Technique for Support Vector Machines," Computational Management Science, Springer, vol. 3(2), pages 131-145, April.
Peng Han & Xinyue Yang & Yifei Zhao & Xiangmin Guan & Shengjie Wang, 2022. "Quantitative Ground Risk Assessment for Urban Logistical Unmanned Aerial Vehicle (UAV) Based on Bayesian Network," Sustainability, MDPI, vol. 14(9), pages 1-13, May.
Sachin Kumar & Zairu Nisha & Jagvinder Singh & Anuj Kumar Sharma, 2022. "Sensor network driven novel hybrid model based on feature selection and SVR to predict indoor temperature for energy consumption optimisation in smart buildings," International Journal of System Assurance Engineering and Management, Springer;The Society for Reliability, Engineering Quality and Operations Management (SREQOM),India, and Division of Operation and Maintenance, Lulea University of Technology, Sweden, vol. 13(6), pages 3048-3061, December.
Andrej Čopar & Blaž Zupan & Marinka Zitnik, 2019. "Fast optimization of non-negative matrix tri-factorization," PLOS ONE, Public Library of Science, vol. 14(6), pages 1-15, June.
Andrea Manno & Laura Palagi & Simone Sagratella, 2018. "Parallel decomposition methods for linearly constrained problems subject to simple bound with application to the SVMs training," Computational Optimization and Applications, Springer, vol. 71(1), pages 115-145, September.
Mustak, Mekhail & Salminen, Joni & Plé, Loïc & Wirtz, Jochen, 2021. "Artificial intelligence in marketing: Topic modeling, scientometric analysis, and research agenda," Journal of Business Research, Elsevier, vol. 124(C), pages 389-404.
Tianrui Yin & Wei Chen & Bo Liu & Changzhen Li & Luyao Du, 2023. "Light “You Only Look Once”: An Improved Lightweight Vehicle-Detection Model for Intelligent Vehicles under Dark Conditions," Mathematics, MDPI, vol. 12(1), pages 1-19, December.
Ma, Ji, 2020. "Automated coding using machine-learning and remapping the U.S. nonprofit sector: A guide and benchmark," OSF Preprints pt3q9_v1, Center for Open Science.
Prabowo, Rudy & Thelwall, Mike, 2009. "Sentiment analysis: A combined approach," Journal of Informetrics, Elsevier, vol. 3(2), pages 143-157.
Giampaolo Liuzzi & Laura Palagi & Mauro Piacentini, 2010. "On the convergence of a Jacobi-type algorithm for Singly Linearly-Constrained Problems Subject to simple Bounds," DIS Technical Reports 2010-01, Department of Computer, Control and Management Engineering, Universita' degli Studi di Roma "La Sapienza".
Yu Bian & Hao Chen & Zujian Liu & Ling Chen & Ya Guo & Yongpeng Yang, 2024. "Geological Disaster Susceptibility Evaluation Using Machine Learning: A Case Study of the Atal Tunnel in Tibetan Plateau," Sustainability, MDPI, vol. 16(11), pages 1-23, May.
Li, Tao & Liu, Xiangyu & Li, Guannan & Wang, Xing & Ma, Jiangqiaoyu & Xu, Chengliang & Mao, Qianjun, 2024. "A systematic review and comprehensive analysis of building occupancy prediction," Renewable and Sustainable Energy Reviews, Elsevier, vol. 193(C).
Farah Mohammad & Saad Al Ahmadi, 2023. "Alzheimer’s Disease Prediction Using Deep Feature Extraction and Optimization," Mathematics, MDPI, vol. 11(17), pages 1-17, August.
Wang, Yongqiang & Huang, Donghua & Sun, Kexin & Shen, Hongzheng & Xing, Xuguang & Liu, Xiao & Ma, Xiaoyi, 2023. "Multiobjective optimization of regional irrigation and nitrogen schedules by using the CERES-Maize model with crop parameters determined from the remotely sensed leaf area index," Agricultural Water Management, Elsevier, vol. 286(C).
Santiago Carbo-Valverde & Pedro Cuadros-Solas & Francisco Rodríguez-Fernández, 2020. "A machine learning approach to the digitalization of bank customers: Evidence from random and causal forests," PLOS ONE, Public Library of Science, vol. 15(10), pages 1-39, October.
Ngai, Eric W.T. & Wu, Yuanyuan, 2022. "Machine learning in marketing: A literature review, conceptual framework, and research agenda," Journal of Business Research, Elsevier, vol. 145(C), pages 35-48.
Joni Salminen & Mekhail Mustak & Muhammad Sufyan & Bernard J. Jansen, 2023. "How can algorithms help in segmenting users and customers? A systematic review and research agenda for algorithmic customer segmentation," Journal of Marketing Analytics, Palgrave Macmillan, vol. 11(4), pages 677-692, December.
Anabel Guzmán Ordóñez & Francisco Javier Arroyo Cañada & Emmanuel Lasso & Javier A. Sánchez-Torres & Manuela Escobar-Sierra, 2024. "Analytical model to measure the effectiveness of content marketing on Twitter: the case of governorates in Colombia," Journal of Marketing Analytics, Palgrave Macmillan, vol. 12(4), pages 962-978, December.
Ayat Zaki Ahmed & Manuel Rodríguez Díaz, 2022. "A Methodology for Machine-Learning Content Analysis to Define the Key Labels in the Titles of Online Customer Reviews with the Rating Evaluation," Sustainability, MDPI, vol. 14(15), pages 1-31, July.

More about this item

Keywords

News Articles; Classification; Intelligent Methods; Machine Learning; Support Vector Machine; Multinomial Naive Bayes; Inverse Document Frequency(IDF);
All these keywords.

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:ijsaem:v:13:y:2022:i:3:d:10.1007_s13198-021-01471-7. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

An intelligent model based on integrated inverse document frequency and multinomial Naive Bayes for current affairs news categorisation

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data