IDEAS home Printed from https://ideas.repec.org/a/gam/jsusta/v8y2016i3p239-d65021.html
   My bibliography  Save this article

Emerging Pattern-Based Clustering of Web Users Utilizing a Simple Page-Linked Graph

Author

Listed:
  • Xiuming Yu

    (Database/Bioinformatics Laboratory, College of Electrical and Computer Engineering, Chungbuk National University, Cheongju, Chungbuk 28644, Korea)

  • Meijing Li

    (Database/Bioinformatics Laboratory, College of Electrical and Computer Engineering, Chungbuk National University, Cheongju, Chungbuk 28644, Korea)

  • Kyung Ah Kim

    (Department of Biomedical Engineering, College of Medicine, Chungbuk National University, Cheongju, Chungbuk 28644, Korea)

  • Jimoon Chung

    (Namseoul University, Computer Science, Seoul 331-707, Korea)

  • Keun Ho Ryu

    (Database/Bioinformatics Laboratory, College of Electrical and Computer Engineering, Chungbuk National University, Cheongju, Chungbuk 28644, Korea)

Abstract

Web usage mining is a popular research area in data mining. With the extensive use of the Internet, it is essential to learn about the favorite web pages of its users and to cluster web users in order to understand the structural patterns of their usage behavior. In this paper, we propose an efficient approach to determining favorite web pages by generating large web pages, and emerging patterns of generated simple page-linked graphs. We identify the favorite web pages of each user by eliminating noise due to overall popular pages, and by clustering web users according to the generated emerging patterns. Afterwards, we label the clusters by using Term Frequency-Inverse Document Frequency (TF-IDF). In the experiments, we evaluate the parameters used in our proposed approach, discuss the effect of the parameters on generating emerging patterns, and analyze the results from clustering web users. The results of the experiments prove that the exact patterns generated in the emerging-pattern step eliminate the need to consider noise pages, and consequently, this step can improve the efficiency of subsequent mining tasks. Our proposed approach is capable of clustering web users from web log data.

Suggested Citation

  • Xiuming Yu & Meijing Li & Kyung Ah Kim & Jimoon Chung & Keun Ho Ryu, 2016. "Emerging Pattern-Based Clustering of Web Users Utilizing a Simple Page-Linked Graph," Sustainability, MDPI, vol. 8(3), pages 1-18, March.
  • Handle: RePEc:gam:jsusta:v:8:y:2016:i:3:p:239-:d:65021
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2071-1050/8/3/239/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2071-1050/8/3/239/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Thomas Gruber, 2007. "Ontology of Folksonomy: A Mash-Up of Apples and Oranges," International Journal on Semantic Web and Information Systems (IJSWIS), IGI Global, vol. 3(1), pages 1-11, January.
    2. Li, Gang & Law, Rob & Vu, Huy Quan & Rong, Jia & Zhao, Xinyuan (Roy), 2015. "Identifying emerging hotel preferences using Emerging Pattern Mining technique," Tourism Management, Elsevier, vol. 46(C), pages 311-321.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Alessandro Massaro & Daniele Giannone & Vitangelo Birardi & Angelo Maurizio Galiano, 2021. "An Innovative Approach for the Evaluation of the Web Page Impact Combining User Experience and Neural Network Score," Future Internet, MDPI, vol. 13(6), pages 1-21, May.
    2. Ziyun Deng & Tingqin He, 2018. "A Method for Filtering Pages by Similarity Degree based on Dynamic Programming," Future Internet, MDPI, vol. 10(12), pages 1-12, December.
    3. Dongwook Kim & Sungbum Kim, 2017. "The Role of Mobile Technology in Tourism: Patents, Articles, News, and Mobile Tour App Reviews," Sustainability, MDPI, vol. 9(11), pages 1-45, November.
    4. Xiaoli Wang & Yun Liu & Yanbing Ju, 2018. "Sustainable Public Procurement Policies on Promoting Scientific and Technological Innovation in China: Comparisons with the U.S., the UK, Japan, Germany, France, and South Korea," Sustainability, MDPI, vol. 10(7), pages 1-27, June.
    5. Isaac Machorro-Cano & Ingrid Aylin Ríos-Méndez & José Antonio Palet-Guzmán & Nidia Rodríguez-Mazahua & Lisbeth Rodríguez-Mazahua & Giner Alor-Hernández & José Oscar Olmedo-Aguirre, 2023. "Medical Opinions Analysis about the Decrease of Autopsies Using Emerging Pattern Mining," Data, MDPI, vol. 9(1), pages 1-14, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ana Fernández-Vilas & Rebeca P. Díaz-Redondo & Sandra Servia-Rodríguez, 2015. "IPTV parental control: A collaborative model for the Social Web," Information Systems Frontiers, Springer, vol. 17(5), pages 1161-1176, October.
    2. Ram, Yael & Gal-Tzur, Ayelet & Rechavi, Amit, 2021. "Identifying attributes of public transport services for urban tourists: A data-mining method," Journal of Transport Geography, Elsevier, vol. 93(C).
    3. Gergely Marcell Honti & Janos Abonyi, 2019. "A Review of Semantic Sensor Technologies in Internet of Things Architectures," Complexity, Hindawi, vol. 2019, pages 1-21, June.
    4. Guizzardi, Andrea & Pons, Flavio Maria Emanuele & Angelini, Giovanni & Ranieri, Ercolino, 2021. "Big data from dynamic pricing: A smart approach to tourism demand forecasting," International Journal of Forecasting, Elsevier, vol. 37(3), pages 1049-1060.
    5. Blazquez, Desamparados & Domenech, Josep, 2018. "Big Data sources and methods for social and economic analyses," Technological Forecasting and Social Change, Elsevier, vol. 130(C), pages 99-113.
    6. Ahani, Ali & Nilashi, Mehrbakhsh & Yadegaridehkordi, Elaheh & Sanzogni, Louis & Tarik, A. Rashid & Knox, Kathy & Samad, Sarminah & Ibrahim, Othman, 2019. "Revealing customers’ satisfaction and preferences through online review analysis: The case of Canary Islands hotels," Journal of Retailing and Consumer Services, Elsevier, vol. 51(C), pages 331-343.
    7. Kedar Pandurang Joshi & Amol Dhaigude, 2021. "Revenue management for homestay with TODIM-integrated EMSR-b," Journal of Revenue and Pricing Management, Palgrave Macmillan, vol. 20(2), pages 134-148, April.
    8. A Fronzetti Colladon & B Guardabascio & R Innarella, 2021. "Using social network and semantic analysis to analyze online travel forums and forecast tourism demand," Papers 2105.07727, arXiv.org.
    9. Sameer Mathur & Prem Prakash Dewani, 2016. "Influence of cultural heritage on hotel prices, occupancy and profit," Tourism Economics, , vol. 22(5), pages 1014-1032, October.
    10. Orhan, Ezgi, 2023. "Locational attributes of the lodging industry: An empirical study on urban hotels in Ankara, Turkey," Land Use Policy, Elsevier, vol. 125(C).
    11. Mellinas, Juan Pedro & Nicolau, Juan Luis, 2019. "Asymmetric effects of WiFi on overall satisfaction," Annals of Tourism Research, Elsevier, vol. 78(C), pages 1-1.
    12. Gang Chen & Shuaiyong Xiao & Chenghong Zhang & Huimin Zhao, 2023. "A Theory-Driven Deep Learning Method for Voice Chat–Based Customer Response Prediction," Information Systems Research, INFORMS, vol. 34(4), pages 1513-1532, December.
    13. Pantano, Eleonora & Priporas, Constantinos-Vasilios & Stylos, Nikolaos, 2017. "‘You will like it!’ using open data to predict tourists' response to a tourist attraction," Tourism Management, Elsevier, vol. 60(C), pages 430-438.
    14. Marie Al-Ghossein & Talel Abdessalem & Anthony Barré, 2018. "Open data in the hotel industry: leveraging forthcoming events for hotel recommendation," Information Technology & Tourism, Springer, vol. 20(1), pages 191-216, December.
    15. Yadegaridehkordi, Elaheh & Nilashi, Mehrbakhsh & Nizam Bin Md Nasir, Mohd Hairul & Momtazi, Saeedeh & Samad, Sarminah & Supriyanto, Eko & Ghabban, Fahad, 2021. "Customers segmentation in eco-friendly hotels using multi-criteria and machine learning techniques," Technology in Society, Elsevier, vol. 65(C).
    16. Li, Xin & Pan, Bing & Law, Rob & Huang, Xiankai, 2017. "Forecasting tourism demand with composite search index," Tourism Management, Elsevier, vol. 59(C), pages 57-66.
    17. Detlef Schoder & Johannes Putzke & Panagiotis Metaxas & Peter Gloor & Kai Fischbach, 2014. "Information Systems for “Wicked Problems”," Business & Information Systems Engineering: The International Journal of WIRTSCHAFTSINFORMATIK, Springer;Gesellschaft für Informatik e.V. (GI), vol. 6(1), pages 3-10, February.
    18. Lamyaa EL BASSITI, 2017. "Generic Modular Ontology for Innovation Domain. A Key Pillar Towards “Innovation Interoperability”," Journal of Entrepreneurship, Management and Innovation, Fundacja Upowszechniająca Wiedzę i Naukę "Cognitione", vol. 13(2), pages 105-126.
    19. Park, Eunhye & Park, Jinah & Hu, Mingming, 2021. "Tourism demand forecasting with online news data mining," Annals of Tourism Research, Elsevier, vol. 90(C).
    20. Silvia Emili & Paolo Figini & Andrea Guizzardi, 2020. "Modelling international monthly tourism demand at the micro destination level with climate indicators and web-traffic data," Tourism Economics, , vol. 26(7), pages 1129-1151, November.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jsusta:v:8:y:2016:i:3:p:239-:d:65021. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.