IDEAS home Printed from https://ideas.repec.org/a/gam/jsusta/v6y2014i4p1896-1912d34877.html
   My bibliography  Save this article

Using Web Crawler Technology for Geo-Events Analysis: A Case Study of the Huangyan Island Incident

Author

Listed:
  • Hao Hu

    (School of Geography, Beijing Normal University, 100875 Beijing, China)

  • Yuejing Ge

    (School of Geography, Beijing Normal University, 100875 Beijing, China)

  • Dongyang Hou

    (School of Environment Science and Spatial Informatics, China University of Mining and Technology, Xuzhou 221116, China)

Abstract

Social networking and network socialization provide abundant text information and social relationships into our daily lives. Making full use of these data in the big data era is of great significance for us to better understand the changing world and the information-based society. Though politics have been integrally involved in the hyperlinked world issues since the 1990s, the text analysis and data visualization of geo-events faced the bottleneck of traditional manual analysis. Though automatic assembly of different geospatial web and distributed geospatial information systems utilizing service chaining have been explored and built recently, the data mining and information collection are not comprehensive enough because of the sensibility, complexity, relativity, timeliness, and unexpected characteristics of political events. Based on the framework of Heritrix and the analysis of web-based text, word frequency, sentiment tendency, and dissemination path of the Huangyan Island incident were studied by using web crawler technology and the text analysis. The results indicate that tag cloud, frequency map, attitudes pie, individual mention ratios, and dissemination flow graph, based on the crawled information and data processing not only highlight the characteristics of geo-event itself, but also implicate many interesting phenomenon and deep-seated problems behind it, such as related topics, theme vocabularies, subject contents, hot countries, event bodies, opinion leaders, high-frequency vocabularies, information sources, semantic structure, propagation paths, distribution of different attitudes, and regional difference of net citizens’ response in the Huangyan Island incident. Furthermore, the text analysis of network information with the help of focused web crawler is able to express the time-space relationship of crawled information and the information characteristic of semantic network to the geo-events. Therefore, it is a useful tool to collect information for understanding the formation and diffusion of web-based public opinions in political events.

Suggested Citation

  • Hao Hu & Yuejing Ge & Dongyang Hou, 2014. "Using Web Crawler Technology for Geo-Events Analysis: A Case Study of the Huangyan Island Incident," Sustainability, MDPI, vol. 6(4), pages 1-17, April.
  • Handle: RePEc:gam:jsusta:v:6:y:2014:i:4:p:1896-1912:d:34877
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2071-1050/6/4/1896/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2071-1050/6/4/1896/
    Download Restriction: no
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Yuting Sun & Shu-Nung Yao, 2022. "Sustainability Trade-Offs in Media Coverage of Poverty Alleviation: A Content-Based Spatiotemporal Analysis in China’s Provinces," Sustainability, MDPI, vol. 14(16), pages 1-26, August.
    2. Wu, Zezhou & He, Qiufeng & Li, Jiarun & Bi, Guoqiang & Antwi-Afari, Maxwell Fordjour, 2023. "Public attitudes and sentiments towards new energy vehicles in China: A text mining approach," Renewable and Sustainable Energy Reviews, Elsevier, vol. 178(C).
    3. Xingchen Lv & Jun Meng & Qiufeng Wu, 2022. "Dynamic Influence of Network Public Opinions on Price Fluctuation of Small Agricultural Products Based on NLP-TVP-VAR Model—Taking Garlic as an Example," Sustainability, MDPI, vol. 14(14), pages 1-21, July.
    4. Yingxia Xue & Honglei Liu, 2023. "Exploration of the Dynamic Evolution of Online Public Opinion towards Waste Classification in Shanghai," IJERPH, MDPI, vol. 20(2), pages 1-15, January.
    5. Sijing Liu & Jiuping Xu & Xiaoyuan Shi & Guoqi Li & Dinglong Liu, 2018. "Sustainable Distribution Organization Based on the Supply–Demand Coordination in Large Chinese Cities," Sustainability, MDPI, vol. 10(9), pages 1-25, August.
    6. Feng Zhang & Jingwei Zhou & Renyi Liu & Zhenhong Du & Xinyue Ye, 2016. "A New Design of High-Performance Large-Scale GIS Computing at a Finer Spatial Granularity: A Case Study of Spatial Join with Spark for Sustainability," Sustainability, MDPI, vol. 8(9), pages 1-19, September.
    7. Dongyang Hou & Hao Wu & Jun Chen & Ran Li, 2014. "A Focused Crawler for Borderlands Situation Information with Geographical Properties of Place Names," Sustainability, MDPI, vol. 6(10), pages 1-24, September.
    8. Ossi Ylijoki & Jari Porras, 2016. "Conceptualizing Big Data: Analysis of Case Studies," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 23(4), pages 295-310, October.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jsusta:v:6:y:2014:i:4:p:1896-1912:d:34877. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.