IDEAS home Printed from https://ideas.repec.org/a/spr/advdac/v11y2017i4d10.1007_s11634-016-0260-z.html
   My bibliography  Save this article

Fuzzy rule based classification systems for big data with MapReduce: granularity analysis

Author

Listed:
  • Alberto Fernández

    (University of Granada)

  • Sara Río

    (University of Granada)

  • Abdullah Bawakid

    (King Abdulaziz University (KAU))

  • Francisco Herrera

    (University of Granada
    King Abdulaziz University (KAU))

Abstract

Due to the vast amount of information available nowadays, and the advantages related to the processing of this data, the topics of big data and data science have acquired a great importance in the current research. Big data applications are mainly about scalability, which can be achieved via the MapReduce programming model.It is designed to divide the data into several chunks or groups that are processed in parallel, and whose result is “assembled” to provide a single solution. Among different classification paradigms adapted to this new framework, fuzzy rule based classification systems have shown interesting results with a MapReduce approach for big data. It is well known that the performance of these types of systems has a strong dependence on the selection of a good granularity level for the Data Base. However, in the context of MapReduce this parameter is even harder to determine as it can be also related with the number of Maps chosen for the processing stage. In this paper, we aim at analyzing the interrelation between the number of labels of the fuzzy variables and the scarcity of the data due to the data sampling in MapReduce. Specifically, we consider that as the partitioning of the initial instance set grows, the level of granularity necessary to achieve a good performance also becomes higher. The experimental results, carried out for several Big Data problems, and using the Chi-FRBCS-BigData algorithms, support our claims.

Suggested Citation

  • Alberto Fernández & Sara Río & Abdullah Bawakid & Francisco Herrera, 2017. "Fuzzy rule based classification systems for big data with MapReduce: granularity analysis," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 11(4), pages 711-730, December.
  • Handle: RePEc:spr:advdac:v:11:y:2017:i:4:d:10.1007_s11634-016-0260-z
    DOI: 10.1007/s11634-016-0260-z
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11634-016-0260-z
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11634-016-0260-z?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Vivien Marx, 2013. "The big challenges of big data," Nature, Nature, vol. 498(7453), pages 255-260, June.
    2. Chris A. Mattmann, 2013. "A vision for data science," Nature, Nature, vol. 493(7433), pages 473-475, January.
    3. Caf, . "Programa de bosques," Books, CAF Development Bank Of Latinamerica, number 533.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Lin Zhu & Xiantao Liu & Sha He & Jun Shi & Ming Pang, 2015. "Keywords co-occurrence mapping knowledge domain research base on the theory of Big Data in oil and gas industry," Scientometrics, Springer;Akadémiai Kiadó, vol. 105(1), pages 249-260, October.
    2. Zhang, Yi & Huang, Ying & Porter, Alan L. & Zhang, Guangquan & Lu, Jie, 2019. "Discovering and forecasting interactions in big data research: A learning-enhanced bibliometric study," Technological Forecasting and Social Change, Elsevier, vol. 146(C), pages 795-807.
    3. Stefano Bianchini & Moritz Müller & Pierre Pelletier, 2022. "Artificial intelligence in science: An emerging general method of invention," Post-Print hal-03958025, HAL.
    4. Jun Feng & Zhenting Li & Shizhen Zhang & Chun Bao & Jingxian Fang & Yun Yin & Bolei Chen & Lei Pan & Bing Wang & Yu Zheng, 2023. "A Microimage-Processing-Based Technique for Detecting Qualitative and Quantitative Characteristics of Plant Cells," Agriculture, MDPI, vol. 13(9), pages 1-16, September.
    5. Tang, Ming & Liao, Huchang, 2021. "From conventional group decision making to large-scale group decision making: What are the challenges and how to meet them in big data era? A state-of-the-art survey," Omega, Elsevier, vol. 100(C).
    6. Janssen, Marijn & van der Voort, Haiko & Wahyudi, Agung, 2017. "Factors influencing big data decision-making quality," Journal of Business Research, Elsevier, vol. 70(C), pages 338-345.
    7. Haitham Nobanee & Mehroz Nida Dilshad & Mona Al Dhanhani & Maitha Al Neyadi & Sultan Al Qubaisi & Saeed Al Shamsi, 2021. "Big Data Applications the Banking Sector: A Bibliometric Analysis Approach," SAGE Open, , vol. 11(4), pages 21582440211, December.
    8. Reza Farrahi Moghaddam & Fereydoun Farrahi Moghaddam & Mohamed Cheriet, 2014. "A Multi-Entity Input Output (MEIO) Approach to Sustainability - Water-Energy-GHG (WEG) Footprint Statements in Use Cases from Auto and Telco Industries," Papers 1404.6227, arXiv.org, revised Apr 2014.
    9. Daphne R. Raban & Avishag Gordon, 2020. "The evolution of data science and big data research: A bibliometric analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 122(3), pages 1563-1581, March.
    10. Yoshiyuki Ogata & Kazuto Mannen & Yasuto Kotani & Naohiro Kimura & Nozomu Sakurai & Daisuke Shibata & Hideyuki Suzuki, 2018. "ConfeitoGUI: A toolkit for size-sensitive community detection from a correlation network," PLOS ONE, Public Library of Science, vol. 13(10), pages 1-18, October.
    11. S. Vijayakumar Bharathi, 2017. "Prioritizing and Ranking the Big Data Information Security Risk Spectrum," Global Journal of Flexible Systems Management, Springer;Global Institute of Flexible Systems Management, vol. 18(3), pages 183-201, September.
    12. Jonathan E Butner & Ascher K Munion & Brian R W Baucom & Alexander Wong, 2019. "Ghost hunting in the nonlinear dynamic machine," PLOS ONE, Public Library of Science, vol. 14(12), pages 1-21, December.
    13. Subhroshekhar Ghosh & Soumendu Sundar Mukherjee, 2022. "Learning with latent group sparsity via heat flow dynamics on networks," Papers 2201.08326, arXiv.org.
    14. Yan, Li & Cao, Huiying & Gao, Chao & Wang, Zhen & Li, Xuelong, 2023. "Mining of book-loan behavior based on coupling relationship analysis," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 613(C).
    15. J. Lars Kirkby & Dang H. Nguyen & Duy Nguyen & Nhu N. Nguyen, 2022. "Inversion-free subsampling Newton’s method for large sample logistic regression," Statistical Papers, Springer, vol. 63(3), pages 943-963, June.
    16. Zbysław Dobrowolski, 2021. "Internet of Things and Other E-Solutions in Supply Chain Management May Generate Threats in the Energy Sector—The Quest for Preventive Measures," Energies, MDPI, vol. 14(17), pages 1-11, August.
    17. Dawen Xia & Xiaonan Lu & Huaqing Li & Wendong Wang & Yantao Li & Zili Zhang, 2018. "A MapReduce-Based Parallel Frequent Pattern Growth Algorithm for Spatiotemporal Association Analysis of Mobile Trajectory Big Data," Complexity, Hindawi, vol. 2018, pages 1-16, January.
    18. Matteo Fontana & Massimo Tavoni & Simone Vantini, 2019. "Functional Data Analysis of high-frequency load curves reveals drivers of residential electricity consumption," PLOS ONE, Public Library of Science, vol. 14(6), pages 1-16, June.
    19. Lu Jiang & Xinyu Kang & Shan Huang & Bo Yang, 2022. "A refinement strategy for identification of scientific software from bioinformatics publications," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(6), pages 3293-3316, June.
    20. Gamermann, Daniel & Antunes, Felipe Leite, 2018. "Statistical analysis of Brazilian electoral campaigns via Benford’s law," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 496(C), pages 171-188.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:advdac:v:11:y:2017:i:4:d:10.1007_s11634-016-0260-z. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.