IDEAS home Printed from https://ideas.repec.org/p/cee/wpcepe/02-18.html
   My bibliography  Save this paper

Finding Groups in Large Data Sets

Author

Listed:
  • Adrian Müller

    (Center for Energy Policy and Economics CEPE, Department of Management, Technology and Economics, ETH Zurich, Switzerland)

Abstract

This paper aims to give an overview of methods to find groups in large data sets, such as household expenditure survey data. These methods are grouped in three: cluster analysis, dimension reduction and basic explorative methods. The emphasis is put on a critical analysis and potential drawbacks, especially of inputs that have to be provided by the researcher. These may impose some structure not present in the data, thus defeating the purpose of revealing intrinsic patterns. In general, the more elaborate methods, such as cluster analysis, are delicate to apply, especially in the context of social sciences. Often, it may be best to limit oneself to more transparent approaches such as comparisons of basic statistics.

Suggested Citation

  • Adrian Müller, 2002. "Finding Groups in Large Data Sets," CEPE Working paper series 02-18, CEPE Center for Energy Policy and Economics, ETH Zurich.
  • Handle: RePEc:cee:wpcepe:02-18
    as

    Download full text from publisher

    File URL: http://www.cepe.ethz.ch/publications/workingPapers/CEPE_WP18.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Andrea Scheller, "undated". "Researchers' Use of Indicators. Interim Report of The Indicator Project," CEPE Working paper series 99-01, CEPE Center for Energy Policy and Economics, ETH Zurich.
    2. Christine Huttin, 2000. "A cluster analysis on income elasticity variations and US pharmaceutical expenditures," Applied Economics, Taylor & Francis Journals, vol. 32(10), pages 1241-1247.
    3. Sara Lelli, 2001. "Factor Analysis vs. Fuzzy Sets Theory: Assessing the Influence of Different Techniques on Sen's Functioning Approach," Public Economics Working Paper Series ces0121, Katholieke Universiteit Leuven, Centrum voor Economische Studiën, Working Group Public Economics.
    4. Jeffrey Brown & Dennis Glennon, 2000. "Cost structures of banks grouped by strategic conduct," Applied Economics, Taylor & Francis Journals, vol. 32(12), pages 1591-1605.
    5. Filippini, Massimo & Wild, Jorg, 2001. "Regional differences in electricity distribution costs and their consequences for yardstick regulation of access prices," Energy Economics, Elsevier, vol. 23(4), pages 477-488, July.
    6. David Goldblatt, 1999. "Northern Consumption: A Critical Review of Issues, Driving Forces, Disciplinary Approaches and Critiques," CEPE Working paper series 99-03, CEPE Center for Energy Policy and Economics, ETH Zurich.
    7. Massimo Filippini & Jörg Wild & Michael Kuenzle, 2001. "Scale and cost efficiency in the Swiss electricity distribution industry: evidence from a frontier cost approach," CEPE Working paper series 01-08, CEPE Center for Energy Policy and Economics, ETH Zurich.
    8. Shonali Pachauri, "undated". "A First Step to Constructing Energy Consumption Indicators for India. Interim Report of The Indicator Project," CEPE Working paper series 99-02, CEPE Center for Energy Policy and Economics, ETH Zurich.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Reinhard Madlener & Stefan Vögtli, 2006. "Diffusion of bioenergy in urban areas: socio-economic analysis of the planned Swiss wood-fired cogeneration plant in Basel," CEPE Working paper series 06-53, CEPE Center for Energy Policy and Economics, ETH Zurich.
    2. Madlener, Reinhard & Koller, Martin, 2007. "Economic and CO2 mitigation impacts of promoting biomass heating systems: An input-output study for Vorarlberg, Austria," Energy Policy, Elsevier, vol. 35(12), pages 6021-6035, December.
    3. Reinhard Madlener & Carmenza Robledo & Bart Muys & Bo Hektor & Julije Domac, 2003. "A Sustainability Framework for Enhancing the Long-Term Success of LULUCF Projects," CEPE Working paper series 03-29, CEPE Center for Energy Policy and Economics, ETH Zurich.
    4. Silvia Banfi & Massimo Filippini & Andrea Horehájová, 2007. "Hedonic Price Functions for Zurich and Lugano with Special Focus on Electrosmog," CEPE Working paper series 07-57, CEPE Center for Energy Policy and Economics, ETH Zurich.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Reinhard Madlener & Carlos Henggeler Antunes & Luis C. Dias, 2006. "Multi-Criteria versus Data Envelopment Analysis for Assessing the Performance of Biogas Plants," CEPE Working paper series 06-49, CEPE Center for Energy Policy and Economics, ETH Zurich.
    2. Reinhard Madlener & Stefan Vögtli, 2006. "Diffusion of bioenergy in urban areas: socio-economic analysis of the planned Swiss wood-fired cogeneration plant in Basel," CEPE Working paper series 06-53, CEPE Center for Energy Policy and Economics, ETH Zurich.
    3. Silvia Banfi & Massimo Filippini & Andrea Horehájová, 2007. "Hedonic Price Functions for Zurich and Lugano with Special Focus on Electrosmog," CEPE Working paper series 07-57, CEPE Center for Energy Policy and Economics, ETH Zurich.
    4. Kumbaroglu, Gürkan & Madlener, Reinhard & Demirel, Mustafa, 2008. "A real options evaluation model for the diffusion prospects of new renewable power generation technologies," Energy Economics, Elsevier, vol. 30(4), pages 1882-1908, July.
    5. Silvia Banfi & Massimo Filippini & Andrea Horehájová, 2012. "Using a choice experiment to estimate the benefits of a reduction of externalities in urban areas with special focus on electrosmog," Applied Economics, Taylor & Francis Journals, vol. 44(3), pages 387-397, January.
    6. Madlener, Reinhard & Koller, Martin, 2007. "Economic and CO2 mitigation impacts of promoting biomass heating systems: An input-output study for Vorarlberg, Austria," Energy Policy, Elsevier, vol. 35(12), pages 6021-6035, December.
    7. Filippini, Massimo & Pachauri, Shonali, 2004. "Elasticities of electricity demand in urban Indian households," Energy Policy, Elsevier, vol. 32(3), pages 429-436, February.
    8. Marco Semadeni, 2003. "Energy storage as an essential part of sustainable energy systems," CEPE Working paper series 03-24, CEPE Center for Energy Policy and Economics, ETH Zurich.
    9. Daniel Spreng & Marco Semadeni, 2001. "Energie, Umwelt und die 2000 Watt Gesellschaft," CEPE Working paper series 01-11, CEPE Center for Energy Policy and Economics, ETH Zurich.
    10. Marco Semadeni, 2002. "Long-Term Energy Scenarios: Information on Aspects of Sustainable Energy Supply as a Prelude to Participatory Sessions," CEPE Working paper series 02-17, CEPE Center for Energy Policy and Economics, ETH Zurich.
    11. Shonali Pachauri & Daniel Spreng, 2003. "Energy use and energy access in relation to poverty," CEPE Working paper series 03-25, CEPE Center for Energy Policy and Economics, ETH Zurich.
    12. Kentaka Aruga, 2003. "Differences in Characteristics ofReligious Groups in India: As Seen From Household Survey Data," CEPE Working paper series 03-26, CEPE Center for Energy Policy and Economics, ETH Zurich.
    13. Reinhard Madlener & Carmenza Robledo & Bart Muys & Bo Hektor & Julije Domac, 2003. "A Sustainability Framework for Enhancing the Long-Term Success of LULUCF Projects," CEPE Working paper series 03-29, CEPE Center for Energy Policy and Economics, ETH Zurich.
    14. Gürkan Kumbaroglu & Reinhard Madlener, 2001. "A Description of the Hybrid Bottom-Up CGE Model SCREEN with an Application to Swiss Climate Policy Analysis," CEPE Working paper series 01-10, CEPE Center for Energy Policy and Economics, ETH Zurich.
    15. Kopsakangas-Savolainen, Maria & Svento, Rauli, 2008. "Estimation of cost-effectiveness of the Finnish electricity distribution utilities," Energy Economics, Elsevier, vol. 30(2), pages 212-229, March.
    16. Jamasb, T. & Söderberg, M., 2009. "Yardstick and Ex-post Regulation by Norm Model: Empirical Equivalence, Pricing Effect, and Performance in Sweeden," Cambridge Working Papers in Economics 0908, Faculty of Economics, University of Cambridge.
    17. Tindara Addabbo & Gisella Facchinetti, 2013. "Fuzzy logic and the capability approach," Center for the Analysis of Public Policies (CAPP) 0106, Universita di Modena e Reggio Emilia, Dipartimento di Economia "Marco Biagi".
    18. Xavier Ramos, 2008. "Using Efficiency Analysis to Measure Individual Well-being with an Illustration for Catalonia," Palgrave Macmillan Books, in: Nanak Kakwani & Jacques Silber (ed.), Quantitative Approaches to Multidimensional Poverty Measurement, chapter 9, pages 155-175, Palgrave Macmillan.
    19. Cracolici, M. Francesca & Nijkamp, Peter, 2005. "Attractiveness and Effectiveness of Competing Tourist Areas: A Study on Italian Provinces," Serie Research Memoranda 0009, VU University Amsterdam, Faculty of Economics, Business Administration and Econometrics.
    20. Oula Ben Hassine & Hela Bouras, 2022. "Fuzzy Measures of Monetary and Non-monetary Deprivations in Tunisia," International Journal of Economics and Financial Issues, Econjournals, vol. 12(4), pages 65-71, July.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:cee:wpcepe:02-18. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Carlos Ordas (email available below). General contact details of provider: https://edirc.repec.org/data/cepetch.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.