IDEAS home Printed from https://ideas.repec.org/h/pal/palchp/978-1-137-03169-3_6.html
   My bibliography  Save this book chapter

Data Transformation (Pre-processing)

In: Credit Scoring, Response Modeling, and Insurance Rating

Author

Listed:
  • Steven Finlay

Abstract

Model construction techniques display varying degrees of sensitivity to the way data is presented to them. Data transformation is undertaken to provide an alternative representation of the data, that it is hoped will lead to a better (more predictive) model than would result from using the data in its original form. Data transformation typically achieves the following outcomes: Linearization. Transformations are applied so that the relationships between the predictor variables and the dependent variable are (approximately) linear. Having linear relationships is important for methods such as linear regression and logistic regression. If the relationships in the data are highly non-linear then poor models will result using these methods. Linearization is less important for non-linear techniques such as CART and neural networks. Standardization. If one predictor variable takes values in the range 10,000 to 1,000,000 and another takes values in the range 0.01 to 1, then the parameter coefficients (the model weights) will be very different, even if the two variables contribute equally to the model. This is not an issue for all model construction techniques, but as a rule, it is good practice to transform interval variables so that they all take values that lie on the same scale.

Suggested Citation

  • Steven Finlay, 2012. "Data Transformation (Pre-processing)," Palgrave Macmillan Books, in: Credit Scoring, Response Modeling, and Insurance Rating, edition 0, chapter 6, pages 144-164, Palgrave Macmillan.
  • Handle: RePEc:pal:palchp:978-1-137-03169-3_6
    DOI: 10.1057/9781137031693_6
    as

    Download full text from publisher

    To our knowledge, this item is not available for download. To find whether it is available, there are three options:
    1. Check below whether another version of this item is available online.
    2. Check on the provider's web page whether it is in fact available.
    3. Perform a search for a similarly titled item that would be available.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:pal:palchp:978-1-137-03169-3_6. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.palgrave.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.