IDEAS home Printed from https://ideas.repec.org/a/spr/stpapr/v61y2020i4d10.1007_s00362-018-0997-x.html
   My bibliography  Save this article

Dynamic recursive tree-based partitioning for malignant melanoma identification in skin lesion dermoscopic images

Author

Listed:
  • Massimo Aria

    (University of Naples Federico II)

  • Antonio D’Ambrosio

    (University of Naples Federico II)

  • Carmela Iorio

    (University of Naples Federico II)

  • Roberta Siciliano

    (University of Naples Federico II)

  • Valentina Cozza

    (Parthenope University of Naples)

Abstract

In this paper, multivalued data or multiple values variables are defined. They are typical when there is some intrinsic uncertainty in data production, as the result of imprecise measuring instruments, such as in image recognition, in human judgments and so on. So far, contributions in symbolic data analysis literature provide data preprocessing criteria allowing for the use of standard methods such as factorial analysis, clustering, discriminant analysis, tree-based methods. As an alternative, this paper introduces a methodology for supervised classification, the so-called Dynamic CLASSification TREE (D-CLASS TREE), dealing simultaneously with both standard and multivalued data as well. For that, an innovative partitioning criterion with a tree-growing algorithm will be defined. Main result is a dynamic tree structure characterized by the simultaneous presence of binary and ternary partitions. A real world case study will be considered to show the advantages of the proposed methodology and main issues of the interpretation of the final results. A comparative study with other approaches dealing with the same types of data will be also shown. The comparison highlights that, even if the results are quite similar in terms of error rates, the proposed D-CLASS tree returns a more interpretable tree-based structure.

Suggested Citation

  • Massimo Aria & Antonio D’Ambrosio & Carmela Iorio & Roberta Siciliano & Valentina Cozza, 2020. "Dynamic recursive tree-based partitioning for malignant melanoma identification in skin lesion dermoscopic images," Statistical Papers, Springer, vol. 61(4), pages 1645-1661, August.
  • Handle: RePEc:spr:stpapr:v:61:y:2020:i:4:d:10.1007_s00362-018-0997-x
    DOI: 10.1007/s00362-018-0997-x
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00362-018-0997-x
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00362-018-0997-x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Antonio D’Ambrosio & Massimo Aria & Roberta Siciliano, 2012. "Accurate Tree-based Missing Data Imputation and Data Fusion within the Statistical Learning Paradigm," Journal of Classification, Springer;The Classification Society, vol. 29(2), pages 227-258, July.
    2. Billard L. & Diday E., 2003. "From the Statistics of Data to the Statistics of Knowledge: Symbolic Data Analysis," Journal of the American Statistical Association, American Statistical Association, vol. 98, pages 470-487, January.
    3. Gil, Maria Angeles & Montenegro, Manuel & Gonzalez-Rodriguez, Gil & Colubi, Ana & Rosa Casals, Maria, 2006. "Bootstrap approach to the multi-sample test of means with imprecise data," Computational Statistics & Data Analysis, Elsevier, vol. 51(1), pages 148-162, November.
    4. Cappelli, Carmela & Mola, Francesco & Siciliano, Roberta, 2002. "A statistical approach to growing a reliable honest tree," Computational Statistics & Data Analysis, Elsevier, vol. 38(3), pages 285-299, January.
    5. Tatjana Lange & Karl Mosler & Pavlo Mozharovskyi, 2014. "Fast nonparametric classification based on data depth," Statistical Papers, Springer, vol. 55(1), pages 49-69, February.
    6. Riccardo Borgoni & Ann Berrington, 2013. "Evaluating a sequential tree-based procedure for multivariate imputation of complex missing data structures," Quality & Quantity: International Journal of Methodology, Springer, vol. 47(4), pages 1991-2008, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Roberta Siciliano & Antonio D’Ambrosio & Massimo Aria & Sonia Amodio, 2017. "Analysis of Web Visit Histories, Part II: Predicting Navigation by Nested STUMP Regression Trees," Journal of Classification, Springer;The Classification Society, vol. 34(3), pages 473-493, October.
    2. Xiaohui Liu & Shihua Luo & Yijun Zuo, 2020. "Some results on the computing of Tukey’s halfspace median," Statistical Papers, Springer, vol. 61(1), pages 303-316, February.
    3. Ivan Miguel Pires & Faisal Hussain & Nuno M. Garcia & Eftim Zdravevski, 2020. "Improving Human Activity Monitoring by Imputation of Missing Sensory Data: Experimental Study," Future Internet, MDPI, vol. 12(9), pages 1-18, September.
    4. Drago, Carlo, 2015. "Exploring the Community Structure of Complex Networks," MPRA Paper 81024, University Library of Munich, Germany.
    5. Philip Hans Franses & Max Welz, 2022. "Evaluating heterogeneous forecasts for vintages of macroeconomic variables," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 41(4), pages 829-839, July.
    6. Vencalek, Ondrej & Pokotylo, Oleksii, 2018. "Depth-weighted Bayes classification," Computational Statistics & Data Analysis, Elsevier, vol. 123(C), pages 1-12.
    7. Dyckerhoff, Rainer & Mozharovskyi, Pavlo, 2016. "Exact computation of the halfspace depth," Computational Statistics & Data Analysis, Elsevier, vol. 98(C), pages 19-30.
    8. Sun, Yuying & Han, Ai & Hong, Yongmiao & Wang, Shouyang, 2018. "Threshold autoregressive models for interval-valued time series data," Journal of Econometrics, Elsevier, vol. 206(2), pages 414-446.
    9. Lima Neto, Eufrásio de A. & de Carvalho, Francisco de A.T., 2010. "Constrained linear regression models for symbolic interval-valued variables," Computational Statistics & Data Analysis, Elsevier, vol. 54(2), pages 333-347, February.
    10. Abbas Parchami & Przemyslaw Grzegorzewski & Maciej Romaniuk, 2024. "Statistical simulations with LR random fuzzy numbers," Statistical Papers, Springer, vol. 65(6), pages 3583-3600, August.
    11. Paolo Giordani, 2015. "Lasso-constrained regression analysis for interval-valued data," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 9(1), pages 5-19, March.
    12. Fei Liu & L. Billard, 2022. "Partition of Interval-Valued Observations Using Regression," Journal of Classification, Springer;The Classification Society, vol. 39(1), pages 55-77, March.
    13. Sun, Yuying & Zhang, Xinyu & Wan, Alan T.K. & Wang, Shouyang, 2022. "Model averaging for interval-valued data," European Journal of Operational Research, Elsevier, vol. 301(2), pages 772-784.
    14. António Silva & Paula Brito, 2006. "Linear discriminant analysis for interval data," Computational Statistics, Springer, vol. 21(2), pages 289-308, June.
    15. Maia, André Luis Santiago & de Carvalho, Francisco de A.T., 2011. "Holt’s exponential smoothing and neural network models for forecasting interval-valued time series," International Journal of Forecasting, Elsevier, vol. 27(3), pages 740-759.
    16. Carlo Drago, 2021. "The Analysis and the Measurement of Poverty: An Interval-Based Composite Indicator Approach," Economies, MDPI, vol. 9(4), pages 1-17, October.
    17. A. Silva & Paula Brito, 2015. "Discriminant Analysis of Interval Data: An Assessment of Parametric and Distance-Based Approaches," Journal of Classification, Springer;The Classification Society, vol. 32(3), pages 516-541, October.
    18. Lopez-Diaz, Miguel & Ralescu, Dan A., 2006. "Tools for fuzzy random variables: Embeddings and measurabilities," Computational Statistics & Data Analysis, Elsevier, vol. 51(1), pages 109-114, November.
    19. Lin, Wei & González-Rivera, Gloria, 2016. "Interval-valued time series models: Estimation based on order statistics exploring the Agriculture Marketing Service data," Computational Statistics & Data Analysis, Elsevier, vol. 100(C), pages 694-711.
    20. Guo, Junpeng & Li, Wenhua & Li, Chenhua & Gao, Sa, 2012. "Standardization of interval symbolic data based on the empirical descriptive statistics," Computational Statistics & Data Analysis, Elsevier, vol. 56(3), pages 602-610.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:stpapr:v:61:y:2020:i:4:d:10.1007_s00362-018-0997-x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.