IDEAS home Printed from https://ideas.repec.org/a/eee/jmvana/v168y2018icp30-47.html
   My bibliography  Save this article

Shape-preserving wavelet-based multivariate density estimation

Author

Listed:
  • Aya-Moreno, Carlos
  • Geenens, Gery
  • Penev, Spiridon

Abstract

Wavelet estimators for a probability density f enjoy many good properties, however they are not ‘shape-preserving’ in the sense that the final estimate may not be non-negative or integrate to unity. A solution to negativity issues may be to estimate first the square root of f and then square this estimate up. This paper proposes and investigates such an estimation scheme, generalizing to higher dimensions some previous constructions which are valid only in one dimension. The estimation is mainly based on nearest-neighbor-balls. The theoretical properties of the proposed estimator are obtained, and it is shown to reach the optimal rate of convergence uniformly over large classes of densities under mild conditions. Simulations show that the new estimator performs as well in terms of Mean Integrated Squared Error as the classical wavelet estimator and better than it in terms of Mean Squared Hellinger Distance between the estimator and the truth, while automatically producing estimates which are bona fide densities.

Suggested Citation

  • Aya-Moreno, Carlos & Geenens, Gery & Penev, Spiridon, 2018. "Shape-preserving wavelet-based multivariate density estimation," Journal of Multivariate Analysis, Elsevier, vol. 168(C), pages 30-47.
  • Handle: RePEc:eee:jmvana:v:168:y:2018:i:c:p:30-47
    DOI: 10.1016/j.jmva.2018.07.002
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0047259X17305171
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.jmva.2018.07.002?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Lin, Yi & Jeon, Yongho, 2006. "Random Forests and Adaptive Nearest Neighbors," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 578-590, June.
    2. Ebner, Bruno & Henze, Norbert & Yukich, Joseph E., 2018. "Multivariate goodness-of-fit on flat and curved spaces via nearest neighbor distances," Journal of Multivariate Analysis, Elsevier, vol. 165(C), pages 231-242.
    3. Mondal, Pronoy K. & Biswas, Munmun & Ghosh, Anil K., 2015. "On high dimensional two-sample tests based on nearest neighbors," Journal of Multivariate Analysis, Elsevier, vol. 141(C), pages 168-178.
    4. Kristi Kuljus & Bo Ranneby, 2015. "Generalized Maximum Spacing Estimation for Multivariate Observations," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 42(4), pages 1092-1108, December.
    5. Mack, Y. P. & Rosenblatt, M., 1979. "Multivariate k-nearest neighbor density estimates," Journal of Multivariate Analysis, Elsevier, vol. 9(1), pages 1-15, March.
    6. Hall, Peter, 1983. "On near neighbour estimates of a multivariate density," Journal of Multivariate Analysis, Elsevier, vol. 13(1), pages 24-39, March.
    7. Kerkyacharian, Gérard & Picard, Dominique, 1993. "Density estimation by kernel and wavelets methods: Optimality of Besov spaces," Statistics & Probability Letters, Elsevier, vol. 18(4), pages 327-336, November.
    8. Pinheiro, Aluisio & Vidakovic, Brani, 1997. "Estimating the square root of a density via compactly supported wavelets," Computational Statistics & Data Analysis, Elsevier, vol. 25(4), pages 399-415, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Solveig Flaig & Gero Junike, 2021. "Scenario generation for market risk models using generative neural networks," Papers 2109.10072, arXiv.org, revised Aug 2023.
    2. Marina Vannucci & Brani Vidakovic, 1997. "Preventing the Dirac disaster: Wavelet based density estimation," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 6(2), pages 145-159, August.
    3. Solveig Flaig & Gero Junike, 2023. "Validation of machine learning based scenario generators," Papers 2301.12719, arXiv.org, revised Nov 2023.
    4. Morettin Pedro A. & Toloi Clelia M.C. & Chiann Chang & de Miranda José C.S., 2011. "Wavelet Estimation of Copulas for Time Series," Journal of Time Series Econometrics, De Gruyter, vol. 3(3), pages 1-31, October.
    5. Kato, Takeshi, 1999. "Density estimation by truncated wavelet expansion," Statistics & Probability Letters, Elsevier, vol. 43(2), pages 159-168, June.
    6. Chang, Fang & Qiu, Weiliang & Zamar, Ruben H. & Lazarus, Ross & Wang, Xiaogang, 2010. "clues: An R Package for Nonparametric Clustering Based on Local Shrinking," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 33(i04).
    7. Gérard, Kerkyacharian & Dominique, Picard, 1997. "Limit of the quadratic risk in density estimation using linear methods," Statistics & Probability Letters, Elsevier, vol. 31(4), pages 299-312, February.
    8. Rivoirard, Vincent, 2004. "Maxisets for linear procedures," Statistics & Probability Letters, Elsevier, vol. 67(3), pages 267-275, April.
    9. Jerinsh Jeyapaulraj & Dhruv Desai & Peter Chu & Dhagash Mehta & Stefano Pasquali & Philip Sommer, 2022. "Supervised similarity learning for corporate bonds using Random Forest proximities," Papers 2207.04368, arXiv.org, revised Oct 2022.
    10. Gery Geenens, 2014. "Probit Transformation for Kernel Density Estimation on the Unit Interval," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(505), pages 346-358, March.
    11. David M. Ritzwoller & Vasilis Syrgkanis, 2024. "Simultaneous Inference for Local Structural Parameters with Random Forests," Papers 2405.07860, arXiv.org, revised Sep 2024.
    12. Qiu, Tao & Zhang, Qintong & Fang, Yuanyuan & Xu, Wangli, 2024. "Testing homogeneity in high dimensional data through random projections," Journal of Multivariate Analysis, Elsevier, vol. 200(C).
    13. Mendez, Guillermo & Lohr, Sharon, 2011. "Estimating residual variance in random forest regression," Computational Statistics & Data Analysis, Elsevier, vol. 55(11), pages 2937-2950, November.
    14. Arthur Pewsey & Eduardo García-Portugués, 2021. "Recent advances in directional statistics," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 30(1), pages 1-58, March.
    15. Shin-ichi Tsukada, 2019. "High dimensional two-sample test based on the inter-point distance," Computational Statistics, Springer, vol. 34(2), pages 599-615, June.
    16. Li, Yiliang & Bai, Xiwen & Wang, Qi & Ma, Zhongjun, 2022. "A big data approach to cargo type prediction and its implications for oil trade estimation," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 165(C).
    17. Yi Fu & Shuai Cao & Tao Pang, 2020. "A Sustainable Quantitative Stock Selection Strategy Based on Dynamic Factor Adjustment," Sustainability, MDPI, vol. 12(10), pages 1-12, May.
    18. José María Sarabia & Faustino Prieto & Vanesa Jordá & Stefan Sperlich, 2020. "A Note on Combining Machine Learning with Statistical Modeling for Financial Data Analysis," Risks, MDPI, vol. 8(2), pages 1-14, April.
    19. Biau, Gérard & Devroye, Luc, 2010. "On the layered nearest neighbour estimate, the bagged nearest neighbour estimate and the random forest method in regression and classification," Journal of Multivariate Analysis, Elsevier, vol. 101(10), pages 2499-2518, November.
    20. Penrose, Mathew D., 2000. "Central limit theorems for k-nearest neighbour distances," Stochastic Processes and their Applications, Elsevier, vol. 85(2), pages 295-320, February.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:jmvana:v:168:y:2018:i:c:p:30-47. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/622892/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.