IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v56y2012i3p510-521.html
   My bibliography  Save this article

Permutation test for incomplete paired data with application to cDNA microarray data

Author

Listed:
  • Yu, Donghyeon
  • Lim, Johan
  • Liang, Feng
  • Kim, Kyunga
  • Kim, Byung Soo
  • Jang, Woncheol

Abstract

A paired data set is common in microarray experiments, where the data are often incompletely observed for some pairs due to various technical reasons. In microarray paired data sets, it is of main interest to detect differentially expressed genes, which are usually identified by testing the equality of means of expressions within a pair. While much attention has been paid to testing mean equality with incomplete paired data in previous literature, the existing methods commonly assume the normality of data or rely on the large sample theory. In this paper, we propose a new test based on permutations, which is free from the normality assumption and large sample theory. We consider permutation statistics with linear mixtures of paired and unpaired samples as test statistics, and propose a procedure to find the optimal mixture that minimizes the conditional variances of the test statistics, given the observations. Simulations are conducted for numerical power comparisons between the proposed permutation tests and other existing methods. We apply the proposed method to find differentially expressed genes for a colorectal cancer study.

Suggested Citation

  • Yu, Donghyeon & Lim, Johan & Liang, Feng & Kim, Kyunga & Kim, Byung Soo & Jang, Woncheol, 2012. "Permutation test for incomplete paired data with application to cDNA microarray data," Computational Statistics & Data Analysis, Elsevier, vol. 56(3), pages 510-521.
  • Handle: RePEc:eee:csdana:v:56:y:2012:i:3:p:510-521
    DOI: 10.1016/j.csda.2011.08.012
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947311003112
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2011.08.012?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Klebanov Lev & Jordan Craig & Yakovlev Andrei, 2006. "A New Type of Stochastic Dependence Revealed in Gene Expression Data," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 5(1), pages 1-24, March.
    2. Kotz,Samuel & Nadarajah,Saralees, 2004. "Multivariate T-Distributions and Their Applications," Cambridge Books, Cambridge University Press, number 9780521826549.
    3. Adelchi Azzalini & Antonella Capitanio, 2003. "Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t‐distribution," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 65(2), pages 367-389, May.
    4. Efron, Bradley, 2007. "Correlation and Large-Scale Simultaneous Significance Testing," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 93-103, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Daniel Gaigall, 2020. "Testing marginal homogeneity of a continuous bivariate distribution with possibly incomplete paired data," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 83(4), pages 437-465, May.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Papastathopoulos, Ioannis & Tawn, Jonathan A., 2013. "A generalised Student’s t-distribution," Statistics & Probability Letters, Elsevier, vol. 83(1), pages 70-77.
    2. Kim, Hyoung-Moon & Genton, Marc G., 2011. "Characteristic functions of scale mixtures of multivariate skew-normal distributions," Journal of Multivariate Analysis, Elsevier, vol. 102(7), pages 1105-1117, August.
    3. Noor Fadhilah Ahmad Radi & Roslinazairimah Zakaria & Julia Piantadosi & John Boland & Wan Zawiah Wan Zin & Muhammad Az-zuhri Azman, 2017. "Generating Synthetic Rainfall Total Using Multivariate Skew-t and Checkerboard Copula of Maximum Entropy," Water Resources Management: An International Journal, Published for the European Water Resources Association (EWRA), Springer;European Water Resources Association (EWRA), vol. 31(5), pages 1729-1744, March.
    4. Lim Johan & Kim Jayoun & Kim Sang-cheol & Yu Donghyeon & Kim Kyunga & Kim Byung Soo, 2012. "Detection of Differentially Expressed Gene Sets in a Partially Paired Microarray Data Set," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 11(3), pages 1-30, February.
    5. Yang, Yu-Chen & Lin, Tsung-I & Castro, Luis M. & Wang, Wan-Lun, 2020. "Extending finite mixtures of t linear mixed-effects models with concomitant covariates," Computational Statistics & Data Analysis, Elsevier, vol. 148(C).
    6. Sreenivasa Rao Jammalamadaka & Emanuele Taufer & Gyorgy H. Terdik, 2021. "On Multivariate Skewness and Kurtosis," Sankhya A: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 83(2), pages 607-644, August.
    7. Kim, Hyoung-Moon & Maadooliat, Mehdi & Arellano-Valle, Reinaldo B. & Genton, Marc G., 2016. "Skewed factor models using selection mechanisms," Journal of Multivariate Analysis, Elsevier, vol. 145(C), pages 162-177.
    8. Mehdi Amiri & Yaser Mehrali & Narayanaswamy Balakrishnan & Ahad Jamalizadeh, 2022. "Efficient recursive computational algorithms for multivariate t and multivariate unified skew-t distributions with applications to inference," Computational Statistics, Springer, vol. 37(1), pages 125-158, March.
    9. Tsung-I Lin & Pal Wu & Geoffrey McLachlan & Sharon Lee, 2015. "A robust factor analysis model using the restricted skew- $$t$$ t distribution," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 24(3), pages 510-531, September.
    10. Galarza, Christian E. & Matos, Larissa A. & Castro, Luis M. & Lachos, Victor H., 2022. "Moments of the doubly truncated selection elliptical distributions with emphasis on the unified multivariate skew-t distribution," Journal of Multivariate Analysis, Elsevier, vol. 189(C).
    11. Wraith, Darren & Forbes, Florence, 2015. "Location and scale mixtures of Gaussians with flexible tail behaviour: Properties, inference and application to multivariate clustering," Computational Statistics & Data Analysis, Elsevier, vol. 90(C), pages 61-73.
    12. Chen Tong & Peter Reinhard Hansen & Ilya Archakov, 2024. "Cluster GARCH," Papers 2406.06860, arXiv.org.
    13. Takahashi, Makoto & Watanabe, Toshiaki & Omori, Yasuhiro, 2016. "Volatility and quantile forecasts by realized stochastic volatility models with generalized hyperbolic distribution," International Journal of Forecasting, Elsevier, vol. 32(2), pages 437-457.
    14. Jorge E. Galán & María Rodríguez Moreno, 2020. "At-risk measures and financial stability," Financial Stability Review, Banco de España, issue Autumn.
    15. Chen, Qihao & Huang, Zhuo & Liang, Fang, 2023. "Measuring systemic risk with high-frequency data: A realized GARCH approach," Finance Research Letters, Elsevier, vol. 54(C).
    16. Yeap, Claudia & Kwok, Simon S. & Choy, S. T. Boris, 2016. "A Flexible Generalised Hyperbolic Option Pricing Model and its Special Cases," Working Papers 2016-14, University of Sydney, School of Economics.
    17. Wan-Lun Wang, 2019. "Mixture of multivariate t nonlinear mixed models for multiple longitudinal data with heterogeneity and missing values," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 28(1), pages 196-222, March.
    18. Hu, Shuowen & Poskitt, D.S. & Zhang, Xibin, 2012. "Bayesian adaptive bandwidth kernel density estimation of irregular multivariate distributions," Computational Statistics & Data Analysis, Elsevier, vol. 56(3), pages 732-740.
    19. Isaac E. Cortés & Osvaldo Venegas & Héctor W. Gómez, 2022. "A Symmetric/Asymmetric Bimodal Extension Based on the Logistic Distribution: Properties, Simulation and Applications," Mathematics, MDPI, vol. 10(12), pages 1-17, June.
    20. Toshihiro Abe & Arthur Pewsey, 2011. "Sine-skewed circular distributions," Statistical Papers, Springer, vol. 52(3), pages 683-707, August.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:56:y:2012:i:3:p:510-521. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.