IDEAS home Printed from https://ideas.repec.org/a/taf/jnlasa/v112y2017i519p1250-1260.html
   My bibliography  Save this article

Semiparametric Inference in a Genetic Mixture Model

Author

Listed:
  • Pengfei Li
  • Yukun Liu
  • Jing Qin

Abstract

In genetic backcross studies, data are often collected from complex mixtures of distributions with known mixing proportions. Previous approaches to the inference of these genetic mixture models involve parameterizing the component distributions. However, model misspecification of any form is expected to have detrimental effects. We propose a semiparametric likelihood method for genetic mixture models: the empirical likelihood under the exponential tilting model assumption, in which the log ratio of the probability (density) functions from the components is linear in the observations. An application to mice cancer genetics involves random numbers of offspring within a litter. In other words, the cluster size is a random variable. We wish to test the null hypothesis that there is no difference between the two components in the mixture model, but unfortunately we find that the Fisher information is degenerate. As a consequence, the conventional two-term expansion in the likelihood ratio statistic does not work. By using a higher-order expansion, we are able to establish a nonstandard convergence rate N− 1/4 for the odds ratio parameter estimator β^$\hat{\beta }$. Moreover, the limiting distribution of the empirical likelihood ratio statistic is derived. The underlying distribution function of each component can also be estimated semiparametrically. Analogously to the full parametric approach, we develop an expectation and maximization algorithm for finding the semiparametric maximum likelihood estimator. Simulation results and a real cancer application indicate that the proposed semiparametric method works much better than parametric methods. Supplementary materials for this article are available online.

Suggested Citation

  • Pengfei Li & Yukun Liu & Jing Qin, 2017. "Semiparametric Inference in a Genetic Mixture Model," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(519), pages 1250-1260, July.
  • Handle: RePEc:taf:jnlasa:v:112:y:2017:i:519:p:1250-1260
    DOI: 10.1080/01621459.2016.1208614
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1080/01621459.2016.1208614
    Download Restriction: Access to full text is restricted to subscribers.

    File URL: https://libkey.io/10.1080/01621459.2016.1208614?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. F. Zou, 2002. "On empirical likelihood for a semiparametric mixture model," Biometrika, Biometrika Trust, vol. 89(1), pages 61-75, March.
    2. Tao Liu & Joseph W. Hogan & Lisa Wang & Shangxuan Zhang & Rami Kantor, 2013. "Optimal Allocation of Gold Standard Testing Under Constrained Availability: Application to Assessment of HIV Treatment Failure," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 108(504), pages 1173-1188, December.
    3. Miguel de Carvalho & Anthony C. Davison, 2014. "Spectral Density Ratio Models for Multivariate Extremes," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(506), pages 764-776, June.
    4. Z. Tan, 2009. "A note on profile likelihood for exponential tilt mixture models," Biometrika, Biometrika Trust, vol. 96(1), pages 229-236.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Moming Li & Guoqing Diao & Jing Qin, 2020. "On symmetric semiparametric two‐sample problem," Biometrics, The International Biometric Society, vol. 76(4), pages 1216-1228, December.
    2. Wei Zhang & Aiyi Liu & Qizhai Li & Paul S. Albert, 2020. "Nonparametric estimation of distributions and diagnostic accuracy based on group‐tested results with differential misclassification," Biometrics, The International Biometric Society, vol. 76(4), pages 1147-1156, December.
    3. Yufan Wang & Xingzhong Xu, 2023. "A Posterior p -Value for Homogeneity Testing of the Three-Sample Problem," Mathematics, MDPI, vol. 11(18), pages 1-25, September.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Wang, Chunlin & Marriott, Paul & Li, Pengfei, 2017. "Testing homogeneity for multiple nonnegative distributions with excess zero observations," Computational Statistics & Data Analysis, Elsevier, vol. 114(C), pages 146-157.
    2. Chuan Hong & Yang Ning & Shuang Wang & Hao Wu & Raymond J. Carroll & Yong Chen, 2017. "PLEMT: A Novel Pseudolikelihood-Based EM Test for Homogeneity in Generalized Exponential Tilt Mixture Models," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(520), pages 1393-1404, October.
    3. Yufan Wang & Xingzhong Xu, 2023. "A Posterior p -Value for Homogeneity Testing of the Three-Sample Problem," Mathematics, MDPI, vol. 11(18), pages 1-25, September.
    4. Jing Cheng & Jing Qin & Biao Zhang, 2009. "Semiparametric estimation and inference for distributional and general treatment effects," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(4), pages 881-904, September.
    5. Yang Ning & Yong Chen, 2015. "A Class of Pseudolikelihood Ratio Tests for Homogeneity in Exponential Tilt Mixture Models," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 42(2), pages 504-517, June.
    6. Giovanni Compiani & Yuichi Kitamura, 2016. "Using mixtures in econometric models: a brief review and some new results," Econometrics Journal, Royal Economic Society, vol. 19(3), pages 95-127, October.
    7. Moming Li & Guoqing Diao & Jing Qin, 2020. "On symmetric semiparametric two‐sample problem," Biometrics, The International Biometric Society, vol. 76(4), pages 1216-1228, December.
    8. Yufan Wang & Xingzhong Xu, 2023. "Homogeneity Test for Multiple Semicontinuous Data with the Density Ratio Model," Mathematics, MDPI, vol. 11(17), pages 1-28, September.
    9. Jing Qin & Denis H. Y. Leung, 2004. "A Semi-parametric Two-component “Compound” Mixture Model and Its Application to Estimating Malaria Attributable Fractions," Working Papers 17-2004, Singapore Management University, School of Economics.
    10. Mao, Shanjun & Fan, Xiaodan & Hu, Jie, 2021. "Correlation for tree-shaped datasets and its Bayesian estimation," Computational Statistics & Data Analysis, Elsevier, vol. 164(C).
    11. Hanson, Timothy E. & de Carvalho, Miguel & Chen, Yuhui, 2017. "Bernstein polynomial angular densities of multivariate extreme value distributions," Statistics & Probability Letters, Elsevier, vol. 128(C), pages 60-66.
    12. Guanghua Han & Ming Dong, 2017. "Sustainable Regulation of Information Sharing with Electronic Data Interchange by a Trust-Embedded Contract," Sustainability, MDPI, vol. 9(6), pages 1-22, June.
    13. Andr�s Farall & Ricardo Maronna & Tomás Tetzlaff, 2011. "A mixture model for the detection of Neosporosis without a gold standard," Journal of Applied Statistics, Taylor & Francis Journals, vol. 38(5), pages 913-926, February.
    14. Liugen Xue, 2010. "Empirical Likelihood Local Polynomial Regression Analysis of Clustered Data," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 37(4), pages 644-663, December.
    15. Daniela Castro Camilo & Miguel de Carvalho & Jennifer Wadsworth, 2017. "Time-Varying Extreme Value Dependence with Application to Leading European Stock Markets," Papers 1709.01198, arXiv.org.
    16. Mhalla, Linda & Chavez-Demoulin, Valérie & Naveau, Philippe, 2017. "Non-linear models for extremal dependence," Journal of Multivariate Analysis, Elsevier, vol. 159(C), pages 49-66.
    17. Wang, Dong & Chen, Song Xi, 2009. "Combining quantitative trait loci analyses and microarray data: An empirical likelihood approach," Computational Statistics & Data Analysis, Elsevier, vol. 53(5), pages 1661-1673, March.
    18. Raphaël Huser & Marc G. Genton, 2016. "Non-Stationary Dependence Structures for Spatial Extremes," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 21(3), pages 470-491, September.
    19. Jing Qin & Kung-Yee Liang, 2011. "Hypothesis Testing in a Mixture Case–Control Model," Biometrics, The International Biometric Society, vol. 67(1), pages 182-193, March.
    20. Zhang, Archer Gong & Chen, Jiahua, 2022. "Density ratio model with data-adaptive basis function," Journal of Multivariate Analysis, Elsevier, vol. 191(C).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:taf:jnlasa:v:112:y:2017:i:519:p:1250-1260. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Longhurst (email available below). General contact details of provider: http://www.tandfonline.com/UASA20 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.