IDEAS home Printed from https://ideas.repec.org/a/bla/biomet/v76y2020i4p1340-1350.html
   My bibliography  Save this article

Using sufficient direction factor model to analyze latent activities associated with breast cancer survival

Author

Listed:
  • Seungchul Baek
  • Yen‐Yi Ho
  • Yanyuan Ma

Abstract

High‐dimensional gene expression data often exhibit intricate correlation patterns as the result of coordinated genetic regulation. In practice, however, it is difficult to directly measure these coordinated underlying activities. Analysis of breast cancer survival data with gene expressions motivates us to use a two‐stage latent factor approach to estimate these unobserved coordinated biological processes. Compared to existing approaches, our proposed procedure has several unique characteristics. In the first stage, an important distinction is that our procedure incorporates prior biological knowledge about gene‐pathway membership into the analysis and explicitly model the effects of genetic pathways on the latent factors. Second, to characterize the molecular heterogeneity of breast cancer, our approach provides estimates specific to each cancer subtype. Finally, our proposed framework incorporates sparsity condition due to the fact that genetic networks are often sparse. In the second stage, we investigate the relationship between latent factor activity levels and survival time with censoring using a general dimension reduction model in the survival analysis context. Combining the factor model and sufficient direction model provides an efficient way of analyzing high‐dimensional data and reveals some interesting relations in the breast cancer gene expression data.

Suggested Citation

  • Seungchul Baek & Yen‐Yi Ho & Yanyuan Ma, 2020. "Using sufficient direction factor model to analyze latent activities associated with breast cancer survival," Biometrics, The International Biometric Society, vol. 76(4), pages 1340-1350, December.
  • Handle: RePEc:bla:biomet:v:76:y:2020:i:4:p:1340-1350
    DOI: 10.1111/biom.13208
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/biom.13208
    Download Restriction: no

    File URL: https://libkey.io/10.1111/biom.13208?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Jushan Bai & Serena Ng, 2002. "Determining the Number of Factors in Approximate Factor Models," Econometrica, Econometric Society, vol. 70(1), pages 191-221, January.
    2. Jeffrey T Leek & John D Storey, 2007. "Capturing Heterogeneity in Gene Expression Studies by Surrogate Variable Analysis," PLOS Genetics, Public Library of Science, vol. 3(9), pages 1-12, September.
    3. Yanyuan Ma & Xinyu Zhang, 2015. "A validated information criterion to determine the structural dimension in dimension reduction models," Biometrika, Biometrika Trust, vol. 102(2), pages 409-420.
    4. Lam, Clifford & Yao, Qiwei, 2012. "Factor modeling for high-dimensional time series: inference for the number of factors," LSE Research Online Documents on Economics 45684, London School of Economics and Political Science, LSE Library.
    5. Bair, Eric & Hastie, Trevor & Paul, Debashis & Tibshirani, Robert, 2006. "Prediction by Supervised Principal Components," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 119-137, March.
    6. Wei Luo & Bing Li, 2016. "Combining eigenvalues and variation of eigenvectors for order determination," Biometrika, Biometrika Trust, vol. 103(4), pages 875-887.
    7. Carvalho, Carlos M. & Chang, Jeffrey & Lucas, Joseph E. & Nevins, Joseph R. & Wang, Quanli & West, Mike, 2008. "High-Dimensional Sparse Factor Modeling: Applications in Gene Expression Genomics," Journal of the American Statistical Association, American Statistical Association, vol. 103(484), pages 1438-1456.
    8. F Jiang & Y Ma & Y Wei, 2019. "Sufficient direction factor model and its application to gene expression quantitative trait loci discovery," Biometrika, Biometrika Trust, vol. 106(2), pages 417-432.
    9. Seung C. Ahn & Alex R. Horenstein, 2013. "Eigenvalue Ratio Test for the Number of Factors," Econometrica, Econometric Society, vol. 81(3), pages 1203-1227, May.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Fan, Jianqing & Xue, Lingzhou & Yao, Jiawei, 2017. "Sufficient forecasting using factor models," Journal of Econometrics, Elsevier, vol. 201(2), pages 292-306.
    2. Yuefeng Han & Rong Chen & Cun-Hui Zhang, 2020. "Rank Determination in Tensor Factor Model," Papers 2011.07131, arXiv.org, revised May 2022.
    3. Zhu, Xuehu & Guo, Xu & Wang, Tao & Zhu, Lixing, 2020. "Dimensionality determination: A thresholding double ridge ratio approach," Computational Statistics & Data Analysis, Elsevier, vol. 146(C).
    4. Wu, Jianhong, 2019. "Detecting irrelevant variables in possible proxies for the latent factors in macroeconomics and finance," Economics Letters, Elsevier, vol. 176(C), pages 60-63.
    5. Zhaoxing Gao & Ruey S. Tsay, 2021. "Divide-and-Conquer: A Distributed Hierarchical Factor Approach to Modeling Large-Scale Time Series Data," Papers 2103.14626, arXiv.org.
    6. Shuquan Yang & Nengxiang Ling & Yulin Gong, 2022. "Robust estimation of the number of factors for the pair-elliptical factor models," Computational Statistics, Springer, vol. 37(3), pages 1495-1522, July.
    7. Fan, Jianqing & Jiang, Bai & Sun, Qiang, 2022. "Bayesian factor-adjusted sparse regression," Journal of Econometrics, Elsevier, vol. 230(1), pages 3-19.
    8. Yuefeng Han & Rong Chen & Dan Yang & Cun-Hui Zhang, 2020. "Tensor Factor Model Estimation by Iterative Projection," Papers 2006.02611, arXiv.org, revised Jul 2024.
    9. Minseog Oh & Donggyu Kim, 2024. "Property of Inverse Covariance Matrix-based Financial Adjacency Matrix for Detecting Local Groups," Papers 2412.05664, arXiv.org.
    10. Abberger, Klaus & Graff, Michael & Siliverstovs, Boriss & Sturm, Jan-Egbert, 2018. "Using rule-based updating procedures to improve the performance of composite indicators," Economic Modelling, Elsevier, vol. 68(C), pages 127-144.
    11. Matteo Barigozzi & Marc Hallin, 2023. "Dynamic Factor Models: a Genealogy," Papers 2310.17278, arXiv.org, revised Jan 2024.
    12. Zhou, Ruichao & Wu, Jianhong, 2023. "Determining the number of change-points in high-dimensional factor models by cross-validation with matrix completion," Economics Letters, Elsevier, vol. 232(C).
    13. Bo Zhang & Jiti Gao & Guangming Pan & Yanrong Yang, 2023. "Eigen-Analysis for High-Dimensional Time Series Clustering," Monash Econometrics and Business Statistics Working Papers 22/23, Monash University, Department of Econometrics and Business Statistics.
    14. Bodnar, Taras & Reiß, Markus, 2016. "Exact and asymptotic tests on a factor model in low and large dimensions with applications," Journal of Multivariate Analysis, Elsevier, vol. 150(C), pages 125-151.
    15. Barigozzi, Matteo & Trapani, Lorenzo, 2020. "Sequential testing for structural stability in approximate factor models," Stochastic Processes and their Applications, Elsevier, vol. 130(8), pages 5149-5187.
    16. Jiti Gao & Guangming Pan & Yanrong Yang & Bo Zhang, 2019. "Estimation of Cross-Sectional Dependence in Large Panels," Papers 1904.06843, arXiv.org.
    17. He, Yong & Kong, Xinbing & Trapani, Lorenzo & Yu, Long, 2023. "One-way or two-way factor model for matrix sequences?," Journal of Econometrics, Elsevier, vol. 235(2), pages 1981-2004.
    18. Jianqing Fan & Yuan Liao & Martina Mincheva, 2013. "Large covariance estimation by thresholding principal orthogonal complements," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 75(4), pages 603-680, September.
    19. Fan, Jianqing & Ke, Yuan & Liao, Yuan, 2021. "Augmented factor models with applications to validating market risk factors and forecasting bond risk premia," Journal of Econometrics, Elsevier, vol. 222(1), pages 269-294.
    20. Christian Brownlees & Gu{dh}mundur Stef'an Gu{dh}mundsson & Yaping Wang, 2024. "Performance of Empirical Risk Minimization For Principal Component Regression," Papers 2409.03606, arXiv.org, revised Sep 2024.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:biomet:v:76:y:2020:i:4:p:1340-1350. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.blackwellpublishing.com/journal.asp?ref=0006-341X .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.