IDEAS home Printed from https://ideas.repec.org/a/spr/stabio/v9y2017i1d10.1007_s12561-016-9182-8.html
   My bibliography  Save this article

Unit-Free and Robust Detection of Differential Expression from RNA-Seq Data

Author

Listed:
  • Hui Jiang

    (University of Michigan)

  • Tianyu Zhan

    (University of Michigan)

Abstract

Ultra high-throughput sequencing of transcriptomes (RNA-Seq) is a widely used method for quantifying gene expression levels due to its low cost, high accuracy, and wide dynamic range for detection. However, the nature of RNA-Seq makes it nearly impossible to provide absolute measurements of transcript abundances. Several units or data summarization methods for transcript quantification have been proposed in the past to account for differences in transcript lengths and sequencing depths across different genes and different samples. Nevertheless, further between-sample normalization is still needed for reliable detection of differentially expressed genes. In this paper, we propose a unified statistical model for joint detection of differential gene expression and between-sample normalization. Our method is independent of the unit in which gene expression levels are summarized. We also introduce an efficient algorithm for model fitting. Due to the L0-penalized likelihood used in our model, it is able to reliably normalize the data and detect differential gene expression in some cases when more than 50% of the genes are differentially expressed in an asymmetric manner. We compare our method with existing methods using simulated and real data sets.

Suggested Citation

  • Hui Jiang & Tianyu Zhan, 2017. "Unit-Free and Robust Detection of Differential Expression from RNA-Seq Data," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 9(1), pages 178-199, June.
  • Handle: RePEc:spr:stabio:v:9:y:2017:i:1:d:10.1007_s12561-016-9182-8
    DOI: 10.1007/s12561-016-9182-8
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s12561-016-9182-8
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s12561-016-9182-8?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Joseph K. Pickrell & John C. Marioni & Athma A. Pai & Jacob F. Degner & Barbara E. Engelhardt & Everlyne Nkadori & Jean-Baptiste Veyrieras & Matthew Stephens & Yoav Gilad & Jonathan K. Pritchard, 2010. "Understanding mechanisms underlying human gene expression variation with RNA sequencing," Nature, Nature, vol. 464(7289), pages 768-772, April.
    2. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    3. She, Yiyuan & Owen, Art B., 2011. "Outlier Detection Using Nonconvex Penalized Regression," Journal of the American Statistical Association, American Statistical Association, vol. 106(494), pages 626-639.
    4. Smyth Gordon K, 2004. "Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 3(1), pages 1-28, February.
    5. Friedman, Jerome H. & Hastie, Trevor & Tibshirani, Rob, 2010. "Regularization Paths for Generalized Linear Models via Coordinate Descent," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 33(i01).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. She, Yiyuan, 2012. "An iterative algorithm for fitting nonconvex penalized generalized linear models with grouped predictors," Computational Statistics & Data Analysis, Elsevier, vol. 56(10), pages 2976-2990.
    2. Luca Insolia & Ana Kenney & Martina Calovi & Francesca Chiaromonte, 2021. "Robust Variable Selection with Optimality Guarantees for High-Dimensional Logistic Regression," Stats, MDPI, vol. 4(3), pages 1-17, August.
    3. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    4. Zhu Wang, 2022. "MM for penalized estimation," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 31(1), pages 54-75, March.
    5. Naimoli, Antonio, 2022. "Modelling the persistence of Covid-19 positivity rate in Italy," Socio-Economic Planning Sciences, Elsevier, vol. 82(PA).
    6. Umberto Amato & Anestis Antoniadis & Italia De Feis & Irene Gijbels, 2021. "Penalised robust estimators for sparse and high-dimensional linear models," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 30(1), pages 1-48, March.
    7. Zichen Zhang & Ye Eun Bae & Jonathan R. Bradley & Lang Wu & Chong Wu, 2022. "SUMMIT: An integrative approach for better transcriptomic data imputation improves causal gene identification," Nature Communications, Nature, vol. 13(1), pages 1-12, December.
    8. Peter Bühlmann & Jacopo Mandozzi, 2014. "High-dimensional variable screening and bias in subsequent inference, with an empirical comparison," Computational Statistics, Springer, vol. 29(3), pages 407-430, June.
    9. Capanu, Marinela & Giurcanu, Mihai & Begg, Colin B. & Gönen, Mithat, 2023. "Subsampling based variable selection for generalized linear models," Computational Statistics & Data Analysis, Elsevier, vol. 184(C).
    10. Bernardi, Mauro & Costola, Michele, 2019. "High-dimensional sparse financial networks through a regularised regression model," SAFE Working Paper Series 244, Leibniz Institute for Financial Research SAFE.
    11. Loann David Denis Desboulets, 2018. "A Review on Variable Selection in Regression Analysis," Econometrics, MDPI, vol. 6(4), pages 1-27, November.
    12. Zeyu Bian & Erica E. M. Moodie & Susan M. Shortreed & Sahir Bhatnagar, 2023. "Variable selection in regression‐based estimation of dynamic treatment regimes," Biometrics, The International Biometric Society, vol. 79(2), pages 988-999, June.
    13. Li, Peili & Jiao, Yuling & Lu, Xiliang & Kang, Lican, 2022. "A data-driven line search rule for support recovery in high-dimensional data analysis," Computational Statistics & Data Analysis, Elsevier, vol. 174(C).
    14. Wang, Tao & Xu, Pei-Rong & Zhu, Li-Xing, 2012. "Non-convex penalized estimation in high-dimensional models with single-index structure," Journal of Multivariate Analysis, Elsevier, vol. 109(C), pages 221-235.
    15. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "High-Dimensional Methods and Inference on Structural and Treatment Effects," Journal of Economic Perspectives, American Economic Association, vol. 28(2), pages 29-50, Spring.
    16. Jin, Shaobo & Moustaki, Irini & Yang-Wallentin, Fan, 2018. "Approximated penalized maximum likelihood for exploratory factor analysis: an orthogonal case," LSE Research Online Documents on Economics 88118, London School of Economics and Political Science, LSE Library.
    17. Zanhua Yin, 2020. "Variable selection for sparse logistic regression," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 83(7), pages 821-836, October.
    18. Bonaccolto, Giovanni & Caporin, Massimiliano & Maillet, Bertrand B., 2022. "Dynamic large financial networks via conditional expected shortfalls," European Journal of Operational Research, Elsevier, vol. 298(1), pages 322-336.
    19. Dumitrescu, Elena & Hué, Sullivan & Hurlin, Christophe & Tokpavi, Sessi, 2022. "Machine learning for credit scoring: Improving logistic regression with non-linear decision-tree effects," European Journal of Operational Research, Elsevier, vol. 297(3), pages 1178-1192.
    20. Centofanti, Fabio & Fontana, Matteo & Lepore, Antonio & Vantini, Simone, 2022. "Smooth LASSO estimator for the Function-on-Function linear regression model," Computational Statistics & Data Analysis, Elsevier, vol. 176(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:stabio:v:9:y:2017:i:1:d:10.1007_s12561-016-9182-8. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.