IDEAS home Printed from https://ideas.repec.org/a/bla/jorssb/v84y2022i2p630-653.html
   My bibliography  Save this article

SIMPLE: Statistical inference on membership profiles in large networks

Author

Listed:
  • Jianqing Fan
  • Yingying Fan
  • Xiao Han
  • Jinchi Lv

Abstract

Network data are prevalent in many contemporary big data applications in which a common interest is to unveil important latent links between different pairs of nodes. Yet a simple fundamental question of how to precisely quantify the statistical uncertainty associated with the identification of latent links still remains largely unexplored. In this paper, we propose the method of statistical inference on membership profiles in large networks (SIMPLE) in the setting of degree‐corrected mixed membership model, where the null hypothesis assumes that the pair of nodes share the same profile of community memberships. In the simpler case of no degree heterogeneity, the model reduces to the mixed membership model for which an alternative more robust test is also proposed. Both tests are of the Hotelling‐type statistics based on the rows of empirical eigenvectors or their ratios, whose asymptotic covariance matrices are very challenging to derive and estimate. Nevertheless, their analytical expressions are unveiled and the unknown covariance matrices are consistently estimated. Under some mild regularity conditions, we establish the exact limiting distributions of the two forms of SIMPLE test statistics under the null hypothesis and contiguous alternative hypothesis. They are the chi‐square distributions and the noncentral chi‐square distributions, respectively, with degrees of freedom depending on whether the degrees are corrected or not. We also address the important issue of estimating the unknown number of communities and establish the asymptotic properties of the associated test statistics. The advantages and practical utility of our new procedures in terms of both size and power are demonstrated through several simulation examples and real network applications.

Suggested Citation

  • Jianqing Fan & Yingying Fan & Xiao Han & Jinchi Lv, 2022. "SIMPLE: Statistical inference on membership profiles in large networks," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(2), pages 630-653, April.
  • Handle: RePEc:bla:jorssb:v:84:y:2022:i:2:p:630-653
    DOI: 10.1111/rssb.12505
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/rssb.12505
    Download Restriction: no

    File URL: https://libkey.io/10.1111/rssb.12505?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Peter J. Bickel & Purnamrita Sarkar, 2016. "Hypothesis testing for automated community detection in networks," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 78(1), pages 253-273, January.
    2. Jianqing Fan & Yuan Liao & Martina Mincheva, 2013. "Large covariance estimation by thresholding principal orthogonal complements," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 75(4), pages 603-680, September.
    3. Kehui Chen & Jing Lei, 2018. "Network Cross-Validation for Determining the Number of Communities in Network Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(521), pages 241-251, January.
    4. Fan, Jianqing & Fan, Yingying & Lv, Jinchi, 2008. "High dimensional covariance matrix estimation using a factor model," Journal of Econometrics, Elsevier, vol. 147(1), pages 186-197, November.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Zheng Tracy Ke & Jingming Wang, 2024. "Entry-Wise Eigenvector Analysis and Improved Rates for Topic Modeling on Short Documents," Mathematics, MDPI, vol. 12(11), pages 1-41, May.
    2. Jin, Jiashun & Ke, Zheng Tracy & Luo, Shengming, 2024. "Mixed membership estimation for social networks," Journal of Econometrics, Elsevier, vol. 239(2).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Fan, Jianqing & Jiang, Bai & Sun, Qiang, 2022. "Bayesian factor-adjusted sparse regression," Journal of Econometrics, Elsevier, vol. 230(1), pages 3-19.
    2. Fan, Jianqing & Liao, Yuan & Shi, Xiaofeng, 2015. "Risks of large portfolios," Journal of Econometrics, Elsevier, vol. 186(2), pages 367-387.
    3. Seyoung Park & Eun Ryung Lee & Sungchul Lee & Geonwoo Kim, 2019. "Dantzig Type Optimization Method with Applications to Portfolio Selection," Sustainability, MDPI, vol. 11(11), pages 1-32, June.
    4. Christian M. Hafner & Oliver Linton & Haihan Tang, 2016. "Estimation of a multiplicative covariance structure in the large dimensional case," CeMMAP working papers 52/16, Institute for Fiscal Studies.
    5. Hafner, Christian M. & Linton, Oliver B. & Tang, Haihan, 2020. "Estimation of a multiplicative correlation structure in the large dimensional case," Journal of Econometrics, Elsevier, vol. 217(2), pages 431-470.
    6. HAFNER, Christian & LINTON, Oliver B. & TANG, Haihan, 2016. "Estimation of a Multiplicative Covariance Structure in the Large Dimensional Case," LIDAM Discussion Papers CORE 2016044, Université catholique de Louvain, Center for Operations Research and Econometrics (CORE).
    7. Clifford Lam & Phoenix Feng & Charlie Hu, 2017. "Nonlinear shrinkage estimation of large integrated covariance matrices," Biometrika, Biometrika Trust, vol. 104(2), pages 481-488.
    8. Jin-Chuan Duan & Weimin Miao, 2016. "Default Correlations and Large-Portfolio Credit Analysis," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 34(4), pages 536-546, October.
    9. Tae-Hwy Lee & Ekaterina Seregina, 2024. "Optimal Portfolio Using Factor Graphical Lasso," Journal of Financial Econometrics, Oxford University Press, vol. 22(3), pages 670-695.
    10. Bai, Jushan & Liao, Yuan, 2016. "Efficient estimation of approximate factor models via penalized maximum likelihood," Journal of Econometrics, Elsevier, vol. 191(1), pages 1-18.
    11. Jingying Yang, 2024. "Element Aggregation for Estimation of High-Dimensional Covariance Matrices," Mathematics, MDPI, vol. 12(7), pages 1-16, March.
    12. Joongyeub Yeo & George Papanicolaou, 2016. "Random matrix approach to estimation of high-dimensional factor models," Papers 1611.05571, arXiv.org, revised Nov 2017.
    13. Gianluca De Nard & Olivier Ledoit & Michael Wolf, 2018. "Factor models for portfolio selection in large dimensions: the good, the better and the ugly," ECON - Working Papers 290, Department of Economics - University of Zurich, revised Dec 2018.
    14. Choi, Sung Hoon & Kim, Donggyu, 2023. "Large volatility matrix analysis using global and national factor models," Journal of Econometrics, Elsevier, vol. 235(2), pages 1917-1933.
    15. Chen, Jia & Li, Degui & Linton, Oliver, 2019. "A new semiparametric estimation approach for large dynamic covariance matrices with multiple conditioning variables," Journal of Econometrics, Elsevier, vol. 212(1), pages 155-176.
    16. Bodnar, Taras & Mazur, Stepan & Ngailo, Edward & Parolya, Nestor, 2017. "Discriminant analysis in small and large dimensions," Working Papers 2017:6, Örebro University, School of Business.
    17. Lam, Clifford, 2020. "High-dimensional covariance matrix estimation," LSE Research Online Documents on Economics 101667, London School of Economics and Political Science, LSE Library.
    18. Mårten Gulliksson & Stepan Mazur, 2020. "An Iterative Approach to Ill-Conditioned Optimal Portfolio Selection," Computational Economics, Springer;Society for Computational Economics, vol. 56(4), pages 773-794, December.
    19. Yoshimasa Uematsu & Takashi Yamagata, 2019. "Estimation of Weak Factor Models," DSSR Discussion Papers 96, Graduate School of Economics and Management, Tohoku University.
    20. Gautam Sabnis & Debdeep Pati & Anirban Bhattacharya, 2019. "Compressed Covariance Estimation with Automated Dimension Learning," Sankhya A: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 81(2), pages 466-481, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:jorssb:v:84:y:2022:i:2:p:630-653. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/rssssea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.