Author
Listed:
- Haiyan Yu
(Center for Data and Decision Sciences, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
These authors contributed equally to this work.)
- Bing Han
(School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China
These authors contributed equally to this work.)
- Nicholas Rios
(Department of Statistics, George Mason University, Fairfax, VA 22031, USA
These authors contributed equally to this work.)
- Jianbin Chen
(School of Mathematics and Statistics, Beijing Institute of Technology, Beijing 100081, China
These authors contributed equally to this work.)
Abstract
Observational data with massive sample sizes are often distributed on many local machines. From an experimental design perspective, investigators often desire to identify the effect of new treatments (even ML algorithms) on many blocks of experimental data. With time requirements or budget constraints, assigning all treatments to each block is not always feasible. This creates incomplete responses with respect to a randomized complete block design (RCBD). These incomplete responses are missing by design. However, whether they can be estimated with missing imputation methods is not well understood. Thus, it is challenging to correctly identify the treatment effects with missing data. To this end, this paper provides a method for imputation and analysis of the responses with missing data. The proposed method consists of three steps: Reconstruction, Imputation, and ‘Complete’-data Analysis (RICA). The incomplete responses are imputed with the expectation-maximization (EM) algorithm. The RCBD model is then fitted by the resulting dataset. The identifiability result suggests that the missing may be nonignorable for each block, but the whole data of an incomplete design are missing by design when the design is balanced. Theoretical results on relative efficiency also inform us when the missingness should be imputed for incomplete designs with the role of balanced variance. Applications on real-world data verify the efficacy of this method.
Suggested Citation
Haiyan Yu & Bing Han & Nicholas Rios & Jianbin Chen, 2024.
"Missing Data Imputation in Balanced Construction for Incomplete Block Designs,"
Mathematics, MDPI, vol. 12(21), pages 1-22, October.
Handle:
RePEc:gam:jmathe:v:12:y:2024:i:21:p:3419-:d:1511819
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:12:y:2024:i:21:p:3419-:d:1511819. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.