CMStatistics 2023: Start Registration
View Submission - CMStatistics
B0743
Title: Concentration of measure bounds for matrix-variate data with missing values Authors:  Shuheng Zhou - University of California, Riverside (United States) [presenting]
Abstract: The next data perturbation model is considered, where the covariates incur multiplicative errors. For two random matrices $U$, $X$, we denote by $(U \circ X)$ the Hadamard or Schur product, which is defined as $(U \circ X)_{i,j} = (U_{i,j}) (X_{ij})$. The subgaussian matrix variate model is studied, where the matrix variate data is observed through a random mask $U: \mathcal{X} = U \circ X$, where $X = B^{1/2} Z A^{1/2}$, where Z is a random matrix with independent subgaussian entries, and U is a mask matrix with either zero or positive entries, where $E[U_{ij}] \in [0,1]$ and all entries are mutually independent. Under the assumption of independence between $X$ and $U$, componentwise unbiased estimators are introduced for estimating covariance $A$ and $B$ and prove the concentration of measure bounds in the sense of guaranteeing the restricted eigenvalue $(RE)$ conditions to hold on the unbiased estimator for $B$, when columns of the data matrix are sampled with different rates. Multiple regression methods are further developed for estimating the inverse of $B$ and show the statistical rate of convergence. The results provide insight for sparse recovery for relationships among entities (samples, locations, items) when features (variables, time points, user ratings) are present in the observed data matrix $X$ with heterogeneous rates. The proof techniques can certainly be extended to other scenarios. Simulation evidence is provided, illuminating the theoretical predictions.