Estimating Spatial Autocorrelation With Sampled Network Data
针对大规模网络(如社交网络)中无法观测全网络时空间自相关估计不准的问题,提出了近似最大似然估计和配对最大似然估计两种新方法,后者计算效率更高且理论性质优良。
Spatial autocorrelation is a parameter of importance for network data analysis. To estimate spatial autocorrelation, maximum likelihood has been popularly used. However, its rigorous implementation requires the whole network to be observed. This is practically infeasible if network size is huge (e.g., Facebook, Twitter, Weibo, WeChat, etc.). In that case, one has to rely on sampled network data to infer about spatial autocorrelation. By doing so, network relationships (i.e., edges) involving unsampled nodes are overlooked. This leads to distorted network structure and underestimated spatial autocorrelation. To solve the problem, we propose here a novel solution. By temporarily assuming that the spatial autocorrelation is small, we are able to approximate the likelihood function by its first-order Taylor’s expansion. This leads to the method of approximate maximum likelihood estimator (AMLE), which further inspires the development of paired maximum likelihood estimator (PMLE). Compared with AMLE, PMLE is computationally superior and thus is particularly useful for large-scale network data analysis. Under appropriate regularity conditions (without assuming a small spatial autocorrelation), we show theoretically that PMLE is consistent and asymptotically normal. Numerical studies based on both simulated and real datasets are presented for illustration purpose.