Sparse grid imputation (SGI) is a challenging problem since its goal is to infer the values of whole grid from limited number of cells with values. Traditionally, this problem is solved by using various regression methods, such as Knn and Kriging, while in the real world, there may be extra information to aid the inference for better performance. In the SGI problem, besides limited fixed grid cells having precise target domain values, there are context information as well as imprecise observations over the whole grid. By proposing a distribution estimation theory, we implement the theory with a CycleGAN-based neural network model trained with context information and imprecise observations.
The generator of the implementation simultaneously executes the target embedding autoencoder and transforms the observation distributions to the target distribution, and the discriminator helps to shape the transformation. We consider a real-world problem, fine-grained PM2.5 inference with realistic settings: a few (less than 5%) grid cells with precise PM2.5 data, and all the grid cells have their context information of weather, and imprecise observations from satellite and microsensors, and the task asks to infer reasonable values for all empty grid cells. As there is no ground truth for the empty cells, the out-of-sample MSE (mean squared error) and JSD (Jensen–Shannon divergence) measurements are used in the empirical study. The results show that the proposed generator can verify the proposed theory and the performance of the proposed method exceeds the best traditional regression methods.