Abstract:
Multidimensional time series data are widely used, but can be rendered unreliable due to missing values or outliers. A multidimensional time series data processing and reselection mechanism (MTSM) method is proposed in this paper. This method is based on Transformer-based imputation for missing values, combined with the 3σ rule and box plots for outlier detection and hierarchical correction. Multi-scale fuzzy entropy, boundary mixture resampling and Gaussian mixture clustering sampling are applied according to data types to re-screen the imputed and corrected data. A comparative analysis was conducted based on the COVID-19 data from the World Health Organization, and the results show that the MTSM method outperforms GRU, RNN, and LATC at different missing and outlier rates, and also demonstrates outstanding accuracy and robustness.