Abstract:
Multidimensional time series data is widely used, but it can be unreliable due to missing or outlier values. A multidimensional time series data processing and re-screening mechanism (MTSM) method is proposed in this paper. This method is based on Transformer-based imputation of missing values, combined with 3σ method and box plot detection, hierarchical correction of outliers, and applies multi-scale fuzzy entropy, boundary mixture resampling and Gaussian mixture clustering sampling according to data types to re-screen the filled and corrected data. A comparative analysis was conducted based on COVID-19 data from the World Health Organization, and the results showed that the MTSM method outperformed GRU, RNN, and LATC at different missing and abnormal rates, and also demonstrated outstanding accuracy and robustness.