A method for identifying anomalous values of groundwater levels at candidate sites for the geological disposal of high-level radioactive waste
-
摘要: 地下水动态监测为高放废物地质处置候选场址的安全评价提供了关键基础数据, 但研究发现实际的监测数据中存在较多异常值, 严重干扰了对动态过程的准确判断。因此, 亟须建立一种高效的方法对异常值进行准确识别。本文基于局部加权回归的时间序列分解和最小协方差行列式方法构建了地下水位异常值检测组合模型, 使最小协方差行列式方法可以在更独立的残差项中进行异常值检测。结果表明, 构建的组合模型相较于最小协方差行列式方法的单一模型, 其对异常数据具有更好的敏感性和检测精度; 并进一步确定了组合模型的阈值应接近实际的异常值比例, 以获取最佳的检测效果; 此外, 根据新场地段BSQ01、BSQ25、BS35、BS26钻孔的水位数据对组合模型的适用性进行验证, 表明其能够准确识别出混淆于大量正常水位数据中的异常值, 同时也适用于不同类型异常事件的检测。
-
关键词:
- 时间序列异常检测 /
- STL分解 /
- 最小协方差行列式方法 /
- 高放废物 /
- 地质处置
Abstract: Dynamic groundwater monitoring provides critical foundational data for the safety assessment of candidate sites for the geological disposal of high-level radioactive waste. However, research has revealed that actual monitoring data frequently contain numerous anomalous values, severely interfering with the accurate assessment of the dynamic monitoring process. Therefore, there is an urgent need to develop an efficient method to accurately identify these anomalous values. This study built a combined model for anomalous value detection of the groundwater level using local weighted regression-based time series decomposition and the minimum covariance determinant (MCD) method. This combined model allowed the MCD method to achieve anomaly detection in more independent residuals. Results indicate that the combined model exhibited higher sensitivity and detection accuracy for anomalous data than the single MCD model. Furthermore, this study established that the threshold of the combined model should be close to the actual proportion of anomalous values to achieve optimal detection results. Besides, this study validated the applicability of the combined model using groundwater level data from boreholes BSQ01, BSQ25, BS35, and BS26 at the new site. The validation results demonstrate that the combined model can accurately identify anomalous values amidst a large volume of data on the normal groundwater level and is applicable to the detection of different types of anomalous events. -
-
[1] 郭永海, 王驹, 金远新.世界高放废物地质处置库选址研究概况及国内进展[J].地学前缘, 2001, 8(2):327-332.
Guo Y H, Wang J, Jin Y X.The general situation of geological disposal repository siting in the world and research progress in China[J].Earth Science Frontiers, 2001, 8(2):327-332.
[2] Wang J, Chen L, Su R, et al.The Beishan underground research laboratory for geological disposal of high-level radioactive waste in China:Planning, site selection, site characterization and in situ tests[J].Journal of Rock Mechanics and Geotechnical Engineering, 2018, 10(3):411-435.
[3] Calderwood A J, Pauloo R A, Yoder A M, et al.Low-cost, open source wireless sensor network for real-time, scalable groundwater monitoring[J].Water, 2020, 12(4):1066.
[4] Drage J, Kennedy G.Building a low-cost, internet-of-things, real-time groundwater level monitoring network[J].Groundwater Monitoring & Remediation, 2020, 40(4):67-73.
[5] Muharemi F, Logofătu D, Leon F.Machine learning approaches for anomaly detection of water quality on a real-world data set[J].Journal of Information and Telecommunication, 2019, 3(3):294-307.
[6] Pang G S, Shen C H , Cao L B, et al.Deep learning for anomaly detection:A review[J].ACM Computing Surveys, 2021, 54(2):1-38.
[7] Schmidl S, Wenig P, Papenbrock T.Anomaly detection in time series:A comprehensive evaluation[J].Proceedings of the VLDB Endowment, 2022, 15(9):1779-1797.
[8] Rousseeuw P J, Hubert M.Anomaly detection by robust statistics[J].WIREs Data Mining and Knowledge Discovery, 2018, 8(2):e1236.
[9] Yu Y, Zhu Y L, Li S J, et al.Time series outlier detection based on sliding window prediction[J].Mathematical Problems in Engineering, 2014:1-14.
[10] Kulanuwat L, Chantrapornchai C, Maleewong M, et al.Anomaly detection using a sliding window technique and data imputation with machine learning for hydrological time series[J].Water, 2021, 13(13):1862.
[11] Cabana E, Lillo R E, Laniado H.Multivariate outlier detection based on a robust Mahalanobis distance with shrinkage estimators[J].Statistical Papers, 2021, 62(4):1583-1609.
[12] Sripriya T P, Srinivasan M R, Gallo M.Robust distance measure to detect outliers for categorical data[J].Soft Computing, 2020, 24(18):13557-13564.
[13] Li J B, Izakian H, Pedrycz W, et al.Clustering-based anomaly detection in multivariate time series data[J].Applied Soft Computing, 2021, 100:106919.
[14] Smiti A.A critical overview of outlier detection methods[J].Computer Science Review, 2020, 38:100306.
[15] 何黎, 陈磊, 纪莎莎, 等.基于K-shape聚类的连续液位监测数据异常检测方法[J].中国给水排水, 2023, 39(11):56-61.
He L, Chen L, Ji S S, et al.Abnormal detection of continuous water level monitoring data based on K-shape clustering[J].China Water & Wastewater, 2023, 39(11):56-61.
[16] Shi H X, Guo J, Deng Y D, et al.Machine learning-based anomaly detection of groundwater microdynamics:Case study of Chengdu, China[J].Scientific Reports, 2023, 13(1):14718.
[17] Ayadi A, Ghorbel O, Obeid A M, et al.Outlier detection approaches for wireless sensor networks:A survey[J].Computer Networks, 2017, 129(1):319-333.
[18] Sunderland K M, Beaton D, Fraser J, et al.The utility of multivariate outlier detection techniques for data quality evaluation in large studies:An application within the ONDRI project[J].BMC Medical Research Methodology, 2019, 19:102.
[19] Hardin J, Rocke D M.Outlier detection in the multiple cluster setting using the minimum covariance determinant estimator[J].Computational Statistics & Data Analysis, 2004, 44(4):625-638.
[20] Hubert M, Debruyne M, Rousseeuw P J.Minimum covariance determinant and extensions[J].WIREs Computational Statistics, 2018, 10(3):e1421.
[21] 孙杰.基于FAST-MCD算法的异常成绩检测研究[J].现代计算机, 2021, 27(29):59-62.
Sun J.Research on the abnormal grade detection based on the FAST-MCD algorithm[J].Modern Computer, 2021, 27(29):59-62.
[22] Zhou Y J, Ren H R, Li Z W, et al.Anomaly detection via a combination model in time series data[J].Applied Intelligence, 2021, 51(7):4874-4887.
[23] Lin S, Clark R, Birke R, et al.Anomaly detection for time series using VAE-LSTM hybrid model[C]//ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020:4322-4326.
[24] Yokkampon U, Chumkamon S, Mowshowitz A, et al.Anomaly detection using variational autoencoder with spectrum analysis for time series data[C]//2020 Joint 9th International Conference on Informatics, Electronics & Vision (ICIEV) and 2020 4th International Conference on Imaging, Vision & Pattern Recognition (icIVPR), 2020:1-6.
[25] Lyu J M, Wang Y Q, Chen S J.Adaptive multivariate time-series anomaly detection[J].Information Processing & Management, 2023, 60(4):103383.
[26] Samariya D, Thakkar A.A comprehensive survey of anomaly detection algorithms[J].Annals of Data Science, 2023, 10(3):829-850.
[27] Cleveland R B, Cleveland W S.STL:A seasonal-trend decomposition procedure based on Loess[J].Journal of official statistics, 1990, 6(1):3-73.
[28] Rousseeuw P J, Driessen K V.A fast algorithm for the minimum covariance determinant estimator[J].Technometrics, 1999, 41(3):212-223.
[29] Li J B, Zhang Y K, Zhou Z C, et al.Using multiple isotopes to determine groundwater source, age, and renewal rate in the Beishan preselected area for geological disposal of high-level radioactive waste in China[J].Journal of Hydrology, 2024, 629:130592.
[30] Hubert M, Debruyne M.Minimum covariance determinant[J].WIREs Computational Statistics, 2010, 2(1):36-43.
[31] Rousseeuw P J, Hubert M.Robust statistics for outlier detection[J].WIREs Data Mining and Knowledge Discovery, 2011, 1(1):73-79.
[32] 李航.统计学习方法[M].北京:清华大学出版社, 2012.Li H.Statistical learning methodology[M].Beijing:Tsinghua University Press, 2012.
-
计量
- 文章访问数: 25
- PDF下载数: 4
- 施引文献: 0