Application of incremental learning in landslide susceptibility assessment: A case study of Tianshui, Gansu Province
-
摘要:
为了提升机器学习模型在滑坡易发性评价任务中的泛化能力,以甘肃天水市为例,采用基于LightGBM的增量学习模型,并利用Autogluon自动机器学习框架实现模型的超参数优化和堆叠,以及使用SHAP可解释框架进行特征选择和数据异常分析,构建了适用于滑坡易发性评价的增量学习模型。通过在天水市不同区域采集的滑坡灾害数据进行模型验证,结果表明,基于增量学习的滑坡易发性评价模型能够有效地识别和预测滑坡易发区域,根据新数据集自适应调整模型,并且提高模型的性能。
Abstract:To enhance the generalization ability of machine learning models in the assessment of landslide susceptibility, this paper takes the city of Tianshui as an example and employs an incremental learning model based on LightGBM. By utilizing the Autogluon automated machine learning framework, the model's hyperparameter optimization and mdoel stacking are implemented. Additionally, the SHAP explainable framework is used for feature selection and data anomaly analysis. By using the above methods we construct an incremental learning model suitable for landslide susceptibility assessment. Model validation using landslide disaster data collected from various regions in Tianshui city demonstrates that the incremental learning model for landslide susceptibility can effectively identify and predict landslide-prone areas. It adapts to new datasets by self-adjusting the model and improves model performance.
-
-
表 1 滑坡样本信息
Table 1. Landslide sample information
序号 DEM NDBI NDVI NDWI 剖面曲率 土地利用 土壤 地形湿度 地形粗糙 地形起伏 是否滑坡 1 1882.51 −0.44 0.81 −0.71 −0.31 2 8 5.14 1.14 50.57 0 2 1837.37 −0.44 0.81 −0.72 −0.16 2 8 4.76 1.17 63.51 0 3 1786.96 −0.45 0.82 −0.72 0.18 2 8 4.80 1.21 77.71 0 4 1826.43 −0.44 0.80 −0.71 −0.05 2 8 4.73 1.19 75.81 0 5 1820.29 −0.44 0.80 −0.71 −0.17 2 8 4.90 1.19 70.70 0 6 1767.20 −0.45 0.82 −0.73 0.17 2 8 4.59 1.16 68.66 0 1027 1627.71 −0.23 0.60 −0.56 −0.68 1 1 5.39 1.08 54.24 1 1028 1545.51 −0.34 0.73 −0.66 0.10 1 1 5.26 1.08 49.08 1 1029 1549.01 −0.29 0.63 −0.59 0.16 1 1 6.11 1.10 53.96 1 1030 1585.05 −0.18 0.46 −0.46 0.01 1 1 4.83 1.11 55.04 1 1031 1540.65 −0.25 0.56 −0.54 −0.08 1 1 5.46 1.07 44.02 1 表 2 最优参数
Table 2. Optimal parameters
参数名称 learning_rate n_estimators max_depth max_features num_leaves feature_fraction min_data_in_leaf num_boost_round 最优参数 0.1615 448 8 0.3049 77 0.4447 6 92 表 3 训练前后召回率和F1值
Table 3. Recall rate and F1 value before and after training
模型指标 召回率 精确率 F1值 增量训练前 0.846 0.807 0.826 增量训练后 0.842 0.820 0.831 -
[1] Erickson N, Mueller J, Shirkov A, et al. 2020. Autogluon−tabular: Robust and accurate automl for structured data[J]. arXiv, preprint arXiv: 2003.06505.
[2] Huang F, Ye Z, Zhou X, et al. 2022. Landslide susceptibility prediction using an incremental learning Bayesian Network model considering the continuously updated landslide inventories[J]. Bulletin of Engineering Geology and the Environment, 81(6): 250. doi: 10.1007/s10064-022-02748-2
[3] Ke G, Meng Q, Finley T, et al. 2017. Lightgbm: A highly efficient gradient boosting decision tree[J]. Advances in Neural Information Processing Systems, 2017: 3149−3157.
[4] Lundberg S M, Lee S I. 2017. A unified approach to interpreting model predictions[J]. Advances in Neural Information Processing Systems, 2017: 30.
[5] Merghadi A, Yunus A P, Dou J, et al. 2020. Machine learning methods for landslide susceptibility studies: A comparative overview of algorithm performance[J]. Earth−Science Reviews, 207: 103225. doi: 10.1016/j.earscirev.2020.103225
[6] Wu C, Jia R, Qiu T, et al. 2013. Rock burst monitoring and early warning based on incremental learning method with SVM[J]. Research Journal of Information Technology, 5(4): 121−124. doi: 10.19026/rjit.5.5797
[7] Wu Y, Xu W, Yu Q, et al. 2019. Hierarchical Bayesian network based incremental model for flood prediction[C]//MultiMedia Modeling: 25th International Conference, MMM 2019, Thessaloniki, Greece, January 8–11, 2019, Proceedings, Part I 25. Springer International Publishing: 556−566.
[8] 邓念东, 李宇新, 崔阳阳, 等. 2022. 基于机器学习混合模型的滑坡易发性评价[J]. 科学技术与工程, 22(14): 5539−5547.
[9] 黄发明, 胡松雁, 闫学涯, 等. 2022. 基于机器学习的滑坡易发性预测建模及其主控因子识别[J]. 地质科技通报, 41(2): 79−90.
[10] 康孟羽, 朱月琴, 陈晨, 等. 2022. 基于多元非线性回归和BP神经网络的滑坡滑动距离预测模型研究[J]. 地质通报, 41(12): 2281−2289.
[11] 李世其, 段学燕, 刘燕. 2006. 一种决策树增量学习算法在故障诊断中的应用[J]. 华中科技大学学报: 自然科学版, 34(4): 79−81.
[12] 李挺, 洪镇南, 刘智勇, 等. 2018. 基于增量单类支持向量机的工业控制系统入侵检测[J]. 信息与控制, 47(06): 756−761.
[13] 刘纪平, 梁恩婕, 徐胜华, 等. 2022. 顾及样本优化选择的多核支持向量机滑坡灾害易发性分析评价[J]. 测绘学报, 51(10): 2034−2045.
[14] 齐娜, 胡良柏. 2022. 天水盆地黄土滑坡特征与分布规律分析[J]. 甘肃科技, 38(22): 28−33.
[15] 邵葆蓉, 孙即超, 朱月琴, 等. 2020. 基于多元回归的黄土滑坡滑动距离预测模型探讨——以甘肃天水地区为例[J]. 地质通报, 39(12): 1993−2003.
[16] 王毅, 陈曦, 唐贵希, 等. 2022. 基于自动机器学习的全球尺度滑坡灾害易发性预测[J]. 资源环境与工程, 36(5): 604−613.
[17] 王洪林, 董春林, 董俊, 等. 2022. 基于支持向量机增量学习算法的高压电网短路故障位置自动识别[J]. 电气自动化, 44(4): 34−36.
[18] 武雪玲, 沈少青, 牛瑞卿. 2016. GIS支持下应用PSO-SVM模型预测滑坡易发性[J]. 武汉大学学报(信息科学版), 41(5): 665−671.
[19] 严武文. 2010, 基于粗集——神经网络的区域滑坡灾害易发性预测研究[D]. 中国地质大学硕士学位论文.
[20] 张博, 向旭, 贾俊龙, 等. 2023. 基于LightGBM的天然气管道周围滑坡灾害预测方法[J]. 吉林大学学报(理学版), 61(2): 338−346.
[21] 赵泽园, 罗菲. 2020. 基于LightGBM模型的区域滑坡危险性评价研究[J]. 内蒙古煤炭经济, 5: 48−49.
[22] 庄维嘉, 谭文安, 林瑞钦, 等. 2022. GA-LightGBM模型及其在车辆保险需求预测中应用[J]. 上海第二工业大学学报, 39(4): 339−346.
-