Lithology identification of tight sandstone reservoirs based on SSMO-SSA-LGBM algorithm
-
摘要:
研究目的 现有岩性测井识别方法用于致密砂岩储层岩性识别时,存在岩性类别处理不均衡及敏感性不足问题。
研究方法 本文提出SSMO-SSA-LGBM模型,利用SVM-SMOTE过采样算法(简称SSMO)对训练集中岩性数据较少的样本进行平衡化处理,得到新合成样本,并将其与原始训练集组成新训练集,用于训练和构建LGBM模型,由于LGBM模型训练时使用较多超参数,因此采用麻雀优化搜索算法SSA对其进行超参寻优以获得最佳参数组合。以甘肃华池油田S区延10致密砂岩测井数据为基础,训练构建SSMO-SSA-LGBM模型,采用KNN、Adaboost、随机森林等模型进行对比。
研究结果 经SSMO模型平衡化后,LGBM模型对少数类识别性能增强;SSA算法全局优化搜索经较少次数迭代获得LGBM最优超参数;SSMO-SSA-LGBM模型预测性能达到最优,在验证井上岩性识别结果与取心资料符合率较高。
结论 采用SSMO算法能有效解决岩性类别非均衡给岩性预测结果带来的不利影响,SSA算法全局优化搜索经较少次数迭代获得LGBM算法最优超参数组合,使得模型预测性能达到最优,该模型在华池S区的应用效果较好。
-
关键词:
- SSMO-SSA-LGBM算法 /
- 非均衡数据 /
- 岩性识别 /
- 致密砂岩储层 /
- 甘肃华池
Abstract:Objective Existing lithology logging identification methods face challenges of imbalanced lithology class processing and insufficient sensitivity when applied to tight sandstone reservoirs.
Methods This study proposes the SSMO−SSA−LGBM model. First, the SVM−SMOTE oversampling algorithm (abbreviated as SSMO) is used to balance samples with fewer lithology data in the training set by generating synthetic samples. These synthetic samples are combined with the original training set to form a new training dataset for constructing the LightGBM (LGBM) model. Given the numerous hyperparameters in LGBM, the Sparrow Search Algorithm (SSA) is employed to optimize hyperparameters and obtain the optimal combination. The model is trained using logging data from the Yan 10 tight sandstone reservoir in the Huachi S Block, and compared with KNN, Adaboost, Random Forest, and other models.
Results After SSMO balancing, the LGBM model exhibits enhanced recognition performance for minority lithology classes. The SSA algorithm achieves global optimization with fewer iterations, obtaining the optimal hyperparameters for LGBM. The SSMO−SSA−LGBM model demonstrates superior predictive performance, with lithology identification results on validation wells showing high consistency with core data.
Conclusions The SSMO algorithm effectively mitigates the adverse effects of lithology class imbalance on prediction accuracy. The SSA algorithm efficiently identifies the optimal hyperparameter combination for LGBM through limited iterations, maximizing model performance. The proposed model achieves satisfactory application results in the Huachi S Block.
-
-
表 1 研究区典型岩性测井响应数值范围
Table 1. Typical lithological logging response numerical range in the dataset of the study area
岩性 SP/mV GR/API AC/(μs·m−1) DEN/(g·cm−3) CNL/% RT/(Ω·m) 砂砾岩 25.6~38.0 36.8~61.3 218.6~235.7 2.54~2.74 12.34~18.43 62.95~87.64 粗砂岩 17.1~44.8 40.7~67.6 223.0~254.5 2.45~2.69 11.12~22.57 59.14~116.70 中砂岩 13.7~39.8 25.8~69.1 238.8~251.2 2.48~2.68 11.24~26.67 58.94~164.74 细砂岩 18.8~49.71 51.2~89.8 220.3~251.7 2.42~2.68 10.44~24.91 61.15~99.58 泥质粉砂岩 42.1~78.8 72.3~139.9 235.0~278.4 2.50~2.67 11.98~33.36 17.31~63.70 泥岩 94.3~51.5 68.6~170.9 240.1~293.0 2.19~2.70 14.22~37.29 10.12~39.86 炭质泥岩 62.4~73.5 114.6~128.7 252.4~264.4 2.56~2.65 22.98~26.96 81.71~91.24 表 2 训练集和测试集划分结果
Table 2. Division Results of Training Set and Test Set
岩性类型 类别标签 岩心数据 训练样本 训练集 测试集 砂砾岩 0 22 15 7 粗砂岩 1 51 38 13 中砂岩 2 244 198 46 细砂岩 3 248 195 53 泥质粉砂岩 4 177 148 29 泥岩 5 252 203 49 炭质泥岩 6 14 8 6 表 3 各模型使用优化参数
Table 3. Optimization parameters used in various models
模型 KNN 随机森林 Adaboost LGBM SSA优化后各模型
超参数组合metric= euclidean
n_neighbor=5
weights= uniformn_estimators=530
max_depth=10
min_sam_spl=5
min_sam_leaf=1n_estimators=509
learning_rate=0.013
algorithm=SAMME
base_estimator= CARTboosting_type=GBDT
num_leaves=31
n_estimators=510
learning_rate=0.016注:metric(距离度量方法);n_neighbor为邻居数量;weights为权重函数;n_estimators为估计器数量(迭代次数);max_depth为决策树最大深度;min_sam_spl为决策树内部节点分裂所需最小样本数;min_sam_leaf为决策树最小叶节点数;base_estimator为基分类器,Adaboost模型默认base_estimator为CART决策树; Adaboost模型algorithm默认为SAMME;learning_rate为学习率;LGBM默认boosting_type为GBDT;num_leaves为每棵树叶子节点数目 表 4 各对比模型在测试集上综合预测结果
Table 4. Comprehensive prediction results of various comparison models on the test set
模型 精确率/% 召回率/% F1值/% 计算时间/s SSMO-SSA-KNN 83.77 81.52 82.16 657.423 SSMO-SSA-随机森林 89.21 88.95 88.47 722.337 SSMO-SSA-Adaboost 91.14 90.73 90.25 719.774 SSMO-SSA-LGBM 95.54 94.67 95.13 716.425 -
[1] Chen G H, Wu W S, Mao K Y. 2001. Identification of lithology using geological micro resistivity scanning images[J]. Petroleum Exploration and Development, (2): 53−55,110-111,120 (in Chinese with English abstract).
[2] Cheng C, Li P Y, Chen Y, et al. 2022. Research progress of reservoir logging evaluation based on machine learning[J]. Progress in Geophysics, 37(1): 164−177 (in Chinese with English abstract).
[3] Feng C J, Wang J Y, Feng Q F. 2004. Method of identifying igneous rock lithology using logging data[J]. Journal of Northeast Petroleum University, (4): 9−11,109-110 (in Chinese with English abstract).
[4] Gao B, Zheng Y, Qin J, et al. 2022. Network intrusion detection algorithm based on sparrow search algorithm andimproved particle swarm optimization algorithm[J]. Journal of Computer Applications, 42(4): 1201−1206 (in Chinese with English abstract).
[5] Gu Y F, Zhang D Y, Bao Z D. 2021. Lithology prediction of tight sandstone reservoirs using the PSO-GBDT : A case study of the Chang 4+5 members in the Western Jiyuan Oilfield[J]. Bulletin of Mineralogy, Petrology and Geochemistry, 40(3): 624-634 (in Chinese with English abstract).
[6] Gu Y F, Zhang D Y, Bao Z D. 2021. Lithology identification in tight sandstone reservoirs using CRBM-PSO-XGBoost[J]. Oil & Gas Geology, 42(5): 1210−1222 (in Chinese with English abstract).
[7] Gu Y F, Zhang D Y, Bao Z D, et al. 2021. Lithology prediction of tight sandstone formation using GS-LightGBM hybrid machine learning model[J]. Bulletin of Geological Science and Technology, 40(4): 224−234 (in Chinese with English abstract).
[8] Han Q D, Zhang X T, Shen W. 2019. Application of support vector machine based on decision tree feature extraction in lithology classification[J]. Journal of Jilin University(Earth Science Edition), 49(2): 611−620 (in Chinese with English abstract).
[9] Han X H, He Y S, Chen J, et al. 2024. Research on rock lithology intelligent identification based on Swin Transformer[J]. Modern Electronics Technique, 47(7): 37−44 (in Chinese with English abstract).
[10] He Y W, Li W R, Dong Z Z, et al. 2023. Lithologic identification of complex reservoir based on PSO-LSTM-FCN algorithm[J]. Energies, 16(5): 21-35 (in Chinese with English abstract).
[11] Jiang J, Fang L, Zhang H B, et al. 2022. Adaptive multiexpert learning for lithology recognition[J]. SPE Journal, 27(6): 3802−3813.
[12] Li J, Liu, K. Zhou, et al. 2021. An improved deep forest model combining adaptive synthetic sampling for automatic lithology identification[C]//China Automation Congress (CAC): 1215-1220.
[13] Li X, Fan X Y, Wang Z F, et al. 2022. Logging lithology identification method research based on PSO-SVM: A case study of Paleozoic (Pz) reservoir in K oil field, South Turgay Basin, Kazakhstan[J]. Progress in Geophysics, 37(2): 617−626 (in Chinese with English abstract).
[14] Liu K, Zou Z K, Wang Z Z, et al. 2022. Intelligent identification and prediction of lithology of volcanic reservoirs based on machine learning[J]. Special Oil & Gas Reservoirs, 29(1): 38−45 (in Chinese with English abstract).
[15] Luo R Z, Tuo J J, Ni H L, et al. 2023. Logging lithology identification method based on improved ensemble learning[J]. Geophysical Prospecting for Petroleum, 62(2): 212−224 (in Chinese with English abstract).
[16] Ma Z, Zhang C L, Gao S C. 2017. Lithology identification based on principal component analysis and fuzzy recognition[J]. Lithologic Reservoirs, 29(5): 127−133 (in Chinese with English abstract).
[17] Shi S, Yu J F, Cao H T, et al. 2020. Reservoir lithology identification based on SVM using radial basis function: An example of Upper Paleozoic clastic rocks in Dongpu sag[J]. China Sciencepaper, 15(1): 112−118, 136 (in Chinese with English abstract).
[18] Song Z H, Gong H Y, Ran A H, et al. 2024. Lithology logging identification of volcanic rock based on ADASYN-GS-XGBOOST hybrid model[J]. Marine Origin Petroleum Geology, 29(2): 188−196 (in Chinese with English abstract).
[19] Su F, Ma L, Luo R Z, et al. 2020. Research and application of logging lithology identification based on improve multi-class twin support vector machine[J]. Progress in Geophysics, 35(1): 174−180 (in Chinese with English abstract).
[20] Sun X G, Wei W, Li H M. 2012. Fluid sensitivity analysis of petrophysical parameters[J]. Petroleum Reservoir Evaluation and Development, 2(1): 37−40,49 (in Chinese with English abstract).
[21] Wang H, Jiang Y N, Zhang X, et al. 2021. Lithology identification method based on gradient boosting algorithm[J]. Journal of Jilin University(Earth Science Edition), 51(3): 940−950 (in Chinese with English abstract).
[22] Wang Z Q, Dong H C, Fan T E, et al. 2021. Logging lithofacies analysis based on unsupervised learning[J]. Geophysical Prospecting for Petroleum, 60(3): 403−413 (in Chinese with English abstract).
[23] Wang Z S. 2022. Research on imbalanced data classification based on ensemble learning[D]. Master's Thesis of Beijing Jiaotong University (in Chinese with English abstract).
[24] Wu Z Y, Zhang X, Zhang C L, et al. 2021. Lithology identification based on LSTM recurrent neural network[J]. Lithologic Reservoirs, 33(3): 120−128 (in Chinese with English abstract).
[25] Xu F G, Deng S G, Fan Y R, et al. 2006. Overview of the progress in logging evaluation of igneous reservoirs[J] Petroleum Reservoir Evaluation and Development, (4): 239-243, 11.
[26] Xu Z H, Ma W, Lin P, et al. 2021. Intelligent lithology identification based on transfer learning of rock images[J]. Journal of Basic Science and Engineering, 29(5): 1075−1092 (in Chinese with English abstract).
[27] Xue J K, Shen B. 2020. A novel swarm intelligence optimization approach: Sparrow search algorithm[J]. Systems Science and Control Engineering, 8(1): 22−34. doi: 10.1080/21642583.2019.1708830
[28] Zhang H, Lu S F, Li W H, et al. 2017. Application of ΔLogR technology and BP neural network in organic evaluation in the complex lithology tight stratum[J]. Progress in Geophysics, 32(3): 1308−1313 (in Chinese with English abstract).
[29] Zhang Y, Li M C, Han S. 2018. Automatic identification and classification in lithology based on deep learning in rock images[J]. Acta Petrologica Sinica, 34(2): 333−342 (in Chinese with English abstract).
[30] Zhang T, Li Y P, Liu X Y, et al. 2023. Lithology interpretation of deep metamorphic rocks with well logging based on APSO-LSSVM algorithm[J]. Progress in Geophysics, 38(1): 382−392 (in Chinese with English abstract).
[31] Zhang Z L, Feng Y B, Zhao Z K. 2020. Oversampling method for unbalanced datasets based on SVM[J]. Computer Engineering and Applications, 56(23): 220−228 (in Chinese with English abstract).
[32] Zhao F D, Han Z M, Fu X F, et al. 2024. LogDiffusion: A method of lithology identification based on diffusion probability model[J]. Progress in Geophysics: 1-19[2024-09-22] (in Chinese with English abstract).
[33] Zhao J L. 2008. Principles of logging methods[M]. Shaanxi People's Education Press(in Chinese).
[34] Zhou Y K, Liu H. 2024. Deep learning based lithology recognition of well logging data[J]. Uranium Geology, 40(2): 336−345(in Chinese with English abstract).
[35] 陈钢花, 吴文圣, 毛克宇. 2001. 利用地层微电阻率扫描图像识别岩性[J]. 石油勘探与开发, (2): 53−55,110−111,120.
[36] 程超, 李培彦, 陈雁, 等. 2022. 基于机器学习的储层测井评价研究进展[J]. 地球物理学进展, 37(1): 164−177.
[37] 冯翠菊, 王敬岩, 冯庆付. 2004. 利用测井资料识别火成岩岩性的方法[J]. 东北石油大学学报, (4): 9−11,109-110.
[38] 高兵, 郑雅, 秦静, 等. 2022. 基于麻雀搜索算法和改进粒子群优化算法的网络入侵检测算法[J]. 计算机应用, 42(4): 1201−1206.
[39] 谷宇峰, 张道勇, 鲍志东. 2021a. PSO-GBDT识别致密砂岩储集层岩性研究——以姬塬油田西部长4+5段为例[J]. 矿物岩石地球化学通报, 40(3): 624−634.
[40] 谷宇峰, 张道勇, 鲍志东, 等. 2021b. 利用GS-LightGBM机器学习模型识别致密砂岩地层岩性[J]. 地质科技通报, 40(4): 224−234.
[41] 韩启迪, 张小桐, 申维. 2019. 基于决策树特征提取的支持向量机在岩性分类中的应用[J]. 吉林大学学报(地球科学版), 49(2): 611−620.
[42] 韩鑫豪, 何月顺, 陈杰, 等. 2024. 基于Swin Transformer的岩石岩性智能识别研究[J]. 现代电子技术, 47(7): 37−44.
[43] 罗仁泽, 庹娟娟, 倪华玲, 等. 2023. 基于改进集成学习的测井岩性识别方法研究[J]. 石油物探, 62(2): 212−224.
[44] 李曦, 范翔宇, 王兆峰, 等. 2022. 基于PSO-SVM的测井岩性识别方法研究——以南图尔盖盆地K油田古生界(Pz)储层为例[J]. 地球物理学进展, 37(2): 617−626. doi: 10.6038/pg2022FF0254
[45] 刘凯, 邹正银, 王志章, 等. 2022. 基于机器学习的火山岩岩性智能识别及预测[J]. 特种油气藏, 29(1): 38−45.
[46] 马峥, 张春雷, 高世臣. 2017. 主成分分析与模糊识别在岩性识别中的应用[J]. 岩性油气藏, 29(5): 127−133.
[47] 石锁, 余继峰, 曹慧涛, 等. 2020. 基于高斯核SVM的储层岩性识别——以东濮凹陷上古生界碎屑岩为例[J]. 中国科技论文, 15(1): 112−118, 136.
[48] 宋梓豪, 巩红雨, 冉爱华, 等. 2024. 基于ADASYN-GS-XGBOOST混合模型的火山岩测井岩性识别[J]. 海相油气地质, 29(2): 188−196.
[49] 苏赋, 马磊, 罗仁泽, 等. 2020. 基于改进多分类孪生支持向量机的测井岩性识别方法研究与应用[J]. 地球物理学进展, 35(1): 174−180.
[50] 孙兴刚, 魏文, 李红梅. 2012. 岩石物理参数的流体敏感性分析[J]. 油气藏评价与开发, 2(1): 37−40, 49. doi: 10.3969/j.issn.2095-1426.2012.01.009
[51] 王恒, 姜亚楠, 张欣, 等. 2021. 基于梯度提升算法的岩性识别方法[J]. 吉林大学学报(地球科学版), 51(3): 940−950.
[52] 王宗俊, 董洪超, 范廷恩, 等. 2021. 基于无监督学习的测井岩相分析技术及其应用[J]. 石油物探, 60(3): 403−413.
[53] 王增帅. 2022. 基于集成学习的不平衡数据分类问题研究[D]. 北京交通大学硕士学位论文.
[54] 武中原, 张欣, 张春雷, 等. 2021. 基于LSTM循环神经网络的岩性识别方法[J]. 岩性油气藏, 33(3): 120−128.
[55] 许风光, 邓少贵, 范宜仁, 等. 2006. 火成岩储层测井评价进展综述[J]. 油气藏评价与开发, (4): 239−243,11.
[56] 许振浩, 马文, 林鹏, 等. 2021. 基于岩石图像迁移学习的岩性智能识别[J]. 应用基础与工程科学学报, 29(5): 1075-1092.
[57] 张晗, 卢双舫, 李文浩, 等. 2017. ΔLogR技术与BP神经网络在复杂岩性致密层有机质评价中的应用[J]. 地球物理学进展, 32(3): 1308−1313.
[58] 张涛, 李艳萍, 刘晓宇, 等. 2023. 基于自适应粒子群优化最小二乘支持向量机的深层变质岩测井岩性识别[J]. 地球物理学进展, 38(1): 382−392.
[59] 张野, 李明超, 韩帅. 2018. 基于岩石图像深度学习的岩性自动识别与分类方法[J]. 岩石学报, 34(2): 333−342.
[60] 赵逢达, 韩滋民, 付晓飞, 等. 2024. LogDiffusion: 一种基于扩散概率模型的岩性识别方法[J/OL]. 地球物理学进展: 1-19 [2024-09-22].
[61] 赵军龙. 2008. 测井方法原理[M]. 西安: 陕西人民教育出版社.
[62] 张忠林, 冯宜邦, 赵中恺. 2020. 一种基于SVM的非均衡数据集过采样方法[J]. 计算机工程与应用, 56(23): 220−228.
[63] 周渊凯, 刘祜. 2024. 基于深度学习方法的测井岩性识别研究[J]. 铀矿地质, 40(2): 336−345.
-