A deep reinforcement learning approach to gasoline blending real-time optimization under uncertainty

doi:10.1016/j.cjche.2024.03.023

中国化学工程学报 ›› 2024, Vol. 71 ›› Issue (7): 183-192.DOI: 10.1016/j.cjche.2024.03.023

A deep reinforcement learning approach to gasoline blending real-time optimization under uncertainty

Zhiwei Zhu, Minglei Yang, Wangli He, Renchu He, Yunmeng Zhao, Feng Qian

Key Laboratory of Smart Manufacturing in Energy Chemical Process, Ministry of Education, East China University of Science and Technology, Shanghai 200237, China

收稿日期:2023-10-17 修回日期:2024-03-13 出版日期:2024-07-28 发布日期:2024-08-30
通讯作者: Yunmeng Zhao,E-mail:yunmeng.zhao@ecust.edu.cn;Feng Qian,E-mail:fqian@ecust.edu.cn
基金资助:
This work was supported by National Key Research & Development Program – Intergovernmental International Science and Technology Innovation Cooperation Project (2021YFE0112800), National Natural Science Foundation of China (Key Program: 62136003), National Natural Science Foundation of China (62073142), Fundamental Research Funds for the Central Universities (222202417006) and Shanghai Al Lab.

A deep reinforcement learning approach to gasoline blending real-time optimization under uncertainty

Zhiwei Zhu, Minglei Yang, Wangli He, Renchu He, Yunmeng Zhao, Feng Qian

Key Laboratory of Smart Manufacturing in Energy Chemical Process, Ministry of Education, East China University of Science and Technology, Shanghai 200237, China

Received:2023-10-17 Revised:2024-03-13 Online:2024-07-28 Published:2024-08-30
Contact: Yunmeng Zhao,E-mail:yunmeng.zhao@ecust.edu.cn;Feng Qian,E-mail:fqian@ecust.edu.cn
Supported by:
This work was supported by National Key Research & Development Program – Intergovernmental International Science and Technology Innovation Cooperation Project (2021YFE0112800), National Natural Science Foundation of China (Key Program: 62136003), National Natural Science Foundation of China (62073142), Fundamental Research Funds for the Central Universities (222202417006) and Shanghai Al Lab.

摘要/Abstract

摘要： The gasoline inline blending process has widely used real-time optimization techniques to achieve optimization objectives, such as minimizing the cost of production. However, the effectiveness of real-time optimization in gasoline blending relies on accurate blending models and is challenged by stochastic disturbances. Thus, we propose a real-time optimization algorithm based on the soft actor-critic (SAC) deep reinforcement learning strategy to optimize gasoline blending without relying on a single blending model and to be robust against disturbances. Our approach constructs the environment using nonlinear blending models and feedstocks with disturbances. The algorithm incorporates the Lagrange multiplier and path constraints in reward design to manage sparse product constraints. Carefully abstracted states facilitate algorithm convergence, and the normalized action vector in each optimization period allows the agent to generalize to some extent across different target production scenarios. Through these well-designed components, the algorithm based on the SAC outperforms real-time optimization methods based on either nonlinear or linear programming. It even demonstrates comparable performance with the time-horizon based real-time optimization method, which requires knowledge of uncertainty models, confirming its capability to handle uncertainty without accurate models. Our simulation illustrates a promising approach to free real-time optimization of the gasoline blending process from uncertainty models that are difficult to acquire in practice.

关键词: Deep reinforcement learning, Gasoline blending, Real-time optimization, Petroleum, Computer simulation, Neural networks

Abstract: The gasoline inline blending process has widely used real-time optimization techniques to achieve optimization objectives, such as minimizing the cost of production. However, the effectiveness of real-time optimization in gasoline blending relies on accurate blending models and is challenged by stochastic disturbances. Thus, we propose a real-time optimization algorithm based on the soft actor-critic (SAC) deep reinforcement learning strategy to optimize gasoline blending without relying on a single blending model and to be robust against disturbances. Our approach constructs the environment using nonlinear blending models and feedstocks with disturbances. The algorithm incorporates the Lagrange multiplier and path constraints in reward design to manage sparse product constraints. Carefully abstracted states facilitate algorithm convergence, and the normalized action vector in each optimization period allows the agent to generalize to some extent across different target production scenarios. Through these well-designed components, the algorithm based on the SAC outperforms real-time optimization methods based on either nonlinear or linear programming. It even demonstrates comparable performance with the time-horizon based real-time optimization method, which requires knowledge of uncertainty models, confirming its capability to handle uncertainty without accurate models. Our simulation illustrates a promising approach to free real-time optimization of the gasoline blending process from uncertainty models that are difficult to acquire in practice.

Key words: Deep reinforcement learning, Gasoline blending, Real-time optimization, Petroleum, Computer simulation, Neural networks

Zhiwei Zhu, Minglei Yang, Wangli He, Renchu He, Yunmeng Zhao, Feng Qian. A deep reinforcement learning approach to gasoline blending real-time optimization under uncertainty[J]. 中国化学工程学报, 2024, 71(7): 183-192.

参考文献

[1] J. Li, I.A. Karimi, R. Srinivasan, Recipe determination and scheduling of gasoline blending operations, AlChE. J. 56 (2) (2010) 441-465.
[2] J. Alvarez-Ramirez, A. Morales, R. Suarez, Robustness of a class of bias update controllers for blending systems, Ind. Eng. Chem. Res. 41 (19) (2002) 4786-4793.
[3] K. Magoulas, D. Marinos-Kouris, A. Lygeras, Instructions are given for building gasoline-blending LP, Oil Gas J.;(United States) 86 (27) (1988) 32-37.
[4] W. Chen, J. Yang, A double loop optimization method for gasoline online blending, In: 2016 IEEE International Conference on Industrial Technology (ICIT), Taipei, Taiwan, China, 2016.
[5] A. Ahmad, W.H. Gao, S. Engell, A study of model adaptation in iterative real-time optimization of processes with uncertainties, Comput. Chem. Eng. 122 (2019) 218-227.
[6] J.A. Paulson, A. Mesbah, Nonlinear model predictive control with explicit backoffs for stochastic systems under arbitrary uncertainty, IFAC-PapersOnLine 51 (20) (2018) 523-534.
[7] A. Singh, J.F. Forbes, P.J. Vermeer, S.S. Woo, Model-based real-time optimization of automotive gasoline blending operations, J. Process. Contr. 10 (1) (2000) 43-58.
[8] Y. Yang, L. dela Rosa, T.Y.M. Chow, Non-convex chance-constrained optimization for blending recipe design under uncertainties, Comput. Chem. Eng. 139 (2020) 106868.
[9] Y. Yang, Optimal blending under general uncertainties: a chance-constrained programming approach, Comput Chem Eng 171 (2023) 108170.
[10] X. Zhao, Y. Wang, Gasoline blending scheduling based on uncertainty, In: 2009 International Conference on Computational Intelligence and Natural Computing, Wuhan, China, 2009.
[11] X. Dai, X.Q. Wang, R.C. He, W.L. Du, W.M. Zhong, L. Zhao, F. Qian, Data-driven robust optimization for crude oil blending under uncertainty, Comput. Chem. Eng. 136 (2020) 106595.
[12] N. Pasadakis, V. Gaganis, C. Foteinopoulos, Octane number prediction for gasoline blends, Fuel Process. Technol. 87 (6) (2006) 505-509.
[13] E. Paranghooshi, M. Sadeghi, S. Shafiei, Predicting octane numbers for gasoline blends using artificial neural networks: The ANN models were more accurate than regression models, Hydrocarbon Process. 88 (10) (2009) 49-49.
[14] J. Zhang, Q. Wang, Y. Su, S.M. Jin, J.Z. Ren, M. Eden, W.F. Shen, An accurate and interpretable deep learning model for environmental properties prediction using hybrid molecular representations, AlChE. J. 68 (6) (2022) e17634.
[15] Y. Su, S.M. Jin, X.P. Zhang, W.F. Shen, M.R. Eden, J.Z. Ren, Stakeholder-oriented multi-objective process optimization based on an improved genetic algorithm, Comput. Chem. Eng. 132 (2020) 106618.
[16] S.R. Sun, A. Yang, C.L. Chang, G.Q. Hua, J.Z. Ren, Z.G. Lei, W.F. Shen, Improved multiobjective particle swarm optimization integrating mutation and changing inertia weight strategy for optimal design of the extractive single and double dividing wall column, Ind. Eng. Chem. Res. 62 (43) (2023) 17923-17936.
[17] R.S. Sutton, A. Barto, Reinforcement Learning: An Introduction, 2nd ed., MIT press, Cambridge, MA, 2018.
[18] J.D. Wu, Z.B. Wei, W.H. Li, Y. Wang, Y.W. Li, D.U. Sauer, Battery thermal- and health-constrained energy management for hybrid electric bus based on soft actor-critic DRL algorithm, IEEE Trans. Ind. Inform. 17 (6) (2021) 3751-3761.
[19] D.F. Zhu, B. Yang, Y.X. Liu, Z.J. Wang, K. Ma, X.P. Guan, Energy management based on multi-agent deep reinforcement learning for a multi-energy industrial park, Appl. Energy 311 (2022) 118636.
[20] T.S. Chu, J. Wang, L. Codeca, Z.J. Li, Multi-agent deep reinforcement learning for large-scale traffic signal control, IEEE Trans. Intell. Transp. Syst. 21 (3) (2020) 1086-1095.
[21] A. Sass, A. Kummer, J. Abonyi, Multi-agent reinforcement learning-based exploration of optimal operation strategies of semi-batch reactors, Comput. Chem. Eng. 162 (2022) 107819.
[22] K. Arulkumaran, M.P. Deisenroth, M. Brundage, A.A. Bharath, Deep reinforcement learning: a brief survey, IEEE Signal Process. Mag. 34 (6) (2017) 26-38.
[23] B.K.M. Powell, D. Machalek, T. Quah, Real-time optimization using reinforcement learning, Comput. Chem. Eng. 143 (2020) 107077.
[24] T. Quah, D. Machalek, K.M. Powell, Comparing reinforcement learning methods for real-time optimization of a chemical process, Processes 8 (11) (2020) 1497.
[25] A. Mahajan, D. Teneketzis, Multi-armed bandit problems, In: A.O. Hero, D. Castanon, D. Cochran, K. Kastella, Eds., Foundations and applications of sensor management, Springer, New York (2008) 121-151.
[26] J.M. Lee, J.H. Lee, Approximate dynamic programming strategies and their applicability for process control: a review and future directions, Int. J. Control Autom. Syst. 2 (3) (2004) 263-278.
[27] H. Cheng, W.M. Zhong, F. Qian, An application of the particle swarm optimization on the gasoline blending process. Zeng D, International Conference on Applied Informatics and Communication. Berlin, Heidelberg: Springer, 2011: 352-360.
[28] T. Haarnoja, A. Zhou, P. Abbeel, S. Levine, Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: International conference on machine learning, Stockholm, Sweden, 2018.
[29] B. Belousov, H. Abdulsamad, P. Klink, S. Parisi, J. Peters, Eds., Reinforcement Learning Algorithms: Analysis and Applications, Springer Cham, Switzerland, 2022.

A deep reinforcement learning approach to gasoline blending real-time optimization under uncertainty

A deep reinforcement learning approach to gasoline blending real-time optimization under uncertainty

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

[1]	Cheng Yang, Chao Jiang, Guo Yu, Jun Li, Cuimei Bo. Transferable adversarial slow feature extraction network for few-shot quality prediction in coal-to-ethylene glycol process[J]. 中国化学工程学报, 2024, 71(7): 258-271.
[2]	Li Wang, Ji-Xiang Guo, Rui-Ying Xiong, Chen-Hao Gao, Xiao-Jun Zhang, Dan Luo. In situ modification of heavy oil catalyzed by nanosized metal-organic framework at mild temperature and its mechanism[J]. 中国化学工程学报, 2024, 67(3): 166-173.
[3]	Jiarui Liang, Yong Tian, Shutong Yang, Yong Wang, Ruiqi Yin, Yufei Wang. Long-term operation optimization of circulating cooling water systems under fouling conditions[J]. 中国化学工程学报, 2024, 65(1): 255-267.
[4]	Jian Long, Kai Deng, Renchu He. Closed-loop scheduling optimization strategy based on particle swarm optimization with niche technology and soft sensor method of attributes-applied to gasoline blending process[J]. 中国化学工程学报, 2023, 61(9): 43-57.
[5]	Xi Luo, Xiayuan Feng, Xu Ji, Yagu Dang, Li Zhou, Kexin Bi, Yiyang Dai. Extraction and analysis of risk factors from Chinese chemical accident reports[J]. 中国化学工程学报, 2023, 61(9): 68-81.
[6]	Jixiang Liu, Xin Zhou, Gengfei Yang, Hui Zhao, Zhibo Zhang, Xiang Feng, Hao Yan, Yibin Liu, Xiaobo Chen, Chaohe Yang. Conceptual carbon-reduction process design and quantitative sustainable assessment for concentrating high purity ethylene from wasted refinery gas[J]. 中国化学工程学报, 2023, 57(5): 290-308.
[7]	Shanwei Xiong, Li Zhou, Yiyang Dai, Xu Ji. Attention-based long short-term memory fully convolutional network for chemical process fault diagnosis[J]. 中国化学工程学报, 2023, 56(4): 1-14.
[8]	Yao Wang, Qing Ye, Jinlong Li, Qingqing Rui, Azhi Yu. Economic and entropy production evaluation of extractive distillation and solvent-assisted pressure-swing distillation by multi-objective optimization[J]. 中国化学工程学报, 2023, 63(11): 246-259.
[9]	Kun Ren, Zheng Jiao, Xiaolong Wu, Honggui Han. Multivariable identification of membrane fouling based on compacted cascade neural network[J]. 中国化学工程学报, 2023, 53(1): 37-45.
[10]	Qunhong Liu, Jiangtao Yang, Hongwei Zhang, Hongming Sun, Shuzheng Wu, Bingqing Ge, Rong Wang, Pei Yuan. Tuning the properties of Ni-based catalyst via La incorporation for efficient hydrogenation of petroleum resin[J]. 中国化学工程学报, 2022, 45(5): 41-50.
[11]	Lingrui Cui, Jun Xu, Mannian Ren, Tao Li, Dianhua Liu, Fahai Cao. Modification of FCC slurry oil and deoiled asphalt for making high-grade paving asphalt[J]. 中国化学工程学报, 2022, 44(4): 300-309.
[12]	Ruoshi Qin, Jinsong Zhao. Adaptive multiscale convolutional neural network model for chemical process fault diagnosis[J]. 中国化学工程学报, 2022, 50(10): 398-411.
[13]	Chengxiang Shi, Jisheng Xu, Lun Pan, Xiangwen Zhang, Ji-Jun Zou. Perspective on synthesis of high-energy-density fuels: From petroleum to coal-based pathway[J]. 中国化学工程学报, 2021, 35(7): 83-91.
[14]	Qiaoyi Xu, Wenli Du, Jinjin Xu, Jikai Dong. Neural network-based source tracking of chemical leaks with obstacles[J]. 中国化学工程学报, 2021, 33(5): 211-220.
[15]	Sajad Asadizadeh, Shahab Ayatollahi, Bahman ZareNezhad. Fabrication of a highly efficient new nanocomposite polymer gel for controlling the excess water production in petroleum reservoirs and increasing the performance of enhanced oil recovery processes[J]. 中国化学工程学报, 2021, 32(4): 385-392.