SCI和EI收录∣中国化工学会会刊

Chinese Journal of Chemical Engineering ›› 2024, Vol. 71 ›› Issue (7): 183-192.DOI: 10.1016/j.cjche.2024.03.023

Previous Articles     Next Articles

A deep reinforcement learning approach to gasoline blending real-time optimization under uncertainty

Zhiwei Zhu, Minglei Yang, Wangli He, Renchu He, Yunmeng Zhao, Feng Qian   

  1. Key Laboratory of Smart Manufacturing in Energy Chemical Process, Ministry of Education, East China University of Science and Technology, Shanghai 200237, China
  • Received:2023-10-17 Revised:2024-03-13 Online:2024-08-30 Published:2024-07-28
  • Contact: Yunmeng Zhao,E-mail:yunmeng.zhao@ecust.edu.cn;Feng Qian,E-mail:fqian@ecust.edu.cn
  • Supported by:
    This work was supported by National Key Research & Development Program – Intergovernmental International Science and Technology Innovation Cooperation Project (2021YFE0112800), National Natural Science Foundation of China (Key Program: 62136003), National Natural Science Foundation of China (62073142), Fundamental Research Funds for the Central Universities (222202417006) and Shanghai Al Lab.

A deep reinforcement learning approach to gasoline blending real-time optimization under uncertainty

Zhiwei Zhu, Minglei Yang, Wangli He, Renchu He, Yunmeng Zhao, Feng Qian   

  1. Key Laboratory of Smart Manufacturing in Energy Chemical Process, Ministry of Education, East China University of Science and Technology, Shanghai 200237, China
  • 通讯作者: Yunmeng Zhao,E-mail:yunmeng.zhao@ecust.edu.cn;Feng Qian,E-mail:fqian@ecust.edu.cn
  • 基金资助:
    This work was supported by National Key Research & Development Program – Intergovernmental International Science and Technology Innovation Cooperation Project (2021YFE0112800), National Natural Science Foundation of China (Key Program: 62136003), National Natural Science Foundation of China (62073142), Fundamental Research Funds for the Central Universities (222202417006) and Shanghai Al Lab.

Abstract: The gasoline inline blending process has widely used real-time optimization techniques to achieve optimization objectives, such as minimizing the cost of production. However, the effectiveness of real-time optimization in gasoline blending relies on accurate blending models and is challenged by stochastic disturbances. Thus, we propose a real-time optimization algorithm based on the soft actor-critic (SAC) deep reinforcement learning strategy to optimize gasoline blending without relying on a single blending model and to be robust against disturbances. Our approach constructs the environment using nonlinear blending models and feedstocks with disturbances. The algorithm incorporates the Lagrange multiplier and path constraints in reward design to manage sparse product constraints. Carefully abstracted states facilitate algorithm convergence, and the normalized action vector in each optimization period allows the agent to generalize to some extent across different target production scenarios. Through these well-designed components, the algorithm based on the SAC outperforms real-time optimization methods based on either nonlinear or linear programming. It even demonstrates comparable performance with the time-horizon based real-time optimization method, which requires knowledge of uncertainty models, confirming its capability to handle uncertainty without accurate models. Our simulation illustrates a promising approach to free real-time optimization of the gasoline blending process from uncertainty models that are difficult to acquire in practice.

Key words: Deep reinforcement learning, Gasoline blending, Real-time optimization, Petroleum, Computer simulation, Neural networks

摘要: The gasoline inline blending process has widely used real-time optimization techniques to achieve optimization objectives, such as minimizing the cost of production. However, the effectiveness of real-time optimization in gasoline blending relies on accurate blending models and is challenged by stochastic disturbances. Thus, we propose a real-time optimization algorithm based on the soft actor-critic (SAC) deep reinforcement learning strategy to optimize gasoline blending without relying on a single blending model and to be robust against disturbances. Our approach constructs the environment using nonlinear blending models and feedstocks with disturbances. The algorithm incorporates the Lagrange multiplier and path constraints in reward design to manage sparse product constraints. Carefully abstracted states facilitate algorithm convergence, and the normalized action vector in each optimization period allows the agent to generalize to some extent across different target production scenarios. Through these well-designed components, the algorithm based on the SAC outperforms real-time optimization methods based on either nonlinear or linear programming. It even demonstrates comparable performance with the time-horizon based real-time optimization method, which requires knowledge of uncertainty models, confirming its capability to handle uncertainty without accurate models. Our simulation illustrates a promising approach to free real-time optimization of the gasoline blending process from uncertainty models that are difficult to acquire in practice.

关键词: Deep reinforcement learning, Gasoline blending, Real-time optimization, Petroleum, Computer simulation, Neural networks