SCI和EI收录∣中国化工学会会刊

中国化学工程学报 ›› 2025, Vol. 77 ›› Issue (1): 273-292.DOI: 10.1016/j.cjche.2024.10.014

• • 上一篇    下一篇

Machine learning-assisted retrosynthesis planning: Current status and future prospects

Yixin Wei1,2, Leyu Shan1, Tong Qiu1,2, Diannan Lu1, Zheng Liu1   

  1. 1. Department of Chemical Engineering, Tsinghua University, Beijing 100084, China;
    2. Beijing Key Laboratory of Industrial Big Data System and Application, Beijing 100084, China
  • 收稿日期:2024-09-04 修回日期:2024-10-30 接受日期:2024-10-31 出版日期:2025-01-28 发布日期:2024-11-30
  • 通讯作者: Tong Qiu,E-mail:qiutong@mail.tsinghua.edu.cn
  • 基金资助:
    This work is supported by the National Key Research and Development Program of China (2022ZD0117501).

Machine learning-assisted retrosynthesis planning: Current status and future prospects

Yixin Wei1,2, Leyu Shan1, Tong Qiu1,2, Diannan Lu1, Zheng Liu1   

  1. 1. Department of Chemical Engineering, Tsinghua University, Beijing 100084, China;
    2. Beijing Key Laboratory of Industrial Big Data System and Application, Beijing 100084, China
  • Received:2024-09-04 Revised:2024-10-30 Accepted:2024-10-31 Online:2025-01-28 Published:2024-11-30
  • Contact: Tong Qiu,E-mail:qiutong@mail.tsinghua.edu.cn
  • Supported by:
    This work is supported by the National Key Research and Development Program of China (2022ZD0117501).

摘要: Machine learning-assisted retrosynthesis planning aims to utilize machine learning (ML) algorithms to find synthetic pathways for target compounds. In recent years, with the development of artificial intelligence (AI), especially ML, researchers’ interest in ML-assisted retrosynthesis planning has rapidly increased, bringing development and opportunities to the field. In this review, we aim to provide a comprehensive understanding of ML-assisted retrosynthesis planning. We first discuss the formal definition and the objective of retrosynthesis planning, and organize a modular framework which includes four modules: data preparation, data preprocessing, pathway generation and evaluation, and pathway verification. Then, we sequentially review the current status of the first three modules (except pathway verification) in the ML-assisted retrosynthesis planning framework, including ideas, methods, and latest progress. Following that, we specifically discuss large language models in retrosynthesis planning. Finally, we summarize the extant challenges that are faced by current ML-assisted retrosynthesis planning research and offer a perspective on future research directions and development.

关键词: Retrosynthesis planning, Machine learning, Artificial intelligence, Synthetic pathway, Chemoinformatics

Abstract: Machine learning-assisted retrosynthesis planning aims to utilize machine learning (ML) algorithms to find synthetic pathways for target compounds. In recent years, with the development of artificial intelligence (AI), especially ML, researchers’ interest in ML-assisted retrosynthesis planning has rapidly increased, bringing development and opportunities to the field. In this review, we aim to provide a comprehensive understanding of ML-assisted retrosynthesis planning. We first discuss the formal definition and the objective of retrosynthesis planning, and organize a modular framework which includes four modules: data preparation, data preprocessing, pathway generation and evaluation, and pathway verification. Then, we sequentially review the current status of the first three modules (except pathway verification) in the ML-assisted retrosynthesis planning framework, including ideas, methods, and latest progress. Following that, we specifically discuss large language models in retrosynthesis planning. Finally, we summarize the extant challenges that are faced by current ML-assisted retrosynthesis planning research and offer a perspective on future research directions and development.

Key words: Retrosynthesis planning, Machine learning, Artificial intelligence, Synthetic pathway, Chemoinformatics