SCI和EI收录∣中国化工学会会刊

Chinese Journal of Chemical Engineering ›› 2025, Vol. 77 ›› Issue (1): 273-292.DOI: 10.1016/j.cjche.2024.10.014

Previous Articles     Next Articles

Machine learning-assisted retrosynthesis planning: Current status and future prospects

Yixin Wei1,2, Leyu Shan1, Tong Qiu1,2, Diannan Lu1, Zheng Liu1   

  1. 1. Department of Chemical Engineering, Tsinghua University, Beijing 100084, China;
    2. Beijing Key Laboratory of Industrial Big Data System and Application, Beijing 100084, China
  • Received:2024-09-04 Revised:2024-10-30 Accepted:2024-10-31 Online:2024-11-30 Published:2025-01-28
  • Contact: Tong Qiu,E-mail:qiutong@mail.tsinghua.edu.cn
  • Supported by:
    This work is supported by the National Key Research and Development Program of China (2022ZD0117501).

Machine learning-assisted retrosynthesis planning: Current status and future prospects

Yixin Wei1,2, Leyu Shan1, Tong Qiu1,2, Diannan Lu1, Zheng Liu1   

  1. 1. Department of Chemical Engineering, Tsinghua University, Beijing 100084, China;
    2. Beijing Key Laboratory of Industrial Big Data System and Application, Beijing 100084, China
  • 通讯作者: Tong Qiu,E-mail:qiutong@mail.tsinghua.edu.cn
  • 基金资助:
    This work is supported by the National Key Research and Development Program of China (2022ZD0117501).

Abstract: Machine learning-assisted retrosynthesis planning aims to utilize machine learning (ML) algorithms to find synthetic pathways for target compounds. In recent years, with the development of artificial intelligence (AI), especially ML, researchers’ interest in ML-assisted retrosynthesis planning has rapidly increased, bringing development and opportunities to the field. In this review, we aim to provide a comprehensive understanding of ML-assisted retrosynthesis planning. We first discuss the formal definition and the objective of retrosynthesis planning, and organize a modular framework which includes four modules: data preparation, data preprocessing, pathway generation and evaluation, and pathway verification. Then, we sequentially review the current status of the first three modules (except pathway verification) in the ML-assisted retrosynthesis planning framework, including ideas, methods, and latest progress. Following that, we specifically discuss large language models in retrosynthesis planning. Finally, we summarize the extant challenges that are faced by current ML-assisted retrosynthesis planning research and offer a perspective on future research directions and development.

Key words: Retrosynthesis planning, Machine learning, Artificial intelligence, Synthetic pathway, Chemoinformatics

摘要: Machine learning-assisted retrosynthesis planning aims to utilize machine learning (ML) algorithms to find synthetic pathways for target compounds. In recent years, with the development of artificial intelligence (AI), especially ML, researchers’ interest in ML-assisted retrosynthesis planning has rapidly increased, bringing development and opportunities to the field. In this review, we aim to provide a comprehensive understanding of ML-assisted retrosynthesis planning. We first discuss the formal definition and the objective of retrosynthesis planning, and organize a modular framework which includes four modules: data preparation, data preprocessing, pathway generation and evaluation, and pathway verification. Then, we sequentially review the current status of the first three modules (except pathway verification) in the ML-assisted retrosynthesis planning framework, including ideas, methods, and latest progress. Following that, we specifically discuss large language models in retrosynthesis planning. Finally, we summarize the extant challenges that are faced by current ML-assisted retrosynthesis planning research and offer a perspective on future research directions and development.

关键词: Retrosynthesis planning, Machine learning, Artificial intelligence, Synthetic pathway, Chemoinformatics