中国化学工程学报 ›› 2025, Vol. 77 ›› Issue (1): 273-292.DOI: 10.1016/j.cjche.2024.10.014
Yixin Wei1,2, Leyu Shan1, Tong Qiu1,2, Diannan Lu1, Zheng Liu1
收稿日期:
2024-09-04
修回日期:
2024-10-30
接受日期:
2024-10-31
出版日期:
2025-01-28
发布日期:
2024-11-30
通讯作者:
Tong Qiu,E-mail:qiutong@mail.tsinghua.edu.cn
基金资助:
Yixin Wei1,2, Leyu Shan1, Tong Qiu1,2, Diannan Lu1, Zheng Liu1
Received:
2024-09-04
Revised:
2024-10-30
Accepted:
2024-10-31
Online:
2025-01-28
Published:
2024-11-30
Contact:
Tong Qiu,E-mail:qiutong@mail.tsinghua.edu.cn
Supported by:
摘要: Machine learning-assisted retrosynthesis planning aims to utilize machine learning (ML) algorithms to find synthetic pathways for target compounds. In recent years, with the development of artificial intelligence (AI), especially ML, researchers’ interest in ML-assisted retrosynthesis planning has rapidly increased, bringing development and opportunities to the field. In this review, we aim to provide a comprehensive understanding of ML-assisted retrosynthesis planning. We first discuss the formal definition and the objective of retrosynthesis planning, and organize a modular framework which includes four modules: data preparation, data preprocessing, pathway generation and evaluation, and pathway verification. Then, we sequentially review the current status of the first three modules (except pathway verification) in the ML-assisted retrosynthesis planning framework, including ideas, methods, and latest progress. Following that, we specifically discuss large language models in retrosynthesis planning. Finally, we summarize the extant challenges that are faced by current ML-assisted retrosynthesis planning research and offer a perspective on future research directions and development.
Yixin Wei, Leyu Shan, Tong Qiu, Diannan Lu, Zheng Liu. Machine learning-assisted retrosynthesis planning: Current status and future prospects[J]. 中国化学工程学报, 2025, 77(1): 273-292.
Yixin Wei, Leyu Shan, Tong Qiu, Diannan Lu, Zheng Liu. Machine learning-assisted retrosynthesis planning: Current status and future prospects[J]. Chinese Journal of Chemical Engineering, 2025, 77(1): 273-292.
[1] J.X. Dong, M.Y. Zhao, Y.S. Liu, Y.S. Su, X.X. Zeng, Deep learning in retrosynthesis planning: datasets, models and tools, Brief. Bioinform. 23(1) (2022) bbab391. [2] J.A. DiMasi, H.G. Grabowski, R.W. Hansen, Innovation in the pharmaceutical industry: new estimates of R&D costs, J. Health Econ. 47(2016) 20-33. [3] T.J. Struble, J.C. Alvarez, S.P. Brown, M. Chytil, J. Cisar, R.L. DesJarlais, O. Engkvist, S.A. Frank, D.R. Greve, D.J. Griffin, X.J. Hou, J.W. Johannes, C. Kreatsoulas, B. Lahue, M. Mathea, G. Mogk, C.A. Nicolaou, A.D. Palmer, D.J. Price, R.I. Robinson, S. Salentin, L. Xing, T. Jaakkola, W.H. Green, R. Barzilay, C.W. Coley, K.F. Jensen, Current and future roles of artificial intelligence in medicinal chemistry synthesis, J. Med. Chem. 63(16) (2020) 8667-8682. [4] A. Thakkar, S. Johansson, K. Jorner, D. Buttar, J.L. Reymond, O. Engkvist, Artificial intelligence and automation in computer aided synthesis planning, React. Chem. Eng. 6(1) (2021) 27-51. [5] H.M. Chen, O. Engkvist, Y.H. Wang, M. Olivecrona, T. Blaschke, The rise of deep learning in drug discovery, Drug Discov. Today 23(6) (2018) 1241-1250. [6] A. Kadurin, S. Nikolenko, K. Khrabrov, A. Aliper, A. Zhavoronkov, druGAN: an advanced generative adversarial autoencoder model for de novo generation of new molecules with desired molecular properties in silico, Mol. Pharmaceutics 14(9) (2017) 3098-3104. [7] T. Blaschke, M. Olivecrona, O. Engkvist, J. Bajorath, H.M. Chen, Application of generative autoencoder in de novo molecular design, Mol. Inform. 37(1-2) (2018) 1700123. [8] C.W. Coley, W.H. Green, K.F. Jensen, RDChiral: an RDKit wrapper for handling stereochemistry in retrosynthetic template extraction and application, J. Chem. Inf. Model. 59(6) (2019) 2529-2537. [9] B. Liu, B. Ramsundar, P. Kawthekar, J. Shi, J. Gomes, Q. Luu Nguyen, S. Ho, J. Sloane, P. Wender, V. Pande, Retrosynthetic reaction prediction using neural sequence-to-sequence models, ACS Cent. Sci. 3(10) (2017) 1103-1113. [10] M. Ragoza, J. Hochuli, E. Idrobo, J. Sunseri, D.R. Koes, Protein-ligand scoring with convolutional neural networks, J. Chem. Inf. Model. 57(4) (2017) 942-957. [11] A.A. Lee, Q.Y. Yang, V. Sresht, P. Bolgar, X.J. Hou, J.L. Klug-McLeod, C.R. Butler, Molecular Transformer unifies reaction prediction and retrosynthesis across pharma chemical space, Chem. Commun. 55(81) (2019) 12152-12155. [12] E.J. Corey, The logic of chemical synthesis: multistep synthesis of complex carbogenic molecules (Nobel lecture), Angew. Chem. Int. Ed. 30(5) (1991) 455-465. [13] E.J. Corey, W.T. Wipke, Computer-assisted design of complex organic syntheses, Science 166(3902) (1969) 178-192. [14] S.Z. Ding, X.Q. Jiang, C. Meng, L.X. Sun, Z.Q. Wang, H.B. Yang, G.W. Shen, N. Xia, Application of artificial intelligence and big data technology in synthesis planning, Sci. Sin.-Chim 53(1) (2023) 66-78. [15] Y.J. Jiang, Y.M. Yu, M. Kong, Y. Mei, L.T. Yuan, Z.X. Huang, K. Kuang, Z.H. Wang, H.X. Yao, J. Zou, C.W. Coley, Y. Wei, Artificial intelligence for retrosynthesis prediction, Engineering 25(2023) 32-50. [16] J. Goodman, Computer software review: reaxys, J. Chem. Inf. Model. 49(12) (2009) 2897-2898. [17] I. Levin, M.J. Liu, C.A. Voigt, C.W. Coley, Merging enzymatic and synthetic chemistry with computational synthesis planning, Nat. Commun. 13(2022) 7747. [18] X. Zhang, E. King-Smith, L.B. Dong, L.C. Yang, J.D. Rudolf, B. Shen, H. Renata, Divergent synthesis of complex diterpenes through a hybrid oxidative approach, Science 369(6505) (2020) 799-806. [19] J. Li, A. Amatuni, H. Renata, Recent advances in the chemoenzymatic synthesis of bioactive natural products, Curr. Opin. Chem. Biol. 55(2020) 111-118. [20] N.R. Patel, C.C. Nawrat, M. McLaughlin, Y.J. Xu, M.A. Huffman, H. Yang, H.M. Li, A.M. Whittaker, T. Andreani, F. Levesque, A. Fryszkowska, A. Brunskill, D.M. Tschaen, K.M. Maloney, Synthesis of islatravir enabled by a catalytic, enantioselective alkynylation of a ketone, Org. Lett. 22(12) (2020) 4659-4664. [21] Y. Li, W. Liang, L. Peng, D. Zhang, C. Yang,K. C. Li, Predicting drug-target interactions via dual-stream graph neural network, IEEE/ACM Transactions on Computational Biology and Bioinformatics 21(4) (2024) 948-958. [22] A. Mayr, G. Klambauer, T. Unterthiner, S. Hochreiter, DeepTox: toxicity prediction using deep learning, Front. Environ. Sci. 3(2016) 80. [23] S. Szymkuc, E.P. Gajewska, T. Klucznik, K. Molga, P. Dittwald, M. Startek, M. Bajczyk, B.A. Grzybowski, Computer-assisted synthetic planning: the end of the beginning, Angew. Chem. Int. Ed 55(20) (2016) 5904-5937. [24] B. Mikulak-Klucznik, P. Golebiowska, A.A. Bayly, O. Popik, T. Klucznik, S. Szymkuc, E.P. Gajewska, P. Dittwald, O. Staszewska-Krajewska, W. Beker, T. Badowski, K.A. Scheidt, K. Molga, J. Mlynarski, M. Mrksich, B.A. Grzybowski, Computational planning of the synthesis of complex natural products, Nature 588(7836) (2020) 83-88. [25] Z.P. Zhong, J. Song, Z.L. Feng, T.T. Liu, L.X. Jia, S.L. Yao, T.J. Hou, M.L. Song, Recent advances in deep learning for retrosynthesis, Wires Comput. Mol. Sci. 14(1) (2024) e1694. [26] Q. Zhang, J. Liu, W. Zhang, F. Yang, Z.H. Yang, X.L. Zhang, A multi-stream network for retrosynthesis prediction, Front. Comput. Sci. 18(2) (2023) 182906. [27] K. Zhang, V. Mann, V. Venkatasubramanian, G-MATT: Single-step retrosynthesis prediction using molecular grammar tree transformer, AIChE. J. 70(1) (2024) e18244. [28] M.H.S. Segler, M.P. Waller, Neural-symbolic machine learning for retrosynthesis and reaction prediction, Chemistry 23(25) (2017) 5966-5971. [29] C.W. Coley, L. Rogers, W.H. Green, K.F. Jensen, SCScore: synthetic complexity learned from a reaction corpus, J. Chem. Inf. Model. 58(2) (2018) 252-261. [30] L. Fang, J.R. Li, M. Zhao, L. Tan, J.G. Lou, Single-step retrosynthesis prediction by leveraging commonly preserved substructures, Nat. Commun. 14(1) (2023) 2446. [31] I.V. Tetko, P. Karpov, R. Van Deursen, G. Godin, State-of-the-art augmented NLP transformer models for direct and single-step retrosynthesis, Nat. Commun. 11(1) (2020) 5575. [32] P. Reiser, M. Neubert, A. Eberhard, L. Torresi, C. Zhou, C. Shao, H. Metni, C. van Hoesel, H. Schopmans, T. Sommer, P. Friederich, Graph neural networks for materials science and chemistry, Commun. Mater. 3(1) (2022) 93. [33] A. Toniato, P. Schwaller, A. Cardinale, J. Geluykens, T. Laino, Unassisted noise reduction of chemical reaction datasets, Nat. Mach. Intell. 3(2021) 485-494. [34] L. D, Chemical reactions from US patents (1976-Sep2016), figshare (2017) https://doi.org/10.6084/m9.figshare.5104873.v1. Accessed Nov 2024. [35] W. Jin, C. W. Coley, R. Barzilay,T. Jaakkola, Predicting organic reaction outcomes with Weisfeiler-Lehman network, Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, US, 2017. [36] J.Y. He, D.Q. Nguyen, S. Akhondi, C. Druckenbrodt, C. Thorne, R. Hoessel, Z. Afzal, Z.N. Zhai, B.Y. Fang, H. Yoshikawa, A. Albahem, L. Cavedon, T. Cohn, T. Baldwin, K.M. Verspoor, Overview of ChEMU 2020: named entity recognition and event extraction of chemical reactions from patents, Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, 12260(2020) 237-254. [37] A. Morgat, T. Lombardot, K.B. Axelsen, L. Aimo, A. Niknejad, N. Hyka-Nouspikel, E. Coudert, M. Pozzato, M. Pagni, S. Moretti, S. Rosanoff, J. Onwubiko, L. Bougueleret, I. Xenarios, N. Redaschi, A. Bridge, Updates in Rhea - an expert curated resource of biochemical reactions, Nucleic Acids Res. 45(D1) (2017) D415-D418. [38] R. Mercado, S.M. Kearnes, C.W. Coley, Data sharing in chemistry: lessons learned and a case for mandating structured reaction data, J. Chem. Inf. Model. 63(14) (2023) 4253-4265. [39] S.M. Kearnes, M.R. Maser, M. Wleklinski, A. Kast, A.G. Doyle, S.D. Dreher, J.M. Hawkins, K.F. Jensen, C.W. Coley, The open reaction database, J. Am. Chem. Soc. 143(45) (2021) 18820-18826. [40] S. Kim, J. Chen, T.J. Cheng, A. Gindulyte, J. He, S.Q. He, Q.L. Li, B.A. Shoemaker, P.A. Thiessen, B. Yu, L. Zaslavsky, J. Zhang, E.E. Bolton, PubChem 2019 update: improved access to chemical data, Nucleic Acids Res. 47(D1) (2019) D1102-D1109. [41] M. Kanehisa, S. Goto, KEGG Kyoto encyclopedia of genes and genomes, Nucleic Acids Res. 28(1) (2000) 27-30. [42] J. Hastings, G. Owen, A. Dekker, M. Ennis, N. Kale, V. Muthukrishnan, S. Turner, N. Swainston, P. Mendes, C. Steinbeck, ChEBI in 2016: Improved services and an expanding collection of metabolites, Nucleic Acids Res. 44(D1) (2016) D1214-D1219. [43] L. Jeske, S. Placzek, I. Schomburg, A. Chang, D. Schomburg, BRENDA in 2019: a European ELIXIR core data resource, Nucleic Acids Res. 47(D1) (2019) D542-D549. [44] T. U. Consortium, UniProt: the Universal Protein Knowledgebase in 2023, Nucleic Acids Research 51(D1) (2022) D523-D531. [45] A. Jain, S.P. Ong, G. Hautier, W. Chen, W.D. Richards, S. Dacek, S. Cholia, D. Gunter, D. Skinner, G. Ceder, K.A. Persson, Commentary: The Materials Project: a materials genome approach to accelerating materials innovation, 1(1) (2013) 011002. [46] K.T. Winther, M.J. Hoffmann, J.R. Boes, O. Mamun, M. Bajdich, T. Bligaard, Catalysis-Hub.org, an open electronic structure database for surface reactions, Sci. Data 6(2019) 75. [47] N. Schneider, N. Stiefl, G.A. Landrum, What’s what: the (nearly) definitive guide to reaction role assignment, J. Chem. Inf. Model. 56(12) (2016) 2336-2346. [48] Y.H. Ding, B. Qiang, Q.X. Chen, Y.Q. Liu, L.R. Zhang, Z.M. Liu, Exploring chemical reaction space with machine learning models: representation and feature perspective, J. Chem. Inf. Model. 64(8) (2024) 2955-2970. [49] Z.P. Zhong, J. Song, Z.L. Feng, T.T. Liu, L.X. Jia, S.L. Yao, M. Wu, T.J. Hou, M.L. Song, Root-aligned SMILES: a tight representation for chemical reaction prediction, Chem. Sci. 13(31) (2022) 9023-9034. [50] Y.X. Wei, T. Qiu, MDs-NP: a property prediction model construction procedure for naphtha based on molecular dynamics simulation, J. Phys. Condens. Matter 36(31) (2024) 315402. [51] RDKit: open-source cheminformatics, [52] A.A. Toropov, A.P. Toropova, QSPR/QSAR: state-of-art, weirdness, the future, Molecules 25(6) (2020) 1292. [53] D.S. Wigh, J.M. Goodman, A.A. Lapkin, A review of molecular representation in the age of machine learning, Wires Comput. Mol. Sci. 12(5) (2022) e1603. [54] A.M. Moore, A line-formula chemical notation, J. Am. Chem. Soc. 77(7) (1955) 2032. [55] D. Weininger, SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules, J. Chem. Inf. Comput. Sci. 28(1) (1988) 31-36. [56] I. Daylight Chemical Information Systems, Daylight Theory Manual, (2008) https://www.daylight.com/dayhtml/doc/theory/. Accessed Nov 2024. [57] N. M. O'Boyle,A. Dalke, DeepSMILES: An Adaptation of SMILES for Use in Machine-Learning of Chemical Structures, ChemRxiv (2018) https://chemrxiv.org/engage/chemrxiv/article-details/60c73ed6567dfe7e5fec388d. [58] A. Lo, R. Pollice, A. Nigam, A.D. White, M. Krenn, A. Aspuru-Guzik, Recent advances in the self-referencing embedded strings (SELFIES) library, Digit. Discov. 2(4) (2023) 897-908. [59] V. D. Hahnke, E. E. Bolton,S. H. Bryant, PubChem atom environments, J. Cheminf. 7(1) (2015) 41. [60] D. Rogers, M. Hahn, Extended-connectivity fingerprints, J. Chem. Inf. Model. 50(5) (2010) 742-754. [61] J.Y. Deng, Z.B. Yang, H.H. Wang, I. Ojima, D. Samaras, F.S. Wang, A systematic study of key elements underlying molecular property prediction, Nat. Commun. 14(1) (2023) 6395. [62] P. Ertl, A. Schuffenhauer, Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions, J. Cheminform. 1(1) (2009) 8. [63] F. Nikitin, O. Isayev, V. Strijov, DRACON: disconnected graph neural network for atom mapping in chemical reactions, Phys. Chem. Chem. Phys. 22(45) (2020) 26478-26486. [64] A.I. Lin, T.I. Madzhidov, O. Klimchuk, R.I. Nugmanov, I.S. Antipin, A. Varnek, Automatized assessment of protective group reactivity: a step toward big reaction data analysis, J. Chem. Inf. Model. 56(11) (2016) 2140-2148. [65] E. Heid, W.H. Green, Machine learning of reaction properties via learned representations of the condensed graph of reaction, J. Chem. Inf. Model. 62(9) (2022) 2101-2110. [66] C.W. Coley, R. Barzilay, T.S. Jaakkola, W.H. Green, K.F. Jensen, Prediction of organic reaction outcomes using machine learning, ACS Cent. Sci. 3(5) (2017) 434-443. [67] A. Varnek, D. Fourches, F. Hoonakker, V.P. Solov’ev, Substructural fragments: an universal language to encode reactions, molecular and supramolecular structures, J. Comput. Aided Mol. Des. 19(9-10) (2005) 693-703. [68] W. Jaworski, S. Szymkuc, B. Mikulak-Klucznik, K. Piecuch, T. Klucznik, M. Kazmierowski, J. Rydzewski, A. Gambin, B.A. Grzybowski, Automatic mapping of atoms across both simple and complex chemical reactions, Nat. Commun. 10(1) (2019) 1434. [69] M. Latendresse, J.P. Malerich, M. Travers, P.D. Karp, Accurate atom-mapping computation for biochemical reactions, J. Chem. Inf. Model. 52(11) (2012) 2970-2982. [70] R. Korner, J. Apostolakis, Automatic determination of reaction mappings and reaction center information. 1. The imaginary transition state energy approach, J. Chem. Inf. Model. 48(6) (2008) 1181-1189. [71] P. Schwaller, B. Hoover, J.L. Reymond, H. Strobelt, T. Laino, Extraction of organic chemistry grammar from unsupervised learning of chemical reactions, Sci. Adv. 7(15) (2021) eabe4166. [72] R. Nugmanov, N. Dyubankova, A. Gedich, J.K. Wegner, Bidirectional graphormer for reactivity understanding: neural network trained to reaction atom-to-atom mapping task, J. Chem. Inf. Model. 62(14) (2022) 3307-3315. [73] S.A. Rahman, G. Torrance, L. Baldacci, S. Martinez Cuesta, F. Fenninger, N. Gopal, S. Choudhary, J.W. May, G.L. Holliday, C. Steinbeck, J.M. Thornton, Reaction decoder tool (RDT): extracting features from chemical reactions, Bioinformatics 32(13) (2016) 2065-2066. [74] H. Kraut, J. Eiblmaier, G. Grethe, P. Low, H. Matuszczyk, H. Saller, Algorithm for reaction classification, J. Chem. Inf. Model. 53(11) (2013) 2884-2895. [75] E.L. First, C.E. Gounaris, C.A. Floudas, Stereochemically consistent reaction mapping and identification of multiple reaction mechanisms through integer linear optimization, J. Chem. Inf. Model. 52(1) (2012) 84-92. [76] D. Fooshee, A. Andronico, P. Baldi, ReactionMap: an efficient atom-mapping algorithm for chemical reactions, J. Chem. Inf. Model. 53(11) (2013) 2812-2819. [77] S. Chen, S. An, R. Babazade, Y. Jung, Precise atom-to-atom mapping for organic reactions via human-in-the-loop machine learning, Nat. Commun. 15(1) (2024) 2250. [78] W.L. Chen, D.Z. Chen, K.T. Taylor, Automatic reaction mapping and reaction center detection, Wires Comput. Mol. Sci. 3(6) (2013) 560-593. [79] G.A. Preciat Gonzalez, L.R.P. El Assal, A. Noronha, I. Thiele, H.S. Haraldsdottir, R.M.T. Fleming, Comparative evaluation of atom mapping algorithms for balanced metabolic reactions: application to recon 3D, J. Cheminform. 9(1) (2017) 39. [80] Indigo Toolkit, [81] J. Law, Z. Zsoldos, A. Simon, D. Reid, Y. Liu, S.Y. Khew, A.P. Johnson, S. Major, R.A. Wade, H.Y. Ando, Route Designer: a retrosynthetic analysis tool utilizing automated retrosynthetic rule generation, J. Chem. Inf. Model. 49(3) (2009) 593-602. [82] J.L. Baylon, N.A. Cilfone, J.R. Gulcher, T.W. Chittenden, Enhancing retrosynthetic reaction prediction with deep learning using multiscale reaction classification, J. Chem. Inf. Model. 59(2) (2019) 673-688. [83] C.D. Christ, M. Zentgraf, J.M. Kriegl, Mining electronic laboratory notebooks: analysis, retrosynthesis, and reaction based enumeration, J. Chem. Inf. Model. 52(7) (2012) 1745-1756. [84] M.H.S. Segler, M. Preuss, M.P. Waller, Planning chemical syntheses with deep neural networks and symbolic AI, Nature 555(7698) (2018) 604-610. [85] M.H.S. Segler, M.P. Waller, Modelling chemical reasoning to predict and invent reactions, Chemistry 23(25) (2017) 6118-6128. [86] C.W. Coley, W.H. Green, K.F. Jensen, Machine learning in computer-aided synthesis planning, Acc. Chem. Res. 51(5) (2018) 1281-1289. [87] C.W. Coley, L. Rogers, W.H. Green, K.F. Jensen, Computer-assisted retrosynthesis based on molecular similarity, ACS Cent. Sci. 3(12) (2017) 1237-1245. [88] H. Dai, C. Li, C. W. Coley, B. Dai,L. Song, Retrosynthesis prediction with conditional graph logic network, Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver, Canada, 2019. [89] S. Chen, Y. Jung, Deep retrosynthetic reaction prediction using local reactivity and global attention, JACS Au 1(10) (2021) 1612-1620. [90] C.C. Yan, P.L. Zhao, C. Lu, Y. Yu, J.Z. Huang, RetroComposer: composing templates for template-based retrosynthesis prediction, Biomolecules 12(9) (2022) 1325. [91] X.R. Wang, Y.Q. Li, J.Z. Qiu, G.Y. Chen, H.X. Liu, B.B. Liao, C.Y. Hsieh, X.J. Yao, RetroPrime: a Diverse, plausible and Transformer-based method for Single-Step retrosynthesis predictions, Chem. Eng. J. 420(2021) 129845. [92] Z.Q. Chen, O.R. Ayinde, J.R. Fuchs, H. Sun, X. Ning, G2Retro as a two-step graph generative models for retrosynthesis prediction, Commun. Chem. 6(1) (2023) 102. [93] C. Shi, M. Xu, H. Guo, M. Zhang,J. Tang, A graph to graphs framework for retrosynthesis prediction, Proceedings of the 37th International Conference on Machine Learning, Vancouver, Canada, 2020. [94] J.H. Liu, C.C. Yan, Y. Yu, C. Lu, J.Z. Huang, L. Ou-Yang, P.L. Zhao, MARS: a motif-based autoregressive model for retrosynthesis prediction, Bioinformatics 40(3) (2024) btae115. [95] W. Zhong, Z. Yang, C.Y. Chen, Retrosynthesis prediction using an end-to-end graph generative architecture for molecular graph editing, Nat. Commun. 14(1) (2023) 3009. [96] M. Sacha, M. Blaz, P. Byrski, P. Dabrowski-Tumanski, M. Chrominski, R. Loska, P. Wlodarczyk-Pruszynski, S. Jastrzebski, Molecule edit graph attention network: modeling chemical reactions as sequences of graph edits, J. Chem. Inf. Model. 61(7) (2021) 3273-3284. [97] R.X. Sun, H. Dai, L. Li, S. Kearnes, B. Dai, Towards understanding retrosynthesis by energy-based models, Proceedings of the 35th International Conference on Neural Information Processing Systems, San Diego, CA, USA, 2021. [98] Y. Wan, C.-Y. Hsieh, B. Liao,S. Zhang, Retroformer: Pushing the Limits of End-to-end Retrosynthesis Transformer, Proceedings of the 39th International Conference on Machine Learning, California, USA, 2022. [99] P. Karpov, G. Godin,I. V. Tetko, A transformer model for retrosynthesis, Artificial Neural Networks and Machine Learning - ICANN 2019: Workshop and Special Sessions, Proceedings of 28th International Conference on Artificial Neural Networks, Munich, Germany, 2019. [100] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser,I. Polosukhin, Attention Is All You Need, 31st Annual Conference on Neural Information Processing Systems (NIPS), Long Beach, CA, USA,2017. [101] S.J. Zheng, J.H. Rao, Z.Y. Zhang, J. Xu, Y.D. Yang, Predicting retrosynthetic reactions using self-corrected transformer neural networks, J. Chem. Inf. Model. 60(1) (2020) 47-55. [102] L. Yao, W.T. Guo, Z. Wang, S. Xiang, W.T. Liu, G.L. Ke, Node-aligned graph-to-graph: elevating template-free deep learning approaches in single-step retrosynthesis, JACS Au 4(3) (2024) 992-1003. [103] P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Lio,Y. Bengio, Graph Attention Networks, (2017) arXiv:1710.10903. https://doi.org/10.48550/arXiv.1710.10903. [104] K. Yang, K. Swanson, W.G. Jin, C. Coley, P. Eiden, H. Gao, A. Guzman-Perez, T. Hopper, B. Kelley, M. Mathea, A. Palmer, V. Settels, T. Jaakkola, K. Jensen, R. Barzilay, Analyzing learned molecular representations for property prediction, J. Chem. Inf. Model. 59(8) (2019) 3370-3388. [105] B. Chen, T.X. Shen, T.S. Jaakkola, R. Barzilay, Learning to make generalizable and diverse predictions for retrosynthesis, (2019) arXiv:1910.09688. https://doi.org/10.48550/arXiv.1910.09688. [106] Z.K. Tu, C.W. Coley, Permutation invariant graph-to-sequence model for template-free retrosynthesis and reaction prediction, J. Chem. Inf. Model. 62(15) (2022) 3503-3513. [107] K.J. Lin, Y.J. Xu, J.F. Pei, L.H. Lai, Automatic retrosynthetic route planning using template-free models, Chem. Sci. 11(12) (2020) 3355-3364. [108] Y.C. Yan, Y. Zhao, H.F. Yao, J. Feng, L. Liang, W.J. Han, X.H. Xu, C.T. Pu, C.D. Zang, L.F. Chen, Y.Y. Li, H.C. Liu, T. Lu, Y.D. Chen, Y.M. Zhang, RPBP: deep retrosynthesis reaction prediction based on byproducts, J. Chem. Inf. Model. 63(19) (2023) 5956-5970. [109] A. Thakkar, T. Kogej, J.L. Reymond, O. Engkvist, E.J. Bjerrum, Datasets and their influence on the development of computer assisted synthesis planning tools in the pharmaceutical domain, Chem. Sci. 11(1) (2020) 154-168. [110] M.E. Fortunato, C.W. Coley, B.C. Barnes, K.F. Jensen, Data augmentation and pretraining for template-based retrosynthetic prediction in computer-aided synthesis planning, J. Chem. Inf. Model. 60(7) (2020) 3398-3407. [111] P. Schwaller, R. Petraglia, V. Zullo, V.H. Nair, R.A. Haeuselmann, R. Pisoni, C. Bekas, A. Iuliano, T. Laino, Predicting retrosynthetic pathways using transformer-based models and a hyper-graph exploration strategy, Chem. Sci. 11(12) (2020) 3316-3325. [112] H. Lee, S. Ahn, S. Seo, Y. Song, E. Yang, S.J. Hwang, J. Shin, RetCL: a selection-based approach for retrosynthesis via contrastive learning, Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, Montreal, Canada, 2021. [113] Y. Xia, T. He, X. Tan, F. Tian, D. He,T. Qin, Tied transformers: neural machine translation with shared encoder and decoder, Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, Honolulu, Hawaii, USA,2019. [114] O.S. Vaidya, S. Kumar, Analytic hierarchy process: an overview of applications, Eur. J. Oper. Res. 169(1) (2006) 1-29. [115] B.A. Grzybowski, T. Badowski, K. Molga, S. Szymkuc, Network search algorithms and scoring functions for advanced-level computerized synthesis planning, Wiley Interdiscip. Rev. Comput. Mol. Sci. 13(1) (2023) e1630. [116] M. Koch, T. Duigou, J.L. Faulon, Reinforcement learning for bioretrosynthesis, ACS Synth. Biol. 9(1) (2020) 157-168. [117] M. Vorsilak, M. Kolar, I. Cmelo, D. Svozil, SYBA: Bayesian estimation of synthetic accessibility of organic compounds, J. Cheminform. 12(1) (2020) 35. [118] M. Vorsilak, D. Svozil, Nonpher: computational method for design of hard-to-synthesize structures, J. Cheminform. 9(1) (2017) 20. [119] J.H. Yu, J.K. Wang, H. Zhao, J.B. Gao, Y. Kang, D.S. Cao, Z. Wang, T.J. Hou, Organic compound synthetic accessibility prediction based on the graph attention mechanism, J. Chem. Inf. Model. 62(12) (2022) 2973-2986. [120] P. Carbonell, P. Parutto, J. Herisson, S.B. Pandit, J.L. Faulon, XTMS: pathway design in an eXTended metabolic space, Nucleic Acids Res. 42(Web Server issue) (2014) W389-W394. [121] X.X. Wang, Y.J. Qian, H.Y. Gao, C. Coley, Y.M. Mo, R. Barzilay, K.F. Jensen, Towards efficient discovery of green synthetic pathways with Monte Carlo tree search and reinforcement learning, Chem. Sci. 11(40) (2020) 10959-10972. [122] B. Chen, C. Li, H. Dai,L. Song, Retro*: learning retrosynthetic planning with neural guided a* search, Proceedings of the 37th International Conference on Machine Learning, Long Beach, California, USA, 2020. [123] G. Liu, D. Xue, S. Xie, Y. Xia, A. Tripp, K. Maziarz, M. Segler, T. Qin, Z. Zhang,T.-Y. Liu, Retrosynthetic planning with dual value networks, Proceedings of the 40th International Conference on Machine Learning, San Francisco,USA,2023. [124] J. Kim, S. Ahn, H. Lee,J. Shin, Self-Improved Retrosynthetic Planning, Proceedings of the 38th International Conference on Machine Learning, Los Angeles, California, USA, 2021. [125] T.X. Ou, Y.N. Lu, X.S. Wu, J.W. Cao, Monte Carlo tree search: a survey of theories and applications, 20223rd International Conference on Big Data, Artificial Intelligence and Internet of Things Engineering (ICBAIE). July 15-17, 2022, Xi’an, China. IEEE, (2022) 388-396. [126] L. V. Allis, M. van der Meulen,H. J. van den Herik, Proof-number search, Artif. Intell. 66(1) (1994) 91-124. [127] A. Kishimoto, B. Buesser, B. Chen,A. Botea, Depth-first proof-number search with heuristic edge cost and application to chemical synthesis planning, Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver: ACM, 2019. [128] P.E. Hart, N.J. Nilsson, B. Raphael, A formal basis for the heuristic determination of minimum cost paths, IEEE Trans. Syst. Sci. Cybern. 4(2) (1968) 100-107. [129] T. Klucznik, B. Mikulak-Klucznik, M.P. McCormack, H. Lima, S. Szymkuc, M. Bhowmick, K. Molga, Y.B. Zhou, L. Rickershauser, E.P. Gajewska, A. Toutchkine, P. Dittwald, M.P. Startek, G.J. Kirkovits, R. Roszak, A. Adamski, B. Sieredzinska, M. Mrksich, S.L.J. Trice, B.A. Grzybowski, Efficient syntheses of diverse, medicinally relevant targets planned by computer and executed in the laboratory, Chem 4(3) (2018) 522-532. [130] E.W. Dijkstra, A note on two problems in connexion with graphs, Numer. Math. 1(1) (1959) 269-271. [131] S. Genheden, E. Bjerrum, PaRoutes: towards a framework for benchmarking retrosynthesis route predictions, Digit. Discov. 1(4) (2022) 527-539. [132] S. Genheden, O. Engkvist, E. Bjerrum, Clustering of synthetic routes using tree edit distance, J. Chem. Inf. Model. 61(8) (2021) 3899-3907. [133] S. Genheden, O. Engkvist, E. Bjerrum, Fast prediction of distances between synthetic routes with deep learning, Mach. Learn.: Sci. Technol. 3(1) (2022) 015018. [134] P.J. Rousseeuw, Silhouettes: a graphical aid to the interpretation and validation of cluster analysis, J. Comput. Appl. Math. 20(1987) 53-65. [135] J.S. Schreck, C.W. Coley, K.J.M. Bishop, Learning retrosynthetic planning through simulated experience, ACS Cent. Sci. 5(6) (2019) 970-981. [136] R. Shibukawa, S. Ishida, K. Yoshizoe, K. Wasa, K. Takasu, Y. Okuno, K. Terayama, K. Tsuda, CompRet: a comprehensive recommendation framework for chemical synthesis planning with algorithmic enumeration, J. Cheminform. 12(1) (2020) 52. [137] W.L. Wang, Q.L. Liu, L. Zhang, Y.C. Dong, J. Du, RetroSynX: a retrosynthetic analysis framework using hybrid reaction templates and group contribution-based thermodynamic models, Chem. Eng. Sci. 248(2022) 117208. [138] B. Delepine, T. Duigou, P. Carbonell, J.L. Faulon, RetroPath2.0: a retrosynthesis workflow for metabolic engineers, Metab. Eng. 45(2018) 158-170. [139] N. Hadadi, J. Hafner, A. Shajkofci, A. Zisaki, V. Hatzimanikatis, ATLAS of biochemistry: a repository of all possible biochemical reactions for synthetic biology and metabolic engineering studies, ACS Synth. Biol. 5(10) (2016) 1155-1166. [140] J.D. Tyzack, A.J.M. Ribeiro, N. Borkakoti, J.M. Thornton, Transform-MinER: transforming molecules in enzyme reactions, Bioinformatics 34(20) (2018) 3597-3599. [141] S. Genheden, A. Thakkar, V. Chadimova, J.L. Reymond, O. Engkvist, E. Bjerrum, AiZynthFinder: a fast, robust and flexible open-source software for retrosynthetic planning, J. Cheminform. 12(1) (2020) 70. [142] P. Torren-Peraire, A.K. Hassen, S. Genheden, J. Verhoeven, D.A. Clevert, M. Preuss, I.V. Tetko, Models Matter: the impact of single-step retrosynthesis on synthesis planning, Digit. Discov. 3(3) (2024) 558-572. [143] H.X. Liu, H.Y. Yin, Z.Y. Luo, X.N. Wang, Integrating chemistry knowledge in large language models via prompt engineering, Synth. Syst. Biotechnol. 10(2024) 23-38. [144] J. Guo, A.S. Ibanez-Lopez, H. Gao, V. Quach, C.W. Coley, K.F. Jensen, R. Barzilay, Automated chemical reaction extraction from scientific literature, J. Chem. Inf. Model. 62(9) (2022) 2035-2045. [145] Y. Liu, H. Xu, T. Fang, H. Xi, Z. Liu, S. Zhang, H. Poon,S. Wang, T-Rex: Text-assisted Retrosynthesis Prediction, (2024) arXiv:2401.14637. https://doi.org/10.48550/arXiv.2401.14637. [146] U.V. Ucak, I. Ashyrmamatov, J. Ko, J.Y. Lee, Retrosynthetic reaction pathway prediction through neural machine translation of atomic environments, Nat. Commun. 13(1) (2022) 1186. [147] R. Kurczab, S. Smusz, A.J. Bojarski, The influence of negative training set size on machine learning-based virtual screening, J. Cheminform. 6(2014) 32. [148] Y. Wang, C. Pang, Y.Z. Wang, J.R. Jin, J.J. Zhang, X.X. Zeng, R. Su, Q. Zou, L.Y. Wei, Retrosynthesis prediction with an interpretable deep-learning framework based on molecular assembly tasks, Nat. Commun. 14(1) (2023) 6155. [149] J.R. Li, L. Fang, J.G. Lou, Retro-BLEU: quantifying chemical plausibility of retrosynthesis routes through reaction template sequence analysis, Digit. Discov. 3(3) (2024) 482-490. [150] T. Badowski, K. Molga, B.A. Grzybowski, Selection of cost-effective yet chemically diverse pathways from the networks of computer-generated retrosynthetic plans, Chem. Sci. 10(17) (2019) 4640-4651. [151] K. Papineni, S. Roukos, T. Ward,W.-J. Zhu, BLEU: a method for automatic evaluation of machine translation, Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, Stroudsburg, PA, USA, 2002. [152] A. Mayr, G. Klambauer, T. Unterthiner, M. Steijaert, J.K. Wegner, H. Ceulemans, D.A. Clevert, S. Hochreiter, Large-scale comparison of machine learning methods for drug target prediction on ChEMBL, Chem. Sci. 9(24) (2018) 5441-5451. [153] G. Bhisetti, C. Fang, Artificial intelligence-enabled de novo design of novel compounds that are synthesizable, Methods Mol. Biol. 2390(2022) 409-419. [154] H.Y. Gao, T.J. Struble, C.W. Coley, Y.R. Wang, W.H. Green, K.F. Jensen, Using machine learning to predict suitable conditions for organic reactions, ACS Cent. Sci. 4(11) (2018) 1465-1476. [155] T. Stuyver, C.W. Coley, Quantum chemistry-augmented neural networks for reactivity prediction: performance, generalizability, and explainability, J. Chem. Phys. 156(8) (2022) 084104. [156] I. Ismail, C. Robertson, S. Habershon, Successes and challenges in using machine-learned activation energies in kinetic simulations, J. Chem. Phys. 157(1) (2022) 014109. [157] Q.L. Liu, K. Tang, L. Zhang, J. Du, Q.W. Meng, Computer-assisted synthetic planning considering reaction kinetics based on transition state automated generation method, AlChE. J. 69(7) (2023) e18092. [158] C.W. Coley, D.A. Thomas 3rd, J.A.M. Lummiss, J.N. Jaworski, C.P. Breen, V. Schultz, T. Hart, J.S. Fishman, L. Rogers, H.Y. Gao, R.W. Hicklin, P.P. Plehiers, J. Byington, J.S. Piotti, W.H. Green, A.J. Hart, T.F. Jamison, K.F. Jensen, A robotic platform for flow synthesis of organic compounds informed by AI planning, Science 365(6453) (2019) eaax1566. [159] A. M. Bran, S. Cox, O. Schilter, C. Baldassari, A.D. White, P. Schwaller, Augmenting large language models with chemistry tools, Nat. Mach. Intell. 6(2024) 525-535. [160] D.A. Boiko, R. MacKnight, B. Kline, G. Gomes, Autonomous chemical research with large language models, Nature 624(2023) 570-578. |
阅读次数 | ||||||||||||||||||||||||||||||||||||||||||||||||||
全文 2
|
|
|||||||||||||||||||||||||||||||||||||||||||||||||
摘要 7
|
|
|||||||||||||||||||||||||||||||||||||||||||||||||