Table 1 Summary of datasets for SMILES to IUPAC name translation with character and word splits
From: STOUT V2.0: SMILES to IUPAC name conversion using transformer models
IUPAC name character split | IUPAC name word split | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
Input token count | Maximum input length | Output token count | Maximum Output length | Average training time (per epoch) | Input token count | Maximum input length | Output token count | Maximum output length | Average training time (per epoch) | |
1 million | 125 | 150 | 64 | 150 | 50 s | 125 | 150 | 855 | 150 | 57 s |
10 million | 126 | 200 | 64 | 300 | 6 min 10 s | 126 | 200 | 1199 | 300 | 7 min 8 s |
50 million | 132 | 400 | 66 | 400 | 55 min 43 s | 132 | 400 | 1501 | 400 | 1 h 14 min 53 s |