Main

Structure determination is a fundamental aspect of chemical research1. Single-crystal X-ray diffraction (SCXRD) analysis provides the most accessible and accurate method for structure determination2,3,4. As the name implies, SCXRD requires crystalline samples5. In many instances, this is not problematic. Unfortunately, the applicability of SCXRD is limited in the case of non-crystalline samples, including those that exist as liquids or that tend to form oils or produce polycrystalline phases during single-crystal growth6,7,8. This limitation can be overcome by imposing order on non-crystalline molecular species. In a previous seminal work, the authors reported the so-called crystalline sponge (CS) method for the X-ray diffraction (XRD) structure determination of small molecules with a poor propensity to form single crystals6,7,9. This strategy relies on the accommodation of an analyte within the pores of metal-linked crystalline interstitial voids using multiple weak interactions and is a powerful tool for determining the crystal structures of numerous non-crystalline compounds6,7,9. In 2016, a metal–organic framework (MOF)-based alignment strategy that allowed XRD structural analyses of molecules with ligating groups, such as carboxylates10,11, was reported. Powerful as they are, the current CS- and MOF-based structure determination approaches are subject to limitations that affect their applicability. For instance, they have not proved routinely effective in solving the structures of molecules bearing long alkyl chains6,7,8,9,10,11. Molecules with these motifs are often found in natural products and drugs. They are generally difficult to crystallize and can produce disordered structures even when included into CSs or MOFs6,7,8. An approach that might overcome this limitation is to order long alkyl-chain-containing molecules in solid frameworks by specific host−guest interactions. To our knowledge, receptor-bearing materials suitable for this purpose have not yet been reported. Pillar[5]arenes are a well-studied class of supramolecular receptors12,13, which are notable for their ability to form complexes with alkyl-chain-containing guests14,15. We thus considered that their integration into MOFs would provide rigid materials endowed with docking sites that, in turn, might allow for the structure ordering and hence SCXRD analysis of target molecules containing long alkyl chains. This study was undertaken as a test of this hypothesis. In this work, we show that pillar[5]arene-incorporated MOFs can be used to capture alkyl-chain-containing molecules in a way that permits their structural analysis by SCXRD (Fig. 1). Accordingly, EtP5-MOF-2 was found to act as a particularly effective ‘supramolecular dock’ for this target because of its crystallinity, structural robustness and strong affinity for long alkyl-chain-containing guests. The power of this supramolecular docking method is illustrated through the SCXRD-based structure determination of 48 alkyl chain-containing molecules, including a US Food and Drug Administration (FDA)-approved drug, Dojolvi16,17. Moreover, in contrast to the complicated and time-consuming solvent exchange procedures in previous CS- and MOF-based methods, our supramolecular docking approach facilitates the rapid incorporation of guest molecules in frameworks within 10 min through the fast host−guest molecular recognition process. Leveraging these merits, the crystal structure of an unstable alkene sulfide compound was successfully determined18. Notably, to confirm the efficacy of our method in determining the structures of unknown compounds, blind experiments were also performed. Fourteen compounds can be unambiguously determined by our supramolecular docking strategy using SCXRD analysis with the assistance of nuclear magnetic resonance (NMR) spectral and mass spectrometric data without previous knowledge of their structural details. Furthermore, in certain instances, the SCXRD-based structures of products taken directly from crude reaction mixtures could be obtained. Thus, in both conception and scope, the present approach merges the merits of rational design, efficient sample preparation, broad applicability and successful structure determination for challenging molecules.

Fig. 1: Conceptual summary of the supramolecular docking strategy.
figure 1

a, Schematic of the supramolecular dock EtP5-MOF-2 and its preparation from the pillar[5]arene-based strut EtP5BPPy and H4L1 motifs. b, Cartoon representation of the supramolecular docking process that converts guest-free EtP5-MOF-2 to guest-included EtP5-MOF-2-G. DMF, N,N-dimethylformamide.

Broad applicability experiments

Initially, EtP5-MOF-2 was prepared and selected after screening a series of custom-made pillar[5]arene-incorporated MOFs using the pillar–layer strategy that involves linking perethylated pillar[5]arene (EtP5)-incorporated EtP5BPPy, H4L1 and zinc nodes by coordination bonds19 (Supplementary Figs. 159 and Supplementary Tables 16). With EtP5-MOF-2 in hand, it was studied as a potential supramolecular dock for the SCXRD-based structure determination of various target molecules. For these studies, guest-included crystals of EtP5-MOF-2 were prepared and subjected to SCXRD analyses.

Figure 2 shows the chemical structures of 63 representative target molecules chosen for the study. Compounds 125 and 4043 were selected as commercial hard-to-crystallize target molecules that bear alkyl chains. This set also provides a diversity of functionalities, including primary alcohol, aldehyde, ester, ketone, haloalkane and glycols. In terms of specifics, compounds 9, 10, 19, 21, 22 and 24 are natural products, whereas 11 and 25 are FDA-approved drugs. Using EtP5-MOF-2 as a supramolecular dock, the crystal structures of 125 and 4043 could all be obtained as confirmed by Fourier maps of the observed structure factor (Fobs) (Supplementary Figs. 66206 and 281302 and Supplementary Tables 833 and 4851). As shown in Fig. 3, targets 125 and 4043 can dock into the pillar[5]arene subunits in EtP5-MOF-2 and are stabilized by specific host−guest interactions. The details of the docking vary with the specific compound in question (Fig. 3). A brief discussion of selected examples thus follows. cis-11-Hexadecenal 19 is a volatile pheromone component in Helicoverpa assulta20,21 that has not previously yielded to an SCXRD analysis. Using EtP5-MOF-2 as a supramolecular dock, a structure was obtained wherein the aldehyde head of 19 threads into the cavity of the pillar[5]arene units being stabilized by multiple [C–HO] and [C–Hπ] interactions (Supplementary Figs. 167 and 168). Methyl palmitoleate 22 is a volatile pheromone produced by the tsetse fly Glossina morsitans that is infamous as a vector for human and animal disease in Africa22. Although the chemical composition of 22 was determined based on gas chromatography-mass spectrometry (GC-MS), its volatility has precluded an SCXRD structural analysis. Using the present supramolecular docking strategy, the SCXRD structure of 22 was readily obtained. In this instance, the long chain of 22 threads into the cavity of the pillar[5]arene units being held in place by apparent synergetic [C–HO] and [C–Hπ] interactions (Supplementary Figs. 184 and 185). Another notable example of a hard-to-crystallize system is 25. This molecule, known as Dojolvi, is an FDA-approved drug for the treatment of long-chain fatty acid oxidation disorders (LC-FAOD)16,17. LC-FAOD is a group of rare autosomal recessive genetic diseases that are life-threatening for certain patients16,17. To our knowledge, the structure of 25 has yet to be resolved at atomic resolution. This probably reflects its oily nature and flexibility of the three alkyl chains. The resulting structure showed that one alkyl chain of 25 docks into the cavity of the pillar[5]arene subunits as stabilized by [C–HO] and [C–Hπ] interactions (Supplementary Figs. 202 and 203).

Fig. 2: Chemical structures of target molecules 1–63 determined by the supramolecular docking strategy.
figure 2

Target molecules 125 and 4063 represent alkyl-chain-containing molecules and 2639 represent molecules devoid of alkyl chains. r.t., room temperature.

Fig. 3: Single-crystal structures of target molecules 1–25, 40–56 and 58–62 in EtP5-MOF-2.
figure 3

Refined crystal structures of the target molecules obtained from SCXRD data shown with 50% probability thermal ellipsoids. Only the pillar[5]arene units of EtP5-MOF-2 (presented in ball and stick form in khaki) are shown for clarity. In the case of positional disorder, only one conformation of target molecules is shown for clarity. Colour code for target molecules: C atoms are dark grey, H atoms are light grey, O atoms are red, N atoms are blue, F atoms are green, Br atoms are brown, S atoms are luminous yellow and Si atom is light blue.

We next sought to test whether the receptor-bearing framework EtP5-MOF-2 would permit the determination of crystal structures of molecules devoid of alkyl chains. For this portion of the study, targets 2639, were subject to SCXRD analyses after being included in EtP5-MOF-2. In all cases, the structures were readily resolved (Supplementary Figs. 207280 and Supplementary Tables 3447).

On the basis of our studies of the above targets, we conclude that the present supramolecular docking strategy can bring order to highly flexible guest molecules as well as small molecules without alkyl chains with the assistance from the EtP5 cavity and the overall framework. This strategy also permits the structure analysis of non-crystalline species in near-trace (that is, ≥5 µl) amounts.

Blind experiments

To further confirm the generality of our method in practical situations, we collected samples of compounds 4463 from 16 research groups that are challenging to crystallize and are in demand for single-crystal growth. The amount required for each sample was controlled between 5 and 20 μl or mg. We successfully obtained all their crystal structures as shown in Fig. 3, Supplementary Figs. 303454 and Supplementary Tables 5274. It was noteworthy that among these cases, compound 49 is an important precursor for the preparation of biodegradable lipid-modified poly(guanidine thioctic acids)23,24. The crystal structure of compound 49, which has the highest molecular weight (678.08 Da) among the molecules whose structures were determined to date using CS- and MOF-based methods6,10, is reported here. This compound features two ultralong alkyl chains, one of which docks in the cavity of EtP5 units in EtP5-MOF-2 stabilized by multiple [C–Hπ] interactions (Supplementary Figs. 335337). A notable feature is that owing to the elimination of the solvent exchange step and the rapid complexation of target molecules, our supramolecular docking method reduced the time from sample preparation to SCXRD measurement to a few minutes. This allowed our method for the structure determination of unstable compounds. It has been reported that alkenyl sulfides are unstable and prone to be oxidized in air within hours18. Therefore, it is a challenge to obtain the crystal structure of compounds containing alkenyl sulfides. Here compound 62 was used as an example to show the possibility of the structure determination for alkenyl sulfides. Several single crystals of EtP5-MOF-2 were immersed in 62 (50 μl) for 3 min and then subjected to SCXRD measurements. The SCXRD results indicated that the structure of 62 was successfully solved, with 62 complexed in the EtP5 cavity (Supplementary Figs. 433436). We proposed that the EtP5 units might protect 62 and hinder its oxidation process through encapsulation. As-synthesized 62 was observed to gradually deteriorate in air, undergoing complete oxidation to 63 within 24 h, as evidenced by 1H NMR spectroscopy (Supplementary Fig. 443). Immersion of EtP5-MOF-2 crystals into the above oxidized sample and then being subjected to SCXRD analysis allowed the crystal structure of 63 to be determined (Supplementary Figs. 449451). By contrast, for the crystals of EtP5-MOF-2 immersed in the as-synthesized 62 before oxidation, the crystal structure of 62 remained determinable without oxidation, even after exposing the sample in air for 24 h or 30 days (Supplementary Figs. 440447). From these findings, we conclude that the EtP5 unit probably protects 62 by complexing it within the cavity, thereby hindering the oxidation process. By analysing the crystal structure of EtP5-MOF-2-62, the alkenyl sulfide unit is found to be stabilized in the EtP5 units through multiple [C–HS] and [C–Hπ] interactions, which might prevent 62 from oxidation (Supplementary Fig. 434).

To further demonstrate the efficacy of our method in determining the structures of unknown compounds, a series of blind experiments were conducted. Fourteen guest-included blind samples (5063) were subjected to SCXRD measurements and analysed without any previous structural information for the molecules. All the initial crystal structures of the target molecules were determined directly from Q peaks to identify the most probable structures (Fig. 4). These initial structures could be easily corrected by integrating NMR spectral and mass spectrometric data to give the final structures. These results confirmed the reliability of our method in determining the crystal structures of unknown compounds.

Fig. 4: Blind experiments of target molecules 50–63 in EtP5-MOF-2.
figure 4

an, Initial structures and final structures of the target molecules obtained from SCXRD data are shown with 50% probability thermal ellipsoids. The initial structures were determined from Q peaks directly during SCXRD analyses without previous knowledge of any structural information. The final structures were revised from the initial structures considering NMR spectral and mass spectrometric data. Note that the initial structure of 52 is determined as a position-disordered molecule overlapped on the benzene ring that is close to the final structure. Colour code for target molecules: C atoms are dark grey, H atoms are light grey, O atoms are red, N atoms are blue, F atoms are green, Br atoms are brown and S atoms are luminous yellow.

Absolute configuration determination

EtP5BPPy, the key recognition component in EtP5-MOF-2, possesses planar chirality. This chirality is reflected in the fact that EtP5-MOF-2 consists of two interpenetrated frameworks with the two EtP5BPPy enantiomers (denoted as pS-EtP5BPPy and pR-EtP5BPPy) integrated into the networks giving crystals with the overall achiral \(P\overline{1}\) space group (Supplementary Figs. 455 and 456). The two enantiomers in EtP5-MOF-2 led us to explore whether the present supramolecular docking strategy might be used to determine the structures of chiral compounds.

To test the above possibility, target molecules (±)-1, (±)-10, (±)-11, (±)-26, )-58, (±)-59, (±)-60 and (±)-61 with stereogenic centres were allowed to be included into EtP5-MOF-2 (Supplementary Figs. 457475). In all eight cases, the molecules were found distributed within EtP5-MOF-2 as racemic mixtures. For instance, in the case of (±)-1, the S enantiomer is accommodated in the pS-EtP5BPPy subunits present in EtP5-MOF-2, whereas the R enantiomer is bound in the corresponding pR-EtP5BPPy units as confirmed by Fobs maps (Supplementary Fig. 458). The hydroxy group on the C2 point chiral centre of (±)-1 is stabilized by [C–HO] and [C–Hπ] interactions (Supplementary Fig. 457). Based on literature precedent, these specific host–guest interactions might be the driving forces for the observed enantioselective recognition25. We also found similar phenomena in the cases of (±)-10, (±)-11, (±)-26, (±)-58, (±)-59, (±)-60 and (±)-61, which supported the hypothesis that at least one chiral configuration of a guest is inclined to occur within the corresponding chiral network of EtP5-MOF-2 as crystallographically confirmed by Fobs maps (Supplementary Figs. 462475). These findings are taken as evidence that despite the racemic nature of the target molecules, the absolute configuration of one docked guest might be deduced from the chirality of the pS-EtP5BPPy and pR-EtP5BPPy units present in EtP5-MOF-2 without much dependence on analyses (for example, use of a Flack parameter)26. This behaviour might be attributed to the predetermined chirality of the pS/pR-EtP5BPPy subunits in EtP5-MOF-2, which probably collectively provide chiral docking sites for guest molecules10,11,27,28. On the basis of the above studies, we propose that the incorporation of chiral components into supramolecular docking frameworks might facilitate absolute structure determination.

Structure determination of mixtures

In principle, both the pillar[5]arene receptor units and the MOF pores of EtP5-MOF-2 could act as binding sites that can include guests. This led us to explore whether the present supramolecular docking strategy could be used to analyse mixtures of guests. Using EtP5-MOF-2 as the supramolecular docking framework, we sought to test whether it would be possible to obtain SCXRD structural information for a product obtained directly from a reaction mixture without pre-purification. We took 18 as a target and prepared it from 14 by reaction with p-hydroxybenzaldehyde (Supplementary Fig. 135). Product 18 obtained in this way was accompanied by unreacted 14. After treating a single crystal of EtP5-MOF-2 with a small quantity of the crude reaction mixture (5 µl) and allowing to stand for 10 min, the crystal was subjected to an SCXRD measurement. The resulting structural analysis revealed that both 14 and 18 are encapsulated in EtP5-MOF-2 to produce what is formally EtP5-MOF-2-14·18. In the structure, 14 is seen to dock in the pillar[5]arene units stabilized by [C–HBr] and [C–Hπ] interactions, whereas 18 is distributed throughout the pores of EtP5-MOF-2 (Supplementary Figs. 136141). The evidence to distinguish 14 and 18 in EtP5-MOF-2 comes from the Fobs maps because of their structural difference. This result provides support for the suggestion that the present supramolecular docking strategy may be used to solve the crystal structures of alkyl chain-bearing products obtained directly from reaction mixtures without the need for pre-purification.

Outlook

In summary, we report a supramolecular docking strategy designed to address the challenges associated with the structure determination of alkyl-chain-containing molecules. Based on the results obtained, we suggest that the present approach may have a role to play in the structural analysis of a wide range of products obtained under typical laboratory conditions (liquid, sticky, solid or unstable samples and mixtures). Notably, the rational design of a MOF carrier is always a challenging task in framework-based structure determination. Our supramolecular docking method provides a conceptual guiding principle for framework-based structural analysis using a designable and directed supramolecular host−guest recognition strategy, markedly advancing the field of structure determination. The current approach still faces potential limitations, particularly in the structure determination of high molecular weight and structurally complex molecules. Future research will focus on incorporating various supramolecular docking hosts into MOFs to facilitate the structure determination of a diverse range of compounds.

Methods

Preparation of EtP5-MOF-2 crystals

A DMF suspension (4 ml) of EtP5BPPy (11.1 mg, 10.0 μmol), H4L1 (8.10 mg, 10.0 μmol) and Zn(NO3)2·6H2O (5.97 mg, 20.0 μmol) was prepared in a small vial. Then, acetic acid (160 µl) was added. The mixture was subjected to sonication for 2 min and then passed through a syringe filter to give a transparent solution, which was sealed and heated at a constant rate of 1 °C min−1 to 90 °C, maintained at this temperature for 48 h, and cooled to room temperature at a constant cooling rate of 0.2 °C min−1. This procedure yielded transparent flaxen-coloured single crystals of EtP5-MOF-2 suitable for SCXRD analysis.

Preparation of guest-included EtP5-MOF-2 crystals

The mother liquor associated with as-synthesized single crystals of EtP5-MOF-2 was decanted off. The resulting crystals were washed with DMF (3 × 10 ml) three times to remove unreacted reagents. Taking molecule 25 as an example, the single crystals of EtP5-MOF-2 (ranging from 50 µm to 100 µm) obtained in this way were then picked up physically, blotted with tissues, and immersed in a sample of 25 (5 µl) in a 2 ml vial for about 10 min at room temperature. The single crystals were subsequently subjected to SCXRD measurements. In this work, single crystals of EtP5-MOF-2 with dimensions ranging from 50 µm to 100 µm in width were generally used to avoid crystal twinning after guest inclusion or weak diffraction.

SCXRD analyses

SCXRD data were collected on Bruker D8 VENTURE diffractometers with different X-ray sources and detectors (Turbo X-ray Source MoKα radiation (λ = 0.71073 Å) with a PHOTON II CMOS detector, Excillum MetalJet GaKα radiation (λ = 1.34139 Å) with a PHOTON II CMOS detector and INCOATEC IμS DIAMOND CuKα radiation (λ = 1.54184 Å) with a PHOTON III CMOS detector) by Shiyanjia Lab (www.shiyanjia.com). Data were collected at room temperature (300 K) or lower temperature (100 K, 193 K, 221 K, or 238 K). The low temperature was controlled by using a KRYOFLEX II low temperature attachment or an Oxford Cryosystems Cryostream 800 cryostat.

Single crystals were mounted on MicroMesh (MiTeGen) using paratone oil. All data collection was performed in a shutterless mode; the unit cell was determined by a Bruker APEX software suite (APEX429 or APEX530 during the course of data collection). The data sets were reduced and a multi-scan spherical absorption correction was implemented by Bruker SAINT v.8.40B (ref. 31) and SADABS-2016/2 (ref. 32) or TWINABS-2012/1 for some twinning structures33. The target structures were solved by the intrinsic phasing method using the SHELXT 2018/2 (ref. 34) and refined with full-matrix least squares on F2 using the SHELXL 2019/3 (ref. 35) (using OLEX2 1.5 (ref. 36) as the graphical interface). Unless otherwise mentioned, all non-H atoms in the target structure were refined anisotropically and all H atoms were assigned isotropic displacement coefficients U(H) = 1.2U or 1.5U, and their coordinates were allowed to ride on their respective atoms.

The refinement procedure of guest molecules could be divided into the following steps: First, the structure of EtP5-MOF-2 was refined anisotropically and all H atoms were placed into geometrically calculated positions. In this step, constraints such as AFIX 66 and/or restraints such as DFIX, SADI, DANG, SIMU, ISOR and RIGU could be used to make the whole MOF structure reasonable. After this step, assignments of the electron density peaks were made. Some constraints and/or restraints such as AFIX 66, DFIX, SADI, DANG, SAME, SIMU, ISOR and RIGU could be used if necessary in the refinement of the target molecules. Once the whole molecule was localized and fixed, an anisotropic refinement was carried out. H atoms expected to be present in the target molecule were placed into geometrically calculated positions. At last, residual electron densities in the voids due to highly disordered solvents and/or potential guest molecules were treated with the solvent-mask routine of OLEX2 or SQUEEZE of PLATON. The Fobs electron density maps shown in this work are drawn using OLEX2.

Blind experiments of unknown compounds

At first, 14 guest-included blind samples (5063) were prepared by Y.W. These samples were then subjected to SCXRD measurements and analysed by the crystallographer (L.X.), who was unaware of any previous structural information regarding the target molecules. In the first round of blind experiments, all the initial crystal structures of the target molecules were determined directly from Q peaks to identify the most probable structures. In these cases, the structures of three target molecules (51, 56 and 63) could be fully resolved only through the diffraction data. Molecule 52 exhibited a disorder across the symmetrical elements in the crystal structure, whereas the remaining initial structures closely resembled the actual structures of the analysed compounds. These initial structures could be easily corrected by integrating NMR spectral and mass spectrometric data. It is worth mentioning that some preliminary crystal structures of the blind samples (55, 61 and 62) could be automatically solved by crystallographic program SHELXT (ref. 34) without further analyses.