Introduction

Web-spinning spiders are able to produce various types of silk, each tailored to serve a specific purpose. This versatile silk production allows them to efficiently trap prey, construct protective egg cases, evade predators, and secure themselves to surfaces (Vollrath and Knight 2001). The combination of high tensile strength and extensibility results in an extraordinary toughness that inspires research and innovation (Gosline et al. 1999; Qin et al. 2024). In industry, its durability and elasticity open up possibilities for creating high-performance materials for textiles, composites (Kucharczyk et al. 2019), and biodegradable plastics (Humenik et al. 2011). Spider silk proteins are also biocompatible, supporting cell growth and integration. Because of these properties, spider silks are regarded as a promising material for biomedical applications (Guessous et al. 2024) such as wound healing (Altman et al. 2003; Öksüz et al. 2021), degradable biosensors for biomonitoring of analytes in the body (Xu et al. 2019), tissue engineering (Salehi et al. 2020), such as artificial blood vessels (Dastagir et al. 2020), nerve regeneration (Kornfeld et al. 2021; Millesi et al. 2021), and scaffolds creation (Gellynck et al. 2008). They do not provoke immune responses, making them safe for use in humans without causing adverse reactions (Gellynck et al. 2008).

Among the different kinds of silk, dragline silk stands out for its combination of strength and elasticity. This silk's mechanical properties enable it to absorb and dissipate substantial energy, making it exceptionally resilient (Gosline et al. 1999; Malay et al. 2017). The mechanical properties of dragline silk arise from the structural organisation of proteins called spidroins. The primary components of dragline silk are major ampullate spidroins (MaSp) 1 and 2. These large proteins, which range in size from 200 to 350 kDa, feature a distinctive arrangement: a central repetitive region flanked by non-repetitive and evolutionarily conserved N-terminal and C-terminal domains (NTD and CTD). The sequence characteristics of dragline repetitive regions differ among spider species (Ramezaniaghdam et al. 2022). Some spidroins have short, simple repeat units, while others consist of longer, more complex repeats (Hayashi et al. 1999; Gatesy et al. 2001; Römer and Scheibel 2008). Both the CTD and NTD play crucial roles in initiating fibre formation. The NTD, in particular, can enhance spidroin solubility and regulate fibre formation through pH-dependent dimerization (Askarieh et al. 2010; Rissing et al. 2010; Hagn et al. 2011), while the CTD can trigger the transition of the repetitive region into β-sheet conformation (De Oliveira et al. 2024).

Spiders are notoriously difficult to farm for their silk due to their predatory and cannibalistic nature, which makes large-scale production impractical. To overcome this, researchers have turned to transgenic technologies to develop biomimetic silks (Bini et al. 2006). This approach has involved various host organisms. Each of these systems offers insights and incremental progress toward the goal of producing spider silk at a scale and quality suitable for industrial and medical applications, but each also highlights the ongoing challenges (Whittall et al. 2021; Ramezaniaghdam et al. 2022). Spider silks in their native forms contain intrinsically disordered regions and repetitive sequences. When such native sequences are recombinantly expressed in bacteria, they are prone to premature aggregation in inclusion bodies (Rinas et al. 2017). Bacterial systems also encounter the challenge of producing spidroins with multiple molecular masses (Xia et al. 2010; Bowen et al. 2018). Baby hamster kidney (BHK) cells and bovine mammary epithelial alveolar cells (MAC-T) have also been used as hosts for silk protein production (Lazaris et al. 2002). In another approach, spider silk proteins were successfully produced in goat milk (Copeland et al. 2015). While mammalian cells and transgenic animals offer a more complex and potentially more suitable environment for protein folding and post-translational modifications, the yield of spider silk proteins in these systems has been low, limiting their practicality for large-scale production (Lazaris et al. 2002; Copeland et al. 2015). Insects have been explored as another potential production system due to the relatively close evolutionary distance between spiders and insects (Huemmerich et al. 2004; Anton et al. 2017). Although they could produce spider silk filaments, the process is time-consuming, which poses a significant barrier to efficient production. Some challenges are identified with using the yeast Pichia pastoris (new name Komagataella phaffii) as a production system for spider silk protein, such as poor expression and proteolysis (Werten et al. 2019). Plant systems such as tobacco (Menassa et al. 2004; Hauptmann et al. 2013; Weichert et al. 2014) and Arabidopsis (Barr et al. 2004; Yang et al. 2005) have shown successful production of spider silk proteins, although the yields remain low. In Nicotiana tabacum, recombinant mini-spidroins (MaSp1 and MaSp2) containing native N- and C-terminal domains were expressed, albeit with relatively low production levels (Peng et al. 2016). Additionally, fibres generated from these recombinant proteins exhibited lower toughness values compared to natural spider silk (Peng et al. 2017).

Physcomitrella is an established host system for the safe and efficient production of complex recombinant proteins (Decker and Reski 2020). Physcomitrella is exquisitely amenable to precise genome engineering (Reski et al. 2018; Wiedemann et al. 2018) and able to produce difficult-to-express proteins such as human erythropoietin (Parsons et al. 2012) and human factor H (FH) with 150 kDa (Michelfelder et al. 2017). FH has 20 repetitive protein domains, each of which consists of around 60 amino acids. It is a single-chain molecule linked by 40 intramolecular disulfide bridges (Büttner-Mainik et al. 2011). In Physcomitrella, the secretion of the produced proteins to the surrounding medium is possible by using secretory signals (Decker and Reski 2007), facilitating the subsequent recombinant protein purification. The objective of this work was to evaluate the potential of the model moss Physcomitrella (Physcomitrium patens; Lueth and Reski 2023) as a production platform for recombinant spider dragline silk proteins with N- and C-terminal domains. In this study, short MaSp1 repeats were selected to assess the feasibility of producing MaSp1 from Latrodectus hesperus (LhMaSp1) in Physcomitrella. We began by evaluating spider silk protein production transiently and then progressed to stable production in shake flasks, followed by cultivation on bioreactors. We utilized mass spectrometry to confirm the sequence of the produced proteins. We focused on assessing the feasibility of LhMaSp1 purification rather than on the development processes.

Results and discussion

Transient expression of NTD-LhMaSp1-12Rep-CTD-citrine

The major ampullate silk from L. hesperus, commonly known as the western black widow spider, is known for its strength and great extensibility (Lawrence et al. 2004), making the recombinant production of the Major ampullate silk protein 1 (LhMasp1) highly interesting. As protocols for the fast evaluation of recombinant protein production in Physcomitrella protoplasts are well established (Baur et al. 2005; Schaaf et al. 2005), we assessed the expression of a highly repetitive NTD-LhMaSp1-12Rep-CTD protein in this transient expression system.

The coding sequences (CDSs) of 12 repetitive poly-alanine blocks of the L. hesperus silk protein core regions, the full sequence of the N-terminal domain and the C-terminal domain (Ayoub et al. 2007) were optimised (Supplementary Fig S1) for the codon usage in Physcomitrella based on the Kazusa codon database (https://www.kazusa.or.jp/codon/). Splice sites were modified according to Top et al. (2021) in order to prevent potential hetero-splicing events. The possible actions of Physcomitrella microRNAs on the RNA sequence were investigated using psRNATarget (https://www.zhaolab.org/psRNATarget/). The respective microRNAs binding sites were modified in the DNA construct without a change in the amino acid sequence. This construct was synthesized in the pUC-GW-Amp plasmid (Genewiz, Leipzig, Germany). The CDS of NTD-LhMaSp1-12Rep-CTD was cloned into an expression vector containing the Physcomitrella actin5 (PpActin5) promoter (Weise et al. 2006; Mueller et al. 2014; Niederau et al. 2024), the CDS of citrine, and the nos terminator sequence via XhoI and KpnI restriction sites. Gibson assembly (Gibson et al. 2009) was used to include the Physcomitrella aspartic protease signal peptide APsp (Schaaf et al. 2004). The assembled vector (Supplemental Fig. S2 A) was verified by sequencing.

Protoplasts were monitored 2, 4, 7, and 9 days after transfection via fluorescence microscopy. The fluorescence microscopy images reveal the successful production of NTD-LhMaSp1-12Rep-CTD-citrine in Physcomitrella protoplasts (Fig. 1). The citrine signal was observed from day 2, and the signal was still present until day 9. In this construct, we employed the aspartic protease signal peptide (APsp), previously demonstrated to effectively direct GFP to the secretory pathway in Physcomitrella (Schaaf et al. 2004). The presence of ramified network-like fluorescence signals, alongside fluorescence accumulation within the nuclear envelope, serves as evidence for protein targeting to the endoplasmic reticulum (ER) (Schaaf et al. 2004). However, structures reminiscent of an ER network were not observed in cells producing NTD-LhMaSp1-12Rep-CTD-citrine. In contrast, we observed the accumulation of structures as protein aggregates. The presence of these protein aggregates may be explained by the micelle theory proposed for silk assembly (Jin and Kaplan 2003). The hydropathy plots for L. hesperus MaSp1 are roughly sinusoidal in form, with rapidly alternating hydrophilic–hydrophobic units (Ayoub et al. 2007; Parent et al. 2018). The C- and N-terminal domains harbor the most hydrophilic segments, while the extensive repetitive region in the middle is typically hydrophobic (Askarieh et al. 2010; Hagn et al. 2010). Terminal hydrophilic blocks at both the amino and carboxy termini define the outer edges of the micelles (Jin and Kaplan 2003). Scanning electron microscopy and atomic force microscopy imaging of the silk gland dope from Nephilia clavata spiders have found the existence of micrometre-sized granule particles (Lin et al. 2017). Parent et al. (2018) also proposed a hierarchical micelle theory, in which micellar protein assemblies are the essential starting structures critical for the formation of natural silk fibres.

Fig. 1
figure 1

Confocal microscopy images showing the production of NTD-LhMaSp1-12Rep-CTD-citrine fusion protein in Physcomitrella protoplasts. Schematic representation of the fusion protein is at the top. The images were recorded 2 (D2), 4 (D4), 7 (D7), and 9 days (D9) after protoplast transfection and show the successful production of NTD-LhMaSp1-12Rep-CTD-citrine in Physcomitrella (left panel). Protein aggregates are observed from NTD-LhMaSp1-12Rep-CTD fused to citrine (white arrows). The right panel shows the citrine expression control on days 4 (D4) and 9 (D9) after protoplast transfection. All images are 3D-rendered Z-stack images. NTD: N-terminal domain. CTD: C-terminal domain. Bars = 5 µm

The secretion of a protein into the extracellular space can facilitate downstream processing (Schaaf et al. 2004) in future experiments. To confirm the presence of NTD-LhMaSp1-12Rep-CTD-citrine in the secretory pathway, a construct encoding mCerulean with an ER retention signal (KDEL) was used to provide an ER marker for co-localisation studies. Physcomitrella protoplasts were co-transfected with both plasmids to investigate the potential co-localisation of the respective fluorescent proteins. The 3D rendered Z-stack and single-slice images of protoplasts transiently transformed to produce mCerulean-KDEL and NTD-LhMaSp1-12Rep-CTD-citrine showed the ramified network and the localisation to the nuclear envelope (Fig. 2, Supplementary Fig. S3). The mCerulean marker also showed structures as protein aggregates, albeit they appeared larger and more spherical than those observed with LhMaSp1 (Fig. 2, Supplementary Fig. S3). This phenomenon might align with previous findings regarding protein body (PB) formation. PB formation occurs when secretory recombinant proteins are fused to tags such as Zera, elastin-like polypeptides, and hydrophobins. PB formation is not limited to its fusion partners and can be observed with other recombinant proteins accumulated above the threshold level of 0.2% total soluble protein. PBs can be formed regular or irregular in size (Conley et al. 2009; Gutiérrez et al. 2013; Saberianfar et al. 2015, 2016). Protein bodies form in and remain part of the ER (Saberianfar et al. 2016). In addition, protein overexpression may cause artefacts in fluorescence microscopy, including ectopic subcellular localisations, incorrect formation of protein complexes and others (Ratz et al. 2015). The signals emitted by citrine and mCerulean exhibit almost no overlap, albeit demonstrating considerable coverage across similar areas (Fig. 2). A high density of mCerulean-KDEL and NTD-LhMaSp1-12Rep-CTD-citrine signals was observed in an area surrounding the nucleus, which appears to correspond to the rough ER. One possible explanation for the lack of signal overlap could be the formation of PBs at distinct locations within the ER rather than a uniform diffuse signal. Moreover, mCerulean-KDEL is retained in the ER, but NTD-LhMaSp1-12Rep-CTD-citrine can be distributed along the secretory pathway and therefore found in the ER as well as in the Golgi apparatus.

Fig. 2
figure 2

Confocal microscopy images showing the localisation of NTD-LhMaSp1-12Rep-CTD fused to citrine in the secretory pathway of Physcomitrella protoplasts. Schematic representation of the fusion proteins is on top. The mCerulean-KDEL construct was used as a positive control for ER localisation. A Chlorophyll autofluorescence. B, C mCerulean and citrine channels, respectively. The arrows in Z-stack images point to the fluorescence signal of the ramified network and arrows in single slices point to the fluorescence signal of the nuclear envelope. A dense concentration of mCerulean-KDEL and NTD-LhMaSp1-12Rep-CTD-citrine signals is present in a region encircling the nucleus, suggesting its association with the rough ER. Contour drawings are provided to highlight this area. D Citrine and mCerulean signals do not overlap but demonstrate considerable coverage across similar areas (contour drawings). E Merged channels. The upper images display 3D-rendered Z-stack images, while the lower ones depict individual slices of single cells. Grey-scale channels are provided for better visibility. The images were recorded 5 days after protoplast transfection. Bars = 5 µm

It has been reported that spider silk proteins undergo post-translational modifications (PTMs), including glycosylation (Tillinghast et al. 1992; Sponner et al. 2007; Choresh et al. 2009; Stellwagen and Renberg 2019). However, specific details regarding the type of glycosylation in these proteins remain limited. To date, research on the mechanical characteristics of silk fibres has been conducted with no consideration of the presence of PTMs in the spidroin sequences, although it might influence the mechanical properties of spider silk (Dos Santos-Pinto et al. 2018). Physcomitrella is capable of performing post-translational modifications such as N- and O-glycosylation (Koprivova et al. 2003; Decker et al. 2014; Bohlender et al. 2022; Stenitzer et al. 2022; Rempfer et al. 2024). The capability of Physcomitrella in transferring proteins to the secretory pathways via tailored signal peptides (Decker et al. 2014; Hoernstein et al. 2024) might be helpful in future to produce glycosylated spider silk proteins.

Stable production of recombinant NTD-LhMaSp1-8Rep-CTD-8Htag

For the stable production of L. hesperus MaSp1 protein in Physcomitrella, the CDSs of eight repetitive poly-alanine blocks of the L. hesperus silk protein core region (8Rep), the full sequence of the N-terminal domain (NTD), and the C-terminal domain (CTD) were in silico optimised as described before (Supplementary Fig. S4) for expression in Physcomitrella. The complete construct includes the PpActin5, NTD, 8Rep CTD, 8xHis-tag, a nos terminator, and a nptII (neomycin phosphotransferase) antibiotic-resistance expression cassette was synthesized by Genewiz in the pUC-GW-amp plasmid. To have APsp, the construct APsp-NTD-LhMaSp1-12Rep-CTD-8Htag was digested with NheI and EcoRV. The digested product was ligated to the NTD-LhMaSp1-8Rep-CTD-8Htag digested with the same enzymes. Assembled vectors (Supplementary Fig. S5 A) were verified by sequencing. The schematic representation of the expression cassette for the stable production of NTD-LhMaSp1-8Rep-CTD-8Htag is depicted in Fig. 3A.

Fig. 3
figure 3

Western blot analysis showing successful production of recombinant NTD-LhMaSp1-8Rep-CTD-8Htag protein in stable Physcomitrella lines. A Schematic representation of the expression cassette NTD-LhMaSp1-8Rep-CTD-8Htag. The plasmid contains the CDS of aspartic protease signal peptide (APsp), eight codon-optimized repetitive poly-alanine blocks of the L. hesperus MaSp1 protein core region, the full sequence of N- and C-terminal domains (NTD and CTD), 8 × Histidine-tag (8H), under the control of the Physcomitrella actin5 promoter and the NOS terminator. B Immunodetection of NTD-LhMaSp1-8Rep-CTD-8Htag was performed via Western blot using an anti-His-tag antibody (18,184, Thermo Fischer Scientific). Some of the best-producing lines are presented here. A signal slightly above 55 kDa was detected (white arrows). Almost all transgenic lines show high molecular mass signals higher than 180 kDa (black arrows). C Western blot of the two best NTD-LhMaSp1-8Rep-CTD-8Htag producing lines L4 and L33 with the elution fraction from the His SpinTrap column shows single bands at ~ 56 kDa (white arrows) and high molecular mass signals (black arrows). Unspecific signals also present in the WT are marked with yellow arrows. WT: wild type, L: line. PK1: positive control; transgenic moss line producing the 58 kDa His-tagged protein MFHR1. Gradient 4–15% SDS gels were used. Western blot under reducing conditions. MW: PageRuler Prestained Protein Ladder (Thermo Fisher Scientific)

After transformation with the NTD-LhMaSp1-8Rep-CTD-8Htag construct, the regeneration of 150 transgenic plants was achieved. Fifty transgenic plants were selected for screening with Western blot using an anti-His-tag antibody (18,184, Thermo Fischer Scientific), and 23 plants with a positive signal were identified (Supplementary Fig. S6). The calculated molecular mass of NTD-LhMaSp1-8Rep-CTD-8Htag without the signal peptide is 52 kDa. The Western blot analysis of some of the best-producing lines is presented in Fig. 3. Signals slightly above 55 kDa were observed (Fig. 3B). In addition, almost all transgenic lines showed high molecular mass signals (smear) above 180 kDa (Fig. 3B).

The absence of signals with smaller molecular mass than the expected one in the Western blot suggests that codon optimisation effectively prevented potential heterosplicing events and that NTD-LhMaSp1-8Rep-CTD-8Htag is not degraded in our production conditions. The absence of smaller products indicates the suitability of our expression system, which does not lead to degradation of recombinant MaSp1, typical of some other expression systems such as Escherichia coli, Nicotiana benthamiana, and P. pastoris (Fahnestock and Irwin 1997; Menassa et al. 2004; Xia et al. 2010; Bowen et al. 2018).

To assess the purification potential of NTD-LhMaSp1-8Rep-CTD-8Htag using the His-tag, we used His SpinTrap columns. From 23 positive plants in the first Western blot-based screening, we selected the two lines with the strongest signals: L4, a line which produces mainly a product of slightly above 55 kDa corresponding to the expected molecular mass, and L33, notable for its high molecular mass signals (Fig. 3). Both lines grow similar to the WT on solid medium. For further analysis, the two lines were grown as suspension cultures. Protein extracts from both lines were successfully enriched for NTD-LhMaSp1-8Rep-CTD-8Htag using His SpinTrap columns (Fig. 3C, Supplementary Fig. S7), which had no impact on the respective recombinant MaSP1 migration behaviours in SDS-PAGE as described above. L4 is still notable for the single band above 55 kDa, and L33 for the high molecular mass signals. The high molecular mass signals in L4, which were not clearly visible in Fig. 3B, became evident (Fig. 3C). The band above 55 kDa was also evident in L33 (Fig. 3C). The signals of protein monomers present in L33 is in a similar range as that observed in L4 (Fig. 3C). In assessing line efficiency, we employed the PK1 line (Fig. 3C), a transgenic Physcomitrella line, which produces a synthetic complement regulator, MFHR1 (Top et al. 2019). MFHR1 also has the 8His-tag. The production yield of PK1 was 0.1 mg MFHR1/g moss fresh weight under non-optimised conditions (Top et al. 2019) and 0.82 mg/g fresh weight under optimised conditions (Ruiz-Molina et al. 2022). When considering both monomer and high molecular mass signals in L33, the signal strength of NTD-LhMaSp1-8Rep-CTD-8Htag is much higher than the PK1 signal strength (Fig. 3C), so we estimate a production of spider silk protein in Physcomitrella bioreactors above 0.82 mg/g fresh weight. Product yield is one of the most important factors for a commercially viable production system, as it impacts cost-efficiency, scalability, and overall feasibility. Achieving high yields requires selecting an appropriate host system to ensure both the quantity and quality of the recombinant protein.

To further confirm the successful production of NTD-LhMaSp1-8Rep-CTD-8Htag, we performed subsequent mass spectrometry (MS) analysis of the protein band in L4, and identified our target protein with a sequence coverage of 62% (Fig. 4). The APsp was undetectable, indicating its correct cleavage. Detection of some amino acids at the beginning of the NTD and the last amino acids of the CTD confirm the production of the intact protein.

Fig. 4
figure 4

Mass spectrometric analysis of NTD-LhMaSp1-8Rep-CTD-8Htag. A Total protein of protonema material of L4 was recovered after acetone precipitation. The protein band at ~ 56 kDa which is highlighted in the rectangle on the Coomassie-stained SDS-PAGE and corresponds to the band on the Western blot marked with a rectangle, was utilized for mass spectrometry. Anti-His-tag antibody (18,184, Thermo Fischer Scientific) was used for detection. B NTD-LhMaSp1-8Rep-CTD-8Htag peptides identified by mass spectrometry are shown in bold. Blue: aspartic protease signal peptide (APsp). Orange: N-terminal domain. Black: repetitive poly-alanine and glycine blocks. Green: C-terminal domain. Potential sites for hydroxylation of proline are shown in rectangles. Identified hydroxyproline is underlined. Peptides for MS analysis were generated with chymotrypsin. Western blot under reducing conditions. MW: PageRuler Prestained Protein Ladder (Thermo Fisher Scientific)

One possible explanation for the high molecular mass signals is plant-typical O-glycosylation, which occurs on Physcomitrella proteins (Bohlender et al. 2022; Rempfer et al. 2024). To investigate this possibility, the JIM 16 antibody targeting arabinogalactans (Knox et al. 1991) was used. Arabinogalactans are a type of O-glycans, characterized by tree-like and multiple-branched saccharide structures. Due to their size, they can substantially elevate the molecular mass of proteins. Within the N-terminal domain of NTD-LhMaSp1-8Rep-CTD-H8 tag, five potential sites for proline hydroxylation were identified (Fig. 4B). Hydroxylated prolines serve as the primary anchor for O-glycosylation, such as arabinogalactans or extensin-like arabinosylation in plants (Gomord et al. 2010; Bohlender et al. 2022; Rempfer et al. 2024). Here, using MS analysis we did not find evidence for prolyl-hydroxylation or arabinosylation of three of these potential Hyp sites (ITPSK, AAPSG, AAPAA) whereas no peptide coverage for the other two sites (QAPAQ, AAPAT) was obtained (Fig. 4B). In contrast, hydroxylation of a C-terminal proline (TNPAS) residue was identified by MS, although at low confidence (Fig. 4B, and Supplemental Fig. S8). Moreover, no signals indicating the presence of arabinogalactans were detected in the investigated NTD-LhMaSp1-8Rep-CTD-H8 tag protein-containing samples (Supplementary Fig. S9), suggesting that O-glycosylation is absent despite present prolyl hydroxylation. This suggests that the presence of products with high to very high molecular mass is not due to the attachment of arabinogalactan sugar structures to the protein product but might be NTD-LhMaSp1-8Rep-CTD-8Htag multimers. It was reported before that higher-order protein fibres can assemble via multimerization of medium-molecular mass proteins (Li et al. 2024).

Previous studies suggest that spider silk proteins can assemble under acidic pH conditions (Landreh et al. 2010; Hagn et al. 2011; Eisoldt et al. 2012). To test this possibility, we explored the influence of acidic and basic environments on the multimerization of NTD-LhMaSp1-8Rep-CTD-H8 tag enriched via His-SpinTrap columns. Therefore, His-SpinTrap purified samples were adjusted to pH ranges of 2–3, 3–4, 5–6, 7–8, and 8–9, respectively. Subsequently, the samples underwent overnight incubation at 4 °C and were analysed via Western blot and immunodetection. We found no noticeable differences in the very high molecular mass signals on the Western blot after incubation in acidic condition (pH 5–6) compared to basic conditions (Supplementary Fig. S10). Under extremely acidic pH conditions (pH 2–3, pH 3–4), the monomers were nearly absent. This might be because highly acidic pH leads to a decrease in monomer solubility, potentially resulting in protein precipitation in the sample. Moreover, degradation proteases are more active at acidic pH (Stael et al. 2019). A previous study reported the stability of a soluble 22 kDa Araneus ventricosus MaSp (monomers) at pH values from 2 to 12 for at least 1 h (Lee et al. 2012). Interestingly, extremely acidic pH did not lead to the precipitation of multimers, suggesting that these soluble multimers remain stable in extremely acidic environments. This stability implies that these multimers may have a robust structure that is resistant to pH changes. Our observations also reveal that within the basic pH ranges of 7–8 and 8–9, monomers and multimers are present without changes (Supplementary Fig. S10). It seems that variations in pH do not influence the conversion of monomers to multimers and vice versa under our conditions.

To assess the influence of other conditions on the multimerization of our protein, we tested varying pH levels within the growth medium (Supplementary Fig. S11), and different extraction buffers with different pH values (10, 7.4, 4.2) (Supplementary Fig. S12). None of these conditions prevented the formation of very high molecular mass soluble multimers. Furthermore, adding 75 mM DTT to the extraction buffer did not reduce the high molecular mass signals either (Supplementary Fig. S13). It was reported that some components in plant extracts, such as peroxidases, can induce cross-linking of coat proteins of the Nudaurelia capensis omega virus (Castells-Graells and Lomonossoff 2021). They suggested cross-linking occurred during extraction and purification. However, their experiment of using extraction buffers with different pH, or adding 1 mM DTT did not make any difference to the result (Castells-Graells and Lomonossoff 2021).

The underlying cause of these notable soluble multimers remains unspecified within this study. We assume that the formation of these multimers can occur within the cell. The pH of the Golgi apparatus might have an impact on the formation of these multimers. The Golgi apparatus exhibits an acidic pH (6.8 ± 0.2 to 6.3 ± 0.3 from cis to trans Golgi in Arabidopsis) (Shen et al. 2013), the necessary pH for spider silk protein polymerisation, contrasting with the more basic pH of the ER (Shen et al. 2013). The soluble multimers observed in the NTD-LhMaSp1-8Rep-CTD-8Htag producing line could potentially stem from mature protein polymerisation due to the inherent property of proteins to polymerise in the Golgi apparatus, while the protein monomers may represent those still residing within the ER. This polymerisation can be beneficial for protein rearrangements necessary for the efficient in-vitro maturation of silk protein. However, if the soluble multimers stem from immature aggregates, either within the cell or during the extraction process, this could be detrimental to the protein rearrangements, thereby interfering with the maturation process in vitro.

The pH is not the only factor essential for spider silk protein polymerisation. Recently, a total of 180 metabolite components have been found in the spider silk of Trichonephila clavata (Hu et al. 2023). Notably, the presence of two major metabolites, choline and DL-malic acid, within spider silk components highlights their pivotal roles and essential constituents in protein polymerisation and silk formation (Hu et al. 2023). Both components are also present in plant cells (George et al. 1934; Stafford 1956; Hanson et al. 1985; Andresen et al. 1988). As Physcomitrella is rich in secondary metabolites (Erxleben et al. 2012; Munoz et al. 2024), it may provide components for the polymerisation of spider silk proteins.

Purification of moss-produced NTD-LhMaSp1-8Rep-CTD-8Htag

For further analysis, we purified the recombinant protein. We isolated His-tagged NTD-LhMaSp1-8Rep-CTD-8Htag from 9-day-old plant material cultivated under 2% CO2 and purified it using Ni–NTA chromatography. The target protein started eluting in fraction 19 at 140 mM imidazole and was found in higher concentration in fraction 21 at 500 mM imidazole. Validation through Coomassie staining and Western blotting confirmed the presence of the expected monomers between 55 and 70 kDa (Supplementary Fig. S14 A, B). Western blot also shows multimers (Supplementary Fig. S14 B). The lower molecular bands are potential cross-reaction signals from WT. Further optimisation is required to improve the purification process and eliminate unspecific bands. Further experimentation and method development are required to establish a robust and validated ELISA protocol for the quantification of the protein of interest in subsequent studies.

Moss producing NTD-LhMaSp1-8Rep-CTD-8Htag has WT phenotype and similar biomass accumulation

We investigated the impact of NTD-LhMaSp1-8Rep-CTD-8Htag production on moss growth and phenotype. It was reported that the expression of MaSp2 is deleterious to E. coli through a negative effect on cell growth (Connor et al. 2023). Therefore, we investigated the potential impact of NTD-LhMaSp1-8Rep-CTD-8Htag production on the moss phenotype, especially biomass accumulation. Bioreactor cultures of Physcomitrella WT and NTD-LhMaSp1-8Rep-CTD-8Htag producing line L33 were started in parallel at a density of 50 mg DW/L (Fig. 5A). The microscopic analysis of protonema development observed in both lines at day 5 exhibited no differences. Likewise, we did not observe any phenotypic deviations between WT moss and NTD-LhMaSp1-8Rep-CTD-8Htag producing line (Fig. 5B). Biomass gain was monitored over a time period of 17 days by regular dry weight measurements. Biomass density in both lines reached almost 3 g DW/L (Fig. 5C) at day 17. The growth index (GI = biomassfinal − biomassinitial)/biomassinitial) for WT and L33 are 58 and 55.2, respectively. The results show that the biomass density and morphology of the production line were not significantly altered by LhMaSp1 protein production. The LhMaSp1 production (both monomers and soluble multimers) was confirmed in the moss line L33 growing in the photobioreactor (Fig. 5D).

Fig. 5
figure 5

Cultivation of Physcomitrella WT and NTD-LhMaSp1-8Rep-CTD-8Htag producing line (L33) in photobioreactors. A Five-litre photobioreactor set-up of WT and NTD-LhMaSp1-8Rep-CTD-8Htag-producing line L33 was illuminated with LED lamps. After 2 days, the lights increased from 160 to 350 µmol/m2s. B Morphology of protonema tissue from WT and L33 at day 5 of cultivation. Bars = 150 µm. C Biomass accumulation of WT and L33 in the photobioreactor over a period of 17 days. Data represent mean ± standard deviation (SD) from three biological replicates. WT (red points) and L33 (black points) showed exponential growth until day 8, after which the growth rate began to slow. Solid lines represent fitted curves for WT (red) and L33 (black). Statistical analysis showed no significant difference between the two lines (t-test, p > 0.05). Biomass density in both lines reached almost 3 g DW/L on day 17. D 100 mg biomass of both lines at day 8 was harvested for anti-His Western blot analysis. The result shows the production of NTD-LhMaSp1-8Rep-CTD-8Htag in both forms of monomers and soluble multimers. Anti-His-tag antibody (18,184, Thermo Fisher Scientific) was used. Western blot under reducing conditions. MW: PageRuler Prestained Protein Ladder (Thermo Fisher Scientific)

Conclusion

In this study, we established a robust and consistent expression platform for NTD-LhMaSp1-8Rep-CTD-8Htag production in Physcomitrella through genetic engineering and codon optimisation. We successfully achieved the stable production of recombinant NTD-LhMaSp1-8Rep-CTD-8Htag in Physcomitrella photobioreactors. The production of native-sized spidroins in moss could be achieved using methods such as recursive directional assembly (Bowen et al. 2018). Additionally, by utilizing spinning approaches, we can establish optimal conditions for assembling moss-derived spidroins and analysing the mechanical properties of the resulting fibres. The estimated yield of Physcomitrella-produced LhMaSp1 is higher than 0.82 mg/1 g fresh weight, demonstrating the reliability and scalability of the platform. These results underscore the potential of Physcomitrella as an efficient host for the sustainable production of spider silk proteins, offering a promising avenue for the development of novel biomaterials and applications.

Materials and methods

Plant material

Physcomitrella wild type (new species name: P. patens (Hedw.) Mitt.) was cultivated axenically in mineral KnopME medium 1.83 mM KH2PO4, 3.35 mM KCl, 1.8 mM MgSO4, 4.2 mM Ca(NO3)2, 0.6 mM FeSO4, including microelements (50 μM H3BO3, 50 μM MnSO4 × H2O, 15 μM ZnSO4 × 7 H2O, 2.5 μM KJ, 0.5 μM Na2MoO4 × 2 H2O, 0.05 μM CuSO4 × 5 H2O, 0.05 μM CoCl2 × 6 H2O) according to Reski and Abel (1985), Egener et al. (2002) and Decker et al. (2015). LhMaSp1-producing lines were obtained by transformation of WT moss lines. Cultivation of Physcomitrella, protoplast isolation, polyethylene glycol-mediated transfection, regeneration and selection were performed according to Hohe et al. (2002), Hohe and Reski (2002) and Decker et al. (2015). Sterility of the culture media was monitored according to the protocol from Heck et al. (2021).

Design of the vectors NTD-LhMaSp1-12Rep-CTD-citrine, and NTD-LhMaSp1-8Rep-CTD-8Htag for transient and stable production, respectively

The CDSs of 12 repetitive poly-alanine blocks of L. hesperus silk protein core region (GenBank accession number EF595246), the full sequence of the N-terminal domain (GenBank accession number ABY67421.1) and the C-terminal domain (GenBank accession number AWK58707.1) (Ayoub et al. 2007) were optimised (Supplementary Fig. S1) for the expression in Physcomitrella. This construct was synthesized in the pUC-GW-Amp plasmid by Genewiz (Leipzig, Germany). The CDS of NTD-LhMaSp1-12Rep-CTD was cloned into the expression vector containing PpActin5, the CDS of citrine, and the Nos terminator sequence via XhoI and KpnI restriction sites. Gibson assembly (Gibson et al. 2009) was used to include the Physcomitrella aspartic protease 1 signal peptide (APsp; Schaaf et al. 2004). For this, the CDS of APsp was amplified from plasmid Act5_APsp_MFHR1 (Ruiz-Molina et al. 2022), together with the overhangs NTD and PpAct5 with the primers Fwd_overlapAct5_APsp and Rev_overlapNTD_APsp. The CDS of NTD was amplified from plasmid NTD-LhMaSp1-12Rep-CTD together with the overhangs APsp and 12-Rep region with the primers Fwd_overlapAPsp_NTD and Rev_overlap12Rep_NTD (Supplementary Table S1). 100 ng of each PCR product was added to the Gibson reaction including T5-Exonuclease, Phusion polymerase and Taq ligase (NEB, Frankfurt, Germany). The assembled vector was verified by sequencing. The plasmid (Supplementary Fig. S2 A) is available via the International Moss Stock Center IMSC (https://www.moss-stock-center.org) under the accession number P2231 for the expression constructs APsp-NTD-LhMaSp1-12Rep-CTD-citrine. The protein sequence of NTD-LhMaSp1-12Rep-CTD is presented in Supplementary Fig. S2 B.

For the vector of stable production, the intact construct including the CDSs of 8 repetitive poly-alanine blocks of the L. hesperus silk protein core region, the full sequence of the N-terminal domain, and the C-terminal domain, Physcomitrella actin5 (P) promoter, 8His, a nos terminator, and a nptII (Neomycin Phosphotransferase II) antibiotic-resistance expression cassette was synthesized in the PUC-GW-amp plasmid by Genewiz (Leipzig, Germany). The APsp sequence was obtained from the plasmid APsp-NTD-LhMaSp1-12Rep-CTD via digestion with NheI and EcoRV. The digested product was ligated to the NTD-LhMaSp1-8Rep-CTD-8Htag digested with the same enzymes. Assembled vectors were verified by sequencing. The plasmid (Supplementary Fig. S5 A) is available via the International Moss Stock Center IMSC (https://www.moss-stock-center.org) under the accession number P2230 for the expression constructs APsp-NTD-LhMaSp1-8Rep-CTD-8Htag. The protein sequence of NTD-LhMaSp1-8Rep-CTD-8Htag is presented in Supplementary Fig. S5B. Utilizing 12Rep and 8Rep provides us with greater flexibility for the future to generate a variety of proteins in terms of size through methods like recursive directional assembly (Bowen et al. 2018).

Confocal microscopy

To monitor the expression and localisation of NTD-LhMaSp1-12Rep-CTD-citrine proteins, confocal imaging was performed on transiently transfected protoplasts using a Zeiss LSM880 laser scanning confocal microscope (ZEISS, Jena, Germany). For all imaging experiments, an LD LCI Plan-Apochromat 40 ×/1.2 Imm AutoCorr DIC M27 water objective (ZEISS) was used with a zoom factor of 4. Citrine, chlorophyll, and mCerulean were excited with laser beams at 514 nm (Argon), 561 nm (DPSS), and 458 nm (Argon), respectively. The detection ranges were specified as 517–552 nm for citrine, 599–700 nm for chlorophyll, and 472–508 nm for mCerulean, respectively. The images were recorded in Zeiss ZEN black software and acquired as Z-stacks. For visual representation and analysis, the ZEN blue edition software was used. To produce three-dimensional reconstruction images, the maximum intensity projection option was used. To improve resolution, deconvolution was performed on all images.

Plant screening and protein detection

For the production of stable lines, plants surviving the neomycin selection as described in Decker et al. (2015) were transferred to a liquid medium. After propagation of moss lines every two weeks for two months, they were subjected to screening for protein production. For this purpose, suspension cultures of 7-day-old protonema were vacuum-filtrated and 50–80 mg FW material transferred to 2-ml tubes with one tungsten carbide (Qiagen, Hilden, Germany) and one glass bead (Roth, Karlsruhe, Germany), 3 mm diameter. Plant material was homogenised with a tissue lyser (MM 400, Retsch, Haan, Germany) for 1 min at 30 Hz. Extraction buffer (50 mM Tris–HCl pH 7.2, 2% Triton X-100, 1% plant protease inhibitor cocktail P9599 from Sigma Aldrich) was added to the homogenized material and subjected to sonication for 15 min using an ultrasound bath (Sonorex RK52, Bandelin, Berlin, Germany), followed by centrifugation. Proteins were precipitated from the supernatant by adding 6 × acetone allowing overnight incubation. Then pellets were dissolved in 50 mM Tris–HCl pH 7.2, 2% SDS at 95 °C for 15 min. Before running in the SDS PAGE, 75 mM DTT and 4 × Laemmli sample buffer (Bio-Rad, Feldkirchen, Germany) were added. 50 µg of total protein (NanoDrop 1000 spectrophotometers, Thermo Fischer Scientific) was loaded on the SDS gel. Protein production was then analysed using Western blotting. For this, a 4–15% gradient gel (Mini-Protein TGX Precast Gels Bio-Rad) was run at 120 V for 1:30 h and blotted to polyvinylidene fluoride (PVDF) membrane (Hybond P 0.45, Amersham, Cytiva, Marlborough, USA) in a Trans-Blot SD Semi-Dry Electrophoretic Cell (Bio-Rad) for 1.15 h with 1.5 mA/cm2. The membrane was blocked for 1 h with 4% ECL blocking agent (Cytiva) followed by three times washing (1 × 15 min and 2 × 5 min) with Tris-buffered saline (TBST, 0.1% Tween 20). Then the membrane was probed with monoclonal anti-His antibody (18,184, Thermo Fischer Scientific) as primary antibody overnight in 1:2000 dilutions with TBST buffer with 2% blocking agent. After three times washing, the secondary antibody, a sheep anti-mouse antibody coupled to horseradish peroxidase (NA931 V, Invitrogen, Thermo Fisher Scientific, Massachusetts, USA), was diluted to 1:25,000 in TBST buffer with 2% blocking agent and incubated for 1 h on the membrane. After 4 times washing (2 × 10 min and 2 × 5 min) the blot was incubated with detection reagents (SuperSignal West Pico PLUS Chemiluminescent Substrate, Thermo Fisher Scientific) and exposed to autography film (Hyperfilm ECL, Cytiva) up to 20 min.

Rapid screening via His SpinTrap

The His SpinTrap column (Cytiva) was utilized for small-scale screening and purification of NTD-LhMaSp1-8Rep-CTD-8Htag proteins via immobilized metal ion affinity chromatography (IMAC). Protonema from suspension culture, cultivated for 9 days in a 500 mL Erlenmeyer flask under 2% CO2 in a shaker incubator, was harvested and 200 mg of biomass was homogenized using a tissue lyser. The material was then resuspended with 1.2 mL binding buffer (75 mM Na2HPO4x .H2O (disodium phosphate), 0.5 M NaCl, 20 mM imidazole, 0.05% Tween 20, 10% glycerol, 1% plant protease inhibitor cocktail (P9599, Sigma-Aldrich) pH 7 followed by 15 min sonication bath. After centrifugation, the supernatant was applied to the column previously equilibrated with binding buffer. The column was washed with 3 × binding buffer followed by eluting the protein with 1 × elution buffer (100 mM Na2HPO4 × H2O, 0.5 M NaCl, 500 mM imidazole, 10% glycerol, pH 7.4). The eluted pool was collected in a new tube.

Protein purification

NTD-LhMaSp1-8Rep-CTD-8Htag was extracted from vacuum-filtered plant material. For this, 4 ml binding buffer was added per gram FW and the suspension was disrupted with an ULTRA-TURRAX (Ika, Staufen, Germany) at 10,000 rpm for 10 min in an ice bath followed by sonication with an ultrasonic tip (Q500, QSONICA, Newtown, USA), amplitude 55%, volume < 50 ml (time on: 10 s, time off: 40 s) for 20 min in total (on and off) on ice. Two consecutive centrifugation steps (4500 × g for 3 min and 20,000×g for 10 min at 4 °C) were carried out and the supernatant was filtered through 1 µm, and subsequently 0.22 µm PES filters (Carl Roth, Karlsruhe, Germany).

For chromatographic purification, the filtrate was loaded onto a 1 mL HisTrap FF column, using the ÄKTA system (Cytiva) at 1 mL/min. The column was washed with 30 column volumes (CV) of binding buffer. The protein was eluted using a stepwise gradient (3% elution buffer for 9 CV, 15% 6 CV, 25% 3 CV, 100% 5 CV) (100%: 100 mM Na2HPO4 x H2O, 0.5 M NaCl, 500 mM imidazole, 10% glycerol, pH 7.4), and collected in 1.5 mL fractions (fractions 9–23). The protein of interest was screened by Western blot and Coomassie staining.

Mass spectrometry measurement and data analysis

Sample preparation for MS analysis was done as described in Hoernstein et al. (2016). In brief, gel slices at the expected molecular mass were excised using a scalpel, chopped to pieces and destained with 30% acetonitrile (ACN), 70% 100 mM ammonium bicarbonate. Destained samples were shrunken in 100% ACN and dried in a speedvac. 0.2 µg chymotrypsin (Promega, Madison, USA) were applied per sample and proteolytic digest was performed for 16 h at 25 °C. Peptides were purified using C18-STAGE-Tips as described in Hoernstein et al. (2018) and eluted from the tip in 80% ACN, 0.1% formic acid (FA).

Measurements were performed on an Orbitrap Elite instrument (Thermo Scientific) coupled to an UltiMate 3000 RSLCnano HPLC system (Thermo Fisher Scientific) as described in Njenga et al. (2024). In brief, peptides were loaded on a nanoEase™ M/Z Symmetry C18 precolumn (20 mm × 180 µm ID; Waters) at a flowrate of 10 µl and separated on a µPAC column array (50 cm length; PharmaFluidics) at a flow rate of 0.3 µl/min at 40 °C using a binary solvent gradient from 7 to 65% solvent B over 30 min with 0.1% (v/v) formic acid (FA) as solvent A and a mix of 0.1% (v/v) FA, 50% (v/v) methanol (MeOH), and 30% (v/v) ACN as solvent B. Subsequently solvent B was increased to 80% over 5 min and kept for 3 min. The HPLC was coupled online to the Nanospray Flex ion source with the PST-HV as an interface using an NFU liquid transfer system (MS Wil) and a fused silica emitter (EM-20–360; MicrOmics Technologies LLC).

MS1 scans were acquired at a mass range of m/z 370–1700 and a resolution of 120,000 (at m/z 400) using a target value (AGC) of 1 × 106 ions, and a maximum injection time (IT) of 200 ms. MS2 scans were acquired from the top 15 ions (charge ≥ 2) and fragmentation was performed via collision-induced dissociation (CID) in the linear trap at a normalized collision energy of 35%, an activation time of 10 ms with a q-value of 0.25, a resolution of 35,000, an AGC target of 50,000 ions, a maximum injection time of 150 ms, and a dynamic exclusion time of 45 s.

Raw data from MS measurements were processed with Mascot Distiller (V2.8.3.0, Matrix Science) and database searches were performed with Mascot Daemon (V2.6.0, Matrix Science) against a reverse concatenated database containing all Physcomitrella V3.3 protein models (Lang et al. 2018). In parallel, a custom database containing the sequences of known contaminants (e.g. keratin) was included. A precursor mass tolerance of ± 10 ppm and a fragment mass tolerance of ± 0.6 Da were used. Carbamidomethylation (C, + 57.021464 Da), deamidation (N, + 0.984016 Da), pyro-Glu formation (N-term Q, − 17.026549 Da) and oxidation (M; P, + 15.994915 Da) were specified as variable modifications. Search results were loaded in Scaffold5 software (V5.0.1, Proteome Software) and protein hits were accepted at a false discovery rate (FDR) of 1% at the protein level and 0.5% at the peptide level while having at least two identified peptides.

Bioreactor culture and determination of biomass

The cultures of WT and NTD-LhMaSp1-8Rep-CTD-8Htag producing line were scaled up to 5 L stirred tank photobioreactors (Getinge, Sweden) with Knop ME medium. Aeration 0.3 vvm (2% CO2), agitation with pitched 3 blade impeller at 500 rpm was used. After 2 days, the light was increased from 160 to 350 µmol/m2 s. The pH level was kept constant at pH 5.8 by titrating with either 0.5 M NaOH or 0.5 M HCl, and the temperature was maintained at 22 °C throughout the duration of the experiment. For dry weight (DW) measurement, 10–50 mL of tissue suspension was filtered and dried for 2 h at 105 °C. Dry weight measurements were conducted for 17 days for both lines.

Statistical analysis

The graph and analyses were performed with OriginLab software for windows (OriginLab, Northampton, Massachusetts, USA). The normality of the data was assessed using the Shapiro–Wilk test using RStudio, which indicated that the data followed a normal distribution (p > 0.05). Levene’s test for homogeneity of variance showed no significant difference between group variances (p > 0.05). Statistical significance was evaluated by t-test (p > 0.05).