Fig. 6: Combinatorial mutagenesis of SRC.
a, 3D structure of SRC (PDB: 2SRC) indicating the 15 combinatorially mutated residues in library 4 (orange) and ATP (blue). b, Scatter plots showing the reproducibility of fitness estimates from triplicate AbundancePCA experiments. Pearson’s r indicated in red. Rep., biological replicate. c, Histogram showing the number of observed aa variants at increasing Hamming distances from the wild type, in which the x axis is shared with panel d. d, Violin plot showing distributions of abundance growth rates inferred from deep sequencing data versus number of aa substitutions. The percentage of bound protein variants (predicted fraction bound molecules > 0.5) is shown at each Hamming distance from the wild type. e, Nonlinear relationship (global epistasis) between observed abundance fitness and changes in free energy of folding. Thermodynamic model fit is shown in red. f, Performance of energy model that includes all first-order and second-order genetic interaction (energetic coupling) terms/coefficients. g, Relationship between folding coupling energy strength and minimal inter-residue side-chain heavy-atom distance. The mean is shown and error bars indicate 95% confidence intervals from a Monte Carlo simulation approach (n = 10 experiments). Points are coloured by binned inter-residue distances (see legend in panel h). Spearman’s ρ is shown for all couplings (top value), as well as those involving pairs of residues separated by more than five residues in the primary sequence (bottom value). Core residues are indicated as triangles. h, Relationship between folding coupling energy strength and linear sequence (backbone) distance in number of residues.