The flow of information from protein sequence over structure to physiological function has been boggling physicists’, chemists’, and biologists’ minds for over half a century. First postulated by Christian Anfinsen in the 1970s, the ‘thermodynamic hypothesis’ describes a unique relationship in which the amino acid sequence of a protein should be sufficient to determine its structure . Yet, it was during the same time when Cyrus Levinthal famously noted that, theoretically, it would take longer than the age of the universe for a typical protein to sample all possible conformations in order to reach its correct fold . Opening one of the biggest challenges in biology, advances in computational methods, as well as the drastic increase in experimental structures and sequences being published, just recently narrowed down the ‘sequence-structure gap’ that Anfinsen and Levinthal opened. Culminating in AlphaFold 2 , sequence-based structure prediction has reached astonishing accuracy, representing a huge leap forward in biology and also in drug discovery.
Is it that simple? While it is true that structure models represent a great starting point for many research endeavors, the same scientific advances that brought us AlphaFold revealed gaps in the sequence-structure-function paradigm. It is now common knowledge that proteins are not static at all, as suggested by models obtained from predictions or crystallographic structures, but highly dynamic. These dynamics range from bond vibrations to large conformational changes, come in networks, and often modulate protein function – a phenomenon collectively described as allostery . But what is the relationship between sequence and the structural dynamics that govern function? If the sequence-structure-function paradigm holds true, these dynamics should be imprinted in the sequence and should evolve alongside the often highly conserved active site of a protein. More specifically, to maintain biological robustness, a network of energetically coupled residues should translate into a joint evolutionary constraint between each participating residue - they co-evolve or co-mutate .
Besides positional conservation, information about co-evolution is contained in multiple sequence alignments (MSAs) of homologous sequences, as these reflect the ‘evolutionary history’ of a protein family. By applying statistical models to MSAs, the interdependency of the variability of each sequence position, the ‘co-evolutionary coupling’, can be obtained and can be interpreted as direct or indirect physical connectivity between residues. A plethora of methods that build on this principle have been developed and succeeded in deciphering residue-residue couplings for structure prediction and identification of functional domains for ligand binding or allosteric regulation . However, experimental validation of dynamic co-evolving networks that modulate protein function is challenging. In particular, when these dynamics take place in the absence of global structural changes.
In a recent study, Torgeson et al.  combined co-evolutionary analysis and nuclear magnetic resonance (NMR) spectroscopy to identify previously undescribed dynamic networks of the protein tyrosine phosphatase (PTP) PTP1B. PTP1B’s structure, dynamics and function, as well as that of its homologs have been rigorously characterized, making it an ideal system to study the impact of co-evolution on functional dynamics.
Torgeson et al. applied so called pseudolikelihood maximization direct coupling analysis (plmDCA)  to an MSA of PTP1B homologs to derive co-evolutionary couplings and then used a spectral clustering approach to split the structure into strongly coupled co-evolving domains, referred to as evolutionary domains (EDs) . Supporting the idea of co-evolutionary analysis to identify functionally critical residue groups, four of the obtained EDs have been previously verified experimentally in PTP1B. However, further clustering revealed additional, yet uncharacterized subdomains.
More than 16 Å from away from the active side, one of these subdomains contains an extended hydrophobic pocket that appeared to be, other than the already characterized EDs, contiguous in space rather than in sequence. Selectively mutating central positions in the domain, either independently or as triple mutants, reduced thermal stability in all cases, but increased the catalytic turnover rate (kcat) of the enzyme by more than 2-fold. Strikingly, structural analysis by 2D-[1H,15N] and 2D-[1H,13C] transverse relaxation optimized spectroscopy (TROSY) NMR and X-ray crystallography revealed no large conformational changes due to the mutations and previously described allosteric pathways of PTP1B remained unchanged.
In the absence of global structural change, Torgesen et al. reasoned that an increase in kcat could be driven by side-chain dynamics in the µs – ms time range – a relationship that has been previously shown to govern the catalytic cycle of PTP1B . To proof this hypothesis, they conducted so called constant time 13C Carr-Purcell Meiboom-Gill (ct-CPMG) relaxation dispersion experiments that allow for measurements of side-chain dynamics and extraction of a model that describes the conformational exchange between two populations A and B with the exchange rate kex. In measuring relaxation dispersion in the absence and the presence of a substrate-mimicking inhibitor, they could show that under conditions of catalysis (i.e., when the inhibitor is bound) the overall fast dynamics of the free mutated PTP1B are quenched and that residues cluster into three groups based on their kex values. Although these three groups were also identified for the wildtype, the exchange rates of the groups differed. Remarkably, one group contained many residues of the distal co-evolutionary subdomain described above and showed ~2-fold increase in kex from wildtype to mutant PTP1B – an increase that mirrors the 2-fold increase in kcat that was observed in enzymatic assays. The correlation of catalysis and side-chain dynamics in this group of residues was further supported by the resemblance between kcat and the fitted unidirectional exchange rate kAB. Most importantly, this also reflects the reciprocity of this regulatory pathway, because changes in the active site, e.g., binding of a substrate, changed the dynamics in the subdomain, while perturbations in the subdomain due to mutations influenced catalytic activity.
But what is the purpose of this regulatory subdomain? While highly conserved residues of the hydrophobic subdomain support the N-terminal portion of an α-helix, which directly connects to the catalytic loop, less conserved residues flank the C-terminal portion of that α-helix. Accordingly, sequence variations in this less conserved part could enable fine-tuning of the kinetic properties of PTPs without perturbing specificity of the active site or the allosteric regulation.
In mechanistically proving the relationship between a co-evolving non-catalytic subdomain and its impact on enzymatic catalysis Torgesen et al.’s study provides a good view on how functional dynamics can leave a footprint in protein sequences throughout evolution. The combination of co-evolutionary analysis with NMR-based analysis of side-chain dynamics proved to be critical in dissecting the regulatory network, otherwise invisible in static X-ray structures. Although this is just one of the few examples in which the energetic connectivity that underlies residue co-evolution was studied experimentally in detail, the study demonstrates that sequence analysis can be instrumental in mechanistic studies of protein dynamics. Finally, and most importantly, by showing that functional dynamics are indeed encoded in sequence, the study supports addition of dynamics as a missing part to the sequence-structure-function paradigm.
1. Anfinsen, C. B. Principles that Govern the Folding of Protein Chains. Science 181, 223–230 (1973).
2. Levinthal, C. Are there pathways for protein folding? J Chim Phys 65, 44–45 (1968).
3. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
4. Wodak, S. J. et al. Allostery in Its Many Disguises: From Theory to Applications. Structure 27, 566–578 (2019).
5. Göbel, U., Sander, C., Schneider, R. & Valencia, A. Correlated mutations and residue contacts in proteins. Proteins Struct Funct Bioinform 18, 309–317 (1994).
6. Juan, D. de, Pazos, F. & Valencia, A. Emerging methods in protein co-evolution. Nat Rev Genet 14, 249–261 (2013).
7. Torgeson, K. R. et al. Conserved conformational dynamics determine enzyme activity. Sci Adv 8, eabo5546 (2022).
8. Ekeberg, M., Lövkvist, C., Lan, Y., Weigt, M. & Aurell, E. Improved contact prediction in proteins: Using pseudolikelihoods to infer Potts models. Phys Rev E 87, 012707 (2013).
9. Granata, D., Ponzoni, L., Micheletti, C. & Carnevale, V. Patterns of coevolving amino acids unveil structural and dynamical domains. Proc National Acad Sci 114, E10612–E10621 (2017).
10. Torgeson, K. R., Clarkson, M. W., Kumar, G. S., Page, R. & Peti, W. Cooperative dynamics across distinct structural elements regulate PTP1B activity. J Biol Chem 295, 13829–13837 (2020).