Investigating the influence of environment on the evolution of Hsp90 using comprehensive fitness maps

Gene-environment interactions have long been theorized to influence molecular evolution. However, the environmental dependence of most mutations remains unknown. Using deep mutational scanning, we engineered budding yeast with all 44,604 single codon changes encoding 14,160 amino acid variants in Hsp90 and quantified growth effects under standard laboratory conditions and under five stress conditions (elevated temperature, nitrogen starvation, elevated salinity, high ethanol concentration, and oxidative stress caused by diamide). To our knowledge these are the largest comprehensive fitness maps of point mutant growth effects that have been determined. The growth effects of many variants differed between each of the conditions, indicating that environmental conditions can have a large impact on the evolution of Hsp90. Multiple variants provided growth advantages relative to wildtype Hsp90 under individual conditions, however these variants tended to exhibit growth defects in other environments. The diversity of Hsp90 sequences observed in extant eukaryotes preferentially contain amino acid variants that supported robust growth under all tested conditions. Thus, rather than favoring substitutions in individual conditions, the long-term selective pressure on Hsp90 may have been that of fluctuating environments, leading to robustness under a variety of conditions.


INTRODUCTION
The role of environment has been contemplated in theories of evolution for over a hundred years [1][2][3], yet molecular level analyses of how environment impacts the evolution of gene sequences remain experimentally under-explored. Depending on environmental conditions, mutations can be categorized into three classes: deleterious mutations that are purged from populations by purifying selection, nearly-neutral mutations that are governed by stochastic processes, and beneficial mutations that provide a selective advantage [4]. It has long been clear that environmental conditions can alter the fitness effects of mutations [5]. However, examining how environmental conditions impact any of the three classes of mutations is challenging. Measurable properties of nearly-neutral and deleterious mutations in natural populations are impacted by both demographics and selection [4], which are difficult to disentangle. In addition, many traits are complex, making it challenging to identify all of the contributing genetic variations [6]. For these reasons, we do not have a detailed understanding of how environmental conditions impact the evolution of most gene sequences.
Analyses of adaptive mutations in both lab experiments and natural populations indicate that they are frequently rare compared to nearly-neutral and deleterious mutations [4,7,8]. Approaches have been developed to identify adaptive mutations in natural populations [9] and lab experiments [10,11]. Most laboratory studies of adaptation rely on stochastic sampling of mutations that identify a subset of all possible adaptive mutations. With stochastic approaches, many potential adaptive mutations can go undetected. Nevertheless, studies sparked by correlating a genotype and a specific environmental condition have revealed a handful of striking evolutionary mechanisms. In one of the earliest studies of molecular adaptation, the correlation of the prevalence of the sickle cell gene in regions where malaria was endemic led to the realization that this gene provided protection from malaria infection [12]. Similar approaches have been utilized to understand other mechanisms of adaptation including how distinct land environments select for coat color of mice mediated by the Agouti gene [13]. Making the link between genotype, phenotype, and adaptation usually involves a tremendous amount of field work, making it difficult to pursue.
On the other end of the spectrum, the environmental dependence of deleterious mutations has been challenging to analyze because of the large number of mutations in this class. The environmental dependence of gene knockouts was investigated in budding yeast [14]. This study revealed that while 19% of genes were essential in rich media, the majority of genes exhibited growth effects under altered conditions. While this result emphasizes that environmental conditions can dramatically influence selection pressure on genes, it does not provide insights into the impact of environmental conditions on gene sequences.
Modern mutational scanning approaches [15] provide novel opportunities to examine fitness effects of the same mutations under different laboratory conditions [16,17]. The EMPIRIC (Exceedingly Meticulous and Parallel Investigation of Randomized Individual Codons) approach that we developed is particularly well suited to address questions regarding the environmental impact of mutational effects for three reasons: it quantifies growth rates that are a direct measure of experimental fitness, all point mutations are engineered providing comprehensive maps of growth effects, and all the variants can be tracked in the same flask while experiencing identical growth conditions [18]. We have previously used the EMPIRIC approach to investigate how protein fitness maps of ubiquitin vary in different environmental conditions [19]. The analysis of ubiquitin fitness maps revealed that stress environments can exacerbate the fitness defects of mutations. However, the small size of ubiquitin and the near absence of natural variation in ubiquitin sequences (only three amino acid differences between yeast and human) hindered investigation of the properties underlying historically observed substitutions.
Mutational scanning approaches have emerged as a robust method to analyze relationships between gene sequence and function, including aspects of environment-dependent selection pressure. Multiple studies have investigated resistance mutations that enhance growth in drug or antibody environments [20][21][22][23][24]. Most of these studies have focused on interpreting adaptation in the light of protein structure. Of note, Dandage, Chakraborty and colleagues explored how environmental perturbations to protein folding influenced tolerance of mutations in the 178 amino acid gentamicin-resistant gene in bacteria [25]. However, the question of how environmental variation shapes the selection pressure on gene sequences has not been well studied.
Here, we report comprehensive experimental fitness maps of Heat Shock Protein 90 (Hsp90) under multiple stress conditions and compare our experimental results with the historical record of hundreds of Hsp90 substitutions accrued during its billion years of evolution in eukaryotes. Hsp90 encodes a 709 amino acid protein and to our knowledge it is the largest gene for which a comprehensive protein fitness map has been determined. Hsp90 is an essential and highly abundant molecular chaperone which is induced by a wide variety of environmental stresses [26,27]. Hsp90 assists cells in responding to these stressful conditions by facilitating the folding and activation of client proteins through a series of ATP-dependent conformational changes mediated by co-chaperones [28]. These clients are primarily signal transduction proteins, highly enriched in kinases and transcription factors [29] and thus, Hsp90 activity is linked to virtually every process in the cell and influences all facets of cellular physiology. Because Hsp90 helps cells to cope with stress, we hypothesized that the functional pressures acting on Hsp90 may vary in different stress conditions. The conditions in natural environments often fluctuate, and all organisms contain stress response systems that aid in the acclimation to new conditions. The conditions experienced by different populations and species can vary tremendously depending on the niches that they inhabit, providing the potential for distinct selective pressures on Hsp90. Previous studies of a nine amino acid loop in Hsp90 identified multiple amino acid changes that increased the growth rate of yeast in elevated salinity [30], demonstrating the potential for beneficial mutations in Hsp90. However, the sequence of Hsp90 is strongly conserved in eukaryotes (57% amino acid identity from yeast to human), indicating consistent strong purifying selection.
To investigate the potential influence of the environment on Hsp90 evolution, we quantified fitness maps in six different conditions. The different conditions impose distinct molecular constraints on Hsp90 sequence. While proximity to ATP is the dominant functional constraint in standard conditions, the influence of client and co-chaperone interactions on growth rate dramatically increases under stress conditions. Increased selection pressure from heat and diamide stresses led to a greater number of beneficial variants compared to standard conditions. The observed beneficial variants were enriched at functional hotspots in Hsp90. However, the natural variants of Hsp90 tend to be robust to all environments tested, indicating selection for robustness to diverse stress conditions in the natural evolution of Hsp90.

RESULTS
We developed a powerful experimental system to analyze the growth rate supported by all possible Hsp90 point mutations under varied growth conditions. Bulk competitions of yeast with a deep sequencing readout enabled the simultaneous quantification of 98% of possible amino acid changes ( Figure 1A). The single point mutant library was engineered by incorporating a single degenerate codon (NNN) into an otherwise wildtype Hsp90 sequence as previously described [31]. To provide a sensitive readout of changes in Hsp90 function, we used a plasmid system that reduced Hsp90 protein levels to near-critical levels [32]. We employed a barcoding approach to efficiently track all variants in a single competition flask so that all variants experience identical conditions. As described in the Methods, the barcode strategy enabled us to track mutations across a large gene using a short sequencing readout. The barcode strategy also reduced the impact of misreads as they result in unused barcodes that were discarded from analyses.
We transformed the plasmid library of comprehensive Hsp90 point mutations into a conditional yeast strain where we could turn selection of the library on or off. We used a yeast Hsp90shutoff strain in which expression of the only genomic copy of Hsp90 is strictly regulated with a galactose-inducible promoter [32]. Yeast containing the mutant libraries were amplified under conditions that select for the plasmid, but not for the function of Hsp90 variants. We switched the yeast to dextrose media to shut off the expression of wildtype Hsp90 and then split the culture into six different environmental conditions. We extracted samples from each condition at multiple time points and used Illumina sequencing to estimate the frequency of each Hsp90 variant over time. We assessed the selection coefficient of each Hsp90 variant from the change in frequency relative to wildtype Hsp90 using a previously developed Bayesian MCMC method [33,34]. 6 To analyze reproducibility of the growth competition, we performed a technical replicate under standard conditions. We used a batch of the same transformed cells that we had frozen and stored such that the repeat bulk competition experiments and sequencing were performed independently. Selection coefficients between replicates were strongly correlated (R 2 =0.90), and indicated that we could clearly distinguish wildtype-like mutants from highly deleterious stop-like mutants ( Figure 1B). The selection coefficients in this study also correlated strongly (R 2 =0.87) with estimates of the Hsp90 N-domain in a previous study [35] (Figure S1A), indicating that biological replicates also show high reproducibility. Of note, variants with strongly deleterious effects exhibited the greatest variation between replicates, consistent with the noise inherent in estimating the frequency of rapidly depleting variants.
The large number of signaling pathways that depend on Hsp90 [29] and its strong sequence conservation suggest that Hsp90 may be sensitive to mutation. However, most variants of Hsp90 were experimentally tolerated in standard conditions ( Figure 1C). All possible mutations were compatible with function at 425 positions. Only 18 positions had low mutational tolerance to the extent that 15 or more substitutions caused null-like growth defects (R32, E33, N37, D40, D79, G81, G94, I96, A97, S99, G118, G121, G123, Y125, F156, W300, and R380). All of these positions except for W300 are in contact with ATP or mediate ATP-dependent conformational changes in the N-domain of Hsp90. In fact, the average selection coefficient at different positions (a measure of mutational sensitivity) in standard growth conditions correlates (R 2 =0.49) with distance from ATP ( Figure S1B). While W300 does not contact ATP, it transmits information from client binding to long range conformational changes of Hsp90 that are driven by ATP hydrolysis [36]. Our results indicate that ATP binding and the conformational changes driven by ATP hydrolysis impose dominant physical constraints throughout the entirety of Hsp90 under standard laboratory conditions. At first sight, the observation that most mutations were compatible with robust growth in standard conditions is at odds with the fact that the Hsp90 sequence is strongly conserved across large evolutionary distances ( Figure 1C). One potential reason for this discrepancy could be that the strength of purifying selection in large natural populations over long evolutionary time-scales is more stringent than can be measured in the laboratory. In other words, experimentally unmeasurable fitness defects could be subject to purifying selection in nature. In addition, the range of environmental conditions that yeast experience in natural settings may not be reflected by standard laboratory growth conditions. To investigate the impact of environmental conditions on mutational effects in Hsp90, we measured the growth rate of Hsp90 variants under five additional stress conditions.

Impact of stress conditions on mutational sensitivity of Hsp90
We measured the fitness of Hsp90 variants in conditions of nitrogen depletion (ND) (0.0125% ammonium sulfate), hyper-osmotic shock (0.8 M NaCl), ethanol stress (7.5% ethanol), the sulfhydryl-oxidizing agent diamide (0.85 mM), and temperature shock (37°C). All of these stresses are known to elicit a common shared environmental stress response characterized by altered expression of ~900 genes as well as having specific responses unique to each stress [27]. Genes encoding heat shock proteins, including Hsp90, are transiently upregulated in all these stresses except elevated salinity [27,37].
One way to characterize stress conditions is to measure the extent to which they slow down growth. For our experiments, each of the environmental stresses were selected to partially decrease the growth rate. Consistently, all stresses reduced the growth rate of the parental strain within a two-fold range, with depletion of nitrogen levels causing the smallest reduction in growth rate and diamide causing the greatest reduction ( Figure 2A). To investigate how critical Hsp90 is for growth in each condition, we measured growth rates of yeast with either normal or more than 10-fold reduced [32] levels of Hsp90 protein ( Figure 2A). Under standard conditions, the normal level of Hsp90 protein can be dramatically reduced without major impacts on growth rate, consistent with previous findings [32,38].
We anticipated that Hsp90 would be required at increased levels for robust experimental growth in diamide, nitrogen starvation, ethanol, and high temperature [27] based on the concept that cells increase expression level of genes in conditions where those gene products are needed at higher concentration. Consistent with this concept, reduced Hsp90 levels cause a marked decrease in growth rate at 37°C. However, Hsp90 protein levels had smaller impacts on growth rates under the other stress conditions, indicating that reliance on overall Hsp90 function does not increase dramatically in these conditions.
We quantified the growth rates of all Hsp90 single-mutant variants in each of the stress conditions as selection coefficients where 0 represents wildtype and -1 represents null alleles ( Figure S2A-F). We could clearly differentiate between the selection coefficients of wildtype synonyms and stop codons in all conditions ( Figure 2B), and we normalized to these classes of mutations to facilitate comparisons between each condition. Of note, the observed selection coefficients of wildtype synonyms varied more in conditions of high temperature and diamide stress compared to standard ( Figure S2G). We also note greater variation in the selection coefficients of barcodes for the same codon in the diamide and high temperature conditions ( Figure S2H). We conclude that diamide and elevated temperature provided greater noise in our selection coefficient measurements. To take into account differences in signal to noise for each condition, we either averaged over large numbers of mutations or categorized selection coefficients as wildtype like, strongly deleterious, intermediate, or beneficial based on the distribution of wildtype synonyms and stop codons in each condition (see Materials and Methods and Figure S2I).
We compared selection coefficients of each Hsp90 variant in each stress condition to standard condition ( Figure 2B&C). The stresses of 37°C and diamide tend to exaggerate the growth defects of many mutants compared to standard conditions, whereas high salt and ethanol tend to rescue growth defects ( Figure 2B&C and S2J). According to the theory of metabolic flux [39,40], gene products that are rate limiting for growth will be subject to the strongest selection. Accordingly, the relationship between Hsp90 function and growth rate should largely determine the strength of selection acting on Hsp90 sequence. Conditions where Hsp90 function is more directly linked to growth rate would be more sensitive to Hsp90 mutations than conditions where Hsp90 function can be reduced without changing growth rates [32,41]. The average selection coefficients are more deleterious in diamide and temperature stress compared to standard conditions. These findings are consistent with heat and diamide stresses causing a growth limiting increase in unfolded Hsp90 clients that is rate limiting for growth. In contrast, the average selection coefficients are less deleterious in ethanol and salt stress than in standard conditions, consistent with a decrease in the demand for Hsp90 function in these conditions. Of note, the biophysical impacts of these sets of conditions are distinct: increased temperature and diamide both generally act to increase protein unfolding, while salt and ethanol have more complex impacts that include inducing aggregation.

Structural analyses of environmental responsive positions
Altering environmental conditions had a pervasive influence on mutational effects along the sequence of Hsp90 ( Figure 3A & S3A). We structurally mapped the average selection coefficient of each position in each condition relative to standard conditions as a measure of the sensitivity to mutation of each position under each environmental stress ( Figure 3A). Many positions had mutational profiles that were responsive to a range of environments. Environmentally responsive positions with large changes in average selection coefficient in at least three conditions are highlighted on the Hsp90 structure in green in Figure 3B. Unlike the critical positions that cluster around the ATP binding site ( Figure 1C), the environmentally responsive positions are located throughout all domains of Hsp90. Similar to critical residues, environmentally responsive positions are more conserved in nature compared to other positions in Hsp90 ( Figure 3C), suggesting that the suite of experimental stress conditions tested captured aspects of natural selection pressures on Hsp90 sequence.
Hsp90 positions with environmentally responsive selection coefficients were enriched in binding contacts with clients, co-chaperones and intramolecular Hsp90 contacts involved in transient conformational changes ( Figure 3D and S3B). About 65% of the environmentally responsive residues have been identified either structurally or genetically as interacting with binding partners [42][43][44][45][46][47][48][49][50][51][52][53][54][55], compared to about 15% of positions that were not responsive to stress conditions. While ATP binding and hydrolysis are the main structural determinants that constrain fitness in standard growth conditions, client and co-chaperone interactions have a larger impact on experimental fitness under stress conditions. Although the mean selection coefficients of mutations at the known client and co-chaperone binding sites are responsive to changes in environment, the direction of the shift of growth rate compared to standard conditions depends on the specific binding partner and environment ( Figure 3E). This suggests that different environments place unique functional demands on Hsp90 that may be mediated by the relative affinities of different binding partners. Consistent with these observations, we hypothesize that Hsp90 client priority is determined by relative binding affinity and that Hsp90 mutations can reprioritize clients that in turn impacts many signaling pathways.

Constraint of mutational sensitivity at high temperature
We find that different environmental conditions impose unique constraints on Hsp90, with elevated temperature placing the greatest purifying selection pressure on Hsp90. Of the 2504 variants of Hsp90 that are deleterious when grown at 37°C, 884 of them (~35%) are deleterious only in this condition ( Figure 3F). We defined mutants that confer temperature sensitive (ts) growth phenotypes on cells as variants with selection coefficients within the distribution of wildtype synonyms in standard conditions and that of stop codons at 37°C. Based on this definition, 675 Hsp90 amino acid changes (roughly 5% of possible changes) were found to be temperature sensitive ( Figure 4A). We sought to understand the physical underpinnings of this large set of Hsp90 ts mutations.
We examined Hsp90 ts mutations for structural and physical patterns. We found that ts mutations tended to concentrate at hotspots ( Figure 4B). These hotspots were spread across all three domains of Hsp90 ( Figure 4C). The largest cluster of hotspots occurred in the C domain of Hsp90. The C domain forms a constitutive homodimer that is critical for function [56]. Of note, homo-oligomerization domains may have a larger ts potential because all subunits contribute to folding and dimerization essentially multiplying the impacts of mutations [57]. To explore the physical underpinnings of ts mutations we examined if they were buried in the structure or surface exposed. Mutations at buried residues tend to have a larger impact on protein folding energy compared to surface residues [58]. Consistent with the idea that many ts mutations may disrupt protein folding at elevated temperature, substitutions that confer a ts phenotype are enriched in buried residues ( Figure 4D). Also consistent with this idea, ts mutations tend to have negative Blosum scores ( Figure 4E), a hallmark of disruptive amino acid changes.
Because growth at elevated temperatures requires higher levels of Hsp90 protein [59], some ts mutations may be due to a reduced function that is enough for growth at standard temperature, but is insufficient at 37 °C [54]. We reasoned that we could distinguish these mutants by examining how growth rate depended on the expression levels of Hsp90. We expect that destabilizing mutants that cause Hsp90 to unfold at elevated temperature would not support efficient growth at 37°C independent of expression levels. In contrast, we expect mutants that reduce Hsp90 function to exhibit an expression-dependent growth defect at 37 °C. We tested a panel of ts mutations identified in the bulk competitions at high and low expression levels ( Figure 4F). The dependence of growth rate at 37 °C on expression level varied for different Hsp90 ts variants. The I64D, I66E and L499R Hsp90 mutants have no activity at 37°C irrespective of expression levels. These disruptive substitutions at buried positions likely destabilize the structure of Hsp90. In contrast, increasing the Hsp90 expression levels at least partially rescued the growth defect for five ts variants (L50D, K102A, D180L, K398L, K594I), indicating that these variants do not providing enough Hsp90 function for robust growth at elevated temperature. All five of these expression dependent ts variants were located at surface positions, indicating that the location of ts mutations can delineate different mechanistic classes.

Hsp90 potential for adaptation to environmental stress
Numerous Hsp90 variants provided a growth benefit compared to the wildtype sequence in stress conditions. The largest number of beneficial variants in Hsp90 occurred in high temperature and diamide conditions ( Figure 5A). Multiple lines of evidence indicate that these mutants are truly beneficial variants and not simply measurement noise. First, the beneficial amino acids generally exhibited consistent selection coefficients among synonymous variants ( Figure S5A). Second, adaptive mutants in diamide and high temperature cluster at certain positions in a significant manner (see below). Finally, we confirmed the increased growth rate at elevated temperature of a panel of variants analyzed in isolation ( Figure S5B). Beneficial mutations in elevated temperature and diamide often clustered at specific positions in Hsp90 ( Figure 5B), indicating that the wildtype amino acids at these positions are far from optimum for growth in these conditions. In contrast, the apparent beneficial mutations in other conditions did not tend to cluster at specific positions ( Figure S5C).
To obtain a more general picture of the potential for adaptation derived from the full fitness distributions, we used Fisher's Geometric model (FGM) [60]. According to FGM, populations evolve in an n-dimensional phenotypic space, through random single step mutations, and any such mutation that brings the population closer to the optimum is considered beneficial. An intuitive hypothesis derived from FGM is that the potential for adaptation in a given environment (that is the availability of beneficial mutations) depends on the distance to the optimum. In order to estimate the distance to the optimum d, we adopted the approach by Martin and Lenormand and fitted a displaced gamma distribution to the neutral and beneficial mutations for each environment [61]. We observed that the yeast populations were furthest from the optimum in elevated temperature and diamide (d=0.072 and 0.05, respectively), followed by nitrogen deprivation (d=0.023), high salinity and ethanol (d=0.021) and standard (d=0.014). This suggests that elevated temperature and diamide have the largest potential for adaptation and is consistent with the observation of the largest proportions of beneficial mutations in these environments. Interestingly, previous results from a 9-amino-acid region in In diamide and elevated temperature, the clustered beneficial positions were almost entirely located in the ATP-binding domain and the middle domain ( Figure 5C), both of which make extensive contacts with clients and co-chaperones [42][43][44][45][46]55]. Beneficial mutations in elevated temperature and diamide conditions were preferentially located on the surface of Hsp90 ( Figure 5D) at positions accessible to binding partners. Analyses of available Hsp90 complexes indicate that beneficial positions were disproportionately located at known interfaces with cochaperones and clients ( Figure 5E). Clustered beneficial mutations are consistent with disruptive mechanisms because different amino acid changes can lead to disruptions, whereas a gain of function is usually mediated by specific amino acid changes. Amino acids that are beneficial in diamide and elevated temperature tend to exhibit deleterious effects in standard conditions ( Figure 5F), consistent with a cost of adaptation. We conjecture that the clustered beneficial mutations are at positions that mediate the binding affinity of subsets of clients and co-chaperones and that disruptive mutations at these positions can lead to re-prioritization of multiple clients. The priority or efficiency of Hsp90 for sets of clients can in turn impact most aspects of physiology because Hsp90 clients include hundreds of kinases that influence virtually every aspect of cell biology.
In the first ten amino acids of Hsp90, we noted a large variation in the selection coefficients of synonymous mutations at elevated temperature ( Figure S5D). These synonymous mutations were only strongly beneficial at high temperature where Hsp90 protein levels are limiting for growth. Analysis of an individual clones confirms that synonymous mutations at the beginning of Hsp90 that were beneficial at high temperature were expressed at higher level in our plasmid system ( Figure S5E, S5F). These results are consistent with a large body of research showing that mRNA structure near the beginning of coding regions often impacts translation efficiency [63][64][65], and that adaptations can be mediated by changes in expression levels [66]. Outside of the first ten amino acids, we did not observe large variation in selection coefficients of synonymous mutations.

Natural selection favors Hsp90 variants that are robust to environment
We next examined how experimental protein fitness maps compared with the diversity of Hsp90 sequences in current eukaryotes. We analyzed Hsp90 diversity in a set of 267 sequences from organisms that broadly span across eukaryotes. We identified 1750 amino acid differences in total that were located at 499 positions in Hsp90. We examined the experimental growth effects of the subset of amino acids that were observed in nature. While the overall distribution of selection coefficients in all conditions was bimodal with peaks around neutral (s=0) and null (s=-1), the natural amino acids were unimodal with a peak centered near neutral ( Figure 6A). The vast majority of natural amino acids had wildtype-like fitness in all conditions studied here ( Figure 6B&C). Whereas naturally occurring amino acids in Hsp90 were rarely deleterious in any experimental condition, they were similarly likely to provide a growth benefit compared to all possible amino acids (5%). This observation indicates that condition-dependent fitness benefits are not a major determinant of natural variation in Hsp90 sequences. Instead, our results indicate that natural selection has favored Hsp90 substitutions that are robust to multiple stressful conditions ( Figure 6D).
Epistasis may provide a compelling explanation for the naturally occurring amino acids that we observed with deleterious selection coefficients. Analyses of Hsp90 mutations in the context of likely ancestral states has demonstrated a few instances of historical substitutions with fitness effects that depend strongly on the Hsp90 sequence background [67]. Indeed, many of the natural amino acids previously identified with strong epistasis (E7A, V23F, T13N) are in the small set of natural amino acids with deleterious effects in at least one condition. Further analyses of natural variants under diverse environmental conditions will likely provide insights into historical epistasis and will be the focus of future research.

DISCUSSION
In this study, we analyzed the protein-wide distribution of fitness effects of Hsp90 across standard conditions and five stress conditions. Fitness effects under standard conditions indicated that critical functional residues were largely explained by proximity to ATP. In contrast, positions spread throughout the structure of Hsp90 exhibited selection coefficients that were responsive to different conditions. While the main functional constraints were apparent in standard conditions, stress conditions revealed many additional functional constraints as well as beneficial mutations.
We observed distinct structural trends for mutations that provide environment-dependent costs and benefits. Many mutations in Hsp90 caused growth defects at elevated temperature where Hsp90 function is limiting for growth. These temperature sensitive mutations tended to be buried and in the homodimerization domain, consistent with an increased requirement for folding stability at elevated temperatures. In contrast, beneficial mutations tended to be on the surface of Hsp90 and at contact sites with binding partners, suggesting that change-of-function mutations may be predominantly governed by alterations to binding interactions.
Importantly, our results demonstrate that while mutations to Hsp90 can provide a growth advantage in specific environmental conditions, naturally occurring amino acids in Hsp90 tend to support robust growth over multiple stress conditions. The finding of beneficial mutations in Hsp90 in specific conditions suggests that similar long-term stresses in nature can lead to positive selection on Hsp90. Consistent with previous work [26], we found that experimentally beneficial mutations tended to have a fitness cost in alternate conditions ( Figure 5F). This indicates that natural environments which fluctuate among different stresses would reduce or eliminate positive selection on Hsp90. Therefore, our results suggest that natural selection on Hsp90 sequence has predominantly been governed by strong purifying selection integrated over multiple stressful conditions. Taken together, these results support the hypothesis that natural populations might experience a so-called "micro-evolutionary fitness seascape" [68], in 13 which rapidly fluctuating environments result in a distribution of quasi-neutral substitutions over evolutionary time scales.   Critical residues have mean selection coefficients that are null-like (within the distribution of stop codons) in all environments. Environmentally responsive positions had mean selection coefficients that differed from standard in three or more environmental conditions by an amount greater than one standard deviation of wildtype synonyms. Tolerant residues were not shifted more than this cutoff in any environment. C. For different classes of positions, evolutionary variation was calculated as amino acid entropy at each position in Hsp90 sequences from diverse eukaryotes. D. Fraction of different classes of mutations located at contact sites with binding partners. E. The average selection coefficient in each environment relative to standard at all the Hsp90 positions at each stated interface. F. Venn diagram of deleterious mutations in different environmental conditions [69]. Total number of deleterious mutants in each condition are stated in parentheses.

Generating mutant libraries
A library of Hsp90 genes was saturated with single point mutations using oligos containing NNN codons as previously described [31]. The resulting library was pooled into 12 separate 60 amino acid long sub-libraries along the Hsp90 sequence (amino acids 1-60, 61-120 etc.) and combined via Gibson Assembly (NEB) with a linearized p414ADHΔter Hsp90 destination vector. To simplify sequencing steps during bulk competition, each variant of the library was tagged with a unique barcode. For each 60 amino acid sub-library, a pool of DNA constructs containing a randomized 18-bp barcode sequence (N18) was cloned 200 nt downstream from the Hsp90 stop codon via restriction digestion, ligation, and transformation into chemically competent E. coli with the goal of each mutant being represented by 10-20 unique barcodes.

Barcode association of library variants
We added barcodes and associated them with Hsp90 variants essentially as previously described [67]. To associate barcodes with Hsp90 variants, we performed paired-end sequencing of each 60 amino acid sub-library using a primer that reads the N18 barcode in one read and a primer unique to each sub-library that anneals upstream of the region containing mutations. To facilitate efficient Illumina sequencing, we generated PCR products that were less than 1kb in length for sequencing. We created shorter PCR products by generating plasmids with regions removed between the randomized regions and the barcode. To remove regions from the plasmids, we performed restriction digest with two unique enzymes, followed by blunt ending with T4 DNA polymerase (NEB) and plasmid ligation at a low concentration (3 ng/μL) to favor circularization over bimolecular ligations. The resulting DNA was relinearized by restriction digest, and amplified with 11 cycles of PCR to generate products for Illumina sequencing. The resulting PCR products were sequenced using an Illumina MiSeq instrument with asymmetric reads of 50 bases for Read1 (barcode) and 250 bases for Read2 (Hsp90 sequence). After filtering low-quality reads (Phred scores <10), the data was organized by barcode sequence. For each barcode that was read more than three times, we generated a consensus of the Hsp90 sequence that we compared to wildtype to call mutations.

Bulk Growth Competitions
Equal molar quantities of each sub-library were mixed to form a pool of DNA containing the entire Hsp90 library with each codon variant present at similar concentration. The plasmid library was transformed using the lithium acetate procedure into the DBY288 Hsp90 shutoff strain essentially as previously described [32]. Sufficient transformation reactions were performed to attain ~5 million independent yeast transformants representing a 5-fold sampling for the average barcode and 100-fold sampling for the average codon variant. Following 12 hours of recovery in SRGal (synthetic 1% raffinose and 1% galactose) media, transformed cells were washed five times in SRGal-W (SRGal lacking tryptophan) media to remove extracellular DNA, and grown in SRGal-W media at 30°C for 48 h with repeated dilution to maintain the cells in log phase of growth. This yeast library were was supplemented with 20% glycerol, aliquoted and slowly frozen by placing a microfuge tubes with cells in a room temperature plastic tube holder in a -80°C freezer.
For each competition experiment, an aliquot of the frozen yeast library cells was thawed at 37°C. Viability of the cells was accessed before and after freezing and was determined to be greater than 90% with this slow freeze, quick thaw procedure. Thawed cells were amplified in SRGal-W for 24 hours, and then shifted to shutoff conditions by centrifugation, washing, and resuspension in 300 mL of synthetic dextrose lacking tryptophan (SD-W) for 12 hours at 30°C. At this point, cells were split and transferred to different conditions including SD-W (standard -2% dextrose, 0.5% ammonium sulphate, 30°C), ND (nitrogen depletion -2% dextrose, 0.0125% ammonium sulphate, 30°C), NaCl (2% dextrose, 0.5% ammonium sulphate, 0.8 M NaCl, 30°C), EtOH (2% dextrose, 0.5% ammonium sulphate, 7.5% Ethanol), diamide (2% dextrose, 0.5% ammonium sulphate, 0.85 mM, 30°C), or 37°C (2% dextrose, 0.5% ammonium sulphate, 37°C). We collected samples of ~10 8 cells at eight time points over a period of 36 hours and stored them at -80°C. Cultures were maintained in log phase by regular dilution with fresh media, maintaining a population size of 10 9 or greater throughout the bulk competition. Bulk competition from the standard condition were conducted in technical duplicates from the frozen yeast library.

DNA Preparation and Sequencing
We isolated plasmid DNA from each bulk competition time point as described [32]. Purified plasmid was linearized with AscI. Barcodes were amplified by 19 cycles of PCR using Phusion polymerase (NEB) and primers that add Illumina adapter sequences and an 8 bp identifier sequence used to distinguish libraries and time points. The identifier sequence was located at positions 91-98 relative to the illumine primer and the barcode was located at positions 1-18. PCR products were purified two times over silica columns (Zymo Research) and quantified using the KAPA SYBR FAST qPCR Master Mix (Kapa Biosystems) on a Bio-Rad CFX machine. Samples were pooled and sequenced on an Illumina NextSeq instrument in single-end 100 bp mode.

Analysis of Bulk Competition Sequencing Data
Illumina sequence reads were filtered for Phred scores >20 and strict matching of the sequence to the expected template and identifier sequence. Reads that passed these filters were parsed based on the identifier sequence. For each condition/time-point identifier, each unique N18 read was counted. The unique N18 count file was then used to identify the frequency of each mutant using the variant-barcode association table. To generate a cumulative count for each codon and amino acid variant in the library, the counts of each associated barcode were summed.

Determination of Selection Coefficient
The frequency of each variant in the library relative to wildtype synonyms was determined at each time point and the slope of the log of this ratio versus generation time was used to calculate the raw selection coefficient. This procedure sets the selection coefficient for the average wildtype synonym to neutral (s=0). Selection coefficients (s) were scaled so that the average stop codon in each environmental condition represented a null allele (s=-1). For the second replicate in standard conditions, we noted a small fitness defect (s≈-0.05) for wildtype synonyms at positions 679-709 relative to other positions. We do not understand the source of this behavior, and chose to normalize to wildtype synonyms from 1-678 for this condition and to exclude positions 679-709 from analyses that include the second replicate of standard conditions. We did not observe this behavior in any other condition.

Yeast growth analysis
Individual Hsp90 variants were generated and analyzed essentially as previously described [32]. Variants were generated by site directed mutagenesis and transformed into DBY288 cells. Selected transformed colonies were grown in liquid SRGal-W media to mid-log phase at 30°C, washed three times and grown in shutoff media (SD-W) at either 30C or 37C. After sufficient time to stall the growth of control cells lacking a rescue copy of Hsp90 (~16 hours), cell density was monitored based on absorbance at 600 nm over time and fit to an exponential growth curve to quantify growth rate.

Natural variation in Hsp90 sequence
We analyzed sequence variation in a previously described alignment of Hsp90 protein sequences from 261 eukaryotic species that broadly span a billion years of evolutionary distance [67].
Step 1: Association of ORF variants with randomized barcodes Step 3: Deep sequencing to measure change in frequency of variants