Toll‐like receptor variation in the bottlenecked population of the endangered Seychelles warbler

In small populations, drift results in a loss of genetic variation, which reduces adaptive evolutionary potential. Furthermore, the probability of consanguineous mating increases which may result in inbreeding depression. Under certain circumstances, balancing selection can counteract drift and maintain variation at key loci. Identifying such loci is important from a conservation perspective and may provide insight into how different evolutionary forces interact in small populations. Toll‐like receptor (TLR) genes play a pivotal role in vertebrate innate immune defence by recognizing invading pathogens. We characterize TLR variation in the Seychelles warbler (SW) Acrocephalus sechellensis, an endangered passerine that recently suffered a population bottleneck. Five of seven TLR loci were polymorphic, with one locus (TLR15) containing four functional variants and showing an excess of heterozygotes. Haplotype‐level tests failed to detect selection at these loci, but site‐specific tests detected signatures of positive selection within TLR3 and TLR15. After characterizing variation (excluding TLR15) in 5–6 other Acrocephalus species, we found that TLR variation was positively correlated with population size across species and followed the pattern observed at neutral microsatellite loci. The depauperate TLR variation observed suggests that even at important immunity‐related loci, balancing selection may only attenuate the overriding effects of drift. However, in the SW, TLR15 appears to be an outlier and warrants further investigation. The low levels of TLR variation may be disadvantageous for the long‐term viability of the SW and conservation measures that maximize the retention of the variation should be considered.


Introduction
Analysis of genetic variation within and among populations can provide important insight into the evolutionary and demographic history of a species (Garrigan & Hedrick, 2003;Piertney & Webster, 2010;Sutton et al., 2011). Levels of variation also provide an indication of a population's adaptive potential and viability (Frankham et al., 1999). In demographically stable populations, genetic variation will reach a mutation-selection-drift balance given sufficient evolutionary time (Kimura & Ohta, 1969). However, balancing selection is said to have occurred when genetic variation at a locus is maintained at a higher level than expected based on the amount of drift affecting the population (Takahata, 1990;Takahata & Nei, 1990;Takahata, Satta & Klein, 1992). Identifying when and where balancing selection occurs can provide insight into the function and importance of specific loci and help us to understand the evolutionary pressures affecting a population (Oleksiak, Churchill & Crawford, 2002;Mitchell-Olds & Schmitt, 2006). Understanding the potential for critical genetic variation to be maintained within small, isolated populations where drift is strong (Lacy, 1987;Franklin & Frankham, 1998;van Oosterhout et al., 2006) is important from a conservation perspective (Young, Boyle & Brown, 1996;Tompkins, 2007;Willi et al., 2007;Grueber, Wallis & Jamieson, 2013). Studies that determine the level of functional variation at key loci in populations can elucidate where within and among individual polymorphism is important (such as those involved in the immune response). It can also identify where lack of polymorphism may underlie potential vulnerability to future stresses, such as hindering adaptation to novel pathogen infections (Frankel, 1974;Hedrick, 2001). The results from these studies can then feed into policy-making decisions when assessing the long-term viability of fragmented and bottlenecked populations. For example, where evidence of unusually low levels of variation at immune genes is found, genetic augmentation from other populations may be considered (e.g. Franklin et al., 2009).
Pathogen-mediated selection (PMS) has been proposed to be a major driver of balancing selection given the strong co-evolutionary relationship between pathogens and their hosts (Jeffery & Bangham, 2000;Bernatchez & Landry, 2003). This idea is well-supported by various studies that have identified elevated levels of variation at specific immune genes within a range of taxa (Hoelzel et al., 1993;Luikart et al., 1998;Frankham et al., 1999;Hansson & Richardson, 2005). Three main non-mutually exclusive mechanisms of PMS (i.e. heterozygote advantage, rare allele advantage and fluctuating selection) (Doherty & Zinkernagel, 1975;Hill et al., 1991;Slade & McCallum, 1992 respectively) have been put forward to explain how genetic variation may be maintained at these immune genes (for reviews, see Potts & Slev, 1995;Hedrick, 2002;. Other forces such as sexual selection (Fisher, 1915;Andersson, 1994) and selection against any mutational load associated with highly polymorphic genes (van Oosterhout, 2009) can act on top of this, and have the potential to interact and exacerbate the overall effect on genetic variation (see Brouwer et al., 2010;Netea, Wijmenga & O'Neill, 2012;Ejsmond et al., 2014).
Many studies which use a candidate gene approach to investigate balancing selection have focused on major histocompatibility complex (MHC) genes (for reviews, see Piertney & Oliver, 2006;, both due to their central role in the acquired immune response and because of the exceptional levels of polymorphism observed at these loci (Hedrick, 1994;Meyer & Thomson, 2001). However, population genetic inference of selection is difficult in this multigene family because of the complications caused by various phenomena including frequent gene duplication (Jeffery & Bangham, 2000;Hess & Edwards, 2002), gene conversion (Ohta, 1995;Spurgin et al., 2011), epistasis, strong linkage and high mutational load (van Oosterhout, 2009). In contrast, studies on variation at innate immune system genes within wild populations are relatively scarce (Acevedo-Whitehouse & Cunningham, 2006), yet these genes are thought to have a simple genomic architecture and evolution. This reduces the confounding effects of the other factors outlined above. Furthermore, innate immune genes play a pivotal role as the first line of defence in vertebrate immunity and there is evidence that they can be under balancing selection (Schlenke & Begun, 2003;Ferrer-admetlla et al., 2008;Mukherjee et al., 2009).
Toll-like receptors (TLRs) are membrane-bound sensors of the innate immune system that recognize distinctive molecular features of invading microbes (for review, see Jin & Lee, 2008). They bind to pathogen-associated molecular patterns (PAMPs), thus triggering an intracellular signal cascade to activate an appropriate immune response (Takeda & Akira, 2005). TLRs are divided into six families based on the types of PAMPs they bind to: lipoproteins, diacylated lipoproteins, double-stranded RNA, lipopolysaccharides, flagellin and cyclic compounds like nucleic acids or other DNA motifs (Roach et al., 2005). In vertebrates, TLRs link the innate and adaptive immune system, working with both modes of immune defence (Schnare et al., 2001;Roach et al., 2005). Recent studies show that polymorphisms at TLR loci can have a direct effect on resistance/susceptibility to pathogen infection across a range of vertebrate groups (see Creagh & O'Neill, 2006;Vinkler et al., 2009;Franklin et al., 2011). Consequently, PMS is thought to maintain variation at these genes and positive selection at TLR genes has been shown in fish (Palti, 2011), mammals (Nakajima et al., 2008;Areal, Abrantes & Esteves, 2011;Tschirren et al., 2013) and birds (Downing et al., 2010;Alcaide & Edwards, 2011;Grueber, Wallis & Jamieson, 2014;Grueber et al., 2015).
Wild birds have been the focus of many evolutionary and ecological studies (for reviews, see Zelano & Edwards, 2002;Kaiser, 2007Kaiser, , 2010Fuller et al., 2012). The samples and data from such studies now provide excellent systems in which to investigate the causes and consequences of innate immune gene variation under natural conditions. A study of variation at avian TLR genes across outbred passerines found evidence that balancing selection was responsible for maintaining variation at these loci (Alcaide & Edwards, 2011). Another study on a bottlenecked population of a single species showed that TLR variation was elevated compared to overall genetic diversity . However, other studies have emphasized a more dominant role of drift (Grueber et al., 2014;Gonzalez-Quevedo et al., 2015).
Here, we characterize variation at seven TLR genes in the bottlenecked population of the Seychelles warbler (SW), Acrocephalus sechellensis and use traditional population genetic statistical tests to search for signatures of selection within whole sequences. Furthermore, we compare this variation to patterns of neutral variation within the SW and a close congeneric species that is widespread across the Palearctic, the great reed warbler (GRW) A. arundinaceus. We then characterize variation at the TLR loci in 2-8 individuals in each of 5-6 other Acrocephalus warbler species and test for signatures of selection at the haplotype level and at the amino acid site level for each species. We use these data to investigate whether TLR variation exists despite the severe bottleneck that the SW population endured when reduced to c. 26 individuals in the last century (Collar & Stuart, 1985). We assess whether there is any evidence that selection has influenced TLR variation in the SW or across the Acrocephalus genus, and include a comparison of TLR variation in relation to population size across all of the Acrocephalus populations characterized. We then discuss how variation at these critical loci may be important for the longterm viability of the SW by maximizing its adaptive potential and long-term persistence.

Study species and sampling
The SW is a small (c. 12-15 g) insectivorous passerine bird endemic to the Seychelles islands (Safford & Hawkins, 2013). Due to anthropogenic effects, by the 1960s, the SW was reduced to just one population of c. 26 individuals remaining on the island of Cousin (Collar & Stuart, 1985). As a result, its effective population size was reduced from c. 6900 in the early 1800s to <50 in the contemporary population . However, with effective conservation management, the population recovered to its carrying capacity of c. 320 adults on Cousin by 1982 (Komdeur, 1992) and has since remained relatively stable (Brouwer et al., 2009;Wright et al., 2014). It has been progressively downlisted in its IUCN red list status from critically endangered to near-threatened (IUCN, 2015) with an estimated global population of c. 3000 mature individuals spread across five islands, including four newly established populations . The SW has since proved to be an excellent study species for evolutionary, ecological and conservation questions (Komdeur, 1992;Richardson, Burke & Komdeur, 2003;van de Crommenacker et al., 2011;Barrett et al., 2013). Since 1997, >96% of the Cousin population have been caught and each bird rung with a unique combination of colour rings and a metal British Trust for Ornithology ring (Richardson, Burke & Komdeur, 2002). Birds are aged at first catch according to their eye colour and behaviour: adult birds are >10 months old with distinctive reddish-brown eyes compared to the light brown eyes of a sub-adult aged 5-10 months. Birds <5 months old have grey eyes. Blood samples (c. 25 lL) are taken via brachial venipuncture, placed in absolute ethanol in a 2 mL screw-top Eppendorf tube and stored at 4°C. All of these protocols were performed in accordance with relevant guidelines and regulations and approved by the Seychelles Bureau for Standards, the Seychelles Department of the Environment, the non-Governmental organization managing Cousin island (Nature Seychelles) and the University of East Anglia's Ethics committee.

TLR variation in the SW
Samples were from putatively unrelated adult birds (>1-yearold) chosen at random from the contemporary 2000-2008 population. Genomic DNA was extracted using a salt-extraction method (Richardson et al., 2001). The TLR loci were selected based on their successful amplification in other passerine speciesprincipally, the house finch Carpodacus mexicanus, and New Zealand robin Petroica australis rakiurausing locus-specific primers (Alcaide & Edwards, 2011; (Supporting Information  Table S1). These primers target parts of the specific exons that have leucine-rich repeats, which are associated with pathogen binding (Table 1). The seven TLR genes that amplified successfully in the SW (TLR1LA, TLR1LB, TLR3, TLR4, TLR5, TLR15 and TLR21) were screened in 22-33 individuals. The number of samples needed to identify the majority of variation at each locus was calculated by rarefaction curves using HPRare v1.0 (Kalinowski, 2005). These curves are based on the number of alleles discovered with increasing sample size until the curve reaches an asymptote, and so the SW sample sizes used imply the point at which this curve did plateau.
For each locus, PCRs were carried out in 10 lL volume with genomic DNA at a concentration of c. 10 ng lL À1 . Taq PCR Master Mix was used (Qiagen Ltd, Hilden, Germany) which includes: Taq DNA polymerase, QIAGEN PCR buffer, MgCl 2 and ultrapure dNTPs at optimized concentrations. PCRs were carried out using the following conditions: 40 s at 94°C, 40 s at the locus-specific annealing temperature (Supporting Information Table S1), 80 s at 72°C, all repeated for 34 cycles. All PCRs started with an incubation step of 3 min at 94°C and finished with an incubation step of 10 min at 72°C. All PCR products were electrophoresed on a 2% agarose gel containing ethidium bromide and visualized to determine successful amplification of the expected size fragment. Successful samples were submitted to MWG Operon (Eurofins, Ebersberg, Germany) for Sanger sequencing. All unique sequences were confirmed by repeated sequencing across multiple individuals or, where identified in only one individual, multiple independent PCRs from that individual.
All sequences were aligned against target sequences of the given loci/exon from other passerine species, available in the National Centre for Biotechnology Information (NCBI) nucleotide database, using BioEdit (Hall, 1999) via ClustalW codon alignment. Each chromatogram was examined by eye to identify single-nucleotide polymorphisms (SNPs). Sequences with multiple SNPs had their haplotypes inferred using Bayesian PHASE algorithms (Stephens & Donnelly, 2003) in the program DnaSP (Librado & Rozas, 2009), but given the low levels of polymorphism observed, this could also be done manually as an error-check for DnaSP phasing, which does have an error rate of c. 5% (Marchini et al., 2006). No stop codons or frameshift mutations were detected in these sequences when translated.

Haplotype-level tests
Amino acid sequences were translated using Mega v5.1 (Tamura et al., 2007). In the SW, all haplotype frequencies observed at each locus were tested for linkage disequilibrium using pairwise log likelihood ratio statistics, and were tested for deviation from Hardy-Weinberg proportions using the Markov chain method available in Genepop v.2 (Raymond & Rousset, 1995). F IS values are presented using Robertson & Hill's (1984) estimates, which have lower variance under the null hypothesis compared to the alternative Weir & Cockerham's (1984) estimate. DnaSP was used to calculate basic measures of genetic variation for each locus for each species: number of sequences (N), overall number of segregating sites (S), number of unique haplotypes (H), haplotype diversity (H d ), nucleotide diversity (p) and ratio of synonymous (dS) to non-synonymous (dN) substitutions.
Neutrality tests were carried out on the SW sequences using DnaSP, including Tajima's D (Tajima, 1989), Fu and Li's F (Fu & Li, 1993) and the D statistic (Fu, 1996). Tajima's D is based on the difference between the number of segregating sites and the average number of nucleotide differences between haplotypes. Fu and Li's D statistic is based on the difference between mutations appearing only once among sequences and the total number of mutations, whereas the F-statistic is based on the average number of nucleotide differences between haplotype pairs (k). These tests of selection are averaged over all sites in the sequence, thus they will be confounded if selection differs across sites. Moreover, haplotype-level selection tests can lack power because they are also influenced by population history and so are not able to detect relatively weak signatures of selection .

Neutral variation in the SW
A total of 10 supposedly unbiased microsatellite markers have been screened previously to determine relative levels of neutral variation in the SW A. sechellensis (SW), the GRW A. arundinaceus and the Basra reed warbler A. griseldis (Hansson & Richardson, 2005), three closely related congeneric species. These markers were originally chosen to be free of any ascertainment bias, that is, not selectively chosen based on their polymorphism characteristics in any of the given species, unlike the larger set of microsatellites commonly used in the SW (Richardson et al., 2000). Here, we used these microsatellite data to assess the general pattern of loss of variation at specific TLR loci compared to that observed at for neutral variation in the SW. We also compared differences in the levels of neutral and TLR diversity between the SW and the widespread congeneric GRW and used a generalized linear model (GLM) analysis to assess if significant difference in the relative levels of these types of variation existed between the species (by testing for an interaction between the type of marker/variation being measured, and the species/demography being sampled).

TLR variation in other Acrocephalus species (other warblers)
To help identify signatures of selection within TLR loci and assess variation at the genus level (Acrocephalus), the same TLR loci as above were screened in 2-8 individuals from each of the GRW A. arundinaceus, Eurasian reed warbler A. scirpaceus, Australian reed warbler A. australis, sedge warbler A. schoenobaenus, Cape Verde warbler A. brevipennis and Henderson's Island warbler A. taiti; hereafter, collectively referred to as 'other warblers' (OW). The sequencing protocols outlined above for the SW were used but with different optimized annealing temperatures for each OW species (Supporting Information Table S1). However, some loci proved problematic to optimize even when different combinations of primers were trialled , both for the SW and the other Acrocephalus species investigated. As a result, TLR15 was dropped from the OW component of this study. DnaSP was used to calculate basic measures of genetic variation for each locus for each species in the same way as outlined for the SW. Note that when referring to the OW species (in shorthand for 'other warblers'), we did not lump sequences of the different species together in the statistical analyses.

Haplotype-level tests
For all Acrocephalus species, the occurrence of gene conversion was estimated for each locus in DnaSP, which incorporates an algorithm (Betr an et al., 1997) to detect gene conversion tracts from multiple differentiated populations (referred to as subpopulations). Recombination rates were also estimated using the recombination parameter R = 4Nr (for autosomal loci of diploid organisms) (Hudson, 1987) where N is the population size and r is the recombination rate per sequence (per gene). The estimator is based on the variance of the average number of nucleotide differences between pairs of sequences, S2k (Hudson, 1987, equation 1). The minimum number of recombination events is estimated based on these calculations (Hudson & Kaplan, 1985).

Site-specific tests
Z-tests of selection were carried out in Mega v5.1 (Tamura et al., 2007) to identify selection based on dN/dS across species (Kryazhimskiy & Plotkin, 2008); first using just the OW species and then also including the SW. This is a codon-based test that can account for selective waves with different direction or intensity on specific sites (see Burgarella et al., 2012). Additionally, we assessed evidence of selection at codons within each TLR locus across the Acrocephalus genus using phylogenetically controlled selection tests. The HyPhy package available on DataMonkey (Delport et al., 2010) was used to run different models (for review, see Kosakovsky Pond & Frost, 2005) to identify individual sites under selection based on dN/ dS ratios at each codon across: (1) SW (2) OW and (3) SW + OW. However, on the basis of detecting potential recombination events as outlined above, Genetic Algorithm Recombination Detection (GARD) analysis was also carried out for the sequences of all loci for the SW, OW and SW + OW. This analysis is a precursor to the selection detection tests so that in the event of detecting recombination, any recombination effects are accounted for in the models (Pond et al., 2006). Two selection detection models were run: (1) mixed effects model of evolution (MEME), a mixed effects model of evolution with a significance level threshold of 0.1 and used to detect episodic positive selection (Murrell et al., 2012) and (2) a fast, unconstrained bayesian approximation (FUBAR) model using a Markov chain Monte Carlo routine which has a Bayes factor/ posterior probability set at 0.9 and detects sites under putative selection (Murrell et al., 2013). Only sites identified under both models were taken into consideration.

Investigating mainland and island endemic Acrocephalus species
The Acrocephalus species examined have a range of demographic and evolutionary histories (for overview, see Schulze-Hagen & Leisler, 2011). The GRW, Eurasian reed warbler, Australian reed warbler and sedge warbler are all migrant species from large outbred populations classified as 'under least concern.' Estimated European populations are 950 000, 3.1 and 2.3 million for the GRW, Eurasian reed warbler and sedge warbler respectively (after Hagemeijer & Blair, 1997;BirdLife International 2015). The Australian Reed warbler is widespread across Australia, New Guinea and South-West Asia, and we have used a population census size estimate of 1.5 million based on existing literature (del Hoyo, Elliott & Christie, 2006;BirdLife International 2015). These four species can be categorized as 'mainland.' The Cape Verde warbler and Henderson's Island warbler are two other island species with restricted, but at the present time stable populations estimated at 1000-1500 (Schulze-Hagen & Leisler, 2011) and c. 7000 individuals (Brooke & Hartley, 1995;Birdlife International 2015) respectively. These two species and the SW are categorized as 'island' species.
We tested for differences in TLR variation between mainland and island species using Welch's t-tests of unequal variances in R (R Core Team, 2015) and we investigated the relationship between consensus population size and TLR variation using a regression analysis in Sigmaplot (Systat Software Inc., London, UK). This was run for (1) all TLR variation observed and (2) only TLR variation resulting in a change at the amino acid level (hereafter termed functional variation). TLR variation is measured as 'haplotype diversity', which is the uniqueness of a haplotype in a given subset/population of individuals including a measure of the relative haplotype frequency (x i ) in the sample of individuals and can account for differences in sample size (N) (Nei, 1987).
In order to assess sequence evolution at each TLR locus within and across the different species, we constructed maximum-likelihood trees for each locus with 1000 bootstrap replications in Mega v4.0 (Tamura et al., 2007). The trees were based on nucleotide variation (given the sequences were all <1 kb) under the general time reversible substitution model (Nei & Kumar 2000). Sequences of non-Acrocephalus avian species, obtained from the NCBI database, were used to root the tree: Carpodacus mexicanus (house finch), Petroica australis rakiura (Stewart Island robin), Taeniopygia guttata (zebra finch), Picoides pubescens (downy woodpecker), Philesturnus carunculatus (saddleback), Accipiter cooperii (Cooper's hawk), Falco naumanni (lesser kestrel), Anas platyrhynchos (mallard) and Gallus gallus domesticus (domestic chicken) (Supporting Information Table S2). Table 1 characterizes the variation observed at the seven loci amplified in the SW. TLR4 and TLR21 were monomorphic in the 30 individuals screened for these loci. TLR1LA, TLR1LB, TLR3, TLR5 and TLR15 were polymorphic (Fig. 1,  Supporting Information Fig. S1). TLR15 was the only locus where all the variation observed (at three segregating sites) was non-synonymous, resulting in four different amino acid haplotypes and multiple changes to the characteristics of the encoded protein (Supporting Information  Table S4). In the SW, three loci deviated from Hardy-Weinberg proportions: TLR1LB and TLR3 had a deficiency of heterozygotes (TLR1LB: F IS = 0.372, P = 0.002; TLR3: F IS = 0.186, P = 0.031) and TLR15 had a heterozygote excess (F IS = À0.061, P = 0.017) (Supporting Information Fig. S1). None of the pairwise combinations of loci tested positive for linkage disequilibrium. At the haplotype level, none of the tests could reject neutral evolution when performed on the limited numbers of alleles found at each of the five polymorphic TLR loci within the SW (all tests P > 0.1) (Supporting Information Table S6).

Neutral variation in the SW
The pattern of relative levels of variation observed in the SW (an island bottlenecked species) compared to the GRW (a mainland migratory species) was similar for both microsatellite loci (neutral variation) and TLR loci (excluding TLR15) (Fig. 2). In both cases, the GRW had higher levels of variation compared to the SW. The GLM analysis found that being an island or mainland population, and its interaction with the type of variation (neutral or TLR) was not significant (Supporting Information Table S5). In striking contrast, at TLR15 levels of haplotype diversity were high (considerably higher than the other TLRS in the SW or in the OWs) highlighting the difference between TLR15 and the other TLR loci in the SW.

Site-specific selection tests
The Z-tests based on dN/dS across haplotypes failed to detect selection in both the SW and in the OWs (Supporting Information Table S7). The phylogenetic tests for selection at individual sites were performed at three levels: within the SW only, across the other warblers excluding the SW (OW) and finally across the entire dataset (SW + OW). Using the SW TLR sequences, GARD analysis failed to detect any evidence of recombination using a site-by-site analysis. This meant we could proceed to use the selection detection models as normal. TLR1LB and TLR5 each had a single site identified as being under putative purifying selection according to the FUBAR model. This model also identified a single site at both the TLR3 and TLR15 loci to be under putative positive (balancing) selection (Table 3), of which the site at TLR15 in the SW was also identified as being under positive selection in the MEME model (Supporting Information Fig. S2).
When the OW TLR sequences were examined (excluding the SW) with the FUBAR model, the same site at TLR1LB was identified as being under purifying selection along with an additional site, while three other sites were identified to be under putative positive selection at TLR1LB (Table 3). One of these sites was confirmed using the episodic positive selection MEME model. As for TLR5, the one site under purifying selection identified in the SW was not identified in the OW, but reappeared when considering all Acrocephalus species, so must be SW-specific. When considering OW, the one site found to be under position selection at TLR3 in the SW was also found, as well as two sites under purifying selection.
The same analyses could not be carried out for TLR1LA since less than three unique haplotype sequences were detected in the SW. However, in the OW and OW including the SW, one site was identified to be under positive selection and four sites under purifying selection at this locus across the genus. For TLR4 and TLR21 (two loci which were monomorphic in the SW), no sites were identified to be under positive selection across the Acrocephalus genus. However, several sites were identified to be under purifying selection (TLR4 n = 4, TLR21 n = 2) ( Table 3). For all loci tested, all sites detected by the MEME model to be under episodic positive selection were also detected by the putative selection FUBAR model.

Investigating mainland and island endemic Acrocephalus species
There was significantly more variation present at TLR loci (excluding TLR15) in the mainland migratory species -A. australis, A. arundinaceus, A. schoenobaenus and A. scirpaceusthan observed in the island endemic species including A. brevipennis, A. sechellensis and A. taiti (Fig. 3). This was the case for the number of segregating sites S (t = À2.75, d.f. = 6, P = 0.032) and number of unique haplotypes H (t = À2.99, d.f. = 6, P = 0.023). Post hoc Tukey tests show that levels of variation averaged across all TLR loci (measured as S and H) in the SW did not differ significantly to those observed in the other island endemics, although there was a tendency to have lower variation than in A. brevipennis (Tukey HSD: mean difference in S = À0.268, P = 0.076; mean difference in H = À0.308, P = 0.074) but not in A. taiti (Tukey HSD: mean difference in S = À0.052, P = 0.891; mean difference in H = À0.148, P = 0.496).
Across the Acrocephalus species sampled, census population size significantly predicted mean TLR haplotype diversity for all nucleotide variation (t = 4.96, d.f. = 6, P = 0.04) (Fig. 4a). This pattern became non-significant when only considering amino acid variation, that is, dN substitutions only (t = 1.55, d.f. = 6, P = 0.18) (Fig. 4b) most likely due to a lack of power. Overall, the mainland species had more variation across the TLR gene family in comparison with the island endemic species with the mean number of alleles observed per locus for mainland Acrocephalus species 3-4, but only 2-3 for the island Acrocephalus species. Neighbour-joining trees showed distinct segregation to the level of genus for all TLR loci analysed, supported by high bootstrap values between 76 and 100% (with the exception of TLR21 at 49%). All Acrocephalus species screened were clearly out-grouped not only from all non-passerine species, including those from Falconidae and the Galliformes but also from other Passeriformes families like Fringillidae and Turdidae. Within the Acrocephalus genus variation within each TLR locus did not separate out by particular species. None of the loci showed evidence of gene conversion as no conversion events were identified for any of the pairwise combinations of alleles tested in any of the species for any of the loci. However, at least one recombination event appears to have occurred in each of the four TLR genes in the evolution of these Acrocephalus warblers (TLR1LB, TLR3, TLR4 and TLR15) (minimum number of recombination events identified between specific sites = 2, 1, 2, 2 respectively).

Discussion
We characterized variation at seven TLR immune genes in the bottlenecked population of the SW. Two of these loci were monomorphic while five polymorphic loci had 2-5 alleles each. Across the TLR loci levels of variation were, on average, as low as levels of variation observed at neutral loci (Hansson & Richardson, 2005). The same proportion of TLR loci (5/7; 71%) and microsatellite loci (7/10; 70%) were polymorphic, and the average levels of variation were similar at 2.5 alleles observed at microsatellite loci to 2.9 alleles observed at the TLR loci in the SW. Whole-sequence selection tests failed to reject the null hypothesis of neutral evolution, indicating that if selection was occurring, the effects were modest and the sample size in this study was not sufficient to detect it. Furthermore, the relative paucity of TLR variation in the SW compared to that observed in the non-bottlenecked congeneric GRW population was in line with the different levels of neutral variation observed across the two species. Although we recognize that the mutational characteristics (including the mutation rate) differ dramatically between TLRs and microsatellite markers, we made this comparison to illustrate how genetic drift appears to be the overriding evolutionary force governing TLR variation in the SW (and other island warbler species). The only TLR locus to stand out from this overall pattern was TLR15 (see below). Next, we compared levels of variation among different TLR loci in 5-6 other congeneric warbler species (OW) with varying demographic histories. This comparison excluded TLR15 which was, unfortunately, problematic to amplify in the OW species. We found that TLR variation in the SW (and other island populations) was reduced compared to that in the large populations of the mainland species (Fig. 3). This suggests that genetic drift is the main force shaping TLR variation in the small, previously bottlenecked, isolated island populations. This result concurs with those above comparing the SW and GRW, which shows that variation at most TLR loci has been reduced in line with neutral variation in the SW species.
Census population size did predict overall levels of nucleotide variation at TLR loci among different populations (Fig. 4a), however, this association was weaker (not significant) when testing amino acid (functional) variation alone (Fig. 4b). This difference is explained by the fact that the island species have much lower levels of overall nucleotide variation than mainland species, but not so much lower levels of amino acid variants (Supporting Information Fig. S2). This may be because a greater proportion of nonsynonymous variation, compared to synonymous variation, is retained in the bottlenecked populations, perhaps as a result of balancing selection mitigating the effect of drift on these functional variants. Alternatively, it may merely be a lack of power associated with assessing just a subset of all the variants. Overall, the main effect is clear; there is substantially less TLR variation within the small bottlenecked island populations.
Despite the lack of evidence of any strong signature of selection, we did find some interesting patterns within  (1) number of segregating sites (S) and (2) number of unique haplotypes (H). S and H are an average measure taken across all individuals included in the island and mainland groups. Standard error bars are presented OR mean toll-like receptor (TLR) haplotype diversity compared directly between bottlenecked island species and mainland migratory species from the Acrocephalus genus. specific loci. For example, in the SW, TLR15 appeared to have a bias towards retaining potentially functional (amino acid) variants, with all four sequence variants detected encoding different amino acid sequences. TLR15 was a clear outlier to the other TLR loci in terms of levels of variation in the SW (Fig. 1), and was the only locus to show a significantly deviation from Hardy-Weinberg proportions as a result of an excess of heterozygotes. This may be a result of heterozygote advantage (Supporting Information Fig. S1), a mechanism of balancing selection that has been found to act on immunogenetic variation in various species (Hedrick, 2002;Worley et al., 2010;Niskanen et al., 2013). Note, however, that rare allele advantage (i.e. negative frequency dependent selection) can also result in an excess of heterozygotes and relatively homogenous allele frequency distributions. TLR15 also retained more variation in the bottlenecked SW compared to that observed at neutral markers or the other TLR loci in the other island Acrocephalus species (where variation was consistently low). Again, this pattern may be indicative of balancing selection overriding the effects of drift at TLR15 in the SW. The low levels of variation at TLR loci in the SW and other Acrocephalus may explain why the haplotype-level neutrality tests failed to detect any strong signatures of positive selection in, or across, these species. Haplotype-based neutrality tests have been much criticized for their limitations and lack of power for detecting selection (Vasemagi & Primmer, 2005;Leffler et al., 2012;Li et al., 2012), which, given the limited sequence data available from the genetically depauperate population of the SW (and other bottlenecked species A. brevipennis and A. taiti), may explain our results. Furthermore, haplotype-level tests based on the allele frequency spectrum make strong inferences about the populations' demography, such as constant population size (Nielsen, 2005). The SW population, which has been expanding rapidly since it was reduced to c. 26 individuals in the 1960s , does not comply with these assumptions.
Selection was identified at individual sites within the exons of the TLR loci examined. In the SW, both TLR3 and TLR15 had individual sites identified as being under positive selection, and these sites were confirmed to be under selection across the Acrocephalus genus. An additional site at the TLR15 locus was also identified in the OW species but not in the SW. TLR15 was also the locus under the relatively greatest amount of selection overall with sites for both positive and purifying selection. This pattern of different sites within the exon showing signatures of different types of selection is probably because some of the sites are directly involved in PAMP binding while others may be important in determining the overall shape and configuration on the molecule and thus conserved (Bell et al., 2003;Werling et al., 2009;Kawai & Akira, 2010). Therefore, codons within the same exon have the potential to mask opposing selection forces from being detected (Good et al., 2013). While such codon-based tests across species provide considerable power, there are caveats. For example, the signatures they detect will be of past selection caused by pressures that may no longer be acting (Yang & Bielawski, 2000). The tests cannot resolve whether the variation observed is currently under selection in the contemporary population. Many sites that were shown to be under negative (purifying) selection across the other Acrocephalus species were also found to be under negative selection in the SW. Theory predicts that while the intensity of (positive) selection on the innate immune genes may fluctuate in space and time, depending on the selective pressures exerted by pathogens, purifying selection is a constant evolutionary force that preserves the functionality of these genes (Kimura & Ohta, 1969;Ohta, 2002;Mukherjee et al., 2009). This may explain the different results we obtained for positively and negatively selected sites in the SW, in that signals of negative selection were clearer because they came out stronger than the potentially weaker signals of balancing selection. One may question why selection has not maintained more variation at the immunologically important TLR loci in the SW (or other island species). Despite considerable screening efforts, no gastrointestinal parasites or virus infections and only one blood parasitea single strain of avian-malaria (GRW1)have been detected in the SW population (Hutchings, 2009). This contrasts markedly with the diversity of pathogens found in most mainland avian populations, but is normal for remote (bottlenecked) island populations (Steadman, Greiner & Wood, 1990;Coltman et al., 1999;V€ ogeli et al., 2011). Since pathogen-mediated balancing selection is thought to be the force maintaining variation at immune genes (Turner et al., 2012;Westerdahl et al., 2012;Grueber et al., 2014), the paucity of pathogens could help explain why drift appears to be the predominant force shaping TLR variation in our SW population (V€ ogeli et al., 2011). On the other hand, a restricted pathogen fauna in the SW may have actively contributed to the loss of immunogenetic variation. For example, the lack of variation observed at TLR4 in this study is notable as this locus has been shown to be involved in the recognition of Protozoan's such as haemosporidian (malaria-like) parasites (Franklin et al., 2011;Basu et al., 2012), and the only pathogen detected in the SW was a Haemoproteus (Hutchings, 2009). It is possible that the single TLR4 allele remaining in the SW population might have offered the best protection (or tolerance) against GRW1. In the absence of multiple strains exerting selection pressures favouring different alleles, selection may have driven this allele to fixation at TLR4. So, while the most parsimonious explanation for a lack of variation may be genetic drift, we highlight the possibility that PMS could reach a new equilibrium in small isolated populations in the form of the complete fixation of a single allele (Robertson, 1962). This effect of selection could have important implications for conservation genetics of post-bottlenecked populations with limited pathogens because immunogenetic variation could be lost faster than expected based on the drift alone. In support of this idea, several other studies have found that immunogenetic variation eroded faster than neutral variation in island/fragmented populations (see Bollmer et al., 2011;Eimes et al., 2011;Sutton et al., 2011).
While TLR polymorphism may be low in the SW, some variation has been maintained in five of the seven loci despite its demographic history. All TLRs recognize specific PAMPs and their different levels of haplotype variation could reflect the biodiversity of the pathogens they protect against in a particular population or species. TLR1LA and TLR1LB recognize lipoproteins in the cell walls of bacteria, fungi and protozoans (Brownlie & Allan, 2011). In the SW, TLR1LB was the only TLR gene that showed convincing evidence for purifying selection, although TLR5 did have one site identified. TLR1LA only had two synonymous alleles, while TLR5which has been shown to recognize the flagella of bacteria in other species (Andersen-Nissen et al., 2007)had three apparently functionally different alleles. TLR3 and TLR15 had the highest levels of variation and these were the only loci at which any sites were identified as being positively selected in the SW. TLR3 is involved in sensing viral RNA (Uematsu & Akira, 2008), whereas TLR15a gene unique to birdsappears to be important in the recognition of intracellular parasites, including haemosporidian parasites (Boyd, Philbin & Smith, 2007). Further investigations into how individual variation at TLR15 influences resistance or resilience to haemosporidian infection in the SW may be worthwhile, particularly given results of recent studies into malaria infection and MHC genes (similar to TLRs in function and in molecular structure) in passerines (see Bensch et al., 2007;Asghar, Hasselquist & Bensch, 2011;Lachish et al., 2011;Asghar et al., 2015;Marzal et al., 2015) including the closely related A. arundinaceus (Bensch et al., 2000;Westerdahl et al., 2012).
There has been much debate on the relative roles of genetic drift and selection in shaping functional variation in natural populations and importantly, whether balancing selection can maintain important functional variation even in the face of strong drift (e.g. Alcaide, 2010;Sutton et al., 2011;Strand et al., 2012). There is, however, considerable evidence that drift outweighs selection even at immunologically important loci where balancing selection would be expected to be most effective (Miller & Lambert, 2004;Willi et al., 2007;Kuo, Moran & Ochman, 2009;Gonzalez-Quevedo et al., 2015). Our data on TLR variation in the SW and other Acrocephalus species concurs with this general view of the overriding effect of drift. However, potentially functional variation does still exist within the SW at some of the TLR loci (i.e. TLR15). Thus, it is possible that selection may have played a role in maintaining this variation, though more in-depth studies are now required to investigate this possibility. Furthermore, approaches will need to consider how to delineate the relative effects of drift and selection at these candidate loci during the bottleneck. Studies undertaken during bottleneck events, and which identify the cause of selection, will therefore be required before we can fully understand these dynamics.
In summary, the bottleneck suffered by the SW population appears to have reduced the levels of variation at TLR genes in this species. However, some potentially functional variation still remains (most noticeably at the TLR15 locus) possibly as a result of balancing selection. The limited amount of variation detected does, however, undermine our ability to test the significance of such variation. Sequence-based tests of selection have low statistical power and restrictive assumptions. Only studies assessing the impact of genetic variation on individual fitness within the contemporary population and/or simulation studies of drift and selection during bottlenecks will be able to robustly assess the relative contributions of the various evolutionary forces. Such studies would enable scientists to predict the long-term viability of fragmented or bottlenecked populations like the SW. This study illustrates the genetic consequences of a bottleneck event, and it also suggests that in spite of high levels of drift, not all variation is necessarily lost. This study focuses on one population, which sourced four subsequent populations on isolated islands. This means that we can now look at TLR variation in each population and how it is changing/ maintained over time, given that each population was sourced from a population where all individuals are exposed to GRW1 (Fairfield et al. 2016 in press). It can be argued that it is exposure to pathogen-selective pressures like these which are partly responsible for bottlenecked populations being able to retain some of their adaptive evolutionary potential through genetic augmentation. Our findings support this and must be implemented into the island management and long-term viability of this species.

Supporting information
Additional Supporting Information may be found in the online version of this article at the publisher's web-site: Figure S1. Observed and expected haplotype frequency charts for each polymorphic toll-like receptor (TLR) locus amplified in the Seychelles warbler. Figure S2. Nucleotide and corresponding amino acid alignments for all polymorphic toll-like receptor (TLR) loci in the Seychelles warbler. Single-nucleotide polymorphisms are highlighted in black and the corresponding translation is shown highlighted in yellow below. Any sites inferred to be under positive selection using PAML-based methods are highlighted with an asterix of which a black asterix represents the FUBAR model and red for the mixed effects model of evolution (MEME) model. Figure S3. Maximum-likelihood trees for each toll-like receptor (TLR) locus to show the relationship between alleles at each locus across different avian lineages. Bootstrapping is applied to each relationship with 1000 repetitions and the tree is drawn to scale, with branch lengths measured in number of substitutions per site. Trees include all sequences obtained for the Seychelles warbler (SW) and six other Acrocephalus species [other warblers (OW)] and reference sequences of other passerines and non-passerine species to root the trees. Table S1. Primers and PCR annealing temperatures used to amplify toll-like receptor (TLR) loci in seven Acrocephalus species. Table S2. Accession details for all external sequences used in Supporting Information (Fig. S3) to construct maximumlikelihood trees for each toll-like receptor (TLR) locus, extracted from the National Centre for Biotechnology Information (NCBI's) database using the basic local alignment search tool (BLAST). Table S3. Comparison of the polymorphisms observed of the exons encoding toll-like receptor (TLR) genes among individuals from the bottlenecked Cousin island population of Seychelles warblers (A. sechellensis). Comparisons include amino acid changes from non-synonymous (dN) substitutions and the resulting change in protein characteristic as one of three categories: charged, polar and hydrophobic. Table S4. Variation characterized within a contemporary population of Seychelles warbler at 10 neutral microsatellite markers (as published in Hansson & Richardson, 2005) and at seven toll-like receptor (TLR) loci. Abbreviations: k (number of alleles), N (sample size), H e (expected heterozygosity) and H o (observed heterozygosity). Table S5. Haplotype-level tests for selection based on the allele frequency spectrum for each toll-like receptor (TLR) locus for the Seychelles warbler. Significant P-values are in brackets. Table S6. Generalized linear model (GLM) analysis with haplotype diversity as the response variable and mainland/island species status crossed with microsatellite (neutral)/tolllike receptor (functional) loci as fixed factors. The interaction term shows the rate of decline in allelic variation in islands versus the mainland differs between the two types of genes.