Abstract High frequencies of some rare inherited recessive disorders can be found in the Saguenay region of Quebec, Canada. Four disorders have a carrier frequency of about 0.04 (in the range 0.035-0.05): pseudovitamin D-dependent rickets, hereditary tyrosinemia type 1, CharlevoixSaguenay spastic ataxia, and sensorimotor polyneuropathy with or without agenesis of the corpus callosum. Molecular data suggest that only 1 mutation has been introduced into the population since its founding in the 17th century. The carrier frequencies are much higher than one would expect under a theoretical model that includes variance in family size and population growth (Thompson and Neel 1978). I present a methodology called allele dropping to test the hypothesis that only 1 founder introduced a given mutation. This study is based on 891 ascending genealogies and enables one to measure the extent of allele frequency changes resulting from the demographic history of the population. Two scenarios are tested: neutral and lethal alleles. Lethality has a minor effect because the alleles never reach a frequency high enough for selection to be strong. Twenty-five founders have a probability greater than 1% that a lethal mutation they introduced into the population will reach a carrier frequency between 0.035 and 0.05 in the contemporary population. Moreover, 2 founders have a probability greater than 20% that a lethal allele they introduced into the population will reach this target frequency. Therefore the simplest hypothesis that 1 founder introduced 1 disorder into the population is consistent.

KEY WORDS: INHERITED DISORDERS, SAGUENAY REGION, DRIFT, FOUNDER EFFECT

The object of this study is to test whether the demographic parameters of a population in the northeastern part of Quebec, an expanding population created through a founder effect, are sufficient for a lethal allele introduced by only 1 founder to reach a high frequency in the contemporary population. Some of these lethal alleles have attained carrier frequencies greater than 0.035, starting from a founder gene pool of 5000 individuals (initial carrier frequency of 0.0002). This tremendous increase in allele frequency in only 12 generations is much higher than any documented cases and far from any theoretical prediction.

The question of elevated frequencies of rare variants in human populations has been addressed by several researchers. Drift and population growth are the main explanations. For example, the high frequency of some rare inherited disorders in the Finnish population is explained by the population's strong growth from a few founders to 5,000,000 inhabitants in approximately 100 generations (de La Chapelle 1993). This growth led to what is called the Finnish disease heritage, and some disorders, such as aspartylglucosaminuria, reach a carrier frequency of 0.03; other recessive disorders have a lower carrier frequency, about 0.01-0.02 (gyrate atrophy of the choroid and retina, diastrophic dysplasia) (Norio et al. 1973). These frequencies are expected because of the growth rate of the population (Hastbacka et al. 1992; Kaplan et al. 1995), and they can be easily simulated. Thompson and Neel (1978, 1996) used a different approach to study the frequency of rare variants in Amerindian populations that takes into account the variability in family size. The distribution of family size is taken as a zero-modified geometric distribution. Their model fits the distribution of rare variants in the different tribes.

The population under study here is the Saguenay population of northeastern Quebec, Canada. This population has been under genetic study because of the high frequency of some rare inherited disorders (see Table 1). For 4 recessive disorders specific to the Saguenay population the carrier frequency is estimated in the present population to be 0.035-0.05 (1/28-1/20).

From previous studies it is clear that these disorders have been introduced into the Quebec population since the 17th-century French emigration to Nouvelle-France (Heyer and Tremblay 1995; Heyer et al. 1997). Molecular data and genealogical analysis suggest that unique mutations were introduced in the 17th century. This population also has well-documented demographic records. From the known demographic parameters and by using Thompson and Neel's (1978, 1996) model, I calculate that the prior probability that a variant introduced 12 generations ago into the population reaches an actual carrier frequency between 0.035 (1/28) and 0.05 (1/20) is only 2.35 X 10^sup -31^ (m = 1.41, c = 0.3, g = 12 using Thompson and Neel's notations). I focus on prior probability and not on probability conditional on the survival of an allele; I want to assess for a unique variant introduced 12 generations ago in the population the probability that the allele will reach the high known frequencies, taking into account the extinction probability of this variant.

Because for Saguenay genealogical information is available from contemporary individuals to the founding of the population in the 17th century, I use a different approach. In the absence of migration, selection, and mutation, any allele in a population will change in frequency because of (1) any differential reproduction and (2) the random process of gamete sampling resulting from the Mendelian transmission of genes. Provided that vital statistics are available, an effective means of investigating the demographic process by which nuclear alleles change in frequency in a population is to track the genetic contribution of each founder to the descending gene pool (Edwards 1992; Jacquard 1977; Roberts 1968; Roberts and Bear 1980; Cazes 1986). All the demographic events involved in the transmission of genes from 1 generation to another are represented by the genealogical paths that link a founder to some contemporary individuals.

Using the same genealogical information, one can simulate the Mendelian transmission that took place in every genealogical step from any founder to the contemporary individuals (Edwards 1968; MacCluer et al. 1986). The results of these simulations give the amount of stochastic changes in gene frequency.

Using this approach based on genealogies, I measure the extent to which the demography of the Saguenay population allows the increase of allele frequency of a variant introduced by only 1 founder. Thus my approach is different from Kaplan et al.'s (1995) or Thompson and Neel's (1978, 1996) approach because I do not simulate the demography of the population but use the real one.

To better understand the effect of population growth following the founding of a population, I address the following questions: Is such a change in allele frequency something expected in this population? Is there a difference in the survival probability and allele frequency changes for a lethal allele versus a neutral one?

Materials and Methods

Historical Background. The French colonization of the province of Quebec began in the early 17th century and continued until English control began in 1760. About 4500 founders settled before 1700, and an estimated 3500 founders have descendants today (Charbonneau et al. 1987) among the 5,000,000 Francophone inhabitants of Quebec.

The colonists settled along the St. Lawrence River, taking advantage of the rich soil of the valley, which became densely occupied by the l9th century. In the mid-19th century overpopulation stimulated migration from the older established areas toward the United States and to more remote regions of Quebec, such as Saguenay (Pouyez et al. 1983).

The Saguenay region is located on the north shore of the St. Lawrence River, about 200 km northeast of Quebec City. European settlement in this region began in the mid-1800s, originating mostly from the relatively small border region called Charlevoix. This population was created through a combination of migration and rapid intrinsic growth. As early as 1870, the growth of the population was mostly intrinsic. The Saguenay population now approaches 300,000 inhabitants. Here, all ancestors of the contemporary Saguenay-Lac-St-Jean population from the 17th century to today will be referred to as the Saguenay-Lac-St-Jean population; thus our definition is not closely geographic. As Table 1 shows, some recessive inherited disorders reach a high carrier frequency in this population. Pseudo-vitamin D-dependent rickets, hereditary tyrosinemia type 1, Charlevoix-Saguenay spastic ataxia, and sensorimotor polyneuropathy with or without agenesis of the corpus callosum have a carrier frequency of about 0.04 (1/25), whereas other diseases have carrier frequencies of about 0.025 (1/40).

Database. I used a sample of data from the Interuniversity Institute for Population Research database (Bouchard and De Braekeleer 1991). The information contained in this database can be used to reconstruct ascending genealogies. I chose a subset of 891 ascending genealogies of probable and possible Alzheimer cases that have been documented by project IMAGE and Algene Biotechnologies. All these individuals were born around 1930, and more than half of those presumptive cases have had their diagnoses corroborated postmortem or through a rigorous clinical algorithm. I have previously shown that the genetic contribution of 17th-century founders to Saguenay contemporary individuals does not vary appreciably whether these individuals are carriers of an inherited disorder or not (Heyer and Tremblay 1995). I did the same calculations for these 891 individuals, yielding the same results. Therefore the fact that these 891 individuals have been recruited as having Alzheimer's should not bias the results, and they are taken as representative of the contemporary Saguenay population. These 891 individuals trace back to 2631 founders who settled in Nouvelle-France before 1700. These founders and their genetic contribution to the contemporary population have been described by Heyer et al. (1997).

Genetic Contribution. To make this study self-contained, I briefly describe the term genetic contribution. The genetic contribution of a founder to a given group is

where p is the number of individuals in a given group genealogically related to the founder; c is the number of genealogical paths between the founder and the individual, including redundant pathways; and gi, is the number of generations separating the founder from the individual i for each path j. The genetic contribution value indicates for a given group the expected number of copies of a particular allele carried by a founder. The genetic contribution is a summary of all demographic events (marriage, fertility, mortality, and migration) that occurred among the descendants of a founder.

Allele Dropping. I use a simple method that simulates the Mendelian transmission of alleles along genealogical paths (Edwards 1968; MacCluer et al. 1986; Thomas 1990; Heyer 1991; O'Brien et al. 1994). For each individual in the genealogies, I chose at random 1 of the 2 alleles carried by his father and 1 of the 2 alleles carried by his mother. This Mendelian segregation is done starting from the founders and going down along the genealogical paths. I attribute 2 unique alleles to each founder.

The result of 1 simulation is all the alleles carried by the 891 individuals. This gives for each of the 2 alleles carried by 1 founder the number of individuals who are simultaneously carriers of this allele. Using genealogical paths as a support for these simulations, I incorporate all the demographic parameters that influence the allele frequency changes. It is important to understand that I do not simulate a population but the allelic transmission in the real population.

This simple method enables me to measure stochastic events in the process of allele frequency changes in the population. The mean is the genetic contribution, and the distribution accounts for the stochastic process.

Simulations were done under 2 scenarios: (1) for a neutral allele and (2) for a recessive lethal disorder where, by definition, individuals included in an ascending genealogy cannot be homozygous for a lethal recessive mutation. I modified the method so that when an individual is found to be homozygous for 1 of the founder's alleles, his 2 alleles are resampled from his parents until he becomes heterozygous.

From Sample to the Population. The results of several simulations define the probability for each founder's allele to reach a given carrier frequency in the sample. Let P(Y E ]a, b]) be the probability for a given founder that 1 of his alleles reaches a frequency between a and b in the population, and let P(X = x) be the probability for this founder that x individuals in the sample carry his allele:

where P(Y E ]a, b]IX = x) follows a normal distribution of mean xIN and standard deviation x(1 - x)IN, N = sample size. P(X = x) is given by the simulation results.

Thus for each founder I calculate the probability that 1 of his 2 alleles will reach a given target frequency in the population. Results Figure 1 shows the result of 50,000 simulations for 3 different founders. The abscissa represents the value of x [Eq. (2)], and the value on the y axis divided by 50,000 is P(X = x). Using this value as P(X = x) in Eq. (2), I can calculate the probability for 1 founder's allele that it reaches a given frequency range in the population.

Table 2 gives the results of 50,000 simulations for the 15 founders with the highest probability that an allele they carry reaches a carrier frequency between 0.035 and 0.050 in the population. The range 0.035-0.05 is the carrier frequency for the 4 most important specific recessive disorders in the population. Among the 2631 founders only 247 see 1 of their alleles reach a carrier frequency greater than 0.02 in the sample. Neutrality versus Lethality. The effect of lethality is rather small, except for high carrier frequencies (greater than 0.05; Table 2). For some founders the lethality reduces the probability that 1 of the founder's alleles reaches a frequency greater than 0.05 but simultaneously increases the probability that it will reach 0.035-0.050. This trade-off leads to a higher probability for a lethal allele to reach 0.035-0.050 compared with a neutral one. The differences between these 2 probabilities in Table 2 reflect the potential consanguinity attributable to 1 founder.

Figure 2 shows the distribution of the 247 most important founders according to the probability that an allele they carry reaches the target frequency in the population for a recessive lethal allele. Only 4 founders have a probability higher than 10% of reaching the target frequency in their contemporary progenitor. Twenty-one founders have a probability between 1% and 10%, and 15 have a probability between 1%o and 1%. All the remaining founders have a probability of less than 1%o that an allele they carry will reach the target carrier frequency in the present-day Saguenay population. Discussion In the absence of selection and mutation the changes in allele frequency come from 2 different processes: (1) changes resulting from differential demographic behavior (these are measured by the diversity in the genetic contribution) and (2) changes in allele frequency resulting from stochastic events related to Mendelian segregation (these are measured by the simulations). Using the allele-dropping method I have been able to calculate for an allele carried by a given founder the exact probability that the allele could have reached the frequency interval of 0.035-0.05 whether the allele is neutral or lethal. The results show that the fate of an allele is almost identical whether it is neutral or a recessive disorder because these alleles almost never reach a frequency high enough for selection to act. Twenty-five founders have a probability of more than 1% that the allele will reach the target frequency. Moreover, for 2 founders there is a probability of more than 20% that a lethal allele they introduced into the population will reach a carrier frequency between 0.035 and 0.05. I can therefore conclude that the simplest hypothesis of I founder introducing 1 disorder is consistent. High frequencies of some recessive disorders are expected from the historical demography of the population. The results from simulations on the genealogies are consistent with the molecular data available on this population. The hypothesis that only 1 founder introduced a rare variant into the population is a consistent one. I do not rule out the hypothesis that more than 1 founder introduced the same variant. Most of the founders came from different regions of France (Heyer et al. 1997). Because inherited disorders specific to Saguenay are not found in the present-day French population, they were probably not widely dispersed in France. But some of the founders came from the same small area in France and could have introduced the same variant. On average, 1.4 alleles reach the frequency range 0.035-0.05; because I chose 4 disorders at this frequency in the population, a rough estimate of the number of recessive lethal alleles that each individual carries is about 3. From studies on children from incestuous matings, it has been estimated that the number of recessive lethal alleles per individual is between 4 and 5, well in line with Muller's estimate (Vogel and Motulsky 1986). For 2 founders the probability that a lethal allele they introduced reaches a frequency greater than 0.05 is also high (17%). No lethal recessive disorders have been detected at such a high frequency in the Saguenay population. This can be explained by 2 factors. The first factor is chance: If each of these 2 founders carries only 1 recessive lethal disorder, there is a probability of 0.69 that neither allele reaches a frequency higher than 0.05. Even if we assume that each of these 2 founders carries 4 recessive lethal mutations, the probability is 0.22 that none of these 8 mutations reaches a frequency higher than 0.05. The second factor is underestimation of the carrier frequency in the population. Because the carrier frequency is estimated from incidence of the disease among newborns, the data do not include loss of fetuses resulting from miscarriages. Therefore some disorders could have a higher carrier frequency than 0.05.

In the absence of mutation, selection, and migration the process of allele frequency change is drift. In fact, this drift can be decomposed into 2 factors: (1) differential reproduction and (2) sampling process from 1 generation to the next as a result of Mendelian segregation. Differential reproduction is strong in the Saguenay population. Starting from an initial carrier frequency between 0.0005 and 0.005, the expected carrier frequency of some alleles in the contemporary population is 0.03 (genetic contribution of founder 355; Table 2). But when a stochastic factor is taken into account, there is wide dispersion around this expected value (see Figure 1). Therefore genetic contribution alone is not sufficient to study the fate of an allele in a population. And any study that aims to trace back the demographic history of a population using molecular data should be based on a large amount of loci. The method of allele dropping could be applied easily to other populations to measure the probability of finding recessive lethal disorders at any given frequency in these populations when samples of genealogies are available. The results lead to a more general question: Why is there such an inadequacy between the expected value from Thompson and Neel's (1978, 1996) model and the frequencies of the inherited disorders in the Saguenay population? From Thompson and Neel's model the probability that a given allele will reach a frequency of 0.035-0.05 is about 10-31. From my results this value is 1027-fold higher: Among 5262 founder alleles, on average 1.4 reach a carrier frequency in the range 0.035-0.05. Therefore the probability that any given allele will reach this carrier frequency is 1.4/5262 = 2.7 X 10-4. Because more and more models that imply knowledge of the demography of the population are used for linkage studies (Kaplan et al. 1995; Thompson and Neel 1997; Slatkin 1994), this question should be addressed before any attempt is made to use these powerful methods on the Saguenay population, and any analysis should be done with caution on any population. A previous study of linkage disequilibrium for pseudo-vitamin D-deficiency rickets demonstrates this problem (Labuda et al. 1996; Austerlitz and Heyer 1999): Either with Kaplan's branching process or with Luria and Delbruck's correction, the best estimate for the recombination rate between the pseudovitamin D-deficiency rickets mutation and a linked haplotype is 0.05. From GENETHON this recombination rate is known to be 0.07. The discrepancy between the allele frequencies and the recombination estimate implies the same underlying demographic process.

Acknowledgments We thank Denis Gauvreau, President of the Societe Algene Biotechnologies Inc., who provided authorization to use his genealogical data; and Gerard Bouchard, Director of Interuniversity Institute for Population Research (IREP) for giving access to the genealogical database. We also thank Frederic Austerlitz, Damien Labuda, Marc Tremblay, and Andre Langaney for their comments on a former draft of this paper and 1 anonymous reviewer and J.D. Terwilliger for their helpful comments.

Received 22 December 1997; revision received IS June 1998.

Literature Cited

Austerlitz, F., and E. Heyer. 1999. Impact of demographic distribution and population growth rate on haplotypic diversity linked to a disease gene and their consequences for the estimation of recombination rate: Example of a French Canadian population. Genet. Epidemiol. (in press).

Bouchard, G., and M. De Braekeleer, eds. 1991. Histoire d'un genome: Population et genetique dans l'est du Quebec. Sillery, Canada: Presses de l'Universite du Quebec. Cazes, M.H. 1986. Genetic origins of the Dogon population in the Arrondissement of Boni (Mali). Am. J. Hum. Genet. 39:96-111.

Charbonneau, H., A. Guillemette, J. Legare et al. 1987. Naissance d'une population: les Francais etablis au Canada au XVlle siecle. Montreal, Canada, and Paris, France: Editions de l'Institut National d'Etudes Demographiques, Presses de l'Universite de Montreal, and Presses Universitaires de France.

de la Chapelle, A. 1993. Disease genes mapping in isolated human populations: The example

of Finland. J. Med. Genet. 30:857-865.

Edwards, A.W.F. 1968. Simulation studies of genealogies. Heredity 23:628. Edwards, A.W.F. 1992. The structure of the polar Eskimo genealogy. Hum. Hered. 42:242252.

Hastbacka, J., A. de la Chapelle, I. Kaitila et al. 1992. Linkage disequilibrium mapping in isolated founder populations: Diastrophic dysplasia in Finland. Natur. Genet. 2:204-211. Heyer, E. 1991. Etude d6mogenetique d'une population humaine: Cas de la maladie de RenduOsler. These de Science, Lyon, France, p. 194.

Heyer, E., and M. Tremblay. 1995. Variability of the genetic contribution of Quebec population founders associated to some deleterious genes. Am. J. Hum. Genet. 56:970-978. Heyer, E., M. Tremblay, and B. Desjardins. 1997. The seventeenth century European origins of hereditary diseases in the Saguenay population (Quebec, Canada). Hum. Biol. 69(2):209-225.

Jacquard, A. 1977. Concepts en genetique des populations. Paris, France: Masson. Kaplan, N.L., W.G. Hill, and B.S. Weir. 1995. Likelihood methods for locating disease genes

in nonequilibrium populations. Am. J. Hum. Genet. 56:18-32. Labuda, M., D. Labuda, M. Korab-Laskowska et al. 1996. Linkage disequilibrium analysis in young populations: Pseudo-vitamin D deficiency rickets and the founder effect in French Canadians. Am. J. Hum. Genet. 59:633-643.

MacCluer, J.W., J.L. Vandeburg, B. Read et al. 1986. Pedigree analysis by computer simulation. Zoo Biol. 5:147-160.

Norio, R., H.R. Nevanlinna, and J. Perheentupa. 1973. Herditary diseases in Finland: Rare flora in rare soil. Ann. Clin. Res. 5:109-141.

O'Brien, E., R.A. Kerber, L.B. Jorde et al. 1994. Founder effect: Assessment of variation in genetic contributions among founders. Hum. Biol. 66:185-204. Pouyez, C., Y. Lavoie, G. Bouchard et al. 1983. Les Saguenayens. Sillery, Canada: Presses de l'Universite du Quebec.

Roberts, D.F. 1968. Genetic effects of population size reduction. Nature 220:1084-1088. Roberts, D.F., and J.C. Bear. 1980. Measures of genetic change in an evolving population. Hum.

Biol. 52:773-786.

Slatkin, M. 1994. Linkage disequilibrium in growing and stable populations. Genetics 137:331336.

Thomas, A. 1990. Comparison of an exact and a simulation method for calculating gene extinction probabilities in pedigrees. Zoo Biol. 9:259-274.

Thompson, E.A., and J.V. Neel. 1978. Probability of founder effect in a tribal population. Proc. Natl. Acad. Sci. USA 75:1442-1445.

Thompson, E.A., and J.V. Neel. 1996. Private polymorphisms: How many? How old? How useful for genetic taxonomies? Molec. Phylogenet. Evol. 5:220-231. Thompson, E.A., and J.V. Neel. 1997. Allelic disequilibrium and allele frequency distribution as a function of social and demography history. Am. J. Hum. Genet. 60:197-204. Vogel, F., and A. Motulsky. 1986. Human Genetics. Berlin, Germany: Springer-Verlag.

E. HEYER1

1 Laboratoire d'Anthropologie Biologique, CNRS UMR 152, Musee de l'Homme, 17 place du Trocad*ro, 75016 Paris, France. E-mail: eheyer@mnhn.fr.

Copyright Wayne State University Press Feb 1999

Provided by ProQuest Information and Learning Company. All rights Reserved