BIOL2007 - INBREEDING AND NEUTRAL EVOLUTION
SO FAR,we have dealt chiefly with deterministic evolution,via natural selection.
TODAY,we explore the effects of finite population size and inbreeding on geneticvariation, and show that this can lead to randomevolutionary change (or "drift"). Mutation is, of course, a sort of randomgenetic change, but genetic drift can work much faster.
Firstwe must study the theory of inbreeding, which can be "regular", for instancein sib-sib mating such as the Pharaohs of Ancient Egypt, or as a simpleeffect of random mating in small populations. We first study regularsystems of inbreeding, then go on to how small population sizescan cause both genetic drift and inbreeding.
MEASURING INBREEDING
If an individual mates with a relative(or with itself! as in some plants or snails), the offspring may be homozygousfor a copy of an allele which is identical bydescent from one of the ancestors:
... in the diagram, a male is homozygousfor two copies of an allele --inherited from a single copy in an ancestor. This is partly because hismum was also his dad's niece (a type of inbreeding that is common in manyhuman societies).
The INBREEDINGCOEFFICIENT, F, is used to gauge the strengthof inbreeding. F = probability that two alleles in an individualare identical by descent (IBD).Fstands for fixation index, because of the increase in homozygosity,or fixation, that results from inbreeding.
Note: two alleles that are identicalby descent must be identical in state.However, a homozygote for an identifiable allele can often be producedwithout inbreeding in its recent ancestry. Thus identityin state does not necessarily imply identity by descent.
Is inbreedingalways bad?
Inbreeding is notgenerally recommended because of the existence of deleterious recessivealleles in most populations. Although these should be rare per gene(usually much less than 10-3, see mutation-selectionbalance), there will be many deleterious alleles per genome.According to some estimates, you and I each carry about 1 strongly deleterioushidden mutation. When homozygous, these mutations reduce fitness; inbreedingwill therefore lead to inbreeding depressionas the homozygous mutations become expressed.
However, inbreedingisn't all bad, and many organisms habitually inbreed. Animals suchas fig wasps and certain parasites regularly mate with their siblings,and selfing is common in many of the most aggressive weeds of agriculture.The advantage is presumably ecological, since a single female can thencolonize an empty resource or host. There may also be a genetic advantageby preventing recombination between adaptive loci. One assumes deleteriousrecessives in habitually inbreeding species have mostly been purgedby selection.
In human societieswhere some families have a lot of wealth, or where a bridal dowry is paid,inbreeding is common. Examples are European royal families, and onthe Indian subcontinent. Perhaps here the idea is to prevent the"recombination" of wealth with other families!
In any case, mildinbreeding, such as mating between first cousins, or uncle-niece isn'tso dangerous. Charles Darwin married his first cousin, Emma Wedgewood,and had an astonishing 10 children. Some were sickly or died young,but this was common in the days before penicillin.
REGULAR SYSTEMSOF INBREEDING
We can measure F easily inregular systems of inbreeding, using Sewall Wright's method of "path analysis":
1) Find each path that allelesmay take to become IBD.Calculations like these are used in geneticcounselling, and in animal breeding and in zoos to avoid inbreeding depression.Some examples:
2) Find the number of path segments (x)between gametes (eggs or sperm) through a single ancestor in common ineach path.
3) Calculate the probability of IBDfor each path. The probability that an allele is IBD between twogametes connected through an individual is 1/2. Thus, the probabilityof IBD for each path is (1/2)x.
4) Add up the probabilities of each pathto get the total probability of IBD.

Consider two alleles, A, and awith frequencies p,q with inbreeding (IBD) at rate F:
Frequency of homozygotes:
AA = (1-F)p2[outbred] + Fp[inbred]
= p2 + Fp(1-p)
= p2 + Fpq
Similarly the frequency of the other homozygotes,aa=q2+Fpq
All genotype frequencies must add to 1,so the extra 2Fpq AAand aahomozygotes must have come from the heterozygotes (which cannot be IBD,since they arent even identical in state), and so overall, the frequenciesare:
genotype AA Aa aafrequency p2+Fpq 2pq(1-F) q2+Fpq Sum = 1
So, inbreedingleads to a reduction in heterozygositywithin the population. The heterozygosity(Het, i.e. the proportion that are heterozygotes under inbreeding)is reduced by a fractionF compared with the outbred (Hardy-Weinberg)expectation HetHW = 2pq:Het= HetHW (1 - F)Therefore, as well as measuring a probability(of IBD), F also measures reduction of heterozygosity,or heterozygote deficitcompared to Hardy-Weinberg. The heterozygote deficit = the level of inbreeding(in the absence of selection, assortative mating, migration, etc.).
GENETIC DRIFT
Deterministicvs. stochastic evolution
The Hardy-Weinberglaw is the basis of all population genetics theory, but it assumes thatin the absence of selection or other evolutionary forces, absolutely nogene frequency change occurs during reproduction. This would be truein an infinitely large population; under these conditions, selection wouldbe completely predictable and deterministic.
However, this isonly approximately true in real populations of finite size. Assumea diploid population of constant size N. Each of 2Nalleles are copied into gametes, which unite to form the next generation.Even if the alleles are equal in fitness (neutral), some will not reproduce,while others will manage to transmit several copies to the next generation.
Below is an exampleof drift. Imagine a rare species kept in a zoo with a population of onlysix diploid individuals. There are a total of 12 alleles (numbered 1-12in generation 0). All alleles are assumed equally fit, so that evolutionis neutral. The alleles may also be genetically distinguishable, or "differentin state" (represented by colours).
If the wild source population were large,all the alleles in generation 0 would have come from different ancestors;none would be identical by descent (IBD).However, by chance some alleles are lost in each generation. After a moderatenumber of generations, every allele will ultimately become a copy of justone of the original alleles, or IBD.Inthe diagram, all the alleles happen to become IBD to allele 1 bythe 7th generation. Another way of saying this is that, looking backwardsin time, the coalescence timeof the alleles in the final population is 7 generations ago.
Alleles that areIBDmust also be identical in state(barring mutation). Because the population has become fixed for allele1, it has also become fixed for the allelic state to which allele 1 belongs("yellow"). Usually, there are fewer allelic states than alleles,so that fixation of state (gen. 5, above) can happen earlier than identityby descent (gen. 7). Random evolution in frequency of allelic statesis called genetic drift.
This kind of evolutionis not predictable; it is random or stochastic.Stochastic evolution occurs in any finite population, whether or not selectionis operating -no evolution is completely deterministic.Even in large populations, evolution is only approximately deterministic.
Drift is slower inlarger populations. Why? If I tossed a coin twice, and get 2 heads,you would not be surprised. If I tossed 20 times, and got 20 headsyou would be very surprised. If I scored 200 heads in as many tosses,you would rightly suspect me of cheating. Similarly, if we have twoalleles in a population (equivalent to heads and tails), we get a largervariance of allele frequency if we have a small population. This is equivalentto getting a more variable fraction of heads when tossing a coin a smallnumber of times.
Predictableunpredictability (remember, science= accurate prediction!)
We can't predictexactlywhat is going to happen in genetic drift, but the distribution ofresults is known, and useful. We can quantify the following:
1) The meangene frequency. The probabilities fortwo alleles in a single generation are given by the binomial distribution,with binomial probability p and numbers of trialsn.The mean, or expected frequencyin the future is simply the binomial probability p (similarly,the average fraction of heads is 0.5; the same as the probability of asingle head on each throw).
2) The varianceof gene frequency after one generation.The binomial variance is:
The standarddeviation (SD)of allele frequency is a good measure of the speed of genetic drift (remember,the mean stays the same). The SD is the square root of the variance;here, if N is the population size of a diploid population,then the total number of alleles, (n in the binomialformula), is 2N, so the standard deviation of allelefrequency after one generation is:
So supposing we areinterested in the rate of drift of the yellow allele which has initialfrequency 0.583 in the diagram above. In a population with 2N= 12 alleles, the SD of allele frequency in a single generation will be0.142; this contrasts with 0.049 for 2N = 100, and 0.016for 2N = 1000. The 95% confidence limits of the genefrequency after a single throw can be calculated approximately, given thatthe binomial has an approximately normal distribution, as +/- 2 S.D.s fromthe mean.
Knowing thevariance for a single generation, we can predict the long-term consequencesof drift, including the probability distribution for allele frequency aftera given number of generations. (The maths is, unfortunately, beyond thiscourse!).
3) The probabilitythat a particular allele will eventually be fixed.We know that one of the alleles will eventually take over; the probabilitythat it will be any particular allele is simply the fraction that the allelehas in the population initially, or.
4) Eventually, anypopulation will become fixed for one of the original alleles, and we canalso predict approximately how long this will take. Looking backwards,this is the coalescence timeof a given population. The coalescence time is given by (rate of fixation)-1(see below) and will therefore be about 2N generations.

Genetic drift is important in nature.Here is a recent example from an Asian bramble (Rubus alceifolius)which is an introduced weed on some Pacific islands. Genetic variationwas studied by means of a DNA fingerprint technique called "Amplified FragmentLength Polymorphisms" - AFLP for short. Each vertical "lane" on thegel represents DNA from a single individual; each AFLP band is thoughtto represent an independent DNA fragment, and polymorphisms are revealedby presence or absence of bands. In its native range (Vietnam, right),this species is highly polymorphic, while in an introduced population (theisland of Réunion, left), no polymorphisms are observed. This suggeststhat the founder population was very small, and that all variation hasbeen lost. (see Amsellem L et al. 2000.Mol. Ecol.9: 443-455, reproduced by permission).
GENETIC DRIFTAS A CAUSE OF INBREEDING
As we have seen, inbreedingresults from drift because alleles become identicalby descent (IBD). We can therefore measure drift in termsof our inbreeding coefficient, F:
In a population of size N, theprobability that two alleles picked during random matingin generation t are IBDdue to copying from generation t-1is

BUT the 2N alleles in theprevious generation may be IBD themselves from inbreeding in previousgenerations. The fraction of alleles in generationt thatare IBD because of inbreeding before generation t-1is:
Summing the inbreeding from previous generationstogether with inbreeding leading to the current generation at time t,we have:
By definition, the heterozygosity after asingle generation of inbreeding, Het = HetHW(1 - F). (See above under EFFECTOF INBREEDING ON POPULATIONS).From the above equation relating Ft toFt-1,and cancelling the HetHW (HetHW= 2pq remains the same between generations, because the expectedgene frequency p remains the same, but the actual Het willchange):
rearranging ...
therefore, after t generationsof drift:
Thus, heterozygositydeclines approximately by a factor

(a) This is true only onaverage because a single allele may have zero, one, two or morecopies in the next generation. The factor

(b) F can also measure inbreedingas a result of subdivision into two or more finite populations. Rememberthat when we assumed Hardy-Weinberg, we also assumed a lack of migration(i.e. mixing of populations).
When we sample from a number of sub-populationswith different gene frequencies which do not mate randomly with each other,the heterozygote deficit gives us a measure of identity by descent producedby the population subdivision.
This between-population inbreeding is usuallywritten FST, meaninginbreeding (F) due tosubdivision into Subpopulationsrelative to the Total population.
For example, assume many populations offinite size N start from from the same gene frequency anddrift apart for t generations. Withineach randomly mating population there is no heterozygote deficit,of course, but each population is accumulating identity by descent at arate ofper generation (on average). Between populations,this results in an increasing heterozygote deficit, or deviation from Hardy-Weinberg.This heterozygote deficit is measuredby FST. If all populationsare of size N, the FST shouldbe equal to the level of identity by descent orinbreeding, F, produced on average by drift withineach population relative to the initial source population. Neat,eh?!
You can try some simulations of drift yourself;go to naturalselection and drift simulations. You can use some of these (DRIFT.EXE,and PDRIFT.EXE) to get an estimate of the level of inbreeding and heterozygotedeficit (F or FST) accumulated duringgenetic drift of up to 100 populations.
FSTis widely used to study gene frequency variation over a geographic rangeas a measure of population subdivision. This topic, which we can'tcover here (shame!), is often referred to as populationstructure.
EFFECTIVE POPULATIONSIZE
Even with no deterministicbias, or natural selection, alleles usually do not have identical probabilityof being passed on, as required in these simple models. Populationgeneticists get around this by calculating an idealized, or effectivepopulation size that produces approximatelythe same rate of genetic drift in their simple models as does the actualpopulation with all its complexity. The effective population sizemay be rather different from the actual population size. Two examples:
1) Separatesexes. The simple theory above assumes that a single individual mayhave two alleles IBD for a single allele in the previous generation.In fact, they can only do this if there is selfing. In dioeciousorganisms like us, this is not (yet!) possible. Separate sexes thereforeenforce some outbreeding, and slow the buildup of identity by descent:the effective size is marginally larger than the actual population size.2) Unequal sex ratio.In species which maintain harems, like the elephant seal (see later inSEXAND SEXUAL SELECTION), a singlemale may commandeer almost all the matings by fighting off other males.Similarly, in modern cow herds almost all females are fertilized artificially;a single bull provides enough sperm for thousands of offspring. Althoughthere are millions of cows in Britain, calves are mostly progeny of veryfew bulls. The effective population size may therefore be in the hundredsrather than millions, because genes in the population are funnelled throughthese few bulls in every generation.
FINALE
During this lecture, we measured inbreedingusing the inbreeding coefficient,F.We applied this method toregular systems of inbreeding,and then tried something a bit trickier: to use F to measureinbreedingdue to genetic drift in finite populations.
The Hardy-Weinberg law is very useful,and simple models of natural selection work well most of the time. However,these models have the ever-so-slight drawback that they depend on an assumptionof infinite population sizes. Before today, we modeled evolution in termsof infinitely divisible gene frequencies. In fact this is simply doesn'twork: some of the most interesting evolution happens when we mix randomgenetic drift -- due to finite population sizes -- with deterministicforces -- selection. Drift may or, may not be important in evolution, butit always happens, because populations are always finite.
For now, it is worth knowing that the equationcharacterizes perhaps the most important genetic problem in conservation.The equation will be important in any species with low overall N;for instance in many endangered large mammals, such as tigers in the Girforest in India, Florida panthers, and Sumatran rhinos.
Well! That's probably enough fortoday!
FURTHER READING
FUTUYMA, DJ 1998.Evolutionary Biology. Chapter 11 (pp. 297-314).
PopulationStructure lecture notes (optional!).
Back to BIOL2007 TIMETABLE