Journal
|
BMC
Proceedings
|
Title
|
Association
Analysis of Whole Genome Sequencing Data Accounting for Longitudinal and
Family Designs
|
Year
|
2014
|
Research of Method
|
This study focus on the first replicate
of simulated SBP and DBP and the genetic data from sequencing and imputation
only on chromosome 3 because of limited computation resources. The analyses
were performed without knowledge of the underlying simulating model. We
obtained 849 individuals from 20 pedigrees, among which 142 are unrelated.
All individuals have SBP and DBP measurements at three time points with no
missing data, as well as age, gender, smoking status, and antihy-pertensive
medication status. Note that we preadjust the SBP and DBP measurements by the
antihypertensive medication status (i.e., increasing SBP by 10 mm Hg and DBP
by 5 mm Hg if the subject is taking medication). We define common variants as
those with MAFs 5% or greater and obtain 403,098 SNPs on chromosome 3. We
jointly analyze rare variants by mRNA transcripts, which are the functional
products of genes. We exclude tran-scripts whose total rare allele frequency
(i.e., sum of MAFs over all inclusive variants) is less than 0.01 and end up
with a total of 813 transcripts represented by accession numbers. Given a
common single-nucleotide polymorphism (SNP) or a transcript for the
phenotype, consider using :
(a) the baseline,
(b) the time-averaged, and
(c) the repeated measurements.
For study subjects, consider using :
(a) the entire pedigree-based sample,
and
(b) the unrelated subjects only.
|
Subject
|
Figure 1 displays the quantile-quantile
(QQ) plots of p-values for testing the association between common SNPs and
SBP. All 6 tests produced proper type I error because their genomic control
parameter λ s are close to 1. This suggests that the data
are well described by our models and population stratification is
appropriately adjusted by the PCs. Clearly, using all pedigree-based samples
is substantially more powerful than using the unrelated subjects only. In
addition, using the averaged SBP yielded smaller p-values for top SNPs than
using the baseline or repeated measurements, and using the repeated
measurements is slightly more powerful than using the baseline. This pattern
can also be seen in Figure 2. The top five SNPs based on the method using the
averaged SBP and all subjects are listed in Table 1, whose last column
provides the refined p-values from model (4). Note that the use of model (4)
does not alter the aforementioned order based on power, although it tends to
slightly improve on the use of model (5). Using the Bonferroni correction,
the genome-wide significance threshold is 1.3 × 10-7, at which the
top five SNPs can be declared as genome-wide significant by any method. Note
that we only focused on chromosome 3, so what we are assessing is in fact
chromosome-wide signifi-cance. For testing the association between rare
variants and SBP, the 6 tests also have controlled type I error (see Figure 3
for QQ plots of p-values). Again, com-pared with the unrelated subset, the
relatives added con-siderable information on the associations of the top
three transcripts. All three types of SBP generated comparable power with all
individuals, and the three consensus top transcripts are described in Table
2. Using the Bonferroni correction, the genome-wide significance threshold is
6.2 × 10-5, at which the three top transcripts can be declared as
genome-wide significant by any method. All of the identified common and rare
variants map to the gene MAP4, which spans from 47,892,180 to 48,130,769 on
chromosome 3.
The results of testing the
genetic association with DBP show similar patterns as with SBP (data not
shown). In particular, using the averaged DBP yielded better power than using
the repeated measurements. Tables 1 and 2 pro-vide the top common SNPs and
transcripts, respectively.
|
Conclusion
|
Most
GWAS have focused on the population-based design, which maximizes the power
per genotyped subject. Our results demonstrated that including family members
can also significantly boost the power. GWAS have ignored the longitudinal
nature of the pheno-type data, which are available from many prospective
cohorts. The use of the longitudinal data can provide a more accurate
measurement of the phenotype and thus serves as a powerful tool in genetic
association studies.
|
Suggestion
|
The
associa-tion in the fixed-effect parameters modeled can be readily extended
to linkage analysis by including another set of random-effect parameters
whose covariances depend on the proportion of alleles shared identical by
descent at the marker locus between a relative pair
|
Minggu, 29 Maret 2015
Journal Summary
Langganan:
Postingan (Atom)