Parida Rachman: 2015

Minggu, 29 Maret 2015

Journal Summary

Journal	BMC Proceedings
Title	Association Analysis of Whole Genome Sequencing Data Accounting for Longitudinal and Family Designs
Year	2014
Research of Method	This study focus on the first replicate of simulated SBP and DBP and the genetic data from sequencing and imputation only on chromosome 3 because of limited computation resources. The analyses were performed without knowledge of the underlying simulating model. We obtained 849 individuals from 20 pedigrees, among which 142 are unrelated. All individuals have SBP and DBP measurements at three time points with no missing data, as well as age, gender, smoking status, and antihy-pertensive medication status. Note that we preadjust the SBP and DBP measurements by the antihypertensive medication status (i.e., increasing SBP by 10 mm Hg and DBP by 5 mm Hg if the subject is taking medication). We define common variants as those with MAFs 5% or greater and obtain 403,098 SNPs on chromosome 3. We jointly analyze rare variants by mRNA transcripts, which are the functional products of genes. We exclude tran-scripts whose total rare allele frequency (i.e., sum of MAFs over all inclusive variants) is less than 0.01 and end up with a total of 813 transcripts represented by accession numbers. Given a common single-nucleotide polymorphism (SNP) or a transcript for the phenotype, consider using : (a) the baseline, (b) the time-averaged, and (c) the repeated measurements. For study subjects, consider using : (a) the entire pedigree-based sample, and (b) the unrelated subjects only.
Subject	Figure 1 displays the quantile-quantile (QQ) plots of p-values for testing the association between common SNPs and SBP. All 6 tests produced proper type I error because their genomic control parameter _λ s are close to 1. This suggests that the data are well described by our models and population stratification is appropriately adjusted by the PCs. Clearly, using all pedigree-based samples is substantially more powerful than using the unrelated subjects only. In addition, using the averaged SBP yielded smaller p-values for top SNPs than using the baseline or repeated measurements, and using the repeated measurements is slightly more powerful than using the baseline. This pattern can also be seen in Figure 2. The top five SNPs based on the method using the averaged SBP and all subjects are listed in Table 1, whose last column provides the refined p-values from model (4). Note that the use of model (4) does not alter the aforementioned order based on power, although it tends to slightly improve on the use of model (5). Using the Bonferroni correction, the genome-wide significance threshold is 1.3 × 10^-7, at which the top five SNPs can be declared as genome-wide significant by any method. Note that we only focused on chromosome 3, so what we are assessing is in fact chromosome-wide signifi-cance. For testing the association between rare variants and SBP, the 6 tests also have controlled type I error (see Figure 3 for QQ plots of p-values). Again, com-pared with the unrelated subset, the relatives added con-siderable information on the associations of the top three transcripts. All three types of SBP generated comparable power with all individuals, and the three consensus top transcripts are described in Table 2. Using the Bonferroni correction, the genome-wide significance threshold is 6.2 × 10^-5, at which the three top transcripts can be declared as genome-wide significant by any method. All of the identified common and rare variants map to the gene MAP4, which spans from 47,892,180 to 48,130,769 on chromosome 3. The results of testing the genetic association with DBP show similar patterns as with SBP (data not shown). In particular, using the averaged DBP yielded better power than using the repeated measurements. Tables 1 and 2 pro-vide the top common SNPs and transcripts, respectively.
Conclusion	Most GWAS have focused on the population-based design, which maximizes the power per genotyped subject. Our results demonstrated that including family members can also significantly boost the power. GWAS have ignored the longitudinal nature of the pheno-type data, which are available from many prospective cohorts. The use of the longitudinal data can provide a more accurate measurement of the phenotype and thus serves as a powerful tool in genetic association studies.
Suggestion	The associa-tion in the fixed-effect parameters modeled can be readily extended to linkage analysis by including another set of random-effect parameters whose covariances depend on the proportion of alleles shared identical by descent at the marker locus between a relative pair

Association Analysis of Whole Genome Sequencing Data Accounting for Longitudinal and Family Designs

Langganan: Postingan (Atom)