×
  • Select the area you would like to search.
  • ACTIVE INVESTIGATIONS Search for current projects using the investigator's name, institution, or keywords.
  • EXPERTS KNOWLEDGE BASE Enter keywords to search a list of questions and answers received and processed by the ADNI team.
  • ADNI PDFS Search any ADNI publication pdf by author, keyword, or PMID. Use an asterisk only to view all pdfs.
Principal Investigator  
Principal Investigator's Name: Andrew Sharp
Institution: Icahn School of Medicine at Mount Sinai
Department: Genetics and Genomic Sciences
Country:
Proposed Analysis: Variation in tandem repeats (TRs), particularly large expansions of triplet repeats (eg. polyCAG), are known to cause a number of late-onset neurological diseases. Due to their repetitive and degenerate nature, variations in TRs are typically ignored by standard genome analysis pipelines. Furthermore, pathogenic repeat expansions typically span hundreds to thousands of bases, which is much longer than an Illumina sequencing read, making variations in them difficult to detect in short read data. However, recently a number of specialized algorithms have been developed which enable expansions of short TRs (motif sizes 1-12bp) to be detected in short-read genome sequencing data. In addition, our lab has also developed approaches that allow the copy number of repeats with larger motifs (those with motif size ranging from 12bp-200kb) to be estimated based on read depth. We propose to apply specialized algorithmic approaches to study variation in TRs of all motif sizes in whole-genome sequencing data, and investigate their possible role in Alzheimer’s disease (AD). We hypothesize that variation in TR regions contributes to the genetic factors underlying AD. Specifically we will test two hypotheses: 1. That rare pathogenic expansions of TRs (either in the “full” or “pre-mutation” range) occur at increased frequency in AD patients compared to controls. 2. That length variation in TRs of all sizes, including very large TRs (some of which contain entire genes), represents a class of common genetic variation that may alter an individual’s susceptibility to AD. In Aim 1, we will search for expansions of microsatellite repeats using tools such as ExpansionHunter, exSTRa and STRetch, that analyze WGS BAM files for indirect signatures of expanded repeats. We will look for loci that show an excess of rare outlier genotypes in cases vs controls. Where pedigree data is available from familial cases, we would determine segregation of these long repeats with disease. If loci showing rare expansions specifically in cases are identified, if possible, we would request aliquots of DNA from the specific individuals to perform long-read sequencing to validate the presence of potentially pathogenic repeat expansions. In Aim 2, we will use read depth approaches, such as CNVnator, to estimate copy number of large TRs and multicopy regions of the genome. We will compare estimated copy numbers of these repetitive regions in cases vs. controls to identify TR loci that show significant associations of copy number (either gain or loss) with AD compared to controls. Associations will use copy number estimates corrected for technical and biological covariates, such as principal components of WGS read depth data, ethnicity from SNV data, gender, etc, incorporating a multiple testing correction for genome-wide analysis.
Additional Investigators  
Investigator's Name: Paras Garg
Proposed Analysis: Variation in tandem repeats (TRs), particularly large expansions of triplet repeats (eg. polyCAG), are known to cause a number of late-onset neurological diseases. Due to their repetitive and degenerate nature, variations in TRs are typically ignored by standard genome analysis pipelines. Furthermore, pathogenic repeat expansions typically span hundreds to thousands of bases, which is much longer than an Illumina sequencing read, making variations in them difficult to detect in short read data. However, recently a number of specialized algorithms have been developed which enable expansions of short TRs (motif sizes 1-12bp) to be detected in short-read genome sequencing data. In addition, our lab has also developed approaches that allow the copy number of repeats with larger motifs (those with motif size ranging from 12bp-200kb) to be estimated based on read depth. We propose to apply specialized algorithmic approaches to study variation in TRs of all motif sizes in whole-genome sequencing data, and investigate their possible role in Alzheimer’s disease (AD). We hypothesize that variation in TR regions contributes to the genetic factors underlying AD. Specifically we will test two hypotheses: 1. That rare pathogenic expansions of TRs (either in the “full” or “pre-mutation” range) occur at increased frequency in AD patients compared to controls. 2. That length variation in TRs of all sizes, including very large TRs (some of which contain entire genes), represents a class of common genetic variation that may alter an individual’s susceptibility to AD. In Aim 1, we will search for expansions of microsatellite repeats using tools such as ExpansionHunter, exSTRa and STRetch, that analyze WGS BAM files for indirect signatures of expanded repeats. We will look for loci that show an excess of rare outlier genotypes in cases vs controls. Where pedigree data is available from familial cases, we would determine segregation of these long repeats with disease. If loci showing rare expansions specifically in cases are identified, if possible, we would request aliquots of DNA from the specific individuals to perform long-read sequencing to validate the presence of potentially pathogenic repeat expansions. In Aim 2, we will use read depth approaches, such as CNVnator, to estimate copy number of large TRs and multicopy regions of the genome. We will compare estimated copy numbers of these repetitive regions in cases vs. controls to identify TR loci that show significant associations of copy number (either gain or loss) with AD compared to controls. Associations will use copy number estimates corrected for technical and biological covariates, such as principal components of WGS read depth data, ethnicity from SNV data, gender, etc, incorporating a multiple testing correction for genome-wide analysis.
Investigator's Name: Bharati Jadhav
Proposed Analysis: Variation in tandem repeats (TRs), particularly large expansions of triplet repeats (eg. polyCAG), are known to cause a number of late-onset neurological diseases. Due to their repetitive and degenerate nature, variations in TRs are typically ignored by standard genome analysis pipelines. Furthermore, pathogenic repeat expansions typically span hundreds to thousands of bases, which is much longer than an Illumina sequencing read, making variations in them difficult to detect in short read data. However, recently a number of specialized algorithms have been developed which enable expansions of short TRs (motif sizes 1-12bp) to be detected in short-read genome sequencing data. In addition, our lab has also developed approaches that allow the copy number of repeats with larger motifs (those with motif size ranging from 12bp-200kb) to be estimated based on read depth. We propose to apply specialized algorithmic approaches to study variation in TRs of all motif sizes in whole-genome sequencing data, and investigate their possible role in Alzheimer’s disease (AD). We hypothesize that variation in TR regions contributes to the genetic factors underlying AD. Specifically we will test two hypotheses: 1. That rare pathogenic expansions of TRs (either in the “full” or “pre-mutation” range) occur at increased frequency in AD patients compared to controls. 2. That length variation in TRs of all sizes, including very large TRs (some of which contain entire genes), represents a class of common genetic variation that may alter an individual’s susceptibility to AD. In Aim 1, we will search for expansions of microsatellite repeats using tools such as ExpansionHunter, exSTRa and STRetch, that analyze WGS BAM files for indirect signatures of expanded repeats. We will look for loci that show an excess of rare outlier genotypes in cases vs controls. Where pedigree data is available from familial cases, we would determine segregation of these long repeats with disease. If loci showing rare expansions specifically in cases are identified, if possible, we would request aliquots of DNA from the specific individuals to perform long-read sequencing to validate the presence of potentially pathogenic repeat expansions. In Aim 2, we will use read depth approaches, such as CNVnator, to estimate copy number of large TRs and multicopy regions of the genome. We will compare estimated copy numbers of these repetitive regions in cases vs. controls to identify TR loci that show significant associations of copy number (either gain or loss) with AD compared to controls. Associations will use copy number estimates corrected for technical and biological covariates, such as principal components of WGS read depth data, ethnicity from SNV data, gender, etc, incorporating a multiple testing correction for genome-wide analysis.
Investigator's Name: Mariya Shadrina
Proposed Analysis: Variation in tandem repeats (TRs), particularly large expansions of triplet repeats (eg. polyCAG), are known to cause a number of late-onset neurological diseases. Due to their repetitive and degenerate nature, variations in TRs are typically ignored by standard genome analysis pipelines. Furthermore, pathogenic repeat expansions typically span hundreds to thousands of bases, which is much longer than an Illumina sequencing read, making variations in them difficult to detect in short read data. However, recently a number of specialized algorithms have been developed which enable expansions of short TRs (motif sizes 1-12bp) to be detected in short-read genome sequencing data. In addition, our lab has also developed approaches that allow the copy number of repeats with larger motifs (those with motif size ranging from 12bp-200kb) to be estimated based on read depth. We propose to apply specialized algorithmic approaches to study variation in TRs of all motif sizes in whole-genome sequencing data, and investigate their possible role in Alzheimer’s disease (AD). We hypothesize that variation in TR regions contributes to the genetic factors underlying AD. Specifically we will test two hypotheses: 1. That rare pathogenic expansions of TRs (either in the “full” or “pre-mutation” range) occur at increased frequency in AD patients compared to controls. 2. That length variation in TRs of all sizes, including very large TRs (some of which contain entire genes), represents a class of common genetic variation that may alter an individual’s susceptibility to AD. In Aim 1, we will search for expansions of microsatellite repeats using tools such as ExpansionHunter, exSTRa and STRetch, that analyze WGS BAM files for indirect signatures of expanded repeats. We will look for loci that show an excess of rare outlier genotypes in cases vs controls. Where pedigree data is available from familial cases, we would determine segregation of these long repeats with disease. If loci showing rare expansions specifically in cases are identified, if possible, we would request aliquots of DNA from the specific individuals to perform long-read sequencing to validate the presence of potentially pathogenic repeat expansions. In Aim 2, we will use read depth approaches, such as CNVnator, to estimate copy number of large TRs and multicopy regions of the genome. We will compare estimated copy numbers of these repetitive regions in cases vs. controls to identify TR loci that show significant associations of copy number (either gain or loss) with AD compared to controls. Associations will use copy number estimates corrected for technical and biological covariates, such as principal components of WGS read depth data, ethnicity from SNV data, gender, etc, incorporating a multiple testing correction for genome-wide analysis.
Investigator's Name: Alejandro Martin-Trujillo
Proposed Analysis: Variation in tandem repeats (TRs), particularly large expansions of triplet repeats (eg. polyCAG), are known to cause a number of late-onset neurological diseases. Due to their repetitive and degenerate nature, variations in TRs are typically ignored by standard genome analysis pipelines. Furthermore, pathogenic repeat expansions typically span hundreds to thousands of bases, which is much longer than an Illumina sequencing read, making variations in them difficult to detect in short read data. However, recently a number of specialized algorithms have been developed which enable expansions of short TRs (motif sizes 1-12bp) to be detected in short-read genome sequencing data. In addition, our lab has also developed approaches that allow the copy number of repeats with larger motifs (those with motif size ranging from 12bp-200kb) to be estimated based on read depth. We propose to apply specialized algorithmic approaches to study variation in TRs of all motif sizes in whole-genome sequencing data, and investigate their possible role in Alzheimer’s disease (AD). We hypothesize that variation in TR regions contributes to the genetic factors underlying AD. Specifically we will test two hypotheses: 1. That rare pathogenic expansions of TRs (either in the “full” or “pre-mutation” range) occur at increased frequency in AD patients compared to controls. 2. That length variation in TRs of all sizes, including very large TRs (some of which contain entire genes), represents a class of common genetic variation that may alter an individual’s susceptibility to AD. In Aim 1, we will search for expansions of microsatellite repeats using tools such as ExpansionHunter, exSTRa and STRetch, that analyze WGS BAM files for indirect signatures of expanded repeats. We will look for loci that show an excess of rare outlier genotypes in cases vs controls. Where pedigree data is available from familial cases, we would determine segregation of these long repeats with disease. If loci showing rare expansions specifically in cases are identified, if possible, we would request aliquots of DNA from the specific individuals to perform long-read sequencing to validate the presence of potentially pathogenic repeat expansions. In Aim 2, we will use read depth approaches, such as CNVnator, to estimate copy number of large TRs and multicopy regions of the genome. We will compare estimated copy numbers of these repetitive regions in cases vs. controls to identify TR loci that show significant associations of copy number (either gain or loss) with AD compared to controls. Associations will use copy number estimates corrected for technical and biological covariates, such as principal components of WGS read depth data, ethnicity from SNV data, gender, etc, incorporating a multiple testing correction for genome-wide analysis.
Investigator's Name: William Lee
Proposed Analysis: Will will perform analysis of repeat expansions in the WGS data