×
  • Select the area you would like to search.
  • ACTIVE INVESTIGATIONS Search for current projects using the investigator's name, institution, or keywords.
  • EXPERTS KNOWLEDGE BASE Enter keywords to search a list of questions and answers received and processed by the ADNI team.
  • ADNI PDFS Search any ADNI publication pdf by author, keyword, or PMID. Use an asterisk only to view all pdfs.
Principal Investigator  
Principal Investigator's Name: Jungsoo Gim
Institution: Chosun University
Department: Biomedical Science
Country:
Proposed Analysis: The aim of this proposal is to identify ethnicity difference measured by genomic prediction built with Caucasian-centric GWA SNVs and to propose to an approach to improve it towards better performance of genomic prediction model (built in new method) to trans-ethnic groups, East Asian in particular. To address the issue, we have developed a new Bayesian machine learning approach that can transfer genetic risk model knowledge from the non-Hispanic Whites (NHW) dataset, i.e. the largest sample among all populations, to other ethnic groups for better accuracy for these models. This means we need to use genotype datasets from all different ancestry groups together. Instead of performing trans-ethnic meta-(or mega) analysis, we adopt the approach which is previously proposed by Gim et. al. Briefly, the dataset for each ethnic group was divided for the cross-validation (CV) analysis. Every training data of each ethnic group was analyzed to evaluate p-value and BLUP of SNVs. Using the summary statistics, we built the prediction model and a nested-CV (a detailed description is provided in the next paragraph) was applied for the model selection, In the final step, the best model for each ethnic group was tested using the test dataset split in the first step. In this proposed work, we analyze the data in a similar fashion by learning from ethnic-specific variants and by building prediction model using them with the newly developed method. For cross-validation analysis, we will use the nested cross validation scheme to test the accuracy of our approach while incorporating model selection in the training process. To be more specific, (following the suggestion from the ADGC reviews) we do sampling of a fixed percentage of training on each ADGC data to reduce the effects of the between-dataset heterogeneity in the ADGC, for example of 8:2 training and test ratio, 80% of ADC1, 80% of ADC2, 80% of GSK, etc., for the training set and the remaining 20% of ADC1, 20% of ADC2, 20% of GSK, etc., for the testing set. With each training dataset, we apply 5-fold cross validation. Within each cross-validation run, we will further conduct model selection for training by 5-fold cross validation of the training dataset for model selection to choose the best model and parameter combination for the final training model before it is evaluated on the testing dataset. The primary purpose of the proposal is to investigate the genetic heterogeneity of LOAD due to ethnicity difference and to propose a method adjusting it. If necessary, especially with the NHW dataset, we do a comparison analysis with the existing polygenic risk prediction methods.
Additional Investigators