Research Project:
Understanding the Genetic Roots of Obesity Across Populations

Placeholder

Institution Authors

Authors

Journal Title

Journal ISSN

Volume Title

Publisher:

Type

Abstract

Description

We request access to the dbGaP datasets related to the Obesity-Diabetes Familial Risk, Viva La Familia Study for the purpose of developing and benchmarking a rare variant association pipeline designed for whole-genome sequencing (WGS) and whole-exome sequencing (WES) data. Our goal is to create a robust and scalable computational workflow that can accurately identify rare variants associated with complex diseases in both familial and non-familial contexts. The pipeline will integrate variant quality control, population stratification adjustment, and family-based statistical models to handle related individuals, as well as methods suitable for unrelated cohorts. The Viva La Familia dataset provides an ideal test case for method development because of its family-based structure and its focus on obesity and diabetes, two complex diseases of major public health importance. The data will be used strictly for method and software development purposes. Only processed summary-level results (e.g., p-values, effect sizes, genomic coordinates of variants, gene-level burden statistics) will be generated, and these will adhere to dbGaP guidelines to prevent participant re-identification. No attempt will be made to identify individual participants. All raw data will remain stored securely on institutional servers behind firewalls, with access restricted to the PI and authorized personnel under controlled conditions. Data transfer and storage will comply with institutional and NIH security requirements. Upon expiration of the data access period, all raw data and backups will be securely destroyed. We will acknowledge the contributing investigators and dbGaP in all software releases, presentations, and publications that result from the use of these data. Accession numbers and dataset version information will be included in any published work, in accordance with dbGaP policy.We are developing a computational tool to study rare genetic changes that may play a role in diseases. To test and validate our tool, we need to work with data that include both family-based and non-family-based information. The Viva La Familia Study dataset is particularly useful because it focuses on Hispanic families with obesity and diabetes risk, which allows us to evaluate our methods in a real-world scenario where related individuals are included. Our work does not focus on the health of specific individuals. Instead, we aim to improve the methods researchers use to analyze genetic data. By testing and refining our pipeline, we hope to provide a more accurate way to detect rare genetic risk factors for complex diseases like obesity and diabetes. Ultimately, this may help future studies identify genetic elements that contribute to these conditions more effectively.

Citation

Endorsement

Review

Supplemented By

Referenced By