Skip to content

Data repository for ICML 2022 CompBio workshop paper.

Notifications You must be signed in to change notification settings

cx0/icml-human-genetics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

Bayesian tensor factorization for predicting clinical outcomes using integrated human genetics evidence

This repo contains detailed information on data and methodology used in our work presented at ICML 2022 Computational Biology workshop.

Workshop website | Extended abstract | Poster

Schematic Schematic representation of 3-rank tensor with sparse data available on genetics evidence and clinical outcome.

Data and Methods

We integrate three lines of human genetics evidence across rare disease, gene burden and common disease. We also use NLP-based classification of clinical outcomes to label clinical "failures". All the data used in this analysis are retrieved from the latest release of Open Targets platform v22.06, Data downloads

Rare disease (Mendelian phenotypes)

Data source Positive label Negative label
ClinGen Definitive or Strong Other classification
Genomics England PanelApp Green Amber

Gene burden (Rare variant association)

Data source Positive label Negative label
REGENERON $P\le 2.18 \times {10^{−11}}$ Else
AstraZeneca PheWAS Portal $P\le 2 \times {10^{−9}}$ Else
Genebass $P\le 6.7 \times {10^{−7}}$ Else

Common disease (GWAS)

Data source Positive label Negative label
Locus-to-gene "L2G" score Score $\ge 0.5$ Score < $0.5$

Citation

If you find this work useful, please cite it as follows:

@misc{https://doi.org/10.48550/arxiv.2207.12538,
  doi = {10.48550/ARXIV.2207.12538},
  url = {https://arxiv.org/abs/2207.12538},
  author = {Soylemez, Onuralp},
  keywords = {Machine Learning (cs.LG), Genomics (q-bio.GN), Applications (stat.AP), FOS: Computer and information sciences, FOS: Computer and information sciences, FOS: Biological sciences, FOS: Biological sciences},
  title = {Bayesian tensor factorization for predicting clinical outcomes using integrated human genetics evidence},
  publisher = {arXiv},
  year = {2022},
  copyright = {Creative Commons Attribution 4.0 International}
}

About

Data repository for ICML 2022 CompBio workshop paper.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published