Amended end coordinate of insertion calls and overlapping gene IDs
Re-uploaded raw data tsv. Included homozygote count. Updated cohort allele frequency (cohort_af) accuracy for observed genotypes. Removed frequencies for european (eur) and admixed american (amr) ancestries based on somalier probabilities.
We identified some issues with calculations in the tsv file and have removed the file until fixed. We will update with the corrected file soon.
GA4K SV Finder is a tool to search for structural variants (SV) and associated genes from 497 probands in the Genomics Answers for Kids (GA4K) cohort (2023) with HiFi long read genomes processed with PBSV (v2.6.2). Whether you're interested in specific genes, SV coordinates, variant frequencies, or a mix (query file), GA4K SV Finder provides a look into the GA4K rare disease cohort.
Note: SVs were aligned using human genome reference hg38. Cohort alleles with frequencies that would identify a single person have been excluded.
- Python 3.7 or higher
- Tkinter (usually comes with Python)
- Pandas
sudo apt update
sudo apt install python3
sudo dnf install python3
brew install python
After installing Python, you can install Pandas using pip:
pip install pandas
Navigate the command line to the directory where you have downloaded the raw data tsv and ga4ksvf-app.py (cd path/to/your/download
)
Execute the following command:
python3 ga4ksvf-app.py
GA4K SV Finder supports three modes of operation.
Navigate the command line to the directory where you have downloaded the raw data tsv and ga4ksvf-cmd.py (cd path/to/your/download
)
To search for a specific gene, use:
python3 ga4ksvf-cmd.py GENENAME
To search by genomic coordinates:
python3 ga4ksvf-cmd.py chr3:179121491-179374301
To run multiple queries from a file:
python3 ga4ksvf-cmd.py [options]
-f FILE
,--file FILE
: Process queries from a specified file.-e
,--export
: Export accumulated query results to a CSV file. Prompts for a file name, defaulting toresult.csv
.
If the script is run without any arguments, it enters interactive mode. In this mode, you can input queries directly into the terminal. Type 'exit' to quit the interactive mode.
python3 ga4ksvf-cmd.py
chr3:179121491-179374301
chr16:53841057-53841060
chr9:121275764-121307053
MFN1
FTO
GSN