Skip to content

stefanDeveloper/heiDGAF

Repository files navigation

heiDGAF - Domain Generation Algorithms Finder

ML based DNS analyzer to detect Domain Generation Algorithms (DGAs) tunneling, and data exfiltration of malicious actors.

Live Notebook live notebook
Latest Release latest release
Supported Versions python3
Project License License
Continuous Integration Linux WorkFlows MacOS WorkFlows Windows WorkFlows

Getting Started

python -m venv .venv
pip install .

heidgaf -h

Run your analysis:

heidgaf inspect -r data/...

Train your own model:

heidgaf train -m xg -d all

Data

Currently, we support the data format scheme provided by the DNS-Collector:

  • {{ .timestamp }}
  • {{ .return_code }}
  • {{ .client_ip }}
  • {{ .server_ip }}
  • {{ .query }}
  • {{ .type }}
  • {{ .answer }}
  • {{ .size }}b

For training our models, we rely on the following data sets:

However, we compute all feature separately and only rely on the domain and class. Currently, we are only interested in binary classification, thus, the class is either benign or malicious.

Exploratory Data Analysis (EDA)

In the folder ./example we conducted a Exploratory Data Analysis (EDA) to verify the features of interest for our application.

Literature

Based on the following work, we implement heiDGAF to find malicious behaviour such as tunneling or data exfiltration in DNS requests.

  • EXPOSURE: Finding Malicious Domains Using Passive DNS Analysis

    A passiv DNS pipeline for finding malicious domains using J48 decision tree algorithm.

  • Real-Time Detection System for Data Exfiltration over DNS Tunneling Using Machine Learning

    Propose a hybrid DNS tunneling detection system using Tabu-PIO for feature selection.

  • Classifying Malicious Domains using DNS Traffic Analysis

  • DeepDGA: Adversarially-Tuned Domain Generation and Detection

    DeepDGA detecting (and generating) domains on a per-domain basis which provides a simple and flexible means to detect known DGA families. It uses GANs to bypass detectors and shows the effectiveness of such solutions.

  • Kitsune: An Ensemble of Autoencoders for Online Network Intrusion Detection

  • SHAP Interpretations of Tree and Neural Network DNS Classifiers for Analyzing DGA Family Characteristics

  • FANCI : Feature-based Automated NXDomain Classification and Intelligence

Similar Projects