Big data homework solutions
-
Updated
Jun 21, 2017 - Python
Big data homework solutions
An implementation of the minhash algorithm in golang
Tool to analyze news from a dataset.
Aurora karton for calculating minhash from input dataset.
University work. Approximate aligner for long DNA sequences. Estimates Jaccard similarity from k-mers via minimizers and MinHash, then uses it as a sequence identity proxy.
Fast Jaccard similarity search for abstract sets (documents, products, users, etc.) using MinHashing and Locality Sensitve Hashing
Software to identify known plasmid sequence from metagenomic assembly using Minhash
Aurora karton for similiarity matching.
Probability Methods for Informatics Engineering | UA 2018/2019
similarity of the texts (Jaccard Similarity, Minhash, LSH)
Finding Similar Items: Textually Similar Documents
(WIP) HTTP server that deploy distributes Vokter (https://github.com/vokter/vokter) through a REST API.
Rust jieba
Python package for fast MinHash calculation and operations
Add a description, image, and links to the minhash topic page so that developers can more easily learn about it.
To associate your repository with the minhash topic, visit your repo's landing page and select "manage topics."