text-mining

Data and scripts for training the open source PDF questionnaire extraction component for Harmony Kaggle competition using natural language processing (NLP)

nlp competition open-source pdf data-science natural-language-processing information-retrieval text-mining text-classification kaggle information-extraction psychology research-project pdf-files psychology-experiments sentence-embeddings pdf-document-processor psychology-questionnaire

Updated May 27, 2024
Python

george-gca / ai_papers_cleaner

Star

Extract text from papers PDFs and abstracts, and remove uninformative words.

python nlp pdf text-mining nltk text-processing

Updated May 27, 2024
Python

notesjor / CorpusExplorer.Terminal.Console

Star

Erlaubt anderen Programmen/Programmiersprachen den Zugriff auf Analysen/Daten des CorpusExplorer v2.0

nlp api linguistic text-mining corpus-linguistics corpusexplorer

Updated May 27, 2024
C#

antoniooliveira03 / Projects

Star

Projects I have worked during my Bachelor

machine-learning text-mining deep-learning sentiment-analysis

Updated May 27, 2024
Jupyter Notebook

jakeberggren / TDDE16-Text-Mining-Project

Star

Project in the course TDDE16 - Text Mining at Linköping University

text-mining text-classification statistical-machine-learning vector-database openai-api langchain

Updated May 27, 2024
TeX

Lambda-3 / DiscourseSimplification

Star

Extension of the SentenceSimplification project

natural-language-processing text-mining text-classification simplification discourse-analysis discourse-parsing

Updated May 27, 2024
Java

frances is an advanced cloud-based text mining digital platform that leverages information extraction, knowledge graphs, natural language processing (NLP), deep learning, and parallel processing techniques. It has been specifically designed to unlock the full potential of historical digital textual collections.

natural-language-processing text-mining apache-spark information-extraction knowledge-graph parallel-processing digitised-historical-collections cloud-based-platform

Updated May 26, 2024
Jupyter Notebook

caufieldjh / awesome-bioie

Star

🧫 A curated list of resources relevant to doing Biomedical Information Extraction (including BioNLP)

nlp natural-language-processing text-mining awesome bioinformatics information-extraction awesome-list biomedical medical-informatics biomedical-data biomedical-language

Updated May 26, 2024

gengoai / gengoai

Star

Mono Repository for GengoAI projects

java machine-learning natural-language-processing text-mining text-classification text-analysis

Updated May 26, 2024
Java

fitria-dwi / Hoax-Detection

Star

This project aims to build a model to predict the truth of an article, hoax or non-hoax. Apart from that, this project also wants to identify the percentage of hoax and non-hoax articles.

text-mining neural-network machine-learning-algorithms logistic-regression unsupervised-learning support-vector-machines decision-tree-classifier random-forest-classifier gaussian-naive-bayes k-nearest-neighbor-classifier hoax-detection

Updated May 26, 2024
Jupyter Notebook

JesusSalinas / master_upb

Star

Text Analysis

text-mining sraping

Updated May 26, 2024
Python

Saeidhoseinipour / ELBMcoclust

Star

We unified some latent block models by proposing a flexible ELBM that is extended to SELBM to address the sparse problem by revealing a diagonal structure from sparse datasets. This leads to obtain more homogeneous co-clusters and therefore produce useful, ready-to-use and easy-to-interpret results.

text-mining word-cloud exponential text-summarization sparse-matrix co-clustering latent-block-model coclust