tokyo, a REST API, when given any type of document 📄, Identifies mime-type 🧐. Suggests extension 🦔. Alas Extracts text 💪.
-
Updated
Jun 13, 2020 - Clojure
tokyo, a REST API, when given any type of document 📄, Identifies mime-type 🧐. Suggests extension 🦔. Alas Extracts text 💪.
Heuristic based boilerplate removal tool
Python code to extract words and in turn extract letters using pytesseract
Extract all the texts of any project with HTML files and generate a KV (Key-Value) file, key = reference key, value = extracted text.
Arachnio client library for Java 11+
Retrieve data from two different websites, loading them into the PostgreSQL database using Python, and combine them to get and present new information
Tesseract-OCR quick implementation. Linked with stack-overflow question
Extract text content from an HTML page, process it, and extract unique words from the processed text. This notebook utilizes various text processing techniques including cleaning, normalization, tokenization, lemmatization or stemming, and stop words removal.
Api to get text from multiple types of files
A simple web application built with React which allows to upload images containing text, select the language of the text for recognition, and extract the text from the image. As quick as a finger snap - SnapText.
Automation manager
Site that allows the extraction of text contained in images uploaded by the user. (Live demo: https://github.com/giovanninicolettawork/image-to-text)
Text Extraction using Cosine Similarity
Web Application to extract text from image
This web application utilizes OCR technology to recognize text in uploaded images and provides spelling correction and word performance improvement. Users can easily upload images containing text and receive accurate and enhanced text results.
OCR feito em C# para recognição e extração de texto em pdfs, imagens, documentos word, planilhas, txt e de prints do clipboard.
[Thesis] Video Text Extraction
Engine for automated the process of scraping PDFs into local and convert those PDFs into text by performing OCR.
PyQt5를 사용한 간단한 도서 스캐너 프로젝트 (바코드 인식과 텍스트 추출을 통한 도서 정보를 검색 및 표시)
custom github action to parse issue body
Add a description, image, and links to the text-extraction topic page so that developers can more easily learn about it.
To associate your repository with the text-extraction topic, visit your repo's landing page and select "manage topics."