Sudachi in Rust 🦀 and new generation of SudachiPy
-
Updated
May 29, 2024 - Rust
Sudachi in Rust 🦀 and new generation of SudachiPy
Implementation of the LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper
A Python library for interacting with TI-(e)z80 (82/83/84 series) calculator files
💫 Industrial-strength Natural Language Processing (NLP) in Python
Contracts for Spiko's tokenized securities.
(py package) tokenizer based on BPE algorithm for the LLMs (supports the regex pattern and special tokens)
[Paper][Preprint 2024] MyGO: Discrete Modality Information as Fine-Grained Tokens for Multi-modal Knowledge Graph Completion
Public code samples and resources for the Thales CipherTrust Application Protection products of the CipherTrust Data Security Platform
Optimized Craigslist's classification system by creating an algorithm combining LSTM and Random Forest for Text and Image Classification respectively
Tools and resources for the computational processing of Nheengatu (Modern Tupi)
The course "Natural Language Processing Applications" in the Artificial Intelligence program at the National Polytechnic Institute (IPN).
XFT's tokenized luxury watch marketplace.
XFT's onchain insurance marketplace
TI-BASIC token information XMLs for inclusion in other projects
Taiwanese Hokkien Transliterator and Tokeniser
Taiwanese Hokkien Transliterator and Tokeniser
XFT's smart wallet docs
Build and tokenize your own smart contract factory using Fundi, Openzeppelin, and Chainlink contracts with Foundry framework on Etherum/Base Sepolia
🍶 llm-distillery ⇢ use LLMs to run map-reduce summarization tasks on large documents until a target token size is met.
Add a description, image, and links to the tokenization topic page so that developers can more easily learn about it.
To associate your repository with the tokenization topic, visit your repo's landing page and select "manage topics."