This repository is used to collect papers and code in the field of AI.
-
Updated
May 31, 2024
This repository is used to collect papers and code in the field of AI.
Slides from my NLP course on the transformer architecture
This study aims to investigate the effectiveness of three Transformers (BERT, RoBERTa, XLNet) in handling data sparsity and cold start problems in the recommender system. We present a Transformer-based hybrid recommender system that predicts missing ratings and ex- tracts semantic embeddings from user reviews to mitigate the issues.
Seq2SeqSharp is a tensor based fast & flexible deep neural network framework written by .NET (C#). It has many highlighted features, such as automatic differentiation, different network types (Transformer, LSTM, BiLSTM and so on), multi-GPUs supported, cross-platforms (Windows, Linux, x86, x64, ARM), multimodal model for text and images and so on.
Implementation of the LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper
Simple character level Transformer
A novel implementation of fusing ViT with Mamba into a fast, agile, and high performance Multi-Modal Model. Powered by Zeta, the simplest AI framework ever.
Simple implementation of the paper "Attention Is All You Need" - https://arxiv.org/abs/1706.03762 using pytorch.
Sequence-to-sequence framework with a focus on Neural Machine Translation based on PyTorch
An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
A desktop application to assist in learning languages. Uses a deep learning model to generate translations.
Inference Llama 2 in one file of pure 🔥
transformer implementation from scratch for next character prediction
Facial Attribute Recognition using the Transformer architecture, 91% on CelebA
Official implementation of "Particle Transformer for Jet Tagging".
[VLDB 2024] ADF & TransApp: A Transformer-Based Framework for Appliance Detection Using Smart Meter Consumption Series
Official PyTorch implementation of the Vectorized Conditional Neural Field.
Data and code for the machine learning exam assignment of MA Digital Text Analysis (2023).
Code for CRATE (Coding RAte reduction TransformEr).
Extractive Nepali Question Answering System | Browser Extension & Web Application
Add a description, image, and links to the transformer-architecture topic page so that developers can more easily learn about it.
To associate your repository with the transformer-architecture topic, visit your repo's landing page and select "manage topics."