#

transformer-architecture

Here are 216 public repositories matching this topic...

songqiang321 / Awesome-AI-Papers

This repository is used to collect papers and code in the field of AI.

Updated May 31, 2024

gustavecortal / transformer

Slides from my NLP course on the transformer architecture

nlp tutorial slides transformers transformer transformer-architecture transformer-models

Updated May 30, 2024

RuochenT / transformer_hybrid

This study aims to investigate the effectiveness of three Transformers (BERT, RoBERTa, XLNet) in handling data sparsity and cold start problems in the recommender system. We present a Transformer-based hybrid recommender system that predicts missing ratings and ex- tracts semantic embeddings from user reviews to mitigate the issues.

matrix-factorization transformer bert multilabel-classification sentence-embeddings hybrid-recommender-system roberta transformer-architecture xlnet cold-start-problem

Updated May 30, 2024
Jupyter Notebook

zhongkaifu / Seq2SeqSharp

Seq2SeqSharp is a tensor based fast & flexible deep neural network framework written by .NET (C#). It has many highlighted features, such as automatic differentiation, different network types (Transformer, LSTM, BiLSTM and so on), multi-GPUs supported, cross-platforms (Windows, Linux, x86, x64, ARM), multimodal model for text and images and so on.

image translation deep-learning neural-network gpu text machine-translation cuda transformer lstm seq2seq sequence-to-sequence tensor encoder-decoder attention-model transformer-encoder transformer-architecture vision-transformer

Updated May 29, 2024
C#

jshuadvd / LongRoPE

Implementation of the LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper

nlp machine-learning natural-language-processing ai deep-learning transformers artificial-intelligence gpt language-model natural-language-inference natural tokenization natural-language-understanding attention-is-all-you-need attention-mechanisms transformer-architecture natural-language-procressing tokenizers llm

Updated May 29, 2024
Python

sushantkumar23 / nano-gpt

Simple character level Transformer

transformers pytorch attention attention-mechanism rope self-attention multi-head-attention shakespeare-dataset transformer-architecture llm rmsnorm

Updated May 27, 2024
Jupyter Notebook

kyegomez / MultiModalMamba

A novel implementation of fusing ViT with Mamba into a fast, agile, and high performance Multi-Modal Model. Powered by Zeta, the simplest AI framework ever.

machine-learning ai ml transformers torch pytorch artificial-intelligence zeta attention-mechanism ssm mamba transformer-architecture

Updated May 27, 2024
Python

lloydaxeph / multi_head_attention_transformer

Simple implementation of the paper "Attention Is All You Need" - https://arxiv.org/abs/1706.03762 using pytorch.

deep-learning transformers artificial-intelligence nlp-machine-learning transformer-architecture large-language-models llm

Updated May 25, 2024
Python

awslabs / sockeye

Sequence-to-sequence framework with a focus on Neural Machine Translation based on PyTorch

machine-learning deep-neural-networks translation deep-learning machine-translation pytorch transformer seq2seq neural-machine-translation sequence-to-sequence attention-mechanism encoder-decoder attention-model sequence-to-sequence-models attention-is-all-you-need sockeye transformer-architecture transformer-network

Updated May 22, 2024
Python

cmhungsteve / Awesome-Transformer-Attention

An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites

computer-vision deep-learning transformers transformer awesome-list vit papers attention-mechanism attention-mechanisms self-attention transformer-architecture transformer-models detr vision-transformer transformer-cv transformer-with-cv transformer-awesome visual-transformer

Updated May 22, 2024

Blakley / Fluency

A desktop application to assist in learning languages. Uses a deep learning model to generate translations.

language-translation tensorflow keras python3 seq2seq transformer-architecture

Updated May 21, 2024
Jupyter Notebook

tairov / llama2.mojo

Inference Llama 2 in one file of pure 🔥

performance modular mojo inference simd llama tensor vectorization parallelize transformer-architecture llama2

Updated May 21, 2024
Mojo

ashhass / NLP_Transformer

transformer implementation from scratch for next character prediction

language-model transformer-architecture

Updated May 16, 2024
Python

Cyanosite / Facial-Attribute-Recognition

Facial Attribute Recognition using the Transformer architecture, 91% on CelebA

machine-learning computer-vision neural-network transformer facial-recognition celeba celeba-dataset transformer-architecture

Updated May 15, 2024
Jupyter Notebook

jet-universe / particle_transformer

Official implementation of "Particle Transformer for Jet Tagging".

hep transformer-architecture jet-tagging

Updated May 13, 2024
Python

adrienpetralia / TransApp

[VLDB 2024] ADF & TransApp: A Transformer-Based Framework for Appliance Detection Using Smart Meter Consumption Series

python deep-learning time-series classification electricity-consumption time-series-classification transformer-architecture smart-meter-data appliance-detection

Updated May 13, 2024
Python

jhagnberger / vcnef

Official PyTorch implementation of the Vectorized Conditional Neural Field.

machine-learning deep-learning partial-differential-equations transformer-architecture

Updated May 8, 2024
Python

jeroenvansweeveldt / Predicting_superhero_creators-machine_learning_exam_2023

Data and code for the machine learning exam assignment of MA Digital Text Analysis (2023).

nlp machine-learning natural-language-processing word-embeddings pytorch logistic-regression knn-classification glove-embeddings tf-idf-vectorizer transformer-architecture distilbert

Updated May 6, 2024
Jupyter Notebook

Ma-Lab-Berkeley / CRATE

Code for CRATE (Coding RAte reduction TransformEr).

compression sparsification transformer-architecture white-box-architecture

Updated May 3, 2024
Python

Yunika-Bajracharya / Extractive-Nepali-QA

Extractive Nepali Question Answering System | Browser Extension & Web Application

nlp transformer-architecture extractive-question-answering muril

Updated May 2, 2024
Jupyter Notebook

Improve this page

Add a description, image, and links to the transformer-architecture topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the transformer-architecture topic, visit your repo's landing page and select "manage topics."