Reweight GPT

An alternative to the self-attetnion mechanism in Tranformer achitechture. It uses learnable lateral connections to reweight the inputs directly instead of the self-attention mechanism (as illustrated below). To learn more about the method, watch this video (from 41:26): https://youtu.be/l-CjXFmcVzY

Files:

the tutorial folder - A step by step tutorial from the basics to GPT.
reweight-gpt.py (A multi-block GPT implimentation using direct re-weighting of the attention matrix).
reweight-gpt-nonlinear.py (A nonlinear version of the direct re-weighting method. For easy comparsion between the two methods, I adapted this script directly from Andrej Karpathy's GPT implimentation).

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
data		data
tutorial		tutorial
LICENSE		LICENSE
README.md		README.md
reweight-gpt-nonlinear.py		reweight-gpt-nonlinear.py
reweight_gpt.py		reweight_gpt.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

tutorial

tutorial

LICENSE

LICENSE

README.md

README.md

reweight-gpt-nonlinear.py

reweight-gpt-nonlinear.py

reweight_gpt.py

reweight_gpt.py

Repository files navigation

Reweight GPT

Files:

Illustration:

About

Releases

Packages

Languages

License

hunar4321/reweight-gpt

Folders and files

Latest commit

History

Repository files navigation

Reweight GPT

Files:

Illustration:

About

Topics

Resources

License

Stars

Watchers

Forks

Languages