Skip to content

Pull requests: karpathy/llm.c

Author
Filter by author
Label
Filter by label
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Milestones
Filter by milestone
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Testability
#523 opened Jun 2, 2024 by ngc92 Draft
Add master weights to resume state
#522 opened Jun 2, 2024 by gordicaleksa Loading…
Fix mem leak
#519 opened Jun 2, 2024 by gordicaleksa Loading…
Generalize MFU logic to common non-A100 GPUs
#518 opened Jun 2, 2024 by gordicaleksa Loading…
add edu fineweb support, with 10B and 100B version
#517 opened Jun 2, 2024 by eliebak Loading…
Add clarification on Box-Muller
#516 opened Jun 2, 2024 by gordicaleksa Loading…
Use local params for num blocks in adamw_kernel3
#515 opened Jun 2, 2024 by gordicaleksa Loading…
fix compilation with older nvcc
#514 opened Jun 2, 2024 by ngc92 Loading…
Added packed layernorm_forward
#513 opened Jun 2, 2024 by ChrisDryden Loading…
Remove redundant CPU computation in encoder bwd
#512 opened Jun 1, 2024 by gordicaleksa Loading…
adding wsd schedule with (1-sqrt) decay
#508 opened Jun 1, 2024 by eliebak Loading…
Add DockerFile
#501 opened May 30, 2024 by banyan-god Loading…
Realtime training visualization using wandb
#489 opened May 29, 2024 by chinthysl Loading…
MFU for other GPUs
#486 opened May 28, 2024 by ngc92 Loading…
Trigger CI template
#483 opened May 28, 2024 by rosslwheeler Loading…
train_gpt2.c: Add gpt2_write_to_checkpoint method
#467 opened May 26, 2024 by faxe1008 Loading…
.gitignore: ignore more for windows devs
#466 opened May 26, 2024 by nietras Loading…
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.