-
Notifications
You must be signed in to change notification settings - Fork 8.5k
Issues: ggerganov/llama.cpp
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Bug: Server ends up in infinite loop if number of requests in the batch is greater than parallel slots with system prompt
bug-unconfirmed
high severity
Used to report high severity bugs in llama.cpp (Malfunctioning hinder important workflow)
#7834
opened Jun 8, 2024 by
kdhingra307
Research: Im writing a paper on our medical finetuned llava-v1.6,
research 🔬
#7831
opened Jun 8, 2024 by
rohithbojja
iGPU offloading Bug: Memory access fault by GPU node-1 (appeared once only)
AMD GPU
Issues specific to AMD GPUs
bug-unconfirmed
low severity
Used to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches)
#7829
opened Jun 8, 2024 by
eliranwong
No successful releases from CI in the last 2 days.
bug-unconfirmed
low severity
Used to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches)
#7828
opened Jun 8, 2024 by
Spacellary
Bug: CUDA enabled docker container fails to launch
bug-unconfirmed
critical severity
Used to report critical severity bugs in llama.cpp (e.g. Crashing, Corrupted, Dataloss)
#7822
opened Jun 7, 2024 by
mblunt
Bug: Running a large model through the server using vulkan backend always generates gibberish after first call.
bug-unconfirmed
medium severity
Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)
#7819
opened Jun 7, 2024 by
richardanaya
Bug: QWEN2 MoE imatrix contains nan's after generating it
bug-unconfirmed
medium severity
Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)
#7816
opened Jun 7, 2024 by
legraphista
I am running two socket servers, and the CPU usage is at 50%
bug-unconfirmed
high severity
Used to report high severity bugs in llama.cpp (Malfunctioning hinder important workflow)
#7812
opened Jun 7, 2024 by
superLiben
Bug: QWEN2 quantization GGML_ASSERT
bug-unconfirmed
high severity
Used to report high severity bugs in llama.cpp (Malfunctioning hinder important workflow)
#7805
opened Jun 6, 2024 by
bartowski1182
Bug: token generation seems to slow down for higher slots
bug-unconfirmed
low severity
Used to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches)
#7802
opened Jun 6, 2024 by
desperadoduck
Bug: JSON Schema-to-GBNF additionalProperties bugs (and other minor quirks)
bug-unconfirmed
low severity
Used to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches)
#7789
opened Jun 6, 2024 by
HanClinto
Feature Request: Loading PeFT - LoRA adapters during runtime without prior merging
enhancement
New feature or request
#7788
opened Jun 6, 2024 by
niranjanakella
4 tasks done
Refactor: investigate cleaner exception handling for server/server.cpp
help wanted
Extra attention is needed
refactoring
Refactoring
#7787
opened Jun 6, 2024 by
mofosyne
Bug: Error while running model file (.gguf ) in LM Studio
3rd party
Issue related to a 3rd party project
bug-unconfirmed
low severity
Used to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches)
#7779
opened Jun 5, 2024 by
thehsansaeed
Feature Request: GLM-4 9B Support
enhancement
New feature or request
#7778
opened Jun 5, 2024 by
arch-btw
4 tasks done
ggml : add WebGPU backend
help wanted
Extra attention is needed
research 🔬
#7773
opened Jun 5, 2024 by
ggerganov
ggml : add DirectML backend
help wanted
Extra attention is needed
research 🔬
#7772
opened Jun 5, 2024 by
ggerganov
Feature Request: Add support for xLSTM
enhancement
New feature or request
#7764
opened Jun 5, 2024 by
uwu-420
4 tasks done
Feature Request: Add vocabulary type for token-free models that work on raw bytes
enhancement
New feature or request
#7763
opened Jun 5, 2024 by
uwu-420
4 tasks done
Encountering some errors while using Android NDK with Vulkan
bug-unconfirmed
medium severity
Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)
#7760
opened Jun 5, 2024 by
QIANXUNZDL123
Feature Request: Multi session chat support
enhancement
New feature or request
#7758
opened Jun 5, 2024 by
BigCatGit
4 tasks done
Previous Next
ProTip!
Adding no:label will show everything without a label.