-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for IQ quantizarions #4322
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, This will be great to have, thank you.
Been running my Ollama build with this patch for the past few days without any issues, it's great to use some of the newer IQ quants! Keen to see this merged. |
@BruceMacD Any idea when we can get this merged? |
Any updates? |
@BruceMacD would you please be able to update (rebase) your branch from main? |
Co-Authored-By: ManniX-ITA <20623405+mann1x@users.noreply.github.com>
c7731dc
to
643abf7
Compare
@@ -126,10 +150,26 @@ func (t fileType) String() string { | |||
return "IQ2_XS" | |||
case fileTypeQ2_K_S: | |||
return "Q2_K_S" | |||
case fileTypeQ3_K_XS: | |||
return "Q3_K_XS" | |||
case fileTypeIQ3_XS: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test might be a flake. I suggest retrying and if it fails again, maybe ping @dhiltgen
@sammcj sorry for the delay! |
Nice work, this is awesome! |
Awesome, great job! Just wanted to comment that IQ3_M seems to be absent from the list (referring to "filetype.go"), which means IQ3_M would not work? Otherwise all the other quants seem to be present. |
I'm curious about which version of ollama will supported this feature. |
@wwjCMP there hasn't been a final release in two weeks, but v0.1.39 which is currently marked as pre-release has it - https://github.com/ollama/ollama/releases |
thanks |
This change allows importing
IQ
type gguf quantization withollama create
.This change carries the commit from #3657 while moving its changes around to the refactored project structure.
Tested with:
IQ1_S
IQ1_M
IQ2_M
IQ3_XXS
IQ3_XS
IQ3_S
IQ4_NL
IQ4_XS
resolves #3622