Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for IQ quantizarions #4322

Merged
merged 1 commit into from
May 23, 2024
Merged

Add support for IQ quantizarions #4322

merged 1 commit into from
May 23, 2024

Conversation

BruceMacD
Copy link
Contributor

@BruceMacD BruceMacD commented May 10, 2024

This change allows importing IQ type gguf quantization with ollama create.

This change carries the commit from #3657 while moving its changes around to the refactored project structure.

❯ ./ollama create nous-hermes-2-mistral:IQ_4XS -f /Users/bruce/models/nous-hermes-2-mistral/Modelfile
transferring model data 
using existing layer sha256:737258efad6ba5cf7232de66715a26cadba67b0e4bdace5cf03cf49d1e4864a0 
creating new layer sha256:d7285065edcb87b4852f1144dd090812df1b00ade49f74e234066ea9407a14bc 
creating new layer sha256:d8ba2f9a17b3bbdeb5690efaa409b3fcb0b56296a777c7a69c78aa33bbddf182 
creating new layer sha256:b2c4ee0a7317771fcbe7413c369d72ea911c63e6f52b2b0d6298a5a14c8e4983 
writing manifest 
success 

❯ ./ollama run nous-hermes-2-mistral:IQ_4XS
>>> write some python

def print_fruits(fruits):
    for fruit in fruits:
        print(fruit)

Tested with:
IQ1_S
IQ1_M
IQ2_M
IQ3_XXS
IQ3_XS
IQ3_S
IQ4_NL
IQ4_XS

resolves #3622

Copy link
Contributor

@sammcj sammcj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, This will be great to have, thank you.

@sammcj
Copy link
Contributor

sammcj commented May 15, 2024

Been running my Ollama build with this patch for the past few days without any issues, it's great to use some of the newer IQ quants!

Keen to see this merged.

@mann1x
Copy link
Contributor

mann1x commented May 16, 2024

@BruceMacD Any idea when we can get this merged?

@oldmanjk
Copy link

Any updates?

@sammcj
Copy link
Contributor

sammcj commented May 22, 2024

@BruceMacD would you please be able to update (rebase) your branch from main?

Co-Authored-By: ManniX-ITA <20623405+mann1x@users.noreply.github.com>
@@ -126,10 +150,26 @@ func (t fileType) String() string {
return "IQ2_XS"
case fileTypeQ2_K_S:
return "Q2_K_S"
case fileTypeQ3_K_XS:
return "Q3_K_XS"
case fileTypeIQ3_XS:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BruceMacD BruceMacD changed the title Add support for IQ1_S, IQ3_S, IQ2_S, IQ4_XS. IQ4_NL Add support for IQ1_S, IQ3_S, IQ2_S, IQ4_XS, IQ4_NL May 23, 2024
Copy link
Contributor

@mxyng mxyng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test might be a flake. I suggest retrying and if it fails again, maybe ping @dhiltgen

@BruceMacD BruceMacD changed the title Add support for IQ1_S, IQ3_S, IQ2_S, IQ4_XS, IQ4_NL Add support for IQ quantizarions May 23, 2024
@BruceMacD BruceMacD merged commit d6f692a into main May 23, 2024
15 checks passed
@BruceMacD BruceMacD deleted the brucemacd/iq-quants branch May 23, 2024 20:21
@BruceMacD
Copy link
Contributor Author

@sammcj sorry for the delay!

@sammcj
Copy link
Contributor

sammcj commented May 23, 2024

Nice work, this is awesome!

@Xanton19
Copy link

Xanton19 commented May 25, 2024

Awesome, great job! Just wanted to comment that IQ3_M seems to be absent from the list (referring to "filetype.go"), which means IQ3_M would not work? Otherwise all the other quants seem to be present.

@wwjCMP
Copy link

wwjCMP commented May 26, 2024

I'm curious about which version of ollama will supported this feature.

@sammcj
Copy link
Contributor

sammcj commented May 26, 2024

@wwjCMP there hasn't been a final release in two weeks, but v0.1.39 which is currently marked as pre-release has it - https://github.com/ollama/ollama/releases

@wwjCMP
Copy link

wwjCMP commented May 26, 2024

@wwjCMP there hasn't been a final release in two weeks, but v0.1.39 which is currently marked as pre-release has it - https://github.com/ollama/ollama/releases

thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ollama fails to create models when using IQ quantized GGUFs - Error: invalid file magic
7 participants