-
Notifications
You must be signed in to change notification settings - Fork 382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug: Different embedding token usage in Langfuse than in OpenAI #1871
Comments
Do you run on Langfuse Cloud? If so, could you provide a trace_id where this issue occurred? Checking the logs for this request end-to-end would help debug the problem and identify its source. |
Do your embedding generations include inputs/outputs? By default, Langfuse takes the token numbers reported by LlamaIndex and does not attempt to tokenize them on the api-level as storing all embedded documents also in Langfuse is usually not necessary |
Hi @marcklingen, no we do not run it on Langfuse Cloud, we have it self-hosted in our environment so you will not be able to check it. Embedding generations include inputs only - I did not experience any tokens generated from OpenAI. |
Langfuse does both but for llamaindex we try to get them via llamaindex and just ingest into langfuse. If you have logs, you could try to find out if this event included token counts to pinpoint the problem. If no token counts are provided and a known model (e.g. the ones from openai that you use) are used, then Langfuse tokenizes within the ingestion api |
Describe the bug
I wanted to use LlamaParse for parsing a set of documents (PDF/Doc/Docx) and index them to be able to ask custom questions to those documents. I created dedicated (fresh) OpenAI API Key, because I wanted to monitor the token usage and compare Langfuse with OpenAI. Once I performed a bunch of tests where I simply parse documents using LlamaParse and perform indexing step with LlamaIndex, I encountered mismatch between token usage by embedding model in Langfuse and in OpenAI.
Model used text-embedding-ada-002(-v2)
LangFuse token count - 32750
OpenAI account count - 33198
Difference 448 tokens
Tests on same set of documents
33249 - 32800 - 449
Tests on other documents
40779 - 40328 - 451
40646 - 40327 - 319
OpenAI always counted more tokens than Langfuse. As you can see perfoming tests on same documents a pattern shown up that we were missing around 450 tokens.
Have you ever experienced similar issue? Or maybe this is expected behaviour?
To reproduce
Additional information
I tried both Langfuse approaches using decorators (
@observe()
) and using low-level SDK. Both gave me same results - where embedding token count was not the same as from OpenAI.The reason of this issue is that I would like to use Langfuse (monitoring, token counting, price calculation etc.) as a reliable source and I want to be sure that it calculates costs of token usage correctly, so I could estimate each the cost of each trace that I will be executing.
I verified that LlamaParse is not responsible for embedding token usage (performed a simple test only on using LlamaParse phase and monitored OpenAI token usage).
The text was updated successfully, but these errors were encountered: