Releases: BerriAI/litellm
v1.40.0
What's Changed
- fix: fix streaming with httpx client by @krrishdholakia in #3944
- feat(scheduler.py): add request prioritization scheduler by @krrishdholakia in #3954
- [FEAT] Perf improvements - litellm.completion / litellm.acompletion - Cache OpenAI client by @ishaan-jaff in #3956
- fix(http_handler.py): support verify_ssl=False when using httpx client by @krrishdholakia in #3959
- Litellm docker compose start by @krrishdholakia in #3961
Full Changelog: v1.39.6...v1.40.0
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.0
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 120.0 | 133.63252197830545 | 6.467733658247951 | 0.0 | 1936 | 0 | 94.77090299998281 | 801.180971000008 |
Aggregated | Passed ✅ | 120.0 | 133.63252197830545 | 6.467733658247951 | 0.0 | 1936 | 0 | 94.77090299998281 | 801.180971000008 |
v1.39.6
We're launching team member invites (No SSO Required) on v1.39.6 🔥 Invite team member to view LLM Usage, Spend per service https://docs.litellm.ai/docs/proxy/ui
👍 [Fix] Cache Vertex AI clients - Major Perf improvement for VertexAI models
✨ Feat - Send new users invite emails on creation (using 'send_invite_email' on /user/new)
💻 UI - allow users to sign in with with email/password
🔓 [UI] Admin UI Invite Links for non SSO
✨ PR - [FEAT] Perf improvements - litellm.completion / litellm.acompletion - Cache OpenAI client
What's Changed
- Fix warnings from pydantic by @lj-wego in #3670
- Update pydantic version in CI requirements.txt by @lj-wego in #3938
- Allow admin to give invite links to others by @krrishdholakia in #3875
- Update model config definition to use v2 style by @lj-wego in #3943
- Add OIDC + unit test for bedrock httpx by @Manouchehri in #3688
- (fix) Update Mistral model list and prices by @alexpeattie in #3945
- feat -
send_invite_email
on /user/new by @ishaan-jaff in #3942 - [UI] Admin UI Invite Links for non SSO users by @ishaan-jaff in #3950
- [Feat] Admin UI - invite users to view spend by @ishaan-jaff in #3952
- UI - allow users to sign in with with email/password by @ishaan-jaff in #3953
- feat(proxy_server.py): add assistants api endpoints to proxy server by @krrishdholakia in #3936
- [Fix] Cache Vertex AI clients - Perf improvement by @ishaan-jaff in #3935
- fix(bedrock): convert botocore credentials when role is assumed by @pharindoko in #3939
New Contributors
- @lj-wego made their first contribution in #3670
- @alexpeattie made their first contribution in #3945
- @pharindoko made their first contribution in #3939
Full Changelog: v1.39.5...v1.39.6
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.39.6
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 78 | 90.37559010674164 | 6.5521693586672445 | 0.0 | 1958 | 0 | 65.34477100001368 | 961.3953589999937 |
Aggregated | Passed ✅ | 78 | 90.37559010674164 | 6.5521693586672445 | 0.0 | 1958 | 0 | 65.34477100001368 | 961.3953589999937 |
v1.39.5-stable
Full Changelog: v1.39.5...v1.39.5-stable
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.39.5-stable
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 82 | 98.4437988109365 | 6.414126443845541 | 0.0 | 1920 | 0 | 65.89902199999642 | 1363.2986580000193 |
Aggregated | Passed ✅ | 82 | 98.4437988109365 | 6.414126443845541 | 0.0 | 1920 | 0 | 65.89902199999642 | 1363.2986580000193 |
v1.39.5
What's Changed
- fix(router.py): cooldown on 404 errors by @krrishdholakia in #3926
- [Feat] LiteLLM Proxy - use enums for user roles by @ishaan-jaff in #3927
- UI - View user role on admin ui by @ishaan-jaff in #3930
- [UI] edit user role admin UI by @ishaan-jaff in #3929
- fix: add missing seed parameter to ollama input #3923 by @devdev999 in #3924
- feat(main.py): support openai tts endpoint by @krrishdholakia in #3928
- [Feat] UI - cleanup editing users by @ishaan-jaff in #3931
- [Feat- admin UI] Show number of rate limit errors by deployment per day by @ishaan-jaff in #3932
New Contributors
- @devdev999 made their first contribution in #3924
Full Changelog: v1.39.4...v1.39.5
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.39.5
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 130.0 | 168.39339172958924 | 6.4252258901831345 | 0.0 | 1923 | 0 | 109.15407800001731 | 1833.3729599999913 |
Aggregated | Passed ✅ | 130.0 | 168.39339172958924 | 6.4252258901831345 | 0.0 | 1923 | 0 | 109.15407800001731 | 1833.3729599999913 |
v1.39.4
What's Changed
- fix - UI submit chat on enter by @ishaan-jaff in #3916
- Revert "Revert "fix: Log errors in Traceloop Integration (reverts previous revert)"" by @nirga in #3909
Full Changelog: v1.39.3...v1.39.4
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.39.4
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 120.0 | 135.98662418243552 | 6.404889633803229 | 0.0 | 1913 | 0 | 97.80563699996492 | 1663.1231360000243 |
Aggregated | Passed ✅ | 120.0 | 135.98662418243552 | 6.404889633803229 | 0.0 | 1913 | 0 | 97.80563699996492 | 1663.1231360000243 |
v1.39.3
What's Changed
- fix: Log errors in Traceloop Integration (reverts previous revert) by @nirga in #3846
- Added support for Triton chat completion using trtlllm generate endpo… by @giritatavarty-8451 in #3895
- Revert "Added support for Triton chat completion using trtlllm generate endpo…" by @ishaan-jaff in #3900
- [Feat] Implement Logout Admin UI by @ishaan-jaff in #3901
- Revert "fix: Log errors in Traceloop Integration (reverts previous revert)" by @krrishdholakia in #3908
- feat(proxy_server.py): emit webhook event whenever customer spend is tracked by @krrishdholakia in #3906
- fix(openai.py): only allow 'user' as optional param if openai model by @krrishdholakia in #3902
- [Feat] UI update analytics tab to show human friendly usage vals by @ishaan-jaff in #3894
- ui - fix latency analytics on
completion_tokens
by @ishaan-jaff in #3897 - [Admin UI] Edit
Internal Users
by @ishaan-jaff in #3904 - fix(proxy_server.py): fix end user object check when master key used by @krrishdholakia in #3910
- [UI] Fix bug on Model analytics by @ishaan-jaff in #3913
- feat - langfuse use
key_alias
as generation name on litellm proxy by @ishaan-jaff in #3911 - fix pricing / price tracking for vertex_ai/claude-3-opus@20240229 by @ishaan-jaff in #3915
New Contributors
- @giritatavarty-8451 made their first contribution in #3895
Full Changelog: v1.39.2...v1.39.3
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.39.3
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 110.0 | 133.96143579083153 | 6.347194412767075 | 0.0 | 1898 | 0 | 91.88108999995848 | 1459.6432470000025 |
Aggregated | Passed ✅ | 110.0 | 133.96143579083153 | 6.347194412767075 | 0.0 | 1898 | 0 | 91.88108999995848 | 1459.6432470000025 |
v1.39.2
What's Changed
- Update ollama.py for image handling by @rick-github in #2888
- fix(anthropic.py): fix parallel streaming on anthropic.py by @krrishdholakia in #3883
- feat(proxy_server.py): Time to first token Request-level breakdown by @krrishdholakia in #3886
- [BETA-Feature] Add OpenAI
v1/batches
Support on LiteLLM SDK by @ishaan-jaff in #3882 - feat - router add abatch_completion - N Models, M Messages by @ishaan-jaff in #3889
- [Feat] LiteLLM Proxy Add
POST /v1/files
andGET /v1/files
by @ishaan-jaff in #3888 - [Feat] LiteLLM Proxy - Add support for
POST /v1/batches
,GET /v1/batches
by @ishaan-jaff in #3885 - feat(router.py): support fastest response batch completion call by @krrishdholakia in #3887
Full Changelog: v1.38.12...v1.39.2
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.39.2
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 72 | 83.46968387564114 | 6.529958043991633 | 0.0 | 1954 | 0 | 61.38368400002037 | 678.4462749999989 |
Aggregated | Passed ✅ | 72 | 83.46968387564114 | 6.529958043991633 | 0.0 | 1954 | 0 | 61.38368400002037 | 678.4462749999989 |
v1.38.12
What's Changed
- feat(proxy_server.py): CRUD endpoints for controlling 'invite link' flow by @krrishdholakia in #3873
- [Feat] Add, Test Email Alerts on Admin UI by @ishaan-jaff in #3874
Full Changelog: v1.38.11...v1.38.12
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.38.12
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 76 | 91.16258395147193 | 6.473952425752436 | 0.0 | 1937 | 0 | 62.406538999994154 | 1772.6057410000067 |
Aggregated | Passed ✅ | 76 | 91.16258395147193 | 6.473952425752436 | 0.0 | 1937 | 0 | 62.406538999994154 | 1772.6057410000067 |
v1.38.11
💵 LiteLLM v1.38.11 Proxy 100+ LLMs AND Set Budgets for your customers https://docs.litellm.ai/docs/proxy/users#set-rate-limits
✨ NEW /Customer/update and /Customer/delete endpoints https://docs.litellm.ai/docs/proxy/users#set-rate-limits
📝 [Feat] Email alerting is now Free Tier: https://docs.litellm.ai/docs/proxy/email
🚀 [Feat] Show supported OpenAI params on LiteLLM UI model hub
✨ [Feat] Show Created at, Created by on Models Page
What's Changed
- Clarifai-LiteLLM update docs by @mogith-pn in #3856
- [Feat] Show supported OpenAI params on model hub by @ishaan-jaff in #3859
- fix(parallel_request_limiter.py): fix user+team tpm/rpm limit check by @krrishdholakia in #3857
- fix - Admin UI show activity by model_group by @ishaan-jaff in #3865
- [Feat] Show Created at, Created by on
Models
Page by @ishaan-jaff in #3868 - Improve validate-fallbacks method by @SujanShilakar in #3847
- fix(model_dashboard.tsx): accurately show the input/output cost per token when custom pricing is set by @krrishdholakia in #3871
- Admin UI - Public model hub by @krrishdholakia in #3869
- [Feat] Rename
/end/user/new
->/customer/new
(maintain backwards compatibility) by @ishaan-jaff in #3870 - [Feat] Make Email alerting Free Tier, but customizing emails enterprise by @ishaan-jaff in #3872
New Contributors
- @SujanShilakar made their first contribution in #3847
Full Changelog: v1.38.10...v1.38.11
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.38.11
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 94 | 113.13091035154665 | 6.485092627447978 | 0.0 | 1940 | 0 | 80.4994959999874 | 735.4111310000064 |
Aggregated | Passed ✅ | 94 | 113.13091035154665 | 6.485092627447978 | 0.0 | 1940 | 0 | 80.4994959999874 | 735.4111310000064 |
v1.38.10
What's Changed
- [Feat] Model Hub by @ishaan-jaff in #3849
Full Changelog: v1.38.8...v1.38.10
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.38.10
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 130.0 | 152.41971991092666 | 6.452763997233594 | 0.0 | 1931 | 0 | 108.63601500000186 | 1150.9651800000142 |
Aggregated | Passed ✅ | 130.0 | 152.41971991092666 | 6.452763997233594 | 0.0 | 1931 | 0 | 108.63601500000186 | 1150.9651800000142 |