Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Message from Server:TensorRT-LLM not supported on Server yet. #164

Open
Rodenhhh opened this issue Mar 4, 2024 · 5 comments
Open

Message from Server:TensorRT-LLM not supported on Server yet. #164

Rodenhhh opened this issue Mar 4, 2024 · 5 comments
Assignees

Comments

@Rodenhhh
Copy link

Rodenhhh commented Mar 4, 2024

i run the docker ghcr.io/collabora/whisperbot-base:latest and run the server , but when i send the request by client , i faced
client:.

[INFO]: * recording
setting
[INFO]: Waiting for server ready ...
[INFO]: Opened connection
Message from Server: TensorRT-LLM not supported on Server yet. Reverting to available backend: 'faster_whisper'
[INFO]: Websocket connection closed: 1000: 

server:
[03/04/2024-12:11:15] TensorRT-LLM not supported: [TensorRT-LLM][ERROR] CUDA runtime error in cub::DeviceSegmentedRadixSort::SortPairsDescending(nullptr, cubTempStorageSize, logProbs, (T*) nullptr, idVals, (int*) nullptr, vocabSize * batchSize, batchSize, beginOffsetBuf, offsetBuf + 1, 0, sizeof(T) * 8, stream): no kernel image is available for execution on the device (/root/TensorRT-LLM/cpp/tensorrt_llm/kernels/samplingTopPKernels.cu:322)

@makaveli10
Copy link
Collaborator

@Rodenhhh Hey, which gpu are using ?

@Rodenhhh
Copy link
Author

Rodenhhh commented Mar 4, 2024

@makaveli10 A100-40GB
image

@makaveli10
Copy link
Collaborator

@Rodenhhh yeah the docker image is supposed to work only on 4090, and unfortunately we missed that part and its not mentioned anywhere sorry for the trouble.

As for a solution, stay tuned we will push a docker-compose to make TensorRT-LLM setup straight forward.
Thanks

@makaveli10
Copy link
Collaborator

@Rodenhhh you can test the docker compose setup if it builds and works as expected. Just make sure to pass the right CUDA_ARCH to docker compose build to have tensorrt-llm build successfully

@makaveli10
Copy link
Collaborator

@Rodenhhh did you get a chance to try out the docker compose solution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

2 participants