FastAPI middleware for comparing different ML model serving approaches

About this project

This server, written in FastAPI, serves as middleware between the client and selected ML model serving tools (Tensorflow Serving, TorchServe and NVIDIA Triton).

It allows the client to infer using a unified and very simple JSON API.

This project has been created to support a paper to test the performance of selected serving tools. Therefore, the outputs from the models are gathered but not further parsed and returned to the client (in other words, the client just sends JPEGs and receives no results, only 200 HTTP responses).

The resulting JSON API is really not complicated at all. All you need is a curl like this:

curl -vS http://localhost:8000/infer/${SERVING_TYPE} \
        -F "image=@path/to/local/image.JPG"

Where SERVING_TYPE can be one of:

torchserve
tfserving
triton_pytorch
triton_tensorflow

For developers

If you want to develop this project, the instructions are here: docs/DEVELOPMENT
If you want to perform a performance test, the instructions are here: docs/AWS_SETUP

About our team

See: https://biano-ai.github.io/about-biano-ai/

Contact us

See: https://biano-ai.github.io/about-biano-ai/

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
configuration/torchserve		configuration/torchserve
data/imagenet		data/imagenet
docs		docs
models		models
requirements		requirements
src		src
trash		trash
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
Taskfile.yml		Taskfile.yml
activate.ps1		activate.ps1
api.py		api.py
docker-compose.test.yml		docker-compose.test.yml
package-lock.json		package-lock.json
package.json		package.json
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
script.js		script.js
setup.cfg		setup.cfg

License

Biano-AI/serving-compare-middleware

Folders and files

Latest commit

History

Repository files navigation

FastAPI middleware for comparing different ML model serving approaches

About this project

For developers

About our team

Contact us

License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages