-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docker build #122
Comments
I got an image built. It's not clean enough for a pull request but I'll share what I've got anyway. Maybe someone else can pick this up and contribute it (assuming the maintainers want it). I'm just creating a # FIXME: Makes a huge image.
# TODO: Optimize with a multi-stage build, perhaps also using venv.
# Pin to 3.10-bookworm to get Python 3.10
# because https://github.com/MahmoudAshraf97/whisper-diarization/issues/90
FROM python:3.10-bookworm
ARG WD_USER=joe
ARG WD_UID=1000
ARG WD_GROUP=joe
ARG WD_GID=1000
# We rarely see a full upgrade in a Dockerfile. Why?
# && apt-get --assume-yes dist-upgrade \
RUN apt-get update \
&& apt-get --assume-yes --no-install-recommends install \
cython3 \
ffmpeg \
unzip \
wget \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /usr/src/app
COPY . .
RUN addgroup --gid $WD_GID $WD_GROUP \
&& adduser --uid $WD_UID --gid $WD_GID --shell /bin/bash --no-create-home $WD_USER \
&& chown -R $WD_USER:$WD_GROUP /usr/src/app
USER $WD_USER:$WD_GROUP
RUN mkdir venv \
&& python -m venv venv \
&& . venv/bin/activate \
&& pip install Cython \
&& pip install --no-cache-dir --requirement requirements.txt Build with As user BASE=$HOME/whisper-diarization
mkdir -p $BASE/data
mkdir -p $BASE/HOME_CACHE
mkdir -p $BASE/HOME_CONFIG
APP=/usr/src/app
mv /tmp/recording.mp3 data/
docker run --rm -it \
-v $BASE/data:/data \
-v $BASE/HOME_CONFIG:$APP/.config \
-v $BASE/HOME_CACHE:$APP/.cache \
--user joe:joe \
whisper-diarization \
bash Now you're in the container at a non-root shell prompt, presumably. Run: export HOME=/usr/src/app
source venv/bin/activate
python diarize_parallel.py -a /data/recording.mp3
exit Now, inspect and manually clean up |
Don't forget the |
Just released "transcription stream" on GitHub today, which includes a docker image that runs diarize.py. Takes me about 15 minutes to build, but works great and is fast/automated. Would love to get your thoughts: https://github.com/transcriptionstream/transcriptionstream |
It took me 30 minutes to build it and the 7.5GB size, but it works. Thanks for sharing :) |
Just thought it would be handy to have a Docker image for this tool. I've been unable to get it working so far but I'll keep trying. If anyone else has it running in Docker, please share.
The text was updated successfully, but these errors were encountered: