Add arguments `time_off` and `duration` to transcriber #1533

me-kell · 2024-03-06T08:30:11Z

Currently the transcriber processes the whole input file. From the beginning to the end.

It would be very useful to be able to pass a start time offset and/or a duration to the transcriber.

Here is a proposal how to do it:

Add (ffmpeg's) arguments time_off and duration in python/vosk/transcriber/cli.py after line 46.

parser.add_argument("--time_off", "-ss", default=None, type=int, help="start time offset")
parser.add_argument("--duration", "-d", default=None, type=int, help="duration")

Pass the arguments time_off and duration to ffmpeg in function resample_ffmpeg in python/vosk/transcriber/transcriber.py (line 115):

        cmd = shlex.split("ffmpeg -nostdin -loglevel quiet "
                "-i \'{}\' -ar {} -ac 1 {} {} -f s16le -".format(
                    str(infile), 
                    SAMPLE_RATE, 
                    f'-ss {self.args.time_off}' if self.args.time_off is not None else '', # add this
                    f'-t {self.args.duration}' if self.args.duration is not None else ''   # and this
                    ))

The function resample_ffmpeg_async could be adapted similarly.

The text was updated successfully, but these errors were encountered:

nshmyrev · 2024-03-06T16:23:42Z

Hi, thank you for the proposal! Looks nice but what is the usecase please? I can't imagine the user needs to start from certain offset instead of just processing the whole file.

me-kell · 2024-03-06T20:48:34Z

Some use cases:

Have a recording of an interview and a list of the start times of every question and answer. You may want to assign the transcripted parts to their respective time points (question and answer).
You have a music radio programm with the radio speaker commenting every two or three songs. You may want to transcribe only the radio speaker but not the music songs.
And last but not least: you have an audio file with different languages spoken by different speakers. You may want to transcript different parts of the audio in different languages using the corresponding language and model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add arguments `time_off` and `duration` to transcriber #1533

Add arguments `time_off` and `duration` to transcriber #1533

me-kell commented Mar 6, 2024

nshmyrev commented Mar 6, 2024

me-kell commented Mar 6, 2024

Add arguments time_off and duration to transcriber #1533

Add arguments time_off and duration to transcriber #1533

Comments

me-kell commented Mar 6, 2024

nshmyrev commented Mar 6, 2024

me-kell commented Mar 6, 2024

Add arguments `time_off` and `duration` to transcriber #1533

Add arguments `time_off` and `duration` to transcriber #1533