Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

yt-dlp doesn't use aria2c for subtitles and download is slower than videos #9919

Open
9 of 10 tasks
Ottaviocr opened this issue May 14, 2024 · 12 comments
Open
9 of 10 tasks
Labels
enhancement New feature or request triage Untriaged issue

Comments

@Ottaviocr
Copy link

DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE

  • I understand that I will be blocked if I intentionally remove or skip any mandatory* field

Checklist

Provide a description that is worded well enough to be understood

I have noticed that yt-dlp, despite being configured to use aria2c for downloads, on many sites uses native downloader to download subtitles and in this case, the download speeds are atrociously slow.

In the below fragment, it took 06:20 to download 27.84KiB of subtitles but only 50 secs to download 534.48MiB of video file and 3 seconds to download 20MB.

Provide verbose output that clearly demonstrates the problem

  • Run your yt-dlp command with -vU flag added (yt-dlp -vU <your command line>)
  • If using API, add 'verbose': True to YoutubeDL params instead
  • Copy the WHOLE output (starting with [debug] Command-line config) and insert it below

Complete Verbose Output

[debug] Command-line config: ['-R', 'infinite', '--fragment-retries', 'infinite', '--restrict-filenames', '--progress', '--downloader=aria2c', '--abort-on-error', '-P', 'temp:/dev/shm', '-vU', '--write-subs', '-o', '%(title)s.%(ext)s', '-S', 'res:900', 'https://www.rts.ch/play/tv/12h45/video/12h45?urn=urn:rts:video:14903403']
[debug] Encodings: locale UTF-8, fs utf-8, pref UTF-8, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version master@2024.05.13.232520 from yt-dlp/yt-dlp-master-builds [351dc0bc3] (zip)
[debug] Python 3.11.2 (CPython x86_64 64bit) - Linux-6.1.0-18-amd64-x86_64-with-glibc2.36 (OpenSSL 3.0.11 19 Sep 2023, glibc 2.36)
[debug] exe versions: ffmpeg 5.1.4-0 (setts), ffprobe 5.1.4-0
[debug] Optional libraries: brotli-1.0.9, certifi-2022.09.24, requests-2.28.1, sqlite3-3.40.1, urllib3-1.26.12
[debug] Proxy map: {}
[debug] Request Handlers: urllib
[debug] Loaded 1803 extractors
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp-master-builds/releases/latest
Latest version: master@2024.05.13.232520 from yt-dlp/yt-dlp-master-builds
yt-dlp is up to date (master@2024.05.13.232520 from yt-dlp/yt-dlp-master-builds)
[SRGSSRPlay] Extracting URL: https://www.rts.ch/play/tv/12h45/video/12h45?urn=urn:rts:video:14903403
[SRGSSR] Extracting URL: srgssr:rts:video:14903403
[SRGSSR] 14903403: Downloading JSON metadata
[SRGSSR] 14903403: Downloading m3u8 information
[info] 14903403: Downloading subtitles: fr
[debug] Sort order given by user: res:900
[debug] Formats sorted by: hasvid, ie_pref, res:900(900.0), lang, quality, fps, hdr:12(7), vcodec:vp9.2(10), channels, acodec, size, br, asr, proto, vext, aext, hasaud, source, id
[debug] Default format spec: bestvideo*+bestaudio/best
[info] 14903403: Downloading 1 format(s): HLS-H264-HD-3629+HLS-H264-HD-audio0-Français
[info] Writing video subtitles to: /dev/shm/12h45.fr.vtt
[debug] Invoking hlsnative downloader on "https://rts-vod-amd.akamaized.net/ww/14903403/615a0d44-ed43-38a7-a1d5-e29c808eeb06/index-f6.m3u8"
[hlsnative] Downloading m3u8 manifest
[hlsnative] Total fragments: 162
[download] Destination: /dev/shm/12h45.fr.vtt
[download] 100% of   27.84KiB in 00:06:20 at 74.94B/s
[debug] Invoking hlsnative downloader on "https://rts-vod-amd.akamaized.net/ww/14903403/615a0d44-ed43-38a7-a1d5-e29c808eeb06/index-f4-v1.m3u8"
[hlsnative] Downloading m3u8 manifest
[hlsnative] Fragment downloads will be delegated to aria2c
[hlsnative] Total fragments: 163
[download] Destination: /dev/shm/12h45.fHLS-H264-HD-3629.mp4
[debug] aria2c command line: aria2c -c --no-conf --console-log-level=warn --summary-interval=0 --download-result=hide --http-accept-gzip=true --file-allocation=none -x16 -j16 -s16 --allow-overwrite=true --allow-piece-length-change=true --header 'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36' --header 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' --header 'Accept-Language: en-us,en;q=0.5' --header 'Sec-Fetch-Mode: navigate' --check-certificate=true --remote-time=true --show-console-readout=true --dir /dev/shm/ --auto-file-renaming=false --uri-selector=inorder -i /dev/shm/12h45.fHLS-H264-HD-3629.mp4.part.frag.urls
[download] 100% of  534.48MiB in 00:00:50 at 10.63MiB/s
[debug] Invoking hlsnative downloader on "https://rts-vod-amd.akamaized.net/ww/14903403/615a0d44-ed43-38a7-a1d5-e29c808eeb06/index-f1-a1.m3u8"
[hlsnative] Downloading m3u8 manifest
[hlsnative] Fragment downloads will be delegated to aria2c
[hlsnative] Total fragments: 163
[download] Destination: /dev/shm/12h45.fHLS-H264-HD-audio0-Français.mp4
[debug] aria2c command line: aria2c -c --no-conf --console-log-level=warn --summary-interval=0 --download-result=hide --http-accept-gzip=true --file-allocation=none -x16 -j16 -s16 --allow-overwrite=true --allow-piece-length-change=true --header 'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36' --header 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' --header 'Accept-Language: en-us,en;q=0.5' --header 'Sec-Fetch-Mode: navigate' --check-certificate=true --remote-time=true --show-console-readout=true --dir /dev/shm/ --auto-file-renaming=false --uri-selector=inorder -i '/dev/shm/12h45.fHLS-H264-HD-audio0-Français.mp4.part.frag.urls'
[download] 100% of   20.00MiB in 00:00:03 at 5.79MiB/s
[debug] ffmpeg command line: ffprobe -show_streams 'file:/dev/shm/12h45.fHLS-H264-HD-audio0-Français.mp4'
[Merger] Merging formats into "/dev/shm/12h45.mp4"
[debug] ffmpeg command line: ffmpeg -y -loglevel repeat+info -i file:/dev/shm/12h45.fHLS-H264-HD-3629.mp4 -i 'file:/dev/shm/12h45.fHLS-H264-HD-audio0-Français.mp4' -c copy -map 0:v:0 -map 1:a:0 -bsf:a:0 aac_adtstoasc -movflags +faststart file:/dev/shm/12h45.temp.mp4
Deleting original file /dev/shm/12h45.fHLS-H264-HD-3629.mp4 (pass -k to keep)
Deleting original file /dev/shm/12h45.fHLS-H264-HD-audio0-Français.mp4 (pass -k to keep)
[MoveFiles] Moving file "/dev/shm/12h45.fr.vtt" to "12h45.fr.vtt"
[MoveFiles] Moving file "/dev/shm/12h45.mp4" to "/home/oc/storage/Videos/DE/TODO/12h45.mp4"
@Ottaviocr Ottaviocr added bug Bug that is not site-specific triage Untriaged issue labels May 14, 2024
@kclauhk
Copy link
Contributor

kclauhk commented May 15, 2024

for faster download, you may try --concurrent-fragment 20
you don't need external downloader for speed

@Ottaviocr
Copy link
Author

Ottaviocr commented May 15, 2024

for faster download, you may try --concurrent-fragment 20

Thanks, this reduced the time to:

[download] 100% of 27.84KiB in 00:00:15 at 1.79KiB/s

And it does de facto solve the problem. I still don't understand why aria2c is not involved in the download of subtitles.

@Ottaviocr
Copy link
Author

try --concurrent-fragment 20 you don't need external downloader for speed

On a 2nd thought, are you suggesting not using aria2c at all?

@kclauhk
Copy link
Contributor

kclauhk commented May 15, 2024

try --concurrent-fragment 20 you don't need external downloader for speed

On a 2nd thought, are you suggesting not using aria2c at all?

It's up to you, but an external downloader is not needed for speeding up this subtitle download.
You can increase the number, say 50, to further shorten the time so long as no error occurs.

@dirkf
Copy link
Contributor

dirkf commented May 15, 2024

[download] 100% of 27.84KiB in 00:06:20 at 74.94B/s

75B/s ??
There's more to this than whether aria2c is being used.

With a UK domestic ADSL2 connection (not the limiting factor) and no external downloader, I get 10x that speed for the sttl download (41s) and well over 1MiB/s for the mp4 download.

@Ottaviocr
Copy link
Author

Ottaviocr commented May 15, 2024 via email

@dirkf
Copy link
Contributor

dirkf commented May 15, 2024

minutes

@Ottaviocr
Copy link
Author

Ottaviocr commented May 15, 2024 via email

@Ottaviocr
Copy link
Author

minutes

Gotcha.

@dirkf
Copy link
Contributor

dirkf commented May 16, 2024

Fragments are the key.

If the site offers a single URL for a resource, it can be downloaded as fast as the site allows.

If the site breaks up video/audio/sttl into chunks (fragments) to facilitate navigation through the media (skip to 30s before the end, eg), the resource has to be fetched chunk by chunk and reassembled, so instead of one (set up connection) + (transfer) + (close connection) we have one per fragment. As video has a large bit-rate compared with audio/sttl, the overhead of multiple (set up connection) + (close connection) is much lower. Fetching fragments concurrently is another way of reducing the overhead.

@Ottaviocr
Copy link
Author

Ottaviocr commented May 16, 2024 via email

@dirkf
Copy link
Contributor

dirkf commented May 16, 2024

Unrelated (a different level of chunking).

@bashonly bashonly added enhancement New feature or request and removed bug Bug that is not site-specific labels May 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request triage Untriaged issue
Projects
None yet
Development

No branches or pull requests

4 participants