Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix threads implementation #685

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from
Open

Fix threads implementation #685

wants to merge 1 commit into from

Conversation

jrdi
Copy link

@jrdi jrdi commented Apr 12, 2021

I invested some time fixing the threads implementation and also adding other improvements such as:

  • Retries with backoff
  • Normalize output for single tickers
  • Remove the shared dfs, prune to race conditions

I'd appreciate it if you can provide some feedback.

* Retries with backoff
* Normalize output for single tickers
* Remove the shared dfs, prune to race conditions
@arnaudgelas
Copy link
Contributor

@jrdi Although I agree with your new implementation, it would be great if you could provide (at least) 1 example which was failing without your fix and now works with it.

@ValueRaider
Copy link
Collaborator

@fredrik-corneliusson I believe you are a heavy user of download() - is this PR worth resolving & merging in?

@fredrik-corneliusson
Copy link
Contributor

Yes I think the threads implementation would be in need of a improvement, any exceptions raised in threads results in the whole download hanging.

@starboi-63
Copy link

starboi-63 commented Aug 21, 2023

@jrdi Although I agree with your new implementation, it would be great if you could provide (at least) 1 example which was failing without your fix and now works with it.

I'm not sure if this issue is local to the machine I was working with (M1 Macbook Pro, macOS 13.5 Ventura, Python 3.11.4), but for large downloads (7000+ tickers over 2 years), I get RuntimeError: can't start new thread. This doesn't stop the download execution immediately, but it consistently hangs mid way through. The specific list of tickers I used to cause the issue is from NASDAQ's API.

I fixed the issue by implementing a yf_download_batches() function that splits large download() calls into smaller ones and combines the data using pandas.concat(). My function only saves adjusted close data, but I think the changes made by @jrdi would also fix the issue I was having by handling the RuntimeError.

@ValueRaider
Copy link
Collaborator

ValueRaider commented Aug 22, 2023

Anyone can submit a pull request, just saying.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants