Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[proposal] Have a dedicated Jupyter emergency fund for gke.mybinder.org #125

Open
3 tasks
choldgraf opened this issue Mar 7, 2022 · 8 comments
Open
3 tasks

Comments

@choldgraf
Copy link
Contributor

choldgraf commented Mar 7, 2022

Context

gke.mybinder.org is the largest member of the Binder federation. We have historically run it via credit donations from Google, and have scrambled to find credits from other stakeholders when these credits run out. Credits tend to come in 1-year batches with hard end-dates.

Because of the cyclical nature of credits, it creates a "credit crunch" where old credits may run out, but we have not yet found new credits for the infrastructure. This creates stressful situations where it's unclear how we'll pay for gke.mybinder.org. Most recently this happened in the issues below:

In these moments, individual stakeholders in the Binder project have stepped up to backstop gke.mybinder.org while we look for more credits, but this is a risky solution that depends on individual actors stepping up, and is a potential source of inequity amongst the Binder team members.

Instead, we should define a process that:

  1. Reduces the risk associated with running out of credits on mybinder.org
  2. Defines clear responsibilities for who must ensure that more credits are available to run the service

Proposal

As a first step, I propose that we set aside a dedicate account to backstop gke.mybinder.org. This account could be linked to the gke.mybinder.org Billing Account, so that whenever credits ran out, we would begin drawing from this account as a last resort. Our target would be to have at least 6 months of funding in the account at all times, to give us plenty of leeway if we need to find another round of credits or fundraise for it.

Note that most of the time, this funding would not be used - we still aim to power mybinder.org via credit allocations. This is just "gap funding" for when credits happen to run out.

How much cost are we talking about?

Historically, gke.mybinder.org costs around $7,000 per month (so, 6 months of funding would be roughly $42,000). However, we have recently undertaken several cost-saving measures, and believe that this is down to around $4,000 a month. So let's say $24,000 is a low-estimate for 6 months of usage. Ideally, we'd shoot for $50,000 in reserves if the funds were available, to give ourselves some breathing room.

Steps to implement this

I believe that we'd need to take the following steps:

  • Decide whether Jupyter wants to commit to this responsibility
  • Agree on a backstop amount, and find the source of this funding (potentially fundraising if necessary)
  • Define the charter for this funding, and the process around when we begin drawing down from this fund + how we will replenish it if we must draw down the funds.
@choldgraf
Copy link
Contributor Author

cc @fperez @ellisonbg and @afshin who I think are the ones that recommended I open this issue / proposal. Please let me know if there's a different place that you'd like me to raise this issue.

@meeseeksmachine
Copy link

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/governance-office-hours-meeting-minutes/1480/179

@Carreau
Copy link
Member

Carreau commented Mar 28, 2022

Sorry for the delay in responding,

I am in general +1 on the idea, I do have some question though.

  • 50k is also a lot as i could cover a FTE for 6month. I'd like to see comparison on what those 50k could be used for. If we had 50k in an account I would be torn between having it sped on mybinder and someone to resurrect nbviewer.org that still have a huge amount of traffic.

  • I'd really like to see a mitigation plan in case we are short on funds to dramatically reduce mybinder spending if the case arise. Like a press-button solution that I don't know, halves the CPU allocated to each image, and implement a 30s wait period to start an image with a "Binder is out of funds" banner.

@minrk
Copy link
Member

minrk commented Mar 28, 2022

We've been discussing some of these things over on JupyterHub threads. One of the biggest issues with the current setup is that we have "n=1" things that are hard to turn off (federation-redirect, main DNS, central analytics, events archive) plus the GKE BinderHub deployment which is the vast majority of the cost, in the same place. One thing we can do is move the "this must run" stuff to a separate deployment, so that it's easier to just "turn GKE off", which rapidly takes most costs to zero (ideally, we'd have a federation that can handle it, but even when we don't all federation members can now be 'full' with come-back-later messages). That way, we could be paying for the inexpensive always-on stuff (we can work on cost estimates) with steady donation income, and rely on grants/other support for the GKE federation member cluster which may need to shut down once in a while.

@choldgraf
Copy link
Contributor Author

choldgraf commented Mar 28, 2022

I'd like to see comparison on what those 50k could be used for.

In this case, these funds would only be used to pay for cloud costs with Binder if we ran out of other funding sources. This would be untouchable funding that cannot be used for other things. So if we are using these funds, it should be a question of "do we want mybinder.org to keep running or not".

In a separate conversation, I think we should find ways to raise revenue that supports ongoing operations and development of these services, but I am trying to scope this issue specifically for emergency purposes (like the one we are in now).

I'd really like to see a mitigation plan in case we are short on funds to dramatically reduce mybinder spending if the case arise

I would also love to see a plan like this. But I do not have the bandwidth nor skills right now to work on it. Do you see it as a blocker for devoting central Jupyter funds to support mybinder.org?

@choldgraf choldgraf changed the title [proposal] Have a dedicated Jupyter rainy day fund for gke.mybinder.org [proposal] Have a dedicated Jupyter emergency fund for gke.mybinder.org Mar 29, 2022
@choldgraf
Copy link
Contributor Author

I've tried to clarify this proposal by renaming this to be "Emergency fund" instead of "Rainy day" fund - I think the original title may have been misleading to make people think it was a "nice to have" kind of purpose. I think that "emergency fund" makes it clearer that this is just for emergency purposes.

@Carreau
Copy link
Member

Carreau commented Mar 30, 2022

So if we are using these funds, it should be a question of "do we want mybinder.org to keep running or not".

Sorry I was maybe unclear, this question was more "what could those funds be used for if they were not blocked for binder", it's literally assuming we have those 50k in the bank, is blocking them for binder preventing us from paying someone to do devops on nbviewer 8h per weeks for two years ?

As you point out later, we are all out of bandwidth, and if the choice is between paying cloud cost, and paying you to go raise some money, I most likely prefer the second one than the first one.

I don't have any opposition to Emergency/rainy day (I understood it the same). I want to better discuss the criticality of paying cloud cost vs people.

@damianavila
Copy link
Member

That way, we could be paying for the inexpensive always-on stuff (we can work on cost estimates) with steady donation income, and rely on grants/other support for the GKE federation member cluster which may need to shut down once in a while.

This is an interesting model given the current money constraints and probably fits better with @Carreau thoughts around the paying cloud cost vs people discussion (because you have more degrees of freedom to decide if you either want to support the GKE cluster or allocate that money for development instead).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants