Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider alternative frameworks for data structures? #187

Open
cboettig opened this issue Feb 23, 2024 · 3 comments
Open

Consider alternative frameworks for data structures? #187

cboettig opened this issue Feb 23, 2024 · 3 comments

Comments

@cboettig
Copy link
Member

For lots of good reasons data-8 has, as I understand, always relied on Berkeley's own datascience package which offers far more intuitive/pythonic syntax to pandas. While I agree with all the pedagogical justification there relative to pandas, as you all probably know there are now much more performant and pythonic alternatives displacing pandas dominance, specifically polars and ibis.

I think these provide a syntax that is closer to datascience than pandas, and is more nicely aligned with and informed by database theory (and indeed can be translated directly to SQL). I know this wouldn't be a small overhaul, but I think it could be a substantial improvement.

Maybe it would make more sense to migrate data100 from pandas to polars first?

@davidwagner
Copy link
Member

That does sound intriguing and promising! I don't know polars and ibis well enough to judge whether this would be an improvement, or have the resources to take on this major shift, but it does sound like the sort of change that might be a big improvement. It would be great to be able to move to run on existing standard libraries rather than relying on the datascience package, if the existing libraries are easy enough to learn and meet the pedagogical goals.

@cboettig
Copy link
Member Author

@davidwagner very cool! @jegonzal and @fperez were discussing this a bit in the context of data-100 too and may have more insight. From what I understand, it sounds like Wes created ibis to address these issues they had in pandas in the first place [1].

@fperez
Copy link

fperez commented Apr 17, 2024

Yes! I haven't had time to dig into the details of polars vs ibis, and I'm not even sure if they occupy quite the same space. But polars is definitely rapidly rising as a viable alternative to pandas, and I think we'd gain a ton from exploring this.

I also think that a combination of one/two GSIs + AI-assisted translation could make the porting of at least the base material a reasonable lift, with the faculty/textbook authors having to only do a final review of the resulting product.

It's not trivial, but it could be done in parallel over a semester if DSUS assigns one or two GSIs to the job.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants