Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XLSX no longer processing due to xlrd #232

Open
bushong1 opened this issue Aug 31, 2021 · 6 comments
Open

XLSX no longer processing due to xlrd #232

bushong1 opened this issue Aug 31, 2021 · 6 comments

Comments

@bushong1
Copy link

So it looks like the dependency messytables uses xlrd for excel file processing. The latest xlrd does not support XLSX files anymore due to, as I understand it, security concerns. messytables appears to be a dead project, not having had any activity in the last 2 years. This stack overflow post says that xlrd should be swapped out for openpyxl, but with messytables being unmaintained, that seems unlikely to happen. Is there any effort being taken to support XLSX files?

@anuveyatsu
Copy link

@bushong1 there might be some work from our side to fix this but not yet been confirmed. Also, I'd consider replacing Datapusher with Aircan but you'd need to create a new DAG for XLSX loading.

@fishbone1
Copy link

When I try to upload an XLSX-file the state remains "pending" forever, which is odd

@categulario
Copy link
Contributor

It seems to me that the option is to replace messytables dependency with its sucesor frictionless

@EricSoroos
Copy link

We're also seeing that some .ods files aren't processed well by messytables, essentially causing OOM errors consuming >4G of memory. (among other reasons, it's doing zipfile extraction into memory, and potentially duplicating cells in rows many times to fill a large empty spreadsheet).

@pkernevez
Copy link

Any new on this issue ?
I someone found a solution (like Aircan) ?

@categulario
Copy link
Contributor

categulario commented Jan 19, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants