Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sqlp selects wrong data when multiple tables have the same named column #1820

Open
urkle opened this issue May 15, 2024 · 6 comments
Open

sqlp selects wrong data when multiple tables have the same named column #1820

urkle opened this issue May 15, 2024 · 6 comments

Comments

@urkle
Copy link

urkle commented May 15, 2024

Describe the bug
qsv sqlp selects the wrong "table" when multiple CSV files have the same-named column.

To Reproduce
Steps to reproduce the behavior:

  1. create CSV file "one.csv" with contents
id,data
1,open
  1. create CSV file "two.csv" with contents
id,data
1,closed
  1. run `qsv sqlp one.csv two.csv 'SELECT _t_1.id, _t_2.data FROM _t_1 JOIN _t_2 ON _t_1.id = _t_2.id'
    The wrong output is produced
id,data
1,open

Expected behavior
The output

id,data
1,closed

Desktop (please complete the following information):

  • OS: macOS 12.7.4
  • qsv Version: qsv 0.127.0-standard-apply;fetch;foreach;geocode;Luau 0.622;to;polars-0.39.2;self_update-16-16;51.20 GiB-1.43 GiB-22.83 GiB-64.00 GiB (x86_64-apple-darwin compiled with Rust 1.77.2) prebuilt
@urkle
Copy link
Author

urkle commented May 15, 2024

This looks to be an issue with polars.

I installed polars-cli 0.7.0 and ran this query.

select a.id, a.data as d1, b.data as d2 from read_csv('one.csv') as a join read_csv('two.csv') as b ON a.id = b.id;

which yields

┌─────┬──────┬──────┐
│ id  ┆ d1   ┆ d2   │
│ --- ┆ ---  ┆ ---  │
│ i64 ┆ str  ┆ str  │
╞═════╪══════╪══════╡
│ 1   ┆ open ┆ open │
└─────┴──────┴──────┘

@jqnatividad
Copy link
Owner

jqnatividad commented May 15, 2024

Thanks @urkle for the report and checking the query on polars-cli as well.

As you surmised, sqlp is really a convenience wrapper to tap the power of Polars for CSVs from the command-line.

Given polars' development tempo, if you report the issue, I wouldn't be surprised if the fix makes it into the next Rust Polars release.

Also, it's been about a month since polars 0.39.2 has been released, and there are tons of improvements I'm looking forward to...

pola-rs/polars@rs-0.39.2...main

@urkle
Copy link
Author

urkle commented May 15, 2024

@jqnatividad I am working on reporting w/ polars right now.

@jqnatividad
Copy link
Owner

jqnatividad commented May 16, 2024

pola-rs/polars#16255 was closed as the problem was originally reported in pola-rs/polars#15929, which is still awaiting resolution.

@jqnatividad
Copy link
Owner

Just upgraded polars to 0.40.0 and unfortunately, its still not returning the expected result.

@alexander-beedie
Copy link

FYI: the fix will be landing on our side in the (very) near future ;)
pola-rs/polars#16507

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants