-
Notifications
You must be signed in to change notification settings - Fork 644
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Properties that default to pandas have unclear error messages #7233
Labels
Enable plugin
Fixes needed to enable external plugins
P2
Minor bugs or low-priority feature requests
Comments
noloerino
added
P2
Minor bugs or low-priority feature requests
Enable plugin
Fixes needed to enable external plugins
labels
Apr 30, 2024
5 tasks
sfc-gh-joshi
added a commit
to snowflakedb/snowpark-python
that referenced
this issue
May 1, 2024
#1454) Please answer these questions before submitting your pull requests. Thanks! 1. What GitHub issue is this PR addressing? Make sure that there is an accompanying issue to your PR. Fixes SNOW-1347394 2. Fill out the following pre-review checklist: - [ ] I am adding a new automated test(s) to verify correctness of my new code - [ ] I am adding new logging messages - [ ] I am adding a new telemetry message - [ ] I am adding new credentials - [ ] I am adding a new dependency 3. Please describe how your code solves the related issue. This PR removes our vendored copy of the `BaseQueryCompiler` class, inheriting the class from upstream Modin instead. Similarly, it removes all the operator registration classes defined in `snowflake.snowpark.modin.core.dataframe.algebra.default2pandas`, with one exception. Upstream Modin does not properly render the names of `property` objects (modin-project/modin#7233), so we should override `DataFrameDefault.register` to fix this until this issue is fixed upstream. This PR incidentally removes `Series.dt.week` + `Series.dt.weekofyear`, which were already removed in pandas 2.0. --------- Co-authored-by: Naren Krishna <naren.krishna@snowflake.com>
noloerino
added a commit
to noloerino/modin
that referenced
this issue
May 14, 2024
…ror messages Signed-off-by: Jonathan Shi <jhshi07@gmail.com>
noloerino
added a commit
to noloerino/modin
that referenced
this issue
May 14, 2024
…ror messages Signed-off-by: Jonathan Shi <jhshi07@gmail.com>
7 tasks
YarShev
pushed a commit
that referenced
this issue
May 15, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Enable plugin
Fixes needed to enable external plugins
P2
Minor bugs or low-priority feature requests
When a DataFrame/Series property defaults to pandas by registering an operator in
modin.core.dataframe.algebra.default2pandas
, the raised warning message does not properly render the name of the property. For example, commenting out thePandasQueryCompiler
assignment ofdt_date
and instead using theBaseQueryCompiler
implementation (which usesDateTimeDefault
) yields the following:This occurs because the method is registered with
DateTimeDefault.register(pandas.Series.dt.date)
, and sincedt.date
is a property, it has no__name__
field. To resolve this, the below line ofDefaultMethod.register
should be changed to accessfunc.fget.__name__
iffunc
is a Pythonproperty
object.modin/modin/core/dataframe/algebra/default2pandas/default.py
Line 97 in 9fa326f
As far as I can tell, this does not affect any of the first-class Modin backends. Properties like
DataFrame.attrs
useBasePandasDataset._default_to_pandas
, which explicitly requires a name to be passed in.dt
/str
accessor properties useStrDefault
andDatetimeDefault
inBaseQueryCompiler
, butPandasQueryCompiler
uses theMap
operator to avoid this default. However, downstream libraries that implement a custom subclass ofBaseQueryCompiler
are affected.The text was updated successfully, but these errors were encountered: