-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discrpancy in seaborn.objects.Dodge groupby order #3593
Comments
Duplicate of #3015 I believe? |
Yeah the problem presents a little bit differently but I think it is the same underlying issue. |
Agree, it seems the root it’s the same, sorry I didn’t notice the issue you mentioned.By the way, if nobody else is already taking care of it and you think could be a good first issue I would like to give it a deeper look.
|
Actually sorry, I want to revise what I said here a bit. First I didn't look at the linked issue closely enough — this did sound like a duplicate, but the more relevant issue is #3556 which is a bit different and more fundamental. But on the other hand I'm not totally convinced that there's a well-defined "correct behavior" here since you actually are passing different datasets, and the default ordering rule is to use categories in the order that they are encountered in the data. Unless I am missing something, I think that the consistent seaborn behavior would be to assign different default orderings. In any case, if not obvious, the existing way to force a specific ordering would be |
It's true that I'm passing a different dataset, but the reason is that I want to produce a boxplot using seaborn objects and (up to now) I didn't find a function/argument to plot only the outliers from the original dataset. I also agree that the current behavior it's consistent: a new dataset is passed -> the groupby operation should be repeated and the categories should be placed in te order of appearance. But, may I ask if there is already a plan to provide dataset operations inside seaborn objects? or a BoxPlot object? The use cases that I can imagine at the moment are the following ones:
By the way, thanks for the reference to |
Yes I've thought about adding an |
Hi, I would like to report a strange behavior in the
Move
objectDodge
.Seaborn version: 0.13.0
Matplotlib version: 3.8.2
Everything start because I wanted to play around with the
objects
namespace.As dataset I use the penguins dataset, I drop both the nan and all the values that I do not consider outliers, i.e. everything in between the 0.05 and 0.95 quantile (for each combination of species and sex).
Here the code to replicate the dataset:
The print the nested for loop is just to have an idea of how many points to expect
Then I try to plot such outliers with the species on the x axis and the body_mass_g as y axis as follows:
As you can see the only difference is that in the first plot I pass the outliers dataset in the add layer, while in the second plot I use directly the outliers dataset at plot level.
I expected the two resulting plots to be identical, but I get two different results, and I think this is caused by the order in groupby operation in the
Dodge
Movement.Attached the two plots in the same order as the code:
I found a solution on how to make the two plots identical, in the first plot code include also the
groupby
argument as follows:The cause of the problem is that the first element of the outliers is a "female" penguin, while in the original dataset it's a "male" penguin.
I can see that without specifying anything the grupby operation is executed on the new dataset, producing a possible different order.
But I don't see then why when I specify the groupby variable at that level I get the groupby order executed on the
penguins
dataset.As a solution I would suggest what follows:
This way with the solution one the order of the components would be the same between all the layers even providing different subdatasets at each level.
Otherwise the user has the second option to redefine the groupby operation per level.
The text was updated successfully, but these errors were encountered: