We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python version : 3.8.17 (default, Aug 10 2023, 12:50:17) IPython version : 8.12.3 Tornado version : 6.4 Bokeh version : 3.1.1
No response
The boxplot example of the documentation in examples/topics/stats/boxplot.py should compute the whiskers by:
The whiskers are computed in the example by just calculating:
Which leads to the whiskers not providing any additional information at all.
import pandas as pd from bokeh.models import ColumnDataSource, Whisker from bokeh.plotting import figure, show from bokeh.sampledata.autompg2 import autompg2 from bokeh.transform import factor_cmap df = autompg2[["class", "hwy"]].rename(columns={"class": "kind"}) kinds = df.kind.unique() # compute quantiles qs = df.groupby("kind").hwy.quantile([0.25, 0.5, 0.75]) qs = qs.unstack().reset_index() qs.columns = ["kind", "q1", "q2", "q3"] df = pd.merge(df, qs, on="kind", how="left") # compute IQR outlier bounds iqr = df.q3 - df.q1 df["upper"] = df.q3 + 1.5*iqr df["lower"] = df.q1 - 1.5*iqr source = ColumnDataSource(df) p = figure(x_range=kinds, tools="", toolbar_location=None, title="Highway MPG distribution by vehicle class", background_fill_color="#eaefef", y_axis_label="MPG") # outlier range whisker = Whisker(base="kind", upper="upper", lower="lower", source=source) whisker.upper_head.size = whisker.lower_head.size = 20 p.add_layout(whisker) # quantile boxes cmap = factor_cmap("kind", "TolRainbow7", kinds) p.vbar("kind", 0.7, "q2", "q3", source=source, color=cmap, line_color="black") p.vbar("kind", 0.7, "q1", "q2", source=source, color=cmap, line_color="black") # outliers outliers = df[~df.hwy.between(df.lower, df.upper)] p.scatter("kind", "hwy", source=outliers, size=6, color="black", alpha=0.3) p.xgrid.grid_line_color = None p.axis.major_label_text_font_size="14px" p.axis.axis_label_text_font_size="12px" show(p)
The text was updated successfully, but these errors were encountered:
I could provide a PR in the next days if desired.
Sorry, something went wrong.
@its-DomeE do you mean the wiskers difference like
I'm using the backported (adapted) code from matplotlib.cbook.boxplot_stats() (code) in my lib. The function itself uses the [McGill1978] approach.
matplotlib.cbook.boxplot_stats()
(This code is used by seaborn.boxplot() too as far as seaborn is "high-level frontend" for matplotlib).
seaborn.boxplot()
[McGill1978] McGill, R., Tukey, J.W., and Larsen, W.A. (1978) "Variations of Boxplots", The American Statistician, 32:12-16.
No branches or pull requests
Software versions
Python version : 3.8.17 (default, Aug 10 2023, 12:50:17)
IPython version : 8.12.3
Tornado version : 6.4
Bokeh version : 3.1.1
Browser name and version
No response
Jupyter notebook / Jupyter Lab version
No response
Expected behavior
The boxplot example of the documentation in examples/topics/stats/boxplot.py should compute the whiskers by:
Observed behavior
The whiskers are computed in the example by just calculating:
Which leads to the whiskers not providing any additional information at all.
Example code
Stack traceback or browser console output
No response
Screenshots
No response
The text was updated successfully, but these errors were encountered: