Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Control number of progress bars #243

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

applio
Copy link

@applio applio commented Jun 8, 2023

As described in issue #242, this PR proposes adding a way to control the maximum number of progress bars to display that maintains backwards-compatibility with existing code.

Control over the number of progress bars is made through the existing input parameter to pandarallel.initialize():

pandarallel.pandarallel.initialize(nb_workers=42, progress_bar=5)

Expanded testing is provided by both expanding the number of params to the pytest.fixture, progress_bar() as well as one new specific unit test.

@codecov
Copy link

codecov bot commented Jun 8, 2023

Codecov Report

Merging #243 (664f3ec) into master (46fc0e5) will decrease coverage by 0.96%.
The diff coverage is 64.00%.

@@            Coverage Diff             @@
##           master     #243      +/-   ##
==========================================
- Coverage   91.37%   90.42%   -0.96%     
==========================================
  Files          12       12              
  Lines         580      595      +15     
==========================================
+ Hits          530      538       +8     
- Misses         50       57       +7     
Impacted Files Coverage Δ
pandarallel/progress_bars.py 72.79% <55.00%> (-2.62%) ⬇️
pandarallel/core.py 91.05% <100.00%> (+0.04%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@applio
Copy link
Author

applio commented Jun 8, 2023

If I am interpreting the output from the codecov bot correctly, all of the decrease in the measured percentage of code coverage can be attributed to the patch's changes to code that only executes in a Jupyter notebook setting, specifically changes to ProgressBarsNotebookLab.__init__().

If desired, I could introduce mocks in an additional unittest to permit this to kind-of run just to get the code coverage percentage higher though I did not see any existing unit tests to do anything similar? I'm not sure how much practical value it would have beyond bumping the percentages?

@till-m till-m requested review from nalepae and till-m June 8, 2023 19:58
Copy link
Collaborator

@till-m till-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @applio,

thanks for the contribution. Some small remarks, but this looks good to me (code-wise + when running in a terminal), unfortunately I think my ipywidgets are broken locally and I cannot check whether they work properly in a notebook. @nalepae could you check maybe?

I'm not entirely sure what the usecase of multiple progress bars is, when one progress bar represents multiple workers. What is the advantage here compared to using just one progress bar?

The only thing that this is lacking (from my side) is documentation. This does not need to happen in this PR as far as I am concerned, but I would appreciate if you could expand the section here.

If desired, I could introduce mocks in an additional unittest to permit this to kind-of run just to get the code coverage percentage higher though I did not see any existing unit tests to do anything similar? I'm not sure how much practical value it would have beyond bumping the percentages?

I don't think this would be good, it's basically just window dressing if I understand correctly. Unfortunately I think right now there is no way around doing this manually, but then again, with primarily visual features I think one should check them manually anyways.

@@ -205,6 +205,7 @@ def parallelize_with_memory_file_system(
nb_requested_workers: int,
data_type: Type[DataType],
progress_bars_type: ProgressBarsType,
max_progress_bars=-1,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typehint

@@ -355,6 +356,7 @@ def parallelize_with_pipe(
nb_requested_workers: int,
data_type: Type[DataType],
progress_bars_type: ProgressBarsType,
max_progress_bars=-1,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typehint

@@ -55,9 +55,18 @@ def is_notebook_lab() -> bool:


class ProgressBarsConsole(ProgressBars):
def __init__(self, maxs: List[int], show: bool) -> None:
def __init__(self, maxs: List[int], show: bool, count=-1) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typehint

@@ -119,7 +132,7 @@ def update(self, values: List[int]) -> None:


class ProgressBarsNotebookLab(ProgressBars):
def __init__(self, maxs: List[int], show: bool) -> None:
def __init__(self, maxs: List[int], show: bool, count=-1) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typehint

@@ -239,7 +240,7 @@ def closure(

show_progress_bars = progress_bars_type != ProgressBarsType.No

progress_bars = get_progress_bars(progresses_length, show_progress_bars)
progress_bars = get_progress_bars(progresses_length, show_progress_bars, max_progress_bars)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check line length -- I think this repo (generally) follows the PEP8 rule of max 79 character lines. @nalepae do we have an auto-formatter that we use?

@nalepae
Copy link
Owner

nalepae commented Jun 29, 2023

Thank you for your well-written PR!

However, I do not get exactly the point of having a number of progress bars that is neither 1 nor the number of workers.

I do understand that having like 128 progress bars is overwhelming.
I do understand, in such a case, we simply would like to have only 1, global, progress bar.

However, I don't really see the added value to have something in between.

Can you please explain the rationale a little bit more?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants