FIX: Restore generate_gantt_chart functionality #3290

shnizzedy · 2021-01-08T20:45:03Z

Summary

Fixes #2982. Maybe fixes #3527.

All tests pass locally. ⁸/₁₃ jobs pass on Travis. The Travis failures seem unrelated to the changes in this PR.

List of changes proposed in this PR (pull-request)

Updates

nipype/nipype/utils/draw_gantt_chart.py

Line 391 in fa65caf

def generate_gantt_chart(

to handle changes to profiler

filter log file to include only logged nodes with timing information
convert datetime strings to datetime objects before doing datetime math
for nodes with timing information but no name, use id or an empty string for name instead of crashing
warn if a node is suspected of being included twice instead of raising an exception

skip nodes with non-timestamp start and finish values like "N/A" and "Unknown"

nipype/nipype/utils/draw_gantt_chart.py

Lines 144 to 147 in a08ee57

    
           try: 
        
               all_res += float(event[resource]) 
        
           except ValueError: 
        
               next

nipype/nipype/utils/draw_gantt_chart.py

Lines 151 to 154 in a08ee57

    
           try: 
        
               all_res -= float(event[resource]) 
        
           except ValueError: 
        
               next

Adds a test

nipype/nipype/pipeline/plugins/tests/test_callback.py

Lines 66 to 98 in fa65caf

    
           @pytest.mark.parametrize("plugin", ["Linear", "MultiProc", "LegacyMultiProc"]) 
        
           def test_callback_gantt(tmpdir, plugin): 
        
               import logging 
        
               import logging.handlers 
        
               from os import path 
        
               from nipype.utils.profiler import log_nodes_cb 
        
               from nipype.utils.draw_gantt_chart import generate_gantt_chart 
        
               log_filename = path.join(tmpdir, "callback.log") 
        
               logger = logging.getLogger("callback") 
        
               logger.setLevel(logging.DEBUG) 
        
               handler = logging.FileHandler(log_filename) 
        
               logger.addHandler(handler) 
        
               # create workflow 
        
               wf = pe.Workflow(name="test", base_dir=tmpdir.strpath) 
        
               f_node = pe.Node( 
        
                   niu.Function(function=func, input_names=[], output_names=[]), name="f_node" 
        
               ) 
        
               wf.add_nodes([f_node]) 
        
               wf.config["execution"] = {"crashdump_dir": wf.base_dir, "poll_sleep_duration": 2} 
        
               plugin_args = {"status_callback": log_nodes_cb} 
        
               if plugin != "Linear": 
        
                   plugin_args["n_procs"] = 8 
        
               wf.run(plugin=plugin, plugin_args=plugin_args) 
        
               generate_gantt_chart( 
        
                   path.join(tmpdir, "callback.log"), 1 if plugin == "Linear" else 8 
        
               ) 
        
               assert path.exists(path.join(tmpdir, "callback.log.html"))

to make sure the Gantt chart HTML page generates without error

Adds myself as a contributor to .zenodo.json

Acknowledgment

(Mandatory) I acknowledge that this contribution will be available under the Apache 2 license.

* exclude nodes without timing information from Gantt chart * fall back on "id" or empty string if no "name" in node

shnizzedy · 2021-01-08T20:56:20Z

As noted

[T]here is an issue with the number of threads being estimated by the callback, or the gantt chart creation script is pulling in the wrong numbers. Some of the nodes are reporting using 210 threads!

Originally posted by @ccraddock in FCP-INDI/C-PAC#1404 (comment)

I thought maybe runtime_threads was counting something different than I expected.

I see the profile uses cpu_percent for runtime_threads which returns a percentage of a CPU, so I think something like math.ceil(cpu_percent)/100 would be an estimate of the number of threads, but there's some disconnected code that looks like it collects the actual number of threads used (as opposed to percentage of 1 CPU).

Originally posted by @shnizzedy in FCP-INDI/C-PAC#1404 (comment)

I think estimating the number of threads (by dividing by cpu_percent 100 and rounding up) is good enough for what I'm trying to do.

Originally posted by @shnizzedy in FCP-INDI/C-PAC#1404 (comment)

I think the issues of

what runtime_threads is logging and
whether the number of threads used by a node is recorded

are related to this PR and issue, but beyond the scope of these changes. C-PAC has its own callback function in which I'm dividing and rounding, so I made no changes regarding runtime_threads in Nipype.

codecov · 2021-01-08T21:09:58Z

Codecov Report

Merging #3290 (933fad3) into master (47fe00b) will increase coverage by 3.87%.
The diff coverage is 68.42%.

@@            Coverage Diff             @@
##           master    #3290      +/-   ##
==========================================
+ Coverage   64.70%   68.57%   +3.87%     
==========================================
  Files         302      302              
  Lines       39869    48743    +8874     
  Branches     5288     7226    +1938     
==========================================
+ Hits        25796    33425    +7629     
- Misses      12984    14091    +1107     
- Partials     1089     1227     +138

Flag	Coverage Δ
unittests	`65.01% <64.70%> (+0.31%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
nipype/utils/draw_gantt_chart.py	`93.33% <68.42%> (+83.26%)`	⬆️
nipype/utils/logger.py	`81.60% <0.00%> (-3.01%)`	⬇️
nipype/utils/onetime.py	`81.81% <0.00%> (-2.80%)`	⬇️
nipype/interfaces/niftyseg/label_fusion.py	`55.42% <0.00%> (-1.72%)`	⬇️
nipype/interfaces/diffusion_toolkit/dti.py	`61.64% <0.00%> (-1.57%)`	⬇️
nipype/pipeline/plugins/base.py	`57.89% <0.00%> (-0.19%)`	⬇️
nipype/algorithms/icc.py	`57.53% <0.00%> (ø)`
nipype/utils/docparse.py	`52.21% <0.00%> (ø)`
nipype/interfaces/fsl/utils.py	`63.76% <0.00%> (ø)`
nipype/interfaces/afni/__init__.py	`100.00% <0.00%> (ø)`
... and 119 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 47fe00b...933fad3. Read the comment docs.

effigies

Looks reasonable, though I don't have any experience with this bit of the code. Inclined to merge tomorrow unless someone complains.

shnizzedy · 2021-04-01T14:06:36Z

My only hesitance is the potentially misleading runtime_threads ― maybe that should be fixed before restoring this functionality?

mgxd

Looks good, just some minor nits.

My only hesitance is the potentially misleading runtime_threads ― maybe that should be fixed before restoring this functionality?

I agree 👍

mgxd · 2021-04-01T14:33:06Z

nipype/utils/draw_gantt_chart.py

+                try:
+                    all_res += float(event[resource])
+                except ValueError:
+                    next


Suggested change

next

pass

whoops, good catch! I actually meant

nipype/nipype/utils/draw_gantt_chart.py

Line 147 in 25dd1fc

continue

25dd1fc

mgxd · 2021-04-01T14:40:12Z

nipype/utils/draw_gantt_chart.py

+                try:
+                    all_res -= float(event[resource])
+                except ValueError:
+                    next


Suggested change

next

pass

whoops, good catch! I actually meant

nipype/nipype/utils/draw_gantt_chart.py

Line 154 in 25dd1fc

continue

25dd1fc

nipype/pipeline/plugins/tests/test_callback.py

nipype/utils/draw_gantt_chart.py

Co-authored-by: Mathias Goncalves <goncalves.mathias@gmail.com>

Ref nipy#3290 (comment), nipy#3290 (comment) Co-authored-by: Mathias Goncalves <goncalves.mathias@gmail.com>

nipype/pipeline/plugins/tests/test_callback.py

Co-authored-by: Chris Markiewicz <effigies@gmail.com>

effigies · 2021-04-30T21:44:44Z

My only hesitance is the potentially misleading runtime_threads ― maybe that should be fixed before restoring this functionality?

I agree

Was this fixed? What needs doing?

shnizzedy · 2021-05-03T13:38:11Z

Was this fixed? What needs doing?

I haven't fixed it (yet at least). The issue is that the chart uses runtime_threads from the callback log as a count of threads observed being used at runtime, but the value actually stored there is cpu_percent,

nipype/nipype/utils/profiler.py

Line 143 in e9217c2

"runtime_threads": getattr(node.result.runtime, "cpu_percent", "N/A"),

a float representing the current process CPU utilization as a percentage

This leads to thread counts in the hundreds when they're expected to be in the ones, like

So I think the "threads" part of these charts should be changed before the chart functionality is restored, either

by updating the log to include an integer count of threads and use this value in the chart
change the column from threads to CPU percentage
something else?

effigies · 2021-05-06T00:34:15Z

Yeah, seems like we want something like:

if status_dict['runtime_threads'] != "N/A":
    status_dict['runtime_threads'] //= 100

shnizzedy · 2021-05-06T13:33:31Z

An existing unit test does

nipype/nipype/interfaces/base/tests/test_resource_monitor.py

Lines 76 to 78 in 6c06030

    
           assert ( 
        
               int(result.runtime.cpu_percent / 100 + 0.2) == n_procs 
        
           ), "wrong number of threads estimated"

which is similar to what we're doing for now in C-PAC:

if runtime_threads != 'N/A':
    runtime_threads = math.ceil(runtime_threads/100)

My concern is that, as I read

Note: the returned value can be > 100.0 in case of a process running multiple threads on different CPU cores.
Note: the returned value is explicitly not split evenly between all available CPUs (differently from psutil.cpu_percent()). This means that a busy loop process running on a system with 2 logical CPUs will be reported as having 100% CPU utilization instead of 50%. This was done in order to be consistent with top UNIX utility and also to make it easier to identify processes hogging CPU resources independently from the number of CPUs. It must be noted that taskmgr.exe on Windows does not behave like this (it would report 50% usage instead). To emulate Windows taskmgr.exe behavior you can do: p.cpu_percent() / psutil.cpu_count().

― psutil documentation: Process.cpu_percent

this number can be a misleading estimate. For example, if a process is using 25% of each of 4 CPUs, I believe this would report 100%, which would reduce to 1 or 2 threads depending on how we're rounding up or not. I'd be happy to learn that either I'm misunderstanding the number or that the number is good enough.

shnizzedy added 7 commits January 6, 2021 09:54

🐛 Convert timing values to datetimes from strings

f30c7e7

* exclude nodes without timing information from Gantt chart * fall back on "id" or empty string if no "name" in node

🥅 Reduce double logging from exception to warning

00e80b0

✅ Add test for draw_gantt_chart

e0be087

🚨 Automatic linting by pre-commit

e049019

👥 Add Jon Clucas to Zenodo JSON

f982076

✅ Use tmpdir for Gantt test

fa65caf

🚸 Don't restrict nan timestamps to predetermined options

a08ee57

shnizzedy mentioned this pull request Jan 8, 2021

generate_gantt_chart fails on logfile #2982

Open

shnizzedy mentioned this pull request Jan 29, 2021

⚡️ Update memory and threading estimates FCP-INDI/C-PAC#1428

Merged

8 tasks

shnizzedy mentioned this pull request Mar 4, 2021

🔇 Comment out runtime_threads ⩼ threads FCP-INDI/C-PAC#1457

Merged

8 tasks

effigies approved these changes Mar 31, 2021

View reviewed changes

mgxd reviewed Apr 1, 2021

View reviewed changes

shnizzedy and others added 3 commits April 1, 2021 16:03

💬 Simplify warning

e4762f9

Co-authored-by: Mathias Goncalves <goncalves.mathias@gmail.com>

🚨 Remove unnecessary import

3a5d604

Co-authored-by: Mathias Goncalves <goncalves.mathias@gmail.com>

🎨 next ≠ continue

25dd1fc

Ref nipy#3290 (comment), nipy#3290 (comment) Co-authored-by: Mathias Goncalves <goncalves.mathias@gmail.com>

effigies reviewed Apr 5, 2021

View reviewed changes

nipype/pipeline/plugins/tests/test_callback.py Show resolved Hide resolved

shnizzedy and others added 4 commits April 5, 2021 09:37

💚 Skip test that requires pandas if pandas not installed

22851bf

Co-authored-by: Chris Markiewicz <effigies@gmail.com>

TEST: Add pandas import check

073f9fd

STY: black

9621af5

STY/TEST: black and skipif syntax

933fad3

shnizzedy mentioned this pull request Jan 31, 2023

BUG: Reading serialized event requires conversion of dates #3528

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FIX: Restore generate_gantt_chart functionality #3290

FIX: Restore generate_gantt_chart functionality #3290

shnizzedy commented Jan 8, 2021 •

edited

shnizzedy commented Jan 8, 2021

codecov bot commented Jan 8, 2021 •

edited

effigies left a comment

shnizzedy commented Apr 1, 2021

mgxd left a comment

mgxd Apr 1, 2021

shnizzedy Apr 1, 2021

mgxd Apr 1, 2021

shnizzedy Apr 1, 2021

effigies commented Apr 30, 2021

shnizzedy commented May 3, 2021

effigies commented May 6, 2021

shnizzedy commented May 6, 2021

	@pytest.mark.parametrize("plugin", ["Linear", "MultiProc", "LegacyMultiProc"])
	def test_callback_gantt(tmpdir, plugin):
	import logging
	import logging.handlers

	from os import path

	from nipype.utils.profiler import log_nodes_cb
	from nipype.utils.draw_gantt_chart import generate_gantt_chart

	log_filename = path.join(tmpdir, "callback.log")
	logger = logging.getLogger("callback")
	logger.setLevel(logging.DEBUG)
	handler = logging.FileHandler(log_filename)
	logger.addHandler(handler)

	# create workflow
	wf = pe.Workflow(name="test", base_dir=tmpdir.strpath)
	f_node = pe.Node(
	niu.Function(function=func, input_names=[], output_names=[]), name="f_node"
	)
	wf.add_nodes([f_node])
	wf.config["execution"] = {"crashdump_dir": wf.base_dir, "poll_sleep_duration": 2}

	plugin_args = {"status_callback": log_nodes_cb}
	if plugin != "Linear":
	plugin_args["n_procs"] = 8
	wf.run(plugin=plugin, plugin_args=plugin_args)

	generate_gantt_chart(
	path.join(tmpdir, "callback.log"), 1 if plugin == "Linear" else 8
	)
	assert path.exists(path.join(tmpdir, "callback.log.html"))

FIX: Restore generate_gantt_chart functionality #3290

Are you sure you want to change the base?

FIX: Restore generate_gantt_chart functionality #3290

Conversation

shnizzedy commented Jan 8, 2021 • edited

Summary

List of changes proposed in this PR (pull-request)

Acknowledgment

shnizzedy commented Jan 8, 2021

codecov bot commented Jan 8, 2021 • edited

Codecov Report

effigies left a comment

Choose a reason for hiding this comment

shnizzedy commented Apr 1, 2021

mgxd left a comment

Choose a reason for hiding this comment

mgxd Apr 1, 2021

Choose a reason for hiding this comment

shnizzedy Apr 1, 2021

Choose a reason for hiding this comment

mgxd Apr 1, 2021

Choose a reason for hiding this comment

shnizzedy Apr 1, 2021

Choose a reason for hiding this comment

effigies commented Apr 30, 2021

shnizzedy commented May 3, 2021

effigies commented May 6, 2021

shnizzedy commented May 6, 2021

shnizzedy commented Jan 8, 2021 •

edited

codecov bot commented Jan 8, 2021 •

edited