-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stdlib: Add multiprocessing simulation class #942
Conversation
Change-Id: I98b03c52eb769731a04d3d2aab8a79eb8ac91d15
Change-Id: Ia05bd6422b52f9d319448ccefe1639613c33120f
I am not to sure about what name the class should be. I have tentatively gone with MultiSim. I will also push a test that runs the example script. |
Change-Id: Ie223805cb560d34f498092f359fabb300b5bb30d
Change-Id: Ic08824fc2eca66dc3d8f30e90b740c96a2dad040
Change-Id: I5c902c924025b82c37138437df0cf43f202ee0d8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the big gap in your knowledge if you haven't properly dived into the Multiprocessing
module beyond the Process
class. I'd suggest reaching through a tutorial on some of the more "advanced" features. multiprocessing.pool
is very powerful and solves a lot of the issues you tried to fix here (see my in-line comments).
processes = [] | ||
|
||
for sim_callable in self.sim_callables: | ||
if len(processes) > num_cpus: | ||
for process in processes: | ||
if not process.is_alive(): | ||
processes.remove(process) | ||
os.kill(process.pid, signal.SIGTERM) | ||
sleep(1) | ||
process_name = ( | ||
f"{sim_callable.func.__name__}_{sim_callable.args[0].get_id()}" | ||
) | ||
process = Process( | ||
target=run_simulator, | ||
args=(sim_callable,), | ||
name=process_name, | ||
) | ||
print(f"Starting process {process_name}") | ||
|
||
process.start() | ||
processes.append(process) | ||
|
||
while processes: | ||
for process in processes: | ||
if not process.is_alive(): | ||
processes.remove(process) | ||
sleep(1) | ||
|
||
print("All simulations have finished") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You really should've read up on multiprocess pools
before this. This functionality you provide already exists to handle the scheduling of new processes. The following should be roughly what you need to do:
# Specify the number of processes the pool is allowed to use.
# Note: If `multiprocessing.Pool()` then all available processes are used.
pool = multiprocessing.Pool(processes=num_proc)
# Create a process for each element in `self._sim_callables` passed to function `_run_simulator`. I.e.:
# Process 1 : _run_simulator(self.sim_callables[0])
# Process 2 : _run_simulator(self.sim_callables[1])
# ... and so on.
pool.map(_run_simulator, self.sim_callables)
# The `pool.map` function will only return when all the processes have finished executing.
def _run_simulator(sim_callable: Callable[[], Simulator]) -> None:
sim_callable().run()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I had a good reason not to use multiprocessing.Pool
. It does work, but it's going to be limiting our features.
The reason is that I am looking forward to a new feature that wouldn't be possible with Pool
. I want to be able to track the processes ourselves. Specifically, I want to send a special signal periodically and ask the running simulation to tell me what it's current status is. This is related to the better support for exit events.
I can provide more detail later, but I have a plan!
all_simulations_completed_verifier = verifier.MatchRegex( | ||
re.compile(r"All simulations have finished") | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two comments on this test and the wider change:
- I'm happy with "exit code == 0" for this.
- I don't think the
MultiSim
object should be printing this message. I'd assume if therun
(orrun_all
) function returns without complaint then all the simulations have finished.
if config.bin_path: | ||
resource_path = config.bin_path | ||
else: | ||
resource_path = joinpath(absdirpath(__file__), "..", "resources") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've copy and pasted this from other test scripts but the resource_path
is never used anywhere here.
"multisim", | ||
"multi-sim-example.py", | ||
), | ||
config_args=[], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought: Why don't you have the single argument of this config to be the number of threads? It doesn't make much sense for it to be hard-coded in the script.
I created a little toy example for an approach I think is better. Despite being a simple bare-bones proof-of-concept I hope it's clear how it'd all fit together if built upon: BobbyRBruce@599e76f. You can run the example with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's looking good. A few comments below.
args=(sim_callable,), | ||
name=process_name, | ||
) | ||
print(f"Starting process {process_name}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't have any print
statements in this code. There's a good argument to need logging but for that, we should use something better. E.g., info()
if we can or logging
in python. That said, we need to tread lightly here with changes. My only suggestion is to remove the print statement or change it to an info
processes = [] | ||
|
||
for sim_callable in self.sim_callables: | ||
if len(processes) > num_cpus: | ||
for process in processes: | ||
if not process.is_alive(): | ||
processes.remove(process) | ||
os.kill(process.pid, signal.SIGTERM) | ||
sleep(1) | ||
process_name = ( | ||
f"{sim_callable.func.__name__}_{sim_callable.args[0].get_id()}" | ||
) | ||
process = Process( | ||
target=run_simulator, | ||
args=(sim_callable,), | ||
name=process_name, | ||
) | ||
print(f"Starting process {process_name}") | ||
|
||
process.start() | ||
processes.append(process) | ||
|
||
while processes: | ||
for process in processes: | ||
if not process.is_alive(): | ||
processes.remove(process) | ||
sleep(1) | ||
|
||
print("All simulations have finished") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I had a good reason not to use multiprocessing.Pool
. It does work, but it's going to be limiting our features.
The reason is that I am looking forward to a new feature that wouldn't be possible with Pool
. I want to be able to track the processes ourselves. Specifically, I want to send a special signal periodically and ask the running simulation to tell me what it's current status is. This is related to the better support for exit events.
I can provide more detail later, but I have a plan!
from gem5.isas import ISA | ||
from gem5.prebuilt.riscvmatched.riscvmatched_board import RISCVMatchedBoard | ||
from gem5.simulate.simulator import Simulator | ||
from gem5.utils.requires import requires | ||
|
||
|
||
def run_riscvmathed_workload(resource) -> Simulator: | ||
requires(isa_required=ISA.RISCV) | ||
|
||
# instantiate the riscv matched board with default parameters | ||
board = RISCVMatchedBoard() | ||
|
||
board.set_workload(resource) | ||
|
||
# run the simulation with the RISCV Matched board | ||
simulator = Simulator(board=board, full_system=False) | ||
return simulator | ||
|
||
|
||
def run_riscvmatched_worklaod_diff_clocks(resource) -> Simulator: | ||
requires(isa_required=ISA.RISCV) | ||
|
||
# instantiate the riscv matched board with default parameters | ||
board = RISCVMatchedBoard( | ||
clk_freq="2.2GHz", | ||
) | ||
|
||
board.set_workload(resource) | ||
|
||
# run the simulation with the RISCV Matched board | ||
simulator = Simulator(board=board, full_system=False) | ||
return simulator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this works +1.
to_return.append(lambda: Simulator(board))
may work as well if you don't want to use partial
. I can't decide which I like more.
workload_1 = obtain_resource("riscv-gapbs-tc-run") | ||
workload_2 = obtain_resource("riscv-npb-is-size-s-run") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use the example suite?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot currently as suites dont work with multiprocessing. After the refactor obtain_resource PR is merged, we can change this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Bobby is right here. Can you test if you can put this in a single file?
I think the only thing that needs to be in another file is what's in multi_sim.py
Change-Id: I727fd8518b2adfc99e66f208c465a9b587bb6032
This is replaced by the work done in #1167 |
This PR is my first stab at adding a mutli processing module to stdlib.
Change-Id: I98b03c52eb769731a04d3d2aab8a79eb8ac91d15