stdlib: Add multiprocessing simulation class #942

Harshil2107 · 2024-03-18T03:10:49Z

This PR is my first stab at adding a mutli processing module to stdlib.

There are some extra files that are used for testing but will be removed when I undraft the PR

Change-Id: I98b03c52eb769731a04d3d2aab8a79eb8ac91d15

Change-Id: Ia05bd6422b52f9d319448ccefe1639613c33120f

Harshil2107 · 2024-03-19T17:32:55Z

I am not to sure about what name the class should be. I have tentatively gone with MultiSim. I will also push a test that runs the example script.

Change-Id: Ie223805cb560d34f498092f359fabb300b5bb30d

Change-Id: Ic08824fc2eca66dc3d8f30e90b740c96a2dad040

Change-Id: I5c902c924025b82c37138437df0cf43f202ee0d8

BobbyRBruce

I think the big gap in your knowledge if you haven't properly dived into the Multiprocessing module beyond the Process class. I'd suggest reaching through a tutorial on some of the more "advanced" features. multiprocessing.pool is very powerful and solves a lot of the issues you tried to fix here (see my in-line comments).

tests/gem5/multiprocessing_tests/README.md

src/python/gem5/simulate/multi_sim.py

BobbyRBruce · 2024-03-22T13:33:20Z

src/python/gem5/simulate/multi_sim.py

+        processes = []
+
+        for sim_callable in self.sim_callables:
+            if len(processes) > num_cpus:
+                for process in processes:
+                    if not process.is_alive():
+                        processes.remove(process)
+                    os.kill(process.pid, signal.SIGTERM)
+                sleep(1)
+            process_name = (
+                f"{sim_callable.func.__name__}_{sim_callable.args[0].get_id()}"
+            )
+            process = Process(
+                target=run_simulator,
+                args=(sim_callable,),
+                name=process_name,
+            )
+            print(f"Starting process {process_name}")
+
+            process.start()
+            processes.append(process)
+
+        while processes:
+            for process in processes:
+                if not process.is_alive():
+                    processes.remove(process)
+            sleep(1)
+
+        print("All simulations have finished")
+


You really should've read up on multiprocess pools before this. This functionality you provide already exists to handle the scheduling of new processes. The following should be roughly what you need to do:

# Specify the number of processes the pool is allowed to use. # Note: If `multiprocessing.Pool()` then all available processes are used. pool = multiprocessing.Pool(processes=num_proc) # Create a process for each element in `self._sim_callables` passed to function `_run_simulator`. I.e.: # Process 1 : _run_simulator(self.sim_callables[0]) # Process 2 : _run_simulator(self.sim_callables[1]) # ... and so on. pool.map(_run_simulator, self.sim_callables) # The `pool.map` function will only return when all the processes have finished executing. def _run_simulator(sim_callable: Callable[[], Simulator]) -> None: sim_callable().run()

Yeah, I had a good reason not to use multiprocessing.Pool. It does work, but it's going to be limiting our features.

The reason is that I am looking forward to a new feature that wouldn't be possible with Pool. I want to be able to track the processes ourselves. Specifically, I want to send a special signal periodically and ask the running simulation to tell me what it's current status is. This is related to the better support for exit events.

I can provide more detail later, but I have a plan!

src/python/gem5/simulate/multi_sim.py

BobbyRBruce · 2024-03-22T13:38:44Z

tests/gem5/multiprocessing_tests/test-multiprocesing.py

+all_simulations_completed_verifier = verifier.MatchRegex(
+    re.compile(r"All simulations have finished")
+)


Two comments on this test and the wider change:

I'm happy with "exit code == 0" for this.

I don't think the MultiSim object should be printing this message. I'd assume if the run (or run_all) function returns without complaint then all the simulations have finished.

BobbyRBruce · 2024-03-22T13:44:40Z

tests/gem5/multiprocessing_tests/test-multiprocesing.py

+if config.bin_path:
+    resource_path = config.bin_path
+else:
+    resource_path = joinpath(absdirpath(__file__), "..", "resources")


You've copy and pasted this from other test scripts but the resource_path is never used anywhere here.

BobbyRBruce · 2024-03-22T13:45:22Z

tests/gem5/multiprocessing_tests/test-multiprocesing.py

+        "multisim",
+        "multi-sim-example.py",
+    ),
+    config_args=[],


Thought: Why don't you have the single argument of this config to be the number of threads? It doesn't make much sense for it to be hard-coded in the script.

BobbyRBruce · 2024-03-24T04:05:16Z

I created a little toy example for an approach I think is better. Despite being a simple bare-bones proof-of-concept I hope it's clear how it'd all fit together if built upon: BobbyRBruce@599e76f.

You can run the example with ./build/ALL/gem5.opt multisim/config.py.

powerjg

It's looking good. A few comments below.

src/python/gem5/simulate/multi_sim.py

powerjg · 2024-03-29T00:16:52Z

src/python/gem5/simulate/multi_sim.py

+                args=(sim_callable,),
+                name=process_name,
+            )
+            print(f"Starting process {process_name}")


We shouldn't have any print statements in this code. There's a good argument to need logging but for that, we should use something better. E.g., info() if we can or logging in python. That said, we need to tread lightly here with changes. My only suggestion is to remove the print statement or change it to an info

src/python/gem5/simulate/multi_sim.py

powerjg · 2024-03-29T00:23:04Z

src/python/gem5/simulate/multi_sim.py

+        processes = []
+
+        for sim_callable in self.sim_callables:
+            if len(processes) > num_cpus:
+                for process in processes:
+                    if not process.is_alive():
+                        processes.remove(process)
+                    os.kill(process.pid, signal.SIGTERM)
+                sleep(1)
+            process_name = (
+                f"{sim_callable.func.__name__}_{sim_callable.args[0].get_id()}"
+            )
+            process = Process(
+                target=run_simulator,
+                args=(sim_callable,),
+                name=process_name,
+            )
+            print(f"Starting process {process_name}")
+
+            process.start()
+            processes.append(process)
+
+        while processes:
+            for process in processes:
+                if not process.is_alive():
+                    processes.remove(process)
+            sleep(1)
+
+        print("All simulations have finished")
+


Yeah, I had a good reason not to use multiprocessing.Pool. It does work, but it's going to be limiting our features.

The reason is that I am looking forward to a new feature that wouldn't be possible with Pool. I want to be able to track the processes ourselves. Specifically, I want to send a special signal periodically and ask the running simulation to tell me what it's current status is. This is related to the better support for exit events.

I can provide more detail later, but I have a plan!

src/python/gem5/simulate/multi_sim.py

configs/example/gem5_library/multisim/example_callable.py

powerjg · 2024-03-29T00:34:58Z

configs/example/gem5_library/multisim/example_callable.py

+from gem5.isas import ISA
+from gem5.prebuilt.riscvmatched.riscvmatched_board import RISCVMatchedBoard
+from gem5.simulate.simulator import Simulator
+from gem5.utils.requires import requires
+
+
+def run_riscvmathed_workload(resource) -> Simulator:
+    requires(isa_required=ISA.RISCV)
+
+    # instantiate the riscv matched board with default parameters
+    board = RISCVMatchedBoard()
+
+    board.set_workload(resource)
+
+    # run the simulation with the RISCV Matched board
+    simulator = Simulator(board=board, full_system=False)
+    return simulator
+
+
+def run_riscvmatched_worklaod_diff_clocks(resource) -> Simulator:
+    requires(isa_required=ISA.RISCV)
+
+    # instantiate the riscv matched board with default parameters
+    board = RISCVMatchedBoard(
+        clk_freq="2.2GHz",
+    )
+
+    board.set_workload(resource)
+
+    # run the simulation with the RISCV Matched board
+    simulator = Simulator(board=board, full_system=False)
+    return simulator


If this works +1.

to_return.append(lambda: Simulator(board)) may work as well if you don't want to use partial. I can't decide which I like more.

powerjg · 2024-03-29T00:36:52Z

configs/example/gem5_library/multisim/multi-sim-example.py

+workload_1 = obtain_resource("riscv-gapbs-tc-run")
+workload_2 = obtain_resource("riscv-npb-is-size-s-run")


Can you use the example suite?

I cannot currently as suites dont work with multiprocessing. After the refactor obtain_resource PR is merged, we can change this.

powerjg · 2024-03-29T00:38:06Z

configs/example/gem5_library/multisim/example_callable.py

I think Bobby is right here. Can you test if you can put this in a single file?

I think the only thing that needs to be in another file is what's in multi_sim.py

Change-Id: I727fd8518b2adfc99e66f208c465a9b587bb6032

BobbyRBruce · 2024-05-25T21:34:22Z

This is replaced by the work done in #1167

stdlib: Add multiprocessing simulation class

05c46c3

Change-Id: I98b03c52eb769731a04d3d2aab8a79eb8ac91d15

ivanaamit added the stdlib The gem5 standard library. Code typically found under "src/pythongem5" label Mar 18, 2024

stdlib: Add multisim support and example

7632aeb

Change-Id: Ia05bd6422b52f9d319448ccefe1639613c33120f

Harshil2107 marked this pull request as ready for review March 19, 2024 17:31

Harshil2107 added 3 commits March 19, 2024 11:04

misc: revert riscvmatched-hello

3b4520c

Change-Id: Ie223805cb560d34f498092f359fabb300b5bb30d

tests: Added test for multiprocessing

5da266f

Change-Id: Ic08824fc2eca66dc3d8f30e90b740c96a2dad040

stdlib: Added function to allow crossproduct of config and resources

dab82c9

Change-Id: I5c902c924025b82c37138437df0cf43f202ee0d8

BobbyRBruce requested changes Mar 22, 2024

View reviewed changes

BobbyRBruce reviewed Mar 22, 2024

View reviewed changes

powerjg reviewed Mar 29, 2024

View reviewed changes

powerjg added this to the v24.0 milestone Apr 11, 2024

stdlib: Update logic for processes and examples

9077982

Change-Id: I727fd8518b2adfc99e66f208c465a9b587bb6032

BobbyRBruce closed this May 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

stdlib: Add multiprocessing simulation class #942

stdlib: Add multiprocessing simulation class #942

Harshil2107 commented Mar 18, 2024 •

edited

Harshil2107 commented Mar 19, 2024

BobbyRBruce left a comment

BobbyRBruce Mar 22, 2024

powerjg Mar 29, 2024

BobbyRBruce Mar 22, 2024

BobbyRBruce Mar 22, 2024

BobbyRBruce Mar 22, 2024

BobbyRBruce commented Mar 24, 2024

powerjg left a comment

powerjg Mar 29, 2024

powerjg Mar 29, 2024

powerjg Mar 29, 2024

powerjg Mar 29, 2024

Harshil2107 Apr 15, 2024

powerjg Mar 29, 2024

BobbyRBruce commented May 25, 2024

		workload_1 = obtain_resource("riscv-gapbs-tc-run")
		workload_2 = obtain_resource("riscv-npb-is-size-s-run")

stdlib: Add multiprocessing simulation class #942

stdlib: Add multiprocessing simulation class #942

Conversation

Harshil2107 commented Mar 18, 2024 • edited

Harshil2107 commented Mar 19, 2024

BobbyRBruce left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

BobbyRBruce commented Mar 24, 2024

powerjg left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

BobbyRBruce commented May 25, 2024

Harshil2107 commented Mar 18, 2024 •

edited