You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When specifying a partition in the use_slurm_host function if there is more than one board in the partition, and the first board(s) returned by sinfo has status down, the set_partition function (line 72 of lava/util/slurm.py) will return a value error that the partition is not found or is down.
To reproduce current behavior
After applying my own fix for bug in #753
run code:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[3], line 3
1 from lava.utils import loihi
----> 3 loihi.use_slurm_host(partition='oheogluch', loihi_gen=loihi.ChipGeneration.N3B3)
4 use_loihi2 = loihi.is_installed()
6 # if use_loihi2:
File ~/lava_env/lib/python3.8/site-packages/lava/utils/loihi.py:57, in use_slurm_host(partition, board, loihi_gen)
54 os.environ["LOIHI_GEN"] = loihi_gen.value
56 slurm.set_board(board, partition)
---> 57 slurm.set_partition(partition)
59 global host
60 host = "SLURM"
File ~/lava_env/lib/python3.8/site-packages/lava/utils/slurm.py:89, in set_partition(partition)
87 print(partition_info)
88 if partition_info is None or "down" in partition_info.state:
---> 89 raise ValueError(
90 f"Attempting to use SLURM for Loihi but partition {partition} "
91 f"is not found or is down. Run sinfo to check available "
92 f"partitions.")
94 os.environ["PARTITION"] = partition
ValueError: Attempting to use SLURM for Loihi but partition oheogluch is not found or is down. Run sinfo to check available partitions.
Expected behavior
The expected behaviour is to update the os.environ['PARTITION'] variable to reflect the selected partition.
Environment (please complete the following information):
Device: Intel cloud
OS: Linux
Lava version 0.8.0
Additional Context
Temporarily fixed this by changing line 88 of lava/util/slurm.py to ignore the "down" partition state.
when I run sinfo this seems to occur when the first listed board for the partition has a status "down" even though other boards have status idle.
Possibly symmetric problem in setting boards?
The text was updated successfully, but these errors were encountered:
Describe the bug
When specifying a partition in the use_slurm_host function if there is more than one board in the partition, and the first board(s) returned by sinfo has status down, the set_partition function (line 72 of lava/util/slurm.py) will return a value error that the partition is not found or is down.
To reproduce current behavior
After applying my own fix for bug in #753
run code:
I get the error
Expected behavior
The expected behaviour is to update the os.environ['PARTITION'] variable to reflect the selected partition.
Environment (please complete the following information):
Additional Context
Temporarily fixed this by changing line 88 of lava/util/slurm.py to ignore the "down" partition state.
when I run sinfo this seems to occur when the first listed board for the partition has a status "down" even though other boards have status idle.
Possibly symmetric problem in setting boards?
The text was updated successfully, but these errors were encountered: