Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SpinnmanIOException: IO Error: Failed to communicate with the machine #127

Open
sauravtii opened this issue Aug 15, 2022 · 7 comments
Open
Assignees

Comments

@sauravtii
Copy link

sauravtii commented Aug 15, 2022

I am trying out this code (https://github.com/NeuromorphicProcessorProject/snn_toolbox/blob/master/examples/mnist_keras_spiNNaker.py), and after running the main function it is giving me the following error. I am using pypi version of SNNtoolbox. I have also provided my config file below.

Can anyone please help me on this ?

The whole output after calling out the main function:

SpinnmanIOException                       Traceback (most recent call last)
Input In [5], in <cell line: 67>()
     62     config.write(configfile)
     64 # RUN SNN TOOLBOX #
     65 ###################
---> 67 main(config_filepath)

File ~/.local/lib/python3.8/site-packages/snntoolbox/bin/run.py:31, in main(filepath)
     29 if filepath is not None:
     30     config = update_setup(filepath)
---> 31     run_pipeline(config)
     32     return
     34 parser = argparse.ArgumentParser(
     35     description='Run SNN toolbox to convert an analog neural network into '
     36                 'a spiking neural network, and optionally simulate it.')

File ~/.local/lib/python3.8/site-packages/snntoolbox/bin/utils.py:145, in run_pipeline(config, queue)
    142     return snn.run(**test_set)
    144 # Simulate network
--> 145 results = run(spiking_model, **testset)
    147 # Clean up
    148 spiking_model.end_sim()

File ~/.local/lib/python3.8/site-packages/snntoolbox/bin/utils.py:220, in run_parameter_sweep.<locals>.decorator.<locals>.wrapper(snn, **testset)
    216     if len(param_values) > 1:
    217         print("\nCurrent value of parameter to sweep: " +
    218               "{} = {:.2f}\n".format(param_name, p))
--> 220     results.append(run_single(snn, **testset))
    222 # Plot and return results of parameter sweep.
    223 try:

File ~/.local/lib/python3.8/site-packages/snntoolbox/bin/utils.py:142, in run_pipeline.<locals>.run(snn, **test_set)
    140 @run_parameter_sweep(config, queue)
    141 def run(snn, **test_set):
--> 142     return snn.run(**test_set)

File ~/.local/lib/python3.8/site-packages/snntoolbox/simulation/utils.py:606, in AbstractSNN.run(self, x_test, y_test, dataflow, **kwargs)
    603 # Main step: Run the network on a batch of samples for the duration
    604 # of the simulation.
    605 print("\nStarting new simulation...\n")
--> 606 output_b_l_t = self.simulate(**data_batch_kwargs)
    608 # Halt if model is to be serialised only.
    609 if self.config.getboolean('tools', 'serialise_only'):

File ~/.local/lib/python3.8/site-packages/snntoolbox/simulation/target_simulators/spiNNaker_target_sim.py:381, in SNN.simulate(self, **kwargs)
    379 if self.config.getboolean('tools', 'serialise_only'):
    380     sys.exit('finished after serialisation')
--> 381 self.sim.run(self._duration)
    382 print("\nCollecting results...")
    383 output_b_l_t = self.get_recorded_vars(self.layers)

File ~/.local/lib/python3.8/site-packages/spynnaker8/__init__.py:703, in run(simtime, callbacks)
    701 if not globals_variables.has_simulator():
    702     raise ConfigurationException(FAILED_STATE_MSG)
--> 703 return __pynn["run"](simtime, callbacks=callbacks)

File ~/.local/lib/python3.8/site-packages/pyNN/common/control.py:111, in build_run.<locals>.run(simtime, callbacks)
     96 def run(simtime, callbacks=None):
     97     """
     98     Advance the simulation by `simtime` ms.
     99 
   (...)
    109     the initial conditions (time ``t = 0``), use the ``reset()`` function.
    110     """
--> 111     return run_until(simulator.state.t + simtime, callbacks)

File ~/.local/lib/python3.8/site-packages/pyNN/common/control.py:93, in build_run.<locals>.run_until(time_point, callbacks)
     90         callback_events.extend((callback(simulator.state.t), callback)
     91                 for callback in active_callbacks)
     92 else:
---> 93     simulator.state.run_until(time_point)
     94 return simulator.state.t

File ~/.local/lib/python3.8/site-packages/spynnaker8/spinnaker.py:119, in SpiNNaker.run_until(self, tstop)
    114 """ Run the simulation until the given simulation time.
    115 
    116 :param tstop: when to run until in milliseconds
    117 """
    118 # Build data
--> 119 self._run_wait(tstop - self.t)

File ~/.local/lib/python3.8/site-packages/spynnaker8/spinnaker.py:150, in SpiNNaker._run_wait(self, duration_ms, sync_time)
    143 def _run_wait(self, duration_ms, sync_time=0.0):
    144     """ Run the simulation for a length of simulation time.
    145 
    146     :param duration_ms: The run duration, in milliseconds
    147     :type duration_ms: int or float
    148     """
--> 150     super(SpiNNaker, self).run(duration_ms, sync_time)

File ~/.local/lib/python3.8/site-packages/spynnaker/pyNN/abstract_spinnaker_common.py:380, in AbstractSpiNNakerCommon.run(self, run_time, sync_time)
    371 if (self.config.getboolean("Reports", "reports_enabled") and
    372         self.config.getboolean(
    373             "Reports", "write_redundant_packet_count_report") and
    374         not self._use_virtual_board and run_time is not None and
    375         not self._has_ran and self._config.getboolean(
    376             "Reports", "writeProvenanceData")):
    377     self.extend_extra_post_run_algorithms(
    378         ["RedundantPacketCountReport"])
--> 380 super().run(run_time, sync_time)
    381 for projection in self._projections:
    382     projection._clear_cache()

File ~/.local/lib/python3.8/site-packages/spinn_front_end_common/interface/abstract_spinnaker_base.py:780, in AbstractSpinnakerBase.run(self, run_time, sync_time)
    778 @overrides(SimulatorInterface.run)
    779 def run(self, run_time, sync_time=0):
--> 780     self._run(run_time, sync_time)

File ~/.local/lib/python3.8/site-packages/spinn_front_end_common/interface/abstract_spinnaker_base.py:932, in AbstractSpinnakerBase._run(self, run_time, sync_time)
    929         self._max_run_time_steps = None
    931     if self._machine is None:
--> 932         self._get_machine(total_run_time, n_machine_time_steps)
    933     self._do_mapping(run_time, total_run_time)
    935 # Check if anything has per-timestep SDRAM usage

File ~/.local/lib/python3.8/site-packages/spinn_front_end_common/interface/abstract_spinnaker_base.py:1211, in AbstractSpinnakerBase._get_machine(self, total_run_time, n_machine_time_steps)
   1208 # If we are using a directly connected machine, add the details to get
   1209 # the machine and transceiver
   1210 if self._hostname is not None:
-> 1211     self._machine_by_hostname(n_machine_time_steps, total_run_time)
   1213 elif self._use_virtual_board:
   1214     self._machine_by_virtual(n_machine_time_steps, total_run_time)

File ~/.local/lib/python3.8/site-packages/spinn_front_end_common/interface/abstract_spinnaker_base.py:1267, in AbstractSpinnakerBase._machine_by_hostname(self, n_machine_time_steps, total_run_time)
   1264 outputs.append("MemoryMachine")
   1265 outputs.append("MemoryTransceiver")
-> 1267 executor = self._run_algorithms(
   1268     inputs, algorithms, outputs, [], [], "machine_generation")
   1269 self._machine = executor.get_item("MemoryMachine")
   1270 self._txrx = executor.get_item("MemoryTransceiver")

File ~/.local/lib/python3.8/site-packages/spinn_front_end_common/interface/abstract_spinnaker_base.py:1195, in AbstractSpinnakerBase._run_algorithms(self, inputs, algorithms, outputs, tokens, required_tokens, provenance_name, optional_algorithms)
   1192 except Exception as e3:
   1193     logger.warning("problem when shutting down {}".format(e3),
   1194                    exc_info=True)
-> 1195 raise e

File ~/.local/lib/python3.8/site-packages/spinn_front_end_common/interface/abstract_spinnaker_base.py:1175, in AbstractSpinnakerBase._run_algorithms(self, inputs, algorithms, outputs, tokens, required_tokens, provenance_name, optional_algorithms)
   1165 executor = PACMANAlgorithmExecutor(
   1166     algorithms=algorithms, optional_algorithms=optional,
   1167     inputs=inputs, tokens=tokens,
   (...)
   1171     provenance_name=provenance_name,
   1172     provenance_path=self._pacman_executor_provenance_path)
   1174 try:
-> 1175     executor.execute_mapping()
   1176     self._pacman_provenance.extract_provenance(executor)
   1177     return executor

File ~/.local/lib/python3.8/site-packages/pacman/executor/pacman_algorithm_executor.py:666, in PACMANAlgorithmExecutor.execute_mapping(self)
    664 if self._do_direct_injection:
    665     with injection_context(self._internal_type_mapping):
--> 666         self.__execute_mapping()
    667 else:
    668     self.__execute_mapping()

File ~/.local/lib/python3.8/site-packages/pacman/executor/pacman_algorithm_executor.py:682, in PACMANAlgorithmExecutor.__execute_mapping(self)
    679     timer.start_timing()
    681 # Execute the algorithm
--> 682 results = algorithm.call(self._internal_type_mapping)
    684 if self._provenance_path:
    685     self._report_full_provenance(algorithm, results)

File ~/.local/lib/python3.8/site-packages/pacman/executor/algorithm_classes/abstract_python_algorithm.py:77, in AbstractPythonAlgorithm.call(self, inputs)
     74 method_inputs = self._get_inputs(inputs)
     76 # Run the algorithm and get the results
---> 77 results = self.call_python(method_inputs)
     79 if results is not None and not isinstance(results, tuple):
     80     results = (results,)

File ~/.local/lib/python3.8/site-packages/pacman/executor/algorithm_classes/python_class_algorithm.py:97, in PythonClassAlgorithm.call_python(self, inputs)
     93     method = self._python_method
     94 logger.error("Error when calling {}.{}.{} with inputs {}",
     95              self._python_module, self._python_class, method,
     96              inputs.keys())
---> 97 raise e

File ~/.local/lib/python3.8/site-packages/pacman/executor/algorithm_classes/python_class_algorithm.py:89, in PythonClassAlgorithm.call_python(self, inputs)
     87     method = getattr(instance, self._python_method)
     88 try:
---> 89     return method(**inputs)
     90 except Exception as e:
     91     method = "__call__"

File ~/.local/lib/python3.8/site-packages/spinn_front_end_common/interface/interface_functions/machine_generator.py:143, in MachineGenerator.__call__(self, hostname, bmp_details, downed_chips, downed_cores, downed_links, board_version, auto_detect_bmp, scamp_connection_data, boot_port_num, reset_machine_on_start_up, report_waiting_logs, max_sdram_size, repair_machine, ignore_bad_ethernets, default_report_directory)
    139 if board_version is None:
    140     raise ConfigurationException(
    141         "Please set a machine version number in the "
    142         "corresponding configuration (cfg) file")
--> 143 txrx.ensure_board_is_ready()
    144 txrx.discover_scamp_connections()
    145 return txrx.get_machine_details(), txrx

File ~/.local/lib/python3.8/site-packages/spinnman/transceiver.py:981, in Transceiver.ensure_board_is_ready(self, number_of_boards, width, height, n_retries, extra_boot_values)
    979 # verify that the version is the expected one for this transceiver
    980 if version_info is None:
--> 981     raise SpinnmanIOException(
    982         "Failed to communicate with the machine")
    983 if (version_info.name != _SCAMP_NAME or
    984         not self.is_scamp_version_compabible(
    985             version_info.version_number)):
    986     raise SpinnmanIOException(
    987         "The machine is currently booted with {}"
    988         " {} which is incompatible with this transceiver, "
    989         "required version is {} {}".format(
    990             version_info.name, version_info.version_number,
    991             _SCAMP_NAME, _SCAMP_VERSION))

SpinnmanIOException: IO Error: Failed to communicate with the machine

.spynnaker.cfg file

[Machine]
#-------
# Information about the target SpiNNaker board or machine:
# machineName: The name or IP address or the target board

# One and only one of the three machineName, spalloc_server or virtual_board = True must be set

# machine name is typically a URL and then version is required
machineName = 192.168.240.1
version = 5

# spalloc_server is typically a URL and then port and user are required
spalloc_server = None
spalloc_port = 22244
spalloc_user = None

# If using virtual_board both width and height must be set
virtual_board = False
# Allowed values pairs are (2,2)  (8,8)   (n*12,m*12)  and (n*12+4, m*12+4)
width = None
height = None

# Time scale factor allows the slowing down of the simulation
time_scale_factor = None

[Reports]
# options are DEFAULT or a file path
# In all cases oldest folders are automatically deleted to max_reports_kept=
default_report_file_path = DEFAULT

# options are DEFAULT, or a file path
# In all cases oldest folders are automatically deleted to max_reports_kept=
default_application_data_file_path = DEFAULT

[Mode]
# mode = Production or Debug
# In Debug mode all report boolean config values are automatically overwritten to True
mode = Production



# Additional config options can be found in:
# /home/sauravpawar/.local/lib/python3.8/site-packages/spinn_front_end_common/interface/spinnaker.cfg
# /home/sauravpawar/.local/lib/python3.8/site-packages/spynnaker/pyNN/spynnaker.cfg

# Copy any additional settings you want to change here including section headings

Config file creation:

# SNN TOOLBOX CONFIGURATION #
#############################

# Create a config file with experimental setup for SNN Toolbox.
configparser = import_configparser()
config = configparser.ConfigParser()

config['paths'] = {
    'path_wd': path_wd,             # Path to model.
    'dataset_path': path_wd,        # Path to dataset.
    'filename_ann': model_name      # Name of input model.
}

config['tools'] = {
    'evaluate_ann': True,           # Test ANN on dataset before conversion.
    # Normalize weights for full dynamic range.
    'normalize': True,
    'scale_weights_exp': True
}

config['simulation'] = {
    # Chooses execution backend of SNN toolbox.
    'simulator': 'spiNNaker',
    'duration': 50,                 # Number of time steps to run each sample.
    'num_to_test': 5,               # How many test samples to run.
    'batch_size': 1,                # Batch size for simulation.
    # SpiNNaker seems to require 0.1 for comparable results.
    'dt': 0.1
}

config['input'] = {
    'poisson_input': True,           # Images are encodes as spike trains.
    'input_rate': 1000
}

config['cell'] = {
    'tau_syn_E': 0.01,
    'tau_syn_I': 0.01
}

config['output'] = {
    'plot_vars': {                  # Various plots (slows down simulation).
        'spiketrains',              # Leave section empty to turn off plots.
        'spikerates',
        'activations',
        'correlation',
        'v_mem',
        'error_t'}
}

@rbodo
Copy link
Contributor

rbodo commented Aug 15, 2022

I can't offer support for spinnaker-related questions at the moment; perhaps @ej159 has some insights?

(btw, it's recommended to clone the repo rather than using the pypi install.)

@sauravtii
Copy link
Author

Thanks @rbodo. @ej159 Can you please provide me a solution ?

@ej159
Copy link
Contributor

ej159 commented Aug 15, 2022

This is a problem connecting to SpiNNaker. Do you have a local SpiNNaker board or a connection to a large machine? If you don't, this won't work. Try running the code through the HBP JupyterLab interface. You will have to git clone the SNNtoolbox there and see how you get on.

Unfortunately I don't have an enormous amount of bandwidth at the moment to properly troubleshoot the problems you are coming up with, but in a couple of weeks' time I will be able to.

@sauravtii
Copy link
Author

Hi, I did try running it on HBP JuyterLab interface and got the same error that I raised an issue for (error).

@sauravtii
Copy link
Author

sauravtii commented Sep 21, 2022

Just wanted to follow up @ej159

@sauravtii
Copy link
Author

Just wanted to follow up

@sauravtii
Copy link
Author

sauravtii commented Nov 6, 2022

Can you please solve this, it is very urgent for me as I am working on a project that has a deadline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants