Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discrepancy in Result Files for GPU-based Neuron Serialization #701

Open
d-kamath opened this issue Sep 9, 2024 · 0 comments
Open

Discrepancy in Result Files for GPU-based Neuron Serialization #701

d-kamath opened this issue Sep 9, 2024 · 0 comments
Assignees

Comments

@d-kamath
Copy link
Collaborator

d-kamath commented Sep 9, 2024

#569 CPU-based Neuron serialization operates as expected. We verified this by comparing two simulation approaches:

  1. Running the entire configuration file in a single simulation (e.g., 10 epochs).
  2. Running the same configuration in two stages:
    Stage 1: Simulating and serializing the first half of the configuration (e.g., 5 epochs).
    Stage 2: Deserializing the saved state from Stage 1 and completing the simulation with the second half (e.g., 5 epochs).
    The final result file from both the full simulation and Stage 1 & Stage 2 are identical, confirming the correctness of the serialization process.

However, for GPU-based simulations, while serialization is functional, the output result file from a full simulation and two half simulations differ. This discrepancy arises because the GPU’s random noise array is not serialized. To resolve this, we need to implement a method to manually copy the GPU state into temporary storage during serialization and restore it during deserialization.

@d-kamath d-kamath self-assigned this Sep 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant