Memory error on ecephys uploads #1411

wbwakeman · 2020-03-10T16:37:16Z

LIMS is now importing the complete Extracellular Electrophysiology ('ecephys'/'neuropixels') raw data during the ECEPHYS_SESSION_UPLOAD. These jobs consume all available memory and then fail. The stack trace from log file /allen/programs/braintv/production/visualbehavior/prod0/specimen_963194077/ecephys_session_1010772001/202003061905_ECEPHYS_SESSION_UPLOAD_QUEUE_1010772001_1012903422.log is below.

This is mostly likely caused by the hashing mechanism in

AllenSDK/allensdk/brain_observatory/ecephys/copy_utility/__main__.py

Line 15 in ba72143

def hash_file(path, hasher_cls):

To avoid this problem, we can implement Nile's suggestion to switch hash_file to read chunks of the file and call hasher.update in a loop.

might look something like:

while True:
chunk = file_obj.read(chunk_size)
if not chunk:
break
hasher.update(chunk)

2020-03-06 19:13:07,743 - 29600 - INFO - copied from //allen/programs/braintv/production/incoming/neuralcoding/1010772001_492651_20200227_platformD1.json to /allen/programs/braintv/production/visualbehavior/prod0/specimen_963194077/ecephys_session_1010772001/1012903422/1010772001_492651_20200227_platformD1.json
2020-03-06 19:13:07,976 - 29600 - INFO - copied from //allen/programs/braintv/production/incoming/neuralcoding/1010772001_492651_20200227_probeABC/settings_2.xml to /allen/programs/braintv/production/visualbehavior/prod0/specimen_963194077/ecephys_session_1010772001/1010772001_492651_20200227_probeABC/settings_2.xml
2020-03-06 20:52:44,075 - 29600 - INFO - copied from //allen/programs/braintv/production/incoming/neuralcoding/1010772001_492651_20200227_probeABC/recording_slot2_5.npx2 to /allen/programs/braintv/production/visualbehavior/prod0/specimen_963194077/ecephys_session_1010772001/1010772001_492651_20200227_probeABC/recording_slot2_5.npx2
Traceback (most recent call last):
  File "/allen/aibs/technology/conda/production/allensdk_py36/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/allen/aibs/technology/conda/production/allensdk_py36/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/allen/aibs/technology/conda/production/allensdk_py36/lib/python3.6/site-packages/allensdk/brain_observatory/ecephys/copy_utility/__main__.py", line 140, in <module>
    output = main(**parser.args)
  File "/allen/aibs/technology/conda/production/allensdk_py36/lib/python3.6/site-packages/allensdk/brain_observatory/ecephys/copy_utility/__main__.py", line 122, in main
    hashes = compare(file_entry['source'], file_entry['destination'], hasher_cls, raise_if_comparison_fails)
  File "/allen/aibs/technology/conda/production/allensdk_py36/lib/python3.6/site-packages/allensdk/brain_observatory/ecephys/copy_utility/__main__.py", line 70, in compare
    return compare_files(source, dest, hasher_cls, raise_if_comparison_fails)
  File "/allen/aibs/technology/conda/production/allensdk_py36/lib/python3.6/site-packages/allensdk/brain_observatory/ecephys/copy_utility/__main__.py", line 76, in compare_files
    source_hash = hash_file(source, hasher_cls)
  File "/allen/aibs/technology/conda/production/allensdk_py36/lib/python3.6/site-packages/allensdk/brain_observatory/ecephys/copy_utility/__main__.py", line 18, in hash_file
    hasher.update(file_obj.read())
MemoryError

The text was updated successfully, but these errors were encountered:

…large-data Do checksums for data files in chunks

wbwakeman added braintv relates to Insitute BrainTV program bug neuropixels labels Mar 10, 2020

kschelonka self-assigned this Mar 10, 2020

kschelonka added a commit that referenced this issue Mar 11, 2020

Merge pull request #1413 from AllenInstitute/GH-1411/bugfix/checksum-…

8701e34

…large-data Do checksums for data files in chunks

djkapner closed this as completed Mar 13, 2020

wbwakeman added this to the Pika 2020-03-11 milestone Mar 13, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory error on ecephys uploads #1411

Memory error on ecephys uploads #1411

wbwakeman commented Mar 10, 2020

Memory error on ecephys uploads #1411

Memory error on ecephys uploads #1411

Comments

wbwakeman commented Mar 10, 2020