Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allen hot fix #1116

Merged
merged 23 commits into from
Jul 8, 2024
Merged

Allen hot fix #1116

merged 23 commits into from
Jul 8, 2024

Conversation

iamzoltan
Copy link
Contributor

No description provided.

@iamzoltan
Copy link
Contributor Author

@steevelaquitaine @yavorska-iryna - its the same issue as before.

projects/neurons/load_Allen_Visual_Behavior_from_SDK.ipynb failed quality control.
An error occurred while executing the following cell:
------------------
data_storage_directory = "/temp"  # Note: this path must exist on your local drive
cache = VisualBehaviorOphysProjectCache.from_s3_cache(cache_dir=data_storage_directory)
------------------


---------------------------------------------------------------------------
PermissionError                           Traceback (most recent call last)
Cell In[3], line 2
      1 data_storage_directory = "/temp"  # Note: this path must exist on your local drive
----> 2 cache = VisualBehaviorOphysProjectCache.from_s3_cache(cache_dir=data_storage_directory)

File /opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/allensdk/brain_observatory/behavior/behavior_project_cache/project_cache_base.py:77, in ProjectCacheBase.from_s3_cache(cls, cache_dir, bucket_name_override)
     50 @classmethod
     51 def from_s3_cache(
     52         cls,
     53         cache_dir: Union[str, Path],
     54         bucket_name_override: Optional[str] = None
     55 ) -> "ProjectCacheBase":
     56     """instantiates this object with a connection to an s3 bucket and/or
     57     a local cache related to that bucket.
     58 
   (...)
     74 
     75     """
---> 77     fetch_api = cls.cloud_api_class().from_s3_cache(
     78 cache_dir,
     79 bucket_name=(
     80 bucket_name_overrideifbucket_name_overrideisnotNone
     81 elsecls.BUCKET_NAME),
     82 project_name=cls.PROJECT_NAME,
     83 ui_class_name=cls.__name__)
     85     return cls(fetch_api=fetch_api)

File /opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/allensdk/brain_observatory/behavior/behavior_project_cache/project_apis/data_io/project_cloud_api_base.py:108, in ProjectCloudApiBase.from_s3_cache(cls, cache_dir, bucket_name, project_name, ui_class_name)
     78 @classmethod
     79 def from_s3_cache(cls, cache_dir: Union[str, Path],
     80                   bucket_name: str,
     81                   project_name: str,
     82                   ui_class_name: str) -> "ProjectCloudApiBase":
     83     """instantiates this object with a connection to an s3 bucket and/or
     84     a local cache related to that bucket.
     85 
   (...)
    106 
    107     """
--> 108     cache = S3CloudCache(cache_dir,
    109 bucket_name,
    110 project_name,
    111 ui_class_name=ui_class_name)
    112     return cls(cache)

File /opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/allensdk/api/cloud_cache/cloud_cache.py:1066, in S3CloudCache.__init__(self, cache_dir, bucket_name, project_name, ui_class_name)
   1063 self._manifest = None
   1064 self._bucket_name = bucket_name
-> 1066 super().__init__(cache_dir=cache_dir,project_name=project_name,
   1067 ui_class_name=ui_class_name)

File /opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/allensdk/api/cloud_cache/cloud_cache.py:391, in CloudCacheBase.__init__(self, cache_dir, project_name, ui_class_name)
    390 def __init__(self, cache_dir, project_name, ui_class_name=None):
--> 391     super().__init__(cache_dir=cache_dir,project_name=project_name,
    392 ui_class_name=ui_class_name)
    394     # what latest_manifest was the last time an OutdatedManifestWarning
    395     # was emitted
    396     self._manifest_last_warned_on = None

File /opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/allensdk/api/cloud_cache/cloud_cache.py:63, in BasicLocalCache.__init__(self, cache_dir, project_name, ui_class_name)
     57 def __init__(
     58     self,
     59     cache_dir: Union[str, Path],
     60     project_name: str,
     61     ui_class_name: Optional[str] = None
     62 ):
---> 63     os.makedirs(cache_dir,exist_ok=True)
     65     # the class users are actually interacting with
     66     # (for warning message purposes)
     67     if ui_class_name is None:

File /opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/os.py:2[25](https://github.com/NeuromatchAcademy/course-content/actions/runs/9840571181/job/27165119130?pr=1116#step:10:26), in makedirs(name, mode, exist_ok)
    223         return
    224 try:
--> 225     mkdir(name,mode)
    2[26](https://github.com/NeuromatchAcademy/course-content/actions/runs/9840571181/job/27165119130?pr=1116#step:10:27) except OSError:
    2[27](https://github.com/NeuromatchAcademy/course-content/actions/runs/9840571181/job/27165119130?pr=1116#step:10:28)     # Cannot rely on checking for EEXIST, since the operating system
    228     # could give priority to other errors like EACCES or EROFS
    229     if not exist_ok or not path.isdir(name):

PermissionError: [Errno 13] Permission denied: '/temp'```

@iamzoltan
Copy link
Contributor Author

iamzoltan commented Jul 8, 2024

This can probably be fixed by using a directory relative to the current directory for /temp

@steevelaquitaine
Copy link
Contributor

If I understand that should be fixable with an os.mkdir("/temp") before the following lines:

data_storage_directory = "/temp" # Note: this path must exist on your local drive
cache = VisualBehaviorOphysProjectCache.from_s3_cache(cache_dir=data_storage_directory)

Can you check @yavorska-iryna?
Thanks a lot in advance.

@yavorska-iryna
Copy link
Contributor

We have not seen this issue when we ran the notebook. Steeve's fix may work. I need to test it.

@yavorska-iryna
Copy link
Contributor

@iamzoltan I dont run into the same permission error when I test the notebook in Jupyter lab or colab. I changed the name of the folder and the code still ran, so I dont think os.makedir would fix it. Since I can't replicate this error, it's hard for me to fix it.

@iamzoltan
Copy link
Contributor Author

Yes I understand. We are talking about the processing environment on GH. Can you change the location to ./temp and lets see if that fixes the issue

@yavorska-iryna
Copy link
Contributor

yavorska-iryna commented Jul 8, 2024

Let me look into it.

@yavorska-iryna
Copy link
Contributor

yavorska-iryna commented Jul 8, 2024

@iamzoltan I changed the path to "./temp" and merged it to my forked branch. Let me know if that works.

@matchings
Copy link
Contributor

I just tested the notebook on the allen-hot-fix branch with the following change to the cache loading and it worked as expected:

data_storage_directory = "./temp" cache = VisualBehaviorOphysProjectCache.from_s3_cache(cache_dir=data_storage_directory)

A folder called 'temp' was created in the same directory as the notebook and the cache loaded properly.

changing data storage directory
@iamzoltan
Copy link
Contributor Author

iamzoltan commented Jul 8, 2024

Looks like the process is failing after adding in the fMRI fixes

@iamzoltan
Copy link
Contributor Author

actually I cant get the allen books to run locally. I get this error:

   41 # query on valid_roi if exclude_invalid_rois == True
     42 if exclude_invalid_rois:
---> 43     cell_specimen_table = ophys_experiment.cell_specimen_table.query('valid_roi').reset_index()  # noqa E501
     44 else:
     45     cell_specimen_table = ophys_experiment.cell_specimen_table.reset_index()  # noqa E501

Cell In[16], line 20, in <lambda>(self, expr, **kwargs)
     18 pd.set_option('display.max_columns', 500)
     19 # this line may be needed if you run into Error in pandas query function
---> 20 pd.DataFrame.query = lambda self, expr, **kwargs: self.query(expr, engine='python', **kwargs) 

Cell In[16], line 20, in <lambda>(self, expr, **kwargs)
     18 pd.set_option('display.max_columns', 500)
     19 # this line may be needed if you run into Error in pandas query function
---> 20 pd.DataFrame.query = lambda self, expr, **kwargs: self.query(expr, engine='python', **kwargs) 

TypeError: __main__.<lambda>() got multiple values for keyword argument 'engine'

@yavorska-iryna
Copy link
Contributor

@iamzoltan this line can be removed: pd.DataFrame.query = lambda self, expr, **kwargs: self.query(expr, engine='python', **kwargs) It was added because in some instances pandas used numpy engine and query function didn't work. It can also be fixed by adding engine='python' when using pd.query.

@iamzoltan
Copy link
Contributor Author

iamzoltan commented Jul 8, 2024

I am sorting it out. now we are running out of space, which is a good error. I will clear some space, and try again.

@iamzoltan iamzoltan merged commit f2272cd into main Jul 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants