-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ticket/PSB-167: Add quality to metadata table #2712
Conversation
b1abd36
to
eefdf51
Compare
eefdf51
to
1a827a4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some small changes requested
"behavior_session_id. No sessions found for " | ||
f"id={behavior_session_id}" | ||
) | ||
if len(row.shape) != 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't a very clear check. You are basically checking whether it returns a dataframe (in case of the behavior_session_id repeated) or series in case of behavior_session_id not repeated.
Could you check the type to see if it's a dataframe or series?
Alternatively, check how many times behavior_session_id appears in the dataframe
if (self._behavior_session_table.index == behavior_session_id).sum() != 1:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Switching to isintance(obj, pd.Series)
raise RuntimeError( | ||
"The behavior_ecephys_session_table should " | ||
"have 1 and only 1 entry for a given " | ||
"ecephys_session_id. Not entries found for requested " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo
probes_meta = self._probe_table[ | ||
(self._probe_table["ecephys_session_id"] == ecephys_session_id) | ||
& (self._probe_table["has_lfp_data"]) | ||
] | ||
if session_meta.shape[0] != 1: | ||
if len(session_meta.shape) != 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as above
f"behavior_session_id=={behavior_session_id}" | ||
) | ||
if row.shape[0] != 1: | ||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the logic to check if the id is in the index exactly 1 time is repeated. Could you write a function to avoid duplicating code?
d43974b
to
badd5ea
Compare
@@ -146,9 +128,55 @@ def f(): | |||
str(session_data_path), probe_meta=probe_meta | |||
) | |||
|
|||
@staticmethod | |||
def _return_one_session_metadata(input_table: pd.DataFrame, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not specific to behavior neuropixels. Can you please make this more generic such as get_one_row
and place in dataframe_utils
?
f" there are {row.shape[0]} entries." | ||
) | ||
|
||
row = self._return_one_session_metadata( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please always pass by keyword. much clearer
allensdk/core/dataframe_utils.py
Outdated
@@ -158,3 +158,48 @@ def enforce_df_int_typing(input_df: pd.DataFrame, | |||
input_df[col] = \ | |||
input_df[col].fillna(INT_NULL).astype(int) | |||
return input_df | |||
|
|||
|
|||
def return_one_session_metadata(input_table: pd.DataFrame, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should really be more generic than it is since this file contains generic dataframe functions and this function is an outlier
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really agree. While this could be more generically used, the error messages are meant to convey more descriptive information to a user trying to pull a session using get_whatever_session
rather than a generic KeyError or similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You would just have to rename the function to something more generic and rename session_id
to something like index_value
and rename the word session
. It would still output a descriptive message like you have just without the word "session". You already use the word "entries" which is generic.
Though because it's not generic it should be instead placed in a package specific to the session metadata tables such as allensdk.behavior_project_cache.tables.util
. Anyways I approved it, so it's fine if it's not changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Enforce int typing for VBN columns. Update unittests Create functions to retrieve one and only one row from DataFrame Add unittests for zero and multiple sessions in table.
Add .flake8 config file.
e93ad46
to
2c9dc0c
Compare
Add quality column to VBN units table.
Add enforcement of integer typing to VBN table.
Change session lookup from a query of the pandas table to a
.loc
call. This was due to Int64 types being incompatible with the query.loc
will likely be a quicker lookup and more pandas like.