Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ticket/PSB-167: Add quality to metadata table #2712

Merged
merged 2 commits into from
Aug 21, 2023

Conversation

morriscb
Copy link
Contributor

@morriscb morriscb commented Aug 4, 2023

Add quality column to VBN units table.
Add enforcement of integer typing to VBN table.
Change session lookup from a query of the pandas table to a .loc call. This was due to Int64 types being incompatible with the query. loc will likely be a quicker lookup and more pandas like.

@morriscb morriscb force-pushed the ticket/PSB-167/addQualityToMetadataTable branch from b1abd36 to eefdf51 Compare August 9, 2023 01:44
@morriscb morriscb changed the title Ticket/psb 167/add quality to metadata table Ticket/PSB-167: Add quality to metadata table Aug 9, 2023
@morriscb morriscb force-pushed the ticket/PSB-167/addQualityToMetadataTable branch from eefdf51 to 1a827a4 Compare August 11, 2023 17:02
@morriscb morriscb marked this pull request as ready for review August 14, 2023 17:21
@morriscb morriscb requested review from aamster and mikejhuang August 14, 2023 17:21
Copy link
Contributor

@aamster aamster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small changes requested

"behavior_session_id. No sessions found for "
f"id={behavior_session_id}"
)
if len(row.shape) != 1:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't a very clear check. You are basically checking whether it returns a dataframe (in case of the behavior_session_id repeated) or series in case of behavior_session_id not repeated.

Could you check the type to see if it's a dataframe or series?

Alternatively, check how many times behavior_session_id appears in the dataframe

if (self._behavior_session_table.index == behavior_session_id).sum() != 1:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switching to isintance(obj, pd.Series)

raise RuntimeError(
"The behavior_ecephys_session_table should "
"have 1 and only 1 entry for a given "
"ecephys_session_id. Not entries found for requested "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

probes_meta = self._probe_table[
(self._probe_table["ecephys_session_id"] == ecephys_session_id)
& (self._probe_table["has_lfp_data"])
]
if session_meta.shape[0] != 1:
if len(session_meta.shape) != 1:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above

f"behavior_session_id=={behavior_session_id}"
)
if row.shape[0] != 1:
try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the logic to check if the id is in the index exactly 1 time is repeated. Could you write a function to avoid duplicating code?

@morriscb morriscb force-pushed the ticket/PSB-167/addQualityToMetadataTable branch from d43974b to badd5ea Compare August 15, 2023 20:33
@morriscb morriscb requested a review from aamster August 16, 2023 17:12
@@ -146,9 +128,55 @@ def f():
str(session_data_path), probe_meta=probe_meta
)

@staticmethod
def _return_one_session_metadata(input_table: pd.DataFrame,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not specific to behavior neuropixels. Can you please make this more generic such as get_one_row and place in dataframe_utils ?

f" there are {row.shape[0]} entries."
)

row = self._return_one_session_metadata(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please always pass by keyword. much clearer

@morriscb morriscb requested a review from aamster August 16, 2023 21:01
@@ -158,3 +158,48 @@ def enforce_df_int_typing(input_df: pd.DataFrame,
input_df[col] = \
input_df[col].fillna(INT_NULL).astype(int)
return input_df


def return_one_session_metadata(input_table: pd.DataFrame,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should really be more generic than it is since this file contains generic dataframe functions and this function is an outlier

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really agree. While this could be more generically used, the error messages are meant to convey more descriptive information to a user trying to pull a session using get_whatever_session rather than a generic KeyError or similar.

Copy link
Contributor

@aamster aamster Aug 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You would just have to rename the function to something more generic and rename session_id to something like index_value and rename the word session. It would still output a descriptive message like you have just without the word "session". You already use the word "entries" which is generic.

Though because it's not generic it should be instead placed in a package specific to the session metadata tables such as allensdk.behavior_project_cache.tables.util. Anyways I approved it, so it's fine if it's not changed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Enforce int typing for VBN columns.
Update unittests
Create functions to retrieve one and only one row from DataFrame
Add unittests for zero and multiple sessions in table.
@morriscb morriscb force-pushed the ticket/PSB-167/addQualityToMetadataTable branch from e93ad46 to 2c9dc0c Compare August 21, 2023 17:38
@morriscb morriscb merged commit a3aa0cf into rc/2.16.0 Aug 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants