Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ticket/2450/supplemental/columns #2451

Merged
merged 5 commits into from
Jun 1, 2022

Conversation

danielsf
Copy link
Contributor

As part of the 2022 VBN release, we are adding some hand annotations not currently represented in the LIMS database to the ecephys_sessions table. Rather than update the schema of the LIMS database (a change that would have implications for all previously-collected ecephys data), we are adding functionality to the VBN metadata_writer that allows us to add columns to the ecephys_sessions table by hand. This PR encompasses that functionality. The new supplemental_columns entry in the metadata_writer schema should look something like this

  "supplemental_columns": [
    {
      "abnormal_activity": false,
      "ecephys_session_id": 1051155866
    },
    {
      "abnormal_activity": false,
      "ecephys_session_id": 1044385384
    },
    {
      "abnormal_activity": false,
      "ecephys_session_id": 1044594870
    },
    {
      "abnormal_activity": false,
      "abnormal_histology": [
        "Hippocampus"
      ],
      "ecephys_session_id": 1056495334
    }]

supplemental_df = pd.DataFrame(
data=self.args['supplemental_columns'])

columns_to_patch = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't you just use pd.merge here instead of patch_df_from_other?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if we did just use pd.merge, I'd want to wrap it in a function that we could test to make sure that the columns we are adding get added the way we expect. patch_df_from_other is already tested. I'd rather keep this as it is.

@@ -47,6 +47,18 @@ class VBN2022MetadataWriterInputSchema(argschema.ArgSchema):
"{ecephys_nwb_dir}/{ecephys_nwb_prefix}_{ecephys_session_id}.nwb")
)

supplemental_columns = argschema.fields.List(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this argument would be better named as supplemental_data. supplemental_columns makes it seem like it is a list of column names.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -47,6 +47,18 @@ class VBN2022MetadataWriterInputSchema(argschema.ArgSchema):
"{ecephys_nwb_dir}/{ecephys_nwb_prefix}_{ecephys_session_id}.nwb")
)

supplemental_columns = argschema.fields.List(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this rather be an input file? Passing a long list of dicts through the command line would be cumbersome.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to proliferate the number of input files we have to keep track of. These input.jsons are large, but they have the virtue of carrying everything we need in one package.

I also don't see users specifying this field (or, or that matter, probes_to_skip) on the command line.

I'd rather leave this as it is.

@danielsf danielsf merged commit 03bc311 into vbn_2022_dev Jun 1, 2022
@danielsf danielsf deleted the ticket/2450/supplemental/columns branch June 8, 2022 16:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants