Ticket/2454/rename/vbn/fields #2462

danielsf · 2022-06-03T23:36:38Z

This PR renames some fields in the NWB and metadata tables to bring the two datastreams into alignment.

It also adds the "strip-substructure" functionality to the "structure_acronym" column in the metadata tables.

except waveform_duration and waveform_halfwidth

(necessary to maintain backwards combatibility with VCN)

aamster

Looks good, with a few minor comments

aamster · 2022-06-07T14:20:29Z

allensdk/brain_observatory/ecephys/utils.py

+    any other float will provoke an error.
+    """
+
+    if isinstance(acronym, numbers.Number):


It looks like the outer if can be safely removed since you are just checking if it is nan here

Actually, numpy doesn't like running np.isnan on non-numbers

>>> import numpy as np >>> np.isnan('abcd') Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'' >>>

aamster · 2022-06-07T14:24:43Z

allensdk/brain_observatory/ecephys/utils.py

+        new_acronym = set()
+
+        for el in acronym:
+            if isinstance(el, str):


It looks like this part can recursively call strip_substructure_acronym and add the results to a list if you feel comfortable writing that

Good catch; done.

aamster · 2022-06-07T14:28:15Z

allensdk/brain_observatory/vbn_2022/metadata_writer/dataframe_manipulations.py

@@ -526,3 +529,38 @@ def _get_session_duration_from_behavior_session_ids(
        index=pd.Int64Index([x['behavior_session_id'] for x in durations],
                            name='behavior_session_id'))
    return durations
+
+
+def strip_substructure_acronym_df(


I'm not sure why this function is needed. Can you not just do

df[col_name] = df[col_name].apply(strip_substructure_acronym)

I can; I just don't know that much pandas. I've made the change.

Note that apply is not an inplace operation. ie

units_table['structure_acronym'].apply(strip_substructure_acronym)

won't have any effect. It needs to be assigned.

this is why I hate pandas ;)

Just running apply results in a series....

>>> import pandas as pd >>> data = [{'a': 1, 'b': 2}, {'a': 2, 'b': 3}] >>> df = pd.DataFrame(data=data) >>> def fn(x): ... return x**2 ... >>> >>> new_df = df['a'].apply(fn) >>> new_df 0 1 1 4 Name: a, dtype: int64

I'm just going to go back to my "implement a function to munge the structure acronym column." It was tested.

I really do not like pandas

You need to assign the output to a column, not to the dataframe, i.e

units_table['structure_acronym'] = units_table['structure_acronym'].apply(strip_substructure_acronym)

danielsf added 7 commits June 2, 2022 22:36

rename manual_structure_acronyms -> structure_acronyms

cef3866

rename description -> name

ca8f261

drop waveform_ prefix from all fields

a49eb9a

except waveform_duration and waveform_halfwidth

add methods to sanitize structure acronyms

3c35132

strip substructure from channels, probes, and units metadata tables

f707aac

fix pep8 error

abf49d8

structure acronym sanitization can handle acronym=None

5fd483a

danielsf force-pushed the ticket/2454/rename/vbn/fields branch from 4e76aae to e1adc38 Compare June 6, 2022 20:31

use same util to strip substructure acronym everywhere

ad14c9f

danielsf force-pushed the ticket/2454/rename/vbn/fields branch from e1adc38 to ad14c9f Compare June 6, 2022 20:33

strip_substructure_acronym can handle np.NaN

bbb2e7e

(necessary to maintain backwards combatibility with VCN)

danielsf force-pushed the ticket/2454/rename/vbn/fields branch from 1cc4625 to bbb2e7e Compare June 6, 2022 22:54

aamster approved these changes Jun 7, 2022

View reviewed changes

danielsf force-pushed the ticket/2454/rename/vbn/fields branch 2 times, most recently from 603c881 to 7c5a68f Compare June 7, 2022 17:28

strip_substructure_acronym recursively calls self

1a7f9ca

danielsf force-pushed the ticket/2454/rename/vbn/fields branch from 7c5a68f to 1a7f9ca Compare June 7, 2022 20:23

danielsf merged commit 92cd3a0 into vbn_2022_dev Jun 7, 2022

danielsf mentioned this pull request Jun 7, 2022

Rename fields in VBN data/metadata #2454

Closed

3 tasks

danielsf deleted the ticket/2454/rename/vbn/fields branch June 8, 2022 16:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ticket/2454/rename/vbn/fields #2462

Ticket/2454/rename/vbn/fields #2462

danielsf commented Jun 3, 2022 •

edited

Loading

aamster left a comment

aamster Jun 7, 2022

danielsf Jun 7, 2022

aamster Jun 7, 2022

danielsf Jun 7, 2022

aamster Jun 7, 2022 •

edited

Loading

danielsf Jun 7, 2022

aamster Jun 7, 2022

danielsf Jun 7, 2022

danielsf Jun 7, 2022

aamster Jun 7, 2022 •

edited

Loading

Ticket/2454/rename/vbn/fields #2462

Ticket/2454/rename/vbn/fields #2462

Conversation

danielsf commented Jun 3, 2022 • edited Loading

aamster left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aamster Jun 7, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aamster Jun 7, 2022 • edited Loading

Choose a reason for hiding this comment

danielsf commented Jun 3, 2022 •

edited

Loading

aamster Jun 7, 2022 •

edited

Loading

aamster Jun 7, 2022 •

edited

Loading