-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ticket/2454/rename/vbn/fields #2462
Conversation
except waveform_duration and waveform_halfwidth
4e76aae
to
e1adc38
Compare
e1adc38
to
ad14c9f
Compare
(necessary to maintain backwards combatibility with VCN)
1cc4625
to
bbb2e7e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, with a few minor comments
any other float will provoke an error. | ||
""" | ||
|
||
if isinstance(acronym, numbers.Number): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the outer if can be safely removed since you are just checking if it is nan here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, numpy doesn't like running np.isnan
on non-numbers
>>> import numpy as np
>>> np.isnan('abcd')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
>>>
new_acronym = set() | ||
|
||
for el in acronym: | ||
if isinstance(el, str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this part can recursively call strip_substructure_acronym
and add the results to a list if you feel comfortable writing that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch; done.
@@ -526,3 +529,38 @@ def _get_session_duration_from_behavior_session_ids( | |||
index=pd.Int64Index([x['behavior_session_id'] for x in durations], | |||
name='behavior_session_id')) | |||
return durations | |||
|
|||
|
|||
def strip_substructure_acronym_df( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why this function is needed. Can you not just do
df[col_name] = df[col_name].apply(strip_substructure_acronym)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can; I just don't know that much pandas. I've made the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that apply
is not an inplace operation. ie
units_table['structure_acronym'].apply(strip_substructure_acronym)
won't have any effect. It needs to be assigned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is why I hate pandas ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just running apply results in a series....
>>> import pandas as pd
>>> data = [{'a': 1, 'b': 2}, {'a': 2, 'b': 3}]
>>> df = pd.DataFrame(data=data)
>>> def fn(x):
... return x**2
...
>>>
>>> new_df = df['a'].apply(fn)
>>> new_df
0 1
1 4
Name: a, dtype: int64
I'm just going to go back to my "implement a function to munge the structure acronym column." It was tested.
I really do not like pandas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to assign the output to a column, not to the dataframe, i.e
units_table['structure_acronym'] = units_table['structure_acronym'].apply(strip_substructure_acronym)
603c881
to
7c5a68f
Compare
7c5a68f
to
1a7f9ca
Compare
This PR renames some fields in the NWB and metadata tables to bring the two datastreams into alignment.
It also adds the "strip-substructure" functionality to the "structure_acronym" column in the metadata tables.