Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when not discretizing MIMIC-III time-series data - TypeError: bad operand type for unary ~: 'float' #11

Open
MattHodgman opened this issue Oct 26, 2022 · 2 comments

Comments

@MattHodgman
Copy link

MattHodgman commented Oct 26, 2022

I am running FIDDLE on data extracted from MIMIC-III using the pipeline outlined in FIDDLE-experiments. I have my population of ICU stays and am running FIDDLE with these parameters:

--T=240.0
--dt=1.0
--theta_1=0.003
--theta_2=0.003
--theta_freq=1
--stats_functions 'mean'

and other default ones found in run_make_all.sh.

I get the following error:

Traceback (most recent call last):  
  File "/home/hodgman/miniconda3/envs/FIDDLE-env/lib/python3.7/runpy.py", line 193, in _run_module_as_main  
    "__main__", mod_spec)  
  File "/home/hodgman/miniconda3/envs/FIDDLE-env/lib/python3.7/runpy.py", line 85, in _run_code  
    exec(code, run_globals)  
  File "/home/hodgman/FIDDLE-experiments/FIDDLE/FIDDLE/run.py", line 141, in <module>  
    main()  
  File "/home/hodgman/FIDDLE-experiments/FIDDLE/FIDDLE/run.py", line 138, in main  
    X, X_feature_names, X_feature_aliases = FIDDLE_steps.process_time_dependent(df_time_series, args)  
  File "/home/hodgman/FIDDLE-experiments/FIDDLE/FIDDLE/steps.py", line 244, in process_time_dependent  
    X_all, X_all_feature_names, X_discretization_bins = map_time_series_features(df_time_series, dtypes_time_series, args)  
  File "/home/hodgman/FIDDLE-experiments/FIDDLE/FIDDLE/steps.py", line 604, in map_time_series_features  
    df.loc[~numeric_mask, col] = np.nan  
  File "/home/hodgman/miniconda3/envs/FIDDLE-env/lib/python3.7/site-packages/pandas/core/generic.py", line 1532, in __invert__  
    new_data = self._mgr.apply(operator.invert)  
  File "/home/hodgman/miniconda3/envs/FIDDLE-env/lib/python3.7/site-packages/pandas/core/internals/managers.py", line 325, in apply  
    applied = b.apply(f, **kwargs)  
  File "/home/hodgman/miniconda3/envs/FIDDLE-env/lib/python3.7/site-packages/pandas/core/internals/blocks.py", line 381, in apply  
    result = func(self.values, **kwargs)  
TypeError: bad operand type for unary ~: 'float'

Do you know what could be causing this error? I was able to determine that it first occurs in the column 225958 and numeric_mask contains at least one NaN value which must mean column 225958 contains None values however in in my input_data.p file there are no None or NaN variable_values for variable_name == '225958'.

@shengpu-tang
Copy link
Member

Hello, the numeric_mask is generated from the is_numeric function in helpers.py:
https://github.com/MLD3/FIDDLE/blob/master/FIDDLE/helpers.py#L191
on this line:
https://github.com/MLD3/FIDDLE/blob/master/FIDDLE/steps.py#L601

I agree with your logic, so it is indeed surprising if input_data.p does not contain None/NaN but numeric_mask contains NaN. Perhaps you could try with a small example with/without nans and apply the is_numeric function to that column?

@MattHodgman
Copy link
Author

MattHodgman commented Oct 27, 2022

is_numeric works when I extract the 225958 feature column from input_data.p to col_data and run

numeric_mask = col_data.apply(is_numeric)

numeric_mask only contains True and False values. When I switch one of these booleans to np.nan or a float I can reproduce the error. I'm going to see if I can extract the ts_mixed dataframe from https://github.com/MLD3/FIDDLE/blob/master/FIDDLE/steps.py#L594 and look at feature 225958.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants