Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data analysis part 2 #173

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from
Draft

Conversation

anusha-ramdarshan
Copy link
Collaborator

No description provided.


## Looking at the data

The data will be split into different csv files, split by different data types.According to your setup, there will be up to 5 files:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're missing a space after the fullstop.

- homie_enum
- homie_color: contains rgb values for the smart lights
- homie_float: contains all metrics stored as floats (temperature)
- homie_integer: contains all metrics stored as integers (humidity %, battery level %)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

string is also possible.

- homie_float: contains all metrics stored as floats (temperature)
- homie_integer: contains all metrics stored as integers (humidity %, battery level %)

Here, we want to focus on the csvs containing floats and integers, as they contain the temperature/humdity data. Useful columns:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CSV files


Here, we want to focus on the csvs containing floats and integers, as they contain the temperature/humdity data. Useful columns:
- time: since epoch (unix epoch 1970). pandas handles this for us.
device_id
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this should also be a list entry.

- node_type: =="Mijia sensor" to select only the temperature/humidity sensor data
- node_name: nickname for the sensor (e.g., "living room")

There are between 4 and 10 data points per sensor per minute, depending on how often a sensor gets polled (~ 10K data points in a 24h period for a given sensor)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on the min_update_period_seconds in mijia-homie.toml, really.

@@ -7,7 +7,9 @@
"outputs": [],
"source": [
"import pandas as pd \n",
"import plotly.express as px\n"
"import plotly.express as px\n",
"from sklearn.preprocessing import StandardScaler\n",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hrm. I'm getting an error here. Trying to debug now.

ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-2-186f7a1512d6> in <module>
      1 import pandas as pd
      2 import plotly.express as px
----> 3 from sklearn.preprocessing import StandardScaler
      4 from sklearn.decomposition import PCA

ModuleNotFoundError: No module named 'sklearn'

@@ -10,6 +10,7 @@ ipykernel = "^5.5.3"
pandas = "^1.2.4"
plotly = "^4.14.3"
nbstripout = "^0.3.9"
sklearn = "^0.0"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://pypi.org/project/sklearn/ says to use scikit-learn instead.

vscode also decided that it wanted to install notebook when I tried things out on a fresh virtualenv, but I can make a patch for that as a separate PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants