0% found this document useful (0 votes)
110 views30 pages

Analyzing IoT Data in Python Chapter3

The document discusses analyzing IoT data from multiple sources in Python. It demonstrates how to combine temperature and sunlight datasets, rename columns, resample the data to hourly intervals, fill in missing values, and analyze correlations. Detection of outliers using standard deviations is shown. Seasonal decomposition is performed to identify trends and seasonal patterns in the temperature data. Autocorrelation and combined plots are used to further analyze the time series components.

Uploaded by

Fgpeqw
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
110 views30 pages

Analyzing IoT Data in Python Chapter3

The document discusses analyzing IoT data from multiple sources in Python. It demonstrates how to combine temperature and sunlight datasets, rename columns, resample the data to hourly intervals, fill in missing values, and analyze correlations. Detection of outliers using standard deviations is shown. Seasonal decomposition is performed to identify trends and seasonal patterns in the temperature data. Autocorrelation and combined plots are used to further analyze the time series components.

Uploaded by

Fgpeqw
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 30

Combining

datasources for
further analysis
A N A LY Z I N G I OT D ATA I N P Y T H O N

Matthias Voppichler
IT Developer
Combining data sources
print(temp.head()) print(sun.head())

value value
timestamp timestamp
2018-10-03 08:00:00 16.3 2018-10-03 08:00:00 1798.7
2018-10-03 09:00:00 17.7 2018-10-03 08:30:00 1799.9
2018-10-03 10:00:00 20.2 2018-10-03 09:00:00 1798.1
2018-10-03 11:00:00 20.9 2018-10-03 09:30:00 1797.7
2018-10-03 12:00:00 21.8 2018-10-03 10:00:00 1798.0

ANALYZING IOT DATA IN PYTHON


Naming columns
temp.columns = ["temperature"]
sun.columns = ["sunshine"]

print(temp.head(2))
print(sun.head(2))

temperature
timestamp
2018-10-03 08:00:00 16.3
2018-10-03 09:00:00 17.7
sunshine
timestamp
2018-10-03 08:00:00 1798.7
2018-10-03 08:30:00 1799.9

ANALYZING IOT DATA IN PYTHON


Concat
environ = pd.concat([temp, sun], axis=1)

print(environ.head())

temperature sunshine
timestamp
2018-10-03 08:00:00 16.3 1798.7
2018-10-03 08:30:00 NaN 1799.9
2018-10-03 09:00:00 17.7 1798.1
2018-10-03 09:30:00 NaN 1797.7
2018-10-03 10:00:00 20.2 1798.0

ANALYZING IOT DATA IN PYTHON


Resample
agg_dict = {"temperature": "max", "sunshine": "sum"}

env1h = environ.resample("1h").agg(agg_dict)
print(env1h.head())

temperature sunshine
timestamp
2018-10-03 08:00:00 16.3 3598.6
2018-10-03 09:00:00 17.7 3595.8
2018-10-03 10:00:00 20.2 3596.2
2018-10-03 11:00:00 20.9 3594.1
2018-10-03 12:00:00 21.8 3599.9

ANALYZING IOT DATA IN PYTHON


Fillna
env30min = environ.fillna(method="ffill")
print(env30min.head())

temperature sunshine
timestamp
2018-10-03 08:00:00 16.3 1798.7
2018-10-03 08:30:00 16.3 1799.9
2018-10-03 09:00:00 17.7 1798.1
2018-10-03 09:30:00 17.7 1797.7
2018-10-03 10:00:00 20.2 1798.0

ANALYZING IOT DATA IN PYTHON


Let's practice!
A N A LY Z I N G I OT D ATA I N P Y T H O N
Correlation
A N A LY Z I N G I OT D ATA I N P Y T H O N

Matthias Voppichler
IT Developer
df.corr()
print(data.corr())

temperature humidity sunshine light_veh heavy_veh


temperature 1.000000 -0.734430 0.611041 0.401997 0.408936
humidity -0.734430 1.000000 -0.637761 -0.313952 -0.318198
sunshine 0.611041 -0.637761 1.000000 0.408854 0.409363
light_veh 0.401997 -0.313952 0.408854 1.000000 0.998473
heavy_veh 0.408936 -0.318198 0.409363 0.998473 1.000000

ANALYZING IOT DATA IN PYTHON


heatmap
sns.heatmap(data.corr(), annot=True)

ANALYZING IOT DATA IN PYTHON


heatmap
sns.heatmap(data.corr(), annot=True)

ANALYZING IOT DATA IN PYTHON


heatmap
sns.heatmap(data.corr(), annot=True)

ANALYZING IOT DATA IN PYTHON


heatmap
sns.heatmap(data.corr(), annot=True)

ANALYZING IOT DATA IN PYTHON


Pairplot
sns.pairplot(data)

ANALYZING IOT DATA IN PYTHON


Summary
heatmap
Negative correlation

Positive correlation

Correlation close to 1

ANALYZING IOT DATA IN PYTHON


Let's practice!
A N A LY Z I N G I OT D ATA I N P Y T H O N
Outliers
A N A LY Z I N G I OT D ATA I N P Y T H O N

Matthias Voppichler
IT Developer
Outliers
Reasons why outliers appear in Datasets:

Measurement error

Manipulation

Extreme Events

ANALYZING IOT DATA IN PYTHON


Outliers
temp_mean = data["temperature"].mean()
temp_std = data["temperature"].std()

data["mean"] = temp_mean
data["upper_limit"] = temp_mean + (temp_std * 3)
data["upper_limit"] = temp_mean - (temp_std * 3)

print(data.iloc[0]["upper_limit"])
print(data.iloc[0]["mean"])
print(data.iloc[0]["lower_limit"])

29.513933116002725
14.5345
-0.44493311600272456

ANALYZING IOT DATA IN PYTHON


Outlier plot
data.plot()

ANALYZING IOT DATA IN PYTHON


Autocorrelation
from statsmodels.graphics import tsaplots

tsaplots.plot_acf(data['temperature'], lags=50)

ANALYZING IOT DATA IN PYTHON


Autocorrelation
from statsmodels.graphics import tsaplots

tsaplots.plot_acf(data['temperature'], lags=50)

ANALYZING IOT DATA IN PYTHON


Let's practice!
A N A LY Z I N G I OT D ATA I N P Y T H O N
Seasonality and
Trends
A N A LY Z I N G I OT D ATA I N P Y T H O N

Matthias Voppichler
IT Developer
Time series components
Trend

Seasonal

Residual / Noise

series[t] = trend[t] + seasonal[t] + residual[t]

20.2 = 14.9 + 4.39 + 0.91

ANALYZING IOT DATA IN PYTHON


Seasonal decompose
import statsmodels.api as sm
# Run seasonal decompose
decomp = sm.tsa.seasonal_decompose(data["temperature"])
print(decomp.seasonal.head())

decomp.plot()

timestamp
2018-10-01 00:00:00 -3.670394
2018-10-01 01:00:00 -3.987451
2018-10-01 02:00:00 -4.372217
2018-10-01 03:00:00 -4.534066
2018-10-01 04:00:00 -4.802165
Freq: H, Name: temperature, dtype: float64

ANALYZING IOT DATA IN PYTHON


Seasonal decompose

ANALYZING IOT DATA IN PYTHON


Combined plot
decomp = sm.tsa.seasonal_decompose(data)
# Plot the timeseries
plt.plot(data["temperature"], label="temperature")

# Plot trend and seasonality


plt.plot(decomp.trend["temperature"], label="trend")
plt.plot(decomp.seasonal["temperature"], label="seasonal")

plt.show()

ANALYZING IOT DATA IN PYTHON


Combined plot

ANALYZING IOT DATA IN PYTHON


Let's practice!
A N A LY Z I N G I OT D ATA I N P Y T H O N

You might also like