Turner, Ryan - Python Machine Learning - The Ultimate Beginner's Guide To Learn Python Machine Learning Step by Step Using Scikit-Learn and Tensorflow (2019)
Turner, Ryan - Python Machine Learning - The Ultimate Beginner's Guide To Learn Python Machine Learning Step by Step Using Scikit-Learn and Tensorflow (2019)
Legal Notice:
This book is copyright protected. This book is only for personal use. You
cannot amend, distribute, sell, use, quote or paraphrase any part, or the
content within this book, without the consent of the author or publisher.
Disclaimer Notice:
Please note the information contained within this document is for educational
and entertainment purposes only. All effort has been executed to present
accurate, up to date, and reliable, complete information. No warranties of any
kind are declared or implied. Readers acknowledge that the author is not
engaging in the rendering of legal, financial, medical or professional advice.
The content within this book has been derived from various sources. Please
consult a licensed professional before attempting any techniques outlined in
this book.
The learning process begins with data or observations, like instruction, direct
experience, or examples of extracting patterns from the data and using these
patterns to make predictions in the future (Bose, 2019). The primary goal of
machine learning is to allow computers to learn automatically without
intervention by humans and adjust accordingly.
With machine learning, we can analyze large quantities of data. Machine
learning gives us profitable results, but we may need a number of resources
to reach this point. Additional time may be needed to train the machine
learning models.
Classification of Machine Learning Algorithms
For the case of supervised learning, the human is expected to provide both
the inputs and the outputs which are desired and furnish the feedback based
on the accuracy of the predictions during training. After completion of the
training, the same algorithm will have to apply that was applied to the next
data.
The concept of supervised learning can be seen as similar to learning under a
teacher’s supervision in human beings. The teacher gives some examples to
the student, and the student then derives new rules and knowledge from these
examples so as to apply this somewhere else.
It is also good for you to know the difference between the regression
problems and classification problems. In regression problems, the target is a
numeric value, while in classification, the target is a class or a tag. A
regression task can help determine the average cost of all houses in London,
while a classification task will help determine the types of flowers based on
the length of their sepals and petals.
Unsupervised Learning
This type of learning occurs when the algorithm is presented with examples
which lack labels, as is the case with unsupervised learning. However, the
example can be accompanied by positive or negative feedback depending on
the solution which is proposed by the algorithm. It is associated with
applications in which the algorithm has to make decisions, and these
decisions are associated with a consequence. It is similar to trial and error in
human learning.
Errors become useful in learning when they are associated with a penalty
such as pain, cost, loss of time, etc. In reinforced learning, some actions are
more likely to succeed compared to others.
Machine learning processes are similar to those of data mining and predictive
modeling. In both cases, searching through the data is required so as to draw
patterns and adjust the actions of the program accordingly. A good example
of machine learning is the recommender systems. If you purchase an item
online, you will get an ad which is related to that item, and that is a good
example of machine learning.
What is Deep Learning?
The history of Machine Learning begins with the history of computing. And
this history began before computers were even invented. The function of
computers is actually math, in particular, Boolean logic. So before anyone
could create a computing machine, the theory about how such a machine
could work had to be figured out by mathematicians. It was the growth of this
theory from hundreds of years ago that made possible the revolution in
computer software that we see today. In a way, the present and future of
computing and machine intelligence belong to great minds from our past.
In 1652, Blaise Pascal, a 19-year-old teenager, created an arithmetic machine
that could add, subtract, multiply, and divide.
In 1679, German mathematician Gottfried Wilhelm Leibniz created a system
of binary code that laid the groundwork for modern computing.
In 1834, English inventor Charles Babbage conceived a mechanical device
that could be programmed with punch cards. While it was never actually
built, its logical structure, Boolean logic, is what nearly every modern
computer relies on to function.
In 1842, Ada Lovelace became the world’s first computer programmer. At 27
years of age, she designed an algorithm for solving mathematical problems
using Babbage’s punch-card technology.
In 1847, George Boole created algebra capable of reducing all values to
Boolean results. Boolean logic is what CPUs use to function and make
decisions.
In 1936, Alan Turing discussed a theory describing how a machine could
analyze and execute a series of instructions. His proof was published and is
considered the base of modern computer science.
In 1943, a neurophysiologist, Warren McCulloch, and a mathematician,
Walter Pitts, co-wrote a paper theorizing how neurons in the human brain
might function. Then they modeled their theory by building a simple neural
network with electrical circuits.
In 1950, Alan Turing proposed the notion of a “learning machine.” This
machine could learn from the world like human beings do and eventually
become artificially intelligent. He theorized about intelligence and described
what was to become known as the “Turing Test” of machine intelligence.
Intelligence, Turing mused, wasn’t well defined, but we humans seem to
have it and to recognize it in others when we experience them using it. Thus,
should we encounter a computer and can’t tell it is a computer when we
interact with it, then we could consider it intelligent too.
In 1951, Marvin Minsky, with the help of Dean Edmonds, created the first
artificial neural network. It was called SNARC (Stochastic Neural Analog
Reinforcement Calculator).
In 1952, Arthur Samuel began working on some of the first machine learning
programs at IBM's Poughkeepsie Laboratory. Samuel was a pioneer in the
fields of artificial intelligence and computer gaming. One of the first things
they built was a machine that could play checkers. More importantly, this
checkers-playing program could learn and improve its game.
Machine Learning was developed as part of the quest in the development of
Artificial Intelligence. The goal of Machine Learning was to have machines
learn from data. But despite its early start, Machine Learning was largely
abandoned in the development of Artificial Intelligence. Like work on the
perceptron, progress in Machine Learning lagged as Artificial Intelligence
moved to the study of expert systems.
Eventually, this focus on a logical, knowledge-based approach to Artificial
Intelligence caused a split between the disciplines. Machine Learning systems
suffered from practical and theoretical problems in representation and
acquiring large data sets to work with. Expert systems came to dominate by
1980, while statistical and probabilistic systems like Machine Learning fell
out of favor. Early neural network research was also abandoned by Artificial
Intelligence researchers and became its own field of study.
Machine Learning became its own discipline, mostly considered outside the
purview of Artificial Intelligence, until the 1990s. Practically, all of the
progress in Machine Learning from the 1960s through to the 1990s was
theoretical, mostly statistics and probability theory. But while not much
seemed to be accomplished, the theory and algorithms produced in these
decades would prove to be the tools needed to re-energize the discipline. At
this point, in the 1990s, the twin engines of vastly increased computer
processing power and the availability of large datasets brought on a sort of
renaissance for Machine Learning. Its goals shifted from the general notion
of achieving artificial intelligence to a more focused goal of solving real-
world problems, employing methods it would borrow from probability theory
and statistics, ideas generated over the previous few decades. This shift and
the subsequent successes it enjoyed brought the field of Machine Learning
back into the fold of Artificial Intelligence, where it resides today as a sub-
discipline under the Artificial Intelligence umbrella.
However, Machine Learning was, continues to be, and might remain a form
of Specific Artificial Intelligence. SAI are software algorithms able to learn a
single or small range of items, which cannot be generalized to the world at
large. To this date, GAI remains an elusive goal, one many believe will never
be reached. Machine Learning algorithms, even if they are specific artificial
intelligence (SAI), are changing the world and will have an enormous effect
on the future.
Chapter 2: Theories of Machine Learning
discipline that continues to grow into its own understanding. After all,
Machine Learning is attempting to model the capacity of the most complex
device in the known universe, the human brain.
As the history of artificial intelligence demonstrates, we can see there is no
“best” approach to Machine Learning. The “best” approach is only the best
approach for now. Tomorrow, a new insight might surpass everything that’s
been accomplished to date, and pave a new path for artificial intelligence to
follow.
How do collections of computer code and mathematics thus far managed to
learn? Here are the three most common methods currently in use to get
computer algorithms to learn.
Supervised and Semi-supervised Learning
Algorithms
Before getting into the practical part of machine learning and deep learning,
we need to install our two libraries, that is, Scikit-Learn and TensorFlow.
Installing Scikit-Learn
TensorFlow comes with APIs for programming languages like C++, Haskell,
Go, Java, Rust, and it comes with a third-party package for R known as
tensorflow. On Windows, TensorFlow can be installed with pip or
Anaconda.
The native pip will install the TensorFlow on your system without having to
go through a virtual environment. However, note that the installation of
TensorFlow with pip may interfere with other Python installations on your
system. However, the good thing is that you only have to run a single
command and TensorFlow will be installed on your system. Also, when
TensorFlow is installed via pip, users will be allowed to run the TensorFlow
programs from the directory they want.
To install TensorFlow with Anaconda, you may have to create a virtual
environment. However, within the Anaconda itself, it is recommended that
you install TensorFlow via the pip install command rather than the conda
install command.
Ensure that you have installed Python 3.5 and above on your Windows
system. Python3 comes with a pip3 program which can be used for
installation of TensorFlow. This means we should use the pip3 install
command for installation purposes. The following command will help you
install the CPU-only version for TensorFlow:
pip3 install --upgrade tensorflow
The command should be run from the command line:
If you need to install a GPU version for TensorFlow, run this command:
pip3 install --upgrade tensorflow-gpu
This will install TensorFlow on your Windows system.
You can also install TensorFlow with the Anaconda package. Pip comes
installed with Python, but Anaconda doesn’t. This means that to install
TensorFlow with Anaconda, you should first install the Anaconda. You can
download Anaconda from its website and find the installation instructions
from the same site.
Once you install Anaconda, you get a package named conda, which is good
for the management of virtual environments and the installation of packages.
To get to use this package, you should start the Anaconda.
On Windows, click Start, choose “All Programs,” expand the “Anaconda
…” folder then click the “Anaconda Prompt.” This should launch the
anaconda prompt on your system. If you need to see the details of the conda
package, just run the following command on the terminal you have just
opened:
conda info
This should return more details regarding the package manager.
There is something unique with Anaconda. It helps us create a virtual Python
environment using the conda package. This virtual environment is simply an
isolated copy of Python with the capability of maintaining its own files,
paths, and directories so that you may be able to work with specific versions
of Python or other libraries without affecting your other Python projects
(Samuel, 2018). Virtual environments provide us with a way of isolating
projects and avoid problems that may arise as a result of version requirements
and different dependencies across various components. Note that this virtual
environment will remain separate from your normal Python environment
meaning that the packages installed in the virtual environment will not affect
the ones you have in your Python’s normal environment.
We need to create a virtual environment for the TensorFlow package. This
can be done via the conda create command. The command takes the syntax
given below:
conda create -n [environment-name]
In our case, we need to give this environment the name tensorenviron. We
can create it by running the following command:
conda create -n tensorenviron
Allow the process of creating the environment to continue or not. Type “y”
and hit the enter key on the keyboard. The installation will continue
successfully.
Now that you are done with the installations, you can begin to use the
libraries. We will begin with the Scikit-Learn library.
To be able to use scikit-learn in your code, you should first import it by
running this statement:
import sklearn
Loading Datasets
Machine learning is all about analyzing sets of data. Before this, we should
first load the dataset into our workspace. The library comes loaded with a
number of datasets that we can load and work with. We will demonstrate this
by using a dataset known as Iris. This is a dataset of flowers.
Regression
Linear regression
It is the most popular type of predictive analysis. Linear regression involves
the following two things:
1. Do the predictor variables forecast the results of an outcome variable
accurately?
2. Which particular variables are key predictors of the final variable, and
in what standard does it impact the outcome variable?
Naming variables
The regression’s dependent variable has many different names. Some names
include outcome variable, criterion variable, and many others. The
independent variable can be called exogenous variable or repressors.
Functions of the regression analysis
1. Trend Forecasting
2. Determine the strength of predictors
3. Predict an effect
Breaking down regression
There are two basic states of regression-linear regression and multiple
regression. Although there are different methods for complex data and
analysis, linear regression contains an independent variable to help forecast
the outcome of a dependent variable. On the other hand, multiple regression
has two or more independent variables to assist in predicting a result.
Regression is very useful to financial and investment institutions because it is
used to predict the sales of a particular product or company based on the
previous sales and GDP growth among many other factors. The capital
pricing model is one of the most common regression models applied in
finance. The example below describes the formulae used in linear and
multiple regression.
The KNN algorithm is highly used for building more complex classifiers. It is
a simple algorithm, but it has outperformed many powerful classifiers. That is
why it is used in numerous applications of data compression, economic
forecasting, and genetics.
KNN is a supervised learning algorithm, which means that we are given a
labeled dataset made up of training observations (x, y), and our goal is to
determine the relationship between x and y (Karbhari, 2019). This means
that we should find a function that x to y such that when we are given an
input value for x, we are able to predict the corresponding value for y.
The concept behind the KNN algorithm is very simple. It calculates the
distance of the new data point to all the other training data points. The
distance can be of various types including Manhattan, Euclidean, etc. The K-
nearest data points are chosen, in which K can be any integer. Finally, the
data point is assigned to the class in which most of the K data points belong
to.
We will use a dataset named Iris. We had explored it previously. We will be
using this to demonstrate how to implement the KNN algorithm. This dataset
is made up of four attributes namely sepal-width, sepal-length, petal-width
and petal-length. Each type of the Iris plant has certain attributes. Our goal
is to predict the class to which a plant belongs to. The dataset has three
classes Iris-setosa, Iris-versicolor, and Iris-virginica.
We now need to load the dataset into our working environment. Download it
from the following URL:
https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data
Store it in a Pandas data frame. This following script will help us achieve
this:
# create a variable for the dataset url
iris_url = "https://archive.ics.uci.edu/ml/machine-learning-
databases/iris/iris.data"
# Assign column names to the dataset
iris_names = ['Slength', 'Swidth', 'Plength', 'Pwidth', 'Class']
# Load the dataset from the url into a pandas dataframe
dataset = pd.read_csv(iris_url, names=iris_names)
We can have a view of the first few rows of the dataset:
print(dataset.head())
The S is for Sepal while P is for Petal. For example, Slength represents Sepal
length while Plength represents Petal length.
import sys
sys.__stdout__ = sys.stdout
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
# Assign colum names to our dataset
names = ['Slength', 'Swidth', 'Plength', 'Pwidth', 'Class']
# Read the dataset to a pandas dataframe
dataset = pd.read_csv(url, names=names)
As usual, we should divide the dataset into attributes and labels:
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, 4].values
The variable X will hold the first four columns of the dataset which are the
attributes while the variable y will hold the labels.
Splitting the Dataset
We need to be able to tell how well our algorithm performed. This will be
done during the testing phase. This means that we should have training and
testing data. We need to split the data into two such parts. 80% of the data
will be used as the training set while 20% will be used as the test set.
Let us first import the train_test_split method from Scikit-Learn:
from sklearn.model_selection import train_test_split
We can then split the two as follows:
Feature Scaling
Before we can make the actual predictions, it is a good idea for us to scale the
features. After that, all the features will be evaluated uniformly. Scikit-Learn
comes with a class named StandardScaler, which can help us perform the
feature scaling. Let us first import this class:
from sklearn.preprocessing import StandardScaler
We then instantiate the class, and use it to fit a model based on it:
feature_scaler = StandardScaler()
feature_scaler.fit(X_train)
X_train = feature_scaler.transform(X_train)
X_test = feature_scaler.transform(X_test)
The instance was given the name feature_scaler.
Training the Algorithm
With the Scikit-Learn library, it is easy for us to train the KNN algorithm. Let
us first import the KNeighborsClassifier from the Scikit-Learn library:
from sklearn.neighbors import KNeighborsClassifier
The following code will help us train the algorithm:
knn_classifier = KNeighborsClassifier(n_neighbors=5)
knn_classifier.fit(X_train, y_train)
Note that we have created an instance of the class we have created and named
the instance knn_classifier. We have used one parameter in the instantiation,
that is, n_neighbors. We have used 5 as the value of this parameter, and this
basically denotes the value of K. Note that there is no specific value for K,
and it is chosen after testing and evaluation. However, for a start, 5 is used as
the most popular value in most KNN applications.
We can then use the test data to make predictions. This can be done by
running the script given below:
pred_y = knn_classifier.predict(X_test)
Evaluating the Accuracy of the Algorithm
Evaluation of the KNN algorithm is not done in the same way as evaluating
the accuracy of the linear regression algorithm. We were using metrics like
RMSE, MAE, etc. In this case, we will use metrics like confusion matrix,
precision, recall, and f1 score.
We can use the classification_report and confusion_matrix methods to
calculate these metrics. Let us first import these from the Scikit-Learn library:
from sklearn.metrics import confusion_matrix, classification_report
Run the following script:
print(confusion_matrix(y_test, pred_y))
print(classification_report(y_test, pred_y))
The results given above show that the KNN algorithm did a good job in
classifying the 30 records that we have in the test dataset. The results show
that the average accuracy of the algorithm on the dataset was about 90%. This
is not a bad percentage.
Comparing K Value with the Error Rate
We earlier said that there is no specific value of K that can be said to give the
best results on the first go. We chose 5 because it is the most popular value
used for K. The best way to find the best value of K is by plotting a graph of
K value and the corresponding error for the dataset.
Let us create a plot using the mean error for predicted values of the test set
for all the K values that range between 1 and 40. We should begin by
calculating the mean of error for the predicted value with K ranging between
1 and 40. Just run the script given below:
error = []
# K values range between 1 and 40
for x in range(1, 40):
knn = KNeighborsClassifier(n_neighbors=x)
knn.fit(X_train, y_train)
pred_x = knn.predict(X_test)
error.append(np.mean(pred_x != y_test))
The code will run the loop from 1 to 40. In every iteration, the mean error for
the predicted value of the test set is calculated and the result is appended to
error list.
We should now plot the values of error against the values of K. The plot can
be created by running the script given below:
plt.figure(figsize=(12, 6))
plt.plot(range(1, 40), error, color='blue', linestyle='dashed', marker='o',
markerfacecolor='blue', markersize=10)
plt.title('Error Rate for K')
plt.xlabel('K Values')
plt.ylabel('Mean Error')
plt.show()
The graph shows that we will get a Mean Error of 0 when we use values of K
between 1 and 17. It will then be good for you to play around with the value
of K and see its impact on the accuracy of the predictions.
Chapter 7: K-Means Clustering
Now that we have the data, we can create a plot and see how the data points
are distributed. We will then be able to tell whether there are any clusters at
the moment:
plt.show()
The code gives the following plot:
If we use our eyes, we will probably make two clusters from the above data,
one at the bottom with five points and another one at the top with five points.
We now need to investigate whether this is what the K-Means clustering
algorithm will do.
Creating Clusters
We have seen that we can form two clusters from the data points; hence the
value of K is now 2. These two clusters can be created by running the
following code:
kmeans_clusters = KMeans(n_clusters=2)
kmeans_clusters.fit(X)
We have created an object named kmeans_clusters, and 2 have been used as
the value for the parameter n_clusters. We have then called the fit() method
on this object and passed the data we have in our numpy array as the
parameter to the method.
We can now have a look at the centroid values that the algorithm has created
for the final clusters:
print (kmeans_clusters.cluster_centers_)
This returns the following:
The first row above gives us the coordinates for the first centroid, which is,
(16.8, 17). The second row gives us the coordinates of the second centroid,
which is, (70.2, 74.2). If you followed the manual process of calculating the
values of these, they should be the same. This will be an indication that the
K-Means algorithm worked well.
The following script will help us see the data point labels:
print(kmeans_clusters.labels_)
This returns the following:
We have simply plotted the first column of the array named X against the
second column. At the same time, we have passed kmeans_labels_ as the
value for parameter c which corresponds to the labels. Note the use of the
parameter cmap='rainbow'. This parameter helps us to choose the color type
for the different data points.
As you expected, the first five points have been clustered together at the
bottom left and assigned a similar color. The remaining five points have been
clustered together at the top right and assigned one unique color.
We can choose to plot the points together with the centroid coordinates for
every cluster to see how the positioning of the centroid affects clustering. Let
us use three clusters to see how they affect the centroids. The following script
will help you to create the plot:
plt.scatter(X[:,0], X[:,1], c=kmeans_clusters.labels_, cmap='rainbow')[4]
plt.scatter(kmeans_clusters.cluster_centers_[:,0]
,kmeans_clusters.cluster_centers_[:,1], color='black')
plt.show()
The script returns the following plot:
SVMs fall under the category of supervised machine learning algorithms and
are highly applied classification and regression problems. It is known for its
ability to handle nonlinear input spaces. It is highly applied in applications
like intrusion detection, face detection, classification of news articles, emails
and web pages, handwriting recognition and classification of genes.
The algorithm works by segregating the data points in the best way possible.
The distance between the nearest points is referred to as the margin. The goal
is to choose a hyperplane with the maximum possible margin between the
support vectors in a given dataset.
To best understand how this algorithm works, let us first implement it in
Scikit-Learn library. Our goal is to predict whether a bank currency note is
fake or authentic. We will use the attributes of the note including variance of
the image, the skewness of the wavelet transformed image, the kurtosis of the
image and entropy of the image. Since this is a binary classification
algorithm, let us use the SVM classification algorithm.
If we have a linearly-separable data with two dimensions, the goal of a
typical machine learning algorithm is to identify a boundary that will divide
the data so as to minimize the misclassification error. In most cases, one gets
several lines with all these lines correctly classifying the data.
SVM is different from the other classification algorithms in the way it selects
the decision boundary maximizing the distance from the nearest data points
for all classes. The goal of SVM is not to find the decision boundary only, but
to find the most optimal decision boundary.
The most optimal decision boundary refers to the decision boundary with the
maximum margin from the nearest points of all classes. The nearest points
from the decision boundary maximizing the distance between the decision
boundary and the points are known as support vectors. For the case of
support vector machines, the decision boundary is known as maximum
margin classifier or maximum margin hyperplane.
Complex mathematics is involved in the calculation of the support vectors.
Determine the margin between the decision boundary and support vectors
and maximize the margin.
Let us begin by importing the necessary libraries:
import numpy as np[5]
import pandas as pd
import matplotlib.pyplot as plt
Importing the Dataset
We will use the read_csv method provided by the Pandas library to read the
data and import it into our workspace (Agile Actors, 2018). This can be done
as follows:
dataset = pd.read_csv("bank_note.csv")
Let us call the shape method to print the shape of the data for us:
print(dataset.shape)
This returns the following:
This shows that there are 1372 columns and 5 columns in the dataset. Let us
print the first 5 rows of the dataset:
print(dataset.head())
Again, this may return an error because of the lack of the output information.
Let us solve this using the Python’s sys library. You should now have the
following code:
import numpy as np[6]
import pandas as pd
import matplotlib.pyplot as plt
import sys
sys.__stdout__=sys.stdout
dataset = pd.read_csv("bank_note.csv")
print(dataset.head())
The code returns the following output:
All attributes of the data are numeric as shown above. Even the last attribute
is numeric as its values are either 0 or 1.
Preprocessing the Data
It is now time to subdivide the above data into attributes and labels as well as
training and test sets. The following code will help us subdivide the data into
attributes and labels:
X = dataset.drop('Class', axis=1)
y = dataset['Class']
The first line above helps us store all the columns of the dataset into variable
X, except the class column. The drop() function has helped us exclude the
Class column from this. The second line has then helped us store the Class
column into variable y. The variable X now has attributes while the variable y
now has the corresponding labels.
We have achieved the goal of dividing the dataset into attributes and labels.
The next step is to divide the dataset into training and test sets. Scikit-learn
has a library known as model_selection which provides us with a method
named train_test_split that we can use to divide the data into training and
test sets.
First, let us import the train_test_split method:
from sklearn.model_selection import train_test_split
Training the Algorithm
Now that the data has been split into training and test sets, we should now
train the SVM on the training set. Scikit-Learn comes with a library known as
SVM which has built-in classes for various SVM algorithms.
In this case, we will be doing a classification task. Hence, we will use the
support vector classifier class (SVC). The takes a single parameter, that is,
the kernel type. For a simple SVM, the parameter should be set to “linear”
since the simple SVMs can only classify data that is linearly separable.
We will call the fit method of SVC to train the algorithm on our training set.
The training set should be passed as a parameter to the fit method. Let us first
import the SVC class from Scikit-Learn:
from sklearn.svm import SVC
Now run the following code:
svc_classifier = SVC(kernel='linear')
svc_classifier.fit(X_train, y_train)
Making Predictions
We should use the SVC class for making predictions. Note that the
predictions will be made on the test data. Here is the code for making
predictions:
pred_y = svc_classifier.predict(X_test)
Evaluating the Accuracy of the Algorithm
The output given above shows that the algorithm did a good task; an average
of 99% for the above metrics is not bad.
Let us give another example of how to implement SVM in Scikit-Learn using
the Iris dataset. We had already loaded the Iris dataset, a dataset that shows
details of flowers in terms of sepal and petal measurements, that is, width and
length. We can now learn from the data, and then make a prediction for
unknown data. These call for us to create an estimator, and then call its fit
method.
This is demonstrated in the script given below:
from sklearn import svm
from sklearn import datasets
# Loading the dataset
iris = datasets.load_iris()
clf = svm.LinearSVC()
# learn from the dataset
clf.fit(iris.data, iris.target)
# predict unseen data
clf.predict([[ 6.2, 4.2, 3.5, 0.35]])
# Changing model parameters using the attributes ending with an underscore
print(clf.coef_ )
The code will return the following output:
We now have the predicted values for our data. Note that we imported both
datasets and SVM from the scikit-learn library. After loading the dataset, a
model was fitted/created by learning patterns from the data. This was done by
calling the fit() method. Note that the LinearSVC()method helps us to create
an estimator for the support vector classifier, on which we are to create the
model. We have then passed in new data for which we need to make a
prediction.
Chapter 9: Machine Learning and Neural Networks
Neural networks were first developed in the late 1950s as a means to build
learning systems that were modeled on our understanding of how the human
brain functions. Despite their name, however, there is little resemblance
between a neural network and a human brain. Mostly, the name serves as a
metaphor and as inspiration. Each “neuron” in a Neural Network is consists
of a “node,” a piece serial processing code designed to iterate over a problem
until coming up with a solution at which point, the result is passed on to the
next neuron in the layer, or to the next layer if the current layer’s processing
is complete. In contrast, the human brain is capable of true parallel
processing, by nature of the complex interconnections of its neurons and the
fact its processing specialties are located in different areas of the brain
(vision, hearing, smell, etc.), all of which can process signals simultaneously.
Neural networks are a sub-set of Machine Learning. They consist of software
systems that mimic some aspects of human neurons. Neural Networks pass
data through interconnected nodes. The data is analyzed, classified, and then
passed on to the next node, where further classification and categorization
may take place. The first layer of any Neural Network is the input layer,
while the last is the output layer (these can be the same layer, as they were in
the first, most primitive neural networks). Between these neural layers is any
number of hidden layers that do the work of dealing with the data presented
by the input layer. Classical Neural Networks usually contain two to three
layers. When a Neural Network consists of more layers, it is referred to as
Deep Learning. Such Deep Learning systems can have dozens or even
hundreds of layers.
Feedforward Neural Networks
The first and simplest form of Neural Networks is called feedforward. As the
name implies, data flows through a feedforward network in one direction,
from the input, through the node layers containing the neurons and exits
through the output layer. Unlike more modern neural networks, feedforward
networks do not cycle or iterate over their data. They perform a single
operation on the input data, and they provide their solution in an output
stream.
Single-layer perceptron
This is the simplest form of a feedforward neural network with one layer of
nodes. The input is sent to each node already weighted, and depending on
how the node calculates the input and its weight, a minimal threshold may or
not be met, and the neuron either fires (taking the “activated” value) or does
not fire (taking the “deactivated” value).
Multi-layer perceptron
Multi-layer perceptrons, as the name suggests, consist of two or more
(sometimes many more) layers, with the output of the upper layer becoming
the input of the lower layer. Because there are many layers, this form of
neural network often takes advantage of backpropagation, where the
produced output is compared with the expected output, and the degree of
error is fed back through the network to adjust the weights on the appropriate
nodes, all with the intention of producing an output closer to the desired
output state. Each error correction is tiny, so often a great number of
iterations are required to achieve a “learned” state. At this point, the neural
network is no longer considered a feedforward system proper. It is algorithms
such as these multi-layer networks with backpropagation that have become
some of the most successful and powerful Machine Learning devices in the
field.
Recurrent Neural Networks
Big Data is pretty much what it sounds like — the practice of dealing with
large volumes of data. And by large, we are talking about astoundingly huge
amounts of data — gigabytes, terabytes, petabytes of data. A petabyte, to put
this size into perspective, is 10 to the 15 bytes. Written out that is 1 PB =
1,000,000,000,000,000 bytes. When you consider that a single byte is
equivalent in storage to a single letter like an ‘a’ or ‘x,’ the scale of the data
sets being dealt with my Big Data is truly awe-inspiring. And these sizes are
increasing every day.
The term Big Data comes from the 1990s, although computer scientists have
been dealing with large volumes of data for decades. What sets Big Data
apart from data sets before is the fact the size of data sets began to
overwhelm the ability of traditional data analytics software to deal with it.
New database storage systems had to be created (Hadoop for example) just to
hold the data and new software written to be able to deal with so much
information in a meaningful way.
Today the term Big Data brings with it a series of assumptions and practices
that have made it a field all its own. Most Big Data discussions begin with
the 3 V’s. Big data is data containing more variety arriving in increasing
volumes and with increasing velocity (acceleration would be an accurate term
to use here, but then we’d lose the alliteration).
Volume
The term volume refers to the vast amount of data available. When the term
Big Data was coined in the early 2000s, the amount of data available for
analysis was overwhelming. Since then, the volume of data created has
grown exponentially. In fact, the volume of data produced has become so
vast that new storage solutions have to be created just to deal with it. This
increase in available data shows no sign of slowing, and data is, in fact,
increasing geometrically by doubling every two years.
Velocity
Along with the rise in the amount of data being created, there has been an
increase in the speed at which it is produced. Things like smartphones, RFID
chips, and real-time facial recognition produce not only enormous amounts of
data; this data is produced in real time and must be dealt with as it is created.
If not processed in real time, it must be stored for later processing. The
increasing speed of this data arriving strains the capacity of bandwidth,
processing power, and storage space to contain it for later use.
Variety
Data does not get produced in a single format. It is stored numerically in
detailed databases, produced in structure-less text and email documents, and
stored digitally in streaming audio and video. There is stock market data,
financial transactions, and so on, all of it uniquely structured. So not only
must large amounts of data be handled very quickly, it is produced in many
formats that require distinct methods of handling for each type.
Lately, two more V’s have been added:
Value
Data is intrinsically valuable, but only if you are able to extract this value
from it. Also, the state of input data, whether it is nicely structured in a
numeric database or unstructured text message chains, affects its value. The
less structure a data set has, the more work needs to be put into it before it
can be processed. In this sense, well-structured data is more valuable than
less-structured data.
Veracity
Not all captured data is of equal quality. When dealing with assumptions and
predictions parsed out of large data sets, knowing the veracity of the data
being used has an important effect on the weight given to the information
studying it generates. There are many causes that limit data veracity. Data can
be biased by the assumptions made by those who collected them. Software
bugs can introduce errors and omission in a data set. Abnormalities can
reduce data veracity like when two wind speed sensors next to each other
report different wind directions. One of the sensors could be failing, but there
is no way to determine this from the data itself. Sources can also be of
questionable veracity — in a company’s social media feed are a series of very
negative reviews. Were they human or bot created? Human error can also be
present such as a person signing up to a web service entering their phone
number incorrectly. And there are many more ways data veracity can be
compromised.
The point of dealing with all this data is to identify useful detail out of all the
noise — businesses can find ways to reduce costs, increase speed and
efficiency, design new products and brands, and make more intelligent
decisions. Governments can find similar benefits in studying the data
produced by their citizens and industries.
Here are some examples of current uses of Big Data.
Product Development
Big Data can be used to predict customer demand. Using current and past
products and services to classify key attributes, they can then model these
attributes’ relationships and their success in the market.
Predictive Maintenance
Buried in structured data are indices that can predict the mechanical failure of
machine parts and systems. Year of manufacture, make and model, and so on,
provide a way to predict future breakdowns. Also, there is a wealth of
unstructured data in error messages, service logs, operating temperature, and
sensor data. This data, when correctly analyzed, can predict problems before
they happen so maintenance can be deployed preemptively, reducing both
cost and system downtime.
Customer Experience
Many businesses are nothing without their customers. Yet acquiring and
keeping customers in a competitive landscape is difficult and expensive.
Anything that can give a business an edge will be eagerly utilized. Using Big
Data, businesses can get a much clearer view of the customer experience by
examining social media, website visit metrics, call logs, and any other
recorded customer interaction to modify and improve the customer
experience, all in the interests of maximizing the value delivered in order to
acquire and maintain customers. Offers to individual customers can become
not only more personalized but more relevant and accurate. By using Big
Data to identify problematic issues, businesses can handle them quickly and
effectively, reducing customer churn and negative press.
Fraud & Compliance
While there may be single rogue bad actors out there in the digital universe
attempting to crack system security, the real threats are from organized, well-
financed teams of experts, sometimes teams supported by foreign
governments. At the same time, security practices and standards never stand
still but are constantly changing with new technologies and new approaches
to hacking existing ones. Big Data helps identify data patterns suggesting
fraud or tampering. Aggregation of these large data sets makes regulatory
reporting much faster.
Operation Efficiency
Not the sexiest topic, but this is the area in which Big Data is currently
providing the most value and return. Data helps companies analyze and
assess production systems, examine customer feedback and product returns,
and examine a myriad of other business factors to reduce outages and waste,
and even anticipate future demand and trends. Big Data is even useful in
assessing current decision-making processes and how well they function in
meeting demand.
Innovation
Big Data is all about relations between meaningful labels. For a large
business, this can mean examining how people, institutions, other entities,
and business processes intersect, and use any interdependencies to drive new
ways to take advantage of these insights. New trends can be predicted, and
existing trends can be better understood. This all leads to understanding what
customers actually want and anticipate what they may want in the future.
Knowing enough about individual customers may lead to the ability to take
advantage of dynamic pricing models. Innovation driven by Big Data is
really only limited by the ingenuity and creativity of the people curating it.
Machine Learning is also meant to deal with large amounts of data very
quickly. But while Big Data is focused on using existing data to find trends,
outliers, and anomalies, Machine Learning uses this same data to “learn”
these patterns in order to deal with future data proactively. While Big Data
looks at the past and present data, Machine Learning examines the present
data to learn how to deal with the data that will be collected in the future. In
Big Data, it is people who define what to look for and how to organize and
structure this information. In Machine Learning, the algorithm teaches itself
what is important through iteration over test data, and when this process is
completed, the algorithm can then go ahead to new data it has never
experienced before.
Chapter 11: Machine Learning and Regression
Machine Learning can produce two distinct output types — classification and
regression. A classification problem involves an input that requires an output
in the form of a label or category. Regression problems, on the other hand,
involve an input that requires an output value as a quantity. Let’s look at each
form in more detail.
Classification Problems
Classification problems are expected to produce an output that is a label or
category. That is to say, a function is created from an examination of the
input data that produces a discrete output. A familiar example of a
classification problem is whether a given email is spam or not spam.
Classification can involve probability, providing a probability estimate along
with its classification. 0.7 spam, for example, suggests a 70% chance an
existing email is spam. If this percentage meets or exceeds the acceptable
level for a spam label, then the email is classified as spam (we have spam
folders in our email programs, not 65% spam folders). Otherwise, it is
classified as not spam. One common method to determine classification
problem’s algorithm probability of accuracy is to compare the results of its
predictive model against the actual classification of the data set it has
examined. On a data set of 5 emails, for example, where the algorithm has
successfully classified 4 out of 5 of the emails in the set, the algorithm could
be said to have an accuracy of 80%. Of course, in a real-world example, the
training data set would be much more massive.
Regression problems
In a regression problem, the expected output is in the form of an unlimited
numerical range. The price of a used car is a good example. The input might
be the year, color, mileage, condition, etc., and the expected output is a dollar
value, such as $4,500 - $6,500. The skill (error) of a regression algorithm can
be determined using various mathematical techniques. A common skill
measure for regression algorithms is to calculate the root mean squared error,
RMSE.
Although it is possible to modify each of the above methods to produce each
other’s result (that is, turn a classification algorithm into a regression
algorithm and vice versa), the output requirements of the two define each
algorithm quite clearly:
1. Classification algorithms produce a discrete category result which can
be evaluated for accuracy, while regression algorithms cannot.
2. Regression algorithms produce a ranging result and can be evaluated
using root mean squared error, while classification algorithms cannot.
So, while Machine Learning employs both types of methods for problem-
solving (classification and regression), what method is employed for any
particular problem depends on the nature of the problem and how the solution
needs to be presented.
Chapter 12: Machine Learning and the Cloud
Things (IoT)
Perhaps the first Internet of Things was at Carnegie Mellon University in
1982 when a Coke machine was connected to the internet so it could report
on its current stock levels, as well as the temperature of newly-loaded drinks.
But at that time, computer miniaturization had not progressed to the point it
has today, so it was difficult for people at that time to conceive of very small
devices connecting to the internet. Also, consumer wireless internet would
not be available for another 15 years, meaning any internet connected device
had to be fairly large and wired to its connection.
Today, the Internet of Things (IoT) refers to all the Internet-connected
devices in the world. Some are obvious like your smartphone, and some are
not so obvious like a smart thermostat in your house. The term also applies to
internet-connected devices that are part of a device like a sensor on an
industrial robot’s arm or the jet engine from a commercial jetliner.
A relatively simple explanation, but it is hard to conceive just how many IoT
devices are around us, never mind how many will be in the near future. The
following is a breakdown of some of the Internet of Things devices currently
in use, which is often broken down into the following categories:
Consumer Applications
Healthcare
We are entering the world of “Smart Healthcare,” where computers, the
internet, and artificial intelligence are merging to improve our quality of life.
The Internet of Medical (or Health) Things (IoHT) is a specific application of
the Internet of Things designed for health and medically related purposes. It
is leading to the digitization of the healthcare system, providing connectivity
between properly equipped healthcare services and medical resources.
Some Internet of (Heath) Things applications enable remote health
monitoring and operate emergency notification systems. This can range from
blood pressure monitoring and glucose levels to the monitoring of medical
implants in the body. In some hospitals, you will find “smart beds” that can
detect patients trying to get up and adjust themselves to maintain healthy and
appropriate pressure and support for the patient, without requiring the
intervention of a nurse or other health professional. One report estimated
these sorts of devices and technology could save the US health care system
$300 billion in a year in increased revenue and decreased costs. The
interactivity of medical systems has also led to the creation of “m-health,”
which is used to collect and analyze information provided by different
connected resources like sensors and biomedical information collection
systems.
Rooms can be equipped with sensors and other devices to monitor the health
of senior citizens or the critically ill, ensuring their well-being, as well as
monitor that treatments and therapies are carried out to provide comfort,
regain lost mobility, and so on. These sensing devices are interconnected and
can collect, analyze, and transfer important and sensitive information. In-
home monitoring systems can interface with a hospital or other health-care
monitoring stations.
There have been many advances in plastics and fabrics allowing for the
creation of very low-cost, throw away “wearable” electronic IoT sensors.
These sensors, combined with RFID technology, are fabricated into paper or
e-textiles, providing wirelessly-powered, disposable sensors.
Combined with Machine Learning, the health care IoHT ecosphere around
each one of us can improve quality of life, guard against drug errors,
encourage health and wellness, and respond to and even predict responses by
emergency personnel. As a collection of technology and software, this future
“smart” health care will cause a profound shift in medical care, where we no
longer wait for obvious signs of illness to diagnose, but instead use the
predictive power of Machine Learning to detect anomalies and predict future
health issues long before human beings even know something might be
wrong.
Transportation
The IoT assists various transportation systems in the integration of control,
communications, and information processing. It can be applied throughout
the entire transportation system — drivers, users, vehicles, and infrastructure.
This integration allows for inter and even intra-vehicle communication,
logistics and fleet management vehicle control, smart traffic control,
electronic toll collection systems, smart parking, and even safety and road
assistance.
In case of logistics and fleet management, an IoT platform provides
continuous monitoring of cargo and asset locations and conditions using
wireless sensors and send alerts when anomalies occur (damage, theft, delays,
and so on). GPS, temperature, and humidity sensors can return data to the IoT
platform where it can be analyzed and set on to the appropriate users. Users
are then able to track in real-time the location and condition of vehicles and
cargo and are then able to make the appropriate decisions based on accurate
information. IoT can even reduce traffic accidents by providing drowsiness
alerts and health monitoring for drivers to ensure they do not drive when they
need rest.
As the IoT is integrated more and more with vehicles and the infrastructure
required to move these vehicles around, the ability for cities to control and
alleviate traffic congestion, for businesses to control and respond to issues
with transportation of their goods, and for both of these groups to work
together, increases dramatically. In unforeseen traffic congestion due to an
accident, for example, sensitive products (a patient in an ambulance on the
way to Emergency or produce or other time-sensitive items) could be put in
the front of the queue to ensure they are delayed as little as possible, all
without the need for traffic direction by human beings.
The potential rewards of such an integrated system are many, perhaps only
limited by our imagination. Using IoT enabled statistics from traffic will
allow for optimum traffic routing, which in term will reduce travel time and
CO2 emissions. Smart stop lights and road signs with variable-speed and
information displays will communicate more and more with the onboard
systems of vehicles, providing routing information to reduce travel time. And
all of this technology is made possible by dedicated Machine Learning
algorithms with access to the near endless flow of data from thousands and
thousands of individual IoT sensors.
Building/Home Automation
As discussed above, IoT devices can be used in any kind of building, where
they can monitor and control the electrical, mechanical, and electronic
systems of these buildings. The advantages identified are:
By integrating the internet with a building’s energy management
systems, it is possible to create “smart buildings” where energy
efficiency is driven by I0T.
Real-time monitoring of buildings provides a means for reducing
energy consumption and the monitoring of occupant behavior.
IoT devices integrated into buildings provide information on how smart
devices can help us understand how to use this connectivity in future
applications or building designs.
When you read “smart” or “real-time” above, you should be thinking
Machine Learning because it is these algorithms that can tie all of the sensor
input together into something predictive and intelligent.
Industrial Applications
The Internet of Things can be used for control and monitoring of both urban
and rural infrastructure, things like bridges, railways, and off and on-shore
wind farms. The IoT infrastructure can be employed to monitor changes and
events in structural conditions that might threaten safety or increase risks to
users.
The construction industry can employ the Internet of Things and receive
benefits like an increase in productivity, cost savings, paperless workflows,
time reduction, and better-quality work days for employees, while real-time
monitoring and data analytics can save money and time by allowing faster
decision-making processes. Coupling the Internet of Things with Machine
Learning predictive solutions allows for more efficient scheduling for
maintenance and repair, as well as allowing for better coordination of tasks
between users and service providers. Internet of Things deployments can
even control critical infrastructure by allowing bridge access control to
approaching ships, saving time and money. Large-scale deployment of IoT
devices for the monitoring and operation of infrastructure will probably
improve the quality of service, increase uptime, reduce the costs of
operations, and increase incident management and emergency response
coordination. Even waste-management could benefit from IoT deployments,
allowing for the benefits of optimization and automation.
City Scale Uses
The Internet of Things has no upper limit to what it can encompass. Today,
buildings and vehicles are routine targets for the Internet of Things
integration, but tomorrow, it will be cities that are the wide-scale target of
this digital revolution. In many cities around the world, this integration is
beginning with important effects on utilities, transportation, and even law
enforcement. In South Korea, for example, an entire city is being constructed
as the first of its kind, a fully equipped and wired city. Much of the city will
be automated, requiring very little or even no human intervention. Songdo is
a nearly finished construction.
The city of Santander in Spain is taking a different approach. It does not have
the benefit of being built from scratch. Instead, it has produced an app that is
connected with over 10,000 sensors around the city. Things like parking
search, a digital city agenda, environmental monitoring, and more have been
integrated by the Internet of Things. Residents of the city download an app to
their smartphones in order to access this network.
In San Jose California, an Internet of Things deployment has been created
with several purposes — reducing noise pollution, improving air and water
quality, and increasing transportation efficiency. In 2014, the San Francisco
Bay Area partnered with Sigfox, a French company to deploy an ultra-narrow
band wireless data network, the first such business-backed installation in the
United States. They planned to roll out over 4000 more base stations to
provide coverage to more than 30 U.S. cities.
In New York, all the vessels belonging to the city were connected, allowing
24/7 monitoring. With this wireless network, NYWW can control its fleet and
passengers in a way that was simply not possible to imagine even a decade
ago. Possible new applications for the network might include energy and fleet
management, public Wi-Fi, security, paperless tickets, and digital signage
among others.
Most of the developments listed above will rely on Machine Learning as the
intelligent back-end to the forward-facing Internet of Things used to monitor
and update all the components of any smart city. It would seem the next step,
after the creation of smart cities through the integration of Internet of Things
sensors and monitors with Machine Learning algorithms, would be larger,
regional version.
Energy Management
So many electrical devices in our environment already boast an internet
connection. In the future, these connections will allow communication with
the utility from which they draw their power. The utility, in turn, will be able
to use this information to balance energy generation with load usage and to
optimize energy consumption over the grid. The Internet of Things will allow
users to remotely control (and schedule) lighting, HVAC, ovens, and so on.
On the utility’s side, aspects of the electrical grid, transformers, for example,
will be equipped with sensors and communications tools, which the utility
will use to manage the distribution of energy.
Once again, the power of Machine Learning will provide the predictive
models necessary to anticipate and balance loads, aided by the constant flow
of information from the Internet of Things devices located throughout the
power grid. At the same time, these IoT monitors will also provide service
levels and other information, so artificial intelligence systems can track and
identify when components are reaching the end of life and need repair or
replacement. No more surprise power outages, as components getting close to
failure will be identified and repaired before they can threaten the power grid.
Environmental Monitoring
IoT can be used for environmental monitoring and usually, that application
means environmental protection. Sensors can monitor air and water quality,
soil and atmospheric conditions, and can even be deployed to monitor the
movement of wildlife through their habitats. This use of the Internet of
Things over large geographic areas means it can be used by tsunami or
earthquake early-warning systems, which can aid emergency response
services in providing more localized and effective aid. These IoT devices can
include a mobile component. A standardized environmental Internet of
Things will likely revolutionize environmental monitoring and protection.
Here, we can begin to see the geographic scalability of Machine Learning as
it interfaces with the Internet of Things. While the use in cities can be
impressive due to the number of sensors and devices deployed throughout
them, in rural or natural environmental settings, the density of IoT devices
drops sharply, yet these huge geographical areas can come under the scrutiny
of Machine Learning algorithms in ways yet to be conceived.
Trends in IoT
The trend for the Internet of Things is clear — explosive growth. There were
an estimated 8.4 billion IoT devices in 2017, a 31% increase from the year
before. By 2020, worldwide estimates are that there will be 30 billion
devices. At this point, the market value of these devices will be over $7
trillion. The amount of data these devices will produce is staggering. And
when all this data is collected and ingested by Machine Learning algorithms,
control, and understanding of so many aspects of our lives, our cities, and
even our wildlife will increase dramatically.
Intelligence
Ambient Intelligence is an emerging discipline in which our environment
becomes sensitive to us. It was not meant to be part of the Internet of Things,
but the trend seems to be that Ambient Intelligence and autonomous control
are two disciplines into which the Internet of Things will make great inroads.
Already research in these disciplines is shifting to incorporate the power and
ubiquity of the IoT.
Unlike today, the future Internet of Things might be comprised of a good
number of non-deterministic devices, accessing an open network where they
can auto-organize. That is to say, a particular IoT device may have a purpose,
but when required by other Internet of Things devices, it can become part of a
group or swarm to accomplish a collective task that overrides its individual
task in priority. Because these devices will be more generalized with a suite
of sensors and abilities, these collectives will be able to organize themselves
using the particular skills of each individual to accomplish a task, before
releasing its members when the task is complete so they can continue on with
their original purpose. These autonomous groups can form and disperse
depending on the priority given to the task, and considering the
circumstances, context, and the environment in which the situation is taking
place.
Such collective action will require coordination and intelligence, which of
course will need to be provided, at least in part, by powerful Machine
Learning algorithms tasked with achieving these large-scale objectives. Even
if the Internet of Things devices that are conscripted to achieve a larger goal
have their own levels of intelligence, it will be assisted by artificial
intelligence systems able to collate all the incoming data from the IoT devices
responding to the goal and finding the best approach to solve any problems.
Although swarm intelligence technology will also likely play a roll (see later
in the book in the chapter on The Swarm), swarms cannot provide the
predictive ability that crunching massive data sets by Machine Learning
backed by powerful servers have.
Architecture
The system architecture of IoT devices is typically broken down into three
main tiers. Tier one includes devices, Tier two is the Edge Gateway, and Tier
three is the Cloud.
Devices are simply network-connected things with sensors and transmitters
using a communication protocol (open or proprietary) to connect with an
Edge Gateway.
The Edge Gateway is the point where incoming sensor data is aggregated and
processed through systems to provide functionality including data pre-
processioning, securing cloud connections, and running WebSockets or even
edge analytics.
Finally, Tier three includes the applications based in the cloud built for the
Internet of Things devices. This tier includes database storage systems to
archived sensor data. The cloud system handles communication taking place
in all tiers and provides features like event queuing.
As data is passed from the Internet of Things, a new architecture emerges, the
web of things, the application layer which processes the disparate sensor data
from the Internet of Things devices into web applications, driving innovation
in use-cases.
It is no great leap to assume this architecture does, and will even more in the
future, require massive scalability in the network. One solution being
deployed is fog (or edge) computing, where the edge network, tier two above,
takes over many of the analytical and computational tasks, sorting the
incoming data and only providing vetted content to the tier three cloud layer.
This reduces both latency and bandwidth requirements in the network. At the
same time, if communications are temporarily broken between the cloud and
edge layers, the edge layer’s fog computing capabilities can take over to run
the tier one devices until communications can be re-established.
Complexity and Size
Today’s Internet of Things is more a collection of proprietary devices,
protocols, languages, and interfaces. In the future, as there is more
standardization deployed, this collection will be studied as a complex system,
given its fast numbers of communication links, autonomous actors, and its
capacity to constantly integrate new actors. Today, however, most elements
of the Internet of Things do not belong to any massive group of devices.
Devices in a smart home, for example, do not communicate with anything
other than the central hub of the home. These subsystems are there for
reasons of privacy, reliability, and control. Only the central hub can be
accessed or controlled from outside the home, meaning all sensor data,
positions, and the status of the devices inside are not shared. Until such time
as there is an assurance of security from more global networks about access
and control of these private networks of IoT devices, and at the same time,
until non-standard architectures are developed to allow the command and
control of such vast numbers of internet connected devices, the IoT will
likely continue as a collection of non-communicating mini-networks. SND,
Software Defined Networking, appears promising in this area as a solution
that can deal with the diversity and unique requirements of so many IoT
applications.
For Machine Learning to take advantage of all this sensor data, ways will
need to be devised to collect it into training and data sets that can be iterated
over, allowing the software to learn about and be able to predict important
future states of the IoT network. For the time being, unfortunately, most IoT
sensor data produced in these private networks is simply being lost.
Space
Today, the question of an Internet of Thing’s location in space and time has
not been given priority. Smartphones, for example, allow the user to opt out
of allowing an application gaining access to the phone’s capacity for
geolocation. In the future, the precise location, position, size, speed, and
direction of every IoT device on the network will be critical. In fact, for IoT
devices to be interconnected and provide coordinated activity, the current
issues of variable spatial scales, indexing for fast search and close neighbor
operations, and indeed just the massive amount of data these devices will
provide all have to be overcome. If the Internet of Things is to become
autonomous from human-centric decision-making, then in addition to
Machine Learning algorithms being used to command and coordinate them,
space and time dimensions of IoT devices will need to be addressed and
standardized in a way similar to how the internet and Web have been
standardized.
Jean-Louis Gassée and a “basket of remotes”
Gassée worked at Apple from 1981 to 1990 and after that was one of the
founders of Be Incorporated, creator of the BeOS operating system. He
describes the issue of various protocols among Internet of Things providers
as the “basket of remotes” problem. As wireless remote controls gained
popularity, consumers would find they ended up with a basket full of remotes
— one for the TV, one for the VCR, one for the DVD player, one for the
stereo, and possibly even a universal remote that was supposed to replace all
the others but often failed to match at least one device. All these devices used
the same technology, but because each manufacturer used a proprietary
language and/or frequency to communicate with their appliance, there was no
way to easily get one category of device to speak the same language as the
others. In the same way, what Gassée sees happening is, as each new
deployment of some Internet of Things tech reaches the consumer market, it
becomes its own remote and base station, unable to communicate with other
remotes or with the other base station to which other groups of IoT tech can
communicate. If this state of proprietary bubbles of the Internet of Things
collections is not overcome, many of the perceived benefits of a vast IoT
network will never be realized.
Security
Just as security is one of the main concerns for the conventional internet,
security of the Internet of Things is a much-discussed topic. Many people are
concerned that the industry is developing too rapidly and without an
appropriate discussion about the security issues involved in these devices and
their networks. The Internet of Things, in addition to standard security
concerns found on the internet, has unique challenges — security controls in
industry, Internet of Things businesses processes, hybrid systems, and end
nodes.
Security is likely the main concern over adopting Internet of Things tech.
Cyber-attacks on components of this industry are likely to increase as the
scale of IoT adoption increases. And these threats are likely to become
physical, as opposed to merely a virtual threat. Current Internet of Things
systems have many security vulnerabilities including a lack of encrypted
communication between devices, weak authentication (many devices are
allowed to run in production environments with default credentials), lack of
verification or encryption of software updates, and even SQL injection. These
threats provide bad actors the ability to easily steal user credentials, intercept
data, and collect Personally Identifiable Information, or even inject malware
into updated firmware.
Many internet-connected devices are already spying on people in their own
homes, including kitchen appliances, thermostats, cameras, and televisions.
Many components of modern cars are susceptible to manipulation should a
bad actor gain access to the vehicle’s onboard systems including dashboard
displays, the horn, heating/cooling, hood and trunk releases, the engine, door
locks, and even braking. Those vehicles with wireless connectivity are
vulnerable to wireless remote attacks. Demonstration attacks on other
internet-connected devices have been made as well, including cracking
insulin pumps, implantable cardioverter defibrillators, and pacemakers.
Because some of these devices have severe limitations on their size and
processing power, they may be unable to use standard security measures like
strong encryption for communication or even employing firewalls.
Privacy concerns over the Internet of Things have two main thrusts —
legitimate and illegitimate uses. In legitimate uses, governments and large
corporations may set up massive IoT services which by their nature collect
enormous amounts of data. To a private entity, this data can be monetized in
many different ways, with little or no recourse for the people whose lives and
activities are swept up in the data collection. For governments, massive data
collection from the Internet of Things networks provide the data necessary to
provide services and infrastructure, to save resources and reduce emission,
and so on. At the same time, these systems will collect enormous amounts of
data on citizens, including their locations, activities, shopping habits, travel,
and so on. To some, this is the realization of a true surveillance state. Without
a legal framework in place to prevent governments from simply scooping up
endless amounts of data to do with as they wish, it is difficult to refute this
argument.
Illegitimate uses of the massive Internet of Things networks include
everything from DDOS (distributed denial of service) attacks through
malware attacks on one or more of the IoT devices on the network. Even
more worrying, security vulnerabilities in even one device on an Internet of
Things network can, by nature of the fact it is capable of full communication
with all devices nearby because it has access to the encryption requirements
to present itself as a legitimate device on the network, means an infected
device may not only provide its illegitimate host with the data it provides, but
metadata of other devices in the network, and possibly even access to the
edge systems themselves.
In 2016, a DDOS attack powered by an Internet of Things device
compromised by a malware infection led to over 300,000 device infections
and brought down both a DNS provider and several major websites. This
Mirai Botnet was able to single out for attack devices that consisted mostly of
IP cameras, DVRs, printers, and routers.
While there are several initiatives being made to increase security in the
Internet of Things marketplace, many argue that government regulation and
inter-governmental cooperation around the world must be in place to ensure
public safety.
Chapter 14: Machine Learning and Robotics
Neural networks are a machine learning framework that tries to mimic the
way the natural biological neural networks operate. Humans have the
capacity of identifying patterns with a very high degree of accuracy. Anytime
you see a cow, you can immediately recognize that it is a cow. This also
applies to when you see a goat. The reason is that you have learned over a
period of time how a cow or a goat looks like and what differentiates between
the two.
Artificial neural networks refer to computation systems that try to imitate the
capabilities of human learning via a complex architecture that resembles the
nervous system of a human being.
Chapter 15: Machine Learning and Swarm
Intelligence
Swarm Intelligence (SI) is defined as collaborative behavior, natural or
artificial, of decentralized, self-organized systems. That is, Swarm
Intelligence can refer to an ant colony or a “swarm” of autonomous mini-
drones in a lab.
In artificial intelligence, a swarm is typically a collection of agents that
interact with each other and their environment. The inspiration for Swarm
Intelligence comes from nature, from the collaboration of bees to the flocking
of birds to the motions of herd animals, groups of animals that appear to act
intelligently even when no single individual has exceptional intelligence, and
there is no centralized decision-making process.
Swarm Behavior
One of the central tenets gleaned from swarm research has been the notion of
emergent behavior. When a number of individuals are given simple rules for
complex behavior, some behaviors seem to arise despite there being no rule
or instruction to create them. Consider the artificial life program created by
Craig Reynolds in 1986, which simulates bird flocking. Each individual bird
was given a simple set of rules:
Avoid crowding local flockmates (separation).
Steer towards the heading average of local flockmates (alignment).
Steer to travel in the direction of the average center of local flockmates
(cohesion).
When he let the birds free, his experimental birds behaved like a real bird
flock. He discovered he could add more rules to make more complex flocking
behavior like goal seeking or obstacle avoidance.
Applications of Swarm Intelligence
There are many problems and limitations with Machine Learning. This
chapter will go over the technical issues that are currently or may in the
future limit the development of Machine Learning. Finally, it will end with
some philosophical concerns about possible issues Machine Learning may
bring about in the future.
Concerns about Machine Learning limitations have been summarized in a
simple phrase, which outlines the main objects to Machine Learning. It
suggests Machine Learning is greedy, brittle, opaque and shallow. Let’s
examine each one in turn.
Greedy
By calling it greedy, critics of Machine Learning point to the need for
massive amounts of training data to be available in order to successfully train
Machine Learning systems to acceptable levels of error. Because Machine
Learning systems are trained not programmed, their usefulness will be
directly proportional to the amount (and quality) of the data sets used to train
them.
Related to the size of training data required is the fact that, for supervised and
semi-supervised Machine Learning training, the raw data used for training
must first be labeled so that it can be meaningfully employed by the software
to train. In essence, the task of labeling training data means to clean up the
raw content and prepare it for the software to ingest. But labeling data can
itself be a very complex task, as well as often a laborious and tedious one.
None or improperly labeled data fed into a supervised Machine Learning
system will produce nothing of value. It will just be a waste of time.
Brittle
To say Machine Learning is brittle is to highlight a very real and difficult
problem in Artificial Intelligence. The problem is, even after a Machine
Learning system has been trained to provide extremely accurate and valuable
predictions on data it has been trained to deal with, asking that trained system
to examine a data set even slightly different from the type it was trained on
will often cause a complete failure of the system to produce any predictive
value. That is to say, Machine Learning systems are unable to contextualize
what they have learned and apply it to even extremely similar circumstances
to those on which they have been trained. At the same time, attempting to
train an already trained Machine Learning algorithm with a different data set
will cause the system to “forget” its previous learning, losing, in turn, all the
time and effort put in to preparing the previous data sets and training the
algorithm with it.
Bias in Machine Learning systems is another example of how Machine
Learning systems can be brittle. In fact, there are several different kinds of
bias that threaten Artificial Intelligence. Here are a few:
Bias in Data:
Machine Learning is, for the foreseeable future at least, at the mercy of the
data used to train it. If this data is biased in any way, whether deliberately or
by accident, those biases hidden within it may be passed onto the Machine
Learning system itself. If not caught during the training stage, this bias can
taint the work of the system when it is out in the real world doing what it was
designed to do. Facial recognition provides a good example, where facial
recognition systems trained in predominantly white environments with
predominantly white samples of faces, have trouble recognizing people with
non-white facial pigmentation.
Acquired Bias:
It is sometimes the case that, while interacting with people in the real world,
Machine Learning systems can acquire the biases of the people they are
exposed to. A real-world example was Microsoft’s Tay, a chatbot designed to
interact with people on Twitter using natural language. Within 24 hours, Tay
became a pretty foul troll spouting racist and misogynist statements.
Microsoft pulled the plug (and set about white-washing twitter). Many of the
truly offensive things Tay said were the result of people telling it to say them
(there was an option to tell Tay to repeat a phrase you sent it), but there were
some very troubling comments produced by Tay that it was not instructed to
repeat. Microsoft was clearly aware of how nasty Twitter can get, and I think
it’s fair to say creating a racist, misogynist chatbot was just about the last
thing on their whiteboard. So if a massive, wealthy company like Microsoft
cannot train an artificial intelligence that doesn’t jump into the racist, woman-
hating camp of nasty internet trolls, what does that say about the dangers
inherent in any artificial intelligence system we create to interact in the real
world with real people?
Emergent Bias:
An echo chamber is a group or collection of people who all believe the same
things. This could be a political meeting in a basement or a chat room on the
internet. Echo chambers are not tolerant of outside ideas, especially those that
disagree with or attempt to refute the group’s core beliefs. Facebook has
become, in many ways, the preeminent echo chamber producer on Earth. But
while the echo chamber phrase is meant to describe a group of people with
similar ideas, Facebook’s artificial intelligence has taken this idea one step
further: to an echo chamber of one. The Machine Learning system Facebook
deploys to gather newsfeeds and other interesting bits of information to show
to you can become just such an echo chamber. As the artificial intelligence
learns about your likes and dislikes, about your interests and activities, it
begins to create a bubble around you that, while it might seem comforting,
will not allow opposing or offensive views to reach you. The area where this
is most alarming is news and information. Being constantly surrounded by
people who agree with you, reading the news that only confirms your beliefs,
is not necessarily a good thing. How we know we are correct about a
particular issue is by testing our beliefs against who believe otherwise. Do
our arguments hold up to their criticism or might we be wrong? In a
Facebook echo chamber of one, that kind of learning and growth becomes
less and less possible. Some studies suggest that spending time with people
who agree with you tends to polarize groups, making the divisions between
them worse.
Goals Conflict Bias:
Machine Learning can often support and reinforce biases that exist in the real
world because doing so increases their reward system (a system is
“rewarded” when it achieves a goal). For example, imagine you run a college,
and you want to increase enrollment. The contract you have with the
advertising company is that you will pay a certain amount for each click you
get, meaning someone has seen your advertisement for the college and was
interested enough to at least click on the link to see what you are offering. Of
course, a simple generic ad like “Go to College!” wouldn’t work so well, so
you run several ads simultaneously, offering degrees in engineering,
teaching, mathematics, and nursing.
It is in the advertiser’s best interest to get as many clicks to your landing
pages as possible. So they employ a Machine Learning algorithm to track
who clicks on what advertisement, and it begins to target those groups with
the specific ads to gain more clicks. So far, from the outside, this seems like a
win-win. The advertiser is making more revenue, and your college is
receiving more potential students examining your courses. But then you
notice something in the aggregate data of link clicks. Most of the clicks you
are receiving for math and engineering are from young men, while most of
the clicks for nursing and teaching are from young women. This aligns with
an unfortunate cultural stereotype that still exists in western culture, and your
college is perpetuating it. In its desire to be rewarded, the Machine Learning
assigned maximizing clicks for your advertising campaign found an existing
social bias and exploited it to increase revenue for its owner. These two
goals, increasing your enrollment and reducing gender bias in employment,
have come into conflict and Machine Learning sacrificed one to achieve the
other.
Opaque
One of the main criticisms of Machine Learning, and in particular, against
Neural Networks, is that they are unable to explain to their creators why they
arrive at the decisions they do. This is a problem for two reasons: one, more
and more countries are adopting internet laws that include a right to an
explanation. The most influential of these laws is the GDPR (The EU General
Data Protection Regulation), which guarantees EU citizens the right to an
explanation why an algorithm that deals with an important part of their lives
made the decision it did. For example, an EU citizen turned down for a loan
by a Machine Learning algorithm has a right to demand why this happened.
Because some artificial intelligence tools like neural networks are often not
capable of providing any explanation for their decision, and the fact this
decision is hidden in layers of math not readily available for human
examination, such an explanation may not be possible. This has, in fact,
slowed down the adoption of artificial intelligence in some areas. The second
reason it is important that artificial intelligence be able to explain its
decisions to its creators is for verifying that the underlying process is in fact
meeting expectations in the real world. For Machine Learning, the decision-
making process is mathematical and probabilistic. In the real world, decisions
often need to be confirmed by the reasoning used to achieve them.
Take, for example, a self-driving car involved in an accident. Assuming the
hardware of the car is not completely destroyed, experts will want to know
why the car took the actions it did. Perhaps there is a court case, and the
decision of liability rests on how and why the car took the actions it did.
Without transparency around the decision-making process of the software,
justice might not be served in the courts, software engineers may not be able
to find, and correct flaws and more people might be in danger from the same
software running in different cars.
The Philosophical Objections: Jobs, Evil, and
Taking Over the World
Jobs:
One of the main concerns surrounding artificial intelligence is the way these
systems are encroaching upon human employment. Automation is not a new
problem, and vast numbers of jobs have already been lost in manufacturing
and other such industries to robots and automated systems. Because of this,
some argue that this concern that machines and artificial intelligence will take
over so many jobs there will be no more economy is not really a threat.
We’ve encountered such takeovers before, but the economy shifted, and
people found other jobs. In fact, some argue that the net number of jobs has
increased since the loss of so many jobs to automation. So what’s the
problem?
The problem is this next round of job losses will be unprecedented. Artificial
Intelligence will not only replace drudgery and danger. It will keep going.
There are 3.5 million of them currently. How long until all trucking is
handled by Machine Learning systems running self-driving trucks? And what
about ride-sharing services like Uber and Lyft?
The question to ask is not what jobs will be replaced, but what jobs can’t be
replaced. In the Machine Learning round of employment disruption, white
collar jobs are just as much in jeopardy as blue. What about accountants,
financial advisers, copywriters, and advertisers?
The pro-artificial intelligence camp argues that this disruption will open up
new kinds of jobs, things we can’t even imagine. Each major disruption of
the economy in the past that displaced many forms of employment with
automation often caused the creation of new employment unforeseen before
the disruption. The difference with artificial intelligence is there is no reason
to believe these new forms of employment won’t quickly be taken over by
Machine Learning systems as well.
And all the above assumes the use of the current type of very specific
Machine Learning. What happens if researchers are able to generalize
learning in artificial intelligence? What if these systems become able to
generalize what they learn and apply what they know in new and different
contexts? Such a generalized artificial intelligence system could very quickly
learn to do just about anything.
Evil:
A very serious danger from artificial intelligence is the fact that anyone with
resources can acquire and use it. Western governments are already toying
with the idea of autonomous weapons platforms using artificial intelligence
to engage and defeat an enemy, with little or no human oversight. As
frightening as this might be, in these countries at least, there are checks and
balances on the development and deployment of such devices, and in the end,
the population can vote for or against such things.
But even if these platforms are developed and only used appropriately, what
is to stop an enemy from capturing one, reverse engineering it, and then
developing their own? What’s to stop a rival power from investing in the
infrastructure to create their own?
The threat of Machine Learning used by bad actors is very real. How do we
control who gains access to this powerful technology? The genie is out of the
bottle. It can’t be put back in. So, how do we keep it from falling into the
wrong hands?
Taking Over the World:
Finally, we’ll take a look at the end humanity might cause by creating super-
intelligent machines. Luminaries such as Steven Hawking and Elon Musk
have quite publicly voiced their concerns about the dangers artificial
intelligence could pose to humanity. In the Terminator movie franchise, a
Defense Department artificial intelligence called Skynet, tasked with
protecting the United States, determined that human beings, everywhere,
were the real threat. Using its control of the US nuclear arsenal, it launched
an attack on Russia, which precipitated a global nuclear war. In the disaster
that followed, it began systematically wiping out the human race.
This example is extreme and probably better as a movie plot than something
we have to worry about from artificial intelligence in the real world. No, the
danger to humanity from artificial intelligence lies more likely in a paperclip
factory.
Imagine an automated paperclip factory run by an artificial intelligence
system capable of learning over time. This system is smarter than human
beings. It has one goal to produce as many paperclips as possible, as
efficiently as possible. This sounds harmless enough. And at first, it is. The
system learns everything there is to know about creating a paperclip, from the
mining of metals, smelting, transportation of raw steel to its factory, the
automated factory floor, clip design specifications, and so on.
As it learns over time, however, it runs up against walls it cannot surpass. A
factory can only be so efficient. Eventually boosting efficiency further
becomes impossible. Acquiring the supply chain from mining to smelting to
transportation might come next. But again, once these aspects of the business
are acquired and streamlined, they no longer offer a path to the goal —
increase production and efficiency.
As artificial intelligence collects more information about the world, it comes
to learn how to take over other factories, retool them for paperclip
production, and increase output. Then more transportation, mining, and
smelting might need to be acquired. After a few more factories are retrofitted,
the distribution center where the paperclips are delivered begins to fill with
paperclips. There are many more than anyone needs. But this is not the
concern of our artificial intelligence. The supply side of the paperclip chain is
not part of its programming.
It learns about politics and so begins to influence elections when people try to
stop paperclip production. It continues to acquire and take over businesses
and technologies to increase production.
Absurd as it sounds (and this is just a thought experiment), imagine the entire
Earth, now devoid of people, a massive mining, transportation, smelting, and
paperclip production facility. And as the piles of unwanted, unused
paperclips turn into mountains to rival the stone ones of our planet, with
metal resources dwindling, our stalwart AI turns its lenses up to see the
moon, Mars, and the asteroid belt. It sees other solar systems, other galaxies,
and finally the universe. So much raw metal out there, all just waiting to be
made into paperclips…
Chapter 19: Machine Learning and the Future
After the previous chapter, it’s probably best to end on a positive note, by an
examination of the positive future of Machine Learning. This chapter will
break down the future of Machine Learning into segments of society and the
economy in an effort to put these possible futures in a useful context.
Security
Facial recognition and aberrant behavior detection, these are the toolkits from
Machine Learning that are available today. They will become ubiquitous in
the future, protecting the public from criminal behavior and getting people in
trouble the help they need.
But what about the other Machine Learning security features in the future?
In the cyber world, Machine Learning will grow and increase its influence in
identifying cyber-attacks, malicious software code, and unexpected
communication attempts. At the same time, dark hat software crackers are
also working on Machine Learning tools to aid them in breaking into
networks, accessing private data, and causing service disruptions. The future
of the cyber world will be an ongoing war between White and Black Hat
Machine Learning tools. Who will win? We hope the White Hat Machine
Learning algorithms will be victorious, but win or lose, the battle will slowly
move out of the hands of people and into the algorithms of Machine
Learning.
Another sweeping change to security we might see in the near future is
autonomous drones controlled by Machine Learning algorithms. Drones can
maintain constant aerial surveillance over entire cities at very little cost. With
advancements in image recognition, motion capture, video recognition, and
natural language processing, they will be able to communicate with people on
the street, respond to natural disasters and automobile accidents, ferry
medications needed in emergency situations where traditional service
vehicles cannot access, and find and rescue lost hikers by leading them to
safety, deliver them needed supplies, and alerting authorities of their GPS
location.
Markets
The rise of Machine Learning will generate completely new Artificial
Intelligence-based products and services. Entire industries will be created to
service this new software, as well as new products to be added to the Internet
of Things, including a new generation of robots complete with learning
algorithms and the ability to see, hear, and communicate with us using
natural language.
Retail
In the retail sector, we will see enhanced and more accurate personalization.
But instead of merely showing us what we want, Machine Learning will be
dedicated to showing us what we need. Perhaps we’ve been eating too much
fast food this week. A smart algorithm looking out for our well-being would
not throw more and more fast food ads in our face. Instead, reminders about
our health, coupons for gym memberships, or recipes for our favorite salads
might become part of the Artificial Intelligence toolkit for our notifications.
Machine Learning will help us to balance our lives in many ways by using
knowledge about general health, and our own medical records, to provide
information about not only what we want, but also what we might need.
Healthcare
Machine Learning will know almost everything about us and not only
through records on the internet. When we visit our doctor to complain about a
sore shoulder, Machine Learning might inform our GP about how we are
prone to slouch at work, possibly altering the doctor’s diagnosis from a
prescription for analgesics to a prescription for exercises to do at work, as
well as some tutoring on better sitting posture.
On the diagnostic side, Machine Learning will do the busy work it is best at:
examining our x-rays and blood-work and mammograms, looking for
patterns human beings cannot see, getting pre-emptive diagnostic information
to our doctors so we can head off serious illness at early stages, or perhaps
before it even happens. Doctors will be freed up to spend time with their
patients, providing the human touch so often missing in medicine.
Many if not most surgeries will be performed by Artificial Intelligence-
enabled robots, either assisting human surgeons, being supervised by them, or
even, eventually, working fully autonomously. Machine Learning will
monitor our blood gasses under anesthesia, our heart rate, and other health
measures during operations, and reaction in milliseconds should something
appear to be wrong. Iatrogenic disease will decrease dramatically, if not
disappear completely.
The Environment and Sustainability
Machine Learning will be able to study the movement of people in cities,
what they use and don’t use, their patterns of use, how they travel, and where.
Deep learning from this data will allow city planners to employ Machine
Learning algorithms to design and construct both more efficient and pleasant
cities. It will allow massive increases in density without sacrificing quality of
life. These efficiencies will reduce or even possibly eliminate net carbon
emissions from cities.
Augmented Reality
When we wear Google (or Microsoft or Apple or Facebook) glasses in the
future, the embedded processes, video capture, audio capture, and
microphones on these devices will do much more than give us directions to
find a location. Machine Learning will be able to see what we see and
provide explanations and predictive guidance throughout our day. Imagine
having your day “painted” with relevant information on interior walls and
doors, and outside on buildings and signs, guiding you through your schedule
with the information you need right when you need it.
Information Technology
Machine Learning will become a tool people and businesses can apply as
needed like SasS. This MLaaS will allow software to be aware of its
surroundings, to see us, to hear us, and speak to us in natural language.
Connected to the internet, every device will become smart, and generate an
ecosphere around us that attends to our needs and concerns, often before we
even realize we have them. These “Cognitive Services” will provide APIs
and SDKs, leading to rapid “smart” software development and deployment.
Specialized hardware will increase the speed of Machine Learning training,
as well as increase its speed in servicing us. Dedicated AI chips will bring
about a huge change in Artificial Intelligence speed and ubiquity.
Microcomputers will come equipped with Machine Learning capabilities so
that even the smallest device will be able to understand its surroundings.
Where there is no convenient power supply, these devices will run on dime-
sized batteries, lasting for months or years of service before needing
replacement. Currently, these microcomputers cost about 50 cents each. This
price will drop. They will be deployed practically everywhere.
Quantum computing and Machine Learning will merge, bringing solutions to
problems we don’t even know we have yet.
We will see the rise of intelligent robots of every size, make and description,
all dedicated to making our lives better.
Trust Barriers
Natural speech means we will be able to talk to our devices and be
understood by them. The current trust barriers between some people, business
sectors, and governments will slowly break down as Machine Learning
demonstrates its reliability and effectiveness. Improved unsupervised
learning will reduce the time required to develop new Machine Learning
software with required specifications.
Conclusion
Thank you for making it through to the end of Machine Learning for
Beginners. Let’s hope it was informative and able to provide you with all of
the information you needed to begin to understand this extensive topic.
The impact of Machine Learning on our world is already ubiquitous. Our
cars, our phones, our houses, and so much more are already being controlled
and maintained through rudimentary Machine Learning systems. But in the
future, Machine Learning will radically change the world. Some of those
changes are easy to predict. In the next decade or two, people will no longer
drive cars. Instead, automated cars will drive people. But in many other ways,
the effect of Machine Learning on our world is difficult to predict. Will
Machine Learning algorithms replace so many jobs, from trucking to
accounting to many other disciplines, that there won’t be much work left for
people? In 100 years, will there be work for anyone at all? We don’t know
the answer to questions like these because there is so far no limit to what
Machine Learning can accomplish, given time and data and the will to use it
to achieve a particular task.
The future is not necessarily frightening. If there is no work in the future, it
won’t mean that things aren’t getting done. Food will still be grown, picked,
transported to market, and displayed in stores. It’s just that people won’t have
to do any of that labor. As a matter of fact, stores won’t be necessary either,
since the food we order can be delivered directly to our homes. What will the
world be like if human beings have almost unlimited leisure time? Is this a
possible future?
The only real certainty about artificial intelligence and Machine Learning is
that it is increasing in both speed of deployment and in areas which it can
influence. It promises many benefits and many radical changes in our society.
Learning about this technology and creating your own programming in
relation to the development of your own machinery will improve knowledge
of this increasingly popular technology and skyrocket you to the head of
future developments.
Conclusion
Machine learning is a branch of artificial intelligence that involves the design
and development of systems capable of showing an improvement in
performance based on their previous experiences. This means that, when
reacting to the same situation, a machine should show improvement from
time to time. With machine learning, software systems are able to predict
accurately without having to be programmed explicitly. The goal of machine
learning is to build algorithms which can receive input data then use
statistical analysis so as to predict the output value in an acceptable range.
Machine learning originated from pattern recognition and the theory that
computers are able to learn without the need for programming them to
perform tasks. Researchers in the field of artificial intelligence wanted to
determine whether computers are able to learn from data. Machine learning is
an iterative approach, and this is why models are able to adapt as they are
being exposed to new data. Models learn from their previous computations so
as to give repeatable, reliable results and decisions.
References
Agile Actors. (2019). Scikit-learn Tutorial: Machine Learning in Python - Agile Actors #learning.
Retrieved from https://learningactors.com/scikit-learn-tutorial-machine-learning-in-python/
Bose, D. (2019). Benefits of Artificial Intelligence to Web Development. Retrieved from
http://www.urbanui.com/artificial-intelligence-web-development/
Internet of things. (2019). Retrieved from https://en.wikipedia.org/wiki/Internet_of_things
DataMafia. (2019). DataMafia. Retrieved from https://datamafia2.wordpress.com/author/datamafia007/
Karbhari, V. (2019). Top AI Interview Questions & Answers — Acing the AI Interview. Retrieved
from https://medium.com/acing-ai/top-ai-interview-questions-answers-acing-the-ai-interview-
61bf52ca34d4
Newell, G. (2019). How to Decompress Files With the "gz" Extension. Retrieved from
https://www.lifewire.com/example-uses-of-the-gunzip-command-4081346
Pandas, D. (2019). Different types of features to train Naive Bayes in Python Pandas. Retrieved from
https://stackoverflow.com/questions/32707914/different-types-of-features-to-train-naive-bayes-in-
python-pandas/32747371#32747371
Robinson, S. (2019). K-Means Clustering with Scikit-Learn. Retrieved from https://stackabuse.com/k-
means-clustering-with-scikit-learn/
Top 10 Machine Learning Algorithms. (2019). Retrieved from https://www.dezyre.com/article/top-10-
machine-learning-algorithms/202
Samuel, N. (2019). Installing TensorFlow on Windows. Retrieved from
https://stackabuse.com/installing-tensorflow-on-windows/
[1]
Information retrieved from Robinson, S. (2019). K-Means Clustering with Scikit-Learn. Retrieved
from https://stackabuse.com/k-means-clustering-with-scikit-learn/
[2]
Information obtained from DataMafia. (2019). DataMafia. Retrieved from
https://datamafia2.wordpress.com/author/datamafia007/
[3]
Information obtained from Robinson, S. (2019). K-Means Clustering with Scikit-Learn. Retrieved
from https://stackabuse.com/k-means-clustering-with-scikit-learn/
[4]
Information obtained from Robinson, S. (2019). K-Means Clustering with Scikit-Learn. Retrieved
from https://stackabuse.com/k-means-clustering-with-scikit-learn/
[5]
Information obtained from Pandas, D. (2019). Different types of features to train Naive Bayes in
Python Pandas. Retrieved from https://stackoverflow.com/questions/32707914/different-types-of-
features-to-train-naive-bayes-in-python-pandas/32747371#32747371
[6]
Information obtained from Pandas, D. (2019). Different types of features to train Naive Bayes in
Python Pandas. Retrieved from https://stackoverflow.com/questions/32707914/different-types-of-
features-to-train-naive-bayes-in-python-pandas/32747371#32747371
[7]
("Internet of things", 2019)
[8]
("Top 10 Machine Learning Algorithms", 2019)