International Conference on Innovative Computing and Communication (ICICC 2020)
Machine Learning Algorithms and their Real-life Applications: A Survey
Meenakshi
Research scholar,U.I.E.T([Link]),Rohtak,,124001,India
Abstract:
Machine learning (ML) is nothing but the study based on arithmetical and non-arithmetical methods for refining outcome and performance by
systematizing the attainment of knowhow and knowledge from understanding and experience. From last several years, Machine Learning has progressed
from struggle of limited computer devotees using the option of computers learning for basic statistics/mathematics which occasionally reflected
computational methods, playing games, to a liberated discipline of research which has presented required platform for mathematical analysis,
computational principles of learning techniques and has also established several algorithms which are commonly utilized for pattern recognition, analysis
of text, explanation/interpretation and plenty of additional money-making drives and have managed to a distinct research curiosity in field of data mining
to explore concealed consistencies or anomalies in social data which is emergent every minute. Continual efforts, growth, progressive refinement and
improvement in ML has enabled the successful production of thousands of AI expert systems which are presently being used commonly in business and
industries.
This work emphasis on elucidating the idea along with fruition of ML, few prevalent ML algorithms and their actual life applications.
Keyword: Machine learning, pattern recognition, Artificial intelligence
1. Introduction
The data can be more efficiently handled with the help of Machine
Learning. It is applied in cases where the exact information or pattern
from the data is impossible to be interpreted [1]. The requirement of ML
is increasing day by day along with rise of abundant and ever growing
datasets. Machine Learning is highly utilized to extract the requisite
information by major industries such as from medicine to military.
Learning through existing data is the foremost tenacity of ML. Machines’
self-learning has attracted many detailed studies [2] [3]. Solution to this
problem is found by May programmers and mathematicians by applying
several approaches. Demonstration of few of them has been done in Fig.
1.
Section 2 covers all the techniques of machine learning.
Section 3 has the conclusion of this paper.
2. Type of Learning
2.1 Supervised Learning
If external assistance is needed by a set of algorithms, then those
algorithms are called as supervised machine learning algorithms. In these
algorithms, the input dataset gets bifurcated to train dataset and test
dataset. And the output variable, that desires to be anticipated or
categorized falls under train dataset. This is obtained in a way that some
kinds of patterns from the training dataset is learned by all the algorithms.
Now these are applied to the test dataset for forecast or sorting [4]. Fig: 1 Supervised learning algorithm’s process flow
Process flow or workflow of Supervised learning algorithm has been
shown in Fig. 1
There are numerous supervised machine learning algorithms, 3 of the
furthermost well-known supervised machine learning algorithms are
MEENAKSHI 1
Electronic copy available at: [Link]
International Conference on Innovative Computing and Communication (ICICC 2020)
briefed as under: diversified distribution for e.g. the Gaussian mixture models.
The diversified constituents are identified within the present
1. “Decision Tree”: If the attributes are sorted based on their unlabeled data. In this method, one labeled sample per
values and are then grouped, then that machine learning constituent is sufficient to confirm the mixture distribution.
algorithm is called Decision Tree algorithm. It is mainly used
for classification purpose. These Decision Trees consist of
Branches and Nodes. Nodes are the characteristics in a cluster 2. “Self-Training”: The classifier trains itself with a portion of
that need to be categorized and Branches represent an labeled data in this technique, which is then fed with the un-
assessment which the node can receive [4]. labeled data. As the classifier trains itself, so rightly named as
2. “Naïve Bayes”: Chiefly utilized for purpose of clustering and “self-training” . In this technique, the predicted labels as well
classification [6], it is highly utilized by the text classification as the un-labeled points are grouped together in the training
industry. Its architecture basically depends on conditional set.
probability. Trees are created by this algorithm based on their
occurring probability and are known as Bayesian Network. 3. “Transductive SVM”: Abbreviated as TSVM, it is actually an
3. “Support Vector Machine”: SVM is another top-notch ML extension of SVM. Reaching to a solution by TVSM is a NP-
Technique, which works on the principle of margin calculation hard problem. In this technique, both the labeled data and
and is used for classification purpose by drawing margins of unlabeled data are considered. The margin between the labeled
maximum possible distance between the margin and the and un-labeled data is maximized by this techniques and hence
classes. the un-labeled data is made labeled by used of this technique.
In this manner, when the distance is maximized, the possible
Classification error minimizes.
2.4. Reinforcement Learning:
2.2 Unsupervised Learning
.
This learning algorithm technique rest on criteria viz. “trial and error
search” and “delayed outcome” [17]. In this type of learning, the best
If the algorithms learn few features from the data itself, then those decisions are made based on actions taken by learner, leading to more
algorithms are called as Unsupervised Learning algorithms. In these positive outcome. The learner in this technique is given a situation, then,
algorithms, whenever the new data is fed or introduced, the class of the the learner takes actions that may affect the current situations and their
same is recognized by its previously learned features. These type of actions in future. Otherwise, learner has no awareness about what
algorithms are mainly utilized for classification and feature reduction. The activities to be carried out unless it is provided a condition.
algorithms, also known as clustering and dimensionality lessening
methods are having the following two main types.
2.5. Multitask Learning
1. “ K-Means Clustering”: This type of unsupervised learning
algorithm is known as K-Means as it generates ‘K’ discrete
clusters. It creates clusters or groups automatically when
initiated. In this algorithm technique, the items which possess This learning algorithm makes the learning very fast as the learners in this
same characters are placed in identical cluster. The center of a technique segment their respective understanding with one another and
specific cluster is basically the mean of respective cluster [9]. hence can learn simultaneously in place of individually. This technique is
based on a basic objective of assisting other learners to achieve superior
2. “Principal Component Analysis”: In PCA algorithm technique, goal. During making multitask learning algorithms functional on a task,
computations are made easier and much faster by reducing the the procedure and steps incurred in solving a problem and getting to a
dimensions of the data. This technique is also abbreviated as conclusion are remembered by it, which are the utilized by the algorithm
PCA. We can take an example of 2D data to understand this in resoling other similar tasks or problems. This is therefore also termed
simple technique. Normally when a 2D data is plotted on a as inductive transfer mechanism.
graph, it takes up two axes but when PCA is made functional
to this data, the data makes 1D leading to easy and fast
2.6. Ensemble Learning
computations.
2.3. Semi-Supervised Learning
This technique is the main technique being utilized since 1990s. In this,
numerous distinct learners are grouped together to constitute one main
learner which can do the job more easily than other individual learners.
This machine learning method basically utilizes power of (I) supervised The individual learners can be decision tree, Naïve Bayes or neutral
and (II) unsupervised learning techniques. This type of machine learning network etc. It proves that a group or collection of learners is always way
is best utilized for data mining in areas where un-labeled data is better than individual learners when it comes to performing [20]. Two
previously existing and obtaining labeled data is a complicate procedure most popular ensemble learning methods are explained as under [21]:
[15]. Some of the numerous types of semi-supervised learning algorithms
are explained below. 1. “Boosting”: This method of ensemble learning is applied to
reduce bias and variance. It basically generates a gathering of
1. “Generative Models”: This type is considered as one of the weak learners and change all those weak learners into one
oldest utilized semi-supervised learning technique. Let us strong learner. Now, this strong learner is always better than
suppose a structure identical p(a,b) = p(b)p(a|b) where p(a|b) is weak learners since it is a kind of classifier that is hugely
MEENAKSHI 2
Electronic copy available at: [Link]
International Conference on Innovative Computing and Communication (ICICC 2020)
interrelated with true classification unlike weak learner or 3.4 Robot or Automation Control
classifier that hardly correlates with true classification [22].
2. “Bagging”: It reduces variance and also helps in handling
overfitting. Bagging is also called bootstrap aggregating. It is ML techniques are highly utilized in mechanical and auto industry from
used wherever accurateness and steadiness of ML algorithm bottling plants to man less cars by Google which uses ML to assist the car
need to be enhanced. It is relevant in classification and from compiled terrain and road data to aviation in which control tactics
regression [23]. are obtained for stable flight and aerobatics of aircrafts and helicopters.
3.5 Empirical Science Expeiments
3. Applications ML is widely used for researches by majority of the data-intensive science
disciplines such as it is utilized in identifying aberrant cosmic objects in
astrophysics, in Genetics, in Neuroscience and psychosomatic study.
It is important to mention that until 1985, there was no revolution of Additional daily life pivotal roles played by ML are predictive analyses
machine learning having no viable commercial applications. With (in weather forecasting, stock market predictions, market surveys etc.),
exploration and advancement of ML (machine learning), its importance fraud detection, topic identification and spam alerting and filtering etc.
has also risen with its vital real-life applications. Some of them are briefed
below:
4. Conclusion
3.1 Speech Recognition
This work reviews several ML algorithms. About fifty to sixty years back,
The machine learning approaches are highly utilized by all speech ML was considered the material of science fiction only. But in modern era
recognition systems available in market be it mobile phones or computer it has become an essential share of our daily lives, serving us at multiple
systems. Most of the systems implement learning in two separate phase’s levels right from searching pictures to car and bus driving. We must thank
i.e (i) Pre-shipping speaker independent training and (ii) Post shipping to plenty of geniuses like mathematicians, computer scientists, experts and
speaker dependent training. industrialists who powered the vision of learning machines. ML manages
continual progress and advancement of algorithms through its regular
3.2 Computer Vision learning and growing nature. Various types of algorithms have been
explained with examples above in order to assist a quick glance in vast
ocean of ML. Machine Learning can be used for generic purpose and at
the same time it can be used to handle very specific task-oriented
ML is employed by most of the computer vision systems such as facial challenges. With regular and consistent evolution of ML algorithms it is
recognition soft wares or apps, classification systems of microscopic evident that ML tooled up with advance instruments like artificial
images of cells etc. for precision. As a case, the Post Office of America intelligence will remain to transform business and lifestyle and will
employs a handwriting analyzer i.e a computer vision system that is definitely grace predominantly in the upcoming times.
trained to read individual handwritings and thus automatically sorts
letters/mails which are having handwritten addresses with a notable
correctness level of about 85%.
3.3 Bio-Surveillance References
[1] W. Richert, L. P. Coelho, “Building Machine Learning Systems with
Govt. bodies especially related to health sector employ machine learning Python”, Packt Publishing Ltd., ISBN 978-1-78216-140-0
algorithms for tracking outbreaks of diseases. Best example is RODS [2] M. Welling, “A First Encounter with Machine Learning” [3] M. Bowles,
project in western Pennsylvania which collects all the admission reports, “Machine Learning in Python: Essential Techniques for Predictive
reported to emergency wards in all the clinics .ML system is competent Analytics”, John Wiley & Sons Inc., ISBN: 978-1-118- 96174-2
enough to detect aberrant symptoms of patients admitted utilizing their [4] S.B. Kotsiantis, “Supervised Machine Learning: A Review of Classification
profiles. It also checks the patterns and areal distribution of diseases from
Techniques”, Informatica 31 (2007) 249-268
patients. Rather more detailed data is being tried to be collected with an
[5] L. Rokach, O. Maimon, “Top – Down Induction of Decision Trees
ongoing research that would show the history of over-the-counter
medicines’ purchases. These automated learning methods simplify the Classifiers – A Survey”, IEEE Transactions on Systems,
intricacy of complex and dynamic datasets with much efficiency. [6] D. Lowd, P. Domingos, “Naïve Bayes Models for Probability Estimation”
[7] [Link]
.
html
[8] D. Meyer, “Support Vector Machines – The Interface to libsvm in package
e1071”, August 2015
MEENAKSHI 3
Electronic copy available at: [Link]
International Conference on Innovative Computing and Communication (ICICC 2020)
[9] S. S. Shwartz, Y. Singer, N. Srebro, “Pegasos: Primal Estimated sub - [16] X. Zhu, “Semi-Supervised Learning Literature Survey”, Computer
Gradient Solver for SVM”, Proceedings of the 24th International Sciences, University of Wisconsin-Madison, No. 1530, 2005
Conference on Machine Learning, Corvallis, OR, 2007 [17] R. S. Sutton, “Introduction: The Challenge of Reinforcement Learning”,
[10] [Link] Machine Learning, 8, Page 225-227, Kluwer Academic Publishers, Boston,
article 1992
[11] P. Harrington, “Machine Learning in action”, Manning Publications Co., [18] L. P. Kaelbing, M. L. Littman, A. W. Moore, “Reinforcement Learning: A
Shelter Island, New York, 2012 Survey”, Journal of Artificial Intelligence Research, 4, Page 237-285, 1996
[12] [Link] [19] R. Caruana, “Multitask Learning”, Machine Learning, 28, 41-75, Kluwer
[13] K. Alsabati, S. Ranaka, V. Singh, “An efficient k-means clustering Academic Publishers, 1997
algorithm”, Electrical Engineering and Computer Science, 1997 [20] D. Opitz, R. Maclin, “Popular Ensemble Methods: An Empirical Study”,
[14] M. Andrecut, “Parallel GPU Implementation of Iterative PCA Journal of Artificial Intelligence Research, 11, Pages 169-
Algorithms”, Institute of Biocomplexity and Informatics, University of 198, 1999
Calgary, Canada, 2008 [21] Z. H. Zhou, “Ensemble Learning”, National Key Laboratory for Novel
[15] X. Zhu, A. B. Goldberg, “Introduction to Semi – Supervised Learning”, Software echnology, Nanjing University, Nanjing, China
Synthesis Lectures on Artificial Intelligence and Machine Learning, 2009, [22] [Link]
Vol. 3, No. 1, Pages 1-130 [23] [Link]
MEENAKSHI 4
Electronic copy available at: [Link]