Quantum Computing A Tool in Big Data Analytics

Quantum Computing: A Tool in Big Data
Analytics
AKSHAT GAURAV1 , KWOK TAI CHUI2 , FRANCESCO COLACE3 ,
1
Ronin Institute, Montclair, USA, Email: akshat.gaurav@ieee.org
2
Hong Kong Metropolitan University (HKMU), Hong Kong, Email: jktchui@hkmu.edu.hk
3
University of Salerno, Italy, Email: fcolace@unisa.it
ABSTRACT Massive volumes of data, referred to as "big data," can do amazing things. Because of the enormous
potential it has, it’s been a hot issue for the previous two decades. Public and commercial sector organisations are
using big data analytics to better the services they provide. Data management and analysis are necessary in order to
extract relevant information from this little amount of data. Instead of searching for answers in vast data, it becomes
like searching for the needle in the haystack. As a result of its many promises for information processing systems,
quantum computing comes to the rescue, notably in the area of Big Data Analytics. Quantum computing’s power is built
on quantum physics’ principles. Because these events lack a classical counterpart, conventional computers is unable to
provide the same results. Here, we’ve taken a look at what’s out there on Big Data Analytics using Quantum Computing.
As a completely new subject, quantum computing presents a number of open issues. Quantum computing in Big data
analytics is also emphasised for its problems, potential, and future directions and methodologies.
KEYWORDS Quantum computing, Big data
I. INTRODUCTION quantum computing technology under certain circumstances.

Unlike traditional computing, quantum computing has its In the modern world, information has been the driving
origins in a variety of areas. In the subject of information force for greater organisation and innovative ideas. We can
processing, it’s an application of quantum physics ideas. In better arrange ourselves to get the greatest results when we
comparison to conventional computation, quantum comput- have more knowledge. As a result, gathering and analysing
ing provides fundamentally new answers to computational data is critical for any business. This data may also be
issues and makes problem-solving more efficient. Using used to make predictions about the present trends in various
this approach in 1994 sparked the "big bang" of quantum parameters and about what will happen in the future. Our
calculations, paving the path for the creation of quantum growing awareness of this has led us to begin gathering and
computing and the assessment of quantum computers [1]– collecting more data on almost everything via the introduc-
[3]. Three quantum resources, none of which have mirror tion of technological advancements. As a society, we’re in-
representations in conventional processing, provide quantum undated with information on every element of our lives, from
computing its potential capability. A function may be com- social interactions to scientific discoveries to professional
puted on an infinite number of inputs concurrently using careers to personal health. There are some parallels to be seen
the Quantum Parallelism concept of superposition with lin- between the current situation and the recent data deluge. As
earity in quantum mechanics. Quantum interference enables technology has advanced, we’ve been able to generate ever
logical channels of a computation to interfere constructively more data, to the point that it’s now impossible to handle
or destructively, leading to desirable outcomes by strength- with the tools we have. As a result, the phrase ’big data’ was
ening one another and unwanted ones by cancelling one coined to characterise data that is enormous and unwieldy.
another. It’s impossible to explain the function of multi- Large-scale data analysis is a hot topic in academia right now
particle quantum states using a single state for each particle. and will continue to be in the future. As part of its annual
Figure 1 depicts a hypothetical diagram of quantum comput- "Top 10 Strategic Technology Trends For 2013" and "Top 10
ing’s development [4]. There is a functional overlap between Critical Technology Trends For The Next Five Years" lists,
classical and quantum computing, but the physical layer is Gartner included Big Data in each of these categories [5],
fundamentally distinct. Quantum computing technologies are [6]. Yes, it is correct to suggest that Big Data will have a
based on the DiVincenzo criteria, which are complemented profound impact on numerous industries, from business to
with specific physical layer features. Qubits, quantum reg- scientific research to governmental administration. Ten Vs
isters, gates, circuits and memories are all produced from may be used to describe the complexity of big data: Volume,
10 VOLUME 3, 2022
A. Gaurav et al./ Cyber Security Insights Magazine, Vol 03, 2022
FIGURE 1: Evaluation of Quantum computing [1]
Variety, Validity, Viability, Volatility and Vulnerability can all many sources and processed using error-free technolo-
be represented by the 10 Vs. The following is an explanation gies in order to boost its authenticity.
of each ’V’ in health care big data. – Structured, semi-structured, and unstructured data are
– In big data analytics technology, value indicates the sig- all types of healthcare data that are obtained from a
nificance of the data analysed. Processing medical data variety of sources. Hence, diversity depicts the wide
using an inaccurate processing approach diminishes the range of variations that exist in healthcare information.
value of the processed data. Healthcare data processing provides a significant prob-
– The created data’s volume is represented by this value. lem.
Health care information is complicated, and at times it – It refers to the accuracy of the medical data. When it
includes a significant amount of noise. Patient records, comes to healthcare data, validity is critical since if the
biometric sensor readings, and x-ray pictures are among data isn’t legitimate, the processed results are useless.
the most common types of healthcare data. Global There are a variety of methods and instruments for
healthcare data was estimated to be 500 petabytes in verifying the accuracy of healthcare data.
2012 and is predicted to become 2500 petabytes by – When it comes to locating the essential information in
the end of 2020, according to the estimate. 1.2 to 2.4 a big ocean of data, the process becomes much more
Exabytes of healthcare big data are created each year. difficult since the amount of healthcare information is so
– Velocity - This is the pace at which healthcare data enormous. In order to utilise healthcare data effectively,
is processed, i.e., the rate at which data is generated, we must first pick relevant data.
stored, and then transported. There are several sites – An enormous quantity of healthcare data is created
where healthcare data is created, including laboratories every second because to the digital revolution. In ad-
and doctor’s offices. There must be a quick data inter- dition to the previously saved data, this new data is also
change between multiple sites for real-time analysis of included. The next step is to determine the data’s life
large healthcare datasets expectancy. Healthcare data loses relevance with time,
– Veracity is a measure of the data’s dependability. There making it a research topic to determine the appropriate
is a greater chance that a patient’s life will be saved if lifespan of the data.
the data is accurate. This data should be gathered from – An important goal in healthcare is to create an efficient
VOLUME 3, 2022 11
system that can handle a significant volume of health-

care data in the most efficient manner possible. As a
result, less effort is put into finding ways to make the
healthcare system more susceptible to cyberattacks. The
healthcare system begins to lose its significance as soon
as it becomes exposed to security assaults.
– When it comes to big data, one of the most important
things to keep in mind is that it has to protect the privacy
of patients while still providing the information they
need. Visualization is an effective tool for accomplish-
ing both of these objectives.
The pace of data production accelerated with the dawn of the

new millennium. One-fifth of Google’s 1.2 trillion searches
per year are first-time queries. From 2 zettabytes, the value of
big data has more than doubled in the last decade, reaching
59 zettabytes. Big data has grown by 90% in the previous FIGURE 2: Number of Document Published for Quantum
two years, according to an IBM research in 2017 [7]–[10]. Computing
Every day, internet users create over 2.5 trillion bytes of data.
The total quantity of data created in 2020 will be dominated
by North America, although diverse nations will also make II. QUANTUM COMPUTING STATE OF ART
significant contributions. Industry and service industries of The rules of quantum mechanics are used to carry out cal-
all sizes contribute to the growth of big data. Major cor- culations in quantum computing. It is at the molecular level
porations like IBM, Oracle, Amazon, and Google, among and below that quantum mechanics comes into play, since it
others, are spearheading big data initiatives in order to deal is the governing physical theory of all matter. A wave-like
with the influx of information. The fact that governments property is a property of both particles and waves, according
from various nations are interested in the development of to this theory. In contrast to a conventional computer that uses
big data projects shows how important big data is to the randomness and may produce a wide range of results, a quan-
advancement of the technology [11]–[14]. Big data analytics tum machine can have a wide range of amplitudes along its
is attracting the interest of the healthcare industry because computational routes, much as a wave could. This wave-like
of its ability to handle large amounts of data. The healthcare activity might be harnessed for computing gain in the same
industry creates a lot of data, and by the end of 2020, it is way that random choices could lead to diverse outcomes in a
predicted that the entire quantity of data created by the sector classical computer. As soon as you start measuring, the state
would be over 35ZB. Due to this importance of Big data, will ’collapse,’ and you’ll get an answer with a probability
we focus on the reviews for big data analytics techniques equal to amplitude sqrt Because they enable interference be-
[2], [15]. It’s possible that the citation networks we’ve found tween computational routes akin to the interference between
in the scientific literature might show emerging trends and waves6, quantum computers hold the promise of a new kind
cooperation networks in the field of quantum computing and of computing that is fundamentally distinct from any prior
big data analytics. Researchers may look for the most well- (classical) computing form. Some tasks can be solved faster
known academic articles, grant applications, data sets, and using a quantum computer than a conventional computer,
patents in the Lens database [16]. As a consequence, we despite the fact that the greatest general-purpose quantum
acquire our article entries and the associated citation network computers currently exist contain just 50–100 qubits. Two-
from the Lens database. "TS1=("Quantum Computing")" and level systems, such as a photon travelling down one of two
"TS2= (("Quantum Computing") AND "Big data")" are used optical fibres, are referred information as "qubits." There are
to search the Lens database [17] for relevant publications in two ways to think about qubits: as a generalisation of classi-
quantum computing and big data analytics. The data spans cal bits (cbits), and as a way to think about the state of a single
the period from 2010 to 2021. When we finished setting qubit. |a0|2 + ||a1|2 = 1. Quantum computers’ strength stems
up the various quarries, we received the various pieces of from their ability to scale [16]. The state of n qubits is defined
paper seen in Figure 2. Quantum computing and big data by a complex unit vector of dimension 2n, while the state of
analytics based on quantum computing are flourishing topics, a n cbit system may be in any one of 2n potential states at any
as indicated by the exponential rise in publications over one moment. Many of these vectors (also called wavevectors
the preceding decade, as seen in Figure 2. It’s time for a or wavefunctions) may be easily converted by multiplying
comprehensive review in this field to help researchers keep them with unitary matrices. With O(n2) basic quantum gates,
up with the latest developments in the research problem. a wavevector may be Fourier converted. As a result, certain
changes can’t be done effectively. This information can only
be derived from a quantum state by following the rules of
12 VOLUME 3, 2022
quantum measurement. With a chance of |ax|2 of the state result, handling the massive amounts of data generated by
being destroyed, a complete state measurement produces biomedical research requires both biology and IT expertise.
an outcome of x. As a result, even though the amount of Bioinformaticians often have a dual skill set like this.
information required to describe the quantum state of n
qubits grows exponentially as n increases, measurements can IV. BIG DATA AND QUANTUM COMPUTING
only retrieve n bits of information. Finding a means to take An efficient linear and non-linear binary classifier may be
use of quantum computers’ exponential state space despite implemented on a quantum computer using a support vector
these and other limitations is the primary issue of quantum machine with exponential speedups in the size of the vectors
algorithm creation. and number of training instances [18]. Performing a prin-
cipal component analysis and a matrix inversion effectively
III. BIG DATA ANALYTICS CHALLENGES is at the heart of the approach, which relies on a non-
The term "big data" refers to the massive volumes of data sparse matrix simulation technique. Weinstein [19] discusses
that are being created at a fast pace. It is more important to the strong visual approach of dynamic quantum cluster-
optimise customer services than it is to maximise consumer ing (DQC), which works with large and high-dimensional
consumption of the data obtained from diverse sources. Big data. Because it uses differences in data density (in feature
data from biomedical research and healthcare also falls into space) and uncovers hidden subsets, it can handle large,
this category. Big data is a tremendous difficulty because high-dimensional datasets. To demonstrate how and why
of the sheer number of information it contains. As part of data points are identified as simple cluster members with
the scientific community, data must be kept in a manner correlations among all variables assessed, a DQC analysis
that is readily accessible and usable for efficient analysis. has produced a video. According to Rebentrost et al. [20],
Implementation of high-end computing tools, protocols, and a quantum computer-implemented optimal binary classifier
high-end hardware in the clinical environment is another key with logarithmic complexity in vector size and training exam-
difficulty in the context of healthcare data. We need experts ple number has been developed. The Generalized Eigen value
from several fields to accomplish this aim, including biol- Proximal SVM (GEPSVM) was introduced by Marghny et
ogy and information technology as well as statisticians and al. [21] to address the SVM complexity problem. Real-world
mathematicians. Pre-installed software solutions provided by data is affected by errors or noise, and dealing with this data
analytic tool makers may make sensor data accessible on may be a difficult challenge. This issue has been addressed in
a storage cloud. AI specialists have built data mining and this paper with the help of a solution. DSAGEPSVM is the
machine learning features for these instruments, which would name of this approach. Using Quantum Computing, Anguita
be able to turn data into knowledge. Implementation would et al [22] explored how to overcome the challenge of effective
improve the efficiency of healthcare big data collection and SVM training, particularly in the case of digital implemen-
analysis as well as the display of such data. The primary tation. Experiments in synthetic and real-world scenarios
goal is to annotate, integrate, and display this complicated are conducted to support the theoretical understanding of
material in a way that facilitates comprehension. Research in the behavioural characteristics of standard and improved
biomedicine is hampered by a lack of clear information about SVMs. The study provided here examines the similarities and
the data available. Finally, computer graphics designers have contrasts between quantum-based optimization and quadratic
devised visualisation tools that can effectively convey this programming. Future research challenges in Big data and
new information. Big data analysis must also contend with quantum comouting Companies are always working to create
the problem of data heterogeneity. Big data in healthcare is new methods for managing and analysing massive amounts
difficult to make sense of because of its enormous quantity of data in order to better incorporate that data into their
and very varied nature. High-power computer clusters ac- operations. In spite of this, the wide variety of products
cessible through grid computing infrastructures are the most available has made it difficult to share data. These are a
prevalent platforms for executing the software framework few of the issues we’ll touch on briefly in this section. Data
that supports big data analysis. Because of its virtualized storage is one of the main issues, although many firms are
storage and dependable services, cloud computing has be- satisfied with storing their own data on their own premises.
come a popular choice for businesses. Highly scalable, highly Control over security, access, and uptime are just a few of
reliable, and completely self-sufficient are just a few of the the benefits. Scaling and maintaining an on-premises server
benefits that come with this system. Such platforms may network may be costly and time consuming. Cloud-based
operate as a receiver of data from the omnipresent sensors, storage employing IT infrastructure looks to be a more cost-
as a computer to analyse and interpret the data, and as a web- effective and reliable alternative for most healthcare firms,
based visualisation tool for the user. Mobile edge computing according to this study. Organizations should only work with
cloudlets and fog computing may be used in IoT to process cloud service providers that are cognizant of the need of
massive data closer to the data source. ML and AI methods security. For these reasons and more, cloud storage is becom-
to large data analysis on computer clusters need the use of ing increasingly popular. A hybrid approach to data storage
advanced algorithms. Could be written in a programming may be the most adaptable and viable option for providers
language suited to coping with large amounts of data. As a with different data access and storage demands. Cleaning
VOLUME 3, 2022 13
After collection, data must be cleaned or scrubbed to guar- order to reap the benefits of Big Data. More study in these
antee its precision, correctness, consistency, relevance, and sub-fields is needed to handle the problem of Big Data ith the
purity. Automated logic rules may be used to achieve high help of quantum computing.
levels of correctness and integrity in this cleaning procedure.
Machine learning methods may be used to decrease costs REFERENCES
and time, and to prevent bad data from derailing big data [1] B. B. Gupta, A. Gaurav, and D. Peraković, “A big data and deep learning
initiatives, using more complex and accurate tools. Managing based approach for ddos detection in cloud computing environment,” in
2021 IEEE 10th Global Conference on Consumer Electronics (GCCE).
large amounts of data is very challenging, particularly when IEEE, 2021, pp. 287–290.
imperfect data is involved. Data storage and processing and [2] C. L. Stergiou and et al., “Secure machine learning scenario from big
sharing necessitates the creation of a unifirma format. Protec- data in cloud computing via internet of things network,” in Handbook of
computer networks and cyber security. Springer, 2020, pp. 525–554.
tion breaches, hacks, phishing assaults and even ransomware [3] B. B. Gupta, S. Yamaguchi, and D. P. Agrawal, “Advances in security
attacks have made data security a top responsibility for every and privacy of multimedia big data in mobile and cloud computing,”
firm. A set of technological protections was built for the Multimedia Tools and Applications, vol. 77, no. 7, pp. 9203–9208, 2018.
[4] L. Gyongyosi and S. Imre, “A survey on quantum computing technology,”
protected stored data after a number of vulnerabilities were Computer Science Review, vol. 31, pp. 51–71, 2019.
discovered. Organizations are guided by these regulations, [5] C. P. Chen and C.-Y. Zhang, “Data-intensive applications, challenges,
known as HIPAA Security Rules, when it comes to storage, techniques and technologies: A survey on big data,” Information sciences,
vol. 275, pp. 314–347, 2014.
transmissions, authentication procedures and controls over [6] K. Yadav and et al., “2021 hot topics in machine learning research.”
access, integrity, and auditing. Using the latest antivirus soft- [7] F. J. G. Peñalvo, T. Maan, S. K. Singh, S. Kumar, V. Arya, K. T. Chui,
ware, firewalls, encrypting sensitive data, and multi-factor and G. P. Singh, “Sustainable stock market prediction framework using
machine learning models,” International Journal of Software Science and
authentication may save you a lot of time and money in the Computational Intelligence (IJSSCI), vol. 14, no. 1, pp. 1–15, 2022.
long term. Having comprehensive, accurate, and up-to-date [8] K. T. Chui and et al., “Enhancing electrocardiogram classification with
metadata on all of the data stored is essential to a successful multiple datasets and distant transfer learning,” Bioengineering, vol. 9,
no. 11, p. 683, 2022.
data governance strategy. The metadata would include infor- [9] D. Singh, “Captcha improvement: Security from ddos attack,” 2021.
mation such as the date of creation, the purpose of the data, [10] A. Gaurav, V. Arya, and D. Santaniello, “Analysis of machine learning
and who was accountable for it. Later scientific research and based ddos attack detection techniques in software defined network,”
Cyber Security Insights Magazine (CSIM), vol. 1, no. 1, pp. 1–6, 2022.
precise benchmarking might benefit from analysts being able [11] B. Joshi and et al., “A comparative study of privacy-preserving homo-
to reproduce prior questions. As a result, data is more usable morphic encryption techniques in cloud computing,” International Journal
and "data dumpsters" are less likely to be created with useless of Cloud Applications and Computing (IJCAC), vol. 12, no. 1, pp. 1–11,
2022.
data. [12] R. K. S. Rajput, D. Goyal, A. Pant, G. Sharma, V. Arya, and M. K.
With the use of metadata, businesses would be able to Rafsanjani, “Cloud data centre energy utilization estimation: Simulation
query their data and come up with some answers. However, and modelling with idr,” International Journal of Cloud Applications and
Computing (IJCAC), vol. 12, no. 1, pp. 1–16, 2022.
query tools may not be able to access a complete repository [13] F. J. G. Peñalvo, A. Sharma, A. Chhabra, S. K. Singh, S. Kumar, V. Arya,
of data if datasets are not properly interoperable. Moreover, and A. Gaurav, “Mobile cloud computing and sustainable development:
a full picture of a patient’s health may not be created if Opportunities, challenges, and future directions,” International Journal of
Cloud Applications and Computing (IJCAC), vol. 12, no. 1, pp. 1–20,
distinct dataset components are not adequately integrated or 2022.
linked and readily available. Visualizing data using charts, [14] K. Pathoee and et al., “A cloud-based predictive model for the detection of
heatmaps, and histograms to show contrasts and precise breast cancer,” International Journal of Cloud Applications and Computing
(IJCAC), vol. 12, no. 1, pp. 1–12, 2022.
labelling may make it much simpler for humans to absorb [15] B. B. Gupta, Modern Principles, Practices, and Algorithms for Cloud
the information and apply it correctly. There are a variety of Security. IGI Global, 2019.
other examples of data visualisations, such as bar charts, pie [16] P. S. Emani, J. Warrell, A. Anticevic, S. Bekiranov, M. Gandal, M. J.
McConnell, G. Sapiro, A. Aspuru-Guzik, J. T. Baker, M. Bastiani et al.,
charts, and scatterplots. “Quantum computing at the frontiers of biological sciences,” Nature
Methods, vol. 18, no. 7, pp. 701–709, 2021.
V. CONCLUSION [17] “The lens - free & open patent and scholarly search,” https://www.lens.
org/, accessed: 2023-01-01.
Scientific revolutions are set to enter a new phase because of [18] P. Rebentrost, M. Mohseni, and S. Lloyd, “Quantum support vector ma-
Big Data’s role in innovation, competitiveness and produc- chine for big data classification,” Physical review letters, vol. 113, no. 13,
tivity. It is a good thing that we will be able to observe the p. 130503, 2014.
[19] M. Marghny, R. M. A. ElAziz, and A. I. Taloba, “Differential search
technology leapfrogging in the near future. In this article, algorithm-based parametric optimization of fuzzy generalized eigenvalue
we provide a quick introduction to the fundamentals and proximal support vector machine,” arXiv preprint arXiv:1501.00728,
challenges of Big Data and quantum computing. These tech- 2015.
[20] D. Anguita, S. Ridella, F. Rivieccio, and R. Zunino, “Quantum optimiza-
nologies are still in the early stages of development but we tion for training support vector machines,” Neural Networks, vol. 16, no.
are certain that we will see a number of major breakthroughs 5-6, pp. 763–770, 2003.
in the near future. Although Big Data analytics is still in [21] T. A. Shaikh and R. Ali, “Quantum computing in big data analytics: A sur-
vey,” in 2016 IEEE international conference on computer and information
the early stages of development, the current Big Data ap- technology (CIT). IEEE, 2016, pp. 112–115.
proaches and tools are unable to tackle all of the genuine Big [22] R. Dridi and H. Alghassi, “Homology computation of large point clouds
Data challenges. Consequently, governments and businesses using quantum annealing,” arXiv preprint arXiv:1512.09328, 2015.
should devote more resources to this scientific paradigm in

14 VOLUME 3, 2022

Quantum Computing A Tool in Big Data Analytics

Uploaded by

Quantum Computing A Tool in Big Data Analytics

Uploaded by

Quantum Computing: A Tool in Big Data

KEYWORDS Quantum computing, Big data

I. INTRODUCTION quantum computing technology under certain circumstances.

FIGURE 1: Evaluation of Quantum computing [1]

system that can handle a significant volume of health-

The pace of data production accelerated with the dawn of the

should devote more resources to this scientific paradigm in

You might also like