Fraud Detection in Banking Transactions Using Machine Learning
Fraud Detection in Banking Transactions Using Machine Learning
net/publication/369939691
CITATIONS READS
15 4,763
2 authors:
All content following this page was uploaded by Chetan J. Shelke on 13 March 2024.
Abstract - Vulnerability in banking systems has exposed us to They are very convenient to the consumers with a GUI
fraudulent acts, which cause severe damage to both customers interface and leave the back-end processing as in conventional
and the bank in terms of loss of money and reputation. Financial banks, such as post-dated settlement, consolidation, and
fraud in banks is estimated to result in a significant amount of regular reporting. This changes the future banking model by
financial loss annually. Early detection of this helps to mitigate keeping the traditional banking operation at the backend
the fraud, by developing a counter strategy and recovering from becoming a commoditized utility provider. A technological
such losses. A machine learning-based approach is proposed in front and the front end control the customer experience. This
this paper to contribute to fraud detection successfully. The
technological innovation in banking is also connected to
artificial intelligence (AI) based model will speed up the check
several other positive developments in the related industrial
verification to counteract the counterfeits and lower the damage.
In this paper, we analyzed numerous intelligent algorithms
segment.
trained on a public dataset to find the correlation of certain
factors with fraudulence. The dataset utilized for this research
is resampled to minimize the high class of imbalance in it and
analyzed the data using the proposed algorithm for better
accuracy.
Keywords – Credit card fraud, fraudulent transactions, KNN
classifier, Random Forest, XGBoost, Blockchain, Artificial
intelligence
I. INTRODUCTION
The banks of the future are very different in terms of their
functionalities, compared to them what they are today. These
changes are due to the changes in infrastructures, services, Fig.1. AI Technology to improve customer experience in Banking Activities
people, and skill sets. This transformation is only due to the
implementation of financial technologies in banking. Most AI-powered chatbots that mimic human conversation and
banks are capable to adopt innovative technologies to deliver messaging apps are replacing the activities of the backend
financial services and it changes the banking role as we want. services in call centers. Biometric data and iris scanning are
New technologies such as blockchain [18], AI, big data, digital used as an alternative to passwords and tokens used for
payment processing, peer-to-peer lending, crowdfunding, and transactions. The other technologies enabled with FinTech are
robot advisors play a vital role in delivering banking services. IoT, wearable technologies and ben-in-banking are common
What is the need for these technological revolutions in things in day-to-day banking operations, most of them with
banking? As there is a technological evolution, the banking gamification service to the end users. To service their
industry is at the forefront of adopting them in their activities customers, banks today need to change their mode of service.
to deliver better customer services, but many times the By adopting advanced technologies, they may succeed in the
financial crises have adversely affected these new ventures in evolution of the banking industry and embed them in their
the banking industry, as a result, innovation was a very distant operations as their culture and innovation across the
priority. At the same time, many new technologies are found organization. This has consequences in many folds. The City
as gamechanger for transforming the conventional banking bank analysis represents that, for the next 10 years 30% of the
system into customer-friendly banks. Still, a gap was created bank jobs will disappear. Some research estimated this job loss
between what the bank was offering to its customer and their will be more than 50%. This has a consequence for many
experience and convenience perspective. Figure (1) represents financial institutions across the globe. It is not just a job loss,
the different banking activities supported by FinTech but also connected with several economic aspects around it,
companies to improve customer experience by implementing like accounting firms, law firms, hotels, and services
AI technology [22]. This gap was a research topic for many businesses. The profiles of the new jobs created in the FinTech
researchers. The traditional banking system is also varied industry are very small in number, but they are entirely
about this technological growth with the expectation and different with very different skill sets. Those are very essential
requirements of touch points with the customers with trust and for today’s banking systems. Considering all the above-said
confidence in these technologies. To augment this and provide challenges in banking one of the common challenges faced by
better technological support there are hundreds of new most of the banks today are fraudulent transactions and
FinTech companies offering products and services to the malicious behavior of the users. The fraudulent transaction is
banks; p-2-p lending, provides consumer alternatives to loans a rising challenge for the banking industry with an estimated
that were already available in the banks, and robo advisory financial loss and damage to the reputation. This survey report
platform offers to the customers a set of user-friendly presents that about 56% of the fraud in banking is only
solutions. These services are highly visible and cost-effective. reported [15]. It clearly indicates that a better fraud detection
978-1-6654-9260-7/23/$31.00 2023
c IEEE 221
Authorized licensed use limited to: Alliance College of Engineering and Design Bangalore. Downloaded on March 13,2024 at [Link] UTC from IEEE Xplore. Restrictions apply.
model is of paramount requirement in most banks either small backend execution. Prescriptive Security: to analyze the early
or big in size. visibility of threats and cyber-attacks. Hybrid cloud: Designed
This research work aims to design an intelligent system to allow the bank to offer innovative new offerings to its
with a machine learning model, that will be predictive and customers. Instant payment: to provide ubiquitous online
adaptive to detect the fraudulent activities of the customer in transactions. The AI in financial services initiative sets out to
banking transactions. The paper is structured with the explore the multiplicative impacts of emerging technologies.
following sections. In section II we described the literature
review with the related work completed by other researchers
and in section III technological impact on banking and the
digital revolution in India. Section IV describes the role of AI
in risk management and governance and section V, the fraud
analysis using machine learning algorithms followed by a
conclusion.
II. LITERATURE REVIEW
Statistical methods can be used for fraud detection. Here
the statistical distribution of the dataset is analysed for
anomalous behavior of the fraudulent by using Linear
Discriminant Analysis and Logistic regression [1]. The
authors used a variety of data mining techniques in real-time
fraud detection using historical data [1]. The research work [2]
describes the methods to detect fraud by using KNN algorithm
and outlier finding mechanism. The model helps in the
detection of malicious behavior of the fraudulent. The authors
in [3] used an ensemble technique including the Random
Fig.2. Global banking technology radar
Forest model to analyze the normal transactions and compare
the performance of the fraudulent transaction detection A. India Digital Revolution
method by neural networks. Fraud detection in [4] presented
With the momentum of the digital revolution in the world,
the method for credit card transactions and analyzed the data
India also gained a defining momentum. It was estimated that
using Wale-algorithm optimized backpropagation. The
the growth in the digital sector is estimated to double in 2025.
authors in [4][6] have analyzed already classified results for
Complete digitalization is expected to faster world spread
detecting credit card fraud using an imbalanced dataset. K
economic growth and enhance employability through
means clustering is used for sampling groups of fraudulent
incremental value addition across various industrial sectors,
transaction samples. Authors also used genetic algorithms for
such as manufacturing, healthcare, education, and logistic. In
group fraudulent transactions. The researcher used multiple
the last decade, there is a significant change in the field of
machine learning algorithms such as KNN, Logistic
digitalization in India. This technological disruption has been
regression, and Naïve Bayes for analyzing the available
enabled by massive growth in the IT sector and the
dataset. An enhanced study of this has demonstrated, the
demography of the end users. The country has witnessed the
represents that KNN outperforms the other two methods
world’s second-largest digital ecosystem with 700 million
[2][6]. The performance was assessed by precision, recall,
internet users and is estimated to reach up to 829 million by
Mathew correlation coefficient, and balanced classification
the end of 2021 and an equal number of mobile users.
rate specificity. A unique fraud detection technique is
proposed in [7] using Big data technology with a new method
known as Scalable Real-time Fraud Finder (SCARFF) using
different data analysis tools such as Spark, Cassandra, and
Kafka. Real-time data analysis is possible with a large amount
of transaction data. The advantage of this system proposed
holds higher accuracy, fault tolerance, and scalability. In [8]
the researcher presented a feature engineering method to
minimize the number of false positive rates, that are normally
used in the anomaly-detecting algorithm [16].
III. TECHNOLOGICAL IMPACT ON BANKING
Figure (2) shows a global banking technology radar, these
disruptive technologies will shape the future of banking. The Source: Ministry of Electronics and IT
different cutting-edge technologies are pivotal for today’s Fig.3. Internet and Mobile users in India
banking system. Augmented reality: to enhance customer
experience, Blockchain: enable multiple parties to access the This rapid shift is due to the catalytic role played by the
same data simultaneously using distributed ledger[18]. Government in moving its services to digital platforms. Some
Robotic process automation: to mimic human action and of the key achievements of this are India becoming the second
judgment but at a higher speed, scale, and quality. Quantum fastest digital adopter among the top 17 digital economies, e-
computing: to work out complex current complex data governance and digital identities, e-commerce growth and
operations. Artificial intelligence: to make a better decision penetration of mobile internet access, growth of mobile and
even by using historical data. API platforms: designed to work internet access, and adoption of digital media by citizens. With
through API when a front-end experience is connected to a faster pace of changes, there is exponential growth in the
222 International Conference on Intelligent and Innovative Technologies in Computing, Electrical and Electronics (IITCEE)
Authorized licensed use limited to: Alliance College of Engineering and Design Bangalore. Downloaded on March 13,2024 at [Link] UTC from IEEE Xplore. Restrictions apply.
county’s digital economy which is almost doubled by 2020 as result of this there is a significant amount of cost saving, which
compared to 2017 and is expected to become a USD 5 trillion is estimated at about USD 447 billion in 2023, by the
in 2024 and a leap of USD 1 trillion by 2025. implementation of AI applications in front-end, back-end, and
middle-end operations as 45%, 7%, and 50% respectively. The
frontend operation includes a customer interface, personalized
services, user authentication, and validations and wealth
management. Back-end operations aim to signify the backend
process, business and strategy insights, and regulating
compliances.
The middle layer operation does crucial tasks such as the
detection of fraudulent behavior of the customers, risk
identification and mitigation, AML, loan approval, and KYC
verification. Figure (6) indicates the four major areas that
benefitted from the use of AI (USB evidence lab).
Fig.4. Indian Digital Economy
International Conference on Intelligent and Innovative Technologies in Computing, Electrical and Electronics (IITCEE) 223
Authorized licensed use limited to: Alliance College of Engineering and Design Bangalore. Downloaded on March 13,2024 at [Link] UTC from IEEE Xplore. Restrictions apply.
IV. ROLE OF AI IN RISK MANAGEMENT ܲሺ݇ݔሻ ൌ ݂ሺ݇ሻܲሺݔሻ
AND GOVERNANCE ଵ
If ܲ ሺݔሻ݀ ݔൌ ͳܲ ݄݊݁ݐሺ݇ݔሻ݀ ݔൌ
There are other potential benefits and opportunities ଵ
Normalization denotes ݂ሺ݇ሻ ൌ
provided by AI implementation. Consequently, there are
challenges that need to be properly managed. Analysis shows Differentiating with respect to k and setting k = 1, we get
ଵ
that the main risks faced by retail banks by the quality of the ܲݔଵ ሺݔሻ ൌ െܲሺݔሻ with the solution ܲሺݔሻ ൌ
௫
data used for analysis, and the confidentiality of the data taken This does not represent the proper probability distribution.
from the data store for analysis. No AI model can result from The distribution of the first digit is shown as a percentage in
better accuracy unless the quality of the data considered for figure (10). The frequency of the first digits follows the
analysis is appropriate and reliable. To protect the privacy of logarithmic relations as
the customer data a high level of confidentiality is to be ܽͳ
maintained during the data analysis. Validation of the model ܽܨൌ ݈ ݃൬ ൰
uses is also another important requirement to achieve better ܽ
Fa is the frequency of the digit ‘a’ in the first place of used
performance and a high degree of interoperability to gain
support from management and regulators. More adaptation of number. Table (1) gives the observed and computed
AI applications in banking operations leads to new challenges frequencies.
in the areas of operational, legal reputation, and strategies. The
level of these risks is different for different types of banking
services. Some of the retail banks believe that the
implementation of AI may substitute human operators, and
may add legal risks, but its impact on the reputation may be
minimal.
224 International Conference on Intelligent and Innovative Technologies in Computing, Electrical and Electronics (IITCEE)
Authorized licensed use limited to: Alliance College of Engineering and Design Bangalore. Downloaded on March 13,2024 at [Link] UTC from IEEE Xplore. Restrictions apply.
dataset one can, perform oversampling or undersampling. We The KNN algorithm identifies similar things that exist in
perform an exploratory data analysis on the dataset to capture proximity[2]. The data points are closer to each other. The
its features. In data cleansing, the available categorical similarity can be calculated based on the distance functions
features are transformed into numerical values. An such as Euclidean for the continuous variable as
oversampled technique called SMOTE (Synthetic Minority
Over-sampling Technique) is used here. It will create new data
points from the minority class using the neighbour instances ݀ሺݔǡ ݕሻ ൌ ඩሺݔ െ ݕ ሻଶ
so that generated samples are not biased and the base accuracy ୀଵ
International Conference on Intelligent and Innovative Technologies in Computing, Electrical and Electronics (IITCEE) 225
Authorized licensed use limited to: Alliance College of Engineering and Design Bangalore. Downloaded on March 13,2024 at [Link] UTC from IEEE Xplore. Restrictions apply.
boosting algorithm used in the research, which combines a Detection,” 2018 2nd Cyber Security in Networking Conference
(CSNet), Paris, 2018, pp. 1-5, doi: 10.1109/CSNET.2018.8602972.
weak learner with a strong learner. In which the multiple [6] John O. Awoyemi, Adebayo Olusola Adetunmbi, and Samuel Adebayo
iterations involved to process the weak learner and predict the Oluwadare. Credit card fraud detection using machine learning
value of the class labels and, then calculate the loss. This techniques: A comparative analysis. 2017 International Conference on
process repeats until a certain threshold, hence known as Computing Networking and Informatics (ICCNI), pages 1–9, 2017.
gradient decent optimization problem or gradient boost. [7] Fabrizio Carcillo, Andrea Dal Pozzolo, Yann-A¨el Le Borgne, Olivier
Caelen, Yannis Mazzer, and Gianluca Bontempi. Scarff: a scalable
XGBoost is such as gradient boosting technique minimizing framework for streaming credit card fraud detection with spark.
the overfitting. Its speed of processing is much faster due to Information Fusion, 41:182–194, 2018.
parallel processing than conventional gradient boosting. [8] Galina Baader and Helmut Krcmar. Reducing false positives in fraud
detection: Combining the red flag approach with process mining.
International Journal of Accounting Information Systems, 2018.
[9] Ravisankar P, Ravi V, Raghava Rao G, and Bose, Detection of financial
statement fraud and feature selection using data mining techniques,
Elsevier, Decision Support Systems Volume 50, Issue 2, p491-500
(2011) SVM
[10] K. Seeja, and M. Zareapoor, “FraudMiner: A Novel Credit Card Fraud
Detection Model Based on Frequent Itemset Mining,” The Scientific
World Journal, 2014, pp. 1-10. KNN, SVM
[11] C. Tyagi, P. Parwekar, P. Singh, and K. Natla, “Analysis of Credit Card
Fraud Detection Techniques,” Solid State Technology, vol. 63, no. 6,
2020, pp. 18057-18069. Credit card faud
[12] C. Chee, J. Jaafar, I. Aziz, M. Hassan, and W. Yeoh, “Algorithms for
frequent itemset mining: a literature review,” Artificial Intelligence
Review, vol. 52, 2019, pp. 2603–2621. Litrature review AI
[13] S. Kiran, J. Guru, R. Kumar, N. Kumar, D. Katariya, and M. Sharma,
“Credit card fraud detection using Naïve Bayes model based and KNN
classifier,” International Jounral of Advance Research, Ideas and
Innovations in Technology, vol. 4, 2018, pp. 44-47. KNN Naïve Byers
[14] Pumsirirat, A.; Yan, L. Credit Card Fraud Detection Using Deep
Learning based on Auto-Encoder and Restricted Boltzmann Machine.
Available online: [Link]
Credit_Card_Fraud_Detection_Using_Deep_ [Link] (accessed
on 23 February 2021). DL
[15] PwC’s Global Economic Crime and Fraud Survey 2020. Available
online: [Link] (accessed on 30 November
2020). Fraud surver.
[16] Pourhabibi, T.; Ongb, K.L.; Kama, B.H.; Boo, Y.L. Fraud detection: A
Fig 13. Fraud based on gender systematic literature review of graph-based anomaly detection
approaches. Decis. Support Syst. 2020, 133, 113303. Fraud detection.
CONCLUSION [17] Lucas, Y.; Jurgovsky, J. Credit card fraud detection using machine
learning: A survey. arXiv 2020, arXiv:2010.06479. Credit card fraud.
Use of machine learning algorithms proposed in this research [18] Podgorelec, B.; Turkanovi´c, M.; Karakatiˇc, S. A Machine Learning-
to detect fraud in banking applications. The publicly available Based Method for Automated Blockchain Transaction Signing
Including Personalized Anomaly Detection. Sensors 2020, 20, 147.
dataset from UCI is analyzed. The high level of imbalance in Anomaly detection.
the dataset provided is highly biased toward the majority of [19] Synthetic Financial Datasets for Fraud Detection. Available online:
samples. This problem is tackled by the synthetic minority [Link] (accessed on 30
over-sampling technique (SMOTE). Implementation issues of November 2020). Fraud detection.
[20] Ma, T.; Qian, S.; Cao, J.; Xue, G.; Yu, J.; Zhu, Y.; Li, M. An
this by KNN and Random Forest algorithms are handled by Unsupervised Incremental Virtual Learning Method for Financial Fraud
XGBoost as the boosting methods. The performance achieved Detection. In Proceedings of the 2019 IEEE/ACS 16th International
using the model was 97.74%. In the analysis of the dataset, we Conference on Computer Systems and Applications (AICCSA), Abu
found that people in the age group of 19-25 years are more Dhabi, United Arab Emirates, 3–7 November 2019; pp. 1–6. Financial
fraud detection.
likely to be fraudulent than other customers’ demography. [21] Puh, M.; Brki´c, L. Detecting Credit Card Fraud Using Selected Machine
Learning Algorithms. In Proceedings of the 42nd International
REFERENCE Convention on Information and Communication Technology,
[1] R. Rambola, P. Varshney and P. Vishwakarma, “Data Mining Techniques Electronics and Microelectronics (MIPRO), Opatija, Croatia, 20–24
for Fraud Detection in Banking Sector,” 2018 4th International May 2019. Credit card fraud detection.
Conference on Computing Communication and Automation (ICCCA), [22] Ryman-Tubb, N.F.; Krause, P.J.; Garn, W. How Artificial Intelligence
Greater Noida, India, 2018, pp. 1-5, doi: 10.1109/CCAA.2018.8777535. and machine learning research impacts payment card fraud detection: A
[2] N. Malini and M. Pushpa, “Analysis on credit card fraud identification survey and industry benchmark. Eng. Appl. Artif. Intell. 2018, 76, 130–
techniques based on KNN and outlier detection,” 2017 Third 157. Credit card fraud detection.
International Conference on Advances in Electrical, Electronics, [23] Xuan, S.; Liu, G.; Li, Z.; Zheng, L.; Wang, S.; Jiang, C. Random Forest
Information, Communication and Bio-Informatics (AEEICB), Chennai, for Credit Card Fraud Detection. In Proceedings of the 2018 IEEE 15th
2017, pp. 255-258, doi: 10.1109/AEEICB.2017.7972424. International Conference on Networking, Sensing and Control (ICNSC),
[3] Ishan Sohony, Rameshwar Pratap, and Ullas Nambiar. 2018. Ensemble Zhuhai, China, 27–29 March 2018. RF.
learning for credit card fraud detection. In Proceedings of the ACM India [24] Huang, D.; Mu, D.; Yang, L.; Cai, X. CoDetect: Financial Fraud
Joint International Conference on Data Science and Management of Detection with Anomaly Feature Detection. IEEE Access 2018, 6,
Data (CoDS-COMAD ’18). Association for Computing Machinery, 19161–19174. Financial fraud detection.
New York, NY, USA, 289–294. [25] Amarasinghe, T.; Aponso, A.; Krishnarajah, N. Critical Analysis of
DOI:[Link] Machine Learning Based Approaches for Fraud Detection in Financial
[4] C. Wang, Y. Wang, Z. Ye, L. Yan, W. Cai, and S. Pan, “Credit Card Transactions. In Proceedings of the 2018 International Conference on
Fraud Detection Based on Whale Algorithm Optimized BP Neural Machine Learning Technologies (ICMLT’18), Nanchang, China, 21–23
Network,” 2018 13th International Conference on Computer Science June 2018; pp. 12–17. Machine learning for fraud detection
Education (ICCSE), Colombo, 2018, pp. 1-4, doi:
10.1109/ICCSE.2018.8468855
[5] I. Benchaji, S. Douzi and B. ElOuahidi, ”Using Genetic Algorithm to
Improve Classification of Imbalanced Datasets for Credit Card Fraud
226 International Conference on Intelligent and Innovative Technologies in Computing, Electrical and Electronics (IITCEE)
Authorized licensed use limited to: Alliance College of Engineering and Design Bangalore. Downloaded on March 13,2024 at [Link] UTC from IEEE Xplore. Restrictions apply.