Found: 280 Android apps that use OCR to steal cryptocurrency credentials

Optical Character Recognition converts passwords shown in images to machine-readable text.

Dan Goodin – Sep 6, 2024 4:23 pm | 49

Credit: Getty Images

A high level of sophistication

The most notable thing about the newly discovered malware campaign is that the threat actors behind it are employing optical character recognition software in an attempt to extract cryptocurrency wallet credentials that are shown in images stored on infected devices. Many wallets allow users to protect their wallets with a series of random words. The mnemonic credentials are easier for most people to remember than the jumble of characters that appear in the private key. Words are also easier for humans to recognize in images.

SangRyol Ryu, a researcher at security firm McAfee, made the discovery after obtaining unauthorized access to the servers that received the data stolen by the malicious apps. That access was the result of weak security configurations made when the servers were deployed. With that, Ryu was able to read pages available to server administrators.

One page, displayed in the image below, was of particular interest. It showed a list of words near the top and a corresponding image, taken from an infected phone, below. The words represented visually in the image corresponded to the same words.

An admin page showing OCR details.
Credit: McAfee

“Upon examining the page, it became clear that a primary goal of the attackers was to obtain the mnemonic recovery phrases for cryptocurrency wallets,” Ryu wrote. “This suggests a major emphasis on gaining entry to and possibly depleting the crypto assets of victims.”

Optical character recognition is the process of converting images of typed, handwritten, or printed text into machine-encoded text. OCR has existed for years and has grown increasingly common to transform characters captured in images into characters that can be read and manipulated by software.

Ryu continued:

This threat utilizes Python and Javascript on the server-side to process the stolen data. Specifically, images are converted to text using optical character recognition (OCR) techniques, which are then organized and managed through an administrative panel. This process suggests a high level of sophistication in handling and utilizing the stolen information.

Python code for converting text shown in images to machine-readable text.
Credit: McAfee

Python code for converting text shown in images to machine-readable text. Credit: McAfee

People who are concerned they may have installed one of the malicious apps should check the McAfee post for a list of associated websites and cryptographic hashes.

The malware has received multiple updates over time. Whereas it once used HTTP to communicate with control servers, it now connects through WebSockets, a mechanism that’s harder for security software to parse. WebSockets have the added benefit of being a more versatile channel.

A timeline of apps' evolution. Credit: McAfee

Developers have also updated the apps to better obfuscate their malicious functionality. Obfuscation methods include encoding the strings inside the code so they’re not easily read by humans, the addition of irrelevant code, and the renaming of functions and variables, all of which confuse analysts and make detection harder. While the malware is mostly restricted to South Korea, it has recently begun to spread within the UK.

“This development is significant as it shows that the threat actors are expanding their focus both demographically and geographically,” Ryu wrote. “The move into the UK points to a deliberate attempt by the attackers to broaden their operations, likely aiming at new user groups with localized versions of the malware.”