pyWhat is a Python-based identification tool designed to figure out “what” a piece of text or file content represents, especially in security and OSINT workflows. Given inputs such as hex strings, URLs, email addresses, IP addresses, credit card numbers, cryptocurrency wallets, or entire .pcap capture files, it scans for structured patterns and tells you what it finds. The tool is recursive: it can traverse files and directories to extract meaningful entities, which is useful when analyzing malware samples, network captures, or code repositories at scale. It offers powerful filters called “tags” and distributions that let you narrow results to specific categories like bug bounties, cryptocurrencies, or AWS-related artifacts. For automation and integration, pyWhat provides a CLI with options for rarity filtering, sorting, and JSON export, as well as an API that can be imported into other Python programs.
Features
- Identifies a wide range of structured data including URLs, emails, IPs, credit cards, crypto wallets, and more
- Works on raw text, individual files, directories, and network capture files like .pcap
- Offers powerful tag-based filters and rarity controls to focus on specific types of findings
- Supports JSON output, sorting, and other CLI options for easy integration into scripts and pipelines
- Provides a Python API for programmatic use in security tools and automation workflows
- Installable via pip, Homebrew, and MacPorts, with extensive examples for malware and bug bounty use cases