#parser #analyzer #forms-control #vb6

vb6parse

vb6parse is a library for parsing and analyzing VB6 code, from projects, to controls, to modules, and forms

9 releases (2 stable)

new 1.0.1 Feb 27, 2026
1.0.0 Feb 17, 2026
0.5.1 Apr 26, 2025
0.4.1 Sep 30, 2024
0.1.0 Sep 11, 2024

#353 in Parser implementations


Used in aspen

MIT license

9.5MB
119K SLoC

VB6Parse

A complete, high-performance parser library for Visual Basic 6 code and project files.

Crates.io Documentation License: MIT

Project Documentation & Resources
Interactive Playground
VB6 Library Reference
Code Coverage Report
Performance Benchmarks

Overview

VB6Parse is designed as a foundational library for tools that analyze, convert, or process Visual Basic 6 code. While capable of supporting real-time syntax highlighting and language servers, its primary focus is on offline analysis, legacy code utilities, and migration tools.

Key Features:

  • Fast, efficient parsing with minimal allocations
  • Full support for VB6 project files, modules, classes, forms, and resources
  • Concrete Syntax Tree (CST) with complete source fidelity
  • 160+ built-in VB6 library functions and 42 statements
  • Comprehensive error handling with detailed failure information
  • Zero-copy tokenization and streaming parsing

Quick Start

Add VB6Parse to your Cargo.toml:

[dependencies]
vb6parse = "1.0.1"

📖 Complete Getting Started Tutorial - Step-by-step guide with live examples

API Surface

Top-Level Imports

For common use cases, import everything with:

use vb6parse::*;

This brings in:

  • I/O Layer: SourceFile, SourceStream
  • Lexer: tokenize(), Token, TokenStream
  • File Parsers: ProjectFile, ClassFile, ModuleFile, FormFile, FormResourceFile
  • Syntax Parsers: parse(), ConcreteSyntaxTree, SyntaxKind, SerializableTree
  • Error Handling: ErrorDetails, ParseResult, all error kind enums

Layer Modules (Advanced Usage)

For advanced use cases, access specific layers:

use vb6parse::io::{SourceFile, SourceStream, Comparator};
use vb6parse::lexer::{tokenize, Token, TokenStream};
use vb6parse::parsers::{parse, ConcreteSyntaxTree};
use vb6parse::language::controls::{Control, ControlKind};
use vb6parse::errors::{ProjectErrorKind, FormErrorKind};

Parsing Architecture

Bytes/String/File → SourceFile → SourceStream → TokenStream → CST → Object Layer
                    (Windows-1252) (Characters)   (Tokens)    (Tree) (Structured)

Layers:

  1. I/O Layer (io): Character decoding and stream access
  2. Lexer Layer (lexer): Tokenization with keyword lookup
  3. Syntax Layer (syntax): VB6 language constructs and library functions
  4. Parsers Layer (parsers): CST construction from tokens
  5. Files Layer (files): High-level file format parsers
  6. Language Layer (language): VB6 types, colors, controls
  7. Errors Layer (errors): Comprehensive error types

Source Code Organization

src/
├── io/                          # I/O Layer - Character streams and decoding
│   ├── mod.rs                   # SourceFile, SourceStream
│   ├── comparator.rs            # Case-sensitive/insensitive comparison
│   └── decode.rs                # Windows-1252 decoding
│
├── lexer/                       # Lexer Layer - Tokenization
│   ├── mod.rs                   # tokenize() function, keyword lookup
│   └── token_stream.rs          # TokenStream implementation
│
├── syntax/                      # Syntax Layer - VB6 Language constructs
│   ├── library/                 # VB6 built-in library unit tests and documentation
│   │   ├── functions/           # 160+ VB6 functions (14 categories)
│   │   │   ├── array/           # Array, Filter, Join, Split, etc.
│   │   │   ├── conversion/      # CBool, CInt, CLng, Str, Val, etc.
│   │   │   ├── datetime/        # Date, Now, Time, Year, Month, etc.
│   │   │   ├── file_system/     # Dir, EOF, FileLen, LOF, etc.
│   │   │   ├── financial/       # FV, IPmt, IRR, NPV, PV, Rate, etc.
│   │   │   ├── interaction/     # MsgBox, InputBox, Shell, etc.
│   │   │   ├── math/            # Abs, Cos, Sin, Tan, Log, Sqr, etc.
│   │   │   ├── miscellaneous/   # Environ, RGB, QBColor, etc.
│   │   │   ├── string/          # Left, Right, Mid, Len, Trim, etc.
│   │   │   └── ...
│   │   └── statements/          # VB6 statement unit tests and documentation (7 categories)
│   │       ├── file_operations/ # Open, Close, Get, Put, etc.
│   │       ├── filesystem/      # FileCopy, Kill, MkDir, RmDir, etc.
│   │       ├── runtime_control/ # DoEvents, Stop, End, etc.
│   │       ├── runtime_state/   # Date, Time assignment, etc.
│   │       ├── string_manipulation/ # Mid statement, etc.
│   │       ├── system_interaction/  # Beep, etc.
│   │       └── ...
│   ├── statements/              # Statement parsing logic
│   │   ├── control_flow/        # If, Select Case, For, While parsers
│   │   ├── declarations/        # Dim, ReDim, Const, Enum parsers
│   │   └── objects/             # Set, With, RaiseEvent parsers
│   └── expressions/             # Expression parsing utilities
│
├── parsers/                     # Parsers Layer - CST construction
│   ├── cst/                     # Concrete Syntax Tree implementation
│   │   ├── mod.rs               # parse(), ConcreteSyntaxTree, CstNode
│   │   └── rowan_wrapper.rs     # Red-green tree wrapper
│   ├── parseresults.rs          # ParseResult<T, E> type
│   └── syntaxkind.rs            # SyntaxKind enum (all token types)
│
├── files/                       # Files Layer - VB6 file format parsers
│   ├── common/                  # Shared parsing utilities
│   │   ├── properties.rs        # Property bag, PropertyGroup
│   │   ├── attributes.rs        # Attribute statement parsing
│   │   └── references.rs        # Object reference parsing
│   ├── project/                 # VBP - Project files
│   │   ├── mod.rs               # ProjectFile struct and parser
│   │   ├── properties.rs        # Project properties
│   │   ├── references.rs        # Reference types
│   │   └── compilesettings.rs   # Compilation settings
│   ├── class/                   # CLS - Class modules
│   ├── module/                  # BAS - Code modules
│   ├── form/                    # FRM - Forms
│   └── resource/                # FRX - Form resources
│
├── language/                    # Language Layer - VB6 types and definitions
│   ├── color.rs                 # VB6 color constants and Color type
│   ├── controls/                # VB6 control definitions (50+ controls)
│   │   ├── mod.rs               # Control, ControlKind enums
│   │   ├── form.rs              # FormProperties
│   │   ├── textbox.rs           # TextBoxProperties
│   │   ├── label.rs             # LabelProperties
│   │   └── ...                  # 50+ control types
│   └── tokens.rs                # Token enum definition
│
├── errors/                      # Errors Layer - Error types
│   ├── mod.rs                   # ErrorDetails, error printing
│   ├── decode.rs                # SourceFileErrorKind
│   ├── tokenize.rs              # CodeErrorKind
│   ├── project.rs               # ProjectErrorKind
│   ├── class.rs                 # ClassErrorKind
│   ├── module.rs                # ModuleErrorKind
│   ├── form.rs                  # FormErrorKind
│   ├── property.rs              # PropertyError
│   └── resource.rs              # ResourceErrorKind
│
└── lib.rs                       # Public API surface

Common Tasks

For basic usage examples, see the Getting Started Guide.

The following examples show advanced use cases and patterns:

1. Extract All Form Controls

use vb6parse::language::Control;
use vb6parse::*;

fn extract_controls(form_path: &str) -> Vec<String> {
    let source = SourceFile::from_file(form_path).unwrap();
    let result = FormFile::parse(&source);
    let (form, _) = result.unpack();

    let mut control_names = Vec::new();

    if let Some(formfile) = form {
        fn visit_control(control: &Control, names: &mut Vec<String>) {
            names.push(control.name().to_string());

            // Recursively visit children
            if let Some(children) = control.kind().children() {
                for child in children {
                    visit_control(child, names);
                }
            }
        }

        for control in formfile.form.children().unwrap() {
            visit_control(control, &mut control_names);
        }
    }

    control_names
}

2. Analyze Code Without Full Parsing

use vb6parse::*;

fn count_identifiers(code: &str, function_name: &str) -> usize {
    let mut source_stream = SourceStream::new("temp.bas", code);
    let result = tokenize(&mut source_stream);
    let (tokens, _) = result.unpack();

    tokens
        .map(|ts| {
            ts.filter(|(text, token)| {
                *token == language::Token::Identifier && text.eq_ignore_ascii_case(function_name)
            })
            .count()
        })
        .unwrap_or(0)
}

Advanced Topics

Error Handling

VB6Parse uses a custom ParseResult<T, E> type that separates successful results from recoverable errors. See the Error Handling guide for detailed examples.

Quick Reference:

let result = ProjectFile::parse(&source);

// Unpack into result and failures
let (project_opt, failures) = result.unpack();

// Check for failures
if result.has_failures() {
    for failure in result.failures() {
        eprintln!("Error at line {}: {:?}", failure.error_offset, failure.kind);
    }
}

// Convert to Result<T, Vec<ErrorDetails>>
let std_result = result.ok_or_errors();

See also:

Working with the CST

The Concrete Syntax Tree preserves all source information including whitespace and comments:

use vb6parse::*;

let tree = parse(token_stream);

// Navigate the tree
let root = tree.to_root_node();
for child in root.children() {
    println!("Node: {:?}", child.kind());
    println!("Text: {}", child.text());
}

// Serialize for debugging
let serializable = tree.to_serializable();
println!("{:#?}", serializable);

See also:

Character Encoding

VB6 uses Windows-1252 encoding. Always use decode_with_replacement() for file content:

use vb6parse::*;

// From bytes (e.g., file read)
let bytes = std::fs::read("file.bas")?;
let source = SourceFile::decode_with_replacement("file.bas", &bytes).unwrap();

// From UTF-8 string (testing/programmatic)
let source = SourceFile::from_string("test.bas", "Dim x As Integer");

See also:

VB6 Library Functions

VB6Parse includes full definitions for 160+ VB6 library functions organized into 14 categories:

// Access function metadata
use vb6parse::syntax::library::functions::string::left;
use vb6parse::syntax::library::functions::math::sin;
use vb6parse::syntax::library::functions::conversion::cint;

// Each module includes:
// - Full VB6 documentation
// - Function signatures
// - Parameter descriptions
// - Usage examples
// - Related functions

Categories:

  • Array manipulation (Array, Filter, Join, Split, UBound, LBound)
  • Conversion (CBool, CDate, CInt, CLng, CStr, Val, Str)
  • Date/Time (Date, Time, Now, Year, Month, Day, Hour, DateAdd, DateDiff)
  • File System (Dir, EOF, FileLen, FreeFile, LOF, Seek)
  • Financial (FV, IPmt, IRR, NPV, PV, Rate)
  • Formatting (Format, FormatCurrency, FormatDateTime, FormatNumber, FormatPercent)
  • Interaction (MsgBox, InputBox, Shell, CreateObject, GetObject)
  • Inspection (IsArray, IsDate, IsEmpty, IsNull, IsNumeric, TypeName, VarType)
  • Math (Abs, Atn, Cos, Exp, Log, Rnd, Sgn, Sin, Sqr, Tan)
  • String (Left, Right, Mid, Len, InStr, Replace, Trim, UCase, LCase)
  • And more...

See also: src/syntax/library/functions/

Form Resources (FRX Files)

Form resource files contain binary data for controls (images, icons, property blobs):

use vb6parse::*;

// Option 1: load bytes and hand to FormResourceFile to handle.
let bytes = std::fs::read("Form1.frx")?;
let result = FormResourceFile::parse("Form1.frx", bytes);

// Option 2: Load directly from file.
let result = FormResourceFile::from_file("Form1.frx")?;

let (resource, _failures) = result.unpack();
if let Some(resource) = resource {
    for (offset, data) in resource.iter_entries() {
        println!(
            "Resource at offset {}: {} bytes",
            offset,
            data.as_bytes().unwrap().len()
        );
    }
}

See also:

Testing

VB6Parse has comprehensive test coverage.

📊 View Test Coverage Report

Running Tests

# Clone test data (required for integration tests)
git submodule update --init --recursive

# Run all tests
cargo test

# Run only library tests
cargo test --lib

# Run only integration tests
cargo test --test '*'

# Run documentation tests
cargo test --doc

Snapshot Testing

Integration tests use insta for snapshot testing:

# Review snapshot changes
cargo insta review

# Accept all snapshots
cargo insta accept

Test data location: tests/data/ (git submodules of real VB6 projects)

See also:

Benchmarking

VB6Parse includes criterion benchmarks for performance testing:

# Run all benchmarks
cargo bench

# Run specific benchmark
cargo bench bulk_parser_load

# Generate HTML reports
# Results saved to target/criterion/

Benchmarks:

  • bulk_parser_load - Parsing multiple large VB6 projects
  • Token stream generation
  • CST construction

See also:

Code Coverage

VB6Parse uses cargo-llvm-cov to track test coverage and ensure comprehensive testing across all modules.

Installation

# Install cargo-llvm-cov
cargo install cargo-llvm-cov

Running Coverage

# Generate coverage report (terminal output)
cargo llvm-cov

# Generate HTML report
cargo llvm-cov --html
# Open target/llvm-cov/html/index.html in your browser

# Generate coverage with open HTML report
cargo llvm-cov --open

# Generate detailed coverage for specific packages
cargo llvm-cov --package vb6parse

# Include tests in coverage
cargo llvm-cov --all-targets

# Generate LCOV format (for CI/CD integration)
cargo llvm-cov --lcov --output-path lcov.info

Coverage Reports

Coverage reports are saved to:

  • HTML reports: target/llvm-cov/html/
  • Terminal summary: Displays percentage coverage after running cargo llvm-cov
  • LCOV files: lcov.info (when using --lcov flag)

Current Coverage:

  • Library tests: 5,467 tests covering VB6 library functions
  • Integration tests: 31 tests with real-world VB6 projects
  • Documentation tests: 83 tests ensuring examples work
  • Coverage focus: Parsers, tokenization, error handling, and file format support

Contributing to VB6Parse

Contributions are welcome! Please see the CONTRIBUTING.md file for more information.

Development Setup

# Clone repository
git clone https://github.com/scriptandcompile/vb6parse
cd vb6parse

# Get test data
git submodule update --init --recursive

# Run tests
cargo test

# Run benchmarks
cargo bench

# Check for issues
cargo clippy

# Format code
cargo fmt

Code Organization Guidelines

  1. Layer Separation: Keep clear boundaries between layers
  2. Windows-1252 Handling: Always use SourceFile::decode_with_replacement()
  3. Error Recovery: Parsers should recover from errors when possible
  4. CST Fidelity: Preserve all source text including whitespace and comments
  5. Documentation: Include doc tests for public APIs

Adding New Features

VB6 Library Functions:

  • Add to appropriate category in src/syntax/library/functions/
  • Include full VB6 documentation
  • Add comprehensive tests
  • Update category mod.rs

Control Types:

  • Add to src/language/controls/
  • Define properties struct
  • Add to ControlKind enum
  • Include property validation

Error Types:

  • Add to appropriate error module in src/errors/
  • Ensure Display implementation
  • Add context information

Performance Considerations

  • Use zero-copy where possible (string slices, not String)
  • Avoid unnecessary allocations (use iterators)
  • Leverage rowan's red-green tree for CST memory efficiency
  • Use phf crate for compile-time lookup tables

See also:

Supported File Types

Extension Description Status
.vbp Project files ✅ Complete
.cls Class modules ✅ Complete
.bas Code modules ✅ Complete
.frm Forms ⚠️ Partial (font, some icons, etc)
.frx Form resources ⚠️ Partial (binary blobs loaded, not all mapped to properties)
.ctl User controls ✅ Parsed as forms
.dob User documents ✅ Parsed as forms
.vbw IDE window state ❌ Not yet implemented
.dsx Data environments ❌ Not yet implemented
.dsr Data env. resources ❌ Not yet implemented
.ttx Crystal reports ❌ Not yet implemented

Project Status

  • Core Parsing: Fully implemented for VBP, CLS, BAS files
  • Tokenization: Complete with keyword lookup
  • CST Construction: Full syntax tree with source fidelity
  • Error Handling: Comprehensive error types and recovery
  • VB6 Library: 160+ functions, 42 statements documented
  • ⚠️ FRX Resources: Binary loading complete, property mapping partial
  • ⚠️ FRM Properties: Majority of FRM properties load properly, (icon, background, font mapping partial)
  • AST: Not yet implemented (CST available)
  • Testing: 5,500+ tests across unit, integration, and doc tests
  • Benchmarking: Criterion-based performance testing
  • Fuzz Testing: Coverage-guided fuzzing with cargo-fuzz
  • Documentation: Comprehensive API docs and examples

Fuzz Testing

VB6Parse includes comprehensive fuzz testing using cargo-fuzz and libFuzzer to discover edge cases, crashes, and undefined behavior.

Available Fuzz Targets:

  • sourcefile_decode - Tests Windows-1252 decoding with arbitrary bytes
  • sourcestream - Tests low-level character stream operations
  • tokenize - Tests tokenization with malformed VB6 code
  • cst_parse - Tests Concrete Syntax Tree parsing with invalid syntax

Quick Start:

# Install cargo-fuzz (requires nightly)
cargo install cargo-fuzz

# Run a fuzzer for 60 seconds
cargo +nightly fuzz run sourcefile_decode -- -max_total_time=60

# List all fuzz targets
cargo +nightly fuzz list

Learn More: See fuzz/README.md for detailed usage.

Examples

All examples are located in the examples/ directory:

Example Description
audiostation_parse.rs Parse a complete real-world VB6 project
cst_navigation.rs Navigate and query the Concrete Syntax Tree
cst_parse.rs Parse tokens directly to CST
debug_cst.rs Display CST debug representation
debug_resource.rs Inspect FRX resource files
parse_class.rs Parse class files from bytes
parse_control_only.rs Parse individual form controls
parse_form.rs Parse VB6 forms
parse_module.rs Parse code modules
parse_project.rs Parse project files
sourcestream.rs Work with character streams
tokenstream.rs Tokenize VB6 code

Documentation Examples (used in Getting Started guide):

Example Description
docs/hello_world.rs First parse example - simple VB6 module
docs/project_parsing.rs Parse VB6 project files (.vbp)
docs/error_handling.rs Handle parse errors gracefully
docs/tokenization.rs Tokenize VB6 code
docs/cst_navigation.rs Navigate the Concrete Syntax Tree
docs/form_parsing.rs Parse VB6 forms with controls

Run any example with:

cargo run --example parse_project

Resources

Limitations

  1. Encoding: Primarily designed for "predominantly English" source code with Windows-1252 encoding detection limitations
  2. AST: Abstract Syntax Tree is not yet implemented (Concrete Syntax Tree is available)
  3. FRX Mapping: Binary resources are loaded but not all are mapped to control properties
  4. Real-time Use: While capable, not optimized for real-time highlighting or LSP (focus is on offline analysis)

License

MIT License - See LICENSE file for details.


Built with ❤️ by ScriptAndCompile

Dependencies

~16MB
~361K SLoC