Haskell Web Crawler

A super basic web crawler prototype in Haskell.

Features

Fetches web pages via HTTP
Extracts links from HTML
Tracks visited URLs to avoid duplicates
Depth-limited crawling

Build & Run

cabal build
cabal run

Or with Stack:

stack build
stack run

Usage

When you run the program, it will prompt for a starting URL. Enter any valid HTTP/HTTPS URL and it will crawl up to depth 2 (configurable in Main.hs).

Example:

Enter starting URL:
https://example.com

Dependencies

http-conduit: HTTP client
tagsoup: HTML parsing
containers: Set data structure

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
Main.hs		Main.hs
README.md		README.md
heb-crawler.cabal		heb-crawler.cabal

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Haskell Web Crawler

Features

Build & Run

Usage

Dependencies

About

Uh oh!

Releases

Packages

Languages

neilkamath/heb-crawler

Folders and files

Latest commit

History

Repository files navigation

Haskell Web Crawler

Features

Build & Run

Usage

Dependencies

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages