Skip to content

neilkamath/heb-crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Haskell Web Crawler

A super basic web crawler prototype in Haskell.

Features

  • Fetches web pages via HTTP
  • Extracts links from HTML
  • Tracks visited URLs to avoid duplicates
  • Depth-limited crawling

Build & Run

cabal build
cabal run

Or with Stack:

stack build
stack run

Usage

When you run the program, it will prompt for a starting URL. Enter any valid HTTP/HTTPS URL and it will crawl up to depth 2 (configurable in Main.hs).

Example:

Enter starting URL:
https://example.com

Dependencies

  • http-conduit: HTTP client
  • tagsoup: HTML parsing
  • containers: Set data structure

About

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published