Skip to content

hmchuong/DownloadCELEB500k

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DownloadCELEB500k

Download Celeb500k with Scrapy

Requirements

  • Python >= 3.5

Install dependencies

  • Scrapy
  • Pillow
pip install scrapy Pillow

Download url files

Download url files to data folder following the instruction inside.

Start downloading

Run the following command

sh crawl.sh 5

where 5 is the number of retries, you should run 5-10 times to get all images

Results

Run the following command to get the number of download folders

ls -1 data/images/<url part> | wc -l

Run the following command to get the number of downloaded images

wc -l data/images/<url part>.jl

About

Download Celeb500k with Scrapy

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published