Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can not download tensorflow datasets #131

Open
Tianyu00 opened this issue May 19, 2020 · 11 comments
Open

can not download tensorflow datasets #131

Tianyu00 opened this issue May 19, 2020 · 11 comments

Comments

@Tianyu00
Copy link

In ch13/ch13_part1.ipynb, line [52], can not download celeb_a dataset in

celeba_bldr.download_and_prepare()

error message:

NonMatchingChecksumError: Artifact https://drive.google.com/uc?export=download&id=0B7EVK8r0v71pZjFTYXZWM3FlRnM, downloaded to /Users/tz/tensorflow_datasets/downloads/ucexport_download_id_0B7EVK8r0v71pZjFTYXZWM3FlDDaXUAQO8EGH_a7VqGNLRtW52mva1LzDrb-V723OQN8.tmp.db23c1347e5240e68f5f92216dc1872b/uc, has wrong checksum. This might indicate:

@elfelround
Copy link
Contributor

hi, i already commented this on the tensorflow github issues, have a look (searches for link)

not an error from the guys who wrote the book even tho they could mention this on the book or as a comment

@elfelround
Copy link
Contributor

tensorflow/datasets#965

@elfelround
Copy link
Contributor

as for now there is no solution, the dataset is 2-3gb and google cloud isnt happy to share such bandwith with people constantly, i would recommend uploading the dataset to an external site/cdn

@elfelround
Copy link
Contributor

I advise uploading this dataset to kaggle.com

@elfelround
Copy link
Contributor

be aware that ive tried everything and never managed for it to work, vpns, changing the code to bypass checks... so there is no practical solution other than the one im mentioning

@Tianyu00
Copy link
Author

Hi elfelround,
Thank you for your message! I am not sure if you are the author or contributor of this book? I am aware that this is not the error of the author. I've searched this problem and found some discussion online but I don't really understand them. I think it would be nice to have a discussion of this problem here and maybe direct people to the real problem at tensorflow like the link you attached because other people who are reading the book may have the same problem and they may start searching here.

The actual problem I have now is, is there a way to download and use the celeb_a dataset, no matter it is from tensorflow or elsewhere. It is lucky there is no example using this dataset. However if there is, nobody can follow the example. It would be great if the authors or other people who have greater knowledge about this problem can give us some suggestions. Thanks!

@jinensetpal
Copy link

Hello - as stated in the error message, the dataset did install, but the associated SHA checksums did not match with those anticipated by the Tensorflow source. Ensure the version of Tensorflow you are using is the latest, this has worked for me.

If that is not an option, here [1.3GiB] is a direct download link for the celeba dataset, which you can unpack and use as intended.

@rasbt
Copy link
Owner

rasbt commented Jun 8, 2020

Thanks a lot for helping out here, @elfelround , I really appreciate it!

Regarding the dataset, I would also recommend saving it somewhere on the machine you'll be using, because there may be hiccups with their servers some times. On the original CelebA website, they also provide an alternative Baidu link that may be useful in such cases: http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html

@rickiepark
Copy link

Hi, if you have the problem downloading celeb_a dataset with tfds, try as below.

  1. update tfds: pip install --upgrade tensorflow-datasets
  2. download four dataset manually in https://git.io/JL5GM, and then save ~/tensorflow_datasets/downloads/manual
  3. try again celeba_bldr.download_and_prepare(). :)

If you use colab, upload the files to your drive and then mount, copy it like below

from google.colab import drive
drive.mount('/drive')
!mkdir -p ~/tensorflow_datasets/downloads/manual
!cp /drive/MyDrive/datasets/celeba/img_align_celeba.zip ~/tensorflow_datasets/downloads/manual
!cp /drive/MyDrive/datasets/celeba/list_attr_celeba.txt ~/tensorflow_datasets/downloads/manual
!cp /drive/MyDrive/datasets/celeba/list_eval_partition.txt ~/tensorflow_datasets/downloads/manual
!cp /drive/MyDrive/datasets/celeba/list_landmarks_align_celeba.txt ~/tensorflow_datasets/downloads/manual

Thanks

@liqinglin54951
Copy link

liqinglin54951 commented Jan 28, 2021

KeyError: <ExtractMethod.NO_EXTRACT: 1> since celeba_bldr.download_and_prepare() need tfrecord files, so after you got celeb_a dataset_info.json file and txt files(such as list_landmarks_align_celeba.txt, list_attr_celeba.txt) then you need tfrecord files https://drive.google.com/drive/folders/1MKQ9sRwr5OOFk3OBzLz91SsgF3MBqvtP?usp=sharing
Folder structure:
image

@manfredkremer
Copy link

I also have a problem with the command "celeba_bldr.download_and_prepare()". I get constantly the error message "HTTP code 429". So I am stuck at this point. Is there any solution to this problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants