Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File Error: integer division or modulo by zero #191

Closed
tiffmkell opened this issue Apr 4, 2020 · 1 comment
Closed

File Error: integer division or modulo by zero #191

tiffmkell opened this issue Apr 4, 2020 · 1 comment

Comments

@tiffmkell
Copy link

The gpt-2-simple package was running perfectly with the code I used on a previous VM instance on the AI Platform for Google Cloud Platform that was: Tensorflow 1.14 environment, 1 NVIDIA Tesla T4, and Compute Engine default service account.

However, I created a new instance this morning: Tensorflow 1.15 instead of 1.14 and I am throwing an error when it tries to train on a corpus saved in a .txt file. The .txt file is completely loaded into the correct folder, but the model is not able to find it which is why it is returning 0 tokens to train on.

This could be a GCP issue, but I wanted to still ask if their were any dependencies in the package that needed to be updated.

Here is the error:

0it [00:00, ?it/s]
Loading dataset...
dataset has 0 tokens
Training...


ZeroDivisionError Traceback (most recent call last)
in
520 # Steps is max number of training steps
521 model = gpt2.finetune(sess, 'text_scraped.txt', model_name = model_name, steps = 1000,
--> 522 run_name = 'dog_beds')
523 model
524

/opt/conda/lib/python3.7/site-packages/gpt_2_simple/gpt_2.py in finetune(sess, dataset, steps, model_name, model_dir, combine, batch_size, learning_rate, accumulate_gradients, restore_from, run_name, checkpoint_dir, sample_every, sample_length, sample_num, multi_gpu, save_every, print_every, max_checkpoints, use_memory_saving_gradients, only_train_transformer_layers, optimizer, overwrite)
340 (_, v_loss, v_summary) = sess.run(
341 (opt_apply, loss, summary_loss),
--> 342 feed_dict={context: sample_batch()})
343
344 summary_log.add_summary(v_summary, counter)

/opt/conda/lib/python3.7/site-packages/gpt_2_simple/gpt_2.py in sample_batch()
307
308 def sample_batch():
--> 309 return [data_sampler.sample(1024) for _ in range(batch_size)]
310
311 if overwrite and restore_from == 'latest':

/opt/conda/lib/python3.7/site-packages/gpt_2_simple/gpt_2.py in (.0)
307
308 def sample_batch():
--> 309 return [data_sampler.sample(1024) for _ in range(batch_size)]
310
311 if overwrite and restore_from == 'latest':

/opt/conda/lib/python3.7/site-packages/gpt_2_simple/src/load_dataset.py in sample(self, length)
81 def sample(self, length):
82 assert length < self.total_size // len(
---> 83 self.chunks
84 ), "Dataset files are too small to sample {} tokens at a time".format(
85 length)

ZeroDivisionError: integer division or modulo by zero

@tiffmkell
Copy link
Author

Randomly restarted the kernel and it began working after about 5 tries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant