Checking for overtraining? #30

david-sg · 2019-05-07T09:09:41Z

Some of the text that is being generated seems too good... Wondering if I might have overtrained the model.
I can check my dataset manually quickly.

Suggestions for checking the original dataset?

woctezuma · 2019-05-07T17:26:03Z

Sometimes, the output looks too good, but "Ctrl+F" some parts and you should see that it is not copied from the dataset. If that is not the case, for instance if you have trained your model for too many iterations with respect to the size of the "finetuning dataset", I suggest that you simply increase the temperature.

    :temperature=1 : Float value controlling randomness in boltzmann
     distribution. Lower temperature results in less random completions. As the
     temperature approaches zero, the model will become deterministic and
     repetitive. Higher temperature results in more random completions.

david-sg · 2019-05-08T04:23:21Z

I downloaded the training corpus for 345M (approx 765mb) output-dataset_v1_medium-345M-k40.train.jsonl, and searched- but I'm not seeing some of the vocabulary popping up in my generated text...
Is this the correct corpus that it was trained on?

woctezuma · 2019-05-08T05:16:02Z

Sorry, I thought you were talking about your dataset, on which you wanted to finetune the model.

As mentioned in this blog post, the whole training dataset was not released. You only have 250k samples, instead of 8M in this repository.

robclouth · 2019-06-13T00:12:55Z

@david-sg i'm getting really impressive results too...and searching the fine-tuning dataset doesn't bring up anything similar. It's amazing how it blends the new stuff with the original model.

ghost · 2019-06-17T03:08:19Z

Is there an apriori method to know how many steps you should train your dataset when fine-tuning? Want to know if there's any good heuristics out there. Thank You

david-sg · 2019-06-20T00:54:20Z

I found it over-training at a loss below 0.10. Interestingly, it was pulling sentences from the main corpus, not the fine-tuning corpus. Increasing the temperature did not seem to help.

ghost · 2019-06-20T01:10:36Z

the avg. loss?

david-sg · 2019-06-20T01:19:55Z

yes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Checking for overtraining? #30

Checking for overtraining? #30

david-sg commented May 7, 2019

woctezuma commented May 7, 2019 •

edited

Loading

david-sg commented May 8, 2019

woctezuma commented May 8, 2019 •

edited

Loading

robclouth commented Jun 13, 2019

ghost commented Jun 17, 2019

david-sg commented Jun 20, 2019

ghost commented Jun 20, 2019

david-sg commented Jun 20, 2019

Checking for overtraining? #30

Checking for overtraining? #30

Comments

david-sg commented May 7, 2019

woctezuma commented May 7, 2019 • edited Loading

david-sg commented May 8, 2019

woctezuma commented May 8, 2019 • edited Loading

robclouth commented Jun 13, 2019

ghost commented Jun 17, 2019

david-sg commented Jun 20, 2019

ghost commented Jun 20, 2019

david-sg commented Jun 20, 2019

woctezuma commented May 7, 2019 •

edited

Loading

woctezuma commented May 8, 2019 •

edited

Loading