Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

speak about a certain topic #100

Open
iacoposk8 opened this issue Aug 8, 2019 · 1 comment
Open

speak about a certain topic #100

iacoposk8 opened this issue Aug 8, 2019 · 1 comment

Comments

@iacoposk8
Copy link

I am training a model for my language, according to you how much should the loss be for having a good model but not going overfit?

second question: how can I generate texts that speak about a certain topic?

Thank you

@cat6kg
Copy link

cat6kg commented Aug 17, 2019

second question: how can I generate texts that speak about a certain topic?

I'd like to hear some explanation 'for dummies' about this too... (somebody please help).
I've read all issues, but still it's not clear for me.

If I understand correctly, it should work this way:

We have pre-trained models (ready for use):

  • 117M model (500 MB) for old computers.
  • 345M model (1.5 GB) for computers with nice GPU (something like GeForce GTX 1080)
  • (???) there are two more models. However, they are super cool so OpenAI did not make them available for people. I think it's pure AI (and could be dangerous)

Ok, now we have such options:

1. You can just load a model (say 117M) and generate some text. As a result you will get text with random theme. Because wide range of knowledge were used for training. 345 model is more accurate and more powerful than 117.

2. Another option is to RE-train original model to what you need. Say you need text about cars, then you have to re-train (finetune) model to such topic. (This is time consuming, and of course you need to have text data for training).

3. And finally you can just specify 'prefix word' like this:
gpt2.generate(sess, prefix='apple')
So the model will write about apples. However, it could be random apples too. For example, it could be something like: "Apple computers are very..." and so on. So we need to specify which apples we want:
gpt2.generate(sess, prefix='delicious apple')
So model will text us about fruit apples.

What I really don't understand is: why we need to finetune model, if we can just use prefix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants