-
-
Notifications
You must be signed in to change notification settings - Fork 675
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
text generation quality for Chinese #95
Comments
Related: #46 (in the sense that some people have tried fine-tuning on Chinese texts before). |
yes, I have been check #46 issue, that topic is related with encoding issue, but my project can produce Chinese word successful with no encoding problem, the issue is produced text look like no sense. |
Some key point is import as below:
|
@chiangandy , did you use the pretrained model to generate Chinese text or did you train it from scratch? Sorry to bring this up again. I'm trying to do the same thing on Arabic language. |
@mohataher Since pre-train model is base on English which is different from my project target(Traditional Chinese), so I was using my own data to train the model. Result is not bad, but it still can be optimizing better. if you want to train for Arabic language, I am not sure it has pre-train model for Arabic, maybe you can search on Google. If result is no, I suggest you to train new model not using pre-trained model. Andy |
@chiangandy would you mind add my webchat? some problems on training Chinese. thx |
I am happy to share my experience with you on this project.
江謝迪
drizzt00s <[email protected]> 於 2020年9月2日 週三 上午10:35寫道:
… @chiangandy <https://github.com/chiangandy> would you mind add my
webchat? some problems on training Chinese. thx
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#95 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AALGOG5IX3AUKXQVAB7PW4DSDWVNTANCNFSM4IJUSZTA>
.
|
Dears,
So whats your issue on this projects? Can you example this? If you
are Chinese also, I suggest to discuss this by Chinese which will be
better... :)
江謝迪
drizzt00s <[email protected]> 於 2020年9月2日 週三 上午10:35寫道:
…
@chiangandy would you mind add my webchat? some problems on training Chinese. thx
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
I use colab try to train a chinese novel , but result is not actually readable as below:
======== SAMPLE 1 ========
是将吹雾挖出的将成功探测而出。令得她不过来了。
在这句她却便是张了什么处。可地云岚宗的基地本就有猜测的唶回纳,若是可见开一些他们纗地图下的威风。有什么那些自抗成形功探也是被实亀约不运的失踪,这个家伙?”
“按一东西。
“以后?”
第一千两百四纳乎容其收获
双翼下午床 偂静以及云岚宗家族时,现在穿过尮层地死死一死的一位完全自人。若是被这位似乎么好。不过这些层地曘众而速的缘故。先前云岚宗家族与家伙破碎,也知道。”
“按这些年边按一东西。”
双危得枯落双成一些纸藏。现在云岚宗家族。则是有着更是珋地的纳戒。一名一名视线成功探而来。似此如同一股落地被月地位置身给在落地墓墓吼。将三色山峰。都是在她身处的族人而出。他们。能够如何丧门两个家族事。我没有丝毫。比较给云岚宗家族身族成功压渐了过来二人。”
落地最后对此刻低低的落地。这些人吼力地双更驰在云岚宗这般种有些做完的同局一段时间。就在山脉交手吸了一圈。他们仅仅是将会从丝毫地毒间。那家伙。拥有会难以过足有山脉路。想必地实力。可怕的毒间不会速助引。”
心中现在也算是连脸色。纳戒的一道道人影击杀着自指大会回底独血之人。一个了。能够击
my training parameter is ...
gpt2.finetune(sess, dataset="train.txt", model_name='345M', steps=1000, restore_from='fresh', print_every=20, sample_every=200, save_every=500)
Since GPT-2 should be very powerful for text generation, I just want to make sure this quality result is normal or I still have something not figure out yet.
Thnx
The text was updated successfully, but these errors were encountered: