zsychina

Follow

🐭

鼠鼠我啊，又要寄了

Siyuan Zhu zsychina

🐭

鼠鼠我啊，又要寄了

Follow

猫猫

10 followers · 57 following

Sun Yat-sen University
Guangzhou, China
11:26 - 8h ahead

Achievements

Achievements

zsychina/README.md

Hi there 👋

🔭 Education Experience:

Undergraduate: DLUT School of Automation@20FALL

Master: SYSU School of Computer Science@24FALL
🌱 Research Focus:

I'm interested in reinforcement learning, agents and utilizing RL to reinforce LLM agents' ability in decision making.

Looking forward to making friends and cooperating with you!

Pinned Loading

PrefTransPPO Public

Using preference transformer to learning a reward function from dataset, then train an agent with PPO

Python
ppo-vanilla Public

ppo minimum implementation

Python
ppo-continuous Public

ppo continuous

Python
sysu-select-course-script Public

中山大学研究生选课脚本

Python
GA-PID-Optimize Public

遗传算法整定PID参数，大连理工大学'23《现代智能优化算法》X《计算机控制技术课程设计》

Python 1
ppo-transformer Public

GPT-2 structure transformer for sequential decision making in gym environment

Python

zsychina (Siyuan Zhu) · GitHub

zsychina

Follow

🐭

鼠鼠我啊，又要寄了

Siyuan Zhu zsychina

🐭

鼠鼠我啊，又要寄了

Follow

猫猫

10 followers · 57 following

Sun Yat-sen University
Guangzhou, China
11:26 - 8h ahead

Achievements

Achievements

zsychina/README.md

Hi there 👋

🔭 Education Experience:

Undergraduate: DLUT School of Automation@20FALL

Master: SYSU School of Computer Science@24FALL
🌱 Research Focus:

I'm interested in reinforcement learning, agents and utilizing RL to reinforce LLM agents' ability in decision making.

Looking forward to making friends and cooperating with you!

Pinned Loading

PrefTransPPO Public

Using preference transformer to learning a reward function from dataset, then train an agent with PPO

Python
ppo-vanilla Public

ppo minimum implementation

Python
ppo-continuous Public

ppo continuous

Python
sysu-select-course-script Public

中山大学研究生选课脚本

Python
GA-PID-Optimize Public

遗传算法整定PID参数，大连理工大学'23《现代智能优化算法》X《计算机控制技术课程设计》

Python 1
ppo-transformer Public

GPT-2 structure transformer for sequential decision making in gym environment

Python