eval

This branch is 6 commits behind hkust-nlp/simpleRL-reason:main.

Name	Name	Last commit message	Last commit date
parent directory ..
data	data	upload eval code	Jan 25, 2025
latex2sympy	latex2sympy	upload eval code	Jan 25, 2025
sh	sh	upload eval code	Jan 25, 2025
LICENSE	LICENSE	upload eval code	Jan 25, 2025
README.md	README.md	upload eval code	Jan 25, 2025
data_loader.py	data_loader.py	upload eval code	Jan 25, 2025
evaluate.py	evaluate.py	upload eval code	Jan 25, 2025
examples.py	examples.py	upload eval code	Jan 25, 2025
grader.py	grader.py	upload eval code	Jan 25, 2025
math_eval.py	math_eval.py	upload eval code	Jan 25, 2025
math_utils.py	math_utils.py	upload eval code	Jan 25, 2025
model_utils.py	model_utils.py	upload eval code	Jan 25, 2025
parser.py	parser.py	upload eval code	Jan 25, 2025
process.py	process.py	upload eval code	Jan 25, 2025
python_executor.py	python_executor.py	upload eval code	Jan 25, 2025
requirements.txt	requirements.txt	upload eval code	Jan 25, 2025
rm_maj_eval.py	rm_maj_eval.py	upload eval code	Jan 25, 2025
trajectory.py	trajectory.py	upload eval code	Jan 25, 2025
utils.py	utils.py	upload eval code	Jan 25, 2025

README.md

Requirements

You can install the required packages with the following command:

cd latex2sympy
pip install -e .
cd ..
pip install -r requirements.txt 
pip install vllm==0.5.1 --no-build-isolation
pip install transformers==4.42.3

Evaluation

You can evaluate Qwen2.5/Qwen2-Math-Instruct series model with the following command:

# Qwen2.5-Math-Instruct Series
PROMPT_TYPE="qwen25-math-cot"
# Qwen2.5-Math-1.5B-Instruct
export CUDA_VISIBLE_DEVICES="0"
MODEL_NAME_OR_PATH="Qwen/Qwen2.5-Math-1.5B-Instruct"
bash sh/eval.sh $PROMPT_TYPE $MODEL_NAME_OR_PATH

# Qwen2.5-Math-7B-Instruct
export CUDA_VISIBLE_DEVICES="0"
MODEL_NAME_OR_PATH="Qwen/Qwen2.5-Math-7B-Instruct"
bash sh/eval.sh $PROMPT_TYPE $MODEL_NAME_OR_PATH

# Qwen2.5-Math-72B-Instruct
export CUDA_VISIBLE_DEVICES="0,1,2,3"
MODEL_NAME_OR_PATH="Qwen/Qwen2.5-Math-72B-Instruct"
bash sh/eval.sh $PROMPT_TYPE $MODEL_NAME_OR_PATH


# Qwen2-Math-Instruct Series
PROMPT_TYPE="qwen-boxed"
# Qwen2-Math-1.5B-Instruct
export CUDA_VISIBLE_DEVICES="0"
MODEL_NAME_OR_PATH="Qwen/Qwen2-Math-1.5B-Instruct"
bash sh/eval.sh $PROMPT_TYPE $MODEL_NAME_OR_PATH

# Qwen2-Math-7B-Instruct
export CUDA_VISIBLE_DEVICES="0"
MODEL_NAME_OR_PATH="Qwen/Qwen2-Math-7B-Instruct"
bash sh/eval.sh $PROMPT_TYPE $MODEL_NAME_OR_PATH

# Qwen2-Math-72B-Instruct
export CUDA_VISIBLE_DEVICES="0,1,2,3"
MODEL_NAME_OR_PATH="Qwen/Qwen2-Math-72B-Instruct"
bash sh/eval.sh $PROMPT_TYPE $MODEL_NAME_OR_PATH

Acknowledgement

The codebase is adapted from math-evaluation-harness.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

eval

eval

README.md

Requirements

Evaluation

Acknowledgement

Files

eval

Directory actions

More options

Directory actions

More options

Latest commit

History

eval

Folders and files

parent directory

README.md

Requirements

Evaluation

Acknowledgement