Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for SD3 testing along with a refactor of the suite. #266

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 16 additions & 12 deletions .github/workflows/test_iree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -101,15 +101,17 @@ jobs:
- name: cpu_llvm_task
runs-on: nodai-amdgpu-w7900-x86-64
models-config-file: config_pytorch_models_cpu_llvm_task.json
sdxl-prompt-encoder-config-file: config_sdxl_prompt_encoder_cpu_llvm_task.json
sdxl-unet-config-file: config_sdxl_scheduled_unet_cpu_llvm_task.json
sdxl-vae-decode-config-file: config_sdxl_vae_decode_cpu_llvm_task.json
sdxl-prompt-encoder-config-file: config_sd_prompt_encoder_cpu_llvm_task.json
sdxl-unet-config-file: config_sd_scheduled_unet_cpu_llvm_task.json
sdxl-vae-decode-config-file: config_sd_vae_decode_cpu_llvm_task.json
backend: cpu
- name: gpu_mi250_rocm
runs-on: nodai-amdgpu-mi250-x86-64
models-config-file: config_gpu_rocm_models.json
sdxl-prompt-encoder-config-file: config_sdxl_prompt_encoder_gpu_rocm.json
sdxl-unet-config-file: config_sdxl_scheduled_unet_gpu_rocm.json
sdxl-vae-decode-config-file: config_sdxl_vae_decode_gpu_rocm.json
sdxl-prompt-encoder-config-file: config_sd_prompt_encoder_gpu_rocm.json
sdxl-unet-config-file: config_sd_scheduled_unet_gpu_rocm.json
sdxl-vae-decode-config-file: config_sd_vae_decode_gpu_rocm.json
backend: rocm
env:
VENV_DIR: ${{ github.workspace }}/.venv
IREE_TEST_FILES: ~/iree_tests_cache
Expand Down Expand Up @@ -163,12 +165,12 @@ jobs:
--durations=0 \
--config-files=${MODELS_CONFIG_FILE_PATH}

- name: "Running real weights SDXL prompt encoder tests"
- name: "Running real weights SDXL + SD3 prompt encoder tests"
id: prompt_encoder
if: ${{ !cancelled() }}
run: |
source ${VENV_DIR}/bin/activate
pytest iree_tests/pytorch/models/sdxl-prompt-encoder-tank \
pytest iree_tests/pytorch/models/sd-clip \
-rpfE \
-k real_weights \
--no-skip-tests-missing-files \
Expand All @@ -178,12 +180,12 @@ jobs:
--durations=0 \
--config-files=${SDXL_PROMPT_ENCODER_CONFIG_FILE_PATH}

- name: "Running real weights SDXL scheduled unet tests"
- name: "Running real weights SDXL + SD3 scheduled unet/mmdit tests"
id: unet
if: ${{ !cancelled() }}
run: |
source ${VENV_DIR}/bin/activate
pytest iree_tests/pytorch/models/sdxl-scheduled-unet-3-tank \
pytest iree_tests/pytorch/models/sd-unet \
-rpfE \
-k real_weights \
--no-skip-tests-missing-files \
Expand All @@ -192,13 +194,15 @@ jobs:
--timeout=1200 \
--durations=0 \
--config-files=${SDXL_UNET_CONFIG_FILE_PATH}
env:
IREE_TEST_BACKEND: ${{ matrix.backend }}

- name: "Running real weights SDXL vae decode tests"
- name: "Running real weights SDXL + SD3 vae decode tests"
id: vae
if: ${{ !cancelled() }}
run: |
source ${VENV_DIR}/bin/activate
pytest iree_tests/pytorch/models/sdxl-vae-decode-tank \
pytest iree_tests/pytorch/models/sd-vae \
-rpfE \
-k real_weights \
--no-skip-tests-missing-files \
Expand Down
6 changes: 3 additions & 3 deletions iree_tests/benchmarks/benchmark_sdxl_rocm.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,9 @@

benchmark_dir = os.path.dirname(os.path.realpath(__file__))
iree_root = os.path.dirname(benchmark_dir)
prompt_encoder_dir = f"{iree_root}/pytorch/models/sdxl-prompt-encoder-tank"
scheduled_unet_dir = f"{iree_root}/pytorch/models/sdxl-scheduled-unet-3-tank"
vae_decode_dir = f"{iree_root}/pytorch/models/sdxl-vae-decode-tank"
prompt_encoder_dir = f"{iree_root}/pytorch/models/sd-clip/sdxl-prompt-encoder-tank"
scheduled_unet_dir = f"{iree_root}/pytorch/models/sd-unet/sdxl-scheduled-unet-3-tank"
vae_decode_dir = f"{iree_root}/pytorch/models/sd-vae/sdxl-vae-decode-tank"

def run_iree_command(args: Sequence[str] = ()):
command = "Exec:", " ".join(args)
Expand Down
5 changes: 4 additions & 1 deletion iree_tests/configs/config_gpu_rocm_models.json
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,10 @@
"skip_compile_tests": [
"sdxl-scheduled-unet-3-tank",
"sdxl-vae-decode-tank",
"sdxl-prompt-encoder-tank"
"sdxl-prompt-encoder-tank",
"sd3-mmdit",
"sd3-vae-decode",
"sd3-prompt-encoder"
Comment on lines 12 to +18
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is going to exclude all these special models they should be moved to a different subdirectory.

The config_*_models configs were intended to test a set of models that were imported into the test suite in a uniform way and each of these models is special in some way.

],
"skip_run_tests": [],
"expected_compile_failures": [
Expand Down
5 changes: 4 additions & 1 deletion iree_tests/configs/config_pytorch_models_cpu_llvm_task.json
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,10 @@
"skip_compile_tests": [
"sdxl-scheduled-unet-3-tank",
"sdxl-vae-decode-tank",
"sdxl-prompt-encoder-tank"
"sdxl-prompt-encoder-tank",
"sd3-mmdit",
"sd3-vae-decode",
"sd3-prompt-encoder"
],
"skip_run_tests": [],
"expected_compile_failures": [
Expand Down
20 changes: 20 additions & 0 deletions iree_tests/configs/config_sd_prompt_encoder_cpu_llvm_task.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{
"config_name": "cpu_llvm_task",
"iree_compile_flags" : [
"--iree-hal-target-backends=llvm-cpu",
"--iree-llvmcpu-target-cpu-features=host",
"--iree-llvmcpu-fail-on-out-of-bounds-stack-allocation=false",
"--iree-llvmcpu-distribution-size=32",
"--iree-opt-const-eval=false",
"--iree-llvmcpu-enable-ukernels=all",
"--iree-global-opt-enable-quantized-matmul-reassociation"
],
"iree_run_module_flags": [
"--device=local-task",
"--parameters=model=real_weights.irpa"
],
"skip_compile_tests": [],
"skip_run_tests": [],
"expected_compile_failures": [],
"expected_run_failures": []
}
Original file line number Diff line number Diff line change
Expand Up @@ -18,17 +18,12 @@
],
"iree_run_module_flags": [
"--device=hip",
"--parameters=model=real_weights.irpa",
"--input=1x64xi64=@inference_input.0.bin",
"--input=1x64xi64=@inference_input.1.bin",
"--input=1x64xi64=@inference_input.2.bin",
"--input=1x64xi64=@inference_input.3.bin",
"--expected_output=2x64x2048xf16=@inference_output.0.bin",
"--expected_output=2x1280xf16=@inference_output.1.bin",
"--expected_f16_threshold=1.0f"
"--parameters=model=real_weights.irpa"
],
"skip_compile_tests": [],
"skip_run_tests": [],
"expected_compile_failures": [],
"expected_compile_failures": [
"sd3-prompt-encoder",
],
"expected_run_failures": []
}
23 changes: 23 additions & 0 deletions iree_tests/configs/config_sd_scheduled_unet_cpu_llvm_task.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
{
"config_name": "cpu_llvm_task",
"iree_compile_flags" : [
"--iree-hal-target-backends=llvm-cpu",
"--iree-llvmcpu-target-cpu-features=host",
"--iree-llvmcpu-fail-on-out-of-bounds-stack-allocation=false",
"--iree-llvmcpu-distribution-size=32",
"--iree-opt-const-eval=false",
"--iree-llvmcpu-enable-ukernels=all",
"--iree-global-opt-enable-quantized-matmul-reassociation"
],
"iree_run_module_flags": [
"--device=local-task",
"--parameters=model=real_weights.irpa"
],
"skip_compile_tests": [],
"skip_run_tests": [],
"expected_compile_failures": [
"sdxl-scheduled-unet-3-tank",
"sd3-mmdit"
],
"expected_run_failures": []
}
Original file line number Diff line number Diff line change
Expand Up @@ -21,17 +21,12 @@
],
"iree_run_module_flags": [
"--device=hip",
"--parameters=model=real_weights.irpa",
"--module=sdxl_scheduled_unet_pipeline_fp16_rocm.vmfb",
"--input=1x4x128x128xf16=@inference_input.0.bin",
"--input=2x64x2048xf16=@inference_input.1.bin",
"--input=2x1280xf16=@inference_input.2.bin",
"--input=1xf16=@inference_input.3.bin",
"--expected_output=1x4x128x128xf16=@inference_output.0.bin",
"--expected_f16_threshold=0.7f"
"--parameters=model=real_weights.irpa"
],
"skip_compile_tests": [],
"skip_run_tests": [],
"expected_compile_failures": [],
"expected_compile_failures": [
"sd3-mmdit",
],
"expected_run_failures": []
}
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,17 @@
"config_name": "cpu_llvm_task",
"iree_compile_flags" : [
"--iree-hal-target-backends=llvm-cpu",
"--iree-llvmcpu-target-cpu-features=host"
"--iree-llvmcpu-target-cpu-features=host",
"--iree-llvmcpu-fail-on-out-of-bounds-stack-allocation=false",
"--iree-llvmcpu-distribution-size=32",
"--iree-opt-const-eval=false",
"--iree-llvmcpu-enable-ukernels=all",
"--iree-global-opt-enable-quantized-matmul-reassociation"
],
"iree_run_module_flags": [
"--device=local-task",
"--parameters=model=real_weights.irpa",
"--input=1x4x128x128xf16=@inference_input.0.bin",
"--expected_output=1x3x1024x1024xf16=@inference_output.0.bin",
"--expected_f32_threshold=0.01f",
"--expected_f16_threshold=0.02f"
],
"skip_compile_tests": [],
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,7 @@
"iree_run_module_flags": [
"--device=hip",
"--parameters=model=real_weights.irpa",
"--input=1x4x128x128xf16=@inference_input.0.bin",
"--expected_output=1x3x1024x1024xf16=@inference_output.0.bin",
"--expected_f32_threshold=0.6f",
"--expected_f16_threshold=0.4f"
],
"skip_compile_tests": [],
Expand Down
22 changes: 0 additions & 22 deletions iree_tests/configs/config_sdxl_prompt_encoder_cpu_llvm_task.json

This file was deleted.

24 changes: 0 additions & 24 deletions iree_tests/configs/config_sdxl_scheduled_unet_cpu_llvm_task.json

This file was deleted.

18 changes: 16 additions & 2 deletions iree_tests/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -319,7 +319,12 @@ def __init__(self, spec, **kwargs):

self.run_args = ["iree-run-module", f"--module={vmfb_name}"]
self.run_args.extend(self.spec.iree_run_module_flags)
self.run_args.append(f"--flagfile={self.spec.data_flagfile_name}")

# expand data flag file, so beter for logging and can use environment variables
flag_file_path = f"{self.test_cwd}/{self.spec.data_flagfile_name}"
file = open(flag_file_path)
for line in file:
self.run_args.append(line.rstrip())
Comment on lines +323 to +327
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment style: start with an uppercase character, end with a period. Also fix typo and adjust wording.

Suggested change
# expand data flag file, so beter for logging and can use environment variables
flag_file_path = f"{self.test_cwd}/{self.spec.data_flagfile_name}"
file = open(flag_file_path)
for line in file:
self.run_args.append(line.rstrip())
# Expand data flag files to make logs explicit
# Tools accept `--flagfile=/path/to/flagfile` but logs are easier
# to read with explicit `--flag1=value1 --flag2=value2` flags.
flag_file_path = f"{self.test_cwd}/{self.spec.data_flagfile_name}"
file = open(flag_file_path)
for line in file:
self.run_args.append(line.rstrip())

Note that flagfiles are a requirement for some commands and environments. For example, certain terminals have character length limits around 512 or so characters for commands and putting flags in files works around that.


def runtest(self):
# TODO(scotttodd): log files needed by the test (remote files / git LFS)
Expand Down Expand Up @@ -385,6 +390,8 @@ def test_compile(self):
compile_env["IREE_TEST_PATH_EXTENSION"] = os.getenv(
"IREE_TEST_PATH_EXTENSION", default=str(self.test_cwd)
)

# expand environment variable for logging
path_extension = compile_env["IREE_TEST_PATH_EXTENSION"]
cmd = subprocess.list2cmdline(self.compile_args)
cmd = cmd.replace("${IREE_TEST_PATH_EXTENSION}", f"{path_extension}")
Expand All @@ -401,8 +408,15 @@ def test_compile(self):

def test_run(self):
run_env = os.environ.copy()
cmd = subprocess.list2cmdline(self.run_args)
run_env["IREE_TEST_BACKEND"] = os.getenv(
"IREE_TEST_BACKEND", default="none"
)

# expand environment variable for logging
backend = run_env["IREE_TEST_BACKEND"]
cmd = subprocess.list2cmdline(self.run_args)
cmd = cmd.replace("${IREE_TEST_BACKEND}", f"{backend}")
Comment on lines +411 to +418
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is too sketchy IMO. Flagfiles should be runnable as-is and this is adding an extra indirection that will be too difficult to reproduce outside of a CI environment. Any tests needing this behavior should be using a mechanism other than this conftest.py.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me elaborate a bit...

This iree_tests subproject is home to:

  • model sources
  • test inputs
  • test outputs
  • utilities for converting into the test suite format
  • utilities for helping run test suites
  • simple examples of test suite instantiations

For large test suites following a standardized style (ONNX unit tests, ONNX models, StableHLO models, JAX programs, etc.), iree_tests/conftest.py and the config files there give us a lightweight way to define test cases that can be piped through iree-compile -> iree-run-module using a shared set of flags for each test configuration (compile options + run options)

For SDXL, SD3, llama, and other models that we're giving special attention, we should be testing both the out of the box import -> compile -> run path that fits that mold and a curated path like https://github.com/iree-org/iree/blob/main/experimental/regression_suite/tests/pregenerated/test_llama2.py or https://github.com/nod-ai/sdxl-scripts .

  • We can keep a separation between the model sources here (including any scripts needed to get from frameworks, safetensors files, requirements.txt files, etc. down to .mlir and .irpa files) and downstream test instantiations - we'll just end up with more (I'm assuming Python) code downstream to set up exactly what we want tested.

At the point where a model needs a carve-out in a config.json file, an environment variable, a nonstandard file (spec file), or changes to conftest.py, it is too complex/different and should be given its own separate test.

  • We saw some confusion today when one of these special tests was newly passing here on Discord. It took several of us some time to find which config file needed updating since the test was running as part of "sdxl_schedueld_unet" and not "pytorch_models"

We can share test code between the standardized path and custom model tests where it makes sense to do so. In particular, the "compile a program" and "run a program" parts could be fixtures (as they are in https://github.com/iree-org/iree/blob/main/experimental/regression_suite/ireers/fixtures.py). We should have a common way for those stages to run, with the same logging format and the same error messages when tests are unexpectedly passing, newly failing, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this makes sense. I will take a look at an alternate path for the custom models

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can still use the same downloading utilities though to fetch all the sources right (download_remote_files.py). We'll just have a different test_cases.json for the script to parse in the custom path?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can keep the current downloading, or we could follow the fetch_source_fixture style in https://github.com/iree-org/iree/blob/6f178692f11d356a284eba915f2cf1d067d92a69/experimental/regression_suite/tests/pregenerated/test_llama2.py#L21-L24 . If we're explicitly setting up test cases like in that file then we could just fully script it all there, no need for a separate .json file.

IDK, and I can't take a deep context switch right now to think through the design.

I'd start with that test_llama2.py and adapt it to the separate repo model (test sources, inputs, outputs in one repo, test configurations in another)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, no worries, I'd rather stick to unified downloading process (making use of the same tools) to a cache in the test-suite and unified compile/runtime through a fixture as you suggested so we are using the same tools across the whole repo where we could. So, I will proceed with that for now


# TODO(scotttodd): expand flagfile(s)
logging.getLogger().info(
f"Launching run command:\n" #
Expand Down
Git LFS file not shown
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
--input=1x77x2xi64=@inference_input.0.bin
--input=1x77x2xi64=@inference_input.1.bin
--input=1x77x2xi64=@inference_input.2.bin
--input=1x77x2xi64=@inference_input.3.bin
--input=1x77x2xi64=@inference_input.4.bin
--input=1x77x2xi64=@inference_input.5.bin
--expected_output=2x154x4096xf32=@inference_output.0.bin
--expected_output=2x2048xf32=@inference_output.1.bin
--expected_f32_threshold=0.15f
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
--input="1x77x2xi64"
--input="1x77x2xi64"
--input="1x77x2xi64"
--input="1x77x2xi64"
--input="1x77x2xi64"
--input="1x77x2xi64"
--parameters=splats.irpa
Git LFS file not shown
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
{
"file_format": "test_cases_v0",
"test_cases": [
{
"name": "splats",
"runtime_flagfile": "splat_data_flags.txt",
"remote_files": []
},
{
"name": "real_weights",
"runtime_flagfile": "real_weights_data_flags.txt",
"remote_files": [
"https://sharkpublic.blob.core.windows.net/sharkpublic/sai/sd3-prompt-encoder/inference_input.0.bin",
"https://sharkpublic.blob.core.windows.net/sharkpublic/sai/sd3-prompt-encoder/inference_input.1.bin",
"https://sharkpublic.blob.core.windows.net/sharkpublic/sai/sd3-prompt-encoder/inference_input.2.bin",
"https://sharkpublic.blob.core.windows.net/sharkpublic/sai/sd3-prompt-encoder/inference_input.3.bin",
"https://sharkpublic.blob.core.windows.net/sharkpublic/sai/sd3-prompt-encoder/inference_input.4.bin",
"https://sharkpublic.blob.core.windows.net/sharkpublic/sai/sd3-prompt-encoder/inference_input.5.bin",
"https://sharkpublic.blob.core.windows.net/sharkpublic/sai/sd3-prompt-encoder/inference_output.0.bin",
"https://sharkpublic.blob.core.windows.net/sharkpublic/sai/sd3-prompt-encoder/inference_output.1.bin",
"https://sharkpublic.blob.core.windows.net/sharkpublic/sai/sd3-prompt-encoder/real_weights.irpa"
]
}
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
--input=1x64xi64=@inference_input.0.bin
--input=1x64xi64=@inference_input.1.bin
--input=1x64xi64=@inference_input.2.bin
--input=1x64xi64=@inference_input.3.bin
--expected_output=2x64x2048xf16=@inference_output.0.bin
--expected_output=2x1280xf16=@inference_output.1.bin
--expected_f16_threshold=1.0f
3 changes: 3 additions & 0 deletions iree_tests/pytorch/models/sd-unet/sd3-mmdit/model.mlirbc
Git LFS file not shown
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
--parameters=model=real_weights.irpa
--input=2x16x128x128xf16=@inference_input.0.bin
--input=2x154x4096xf16=@inference_input.1.bin
--input=2x2048xf16=@inference_input.2.bin
--input=1xf16=@inference_input.3.bin
--expected_output=2x16x128x128xf32=@inference_output.0.bin
--expected_f16_threshold=1.0f
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
--input="2x16x128x128xf16"
--input="2x154x4096xf16"
--input="2x2048xf16"
--input="1xf16"
--parameters=splats.irpa
3 changes: 3 additions & 0 deletions iree_tests/pytorch/models/sd-unet/sd3-mmdit/splats.irpa
Git LFS file not shown
Loading
Loading