-
Notifications
You must be signed in to change notification settings - Fork 50
Vulkan aq lib integration #171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
zlatinski
wants to merge
4
commits into
main
Choose a base branch
from
vulkan-aq-lib-integration
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
faefab5 to
39dec48
Compare
aa07153 to
5a3a8c8
Compare
5a3a8c8 to
391619b
Compare
Fix video profile IDC selection for 10-bit and higher bit-depth content. This commit fixes automatic video profile selection based on bit depth and chroma subsampling format for H.264, H.265/HEVC, and AV1 encoders. Problem: When encoding 10-bit or 12-bit content, the encoder was requesting incorrect video profiles (e.g., HEVC Main instead of Main10 for 10-bit content), causing vkGetPhysicalDeviceVideoCapabilitiesKHR() to fail with VK_ERROR_VIDEO_PROFILE_FORMAT_NOT_SUPPORTED_KHR. Solution: 1. Updated GetDefaultVideoProfileIdc() for all codecs to select correct profiles: H.264: - HIGH_444_PREDICTIVE for 10-bit or 4:4:4 chroma - HIGH for 8-bit with 4:2:0/4:2:2 H.265/HEVC: - FORMAT_RANGE_EXTENSIONS for 12-bit or 4:4:4 chroma - MAIN_10 for 10-bit with 4:2:0/4:2:2 - MAIN for 8-bit with 4:2:0/4:2:2 AV1: - PROFESSIONAL for 12-bit or 4:2:2 chroma - HIGH for 4:4:4 chroma - MAIN for 8-bit or 10-bit with 4:2:0 2. Modified EncoderConfig::InitVideoProfile() to explicitly set videoProfileIdc from GetDefaultVideoProfileIdc() when default (-1), ensuring correct profile is set before building the VkVideoCoreProfile. 3. Added VkVideoCoreProfile::DumpProfile() function that prints all profile parameters (codec, chroma subsampling, bit depth, profile IDC) when a capability query fails, aiding debugging. 4. Fixed VkVideoEncoder::InitEncoder() to check InitDeviceCapabilities() return value - previously it was ignored, causing crashes on unsupported profiles. 5. Replaced assert() statements with proper error messages including __FILE__ and __LINE__ for easier debugging. 6. Removed duplicate return statements in error handling paths. Files modified: - VkVideoCoreProfile.h: Added DumpProfile() method - VulkanVideoCapabilities.h: Call DumpProfile() on failure, improved error msgs - VkEncoderConfig.cpp: Set videoProfileIdc explicitly in InitVideoProfile() - VkEncoderConfigH264.h: Profile selection for H.264 - VkEncoderConfigH265.h: Profile selection for HEVC - VkEncoderConfigAV1.h: Profile selection for AV1 - VkEncoderConfig*.cpp: Improved error messages with file:line info - VkVideoEncoder.cpp: Check InitDeviceCapabilities() return, remove asserts - vulkan_video_encoder.cpp: Improved error messages, remove asserts Signed-off-by: Tony Zlatinski <[email protected]>
Signed-off-by: Tony Zlatinski <[email protected]>
Major Changes:
1. CMake Integration:
- Add NV_AQ_GPU_LIB build option for optional aq-vulkan library integration
- Automatically detect and configure all AQ include directories (aq_common, aq-vulkan)
- Link AQ library to encoder shared/static libs and test executables
- Set NV_AQ_GPU_LIB_SUPPORTED define when AQ is available
2. AQ Processing Integration:
- Add spatial and temporal AQ configuration parameters to VkEncoderConfig
- New CLI args: --spatialAQStrength, --temporalAQStrength, --aqDumpDir
- Integrate AqProcessor into encoder pipeline with proper semaphore synchronization
- Enable QP delta map mode automatically when AQ is enabled
- Configure AQ based on codec type (H.264/H.265/AV1) and chroma format
AQ strength normalization range is [-1.0, 1.0]:
- 0.0 = default/neutral midpoint (driver default behavior)
- 1.0 = maximum AQ strength
- -1.0 = minimum AQ strength
- < -1.0 (e.g., -2.0) = AQ disabled
- > 1.0 (e.g., 2.0) = AQ disabled
This provides a more intuitive interface where 0 means 'default' rather
than 'disabled', matching common expectations for normalized parameters.
aq-vulkan: add AQ config parameters to enable AQ
--spatialAQStrength <float> : Spatial AQ strength in range [-1.0, 1.0]
-2.0 = disabled (default)
If in range [-1.0, 1.0], spatial AQ is enabled
In combined mode, ratio determines mix (larger value = more influence)
--temporalAQStrength <float> : Temporal AQ strength in range [-1.0, 1.0]
-2.0 = disabled (default)
If in range [-1.0, 1.0], temporal AQ is enabled
In combined mode, ratio determines mix (larger value = more influence)
3. GOP Structure Refactoring:
- Move GOP position calculation to EncodeFrameCommon (common for all codecs)
- Eliminate duplicate GetPositionInGOP calls in codec-specific implementations
- Simplify IDR detection by checking gopPosition.pictureType directly
- Share GOP state management across H.264/H.265/AV1 encoders
4. Frame Processing Pipeline:
- Update EncodeFrameCommon to call EncodeFrame first, then AQ processing
- Add AQ slot allocation with proper flags (SPATIAL/TEMPORAL/SUBSAMPLING)
- Configure reference requirements based on frame type (P-frame: prev ref, B-frame: both)
- Wait on AQ semaphore in SubmitVideoCodingCmds instead of input semaphore
5. Cleanup & Fixes:
- Fix filterPoolNode nullptr in VulkanVideoFrameBuffer deinitialization
- Add aqProcessorSlot to VkVideoEncodeFrameInfo lifecycle management
- Remove redundant isIdr local variable (use gopPosition.pictureType)
- Add build script for remote compilation with AQ enabled
6. add dedicated image pool semaphore
7. Fix QP-map create flags for AQ
8. Switch dumpFilenameOrdering to INPUT_ORDER
9. prevent m_outputImageAspects from being overwritten by Y-subsampling
When generating shader code for the Y-subsampled image (binding 9),
ShaderGenerateImagePlaneDescriptors was called with m_outputImageAspects
by reference. For R8/R16 formats, this function sets imageAspects to
VK_IMAGE_ASPECT_PLANE_0_BIT only, which wiped out the chroma plane bits.
This caused hasOutputChroma to evaluate to false, so chroma data was
never written to the output image, resulting in green-only output.
Fix: Use a separate local variable for the subsampledImage aspects
instead of modifying m_outputImageAspects.
Implementation Details:
- AQ processor created only when spatialAQStrength > 0 or temporalAQStrength > 0
- GOP parameters passed to AQ: frame count, consecutive B-frames, IDR period
- Semaphore chain: Input -> AQ -> Encode (using VK_PIPELINE_STAGE_2_TRANSFER_BIT_KHR)
- Debug support: CPU upload resources, buffer dumping, raw file output
Tested with: H.264, H.265, AV1 encoding with various GOP structures
AQ: Determine AQ modes from the input AQ flags
AQ: Sets the downsampled shift parameters
common: UpdateImageDescriptorSets() build warning fix
Signed-off-by: Tony Zlatinski <[email protected]>
Provides build and execution instructions and scripts for AQ. Signed-off-by: Tony Zlatinski <[email protected]>
391619b to
157c8b1
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.