Skip to content

Conversation

@zlatinski
Copy link
Contributor

No description provided.

@zlatinski zlatinski force-pushed the vulkan-aq-lib-integration branch 3 times, most recently from faefab5 to 39dec48 Compare December 4, 2025 17:09
@zlatinski zlatinski changed the title Vulkan aq lib integration WIP Vulkan aq lib integration Dec 4, 2025
@zlatinski zlatinski force-pushed the vulkan-aq-lib-integration branch 3 times, most recently from aa07153 to 5a3a8c8 Compare January 1, 2026 01:03
@zlatinski zlatinski force-pushed the vulkan-aq-lib-integration branch from 5a3a8c8 to 391619b Compare January 7, 2026 00:35
Fix video profile IDC selection for 10-bit and higher bit-depth content.

This commit fixes automatic video profile selection based on bit depth and
chroma subsampling format for H.264, H.265/HEVC, and AV1 encoders.

Problem:
When encoding 10-bit or 12-bit content, the encoder was requesting incorrect
video profiles (e.g., HEVC Main instead of Main10 for 10-bit content), causing
vkGetPhysicalDeviceVideoCapabilitiesKHR() to fail with
VK_ERROR_VIDEO_PROFILE_FORMAT_NOT_SUPPORTED_KHR.

Solution:
1. Updated GetDefaultVideoProfileIdc() for all codecs to select correct profiles:

   H.264:
   - HIGH_444_PREDICTIVE for 10-bit or 4:4:4 chroma
   - HIGH for 8-bit with 4:2:0/4:2:2

   H.265/HEVC:
   - FORMAT_RANGE_EXTENSIONS for 12-bit or 4:4:4 chroma
   - MAIN_10 for 10-bit with 4:2:0/4:2:2
   - MAIN for 8-bit with 4:2:0/4:2:2

   AV1:
   - PROFESSIONAL for 12-bit or 4:2:2 chroma
   - HIGH for 4:4:4 chroma
   - MAIN for 8-bit or 10-bit with 4:2:0

2. Modified EncoderConfig::InitVideoProfile() to explicitly set videoProfileIdc
   from GetDefaultVideoProfileIdc() when default (-1), ensuring correct profile
   is set before building the VkVideoCoreProfile.

3. Added VkVideoCoreProfile::DumpProfile() function that prints all profile
   parameters (codec, chroma subsampling, bit depth, profile IDC) when a
   capability query fails, aiding debugging.

4. Fixed VkVideoEncoder::InitEncoder() to check InitDeviceCapabilities()
   return value - previously it was ignored, causing crashes on unsupported
   profiles.

5. Replaced assert() statements with proper error messages including
   __FILE__ and __LINE__ for easier debugging.

6. Removed duplicate return statements in error handling paths.

Files modified:
- VkVideoCoreProfile.h: Added DumpProfile() method
- VulkanVideoCapabilities.h: Call DumpProfile() on failure, improved error msgs
- VkEncoderConfig.cpp: Set videoProfileIdc explicitly in InitVideoProfile()
- VkEncoderConfigH264.h: Profile selection for H.264
- VkEncoderConfigH265.h: Profile selection for HEVC
- VkEncoderConfigAV1.h: Profile selection for AV1
- VkEncoderConfig*.cpp: Improved error messages with file:line info
- VkVideoEncoder.cpp: Check InitDeviceCapabilities() return, remove asserts
- vulkan_video_encoder.cpp: Improved error messages, remove asserts

Signed-off-by: Tony Zlatinski <[email protected]>
Major Changes:
1. CMake Integration:
   - Add NV_AQ_GPU_LIB build option for optional aq-vulkan library integration
   - Automatically detect and configure all AQ include directories (aq_common, aq-vulkan)
   - Link AQ library to encoder shared/static libs and test executables
   - Set NV_AQ_GPU_LIB_SUPPORTED define when AQ is available

2. AQ Processing Integration:
   - Add spatial and temporal AQ configuration parameters to VkEncoderConfig
   - New CLI args: --spatialAQStrength, --temporalAQStrength, --aqDumpDir
   - Integrate AqProcessor into encoder pipeline with proper semaphore synchronization
   - Enable QP delta map mode automatically when AQ is enabled
   - Configure AQ based on codec type (H.264/H.265/AV1) and chroma format

AQ strength normalization range is [-1.0, 1.0]:
  -  0.0 = default/neutral midpoint (driver default behavior)
  -  1.0 = maximum AQ strength
  - -1.0 = minimum AQ strength
  - < -1.0 (e.g., -2.0) = AQ disabled
  - >  1.0 (e.g., 2.0) = AQ disabled

This provides a more intuitive interface where 0 means 'default' rather
than 'disabled', matching common expectations for normalized parameters.

aq-vulkan: add AQ config parameters to enable AQ

--spatialAQStrength <float>  : Spatial AQ strength in range [-1.0, 1.0]
                               -2.0 = disabled (default)
                               If in range [-1.0, 1.0], spatial AQ is enabled
                               In combined mode, ratio determines mix (larger value = more influence)
--temporalAQStrength <float> : Temporal AQ strength in range [-1.0, 1.0]
                               -2.0 = disabled (default)
                               If in range [-1.0, 1.0], temporal AQ is enabled
                               In combined mode, ratio determines mix (larger value = more influence)

3. GOP Structure Refactoring:
   - Move GOP position calculation to EncodeFrameCommon (common for all codecs)
   - Eliminate duplicate GetPositionInGOP calls in codec-specific implementations
   - Simplify IDR detection by checking gopPosition.pictureType directly
   - Share GOP state management across H.264/H.265/AV1 encoders

4. Frame Processing Pipeline:
   - Update EncodeFrameCommon to call EncodeFrame first, then AQ processing
   - Add AQ slot allocation with proper flags (SPATIAL/TEMPORAL/SUBSAMPLING)
   - Configure reference requirements based on frame type (P-frame: prev ref, B-frame: both)
   - Wait on AQ semaphore in SubmitVideoCodingCmds instead of input semaphore

5. Cleanup & Fixes:
   - Fix filterPoolNode nullptr in VulkanVideoFrameBuffer deinitialization
   - Add aqProcessorSlot to VkVideoEncodeFrameInfo lifecycle management
   - Remove redundant isIdr local variable (use gopPosition.pictureType)
   - Add build script for remote compilation with AQ enabled

6. add dedicated image pool semaphore

7. Fix QP-map create flags for AQ

8. Switch dumpFilenameOrdering to INPUT_ORDER

9. prevent m_outputImageAspects from being overwritten by Y-subsampling

When generating shader code for the Y-subsampled image (binding 9),
ShaderGenerateImagePlaneDescriptors was called with m_outputImageAspects
by reference. For R8/R16 formats, this function sets imageAspects to
VK_IMAGE_ASPECT_PLANE_0_BIT only, which wiped out the chroma plane bits.

This caused hasOutputChroma to evaluate to false, so chroma data was
never written to the output image, resulting in green-only output.

Fix: Use a separate local variable for the subsampledImage aspects
instead of modifying m_outputImageAspects.

Implementation Details:
- AQ processor created only when spatialAQStrength > 0 or temporalAQStrength > 0
- GOP parameters passed to AQ: frame count, consecutive B-frames, IDR period
- Semaphore chain: Input -> AQ -> Encode (using VK_PIPELINE_STAGE_2_TRANSFER_BIT_KHR)
- Debug support: CPU upload resources, buffer dumping, raw file output

Tested with: H.264, H.265, AV1 encoding with various GOP structures

AQ: Determine AQ modes from the input AQ flags
AQ: Sets the downsampled shift parameters
common: UpdateImageDescriptorSets() build warning fix

Signed-off-by: Tony Zlatinski <[email protected]>
Provides build and execution instructions and scripts for AQ.

Signed-off-by: Tony Zlatinski <[email protected]>
@zlatinski zlatinski force-pushed the vulkan-aq-lib-integration branch from 391619b to 157c8b1 Compare January 7, 2026 19:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants