Skip to content

Undefined behavior in flatnessDetThresh calculation for DP DSC (nvkms-evo3.c / nvkms-evo4.c) #1029

@triple-groove

Description

@triple-groove

NVIDIA Open GPU Kernel Modules Version

580.119.02

Please confirm this issue does not happen with the proprietary driver (of the same version). This issue tracker is only for bugs specific to the open kernel driver.

  • I confirm that this does not happen with the proprietary driver package.

Operating System and Version

Fedora Linux 43 (KDE Plasma Desktop Edition)

Kernel Release

Linux pink5090.lan 6.18.9-200.fc43.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Feb 6 21:43:09 UTC 2026 x86_64 GNU/Linux

Please confirm you are running a stable release kernel (e.g. not a -rc). We do not accept bug reports for unreleased kernels.

  • I am running on a stable kernel release.

Hardware: GPU

NVIDIA GeForce RTX 5090 (UUID: GPU-24e6551a-cb0f-04f9-a141-1fad0b8886e0)

Describe the bug

In src/nvidia-modeset/src/nvkms-evo3.c (and the equivalent in nvkms-evo4.c), the flatness detection threshold is calculated as:

// XXX: I'm pretty sure that this is wrong.
// BitsPerPixelx16 is something like (24 * 16) = 384, and 2 << (384 - 8) is
// an insanely large number.
flatnessDetThresh = (2 << (pDscInfo->dp.bitsPerPixelX16 - 8)); /* ??? */

The code uses bitsPerPixelX16 (which is BPP × 16, e.g. 128 for 8.0 bpp) when it should use bpc (bits per component, e.g. 8). This computes 2 << 120, which is undefined behavior in C — the shift amount exceeds the width of the type. A garbage value is programmed into the DSC encoder hardware register.

NVIDIA's own engineers flagged this with XXX and ??? comments but it was never fixed.

Per the VESA DSC 1.1 specification, the correct formula is:

flatness_det_thresh = 2 << (bpc - 8)

For bpc=8: 2 << 0 = 2.

Proposed Fix

Extract bits_per_component from PPS byte 3 bits [7:4] (already available in pDscInfo->dp.pps[0]) and use it instead of bitsPerPixelX16. The DP path does not receive a pixelDepth parameter (unlike the HDMI path), so we read bpc from the PPS which was already computed correctly by the PPS generator.

nvkms-evo4.c (EvoSetDpDscParamsC9)

     nvAssert(pDscInfo->type == NV_DSC_INFO_EVO_TYPE_DP);

-    // XXX: I'm pretty sure that this is wrong.
-    // BitsPerPixelx16 is something like (24 * 16) = 384, and 2 << (384 - 8) is
-    // an insanely large number.
-    flatnessDetThresh = (2 << (pDscInfo->dp.bitsPerPixelX16 - 8)); /* ??? */
+    // DSC 1.2a spec: flatness_det_thresh = 2 << (bits_per_component - 8).
+    // Extract bits_per_component from PPS byte 3 bits [7:4].
+    // PPS DW[0] packs bytes [3][2][1][0] as big-endian: [31:24][23:16][15:8][7:0].
+    {
+        NvU32 bpc = (pDscInfo->dp.pps[0] >> 28) & 0xF;
+        flatnessDetThresh = (bpc >= 8) ? (2 << (bpc - 8)) : 2;
+    }

nvkms-evo3.c (EvoSetDpDscParams)

     nvAssert(pDscInfo->type == NV_DSC_INFO_EVO_TYPE_DP);

-    // XXX: I'm pretty sure that this is wrong.
-    // BitsPerPixelx16 is something like (24 * 16) = 384, and 2 << (384 - 8) is
-    // an insanely large number.
-    flatnessDetThresh = (2 << (pDscInfo->dp.bitsPerPixelX16 - 8)); /* ??? */
+    // DSC 1.2a spec: flatness_det_thresh = 2 << (bits_per_component - 8).
+    // Extract bits_per_component from PPS byte 3 bits [7:4].
+    // PPS DW[0] packs bytes [3][2][1][0] as big-endian: [31:24][23:16][15:8][7:0].
+    {
+        NvU32 bpc = (pDscInfo->dp.pps[0] >> 28) & 0xF;
+        flatnessDetThresh = (bpc >= 8) ? (2 << (bpc - 8)) : 2;
+    }

Verification

Tested on RTX 5090 (Blackwell, nvEvoCA HAL) with a DSC sink at 8bpc / 8.0bpp. After the fix, dmesg confirms the correct value:

nvidia-modeset: GPU:0: DSC EVO4 DP: head=2 bpc=8 bppX16=129 flatnessDetThresh=2 pps[0]=0x89000011
  • bpc=8 — correctly extracted from PPS byte 3
  • flatnessDetThresh=2 — correct per spec: 2 << (8 - 8) = 2
  • Before the fix: flatnessDetThresh=0 (due to UB from 2 << 121 truncated to 10-bit field)

This fix applies to both nvkms-evo3.c (line ~7822) and nvkms-evo4.c (equivalent location).

Impact

Affects all DP DSC displays. The flatness detection threshold controls how the DSC encoder detects flat (uniform) image regions for optimized encoding. An incorrect value may cause suboptimal bit allocation, potentially contributing to compression artifacts.

Files Affected

  • src/nvidia-modeset/src/nvkms-evo3.c
  • src/nvidia-modeset/src/nvkms-evo4.c

Environment

  • GPU: NVIDIA RTX 5090 (Blackwell)
  • Driver: open-gpu-kernel-modules 580.119.02
  • OS: Fedora 43, Linux 6.18.9
  • Sink: Bigscreen Beyond VR headset (DSC 1.1, 8bpc, 8.0bpp, RGB 4:4:4)

To Reproduce

Connect any DisplayPort display that uses DSC with 8 bits per component (bpc=8)
The DSC encoder is programmed with an incorrect flatnessDetThresh value due to undefined behavior in C

No specific user action triggers this — it occurs on every DP DSC modeset.

Bug Incidence

Always

nvidia-bug-report.log.gz

Summary of Skipped Sections:

Skipped Component | Details

ibstat output | ibstat not found

acpidump output | acpidump not found

mst output | mst not found

nvlsm-bug-report.sh output | nvlsm-bug-report.sh not found

Summary of Errors:

Error Component | Details | Resolution

More Info

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions