Vectorize TensorPrimitives.BitIncrement/BitDecrement for Half using ushort operations #123754

Copilot · 2026-01-29T14:52:34Z

Description

Per feedback on PR #123610, adds vectorized Half support to TensorPrimitives.BitIncrement and TensorPrimitives.BitDecrement by processing Halfs as ushorts for the whole computation, avoiding the overhead of widening to float and narrowing back.

Changes

TensorPrimitives.BitIncrement.cs:

Added check for Half type at method entry using TryUnaryBitwiseInvokeHalfAsInt16<T, HalfBitIncrementOperator>
Created HalfBitIncrementOperator struct implementing IUnaryOperator<short, short> with:
- Direct ushort bit manipulation (no float conversion)
- SIMD implementations for Vector128/256/512
- Proper handling of special cases: NaN (both positive and negative preserved), +Infinity (preserved), -Infinity → MinValue, -0 → Epsilon

TensorPrimitives.BitDecrement.cs:

Added check for Half type at method entry using TryUnaryBitwiseInvokeHalfAsInt16<T, HalfBitDecrementOperator>
Created HalfBitDecrementOperator struct implementing IUnaryOperator<short, short> with:
- Direct ushort bit manipulation (no float conversion)
- SIMD implementations for Vector128/256/512
- Proper handling of special cases: NaN (both positive and negative preserved), -Infinity (preserved), +Infinity → MaxValue, +0 → -Epsilon

Key Optimization

This approach uses TryUnaryBitwiseInvokeHalfAsInt16 which operates directly on the short/ushort representation of Half values without any conversion to/from float. This avoids the overhead of:

Widening Half to float
Performing the operation on float
Narrowing float back to Half

Performance Benchmark

To trigger EgorBot benchmark comparison:

@EgorBot -x64 -arm

```cs
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System.Numerics.Tensors;

BenchmarkSwitcher.FromAssembly(typeof(Bench).Assembly).Run(args);

public class Bench
{
    private Half[] _source = default!;
    private Half[] _destination = default!;

    [Params(1000)]
    public int Count { get; set; }

    [GlobalSetup]
    public void Setup()
    {
        _source = Enumerable.Range(0, Count)
            .Select(i => (Half)(i % 2 == 0 ? i * 0.1f : -i * 0.1f))
            .ToArray();
        _destination = new Half[Count];
    }

    [Benchmark]
    public void BitIncrement()
    {
        TensorPrimitives.BitIncrement<Half>(_source, _destination);
    }

    [Benchmark]
    public void BitDecrement()
    {
        TensorPrimitives.BitDecrement<Half>(_source, _destination);
    }
}

Original prompt

See the comment at #123610 (comment). Implement such an optimization for Half in BitIncrement and BitDecrement, where the Halfs are processed as ushorts for the whole computation rather than trying to convert to floats. Then use the performance benchmarking skill to evaluate the benefits of the change.

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

dotnet-policy-service · 2026-01-29T14:53:47Z

Tagging subscribers to this area: @dotnet/area-system-numerics-tensors
See info in area-owners.md if you want to be subscribed.

Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>

stephentoub · 2026-01-29T15:27:07Z

@copilot, no, this wasn't about Half.BitIncrement/Decrement. It was about vectorizing TensorPrimitives.BitIncrement/BitDecrement when the T is Half. Please start over.

…ment using ushort operations Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>

Copilot · 2026-01-29T15:48:24Z

@copilot, no, this wasn't about Half.BitIncrement/Decrement. It was about vectorizing TensorPrimitives.BitIncrement/BitDecrement when the T is Half. Please start over.

You're right, I misunderstood. I've corrected this in commit 2ac7925. The changes now add vectorized Half support to TensorPrimitives.BitIncrement/BitDecrement by:

Added check for Half type using TryUnaryBitwiseInvokeHalfAsInt16 at the method entry point
Created HalfBitIncrementOperator and HalfBitDecrementOperator that implement IUnaryOperator<short, short> with direct ushort bit manipulation (no float conversion)
Includes SIMD implementations for Vector128/256/512 with proper special case handling

The Half.cs file has been reverted to its original state.

Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>

stephentoub · 2026-01-29T17:00:05Z

@EgorBot -x64 -arm

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System.Numerics.Tensors;

BenchmarkSwitcher.FromAssembly(typeof(Bench).Assembly).Run(args);

public class Bench
{
    private Half[] _source = default!;
    private Half[] _destination = default!;

    [Params(1000)]
    public int Count { get; set; }

    [GlobalSetup]
    public void Setup()
    {
        _source = Enumerable.Range(0, Count)
            .Select(i => (Half)(i % 2 == 0 ? i * 0.1f : -i * 0.1f))
            .ToArray();
        _destination = new Half[Count];
    }

    [Benchmark]
    public void BitIncrement()
    {
        TensorPrimitives.BitIncrement<Half>(_source, _destination);
    }

    [Benchmark]
    public void BitDecrement()
    {
        TensorPrimitives.BitDecrement<Half>(_source, _destination);
    }
}

Copilot

Pull request overview

This PR optimizes TensorPrimitives.BitIncrement and TensorPrimitives.BitDecrement for Half types by implementing vectorized operations that work directly on the ushort bit representation, avoiding costly float conversions.

Changes:

Added vectorized Half support for BitIncrement and BitDecrement using direct ushort bit manipulation
Implemented SIMD operations for Vector128, Vector256, and Vector512 that properly handle all IEEE 754 special cases
Uses TryUnaryBitwiseInvokeHalfAsInt16 to process Half values as short/ushort throughout the computation

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File	Description
TensorPrimitives.BitIncrement.cs	Added Half-specific vectorized operator that manipulates ushort bits directly, with proper handling of NaN, infinities, and zero edge cases
TensorPrimitives.BitDecrement.cs	Added Half-specific vectorized operator that manipulates ushort bits directly, with proper handling of NaN, infinities, and zero edge cases

Initial plan

f19c6a3

Copilot AI assigned Copilot and stephentoub Jan 29, 2026

stephentoub added the area-System.Numerics.Tensors label Jan 29, 2026

Copilot started work on behalf of stephentoub January 29, 2026 14:53 View session

Optimize Half.BitIncrement and BitDecrement to use ushort operations

00bc621

Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Optimize Half processing in BitIncrement and BitDecrement~~ Optimize Half.BitIncrement and BitDecrement to use ushort operations Jan 29, 2026

Copilot AI requested a review from stephentoub January 29, 2026 15:21

Copilot finished work on behalf of stephentoub January 29, 2026 15:21

Copilot started work on behalf of stephentoub January 29, 2026 15:27 View session

Add vectorized Half support to TensorPrimitives.BitIncrement/BitDecre…

2ac7925

…ment using ushort operations Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>

Fix NaN handling in HalfBitIncrement/DecrementOperator

a2d9b8e

Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>

Copilot AI changed the title ~~Optimize Half.BitIncrement and BitDecrement to use ushort operations~~ Vectorize TensorPrimitives.BitIncrement/BitDecrement for Half using ushort operations Jan 29, 2026

Copilot finished work on behalf of stephentoub January 29, 2026 15:55

EgorBot mentioned this pull request Jan 29, 2026

Benchmarks for #123754 (stephentoub) EgorBot/runtime-utils#604

Open

stephentoub marked this pull request as ready for review January 29, 2026 18:57

Copilot AI review requested due to automatic review settings January 29, 2026 18:57

Copilot started reviewing on behalf of stephentoub January 29, 2026 18:58 View session

Copilot AI reviewed Jan 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vectorize TensorPrimitives.BitIncrement/BitDecrement for Half using ushort operations #123754

Vectorize TensorPrimitives.BitIncrement/BitDecrement for Half using ushort operations #123754

Copilot AI commented Jan 29, 2026 •

edited by stephentoub

Loading

Uh oh!

dotnet-policy-service bot commented Jan 29, 2026

Uh oh!

stephentoub commented Jan 29, 2026

Uh oh!

Copilot AI commented Jan 29, 2026

Uh oh!

stephentoub commented Jan 29, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Vectorize TensorPrimitives.BitIncrement/BitDecrement for Half using ushort operations #123754

Are you sure you want to change the base?

Vectorize TensorPrimitives.BitIncrement/BitDecrement for Half using ushort operations #123754

Conversation

Copilot AI commented Jan 29, 2026 • edited by stephentoub Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes

Key Optimization

Performance Benchmark

Uh oh!

dotnet-policy-service bot commented Jan 29, 2026

Uh oh!

stephentoub commented Jan 29, 2026

Uh oh!

Copilot AI commented Jan 29, 2026

Uh oh!

stephentoub commented Jan 29, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Jan 29, 2026 •

edited by stephentoub

Loading