-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Vectorize TensorPrimitives.BitIncrement/BitDecrement for Half using ushort operations #123754
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Tagging subscribers to this area: @dotnet/area-system-numerics-tensors |
Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
|
@copilot, no, this wasn't about Half.BitIncrement/Decrement. It was about vectorizing TensorPrimitives.BitIncrement/BitDecrement when the T is Half. Please start over. |
…ment using ushort operations Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
You're right, I misunderstood. I've corrected this in commit 2ac7925. The changes now add vectorized Half support to TensorPrimitives.BitIncrement/BitDecrement by:
The Half.cs file has been reverted to its original state. |
Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
|
@EgorBot -x64 -arm using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System.Numerics.Tensors;
BenchmarkSwitcher.FromAssembly(typeof(Bench).Assembly).Run(args);
public class Bench
{
private Half[] _source = default!;
private Half[] _destination = default!;
[Params(1000)]
public int Count { get; set; }
[GlobalSetup]
public void Setup()
{
_source = Enumerable.Range(0, Count)
.Select(i => (Half)(i % 2 == 0 ? i * 0.1f : -i * 0.1f))
.ToArray();
_destination = new Half[Count];
}
[Benchmark]
public void BitIncrement()
{
TensorPrimitives.BitIncrement<Half>(_source, _destination);
}
[Benchmark]
public void BitDecrement()
{
TensorPrimitives.BitDecrement<Half>(_source, _destination);
}
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR optimizes TensorPrimitives.BitIncrement and TensorPrimitives.BitDecrement for Half types by implementing vectorized operations that work directly on the ushort bit representation, avoiding costly float conversions.
Changes:
- Added vectorized Half support for BitIncrement and BitDecrement using direct ushort bit manipulation
- Implemented SIMD operations for Vector128, Vector256, and Vector512 that properly handle all IEEE 754 special cases
- Uses
TryUnaryBitwiseInvokeHalfAsInt16to process Half values as short/ushort throughout the computation
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| TensorPrimitives.BitIncrement.cs | Added Half-specific vectorized operator that manipulates ushort bits directly, with proper handling of NaN, infinities, and zero edge cases |
| TensorPrimitives.BitDecrement.cs | Added Half-specific vectorized operator that manipulates ushort bits directly, with proper handling of NaN, infinities, and zero edge cases |
Description
Per feedback on PR #123610, adds vectorized Half support to
TensorPrimitives.BitIncrementandTensorPrimitives.BitDecrementby processing Halfs as ushorts for the whole computation, avoiding the overhead of widening to float and narrowing back.Changes
TensorPrimitives.BitIncrement.cs:
TryUnaryBitwiseInvokeHalfAsInt16<T, HalfBitIncrementOperator>HalfBitIncrementOperatorstruct implementingIUnaryOperator<short, short>with:TensorPrimitives.BitDecrement.cs:
TryUnaryBitwiseInvokeHalfAsInt16<T, HalfBitDecrementOperator>HalfBitDecrementOperatorstruct implementingIUnaryOperator<short, short>with:Key Optimization
This approach uses
TryUnaryBitwiseInvokeHalfAsInt16which operates directly on theshort/ushortrepresentation of Half values without any conversion to/from float. This avoids the overhead of:Performance Benchmark
To trigger EgorBot benchmark comparison:
Original prompt
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.