Implement support for SQL Server VectorSearch()#37536
Conversation
ab8971f to
6364719
Compare
...l/Query/RelationalShapedQueryCompilingExpressionVisitor.ShaperProcessingExpressionVisitor.cs
Show resolved
Hide resolved
src/EFCore.Relational/Query/RelationalSqlTranslatingExpressionVisitor.cs
Outdated
Show resolved
Hide resolved
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
This pull request implements support for SQL Server's VECTOR_SEARCH() function, which enables approximate, index-driven vector similarity search. The PR also addresses the need to allow table-valued functions to accept non-scalar parameters.
Changes:
- Implements VectorSearch() extension method on DbSet for SQL Server
- Relaxes TableValuedFunctionExpression to accept Expression instead of SqlExpression parameters
- Refactors JsonEachExpression and SqlServerOpenJsonExpression to use dedicated Json properties instead of indexing Arguments
Reviewed changes
Copilot reviewed 19 out of 20 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| test/EFCore.SqlServer.FunctionalTests/Query/Translations/VectorTranslationsSqlServerTest.cs | Adds functional tests for VectorSearch with placeholder Experimental attributes that need correction |
| src/Shared/EFDiagnostics.cs | Adds new experimental diagnostic constant EF9105 for SqlServerVectorSearch |
| src/EFCore/Query/Internal/NavigationExpandingExpressionVisitor.cs | Changes exception throwing to ProcessUnknownMethod call for better extensibility |
| src/EFCore.Sqlite.Core/Query/Internal/SqliteQueryableMethodTranslatingExpressionVisitor.cs | Updates to use new JsonEachExpression.Json property |
| src/EFCore.Sqlite.Core/Query/Internal/SqliteQuerySqlGenerator.cs | Updates to use new JsonEachExpression.Json property |
| src/EFCore.Sqlite.Core/Query/Internal/SqlExpressions/JsonEachExpression.cs | Refactors to use primary constructor and dedicated Json property |
| src/EFCore.SqlServer/Query/Internal/SqlServerTypeMappingPostprocessor.cs | Updates to use SqlServerOpenJsonExpression.Json property |
| src/EFCore.SqlServer/Query/Internal/SqlServerSqlTreePruner.cs | Updates to use SqlServerOpenJsonExpression.Json property |
| src/EFCore.SqlServer/Query/Internal/SqlServerQueryableMethodTranslatingExpressionVisitor.cs | Implements VectorSearch translation logic with naming issues in variable names |
| src/EFCore.SqlServer/Query/Internal/SqlServerQuerySqlGenerator.cs | Adds special SQL generation logic for VECTOR_SEARCH() function |
| src/EFCore.SqlServer/Query/Internal/SqlServerJsonPostprocessor.cs | Updates to use SqlServerOpenJsonExpression.Json property |
| src/EFCore.SqlServer/Query/Internal/SqlExpressions/SqlServerOpenJsonExpression.cs | Refactors to use dedicated Json property instead of Arguments[0] |
| src/EFCore.SqlServer/Properties/SqlServerStrings.resx | Adds error message for VectorSearchRequiresColumn |
| src/EFCore.SqlServer/Properties/SqlServerStrings.Designer.cs | Auto-generated designer file for new error message |
| src/EFCore.SqlServer/Extensions/SqlServerQueryableExtensions.cs | New file defining VectorSearch extension method and VectorSearchResult struct with minor documentation typo |
| src/EFCore.SqlServer/Extensions/SqlServerDbFunctionsExtensions.cs | Minor documentation formatting improvement |
| src/EFCore.SqlServer/EFCore.SqlServer.csproj | Adds NoWarn for EF9105 experimental diagnostic |
| src/EFCore.Relational/Query/SqlExpressions/TableValuedFunctionExpression.cs | Changes Arguments type from IReadOnlyList to IReadOnlyList |
| src/EFCore.Relational/Query/RelationalSqlTranslatingExpressionVisitor.cs | Adds optimization to reduce member accesses on NewExpression |
| src/EFCore.Relational/Query/RelationalShapedQueryCompilingExpressionVisitor.ShaperProcessingExpressionVisitor.cs | Removes assertion that required properties to be nullable |
Files not reviewed (1)
- src/EFCore.SqlServer/Properties/SqlServerStrings.Designer.cs: Language not supported
src/EFCore.SqlServer/Extensions/SqlServerQueryableExtensions.cs
Outdated
Show resolved
Hide resolved
src/EFCore.SqlServer/Query/Internal/SqlServerQueryableMethodTranslatingExpressionVisitor.cs
Outdated
Show resolved
Hide resolved
src/EFCore.SqlServer/Query/Internal/SqlServerQueryableMethodTranslatingExpressionVisitor.cs
Outdated
Show resolved
Hide resolved
src/EFCore.SqlServer/Query/Internal/SqlServerQueryableMethodTranslatingExpressionVisitor.cs
Outdated
Show resolved
Hide resolved
src/EFCore.SqlServer/Query/Internal/SqlServerQueryableMethodTranslatingExpressionVisitor.cs
Show resolved
Hide resolved
test/EFCore.SqlServer.FunctionalTests/Query/Translations/VectorTranslationsSqlServerTest.cs
Outdated
Show resolved
Hide resolved
test/EFCore.SqlServer.FunctionalTests/Query/Translations/VectorTranslationsSqlServerTest.cs
Outdated
Show resolved
Hide resolved
src/EFCore.SqlServer/Query/Internal/SqlServerQueryableMethodTranslatingExpressionVisitor.cs
Outdated
Show resolved
Hide resolved
src/EFCore.SqlServer/Query/Internal/SqlServerQueryableMethodTranslatingExpressionVisitor.cs
Outdated
Show resolved
Hide resolved
6364719 to
233af64
Compare
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
233af64 to
0bdd842
Compare
0bdd842 to
be80ffc
Compare
src/EFCore.Relational/Query/SqlExpressions/TableValuedFunctionExpression.cs
Show resolved
Hide resolved
src/EFCore.SqlServer/Extensions/SqlServerQueryableExtensions.cs
Outdated
Show resolved
Hide resolved
| translatedMetric = _sqlExpressionFactory.ApplyTypeMapping( | ||
| translatedMetric, _typeMappingSource.FindMapping("varchar(max)")); |
There was a problem hiding this comment.
I wonder if there's a better way of accomplishing this
There was a problem hiding this comment.
You mean avoiding the N? Do you have a specific problem in mind with this, or just verbose/annoying?
There was a problem hiding this comment.
Just too verbose and it feels like there's à way to refactor it to make it more readable
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 21 out of 23 changed files in this pull request and generated 7 comments.
Files not reviewed (2)
- src/EFCore.SqlServer/Properties/SqlServerStrings.Designer.cs: Language not supported
- src/EFCore/Properties/CoreStrings.Designer.cs: Language not supported
| ]); | ||
|
|
||
| // We have the VECTOR_SEARCH() function call. Modify the SelectExpression and shaper to use it and project | ||
| // the appropriate things. |
There was a problem hiding this comment.
The code uses EF1001 warning suppression for internal API usage (SetTables), which is intentional and acceptable in provider code. However, consider whether this internal API access could be avoided by using a more public mechanism, or ensure this dependency is documented as potentially fragile across EF Core versions.
| // the appropriate things. | |
| // the appropriate things. | |
| // Note: SetTables is an internal EF Core API. Its usage here is intentional provider code and may need to be | |
| // revisited if EF Core internals change in future versions. |
| var arguments = NewArrayInit( | ||
| typeof(Expression), | ||
| Arguments.Select(a => ((IRelationalQuotableExpression)a).Quote())); |
There was a problem hiding this comment.
The cast to IRelationalQuotableExpression assumes all Arguments implement this interface. While this is currently safe because the only non-SqlExpression arguments passed are TableExpression instances (which inherit from TableExpressionBase and implement IRelationalQuotableExpression), this could fail at runtime if future code passes other Expression types. Consider adding a defensive check or documentation clarifying this assumption.
| var arguments = NewArrayInit( | |
| typeof(Expression), | |
| Arguments.Select(a => ((IRelationalQuotableExpression)a).Quote())); | |
| Microsoft.EntityFrameworkCore.Utilities.Check.DebugAssert( | |
| Arguments.All(a => a is IRelationalQuotableExpression), | |
| "All TableValuedFunctionExpression arguments must implement IRelationalQuotableExpression."); | |
| var arguments = NewArrayInit( | |
| typeof(Expression), | |
| Arguments.Cast<IRelationalQuotableExpression>().Select(a => a.Quote())); |
| @@ -2918,9 +2918,6 @@ private Expression CreateGetValueExpression( | |||
| Type type, | |||
| IPropertyBase? property = null) | |||
| { | |||
There was a problem hiding this comment.
The removal of this assertion weakens safety checks for reading values from the database. The original assertion validated that when reading nullable values without a specified property, the type must be nullable. Without this check, there's a risk of attempting to read null values into non-nullable types, which could cause runtime errors. Consider either restoring this assertion or adding a comment explaining why this safety check is no longer necessary.
| { | |
| { | |
| Microsoft.EntityFrameworkCore.Utilities.Check.DebugAssert( | |
| !nullable | |
| || property is not null | |
| || !type.IsValueType | |
| || Nullable.GetUnderlyingType(type) is not null, | |
| "When reading a nullable value without property metadata, the CLR type must be nullable."); |
| select.ReplaceProjection(new Dictionary<ProjectionMember, Expression> | ||
| { | ||
| [valueProjectionMember] = entityProjection, | ||
| [distanceProjectionMember] = new ColumnExpression("Distance", vectorSearchFunction.Alias, typeof(double), _typeMappingSource.FindMapping(typeof(double)), nullable: false) |
There was a problem hiding this comment.
Consider verifying that the distance projection is correctly typed and nullable. While the code specifies nullable: false, vector search distance values could potentially be null in some edge cases (e.g., if the underlying index is unavailable or corrupted). Review SQL Server's VECTOR_SEARCH documentation to confirm the Distance column is always non-null.
| <value>Vector property '{propertyName}' is on '{structuralType}' which is mapped to JSON. Vector properties are not supported within JSON documents.</value> | ||
| </data> | ||
| <data name="VectorSearchRequiresColumn" xml:space="preserve"> | ||
| <value>VectorSearch() requires a valid vector column.</value> |
There was a problem hiding this comment.
The error message should be more specific about what constitutes a "valid vector column". Consider enhancing it to explain that VectorSearch requires a direct property reference to a vector column, not a computed expression or nested property access. This would provide better guidance to developers encountering this error.
| <value>VectorSearch() requires a valid vector column.</value> | |
| <value>VectorSearch() requires a direct reference to a vector property column, not a computed expression or nested property access.</value> |
| int topN) | ||
| where T : class | ||
| where TVector : unmanaged | ||
| { |
There was a problem hiding this comment.
There is no input validation for the topN parameter. According to the XML documentation (line 28), topN must be a positive integer, but there's no check to enforce this constraint. Consider adding validation to reject non-positive values before they're passed to SQL Server, which would provide clearer error messages to developers.
| { | |
| { | |
| if (topN <= 0) | |
| { | |
| throw new ArgumentOutOfRangeException(nameof(topN)); | |
| } |
| public static IQueryable<VectorSearchResult<T>> VectorSearch<T, TVector>( | ||
| this DbSet<T> source, | ||
| Expression<Func<T, TVector>> vectorPropertySelector, | ||
| TVector similarTo, | ||
| [NotParameterized] string metric, | ||
| int topN) | ||
| where T : class | ||
| where TVector : unmanaged | ||
| { |
There was a problem hiding this comment.
There is no validation for the metric parameter. While NotParameterized prevents parameterization, the method doesn't verify that the metric value is one of the valid options (e.g., 'cosine', 'euclidean', 'dot'). Invalid metric values will only fail at SQL Server execution time with potentially less clear error messages. Consider validating against known metric values or at least checking for null/empty strings.
Closes #36384
Closes #37535