Skip to content

Conversation

@roji
Copy link
Member

@roji roji commented Jan 22, 2026

  • Adds IsAutoGenerated to key property definition and attribute.
  • When auto-generation is on, saving a record where the key property is the default (empty GUID, zero integer) triggers the auto-generation. It's still possible to set the key property to a non-default value, overriding the auto-generation.
  • Auto-generation can happen in the database (mainly on the relational databases), or on the cilent side (Guid.NewGuid()).
  • IsAutoGenerated is null by default, meaning that the provider decides whether to turn on autogeneration by default, based on the property type. This means that GUID properties are always auto-generated by default (if the key property is left unassigned), and int/long properties are auto-generated on relational providers. This aligns with the behavior in Entity Framework.
  • All providers support GUID generation, by doing it on the client before sending the record (Guid.NewGuid()). Some providers have special behavior, e.g. PG generates UUIDv7 and SQL Server generates sequential IDs in the database (much better for indexing). Universal GUID auto-generation makes things much simpler for upper layers, specifically Microsoft.Extensions.DataIngestion.

Closes #11485

@moonbox3 moonbox3 added the .NET Issue or Pull requests regarding .NET code label Jan 22, 2026
@github-actions github-actions bot changed the title [MEVD] Implement key auto-generation .Net: [MEVD] Implement key auto-generation Jan 22, 2026
@roji roji requested a review from Copilot January 22, 2026 17:06
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements provider-agnostic key auto-generation support across the MEVD vector-store abstraction and all first-party providers, and adds conformance and unit tests to validate the behavior.

Changes:

  • Introduces IsAutoGenerated metadata on key definitions (VectorStoreKeyProperty, VectorStoreKeyAttribute, KeyPropertyModel) and centralizes validation/auto-generation policy in CollectionModelBuilder.
  • Implements provider-specific key auto-generation behavior (client-side or server-side) for Sqlite, SQL Server, PostgreSQL, Cosmos (NoSQL/Mongo), MongoDB, Redis, Qdrant, Pinecone, Weaviate, Azure AI Search, and InMemory providers.
  • Extends conformance and provider unit tests to exercise key auto-generation and non-auto-generation scenarios (including identity/UUID generation and round-tripping generated keys back into records).

Reviewed changes

Copilot reviewed 49 out of 49 changed files in this pull request and generated no comments.

Show a summary per file
File Description
dotnet/test/VectorData/VectorData.UnitTests/CollectionModelBuilderTests.cs Updates the custom test CollectionModelBuilder to override the new ValidateKeyProperty hook instead of the removed IsKeyPropertyTypeValid.
dotnet/test/VectorData/VectorData.ConformanceTests/TypeTests/KeyTypeTests.cs Refactors generic key-type conformance tests to support an explicit supportsAutoGeneration flag, new overloads for struct/class keys, and exercises both non-auto and auto-generation behaviors including dynamic collections.
dotnet/test/VectorData/SqliteVec.UnitTests/SqliteCommandBuilderTests.cs Adjusts insert-command tests for new BuildInsertCommand signature and adds separate coverage for inserts with and without auto-generated integer keys (including RETURNING behavior and parameter binding).
dotnet/test/VectorData/SqliteVec.ConformanceTests/TypeTests/SqliteKeyTypeTests.cs Marks int/long keys as supporting auto-generation in Sqlite key conformance tests.
dotnet/test/VectorData/SqlServer.ConformanceTests/TypeTests/SqlServerKeyTypeTests.cs Marks int/long keys as supporting auto-generation in SQL Server key conformance tests.
dotnet/test/VectorData/SqlServer.ConformanceTests/SqlServerCommandBuilderTests.cs Updates create-table expectations to allow IDENTITY on key columns and replaces MergeIntoSingle/MergeIntoMany tests with an Upsert test that validates batched per-record MERGE/OUTPUT SQL and parameters.
dotnet/test/VectorData/Redis.ConformanceTests/TypeTests/RedisJsonKeyTypeTests.cs Updates Redis JSON key conformance fixture to accept the new CreateCollection/CreateDynamicCollection signatures and propagate withAutoGeneration into the record definition.
dotnet/test/VectorData/Redis.ConformanceTests/TypeTests/RedisHashSetKeyTypeTests.cs Same as above for Redis hash-set-backed collections.
dotnet/test/VectorData/Qdrant.ConformanceTests/TypeTests/QdrantKeyTypeTests.cs Adjusts Qdrant key tests to use the new struct overload (with default-key sentinel) while not enabling auto-generation for ulong.
dotnet/test/VectorData/PgVector.UnitTests/PostgresSqlBuilderTests.cs Adapts create-table tests to pass PostgreSQL version and updates upsert tests to the new NpgsqlBatch-based, generic BuildUpsertCommand<TKey> API.
dotnet/test/VectorData/PgVector.ConformanceTests/TypeTests/PostgresKeyTypeTests.cs Marks int/long keys as supporting auto-generation in PostgreSQL key conformance tests.
dotnet/test/VectorData/PgVector.ConformanceTests/Support/PostgresTestStore.cs Switches the test container image to pgvector/pgvector:pg18 to enable features like uuidv7() used for UUID key auto-generation.
dotnet/test/VectorData/MongoDB.ConformanceTests/TypeTests/MongoKeyTypeTests.cs Marks ObjectId keys as supporting auto-generation and adjusts int/long tests to use the new struct-based overload without auto-generation.
dotnet/test/VectorData/InMemory.ConformanceTests/TypeTests/InMemoryKeyTypeTests.cs Overrides the base Test method to thread supportsAutoGeneration into collection creation and teardown for the InMemory provider.
dotnet/test/VectorData/CosmosMongoDB.ConformanceTests/TypeTests/CosmosMongoKeyTypeTests.cs Same pattern as Mongo tests for Cosmos DB Mongo-compatible provider (auto-generated ObjectId, struct overload for int/long).
dotnet/src/VectorData/Weaviate/WeaviateModelBuilder.cs Replaces the old key-type validation method with ValidateKeyProperty, enforcing Guid-only keys with a more specific error message.
dotnet/src/VectorData/Weaviate/WeaviateCollection.cs Adds client-side Guid key generation for auto-generated keys in Weaviate before mapping to storage JSON.
dotnet/src/VectorData/VectorData.Abstractions/RecordDefinition/VectorStoreKeyProperty.cs Adds nullable IsAutoGenerated flag to key property definitions used in VectorStoreCollectionDefinition.
dotnet/src/VectorData/VectorData.Abstractions/RecordAttributes/VectorStoreKeyAttribute.cs Adds nullable IsAutoGenerated flag to the key attribute for attribute-based mapping.
dotnet/src/VectorData/VectorData.Abstractions/ProviderServices/KeyPropertyModel.cs Adds non-nullable IsAutoGenerated to the runtime key model and updates ToString to reflect auto-generated keys.
dotnet/src/VectorData/VectorData.Abstractions/ProviderServices/CollectionModelBuilder.cs Centralizes key auto-generation configuration via SupportsKeyAutoGeneration, propagates IsAutoGenerated from attributes/definitions, and replaces IsKeyPropertyTypeValid with ValidateKeyProperty while preserving data/vector validation logic.
dotnet/src/VectorData/SqliteVec/SqliteModelBuilder.cs Overrides SupportsKeyAutoGeneration to allow auto-generation for Guid/int/long keys and implements ValidateKeyProperty for Sqlite key types.
dotnet/src/VectorData/SqliteVec/SqliteCommandBuilder.cs Changes insert-command building to be aware of key auto-generation, omitting key columns when using rowid-based identity, generating GUIDs client-side where needed, and conditionally appending RETURNING only for DB-generated keys.
dotnet/src/VectorData/SqliteVec/SqliteCollection.cs Refactors single/multi upsert into DoUpsertAsync, wires in auto-generation for Sqlite keys (including reading back integer identity values) and keeps embedding-generation behavior intact for both data and vector tables.
dotnet/src/VectorData/SqlServer/SqlServerModelBuilder.cs Overrides SupportsKeyAutoGeneration for Guid/int/long and updates key-type validation to use ValidateKeyProperty.
dotnet/src/VectorData/SqlServer/SqlServerCommandBuilder.cs Enhances table creation to add IDENTITY/DEFAULT NEWSEQUENTIALID for auto-generated keys, replaces MergeIntoSingle/MergeIntoMany with a unified Upsert<TKey> that handles identity-insert and per-record MERGE/OUTPUT semantics, and adds a helper to skip key columns when auto-generating.
dotnet/src/VectorData/SqlServer/SqlServerCollection.cs Adapts collection upsert logic to use the new Upsert<TKey> builder, and injects generated SQL Server keys back into records for both single-record and batched upserts when auto-generation is enabled.
dotnet/src/VectorData/Redis/RedisModelBuilder.cs Switches Redis model key validation to the new ValidateKeyProperty hook for string/Guid keys.
dotnet/src/VectorData/Redis/RedisJsonModelBuilder.cs Same as above for Redis JSON provider.
dotnet/src/VectorData/Redis/RedisJsonDynamicModelBuilder.cs Same as above for dynamic Redis JSON collections.
dotnet/src/VectorData/Redis/RedisJsonCollection.cs Adds client-side Guid generation for auto-generated keys in JSON-backed Redis collections (single and batch upserts) before mapping to JSON and writing to Redis.
dotnet/src/VectorData/Redis/RedisHashSetCollection.cs Adds client-side Guid generation for auto-generated keys in hash-set-backed Redis collections before mapping to storage.
dotnet/src/VectorData/Qdrant/QdrantModelBuilder.cs Moves Qdrant key-type validation to ValidateKeyProperty, enforcing Guid/ulong-only keys.
dotnet/src/VectorData/Qdrant/QdrantCollection.cs Adds client-side Guid generation for auto-generated Guid keys when mapping records to Qdrant PointStructs.
dotnet/src/VectorData/Pinecone/PineconeModelBuilder.cs Switches Pinecone key-type validation to ValidateKeyProperty for string/Guid keys.
dotnet/src/VectorData/Pinecone/PineconeCollection.cs Adds client-side Guid key generation for auto-generated Guid keys before mapping to Pinecone vectors (single and batch upserts).
dotnet/src/VectorData/PgVector/PostgresSqlBuilder.cs Extends table creation to be version-aware and configure identity/defaults for auto-generated int/long/UUID keys, and reworks upsert building into a generic BuildUpsertCommand<TKey> that emits a batch of single-row INSERT/ON CONFLICT or INSERT…RETURNING statements to support key auto-generation.
dotnet/src/VectorData/PgVector/PostgresModelBuilder.cs Overrides SupportsKeyAutoGeneration for Guid/int/long and migrates key-type validation to ValidateKeyProperty.
dotnet/src/VectorData/PgVector/PostgresCollection.cs Unifies single/batch upsert paths, uses the new batched upsert builder, and injects auto-generated PostgreSQL keys (identity or UUID default) back into records using NpgsqlBatch result sets.
dotnet/src/VectorData/MongoDB/MongoCollection.cs Adds client-side key generation logic for auto-generated Guid and ObjectId keys prior to mapping and issuing ReplaceOne upserts.
dotnet/src/VectorData/InMemory/InMemoryModelBuilder.cs Restricts auto-generation support to Guid keys in the InMemory provider via ValidateKeyProperty, while still allowing all .NET types as keys otherwise.
dotnet/src/VectorData/InMemory/InMemoryCollection.cs Adds in-memory key auto-generation for Guid keys and ensures generated keys are written back into records before storage.
dotnet/src/VectorData/CosmosNoSql/CosmosNoSqlModelBuilder.cs Moves Cosmos NoSQL key-type validation to ValidateKeyProperty for string/Guid keys.
dotnet/src/VectorData/CosmosNoSql/CosmosNoSqlCollection.cs Adds client-side Guid generation for auto-generated keys before mapping to JSON and upserting into Cosmos DB NoSQL.
dotnet/src/VectorData/CosmosMongoDB/CosmosMongoCollection.cs Adds client-side key generation for auto-generated Guid and ObjectId keys before mapping and upserting into Cosmos DB Mongo-compatible collections.
dotnet/src/VectorData/AzureAISearch/AzureAISearchModelBuilder.cs Switches Azure AI Search model builder to ValidateKeyProperty for string/Guid keys and removes the old IsKeyPropertyTypeValid helper.
dotnet/src/VectorData/AzureAISearch/AzureAISearchDynamicModelBuilder.cs Same as above for dynamic Azure AI Search collections.
dotnet/src/VectorData/AzureAISearch/AzureAISearchCollection.cs Adds client-side Guid auto-generation for keys when configured, prior to mapping and indexing documents.
dotnet/src/InternalUtilities/connectors/Memory/MongoDB/MongoModelBuilder.cs Extends the internal Mongo model builder with SupportsKeyAutoGeneration for Guid/ObjectId keys and updates key validation to the new ValidateKeyProperty pattern.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@roji roji marked this pull request as ready for review January 22, 2026 22:19
@roji roji requested a review from a team as a code owner January 22, 2026 22:19
return command;
}

internal static SqlCommand MergeIntoSingle(
Copy link
Member Author

@roji roji Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note this significant, SQL Server-specific change. Previously, for upserting multiple records we used a single MERGE; with that strategy it's quite difficult/convoluted to get generated values back in a deterministic order (and we need one in order to inject the generated keys back into the correct .NET instances).

Instead, we now send a batch of MERGE statements. This probably has some performance cost, but unlikely to be significant in the grand scheme of things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

.NET Issue or Pull requests regarding .NET code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

.Net: [MEVD] Implement store key generation where supported

3 participants