-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
Description
IndexWriter will happily allow applications to index documents containing KnnByteVectorField (and presumably KnnFloatVectorField) instances containing "invalid" values.
This invalid vectors will not trigger an Exception from either IndexWriter.addDocument() nor IndexWriter.commit() -- they will only cause problem down the road during index merges, or when running CheckIndex.
A trivial test case can be found in: lucene.invalid-vector-indexing-with-out-failure.test.patch (which uses COSINE sim and indexes new byte[] {0,0,0,...,0}... I'm not sure if similar problems will happen with other vector+sim combos and/or non-normalized vectors when using DOC_PRODUCT?)
AFAIK this test should fail on any system regardless of seed.
The nature of the failure can be changed by modifying tests.asserts to influence whether:
The problem triggers an assertion in in the KnnVectorsWriter.merge call stack.
> java.lang.AssertionError: Nodes are added in the incorrect order! Comparing NaN to [1.0]
> at __randomizedtesting.SeedInfo.seed([75D48A8DF27FDF07:8BF93348CADA4842]:0)
> at org.apache.lucene.util.hnsw.NeighborArray.addInOrder(NeighborArray.java:80)
> at org.apache.lucene.util.hnsw.HnswGraphBuilder.popToScratch(HnswGraphBuilder.java:461)
> at org.apache.lucene.util.hnsw.HnswGraphBuilder.addGraphNodeInternal(HnswGraphBuilder.java:286)
> at org.apache.lucene.util.hnsw.HnswGraphBuilder.addGraphNode(HnswGraphBuilder.java:325)
> at org.apache.lucene.util.hnsw.MergingHnswGraphBuilder.updateGraph(MergingHnswGraphBuilder.java:153)
> at org.apache.lucene.util.hnsw.MergingHnswGraphBuilder.build(MergingHnswGraphBuilder.java:128)
> at org.apache.lucene.util.hnsw.IncrementalHnswGraphMerger.merge(IncrementalHnswGraphMerger.java:214)
> at org.apache.lucene.codecs.lucene99.Lucene99HnswVectorsWriter.mergeOneField(Lucene99HnswVectorsWriter.java:444)
> at org.apache.lucene.codecs.perfield.PerFieldKnnVectorsFormat$FieldsWriter.mergeOneField(PerFieldKnnVectorsFormat.java:128)
> at org.apache.lucene.codecs.KnnVectorsWriter.merge(KnnVectorsWriter.java:105)
> at org.apache.lucene.index.SegmentMerger.mergeVectorValues(SegmentMerger.java:272)
> at org.apache.lucene.index.SegmentMerger.mergeWithLogging(SegmentMerger.java:315)
> at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:159)
> at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:5276)
> at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4739)
> at org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:6538)
> at org.apache.lucene.index.SerialMergeScheduler.merge(SerialMergeScheduler.java:38)
> at org.apache.lucene.index.IndexWriter.executeMerge(IndexWriter.java:2333)
> at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:2328)
> at org.apache.lucene.index.IndexWriter.forceMerge(IndexWriter.java:2163)
> at org.apache.lucene.index.IndexWriter.forceMerge(IndexWriter.java:2111)
> at org.apache.lucene.util.hnsw.TestZeroVectorHnswGraphIndexing.testIndexingAndMerging(TestZeroVectorHnswGraphIndexing.java:53)
...
Reproduce with: gradlew :lucene:core:test --tests "org.apache.lucene.util.hnsw.TestZeroVectorHnswGraphIndexing.testIndexingAndMerging" -Ptests.asserts=true -Ptests.file.encoding=ISO-8859-1 -Ptests.gui=false "-Ptests.jvmargs=-XX:TieredStopAtLevel=1 -XX:+UseParallelGC -XX:ActiveProcessorCount=1" -Ptests.jvms=5 -Ptests.seed=75D48A8DF27FDF07 -Ptests.vectorsize=512
OR ... The problem sneaks through all indexing & merging and only causes an issue when MockDirectoryWrapper.close() invokes CheckIndex
> org.apache.lucene.index.CheckIndex$CheckIndexException: Field "bytes" failed to search k nearest neighbors
> at __randomizedtesting.SeedInfo.seed([9F1658AC2C860676:613BE16914239133]:0)
> at app//org.apache.lucene.index.CheckIndex.checkByteVectorValues(CheckIndex.java:3162)
> at app//org.apache.lucene.index.CheckIndex.testVectors(CheckIndex.java:2855)
> at app//org.apache.lucene.index.CheckIndex.testSegment(CheckIndex.java:1123)
> at app//org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:823)
> at app//org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:593)
> at app//org.apache.lucene.tests.util.TestUtil.checkIndex(TestUtil.java:333)
> at app//org.apache.lucene.tests.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:917)
> at app//org.apache.lucene.util.hnsw.TestZeroVectorHnswGraphIndexing.testIndexingAndMerging(TestZeroVectorHnswGraphIndexing.java:56)
...
Reproduce with: gradlew :lucene:core:test --tests "org.apache.lucene.util.hnsw.TestZeroVectorHnswGraphIndexing.testIndexingAndMerging" -Ptests.asserts=false -Ptests.file.encoding=US-ASCII -Ptests.gui=false "-Ptests.jvmargs=-XX:TieredStopAtLevel=1 -XX:+UseParallelGC -XX:ActiveProcessorCount=1" -Ptests.jvms=5 -Ptests.seed=9F1658AC2C860676 -Ptests.vectorsize=128
Version and environment details
This problem affects main, and branch_10x, back (at least) as far as 10.3.2 where it was discovered due to a randomized Solr test that could inadvertently generate an "all zero" vector (SOLR-17736)