MDEV-35283: support nullable indexed vector columns#5212
Conversation
There was a problem hiding this comment.
Code Review
This pull request implements support for nullable indexed vector columns (MDEV-35283) by removing the NOT NULL constraint check during table creation and updating the index insert, update, and delete hooks to handle null values. The review feedback recommends removing an unused table parameter from the helper function vec_field_is_null and restoring a safety assertion in sql/vector_mhnsw.cc that validates the vector data length and alignment.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| static bool vec_field_is_null(const TABLE *table, const uchar *rec, | ||
| const KEY *keyinfo) |
| if (int err= mhnsw_insert(this, key_info + s->keys)) | ||
| { | ||
| KEY *keyinfo= key_info + s->keys; | ||
| if (vec_field_is_null(this, record[0], keyinfo)) |
| bool old_is_null= vec_field_is_null(this, record[1], keyinfo); | ||
| bool new_is_null= vec_field_is_null(this, record[0], keyinfo); |
| if (int err= mhnsw_invalidate(this, buf, key_info + s->keys)) | ||
| { | ||
| KEY *keyinfo= key_info + s->keys; | ||
| if (vec_field_is_null(this, buf, keyinfo)) |
| DBUG_ASSERT(res); | ||
| DBUG_ASSERT(table->file->ref_length <= graph->field[FIELD_TREF]->field_length); |
There was a problem hiding this comment.
The assertion DBUG_ASSERT(res->length() > 0 && res->length() % 4 == 0); was a valuable safety check to ensure that the vector data is not empty and is properly aligned (since each float dimension is 4 bytes). Since res is guaranteed to be non-NULL by the preceding DBUG_ASSERT(res); (as NULL vectors are skipped before calling mhnsw_insert), this assertion is safe to keep and should not be removed.
DBUG_ASSERT(res);
DBUG_ASSERT(res->length() > 0 && res->length() % 4 == 0);
DBUG_ASSERT(table->file->ref_length <= graph->field[FIELD_TREF]->field_length);There was a problem hiding this comment.
+1. No need to remove the assert.
8eb4d79 to
ec611a0
Compare
gkodinov
left a comment
There was a problem hiding this comment.
Thank you for your contribution! This is a preliminary review.
The regression tests are failing because of your added tests.
Please fix. There are also some optional stylistic suggestions below.
| --echo # MDEV-35283 - support nullable indexed vector columns | ||
| --echo # | ||
|
|
||
| if ($MTR_COMBINATION_INNODB) { |
There was a problem hiding this comment.
you will also need to update vector,*.diff: right now this test is failing when run with Aria and MyISAM.
However, I do not see anything InnoDB specific here in this fix. Why limit testing to InnoDB only to start with?
Maybe just remove the if() altogether?
| static bool vec_field_is_null(const TABLE *table, const uchar *rec, | ||
| const KEY *keyinfo) |
| if (int err= mhnsw_insert(this, key_info + s->keys)) | ||
| { | ||
| KEY *keyinfo= key_info + s->keys; | ||
| if (vec_field_is_null(this, record[0], keyinfo)) |
| bool old_is_null= vec_field_is_null(this, record[1], keyinfo); | ||
| bool new_is_null= vec_field_is_null(this, record[0], keyinfo); |
| DBUG_ASSERT(res); | ||
| DBUG_ASSERT(table->file->ref_length <= graph->field[FIELD_TREF]->field_length); |
There was a problem hiding this comment.
+1. No need to remove the assert.
| return 0; | ||
| } | ||
|
|
||
| static bool vec_field_is_null(const TABLE *table, const uchar *rec, |
| const KEY *keyinfo) | ||
| { | ||
| Field *field= keyinfo->key_part->field; | ||
| return field->is_null_in_record(rec); |
There was a problem hiding this comment.
I'd put an assert that keyinfo->keypart is a vector column type.
Allow creating VECTOR INDEX on nullable columns. NULL values are simply not indexed - rows with NULL vectors won't be found via the vector index, but the CREATE TABLE/ALTER TABLE is permitted. Changes: - Remove NOT_NULL_FLAG check for vector index creation in mysql_prepare_create_table_finalize() - Add inline vec_field_is_null() helper with vector key type assert - Add NULL checks in hlindexes_on_insert/update/delete to skip vector index operations when the vector field is NULL - Handle UPDATE transitions: NULL->non-NULL (insert only), non-NULL->NULL (invalidate only), NULL->NULL (skip) - Clean up DBUG_ASSERT comment in mhnsw_insert() that referenced the now-removed ER_INDEX_CANNOT_HAVE_NULL constraint
ec611a0 to
12f09b1
Compare
gkodinov
left a comment
There was a problem hiding this comment.
LGTM. One additional optimization idea below. Please stand by for the final review.
| KEY *keyinfo= key_info + s->keys; | ||
| if (vec_field_is_null(record[0], keyinfo)) | ||
| return 0; | ||
| if (int err= mhnsw_insert(this, keyinfo)) |
There was a problem hiding this comment.
Final optional suggestion: it looks like you could just fold the null-ness check (and return success) into mhnsw_insert() and mhnsw_invalidate(). You will need to react on NULLs anyway everyplace these are called. This will further localize the check and ensure there's no duplicated code :
if (is_null) skip
mhnsw_xxx()
Allow creating VECTOR INDEX on nullable columns. NULL values are simply not indexed - rows with NULL vectors won't be found via the vector index, but the CREATE TABLE/ALTER TABLE is permitted.
Changes: