Skip to content

MDEV-35283: support nullable indexed vector columns#5212

Open
annbaek wants to merge 1 commit into
MariaDB:mainfrom
annbaek:MDEV-35283-nullable-vector-index
Open

MDEV-35283: support nullable indexed vector columns#5212
annbaek wants to merge 1 commit into
MariaDB:mainfrom
annbaek:MDEV-35283-nullable-vector-index

Conversation

@annbaek

@annbaek annbaek commented Jun 10, 2026

Copy link
Copy Markdown

Allow creating VECTOR INDEX on nullable columns. NULL values are simply not indexed - rows with NULL vectors won't be found via the vector index, but the CREATE TABLE/ALTER TABLE is permitted.

Changes:

  • Remove NOT_NULL_FLAG check for vector index creation in mysql_prepare_create_table_finalize()
  • Add NULL checks in hlindexes_on_insert/update/delete to skip vector index operations when the vector field is NULL
  • Handle UPDATE transitions: NULL->non-NULL (insert only), non-NULL->NULL (invalidate only), NULL->NULL (skip)
  • Remove obsolete DBUG_ASSERTs in mhnsw_insert() that assumed non-NULL was enforced by ER_INDEX_CANNOT_HAVE_NULL

@CLAassistant

CLAassistant commented Jun 10, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements support for nullable indexed vector columns (MDEV-35283) by removing the NOT NULL constraint check during table creation and updating the index insert, update, and delete hooks to handle null values. The review feedback recommends removing an unused table parameter from the helper function vec_field_is_null and restoring a safety assertion in sql/vector_mhnsw.cc that validates the vector data length and alignment.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread sql/sql_base.cc Outdated
Comment on lines +10208 to +10209
static bool vec_field_is_null(const TABLE *table, const uchar *rec,
const KEY *keyinfo)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The table parameter is unused in vec_field_is_null. It should be removed to simplify the function signature and avoid unused parameter warnings.

static bool vec_field_is_null(const uchar *rec,
                              const KEY *keyinfo)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I second the AI here.

Comment thread sql/sql_base.cc Outdated
if (int err= mhnsw_insert(this, key_info + s->keys))
{
KEY *keyinfo= key_info + s->keys;
if (vec_field_is_null(this, record[0], keyinfo))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Update the call to vec_field_is_null to match the simplified signature without the unused table parameter.

    if (vec_field_is_null(record[0], keyinfo))

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

Comment thread sql/sql_base.cc Outdated
Comment on lines +10236 to +10237
bool old_is_null= vec_field_is_null(this, record[1], keyinfo);
bool new_is_null= vec_field_is_null(this, record[0], keyinfo);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Update the calls to vec_field_is_null to match the simplified signature without the unused table parameter.

    bool old_is_null= vec_field_is_null(record[1], keyinfo);
    bool new_is_null= vec_field_is_null(record[0], keyinfo);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here

Comment thread sql/sql_base.cc Outdated
if (int err= mhnsw_invalidate(this, buf, key_info + s->keys))
{
KEY *keyinfo= key_info + s->keys;
if (vec_field_is_null(this, buf, keyinfo))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Update the call to vec_field_is_null to match the simplified signature without the unused table parameter.

    if (vec_field_is_null(buf, keyinfo))

Comment thread sql/vector_mhnsw.cc Outdated
Comment on lines 1422 to 1423
DBUG_ASSERT(res);
DBUG_ASSERT(table->file->ref_length <= graph->field[FIELD_TREF]->field_length);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The assertion DBUG_ASSERT(res->length() > 0 && res->length() % 4 == 0); was a valuable safety check to ensure that the vector data is not empty and is properly aligned (since each float dimension is 4 bytes). Since res is guaranteed to be non-NULL by the preceding DBUG_ASSERT(res); (as NULL vectors are skipped before calling mhnsw_insert), this assertion is safe to keep and should not be removed.

  DBUG_ASSERT(res);
  DBUG_ASSERT(res->length() > 0 && res->length() % 4 == 0);
  DBUG_ASSERT(table->file->ref_length <= graph->field[FIELD_TREF]->field_length);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. No need to remove the assert.

@gkodinov gkodinov added the External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements. label Jun 10, 2026
@gkodinov gkodinov self-assigned this Jun 10, 2026
@annbaek annbaek force-pushed the MDEV-35283-nullable-vector-index branch from 8eb4d79 to ec611a0 Compare June 10, 2026 07:47

@gkodinov gkodinov left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! This is a preliminary review.

The regression tests are failing because of your added tests.
Please fix. There are also some optional stylistic suggestions below.

Comment thread mysql-test/main/vector.test Outdated
--echo # MDEV-35283 - support nullable indexed vector columns
--echo #

if ($MTR_COMBINATION_INNODB) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you will also need to update vector,*.diff: right now this test is failing when run with Aria and MyISAM.

However, I do not see anything InnoDB specific here in this fix. Why limit testing to InnoDB only to start with?

Maybe just remove the if() altogether?

Comment thread sql/sql_base.cc Outdated
Comment on lines +10208 to +10209
static bool vec_field_is_null(const TABLE *table, const uchar *rec,
const KEY *keyinfo)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I second the AI here.

Comment thread sql/sql_base.cc Outdated
if (int err= mhnsw_insert(this, key_info + s->keys))
{
KEY *keyinfo= key_info + s->keys;
if (vec_field_is_null(this, record[0], keyinfo))

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

Comment thread sql/sql_base.cc Outdated
Comment on lines +10236 to +10237
bool old_is_null= vec_field_is_null(this, record[1], keyinfo);
bool new_is_null= vec_field_is_null(this, record[0], keyinfo);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here

Comment thread sql/vector_mhnsw.cc Outdated
Comment on lines 1422 to 1423
DBUG_ASSERT(res);
DBUG_ASSERT(table->file->ref_length <= graph->field[FIELD_TREF]->field_length);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. No need to remove the assert.

Comment thread sql/sql_base.cc Outdated
return 0;
}

static bool vec_field_is_null(const TABLE *table, const uchar *rec,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd even do inline

Comment thread sql/sql_base.cc Outdated
const KEY *keyinfo)
{
Field *field= keyinfo->key_part->field;
return field->is_null_in_record(rec);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd put an assert that keyinfo->keypart is a vector column type.

Allow creating VECTOR INDEX on nullable columns. NULL values are
simply not indexed - rows with NULL vectors won't be found via
the vector index, but the CREATE TABLE/ALTER TABLE is permitted.

Changes:
- Remove NOT_NULL_FLAG check for vector index creation in
  mysql_prepare_create_table_finalize()
- Add inline vec_field_is_null() helper with vector key type assert
- Add NULL checks in hlindexes_on_insert/update/delete to skip
  vector index operations when the vector field is NULL
- Handle UPDATE transitions: NULL->non-NULL (insert only),
  non-NULL->NULL (invalidate only), NULL->NULL (skip)
- Clean up DBUG_ASSERT comment in mhnsw_insert() that referenced
  the now-removed ER_INDEX_CANNOT_HAVE_NULL constraint
@annbaek annbaek force-pushed the MDEV-35283-nullable-vector-index branch from ec611a0 to 12f09b1 Compare June 10, 2026 08:12

@gkodinov gkodinov left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. One additional optimization idea below. Please stand by for the final review.

Comment thread sql/sql_base.cc
KEY *keyinfo= key_info + s->keys;
if (vec_field_is_null(record[0], keyinfo))
return 0;
if (int err= mhnsw_insert(this, keyinfo))

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Final optional suggestion: it looks like you could just fold the null-ness check (and return success) into mhnsw_insert() and mhnsw_invalidate(). You will need to react on NULLs anyway everyplace these are called. This will further localize the check and ensure there's no duplicated code :

if (is_null) skip

mhnsw_xxx()

@gkodinov gkodinov assigned vuvova and unassigned gkodinov Jun 10, 2026
@gkodinov gkodinov requested a review from vuvova June 10, 2026 08:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements.

Development

Successfully merging this pull request may close these issues.

4 participants