Skip to content

Library is not thread-safe: shared connection, racy singleton, and non-atomic auto-increment #9

@jkalias

Description

@jkalias

Summary

sqlite-reflection cannot be used safely from more than one thread. There is no synchronization anywhere in the library (grep -r "mutex\|lock\|atomic" over include/ and src/ returns nothing outside the vendored date.h), yet the entire public API funnels through a single process-wide singleton that wraps one shared sqlite3* connection. The documented usage pattern — Database::Initialize(...) once, then Database::Instance() everywhere — actively encourages sharing that connection across threads, which is exactly the scenario that breaks.

For a local-persistence library this matters a lot in practice: GUI apps move DB work to a background thread to keep the UI responsive, servers handle requests concurrently, and batch jobs fan out across a thread pool. Today all of those are undefined behavior.

Below are three concrete, independent defects, each tied to the current code. (A fourth, an id read-modify-write race, was present in earlier releases and has since been fixed — see the note at the end.)


1. The singleton itself is racy (instance_ is an unguarded raw pointer)

src/database.cc:31 declares Database* Database::instance_ = nullptr; and the lifecycle functions read/write it with no lock (Initialize at src/database.cc:37, Finalize at :46, Instance at :67):

void Database::Initialize(const std::string& path) {
    if (instance_ != nullptr) {                       // check ...
        throw std::invalid_argument("Database has already been initialized");
    }
    const auto effective_path = !path.empty() ? path : ":memory:";
    instance_ = new Database(effective_path.data());  // ... then act (not atomic)
}

const Database& Database::Instance() {
    return *instance_;                                // no null check, no fence
}

Two threads calling Initialize concurrently can both pass the nullptr check and both new Database(...), leaking a connection. Worse, a thread calling Finalize() (which does delete instance_ at src/database.cc:49) while another is mid-Fetch/Save via a reference obtained from Instance() triggers a use-after-free on both the Database object and the underlying sqlite3*.

// Thread A                          // Thread B
const auto& db = Database::Instance();
                                     Database::Finalize();   // deletes the object
db.FetchAll();                       // use-after-free

2. One shared sqlite3* is used concurrently, and every write opens a transaction on it

All operations share db_ (include/database.h:238). Every mutating call routes through ExecutionQuery::Execute() (src/queries.cc:95), which unconditionally brackets the work in BEGIN/COMMIT on that shared handle (BEGIN at src/queries.cc:98, COMMIT at :123):

void ExecutionQuery::Execute() const {
    ...
    if (sqlite3_exec(db_, "BEGIN TRANSACTION;", nullptr, nullptr, nullptr)) {
        throw std::domain_error("Fatal error in transaction start");
    }
    ...
    if (sqlite3_exec(db_, "COMMIT;", nullptr, nullptr, nullptr)) {
        throw std::domain_error("Fatal error in transaction commit");
    }
}

SQLite's BEGIN is not nestable on a single connection. Two threads that both reach Execute() interleave like this:

// Thread A                                  // Thread B
db.Save(person_a);                           db.Save(person_b);
//  -> BEGIN TRANSACTION;  (ok)
                                             //  -> BEGIN TRANSACTION;
                                             //     "cannot start a transaction
                                             //      within a transaction"
                                             //  -> throws "Fatal error in
                                             //     transaction start"

Because the transaction is process-global on the shared connection, Thread B's BEGIN fails even though it is writing an unrelated row, and the failure is reported as a fatal, non-recoverable error. Even setting that aside, the library calls sqlite3_open() (src/database.cc:55) without requesting serialized threading mode and then hands the same handle to every thread — concurrent sqlite3_step/sqlite3_finalize on that one connection is not safe under the default configuration.


3. Error paths close the shared connection out from under other threads

Two read paths respond to a prepare failure by calling sqlite3_close(db_) on the shared handle:

// src/queries.cc:326 (FetchRecordsQuery::GetResults)
if (sqlite3_prepare_v2(db_, sql.data(), -1, &stmt_, nullptr)) {
    sqlite3_close(db_);                 // closes the connection every other thread holds
    throw std::runtime_error((sql + ": could not get results").data());
}
// src/queries.cc:296 (FetchMaxIdQuery::GetMaxId)
if (sqlite3_prepare_v2(db_, sql.data(), -1, &stmt_, nullptr)) {
    sqlite3_close(db_);
    throw std::runtime_error("Could not retrieve max id for table " + record_.name);
}

db_ is the singleton's only connection. If any thread hits this path, every other thread's in-flight statements now reference a closed sqlite3*, and instance_->db_ becomes a dangling pointer (it is never reset to nullptr). A single transient query error on one thread corrupts the whole process.


Reproduction sketch

Database::Initialize("");  // in-memory; same behavior on a file path

std::vector<std::thread> workers;
for (int t = 0; t < 8; ++t) {
    workers.emplace_back([t] {
        const auto& db = Database::Instance();
        for (int i = 0; i < 100; ++i) {
            db.Save(Person{L"John", L"Doe", 42, true, /*id*/ t * 100 + i});
        }
    });
}
for (auto& w : workers) w.join();
// Observed: "Fatal error in transaction start" from the nested BEGIN on the shared
// connection, and intermittent crashes, depending on timing.

Expected behavior

The library should make concurrent use either safe or explicitly, loudly unsupported. Concretely, some combination of:

  • Document the threading contract up front (the README currently says nothing about threads).
  • Serialize access to the shared connection (e.g. a std::mutex guarding Execute()/GetResults() and the singleton lifecycle), or adopt a connection-per-thread / connection-pool model and open SQLite in serialized mode (SQLITE_OPEN_FULLMUTEX / sqlite3_config(SQLITE_CONFIG_SERIALIZED)).
  • Stop calling sqlite3_close(db_) from per-query error paths; surface the error without destroying the shared connection.

Note: id read-modify-write race (fixed in #8)

Earlier releases had a fourth defect here: SaveAutoIncrement computed the next id in user space as GetMaxId() + 1, a check-then-act that let two threads (or processes) read the same MAX(id) and insert colliding ids. PR #8 (39a2454) resolved thisSaveAutoIncrement (include/database.h:119, batch overload at :140) now lets SQLite assign the id natively via INTEGER PRIMARY KEY AUTOINCREMENT + last_insert_rowid(), so id generation is atomic inside SQLite. The now-unused GetMaxId / FetchMaxIdQuery that backed the old scheme is leftover dead code, tracked separately in #10.

Environment

  • Affects main as of 39a2454.
  • Independent of compiler/OS — the defects are in the library's own synchronization (or lack thereof), not in SQLite.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions