Add Gaussian Mixture Models (GMM)#2248
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (4)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (3)
📝 WalkthroughSummary by CodeRabbit
WalkthroughThis PR adds Gaussian Mixture Model support across the C++ core, C and Python bindings, tests, and Fern documentation. ChangesGMM Clustering Feature
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 markdownlint-cli2 (0.22.1)fern/pages/python_api/python-api-cluster-gmm.mdmarkdownlint-cli2 v0.22.1 (markdownlint v0.40.0) Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@c/tests/cluster/gmm_c.cu`:
- Around line 45-64: The cuvsResources_t resource created by
cuvsResourcesCreate(&res) uses a different CUDA stream by default than the
stream used for the asynchronous raft::copy operations, creating a race
condition where cuvsGMMFit executes before the data copy completes. After
calling cuvsResourcesCreate(&res), immediately call cuvsStreamSet(res, stream)
to synchronize the resource's stream with the stream variable that is used by
raft::copy and the subsequent GPU operations. Apply this fix immediately
following the cuvsResourcesCreate call, and also apply the same fix pattern to
the other test sections mentioned in the comment (lines 98-111, 167-187,
216-229).
In `@cpp/src/cluster/gmm_impl.cuh`:
- Around line 1115-1157: The FULL-covariance second pass has inconsistent
responsibilities across its computational steps. The issue is that after
m_finalize mutates weights and means at line 1141, the subsequent estep_tile
call at line 1145 recomputes responsibilities with these updated parameters, but
then m_cov_full_pass uses these new responsibilities which differ from those
used in the first pass and in m_accumulate, breaking the mathematical
consistency of the EM algorithm. Fix this by saving the responsibilities from
the first pass before m_finalize is called, and then reusing those saved
responsibilities in the m_cov_full_pass loop instead of recomputing them with
estep_tile.
In `@python/cuvs/cuvs/cluster/gmm/gmm.pyx`:
- Around line 90-94: The `__cinit__` method in the GMM class does not check the
return value from `cuvsGMMParamsCreate`, which can leave `self.params`
uninitialized if the call fails, leading to undefined behavior. Add error
checking after the `cuvsGMMParamsCreate` call to verify it succeeded and raise
an exception if it fails. Additionally, the `__dealloc__` method currently calls
`check_cuvs()` on `cuvsGMMParamsDestroy`, which can raise exceptions in a
destructor (problematic behavior). Instead, check the return value from
`cuvsGMMParamsDestroy(self.params)` without raising exceptions - only log or
handle the error silently to avoid suppressed exceptions or ResourceWarnings.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: ca5bb572-978f-46af-87e8-00b27bd7a8a1
📒 Files selected for processing (32)
c/CMakeLists.txtc/include/cuvs/cluster/gmm.hc/include/cuvs/core/all.hc/src/cluster/gmm.cppc/tests/CMakeLists.txtc/tests/cluster/gmm_c.cucpp/CMakeLists.txtcpp/include/cuvs/cluster/gmm.hppcpp/src/cluster/gmm.cuhcpp/src/cluster/gmm_double.cucpp/src/cluster/gmm_float.cucpp/src/cluster/gmm_impl.cuhcpp/src/cluster/gmm_kernels.cuhcpp/tests/CMakeLists.txtcpp/tests/cluster/gmm.cufern/docs.ymlfern/pages/c_api/c-api-cluster-gmm.mdfern/pages/c_api/index.mdfern/pages/cluster/gmm.mdfern/pages/cluster/index.mdfern/pages/cpp_api/cpp-api-cluster-gmm.mdfern/pages/cpp_api/index.mdfern/pages/python_api/index.mdfern/pages/python_api/python-api-cluster-gmm.mdpython/cuvs/cuvs/cluster/CMakeLists.txtpython/cuvs/cuvs/cluster/__init__.pypython/cuvs/cuvs/cluster/gmm/CMakeLists.txtpython/cuvs/cuvs/cluster/gmm/__init__.pxdpython/cuvs/cuvs/cluster/gmm/__init__.pypython/cuvs/cuvs/cluster/gmm/gmm.pxdpython/cuvs/cuvs/cluster/gmm/gmm.pyxpython/cuvs/cuvs/tests/test_gmm.py
| def __cinit__(self): | ||
| cuvsGMMParamsCreate(&self.params) | ||
|
|
||
| def __dealloc__(self): | ||
| check_cuvs(cuvsGMMParamsDestroy(self.params)) |
There was a problem hiding this comment.
CRITICAL: Unchecked error in __cinit__ and unsafe exception in __dealloc__
Two related issues in the lifecycle methods:
-
Line 91:
cuvsGMMParamsCreatereturn value is not checked. If allocation fails,self.paramsremains uninitialized (undefined behavior), and subsequent operations will silently corrupt memory or crash. -
Line 94:
check_cuvs()can raise exceptions in__dealloc__, which is problematic:- If
__cinit__failed silently,self.paramsmay be invalid/garbage - Raising exceptions in destructors can cause suppressed errors or ResourceWarnings
- If
Proposed fix
def __cinit__(self):
- cuvsGMMParamsCreate(&self.params)
+ check_cuvs(cuvsGMMParamsCreate(&self.params))
def __dealloc__(self):
- check_cuvs(cuvsGMMParamsDestroy(self.params))
+ if self.params is not NULL:
+ cuvsGMMParamsDestroy(self.params)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@python/cuvs/cuvs/cluster/gmm/gmm.pyx` around lines 90 - 94, The `__cinit__`
method in the GMM class does not check the return value from
`cuvsGMMParamsCreate`, which can leave `self.params` uninitialized if the call
fails, leading to undefined behavior. Add error checking after the
`cuvsGMMParamsCreate` call to verify it succeeded and raise an exception if it
fails. Additionally, the `__dealloc__` method currently calls `check_cuvs()` on
`cuvsGMMParamsDestroy`, which can raise exceptions in a destructor (problematic
behavior). Instead, check the return value from
`cuvsGMMParamsDestroy(self.params)` without raising exceptions - only log or
handle the error silently to avoid suppressed exceptions or ResourceWarnings.
|
/ok to test d4175d9 |
|
Caution Failed to replace (edit) comment. This is likely due to insufficient permissions or the comment being deleted. Error details |
This adds GMM to cuvs. This a pure fp32/fp64 cuda-core version. It already implements some of the memory advance from Flash-GMM.