The deployed ADAMTS-7 cleavage app (scripts/app/pyodide_template.html in the Amin project) has a
well-liked view: per-residue CPP-SHAP impact (red = raises the prediction, blue = lowers it) painted
onto the AlphaFold cartoon, with fade-context, zoom-to-site, and a pLDDT colouring mode. Today this
lives only in that HTML file — the 3D is rendered by 3Dmol.js in the browser, and the
per-residue impact is recomputed by a hand-written perpos loop inside an embedded render_merged
Python function.
This should become a reusable AAanalysis function. It is not HTML-specific — the browser file is the only browser-bound part. Both halves of the logic already exist in AAanalysis:
- Feature → residue mapping already exists.
get_positions_()(aaanalysis/feature_engineering/_backend/cpp/utils_feature.py:221) turns eachdf_featfeature into the 1-based positions it covers, andPlotPartPositions.get_df_pos(..., value_type="sum")(aaanalysis/feature_engineering/_backend/cpp/_utils_cpp_plot_positions.py:201) already aggregatesfeat_impactper position. The app's manualperposloop is a re-implementation of this. - Structure handling already exists in the
[pro]extra.StructurePreprocessor(aaanalysis/data_handling_pro/_struct_preproc.py) parses PDB/CIF via biopython, andfetch_alphafold()downloads AF models + PAE by UniProt accession. The backend (aaanalysis/data_handling_pro/_backend/struct_preproc/encode_pdb.py) already exposesload_structure(),_ca_coords_and_residues()(best-chain pick + Cα coords), and_plddt_per_residue()(Cα B-factor → pLDDT).
The only genuinely new dependency is a Python 3D viewer. py3Dmol is the Python binding to the
same 3Dmol.js the app uses — interactive in Jupyter, exports a self-contained HTML — so the output
mirrors the deployed app one-to-one. A matplotlib mplot3d fallback gives a static Cα scatter for
headless/docs/CI use.
Outcome: a new pro plotting class CPPStructurePlot whose feature_structure(df_feat, pdb=...)
returns an interactive (py3Dmol) or static (matplotlib) view of per-residue feature impact mapped onto
the protein, with fade/zoom focus, a pLDDT colouring mode, and optional AlphaFold auto-fetch.
This handoff covers two stages, intended as two separate issues:
- Issue 1 —
CPPStructurePlot(static render class). Thedf_feat+ structure → coloured 3D view. A pure render function: data in, coloured structure out. Self-contained, ships first. (Sections Public API through Verification below.) - Issue 2 — interactive live-prediction (
.interactive(...)). An ipywidgets wrapper that calls a user-supplied predictor on every change (pick a site, mutate a residue, drag a threshold) and repaints the Issue-1 view. Depends on Issue 1. (Section Issue 2 below.)
Cost: everything here is free to run — py3Dmol/3Dmol.js (BSD, runs locally in-browser), the AlphaFold-DB fetch (free public endpoint), and the CPP/TreeModel prediction (local CPU). No LLM or hosted-inference component anywhere.
- Placement / API: new pro plotting class
CPPStructurePlot, mirroringCPPPlot/AAMutPlot. Core stays dependency-free. - Renderer:
py3Dmolprimary (interactive + standalone HTML); matplotlibmplot3dstatic fallback. - v1 scope: (1) colour by SHAP impact, (2) fade-context + zoom-to-site, (3) pLDDT colouring mode, (4) AlphaFold auto-fetch by accession.
import aaanalysis as aa
sp = aa.CPPStructurePlot(jmd_n_len=10, jmd_c_len=10, verbose=True)
view = sp.feature_structure(
df_feat=df_feat, # standard CPP df_feat (needs col_imp present)
pdb="Q9NQ76.pdb", # OR uniprot="Q9NQ76" to auto-fetch from AlphaFold-DB
col_imp="feat_impact_site", # signed per-position impact column (shap_plot semantics)
tmd_len=10, # TMD length used when the features were generated
start=312, # absolute residue number of the FIRST jmd_n residue in the structure
mode="impact", # "impact" (red/blue) | "plddt" (AF palette)
focus="fade", # "whole" | "fade" (ghost the rest) | "zoom" (camera to window)
size_by_impact=True, # stick radius proportional to |impact|
chain=None, # None = auto best-matching chain (reuses encode_pdb logic)
backend=None, # None = py3Dmol if available else mpl; "py3dmol" | "mpl" to force
)
view.show() # interactive in Jupyter (py3Dmol) / inline PNG (matplotlib)
view.write_html("out.html") # self-contained interactive page (py3Dmol backend)
view.savefig("out.png") # static render (matplotlib backend; py3Dmol can export PNG too)feature_structure() returns a thin StructureView wrapper exposing show(), write_html(path),
savefig(path), and _repr_html_, so the return type is uniform across both backends.
New module aaanalysis/feature_engineering_pro/ (sibling to feature_engineering/, where CPPPlot
lives), gated by the pro extra following the existing ShapModel pattern.
aaanalysis/feature_engineering_pro/
__init__.py # exports CPPStructurePlot
_cpp_structure_plot.py # CPPStructurePlot class (public, numpydoc, input checks)
_backend/
cpp_structure/
map_impact_to_residues.py # df_feat -> {abs_residue: summed impact}
render_py3dmol.py # py3Dmol cartoon colouring + focus modes + HTML export
render_mpl.py # matplotlib mplot3d Ca-scatter fallback
colors.py # shap impact ramp + AF pLDDT palette (reuse package constants)
-
map_impact_to_residues— wrapget_positions_(features, start, tmd_len, jmd_n_len, jmd_c_len)to get each feature's positions, then aggregatedf_feat[col_imp]per position by sum (the samevalue_type="sum"rulePlotPartPositions.get_df_posuses forfeat_impact).startshifts window positions to absolute structure residue numbers. Return{resi: impact}plusmax_absfor normalisation. This replaces the app's hand-writtenperposloop. -
Structure parsing — reuse
aaanalysis/data_handling_pro/_backend/struct_preproc/encode_pdb.py:load_structure()(PDB/CIF),_ca_coords_and_residues()(best chain + Cα xyz, honours thechainarg),_plddt_per_residue()(Cα B-factor → pLDDT). No new parsing code. -
AlphaFold auto-fetch — when
uniprot=is given instead ofpdb=, call the existingStructurePreprocessor.fetch_alphafold()into a temp dir, then load the model. Reuses pro code.
- Impact ramp: white→
COLOR_SHAP_POS(#FF0D57) / white→COLOR_SHAP_NEG(#1E88E5) fromaaanalysis/utils.py:454, with the app'ssign·sqrt(|t|)perceptual transform (shapColor,pyodide_template.html:318). - pLDDT palette: AlphaFold blue→green→yellow→orange (
#0053D6/#65CBF3/#FFDB13/#FF7D45), matching the app'splddtColor()(pyodide_template.html:208).
- py3Dmol (
render_py3dmol.py):addModel(pdb,"pdb"); per-residuesetStyle({resi},{cartoon:{color}}); optionalstickradius proportional to |impact| whensize_by_impact;focus="fade"ghosts non-window residues (opacity:0.45),focus="zoom"callszoomTo({resi: window}). The window = TMD ± jmd lengths derived fromstart/tmd_len/jmd_*_len.write_html()uses py3Dmol's HTML export. MirrorsapplyBaseStyle/show3Din the app (pyodide_template.html:329/:459). - matplotlib (
render_mpl.py):mplot3dscatter of Cα coords coloured by the same ramp; faded residues at low alpha; returns aFigureforsavefig. Used automatically when py3Dmol is unavailable or whenbackend="mpl".
aaanalysis/__init__.py: add atry: from .feature_engineering_pro import CPPStructurePlotblock withmissing_feature_stub("CPPStructurePlot", e, mode="pro")on ImportError; append to__all__on success._EXTRA_MODULES["pro"]inaaanalysis/__init__.py:83: add"py3Dmol"(its import name) so a missing viewer yields the friendly install hint rather than a raw traceback.pyproject.toml[project.optional-dependencies] pro: addpy3Dmol>=2.0. (biopython/requests already present; matplotlib is already a core dep, so the fallback adds no new dependency.)
- New:
aaanalysis/feature_engineering_pro/_cpp_structure_plot.py(+ backend files above) - Reuse:
aaanalysis/feature_engineering/_backend/cpp/utils_feature.py:221(get_positions_) - Reuse:
aaanalysis/data_handling_pro/_backend/struct_preproc/encode_pdb.py(load_structure,_ca_coords_and_residues,_plddt_per_residue) - Reuse:
aaanalysis/data_handling_pro/_struct_preproc.py(StructurePreprocessor.fetch_alphafold) - Reuse:
aaanalysis/utils.py:454(COLOR_SHAP_POS/COLOR_SHAP_NEG) - Edit:
aaanalysis/__init__.py(export +_EXTRA_MODULES),pyproject.toml(pro extra) - Parity reference (Amin project):
scripts/app/pyodide_template.html—render_mergedperpos loop (~line 787),shapColor(318),plddtColor(208),applyBaseStyle(329),show3D(459).
- Unit — mapping: build a tiny
df_feat(2–3 known features), callmap_impact_to_residues, and assert per-residue sums equal a hand-computedperposfor a chosenstart/tmd_len. Cross-check one case against the app'srender_mergedoutput for the same window. - Integration — real data: use the deployed ADAMTS-7
df_feat+ an AlphaFold PDB; runfeature_structure(..., mode="impact", focus="fade"),write_html("/tmp/v.html"), confirm the HTML opens with a coloured cartoon. Repeat withmode="plddt". - Auto-fetch:
feature_structure(df_feat, uniprot="<acc>")downloads the AF model and renders (network-gated; mock/skip in CI). - Fallback: force
backend="mpl"(or simulate py3Dmol absent); assert aFigureis returned andsavefigwrites a PNG. - Gating: in an env without
py3Dmol,aa.CPPStructurePlotresolves to the stub and raises the friendlypip install 'aaanalysis[pro]'hint on use.
Live re-prediction (→ Issue 2); topology colouring mode; linked hover/tooltips across panels; multi-fragment AlphaFold renumbering; per-residue secondary-structure/burial readouts. App conveniences that can be layered later.
Issue 1 is a one-shot render: df_feat in, coloured structure out. Issue 2 closes the loop so the
prediction itself is live and adjustable in a notebook — pick a site, mutate a residue, or drag a
threshold, and the model re-runs and the cartoon repaints. This is the notebook-native equivalent of
the deployed pyodide_template.html app, but driven by the real Python model on a live kernel
(exact predictions, no JS/Pyodide port needed).
The key fork is where it runs (worth stating in the issue so expectations are set):
- Live Jupyter kernel — full live re-prediction works: a Python callback runs the model and repaints the py3Dmol view. This is the target of Issue 2.
- Exported standalone
.html— has no Python kernel, so it cannot call back into the model. Shareable live prediction requires the model to run in the browser (Pyodide or a JS port) — which is exactly what the existing app already does. Out of scope here; Issue 2 targets the live-kernel path only.
import aaanalysis as aa
sp = aa.CPPStructurePlot(jmd_n_len=10, jmd_c_len=10)
# `predictor` is a user-supplied callable: (sequence, p1) -> df_feat
# (e.g. wraps CPP.run(...) + ShapModel/TreeModel to produce a df_feat with col_imp at the site)
ui = sp.interactive(
predictor=my_predictor, # callable returning a df_feat for the current selection
sequence="MK...", # full protein sequence
pdb="Q9NQ76.pdb", # or uniprot="Q9NQ76"
tmd_len=10,
col_imp="feat_impact_site",
mode="impact", focus="fade",
)
ui # renders the widget panel + linked 3D view in the notebook output cellinteractive() returns an ipywidgets container (an HBox/VBox) that displays inline. It owns the
control widgets and re-renders by reusing the same Issue-1 mapping + render backend.
[site slider / sequence box / impact-threshold slider]
--on_change--> Python callback (debounced)
predictor(sequence, p1) -> df_feat (user's model, runs on the kernel)
map_impact_to_residues(df_feat, start=p1_anchor, tmd_len, jmd_*_len)
render backend (py3Dmol) -> repaint the view in-place
- Anchor: the selected P1 (or TMD start) sets
start, the same anchor Issue 1 uses to map window-relativedf_featpositions to absolute structure residues. Selection drivesstart. - Debounce: coalesce rapid slider moves (e.g.
Output+ a short timer) so the model isn't re-run on every intermediate value — mirrors the app's "pause while exploring" behaviour. - In-place repaint: update the py3Dmol view inside a single
ipywidgets.Outputso the camera/zoom state is preserved across re-predictions where possible.
Add one public method to CPPStructurePlot plus a small backend file. No change to the Issue-1 render
path — Issue 2 only drives it.
aaanalysis/feature_engineering_pro/
_cpp_structure_plot.py # + .interactive(predictor, sequence, ...) method
_backend/
cpp_structure/
interactive_widgets.py # ipywidgets panel, debounced callback, Output repaint
interactive_widgets.pybuilds the widget panel, wires the debounced callback topredictor → map_impact_to_residues → render_py3dmol, and manages theOutputcell.predictoris a plain callable supplied by the user — the class does not hard-code CPP/TreeModel, so any model that returns adf_feat(withcol_imp) for a selection works. The docstring should show one worked example wiringCPP+ShapModel/TreeModelinto such a callable.
- Add
ipywidgetsto theproextra inpyproject.toml, and"ipywidgets"to_EXTRA_MODULES["pro"]inaaanalysis/__init__.py. The import is lazy inside.interactive()so Issue-1 static use never requires ipywidgets — only the interactive path does.
- Headless smoke: instantiate
.interactive(...)with a stubpredictor(returns a fixeddf_feat); assert it builds a widget container without a display backend (no exception, correct children). - Callback wiring: simulate a site-slider change; assert the stub predictor is called with the
expected
(sequence, p1)and thatmap_impact_to_residuesis invoked with the matchingstart. - Manual notebook check: in JupyterLab, drive the real CPP/TreeModel predictor; confirm changing the site repaints the cartoon with updated colours and the camera is preserved.
- Gating: without
ipywidgets,.interactive()raises the friendlypip install 'aaanalysis[pro]'hint while.feature_structure()(Issue 1) still works.
Shareable standalone-HTML live prediction (needs Pyodide / JS port — already solved in the deployed app); mutation scanning UIs; multi-site comparison panels.