Skip to content

Anchored Dual-pass HLLD for Hypoelasticity (+ HLLC and interface-consistent HLL)#1414

Draft
ChrisZYJ wants to merge 82 commits into
MFlowCode:masterfrom
ChrisZYJ:hypo_hlld
Draft

Anchored Dual-pass HLLD for Hypoelasticity (+ HLLC and interface-consistent HLL)#1414
ChrisZYJ wants to merge 82 commits into
MFlowCode:masterfrom
ChrisZYJ:hypo_hlld

Conversation

@ChrisZYJ

@ChrisZYJ ChrisZYJ commented May 9, 2026

Copy link
Copy Markdown
Contributor

Description

Adds:

  1. Hypoelasticity: Anchored Dual-pass HLLD
  2. Hypoelasticity: HLLC
  3. Hypoelasticity: HLL option (interface-consistent)
  4. HLL option (alpha div U) so non-conservative treatment aligns with HLLC

Key Design Choices

Separate HLLD Riemann Solvers

At a glance it might be tempting to combine HLLD MHD with dual-pass hypoelasticity HLLD, but keeping them separate makes the code cleaner and much easier to maintain because:

  1. Unlike HLL or HLLC, HLLD is a class of HLLD-type solvers, with formulas and states dependent on the eigenstructure of the governing equations, so the inner states' equations are completely different for MHD vs Hypoelasticity.
  2. HLLD hypoelasticity has a newly developed dual-pass anchored form, making it different from any convenional HLLD Riemann solver. The anchored forms are necessary for the non-conservative hypoelasticity terms, which MHD does not have.
  3. MHD and Hypoelasticity deal with completely different physical regimes with different governing equations, and any changes or new physical models added in the future will not apply to both modules at once.

Riemann Source Terms

For the non-conservative terms, unlike the usual governing equations that only need div U i.e. du/dx, dv/dy, dw/dz (alpha div U, K div U, etc.), Hypoelasticity has cross terms like du/dy, so we must also pass those Riemann-consistent traces from Riemann solver to the rhs. (The old Hypoelasticity code with the HLL Riemann solver uses finite difference for non-conservative rhs, which provides enough stability given that HLL smears the interface immediately, so there wasn't a need to pass the du/dy traces before this PR. But that does not work for HLLC/HLLD for Hypoelasticity.)

Also grouped/named the condition branches (with lots of comments within the code):

Branch Face quantity read RHS formula per $\alpha_k$ K*div(u) velocity source
adv_src_alpha_iface flux_src_n(dir)%vf(j_adv) = per-fluid $\Psi_{\alpha_k}$ $u_\text{cell} \cdot \Delta\Psi_\alpha / \Delta x$ nc_iface_vel_n(dir)%vf(1)
adv_src_vel_iface flux_src_n(dir)%vf(adv\%beg) = shared $\Psi_u$ $\alpha_k \cdot \Delta\Psi_u / \Delta x$ Same flux_src_n slot (already $\Psi_u$)
adv_src_none Skipped (HLLD handles internally)

The derivations, meanings, and usage of the Riemann source variables are not straightforward. I've added some hopefully very helpful notes in misc/dev_notes for future developers (or AI agents; directing them to my notes should help them make fewer mistakes with the source terms) in terms of the understanding and derivations for the HLL/HLLC non-conservative fluxes, and their variable mapping for Riemann solvers and RHS.

Backwards Compatibiilty

  • All default behaviors preserved exactly (newly added features as options)
    • The only exception is the removal of an incorrect ad-hoc fluids-limit guard that affects only Hypoelasticity HLL
  • All existing usage of Riemann and rhs source terms are preserved. No refactor is done to keep the scope of this PR limited (any refactoring would touch most of the existing HLLC functionalities)

Type of change

  • New feature

Testing

  • All tests passed locally on CPU and Nvidia GPU, and on Frontier.

  • Smooth Eigenmode Convergence

image
  • Weak Solution Comparison (Rodriguez & Johnsen (2019) §5.3(b))
image
  • Weak Scaling on Frontier
image

Checklist

  • I added or updated tests for new behavior
  • I updated documentation if user-facing behavior changed
  • GPU results match CPU results
  • Tested on NVIDIA GPU or AMD GPU

AI code reviews

Reviews are not triggered automatically. To request a review, comment on the PR:

  • @coderabbitai review — incremental review (new changes only)
  • @coderabbitai full review — full review from scratch
  • /review — Qodo review
  • /improve — Qodo code suggestions
  • @claude full review — Claude full review (also triggers on PR open/reopen/ready)
  • Add label claude-full-review — Claude full review via label

@sbryngelson

Copy link
Copy Markdown
Member

Heads-up on what changed in master since this branch was cut, since the conflicts here are structural rather than mechanical:

  1. Unify ICPP STL onto the shared IB model path #1546 unified the flux arrays to x-only: the per-direction flux_rs{x,y,z}_vf arrays are gone; Riemann code now writes flux_rsx_vf(${SF('')}$, ...) with permuted indices. The HLLD hypoelastic anchor block here still emits flux_rs${XYZ}$_vf inside its Fypp direction loop, so the y/z iterations reference arrays that no longer exist — that block needs reworking against the new architecture before a merge with master can compile.
  2. Named values for enumerated case parameters and AST-based analytic IC codegen #1550 introduced named parameter values + generated namelists: riemann_solver comparisons in master are now riemann_solver == riemann_solver_hll etc., and the namelist statement in m_start_up comes from a generated include — new parameters get registered in toolchain/mfc/params/definitions.py and appear automatically.

We attempted a maintainer-side conflict resolution and got everything mechanical done, but the HLLD flux block (and a couple of smaller physics choices, e.g. the elastic flux adjustment and the hll_u_interface advection-source semantics) need your judgment. Happy to share the partial resolution as a reference branch if useful — just say the word.

ChrisZYJ added 5 commits June 10, 2026 15:35
Resolves 12 conflicted files. Hypo code migrated to the reshape-removed physical-layout convention (MFlowCode#1433): single rsx buffers + SF()-pattern face indexing in s_hll/s_hllc hypo blocks, s_hypo_hlld kernel, and hatR/nc_iface finalizers. Params re-registered via NAMELIST_VARS (MFlowCode#1456). HLLC xi-factor fix (MFlowCode#1403) preserved in non-hypo branches; hypo branches kept numerically verbatim. Validation: bitcheck + hypo suite + full suite follow as separate verification.
…beta (MFlowCode#1408)

Upstream's WENO5 sum-of-squares smoothness indicators (cancellation-free on uniform grids, MFlowCode#1408) change WENO5 answers; upstream regenerated 18 of its own goldens in that PR. These two PR-added tests (2D axisym HLL u-interface, weno_order=5, tol 1e-11) are the only ones in the suite sensitive enough to exceed tolerance (rel up to 7.4e-3 after 50 steps). Attribution verified by control: the same config at weno_order=1 differs between pre-merge and merged trees by at most 1 ulp (6e-16 rel), so the u-interface/axisym code path itself is exact; the shift is entirely the WENO5 algorithm change. All 47 hypoelasticity tests and the remaining 614 suite tests pass against existing goldens.
The elastic branches of s_write_probe_files passed q_cons_vf(1) (partial density) instead of q_cons_vf(eqn_idx%E) (total energy) to s_compute_pressure in all three dimensional branches, yielding a constant nonphysical probe pressure for hypoelastic runs. Diagnostic-only: solver and golden outputs unaffected. Regression introduced by the eqn_idx refactor.
@ChrisZYJ

Copy link
Copy Markdown
Contributor Author

Thanks for the update. I'm working on the merge together with a few other changes - will commit soon

@sbryngelson

Copy link
Copy Markdown
Member

Following up on my earlier note about #1546 (x-only flux arrays) — the Riemann refactor I mentioned has now landed (#1550#1556), and since you said you're mid-merge, here is a map of where your touched regions now live on master so you can relocate rather than re-resolve:

src/simulation/m_riemann_solvers.fpp is now a ~116-line dispatcher. The solver bodies were split out, and your hunks map as follows:

  • s_hll_riemann_solver (your interface-consistent HLL changes) → src/simulation/m_riemann_solver_hll.fpp
  • s_hllc_riemann_solver (your 12 hunks incl. low-Mach/ADC work) → src/simulation/m_riemann_solver_hllc.fpp
  • s_hlld_riemann_solver and your new s_hypo_hlld_riemann_solver (anchored dual-pass) → src/simulation/m_riemann_solver_hlld.fpp
  • s_populate_riemann_states_variables_buffers, s_initialize_riemann_solver, s_finalize_riemann_solver, and the viscous source-flux routines → src/simulation/m_riemann_state.fpp; your new s_finalize_nc_iface_vel belongs there next to s_finalize_riemann_solver
  • s_riemann_solver" (dispatch), s_initialize/finalize_riemann_solvers_module→ stay inm_riemann_solvers.fpp(wire theriemann_hypo_ADC`/dual-pass dispatch there)

New parameters must go through the registry now. Your hand-edited declarations and m_mpi_proxy broadcasts for riemann_hypo_ADC, ADC_kappa, hll_u_interface, hypo_hll_interface_rhs will conflict: declarations, namelist bindings, and MPI broadcasts are auto-generated from toolchain/mfc/params/definitions.py (_r() + _nv(); physics constraints in case_validator.py). Default assignments stay manual in s_assign_default_values_to_user_inputs; GPU_DECLARE lines for registry scalars live in m_global_parameters_common.fpp (declare directives must be in the declaring module). See docs/documentation/contributing.md. Derived internal flags (hypo_nc_*, adv_src_*, use_nc_iface_vel) can stay as manual declarations in m_global_parameters.fpp.

Other mappings: your m_hypoelastic split (*_finite_diff_per_sweep/*_iface/*_axisym_geom_iface) still targets m_hypoelastic.fpp (unchanged home, but diff against current master — the energy-coupling code there was recently consolidated); m_checker.fpp, m_rhs.fpp, m_time_steppers.fpp, m_variables_conversion.fpp keep their homes. Your SPLIT_LONG_DIRECTIVE macro patch still applies — acc_macros.fpp/omp_macros.fpp retain the $:acc_directive/$:omp_directive emission points.

Two cautions that still stand: (1) the x-only flux architecture concern from my earlier comment — everything writes flux_rsx_vf with permuted indices now, so the per-direction flux_rs${XYZ}$_vf writes in the hypo anchor blocks need rethinking, and that is a design decision I don't want to guess for you; (2) please regenerate your 38 golden files against current master rather than carrying the old ones over — an unexplained golden diff after the refactor would mask exactly the kind of silent regression we're trying to avoid.

Happy to help with the mechanical relocation once you push your merge — ping me here.

@sbryngelson

Copy link
Copy Markdown
Member

No movement/response from author.

@ChrisZYJ

Copy link
Copy Markdown
Contributor Author

@sbryngelson almost done with the merge

@ChrisZYJ

Copy link
Copy Markdown
Contributor Author

@sbryngelson Hi Spencer, I’ve completed the merge and the PR is ready for review. Would you mind reopening it when you have a chance? Thanks!

@sbryngelson sbryngelson reopened this Jun 23, 2026
@github-actions

Copy link
Copy Markdown

Claude Code Review

Head SHA: d55a3f3

Files changed:

  • 25
  • src/simulation/m_riemann_solver_hllc.fpp
  • src/simulation/m_riemann_solver_hypo_hlld.fpp
  • src/simulation/m_riemann_solver_hll.fpp
  • src/simulation/m_riemann_solver_lf.fpp
  • src/simulation/m_riemann_solvers.fpp
  • src/simulation/m_riemann_state.fpp
  • src/simulation/m_rhs.fpp
  • src/simulation/m_hypoelastic.fpp
  • src/simulation/m_checker.fpp
  • src/common/m_variables_conversion.fpp

Findings

1. CORRECTNESS — Index transposition in HLLC axisymmetric geometric source (y-sweep)

File: src/simulation/m_riemann_solver_hllc.fpp
Hunk context: Inside #:if (NORM_DIR == 2) block (diff lines ~3657–3729)

flux_gsrc_rsx_vf(${SF('')}$, eqn_idx%cont%end + dir_idx(1)) = flux_rsx_vf(j, k, l, &
                                         & eqn_idx%cont%end + dir_idx(1)) - p_face + tau_qq_face

The LHS correctly uses ${SF('')}$ which for NORM_DIR == 2 expands to (k, j, l) — placing the normal-direction loop variable k first, matching the rotated-storage convention of rsx arrays. The RHS reads from flux_rsx_vf(j, k, l, ...) with literal indices, so it reads (transverse, normal, other) instead of (normal, transverse, other). This is a silent index-transposition bug: the wrong element is read from flux_rsx_vf for every y-direction interface, producing incorrect geometric source momentum fluxes for all 2D/axisymmetric hypoelastic HLLC runs. The RHS should use flux_rsx_vf(${SF('')}$, ...).

For comparison, the analogous assignment in the HLLD geometric source path (diff line ~4621) correctly reads flux_rsx_vf via the SF macro.


2. CORRECTNESS — G1_eff/G2_eff hardcoded to fluid_pp(1:2) with no num_fluids guard for HLL/HLLC

File: src/simulation/m_rhs.fpp
Hunk context: s_compute_advection_source_term (diff lines ~2217–2236)

G1_eff = 0._wp; G2_eff = 0._wp
if (hypoelasticity) then
    G1_eff = fluid_pp(1)%G
    G2_eff = fluid_pp(2)%G
end if
! ...
blkmod1(...) = (gammas(1)*pres + pi_infs(1))/gammas(1) + (4._wp/3._wp)*G1_eff
blkmod2(...) = (gammas(2)*pres + pi_infs(2))/gammas(2) + (4._wp/3._wp)*G2_eff

This code runs for all hypoelastic Riemann solvers (HLL, HLLC, HLLD). The checker at diff line ~1092 enforces num_fluids == 2 only for HLLD (riemann_solver == 4). There is no matching restriction for HLL or HLLC + hypoelasticity. For num_fluids > 2 + HLLC + hypoelasticity + alt_soundspeed, fluids 3+ are silently excluded from the bulk-modulus correction, producing wrong sound speeds in the advection source. A @:PROHIBIT covering HLLC/HLL cases (or a generalized loop over all fluids) is needed.

@ChrisZYJ

Copy link
Copy Markdown
Contributor Author

The merge took a while because I first merged #1546, then did a performance refactor of the HLLD path, and had to re-merge onto #1550#1556. The refactored code is a lot cleaner now though. Thanks for the improvements!

The relocation map was helpful, and I followed it where I could. One thing I wanted to discuss is keeping m_riemann_solver_hypo_hlld.fpp separate from m_riemann_solver_hlld.fpp:

  • The infrastructure they share is already factored out into s_populate_riemann_states_variables_buffers, s_initialize/finalize_riemann_solver, and s_compute_speed_of_sound.
  • MHD and hypoelasticity are mutually exclusive, and the two solvers are mathematically different apart from the 5-wave structure. They share only like 40 lines of preamble (#:for scaffold, primitive unpack, mixture-property sum) that are common to all solvers, not just HLLD. Almost everything else is different, including different conservative variable vectors, magnetosonic vs elastic wave speeds, conservative vs non-conservative anchored dual-pass, etc.
  • That being said, I'm happy to fold both subroutines into m_riemann_solver_hlld.fpp if you'd prefer all the HLLD-type subroutines in one file.

On a side note, I think the two developer's notes in misc/ are pretty useful. Feel free to take a look, and let me know if you'd like them moved somewhere else.

@ChrisZYJ

Copy link
Copy Markdown
Contributor Author

For Claude Code Review: 1 is false positive. 2 I've added a PROHIBIT to be extra safe

@sbryngelson sbryngelson left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High-level review of structure and organization, not a line-by-line bug pass. The change is smaller than the 110-file count suggests: most of those files are golden tests and examples. The real surface is about 6 files in src/simulation plus the global params and the dev notes.

What works, keep it:

  • The new HLLD solver gets its own module, which matches the existing one-module-per-solver layout. Your reasons for not fusing it with MHD HLLD are correct. Leave them separate.
  • The dev_notes derivations are good. This math is hard to recover later. Keep writing these.
  • The adv_src_* and hypo_nc_dual_pass flags are derived once in m_global_parameters, not raw user inputs. That is the right place for them.

Main point: hypoelasticity now enters the Riemann solvers in three different shapes.

  • HLLC: inline "if (hypoelasticity)" branches inside the existing ~1970-line s_hllc_riemann_solver.
  • HLL: similar inline branches.
  • HLLD: a separate module plus a separate hypo_nc_dual_pass selection path.
    Each choice on its own is fine. Together, anyone asking "how is hypoelasticity handled at interfaces" has to read three different shapes of code. I would not refactor HLLC in this PR (you scoped that out, correctly), but add a short comment at each site, and in dev_notes, that names all three and says why HLLD needs its own path.

Two cleanups that are cheap and in scope:

  1. Collapse the _hatR duplication in the dual pass. The second pass duplicates whole field sets (nc_iface_vel_n and nc_iface_vel_hatR_n) and finalize routines (the regular ones and their _hatR twins). The pass identity, left-anchored vs right-anchored, should be an argument, not a copied routine. This roughly halves the dual-pass code and stops the two copies from drifting apart.
  2. Replace the three adv_src_* booleans with one enum. They are mutually exclusive by construction, and the code already asks maintainers to keep them that way by hand. An enum makes the "two true at once" state impossible and drops the manual rule.

Two notes for later, mostly out of scope:
3. m_rhs.fpp is already ~2200 lines. The dual pass deepens it: allocate hatR fields, solve, finalize hat_L, assemble RHS, finalize hat_R, assemble RHS again. That orchestration would be better in its own routine. Not required here, just flagging it so it does not harden.
4. The 900 to 2000 line solver subroutines are a pre-existing pattern, not something you introduced. Your new solver at 934 lines is actually better than HLLC because you split out the finalize helpers. Splitting the left/right state setup and the per-wave blocks into helpers would set a good example, but it is not a blocker.

Net: the hard parts, the module boundary and the documentation, are right. Before this leaves draft I would do (1) parameterize the dual pass instead of copying hatR, and (2) make adv_src* an enum. The rest is comments or future debt. Inline notes below.

integer, intent(in) :: norm_dir
type(int_bounds_info), intent(in) :: ix, iy, iz

if (hypo_nc_dual_pass) then

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a second solver-selection path. The other solvers are picked by riemann_solver == NUM in the loop below; the dual-pass HLLD is picked by this separate boolean and returns early. Since hypo_nc_dual_pass is derived from riemann_solver and the elasticity flags, the two paths stay consistent, so this is fine to ship. But it is the point where hypoelasticity splits into three different shapes across the Riemann solvers: inline branches in HLLC and HLL, and this separate module for HLLD. Worth a short comment here, and in dev_notes, that names all three and says why HLLD needs its own path.

! adv_src_alpha_iface, adv_src_vel_iface, and adv_src_none true whenever this logic is modified.
!
! HLL Method 1 (alpha-interface): flux_src(adv_idx%beg:adv_idx%end) carries interface alpha_k per fluid.
adv_src_alpha_iface = (riemann_solver == 1 .and. .not. hll_u_interface)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These three booleans are mutually exclusive by construction, and the comment just above asks maintainers to keep them that way by hand. A single enum (for example adv_src_mode with values alpha_iface, vel_iface, none) would make the two-true-at-once state impossible and drop the manual rule. Same readability win at every if (adv_src_*) site downstream.

Comment thread src/simulation/m_rhs.fpp

!> hat_R-pass interface velocities for fused dual-pass HLLD: the axisymmetric geometric source consumes both passes' face
!! velocities after the direction loop, so the hat_R values need their own field set (hat_L uses nc_iface_vel_n).
type(vector_field), allocatable, dimension(:) :: nc_iface_vel_hatR_n

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This _hatR field set is a near copy of nc_iface_vel_n, and the dual pass also adds _hatR copies of the finalize routines in the hypo HLLD module. The left-anchored vs right-anchored distinction reads like it should be a parameter, not a duplicated field set plus routine. Passing the pass identity in would remove the copy and stop the two from drifting. Separate, larger point: the dual-pass orchestration (allocate hatR, solve, finalize hat_L, assemble, finalize hat_R, assemble again) sits inside an already ~2200-line m_rhs. Good candidate to move into its own routine later.

contains

!> HLLD Riemann solver resolves all 5 waves for the hypoelastic equations: 1 entropy wave, 2 shear stress waves, 2 fast waves.
subroutine s_hypo_hlld_riemann_solver(qL_prim_rsx_vf, dqL_prim_dx_vf, dqL_prim_dy_vf, dqL_prim_dz_vf, qL_prim_vf, &

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This subroutine is about 934 lines. That is actually better than the existing HLLC solver at ~1970, and you already split out the finalize helpers below, which is the right instinct. Not a blocker, and the giant-subroutine pattern is pre-existing here. If you want to set a better example, the left/right state setup and the per-wave-region blocks are natural helper boundaries.

!! output arrays. Mirrors s_finalize_nc_iface_vel for the nc_iface_vel_hatR_rs* buffers.
!! @param nc_iface_vel_vf Output: physical hat_R velocity components at interfaces
!! @param norm_dir Sweep direction (1=x, 2=y, 3=z)
subroutine s_finalize_nc_iface_vel_hatR(nc_iface_vel_vf, norm_dir)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a copy of s_finalize_nc_iface_vel above with a _hatR suffix, and s_finalize_riemann_solver_hatR mirrors the regular finalize the same way. Two copies that have to stay in sync by hand. Consider one routine that takes the pass (left vs right) or the target field set as an argument, instead of the _hatR twin. Same point as the field duplication noted in m_rhs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Development

Successfully merging this pull request may close these issues.

2 participants