Background
Follow-up to #2192 (foundation) and PR #2193 (pytest + Zuul infrastructure). Part of Tier 8 (#2199). This issue covers the workflow-oriented CLI command modules under osism/commands/: apply.py (501 LOC), check.py (778 LOC), validate.py (217 LOC), wait.py (159 LOC), compose.py (45 LOC), sync.py (406 LOC), get.py (302 LOC), log.py (237 LOC) and console.py (296 LOC) — together ~2,940 LOC. They are cliff Command classes that schedule Celery tasks (apply, validate, sync), inspect Celery state (wait, get tasks), shell out via SSH/clush/docker (compose, log, console) or compare filesystem metadata (check).
Scope
Add tests/unit/commands/test_apply.py, test_check.py, test_compose.py, test_sync.py, test_log.py and test_console.py; extend the existing test_validate.py, test_wait.py and test_get.py.
Already covered (do not duplicate):
test_validate.py: Run._handle_task returns 1 on TimeoutError while waiting.
test_wait.py: exit-code contract of the --live path (timeout with one/two tasks, task-rc passthrough, success → 0).
test_get.py: Hosts.take_action (inventory load failure → 1, empty inventory → success) and Hostvars.take_action (inventory query failure → 1, missing variable → success).
This is a large group: prioritize the pure-logic / high-value functions listed below; interactive loops (log.Opensearch, console container_prompt) are lowest priority. Like the Tier 3 issues, this issue may be split further during implementation (suggested cut: apply+check / sync+get / compose+log+console + the validate/wait extensions).
Test targets
Run._prepare_task() — apply.py:269
Patch osism.tasks.ansible.run, osism.tasks.ceph.run, osism.tasks.kolla.run, osism.tasks.kubernetes.run (the .si attribute is what gets called; imports happen inside the method body, so patch the canonical task module paths). MAP_ROLE2ENVIRONMENT / MAP_ROLE2RUNTIME are lazy module attributes of osism.data.playbooks — see mocking hints.
role="ceph" → environment forced to "ceph", ceph.run.si("ceph", "ceph", arguments, auto_release_time=task_timeout)
role="ceph-osds", environment "ceph" → ceph- prefix stripped: ceph.run.si("ceph", "osds", ...)
sub="zone-a" with ceph/kubernetes/kolla environments → environment becomes "ceph.zone-a" etc.
- environment
"kubernetes" → kubernetes.run.si(...)
role="loadbalancer-ng" → returns chain of kolla.run.si(env, "loadbalancer-ng", ...) piped into a group over enums.LOADBALANCER_PLAYBOOKS
- environment
"kolla", role="kolla-keystone" → kolla- prefix stripped; kolla.run.si called with ["-e kolla_action=deploy"] + arguments
role="mariadb-ng" / "rabbitmq-ng" → argument is -e kolla_action_ng=<action> instead
- kolla role listed in
MAP_ROLE2RUNTIME["osism-ansible"] (and not "common") → routed to ansible.run.si with the original arguments (no kolla_action)
- environment
None and role not in MAP_ROLE2ENVIRONMENT → bare-except fallback to "custom", info log "Trying to run play ...", ansible.run.si("custom", ...)
overwrite="other" in the default branch → environment replaced by "other"
Run._handle_collection() / Run.handle_collection() — apply.py:135 / apply.py:218
Patch the task modules as above plus osism.tasks.ansible.noop; patch celery.chain / celery.group or assert on the returned signature structure. Drive via small hand-built Role trees (osism.data.enums.Role) instead of the real MAP_ROLE2ROLE.
- item that is not a
Role → TypeError raised and error logged
- flat list of
Roles without dependencies → one prepared task per role, wrapped in a group
Role with nested dependencies → chain(parent_task, child_group) built recursively
dry_run=True → ansible.noop.si() used instead of _prepare_task
show_tree=True → no tasks created, returns None, tree logged ("A [0] ..." / indentation grows with counter)
handle_collection: apply_async() called on the result when not show_tree; not called for show_tree; distinct log messages for dry-run / show-tree / normal mode (patch osism.data.enums.MAP_ROLE2ROLE with dict-patch for the collection lookup)
Run.handle_role() / Run.handle_loadbalancer_task() — apply.py:371 / apply.py:106
Patch osism.tasks.handle_task (imported inside the method) and stub _prepare_task to return a MagicMock whose apply_async() yields either a plain result or a celery.result.GroupResult instance.
- plain task →
handle_task(task, wait, format, timeout) rc passed through
GroupResult → handle_loadbalancer_task path taken
handle_loadbalancer_task with wait=True: rc comes from handle_task(t.parent, ...), t.get() called
wait=False: t.parent.get() additionally called (garbage-collector workaround), children logged for format="log"
Run.take_action() — apply.py:417
Patch osism.commands.apply.utils.check_task_lock_and_exit, osism.commands.apply.utils.check_ansible_facts, and the handle_role / handle_collection methods on the command instance.
check_task_lock_and_exit always invoked first
- no role → table of
MAP_ROLE2ENVIRONMENT printed (capsys), returns 0
- ansible-facts freshness check: performed when a role is given and >300 s since
utils._last_ansible_facts_check; skipped for roles "gather-facts"/"facts", for --show-tree, and when the last check is recent (reset the _last_ansible_facts_check attribute on osism.utils between tests)
role="a//b" → handle_role called once per segment
--retry 2 with handle_role always returning 1 → called 3 times, rc 1; success on 2nd attempt → 2 calls, rc 0
- collection branch:
handle_collection returns None, so rc != 0 is truthy and the loop breaks with take_action returning None — pin this current behavior in a test (candidate for a follow-up fix)
get_file_info() / collect_file_info() — check.py:31 / check.py:59
Pure filesystem helpers — use tmp_path, no mocking needed for the happy paths.
- small regular file → dict with
inode, mtime, size, mode, uid, gid, is_link=False and an md5 hash
- file ≥ 1 MiB →
hash is None
- unreadable file (patch
builtins.open to raise IOError) → hash is None, rest populated
- nonexistent path →
{"error": ...}
collect_file_info: directory tree with .git/venv/__pycache__ subdirs → skipped; symlinks skipped; both files and directories included with relative paths
max_files=2 on a larger tree → scan stops, warning logged
parse_stat_output() — check.py:88
Pure parser, no mocks.
- well-formed
FILE:/INODE:/SIZE:/MTIME:/HASH: blocks → typed dict (int, int, float)
HASH:NONE → None
ERROR: lines captured under "error"
- empty lines and key/value lines before any
FILE: line ignored
- multiple files in one output parsed independently
Mount._compare_file_info() — check.py:338
Pure comparison, no mocks.
- differing inodes → entry in
inode_mismatches with local_inode/fresh_inode
- differing hashes with
check_content=True → content_mismatches; same input with check_content=False → empty
- file only in fresh info →
missing_in_local; only in local info → missing_in_fresh
- entries containing
"error" skipped entirely
- falsy inode (
None/0) on either side → no mismatch recorded
- results sorted by file path
Mount._get_container_id() / Mount._get_mount_source() — check.py:180 / check.py:222
Patch builtins.open with mock_open payloads and os.uname.
- cgroup line containing
docker with a 64-char id → first 12 chars returned; 12-char id returned as-is
- cgroup unreadable + 12-char hostname → hostname returned
- cgroup + hostname fail,
/proc/self/mountinfo containing /docker/containers/<id>/ → id truncated to 12
- nothing matches →
None
_get_mount_source: mountinfo line whose 5th field equals the mount path → source after the - separator returned (must start with /); no separator / non-absolute source / no matching line / IOError → None
Mount.take_action() — check.py:391
Focus on the early-exit guards and final rc; patch osism.commands.check.DOCKER_AVAILABLE, osism.commands.check.docker, os.path.exists, and the instance helpers (_get_container_id, _get_volume_mount_info, _get_mount_source, _run_fresh_container) plus collect_file_info.
- path does not exist → 1 (
format="script" prints FAILED: ...)
DOCKER_AVAILABLE=False → 1
- Docker socket missing → 1
docker.from_env raises → 1
- mount source not determinable (no
--host-path, no Docker mount info, no mountinfo) → 1
- bind mount → source taken from mount info; volume mount with
--volume-name → override used
_run_fresh_container raises → 1
- consistent comparison → 0 (
script prints PASSED); inode mismatches → 1 with INODE_MISMATCHES:<n> in script format
Inode.take_action() — check.py:662
- explicit file list under
tmp_path → rows with type/inode/size; symlinks and missing files skipped
- no files given → random sampling from
environments/* and inventory/* (patch random.sample for determinism); up to 2 entries per subdirectory plus direct files
- all three formats (
table via capsys, log, script); returns 0
validate.Run.take_action() — validate.py:74 (gap)
Patch osism.tasks.ansible.run, osism.tasks.ceph.run, osism.tasks.kolla.run (.delay), osism.commands.validate.utils.check_task_lock_and_exit, and stub _handle_task.
- kolla validator (e.g.
keystone-config) → kolla.run.delay("kolla", "keystone", arguments) with -e kolla_action=config_validate appended to arguments
--environment custom honored for ceph/kolla runtimes (no default applied)
- ceph validator (
ceph-config) → ceph.run.delay("ceph", "validate", ...) (playbook rewritten via VALIDATE_PLAYBOOKS)
- osism-ansible validator without explicit
playbook key (e.g. ntp) → ansible.run.delay("generic", "validate-ntp", ...)
check_task_lock_and_exit invoked
validate.Run._handle_task() — validate.py:55 (gap; timeout case exists)
wait=True, fetch_task_output returns rc → rc passed through
wait=False, format="log" → info log, returns 0
wait=False, format="script" → task id printed, returns 0
validate.Scs.take_action() — validate.py:159
Patch osism.tasks.openstack.setup_cloud_environment, osism.tasks.openstack.cleanup_cloud_environment (imported inside the method) and osism.commands.validate.subprocess.run.
setup_cloud_environment returns success=False → returns 1, no subprocess call, no cleanup
- happy path → command contains
-s <cloud>, -a os_cloud=<cloud>, -V <version>, ends with scs-compatible-iaas.yaml; OS_CLIENT_CONFIG_FILE=/tmp/clouds.yaml in env; returncode passed through
--verbose/--debug/--tests/--output/--sections each append the corresponding flag
subprocess.run raises FileNotFoundError → 1; generic exception → 1
cleanup_cloud_environment(temp_files, original_cwd) called in all post-setup paths (finally)
wait.Run.get_all_task_ids() / take_action() — wait.py:50 / wait.py:62 (gap; --live exit codes exist)
Patch celery.Celery, celery.result.AsyncResult, osism.commands.wait.time.sleep, and osism.utils._init_redis. Make AsyncResult return objects whose state changes between iterations (via side_effect) so re-queue loops terminate.
get_all_task_ids: merges ids from i.scheduled() and i.active(), returns them sorted
- no task ids on CLI → ids pulled from
get_all_task_ids, refresh mode enabled
PENDING + query_task finds nothing → "unavailable" logged / <id> = UNAVAILABLE printed, task not re-queued
PENDING + task known to a worker → re-queued, then SUCCESS on the next pass terminates
SUCCESS with --output → result.get() printed
STARTED without --live → re-queued, finishes when state flips to SUCCESS
--refresh 1 → after the queue drains, get_all_task_ids consulted one more time
format="script" prints <id> = <STATE> lines instead of log output
compose.Run.take_action() — compose.py:25
Patch osism.commands.compose.subprocess.call and osism.commands.compose.ensure_known_hosts_file.
- builds
ssh ... <OPERATOR_USER>@<host> 'docker compose --project-directory=/opt/<environment> <arguments>' with UserKnownHostsFile=<KNOWN_HOSTS_PATH>; arguments are joined without separators (current behavior — pin it)
ensure_known_hosts_file returns False → warning logged, SSH still attempted
sync.Facts / sync.CephKeys / sync.Sonic — sync.py:21 / sync.py:46 / sync.py:90
Patch osism.commands.sync.utils.check_task_lock_and_exit, osism.tasks.ansible.run / osism.tasks.conductor.sync_sonic (.delay) and osism.tasks.handle_task.
Facts: ansible.run.delay("generic", "gather-facts", [], auto_release_time=3600), rc from handle_task
CephKeys: manager/copy-ceph-keys playbook; --no-wait → handle_task(t, False)
Sonic: conductor.sync_sonic.delay(device, show_diff); device-specific vs. generic log message; --no-diff → show_diff=False
sync.Versions._get_kolla_version_from_release() — sync.py:248
Patch requests.get (imported inside the method).
- response with
docker_images: {kolla: "0.20250928.0"} → version returned, URL is <repo>/<release>/base.yml
- HTTP error (
raise_for_status raises RequestException) → RuntimeError
- invalid YAML body →
RuntimeError
- YAML without
docker_images.kolla → RuntimeError "Kolla version not found"
sync.Versions._sync_kolla_versions() / take_action() — sync.py:311 / sync.py:288
Stub _extract_sbom_with_skopeo and _get_kolla_version_from_release; use tmp_path as --configuration-path.
--release 9.4.0 → image =<sbom-image-base>:<version-from-release>; _get_kolla_version_from_release raising → rc 1
- version tag with a date part (
0.20251128.0, also v-prefixed) → release SBOM image base; plain OpenStack version (2025.1) → registry.osism.cloud/kolla/sbom:2025.1
- explicit
--sbom-image → used verbatim, no derivation
- non-dry-run with missing configuration path → rc 1
_extract_sbom_with_skopeo raising RuntimeError / YAMLError → rc 1
openstack_version from the SBOM overrides the CLI value in the rendered template
--dry-run → rendered versions.yml printed, nothing written, rc 0
- happy path → file written to
<config>/environments/kolla/versions.yml (directory auto-created), rc 0
sync.Versions._extract_sbom_with_skopeo() — sync.py:169
Patch osism.commands.sync.subprocess.run to a no-op and pre-build a fake OCI layout (the tmpdir comes from tempfile.TemporaryDirectory — patch osism.commands.sync.tempfile.TemporaryDirectory to return a tmp_path-backed context): index.json → manifest blob → tar layer containing images.yml.
- happy path → parsed
images.yml dict returned
skopeo exits non-zero (CalledProcessError) → RuntimeError "skopeo copy failed"
skopeo binary missing (FileNotFoundError) → RuntimeError "skopeo not found"
- layer that is not a tarfile → skipped, next layer used
- no layer contains
images.yml → RuntimeError "images.yml not found"
get.VersionsManager.take_action() — get.py:21
Patch docker.from_env (imported inside the method).
- three containers with
org.opencontainers.image.version labels → table rows; ceph-ansible adds de.osism.release.ceph, kolla-ansible adds de.osism.release.openstack, osism-ansible has empty release
client.containers.get raising docker.errors.NotFound for one name → that row skipped, others still printed
get.Tasks.take_action() — get.py:58
Patch celery.Celery so app.control.inspect() returns a mock with active() / scheduled() dicts.
- active and scheduled tasks rendered with worker, id, name, status
ACTIVE/SCHEDULED, start time (datetime.fromtimestamp) and args
- empty inspect results → empty table, no exception
get.Facts.take_action() / get.States.take_action() — get.py:189 / get.py:277
Patch the lazy redis client (mocker.patch("osism.utils._init_redis", return_value=MagicMock()) or patch osism.commands.get.utils.redis).
Facts: no cache entry → error "No facts found in cache"; specific fact present → single row; fact missing → error logged; full listing truncates the four ansible_ssh_host_key_*_public facts to 40 chars + ...
States: facts with ansible_local.osism → one row per role with state/timestamp, bootstrap skipped; missing ansible_local/osism key or no cache entry → nothing printed, no exception
get.Hostvars / get.Hosts happy paths — get.py:127 / get.py:238 (gap)
Hostvars with variable present → grid table row printed; without variable argument → one row per variable
Hosts with hosts in inventory → psql table of hostnames (capsys)
log.Ansible / log.Container — log.py:31 / log.py:54
Patch osism.commands.log.subprocess.call and osism.commands.log.ensure_known_hosts_file.
Ansible: parameters joined and appended to /usr/local/bin/ara
Container: command contains docker logs <parameters> <container> and <OPERATOR_USER>@<host>; ensure_known_hosts_file failure → warning, call still made
log.File.take_action() — log.py:105
Patch osism.commands.log.get_hosts_from_group, osism.commands.log.resolve_host_with_fallback, osism.commands.log.subprocess.call, osism.commands.log.ensure_known_hosts_file.
- path traversal (
../../etc/passwd) → error "must stay within /var/log", rc 1, no subprocess call; kolla/nova/nova-compute.log → /var/log/kolla/nova/nova-compute.log accepted
- tail command:
-n <lines> always, -f only with --follow, path shell-quoted via shlex.quote
- group with multiple hosts → clush invoked with
-w host1,host2; clush rc != 0 → rc passed through with error log
- group with exactly one host → host substituted, ssh path used
- non-group host →
resolve_host_with_fallback result used in user@host; ssh rc != 0 → passthrough; success → 0
log.Opensearch.take_action() — log.py:203 (low priority)
Patch osism.commands.log.PromptSession (session.prompt side_effect=["<query>", "exit"]) and osism.commands.log.requests.post.
exit immediately breaks the loop
- response with
hits → Payload printed per hit; --verbose prints timestamp | Hostname | [programname |] Payload, falling back to @timestamp when timestamp is absent
- response without
hits → raw JSON printed
console module helpers — console.py:18 / console.py:37 / console.py:65 / console.py:97 / console.py:128
resolve_hostname_to_ip (console.py:18): patch socket.gethostbyname — success → IP; socket.gaierror → None
get_primary_ipv4_from_netbox (console.py:37): patch osism.commands.console.utils.nb — nb falsy → None; device with primary_ip4.address = "10.0.0.1/24" → "10.0.0.1"; device None / no primary_ip4 → None; query raising → None with warning
resolve_host_with_fallback (console.py:65): DNS hit → IP; DNS miss + Netbox hit → Netbox IP; both miss → original hostname returned with warning
get_hosts_from_group (console.py:97): patch osism.commands.console.subprocess.check_output, get_inventory_path, get_hosts_from_inventory — valid inventory → sorted host list; any exception (e.g. CalledProcessError) → []
select_host_from_list (console.py:128): patch osism.commands.console.prompt — valid number → that host; q/quit/exit → None; non-numeric then valid input → retries; out-of-range then valid → retries
console.Run.take_action() — console.py:172
Patch osism.commands.console.subprocess.call, ensure_known_hosts_file, get_hosts_from_group, resolve_host_with_fallback, select_host_from_list, prompt.
- host syntax routing:
"ctl001/" → container_prompt loop; "ctl001/rabbitmq" → container; ".ctl001" → /run-ansible-console.sh ctl001; ":ctl" → clush with -g ctl and -l <OPERATOR_USER>
- ssh type: group resolving to one host → that host used; multiple hosts →
select_host_from_list; selection cancelled (None) → returns without SSH call
- ssh call uses
resolve_host_with_fallback result and UserKnownHostsFile=<KNOWN_HOSTS_PATH>
- container type:
docker exec -it <container> bash with both parts shlex.quoted, RequestTTY=force in options, host part resolved
- container_prompt: command then
exit → one SSH call with docker <quoted command>
Mocking hints
- Instantiate cliff commands as in the existing tests:
cmd = module.Class(MagicMock(), MagicMock()), then parsed_args = cmd.get_parser("test").parse_args([...]) — this exercises the real argparse defaults (tests/unit/commands/test_wait.py shows the pattern).
- Celery task modules (
osism.tasks.ansible, ceph, kolla, kubernetes, conductor) and osism.tasks.handle_task are imported inside method bodies — patch them at their canonical module path (e.g. mocker.patch("osism.tasks.ansible.run")), not on the command module. Assert on .si(...) / .delay(...) call args; no broker is needed since apply_async/delay are mocked.
osism.data.playbooks.MAP_ROLE2ENVIRONMENT / MAP_ROLE2RUNTIME are lazy module attributes loaded from /interface/playbooks via module-level __getattr__; set them with monkeypatch.setattr(playbooks, "MAP_ROLE2ENVIRONMENT", {...}, raising=False) and call osism.data.playbooks._reset_caches() in teardown (an autouse fixture keeps this tidy).
- The lazy redis attribute on
osism.utils: patch osism.utils._init_redis to return a MagicMock (as test_wait.py already does) before anything touches utils.redis.
apply.take_action stores utils._last_ansible_facts_check as an attribute on the osism.utils module — delete/reset it between tests (monkeypatch.delattr(osism.utils, "_last_ansible_facts_check", raising=False)).
wait.take_action loops until states converge: give the patched AsyncResult a side_effect list whose states end in SUCCESS, and always patch osism.commands.wait.time.sleep.
check.py guards the docker import with try/except — patch osism.commands.check.DOCKER_AVAILABLE and osism.commands.check.docker directly. The pure helpers (get_file_info, collect_file_info, parse_stat_output, _compare_file_info) need only tmp_path.
sync.Versions: requests and jinja2 are imported inside methods — patch requests.get; for the template-render assertions just inspect the written file/printed output instead of mocking jinja2.
log.py imports get_hosts_from_group and resolve_host_with_fallback from osism.commands.console at module level — patch them as osism.commands.log.<name>.
- Interactive prompts: patch
osism.commands.console.prompt / osism.commands.log.PromptSession with side_effect sequences ending in "exit"/"q" so loops terminate.
- Use capsys for the tabulate/print-based assertions (
apply role table, get tables, check script/table formats).
Definition of Done
Dependencies
Background
Follow-up to #2192 (foundation) and PR #2193 (pytest + Zuul infrastructure). Part of Tier 8 (#2199). This issue covers the workflow-oriented CLI command modules under
osism/commands/:apply.py(501 LOC),check.py(778 LOC),validate.py(217 LOC),wait.py(159 LOC),compose.py(45 LOC),sync.py(406 LOC),get.py(302 LOC),log.py(237 LOC) andconsole.py(296 LOC) — together ~2,940 LOC. They are cliffCommandclasses that schedule Celery tasks (apply,validate,sync), inspect Celery state (wait,get tasks), shell out via SSH/clush/docker (compose,log,console) or compare filesystem metadata (check).Scope
Add
tests/unit/commands/test_apply.py,test_check.py,test_compose.py,test_sync.py,test_log.pyandtest_console.py; extend the existingtest_validate.py,test_wait.pyandtest_get.py.Already covered (do not duplicate):
test_validate.py:Run._handle_taskreturns 1 onTimeoutErrorwhile waiting.test_wait.py: exit-code contract of the--livepath (timeout with one/two tasks, task-rc passthrough, success → 0).test_get.py:Hosts.take_action(inventory load failure → 1, empty inventory → success) andHostvars.take_action(inventory query failure → 1, missing variable → success).This is a large group: prioritize the pure-logic / high-value functions listed below; interactive loops (
log.Opensearch,consolecontainer_prompt) are lowest priority. Like the Tier 3 issues, this issue may be split further during implementation (suggested cut:apply+check/sync+get/compose+log+console+ thevalidate/waitextensions).Test targets
Run._prepare_task()—apply.py:269Patch
osism.tasks.ansible.run,osism.tasks.ceph.run,osism.tasks.kolla.run,osism.tasks.kubernetes.run(the.siattribute is what gets called; imports happen inside the method body, so patch the canonical task module paths).MAP_ROLE2ENVIRONMENT/MAP_ROLE2RUNTIMEare lazy module attributes ofosism.data.playbooks— see mocking hints.role="ceph"→ environment forced to"ceph",ceph.run.si("ceph", "ceph", arguments, auto_release_time=task_timeout)role="ceph-osds", environment"ceph"→ceph-prefix stripped:ceph.run.si("ceph", "osds", ...)sub="zone-a"with ceph/kubernetes/kolla environments → environment becomes"ceph.zone-a"etc."kubernetes"→kubernetes.run.si(...)role="loadbalancer-ng"→ returns chain ofkolla.run.si(env, "loadbalancer-ng", ...)piped into agroupoverenums.LOADBALANCER_PLAYBOOKS"kolla",role="kolla-keystone"→kolla-prefix stripped;kolla.run.sicalled with["-e kolla_action=deploy"] + argumentsrole="mariadb-ng"/"rabbitmq-ng"→ argument is-e kolla_action_ng=<action>insteadMAP_ROLE2RUNTIME["osism-ansible"](and not"common") → routed toansible.run.siwith the original arguments (nokolla_action)Noneand role not inMAP_ROLE2ENVIRONMENT→ bare-except fallback to"custom", info log "Trying to run play ...",ansible.run.si("custom", ...)overwrite="other"in the default branch → environment replaced by"other"Run._handle_collection()/Run.handle_collection()—apply.py:135/apply.py:218Patch the task modules as above plus
osism.tasks.ansible.noop; patchcelery.chain/celery.groupor assert on the returned signature structure. Drive via small hand-builtRoletrees (osism.data.enums.Role) instead of the realMAP_ROLE2ROLE.Role→TypeErrorraised and error loggedRoles without dependencies → one prepared task per role, wrapped in agroupRolewith nested dependencies →chain(parent_task, child_group)built recursivelydry_run=True→ansible.noop.si()used instead of_prepare_taskshow_tree=True→ no tasks created, returnsNone, tree logged ("A [0] ..."/ indentation grows with counter)handle_collection:apply_async()called on the result when notshow_tree; not called forshow_tree; distinct log messages for dry-run / show-tree / normal mode (patchosism.data.enums.MAP_ROLE2ROLEwithdict-patch for the collection lookup)Run.handle_role()/Run.handle_loadbalancer_task()—apply.py:371/apply.py:106Patch
osism.tasks.handle_task(imported inside the method) and stub_prepare_taskto return aMagicMockwhoseapply_async()yields either a plain result or acelery.result.GroupResultinstance.handle_task(task, wait, format, timeout)rc passed throughGroupResult→handle_loadbalancer_taskpath takenhandle_loadbalancer_taskwithwait=True: rc comes fromhandle_task(t.parent, ...),t.get()calledwait=False:t.parent.get()additionally called (garbage-collector workaround), children logged forformat="log"Run.take_action()—apply.py:417Patch
osism.commands.apply.utils.check_task_lock_and_exit,osism.commands.apply.utils.check_ansible_facts, and thehandle_role/handle_collectionmethods on the command instance.check_task_lock_and_exitalways invoked firstMAP_ROLE2ENVIRONMENTprinted (capsys), returns 0utils._last_ansible_facts_check; skipped for roles"gather-facts"/"facts", for--show-tree, and when the last check is recent (reset the_last_ansible_facts_checkattribute onosism.utilsbetween tests)role="a//b"→handle_rolecalled once per segment--retry 2withhandle_rolealways returning 1 → called 3 times, rc 1; success on 2nd attempt → 2 calls, rc 0handle_collectionreturnsNone, sorc != 0is truthy and the loop breaks withtake_actionreturningNone— pin this current behavior in a test (candidate for a follow-up fix)get_file_info()/collect_file_info()—check.py:31/check.py:59Pure filesystem helpers — use
tmp_path, no mocking needed for the happy paths.inode,mtime,size,mode,uid,gid,is_link=Falseand an md5hashhash is Nonebuiltins.opento raiseIOError) →hash is None, rest populated{"error": ...}collect_file_info: directory tree with.git/venv/__pycache__subdirs → skipped; symlinks skipped; both files and directories included with relative pathsmax_files=2on a larger tree → scan stops, warning loggedparse_stat_output()—check.py:88Pure parser, no mocks.
FILE:/INODE:/SIZE:/MTIME:/HASH:blocks → typed dict (int,int,float)HASH:NONE→NoneERROR:lines captured under"error"FILE:line ignoredMount._compare_file_info()—check.py:338Pure comparison, no mocks.
inode_mismatcheswithlocal_inode/fresh_inodecheck_content=True→content_mismatches; same input withcheck_content=False→ emptymissing_in_local; only in local info →missing_in_fresh"error"skipped entirelyNone/0) on either side → no mismatch recordedMount._get_container_id()/Mount._get_mount_source()—check.py:180/check.py:222Patch
builtins.openwithmock_openpayloads andos.uname.dockerwith a 64-char id → first 12 chars returned; 12-char id returned as-is/proc/self/mountinfocontaining/docker/containers/<id>/→ id truncated to 12None_get_mount_source: mountinfo line whose 5th field equals the mount path → source after the-separator returned (must start with/); no separator / non-absolute source / no matching line /IOError→NoneMount.take_action()—check.py:391Focus on the early-exit guards and final rc; patch
osism.commands.check.DOCKER_AVAILABLE,osism.commands.check.docker,os.path.exists, and the instance helpers (_get_container_id,_get_volume_mount_info,_get_mount_source,_run_fresh_container) pluscollect_file_info.format="script"printsFAILED: ...)DOCKER_AVAILABLE=False→ 1docker.from_envraises → 1--host-path, no Docker mount info, no mountinfo) → 1--volume-name→ override used_run_fresh_containerraises → 1scriptprintsPASSED); inode mismatches → 1 withINODE_MISMATCHES:<n>in script formatInode.take_action()—check.py:662tmp_path→ rows with type/inode/size; symlinks and missing files skippedenvironments/*andinventory/*(patchrandom.samplefor determinism); up to 2 entries per subdirectory plus direct filestablevia capsys,log,script); returns 0validate.Run.take_action()—validate.py:74(gap)Patch
osism.tasks.ansible.run,osism.tasks.ceph.run,osism.tasks.kolla.run(.delay),osism.commands.validate.utils.check_task_lock_and_exit, and stub_handle_task.keystone-config) →kolla.run.delay("kolla", "keystone", arguments)with-e kolla_action=config_validateappended to arguments--environment customhonored for ceph/kolla runtimes (no default applied)ceph-config) →ceph.run.delay("ceph", "validate", ...)(playbook rewritten viaVALIDATE_PLAYBOOKS)playbookkey (e.g.ntp) →ansible.run.delay("generic", "validate-ntp", ...)check_task_lock_and_exitinvokedvalidate.Run._handle_task()—validate.py:55(gap; timeout case exists)wait=True,fetch_task_outputreturns rc → rc passed throughwait=False,format="log"→ info log, returns 0wait=False,format="script"→ task id printed, returns 0validate.Scs.take_action()—validate.py:159Patch
osism.tasks.openstack.setup_cloud_environment,osism.tasks.openstack.cleanup_cloud_environment(imported inside the method) andosism.commands.validate.subprocess.run.setup_cloud_environmentreturnssuccess=False→ returns 1, no subprocess call, no cleanup-s <cloud>,-a os_cloud=<cloud>,-V <version>, ends withscs-compatible-iaas.yaml;OS_CLIENT_CONFIG_FILE=/tmp/clouds.yamlin env; returncode passed through--verbose/--debug/--tests/--output/--sectionseach append the corresponding flagsubprocess.runraisesFileNotFoundError→ 1; generic exception → 1cleanup_cloud_environment(temp_files, original_cwd)called in all post-setup paths (finally)wait.Run.get_all_task_ids()/take_action()—wait.py:50/wait.py:62(gap;--liveexit codes exist)Patch
celery.Celery,celery.result.AsyncResult,osism.commands.wait.time.sleep, andosism.utils._init_redis. MakeAsyncResultreturn objects whosestatechanges between iterations (viaside_effect) so re-queue loops terminate.get_all_task_ids: merges ids fromi.scheduled()andi.active(), returns them sortedget_all_task_ids, refresh mode enabledPENDING+query_taskfinds nothing → "unavailable" logged /<id> = UNAVAILABLEprinted, task not re-queuedPENDING+ task known to a worker → re-queued, thenSUCCESSon the next pass terminatesSUCCESSwith--output→result.get()printedSTARTEDwithout--live→ re-queued, finishes when state flips toSUCCESS--refresh 1→ after the queue drains,get_all_task_idsconsulted one more timeformat="script"prints<id> = <STATE>lines instead of log outputcompose.Run.take_action()—compose.py:25Patch
osism.commands.compose.subprocess.callandosism.commands.compose.ensure_known_hosts_file.ssh ... <OPERATOR_USER>@<host> 'docker compose --project-directory=/opt/<environment> <arguments>'withUserKnownHostsFile=<KNOWN_HOSTS_PATH>; arguments are joined without separators (current behavior — pin it)ensure_known_hosts_filereturnsFalse→ warning logged, SSH still attemptedsync.Facts/sync.CephKeys/sync.Sonic—sync.py:21/sync.py:46/sync.py:90Patch
osism.commands.sync.utils.check_task_lock_and_exit,osism.tasks.ansible.run/osism.tasks.conductor.sync_sonic(.delay) andosism.tasks.handle_task.Facts:ansible.run.delay("generic", "gather-facts", [], auto_release_time=3600), rc fromhandle_taskCephKeys:manager/copy-ceph-keysplaybook;--no-wait→handle_task(t, False)Sonic:conductor.sync_sonic.delay(device, show_diff); device-specific vs. generic log message;--no-diff→show_diff=Falsesync.Versions._get_kolla_version_from_release()—sync.py:248Patch
requests.get(imported inside the method).docker_images: {kolla: "0.20250928.0"}→ version returned, URL is<repo>/<release>/base.ymlraise_for_statusraisesRequestException) →RuntimeErrorRuntimeErrordocker_images.kolla→RuntimeError"Kolla version not found"sync.Versions._sync_kolla_versions()/take_action()—sync.py:311/sync.py:288Stub
_extract_sbom_with_skopeoand_get_kolla_version_from_release; usetmp_pathas--configuration-path.--release 9.4.0→ image=<sbom-image-base>:<version-from-release>;_get_kolla_version_from_releaseraising → rc 10.20251128.0, alsov-prefixed) → release SBOM image base; plain OpenStack version (2025.1) →registry.osism.cloud/kolla/sbom:2025.1--sbom-image→ used verbatim, no derivation_extract_sbom_with_skopeoraisingRuntimeError/YAMLError→ rc 1openstack_versionfrom the SBOM overrides the CLI value in the rendered template--dry-run→ renderedversions.ymlprinted, nothing written, rc 0<config>/environments/kolla/versions.yml(directory auto-created), rc 0sync.Versions._extract_sbom_with_skopeo()—sync.py:169Patch
osism.commands.sync.subprocess.runto a no-op and pre-build a fake OCI layout (thetmpdircomes fromtempfile.TemporaryDirectory— patchosism.commands.sync.tempfile.TemporaryDirectoryto return atmp_path-backed context):index.json→ manifest blob → tar layer containingimages.yml.images.ymldict returnedskopeoexits non-zero (CalledProcessError) →RuntimeError"skopeo copy failed"skopeobinary missing (FileNotFoundError) →RuntimeError"skopeo not found"images.yml→RuntimeError"images.yml not found"get.VersionsManager.take_action()—get.py:21Patch
docker.from_env(imported inside the method).org.opencontainers.image.versionlabels → table rows;ceph-ansibleaddsde.osism.release.ceph,kolla-ansibleaddsde.osism.release.openstack,osism-ansiblehas empty releaseclient.containers.getraisingdocker.errors.NotFoundfor one name → that row skipped, others still printedget.Tasks.take_action()—get.py:58Patch
celery.Celerysoapp.control.inspect()returns a mock withactive()/scheduled()dicts.ACTIVE/SCHEDULED, start time (datetime.fromtimestamp) and argsget.Facts.take_action()/get.States.take_action()—get.py:189/get.py:277Patch the lazy redis client (
mocker.patch("osism.utils._init_redis", return_value=MagicMock())or patchosism.commands.get.utils.redis).Facts: no cache entry → error "No facts found in cache"; specific fact present → single row; fact missing → error logged; full listing truncates the fouransible_ssh_host_key_*_publicfacts to 40 chars +...States: facts withansible_local.osism→ one row per role with state/timestamp,bootstrapskipped; missingansible_local/osismkey or no cache entry → nothing printed, no exceptionget.Hostvars/get.Hostshappy paths —get.py:127/get.py:238(gap)Hostvarswith variable present → grid table row printed; without variable argument → one row per variableHostswith hosts in inventory → psql table of hostnames (capsys)log.Ansible/log.Container—log.py:31/log.py:54Patch
osism.commands.log.subprocess.callandosism.commands.log.ensure_known_hosts_file.Ansible: parameters joined and appended to/usr/local/bin/araContainer: command containsdocker logs <parameters> <container>and<OPERATOR_USER>@<host>;ensure_known_hosts_filefailure → warning, call still madelog.File.take_action()—log.py:105Patch
osism.commands.log.get_hosts_from_group,osism.commands.log.resolve_host_with_fallback,osism.commands.log.subprocess.call,osism.commands.log.ensure_known_hosts_file.../../etc/passwd) → error "must stay within /var/log", rc 1, no subprocess call;kolla/nova/nova-compute.log→/var/log/kolla/nova/nova-compute.logaccepted-n <lines>always,-fonly with--follow, path shell-quoted viashlex.quote-w host1,host2; clush rc != 0 → rc passed through with error logresolve_host_with_fallbackresult used inuser@host; ssh rc != 0 → passthrough; success → 0log.Opensearch.take_action()—log.py:203(low priority)Patch
osism.commands.log.PromptSession(session.promptside_effect=["<query>", "exit"]) andosism.commands.log.requests.post.exitimmediately breaks the loophits→Payloadprinted per hit;--verboseprintstimestamp | Hostname | [programname |] Payload, falling back to@timestampwhentimestampis absenthits→ raw JSON printedconsolemodule helpers —console.py:18/console.py:37/console.py:65/console.py:97/console.py:128resolve_hostname_to_ip(console.py:18): patchsocket.gethostbyname— success → IP;socket.gaierror→Noneget_primary_ipv4_from_netbox(console.py:37): patchosism.commands.console.utils.nb—nbfalsy →None; device withprimary_ip4.address = "10.0.0.1/24"→"10.0.0.1"; deviceNone/ noprimary_ip4→None; query raising →Nonewith warningresolve_host_with_fallback(console.py:65): DNS hit → IP; DNS miss + Netbox hit → Netbox IP; both miss → original hostname returned with warningget_hosts_from_group(console.py:97): patchosism.commands.console.subprocess.check_output,get_inventory_path,get_hosts_from_inventory— valid inventory → sorted host list; any exception (e.g.CalledProcessError) →[]select_host_from_list(console.py:128): patchosism.commands.console.prompt— valid number → that host;q/quit/exit→None; non-numeric then valid input → retries; out-of-range then valid → retriesconsole.Run.take_action()—console.py:172Patch
osism.commands.console.subprocess.call,ensure_known_hosts_file,get_hosts_from_group,resolve_host_with_fallback,select_host_from_list,prompt."ctl001/"→ container_prompt loop;"ctl001/rabbitmq"→ container;".ctl001"→/run-ansible-console.sh ctl001;":ctl"→ clush with-g ctland-l <OPERATOR_USER>select_host_from_list; selection cancelled (None) → returns without SSH callresolve_host_with_fallbackresult andUserKnownHostsFile=<KNOWN_HOSTS_PATH>docker exec -it <container> bashwith both partsshlex.quoted,RequestTTY=forcein options, host part resolvedexit→ one SSH call withdocker <quoted command>Mocking hints
cmd = module.Class(MagicMock(), MagicMock()), thenparsed_args = cmd.get_parser("test").parse_args([...])— this exercises the real argparse defaults (tests/unit/commands/test_wait.pyshows the pattern).osism.tasks.ansible,ceph,kolla,kubernetes,conductor) andosism.tasks.handle_taskare imported inside method bodies — patch them at their canonical module path (e.g.mocker.patch("osism.tasks.ansible.run")), not on the command module. Assert on.si(...)/.delay(...)call args; no broker is needed sinceapply_async/delayare mocked.osism.data.playbooks.MAP_ROLE2ENVIRONMENT/MAP_ROLE2RUNTIMEare lazy module attributes loaded from/interface/playbooksvia module-level__getattr__; set them withmonkeypatch.setattr(playbooks, "MAP_ROLE2ENVIRONMENT", {...}, raising=False)and callosism.data.playbooks._reset_caches()in teardown (an autouse fixture keeps this tidy).osism.utils: patchosism.utils._init_redisto return aMagicMock(astest_wait.pyalready does) before anything touchesutils.redis.apply.take_actionstoresutils._last_ansible_facts_checkas an attribute on theosism.utilsmodule — delete/reset it between tests (monkeypatch.delattr(osism.utils, "_last_ansible_facts_check", raising=False)).wait.take_actionloops until states converge: give the patchedAsyncResultaside_effectlist whose states end inSUCCESS, and always patchosism.commands.wait.time.sleep.check.pyguards the docker import with try/except — patchosism.commands.check.DOCKER_AVAILABLEandosism.commands.check.dockerdirectly. The pure helpers (get_file_info,collect_file_info,parse_stat_output,_compare_file_info) need onlytmp_path.sync.Versions:requestsandjinja2are imported inside methods — patchrequests.get; for the template-render assertions just inspect the written file/printed output instead of mocking jinja2.log.pyimportsget_hosts_from_groupandresolve_host_with_fallbackfromosism.commands.consoleat module level — patch them asosism.commands.log.<name>.osism.commands.console.prompt/osism.commands.log.PromptSessionwithside_effectsequences ending in"exit"/"q"so loops terminate.applyrole table,gettables,checkscript/table formats).Definition of Done
tests/unit/commands/test_apply.py,test_check.py,test_compose.py,test_sync.py,test_log.py,test_console.pycreatedtests/unit/commands/test_validate.py,test_wait.py,test_get.pyextended (existing tests kept unchanged)pytest --covshows ≥ 80 % for each module in scope (≥ 95 % for the pure helpers incheck.pyand the module-level functions inconsole.py)pipenv run pytest tests/unit/commands/passes locallyflake8,mypy,python-blackremain greenpython-osism-unit-testspassesDependencies