Skip to content

polling: reset bad hosts on manual poll#7055

Draft
oliver-sanders wants to merge 1 commit intocylc:8.6.xfrom
oliver-sanders:7029
Draft

polling: reset bad hosts on manual poll#7055
oliver-sanders wants to merge 1 commit intocylc:8.6.xfrom
oliver-sanders:7029

Conversation

@oliver-sanders
Copy link
Copy Markdown
Member

Check List

  • I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • Contains logically grouped changes (else tidy your branch by rebase).
  • Does not contain off-topic changes (use other PRs for other changes).
  • Applied any dependency changes to both setup.cfg (and conda-environment.yml if present).
  • Tests are included (or explain why tests are not needed).
  • Changelog entry included if this is a change that can affect users
  • Cylc-Doc pull request opened if required at cylc/cylc-doc/pull/XXXX.
  • If this is a bug fix, PR should be raised against the relevant ?.?.x branch.

Comment on lines +997 to +1008
if continue_if_no_good_hosts:
# no hosts available for this platform
# -> reset the bad hosts and try again
self.task_events_mgr.reset_bad_hosts()
host = get_host_from_platform(
platform, bad_hosts=self.bad_hosts
)
else:
ctx.err = f'No available hosts for {platform["name"]}'
LOG.debug(ctx)
callback_255(ctx, itasks)
continue
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both branches here continue to the next iteration of the for loop. host is not used in the first branch. Did you mean to remove the else condition on the line below?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mildly concerned that if Ronnie is right (I think he is) that the test isn't failing....

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ronnie is very much correct.

The test was passing because of the host = 'localhost' above which was providing an unwelcome default pathway through the code.

MB, that was some AI level BS.

@oliver-sanders oliver-sanders modified the milestones: 8.6.x, 8.6.1, 8.6.2 Nov 24, 2025
@MetRonnie
Copy link
Copy Markdown
Member

Converting to draft pending resolution of #7055 (comment)

@MetRonnie MetRonnie marked this pull request as draft December 16, 2025 15:26
@MetRonnie MetRonnie linked an issue Dec 17, 2025 that may be closed by this pull request
@oliver-sanders oliver-sanders modified the milestones: 8.6.2, 8.6.x Dec 18, 2025
* Closes cylc#7029
* This reduces the pressure on cylc#7001 (polling is not attempted if a
  platform runs out of hosts until the next run of reset-bad hosts) by
  making it easier for operators to reset bad hosts and recover their
  workflow in the event of platform outages.
cmd.append("--")
cmd.append(get_remote_workflow_run_job_dir(self.workflow))
job_log_dirs = []
host = 'localhost'
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have removed the localhost default.

platform, bad_hosts=self.bad_hosts
)

if not host and continue_if_no_good_hosts:
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have flattened the if/try/except/if/else pathway.

Comment on lines +992 to +994
host = get_host_from_platform(
platform, bad_hosts=self.bad_hosts
)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have pushed localhost submissions through this interface to flatten the code.

This is a change, though localhost can't get into the bad hosts, so functionless. Happy to change if desired.

Comment on lines +324 to +325
assert poll_ctx.cmd_key == TaskJobManager.JOBS_POLL
assert poll_ctx.host == 'abc'
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test hardened.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

could be better Not exactly a bug, but not ideal.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

polling: reset bad hosts on *manual* poll

3 participants