Skip to content

Feature "async" - "slave" controller getting synchronized with a hardware#478

Closed
Nibanovic wants to merge 11 commits intoros-controls:masterfrom
b-robotized-forks:feat/async-slave-controller
Closed

Feature "async" - "slave" controller getting synchronized with a hardware#478
Nibanovic wants to merge 11 commits intoros-controls:masterfrom
b-robotized-forks:feat/async-slave-controller

Conversation

@Nibanovic
Copy link
Copy Markdown

@Nibanovic Nibanovic commented Jan 20, 2026

This PR extends the basic functionality of async slave hardware interface found in #473, and has a parallel ros2_control PR: ros-controls/ros2_control#2971

The basic functionality ensures the async hardware interface with blocking read() schedules the control cycle. The robot waits for a heartbeat signal (usually recv on UDP socket) to continue to write().

This PR implements a solution for the issue described in the original PR discussion:

For example, If we're running update() for a JTC that is targeting async_slave franka and we have another robot running in sync with controller manager, the update() of the controller is on a different clock than the read/write.

What are the knock-on effects of this and how do we address it? How do we establish a test case where the problems the controller causes are clear? this is what I'm working on currently

What first comes to mind is to run a controller as async controller. Then we configure a controller to target a specific hardware interface. Then, from the hardware interface, at the end of the read() we signal to the async controller to execute its update(). This would be the simplest approach to implement.

Setup

Main idea:

So, we have an async slave hardware interface and one or more async controllers. We expect:

  • regular controllers/hardware interfaces to run as usually
  • async slave hardware interface to block in read()
  • register async controllers as waiting on async slave hardware read() to finish
  • after read() is complete, execute update() on async controllers, signal completion
  • hardware tracks all update_finished signals, and after all are done, proceeds with write()

Fail states:

  • we've hardcoded the hardware to have a 0.9 * read_write_period timeout and throw a warning if updates don't complete in time
  • controller on_error, on_shutdown, on_cleanup and on_deactivate de-register the controller so the hardware does not hand on dead controllers

Validation

In this setup, we have

  1. TestSystemComponent + joint_state_broadcaster running in regular sync mode, 500Hz
  2. kassow robot with blocking read() 250Hz
    a. joint_state_broadcaster_async slave to kassow hardware
    b. joint_state_broadcaster_async_2, also slave to kassow

additionally, I've added a 500us sleep after controller update(), so we can more clearly see the effect a "long" update would have on this setup

We have 2 new introspection members:

  • sync_latency_us: Time between when last "I'm finished" signal was sent out and the component woke up as a consequence. For example:
    • 2 controllers wait for read_finished signal, after each of them wakes up, it notes the time from when the signal was emmitted to when the controller thread actually woke up to process it
    • 2 controllers emmit update_finished. This measures the time from last update signal which causes the hw interface to wake up for write(), and the time the hardware interface thread actually woke up to write()
  • sync_triggers: number of triggers for update() of the async slave controllers, as triggered by the hardware interface.

controllers.yaml

b_controlled_box_cm:
  ros__parameters:
    update_rate: 500  # Hz

    joint_state_broadcaster:
      type: joint_state_broadcaster/JointStateBroadcaster
    joint_state_broadcaster_async:
      type: joint_state_broadcaster/JointStateBroadcaster
    joint_state_broadcaster_async_2:
      type: joint_state_broadcaster/JointStateBroadcaster

joint_state_broadcaster:
  ros__parameters:
    update_rate: 500
    type: joint_state_broadcaster/JointStateBroadcaster

joint_state_broadcaster_async:
  ros__parameters:
    update_rate: 500  
    is_async: true
    async_parameters:
      scheduling_policy: slave         # new option
      slave_to_hardware: kassow    # new parameter
      thread_priority: 30
    type: joint_state_broadcaster/JointStateBroadcaster

joint_state_broadcaster_async_2:
  ros__parameters:
    update_rate: 500
    is_async: true
    async_parameters:
      scheduling_policy: slave
      slave_to_hardware: kassow
      thread_priority: 30
    type: joint_state_broadcaster/JointStateBroadcaster

Sync triggers

We see that the regular joint_state_broadcaster at 500Hz has about 40k triggers, while the async ones, slave to hardware at 250hz, have about half as much, meaning they are woken up correctly (even though they are configured to be at 500Hz in controllers.yaml!)

- name: joint_state_broadcaster.total_triggers
  value: 41219.0
- name: joint_state_broadcaster.failed_triggers
  value: 0.0

- name: joint_state_broadcaster_async.sync_triggers
  value: 20275.0
- name: joint_state_broadcaster_async.total_triggers
  value: 40546.0
- name: joint_state_broadcaster_async.failed_triggers
  value: 0.0

- name: joint_state_broadcaster_async_2.sync_triggers
  value: 20275.0
- name: joint_state_broadcaster_async_2.total_triggers
  value: 37414.0
- name: joint_state_broadcaster_async_2.failed_triggers
  value: 0.0

Plots

In this graph we can see actual measurements for:

  • sync_latency for kassow and two async slave JSBs
  • kassow read execution_time: (purple), showing the portion of the 4000us loop spent waiting (as measured from the controller manager)
  • periodicity of JSB's (orange/green) - waiting for most of the 4000us cycle
  • update rates of all three async components matching 250hz of the kassow hardware
image

deactivating one of the async joint_state_broadcaster

  • the controller correctly de-registers from the hardware, so kassow does not wait for its update() to complete
  • more time is spent waiting, meaning processing time is shorter
image

deactivating both async joint_state_broadcasters

  • deactivating both controllers returns the hardware to sleep most of the 4000us cycle, it does not wait for any of the controllers to finish update, or pays for the latency of the synchronization signals for thread wakeups.
image

Conclusion

I believe this PR showcases the basic functionality of using the Monitor pattern to "pin" async controllers to async hardware that is slave to an external clock of the robot, as propagated by its blocking read().

The read() signals completion, and write() waits until all pinned controllers signal their update() is complete.

Open questions

  • measurements for read_execution_time for async slave components have to be reviewed, as currently, we're measuring the "total blocking time" of the read/update (i.e. time the thread spends sleeping in read/update) from the perspective of the controller manager. From this we can deduce the time spent processing, but the measurement is not immediatelly obvious.

@mergify
Copy link
Copy Markdown

mergify bot commented Jan 20, 2026

@Nibanovic, all pull requests must be targeted towards the master development branch.
Once merged into master, it is possible to backport to jazzy, but it must be in master
to have these changes reflected into new distributions.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 6, 2026

This PR is stale because it has been open for 45 days with no activity. Please tag a maintainer for help on completing this PR, or close it if you think it has become obsolete.

@github-actions github-actions bot added the stale label Mar 6, 2026
@christophfroehlich
Copy link
Copy Markdown
Member

@Nibanovic can you give us an update on the status of this PR, is it ready for review?

@github-actions github-actions bot removed the stale label Mar 30, 2026
destogl and others added 2 commits March 31, 2026 10:15
* now this hardware's update loop is synchronized by the blocking read() of a robot it is targeting
@Nibanovic
Copy link
Copy Markdown
Author

It is not ready for review, as it relies on #473 getting merged, along with any changes that might be required there.

@christophfroehlich
Copy link
Copy Markdown
Member

Ok, I was not sure if this replaces the other PR or is on top of it.

merge with detached_callback
@destogl destogl marked this pull request as ready for review April 2, 2026 09:28
@destogl destogl changed the title Draft: Feat/async slave controller Feat/async slave controller Apr 2, 2026
@github-actions github-actions bot requested review from aprotyas and bijoua29 April 2, 2026 09:59
@codecov-commenter
Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 8.88889% with 41 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.91%. Comparing base (04dc193) to head (abbd88c).
⚠️ Report is 2 commits behind head on jazzy.

Files with missing lines Patch % Lines
.../include/realtime_tools/async_function_handler.hpp 8.88% 36 Missing and 5 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##            jazzy     #478      +/-   ##
==========================================
- Coverage   84.86%   82.91%   -1.96%     
==========================================
  Files          19       19              
  Lines        1520     1557      +37     
  Branches      142      153      +11     
==========================================
+ Hits         1290     1291       +1     
- Misses        136      168      +32     
- Partials       94       98       +4     
Flag Coverage Δ
unittests 82.91% <8.88%> (-1.96%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
.../include/realtime_tools/async_function_handler.hpp 63.00% <8.88%> (-11.52%) ⬇️

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@destogl destogl changed the title Feat/async slave controller Feature "async" - "slave" controller getting synchronized with a hardware Apr 2, 2026
@Nibanovic Nibanovic force-pushed the feat/async-slave-controller branch from abbd88c to 71eaaf0 Compare April 3, 2026 08:24
@mergify
Copy link
Copy Markdown

mergify bot commented Apr 3, 2026

This pull request is in conflict. Could you fix it @Nibanovic?

@destogl destogl changed the base branch from jazzy to master April 3, 2026 08:34
* add docs for functions
* const correctness
* add std::optional<> to catch timeouts when waiting for signal, to starndardize usage
* pre-commit
@Nibanovic
Copy link
Copy Markdown
Author

Closed with explanation here: ros-controls/ros2_control#2971 (comment)

@Nibanovic Nibanovic closed this Apr 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants