Skip to content

feat(modbus): implement COM port sharing and robust connection management (#211)#2200

Open
lookmydog wants to merge 7 commits intofrangoteam:masterfrom
lookmydog:fix/issue-211
Open

feat(modbus): implement COM port sharing and robust connection management (#211)#2200
lookmydog wants to merge 7 commits intofrangoteam:masterfrom
lookmydog:fix/issue-211

Conversation

@lookmydog
Copy link
Copy Markdown

@lookmydog lookmydog commented Feb 13, 2026

Description

This PR refactors the Modbus communication layer to support multiple slave devices sharing a single COM port. It introduces a centralized management system to ensure stability and efficiency in industrial environments.

Key Mechanisms Implemented:

  1. ConnectionManager: A centralized controller to manage port states and lifecycle.
  2. Reference Counting: Tracks the number of active devices per port to safely manage opening/closing.
  3. Queue System: Serializes concurrent requests to prevent data collisions on the physical layer.
  4. Skip Mechanism: Automatically skips unreachable or faulty devices to maintain overall system performance.
  5. Timeout Handling: Improved detection and recovery for unresponsive communication sessions.
  6. Concurrency Control: Robust logic to handle simultaneous connection attempts across multiple slaves.
  7. State-Sync Mechanism: Maintains a clear separation between desired state (intended value) and actual state (device feedback), and continuously reconciles the difference to ensure reliable and consistent communication with the underlying devices. (referencing the approach from @alext-extracellular).

Impact

  • Allows more flexible hardware configurations (RS-485 daisy-chaining).
  • Reduces resource overhead by reusing port instances.
  • Increases communication reliability under high load.

Related Issue

Fixes #211

✅ Checklist

  • Implement core mechanisms (1 to 6)
  • Fix connection drop issue during normal operations (e.g., adding tags)
  • Implement state-sync mechanism
  • Add connect-skipped status and UI visual indicator
  • Update all i18n translation files for the new status
  • Replace console.log with FUXA built-in logger
  • Perform testing on physical devices or simulators
    • Test 01 (RTU Shared Port): Two slave devices sharing a single COM port; verify normal polling operations without data collisions or conflicts.
    • Test 02 (TCP ReuseSerial): Verify connection and queueing stability under TCP shared socket mode (e.g., connecting via an RS-485 to Ethernet Gateway).
    • Test 03 (Standard TCP): Verify connection and data operations (read/write) under standard Modbus TCP mode (independent IP/Socket) to ensure legacy logic remains unaffected.
    • Test 04 (State-Sync): Test adding tags or frequently writing values (SetValue) during runtime; verify that desired and actual states synchronize correctly without causing connection drops.
      • fix bug, The desired value is set as bytes via setvalue, but the actual value returned is a numeric type. Therefore, _isequal will never evaluate to true.
    • Test 05 (Skip Mechanism & Auto Recovery): Intentionally disconnect one slave device; confirm it enters the connect-skipped state without blocking other healthy devices on the same shared port. Reconnect the device and verify it automatically recovers after the skip duration ends.
      • fix bug, Sometimes device connection flapping during auto-recovery after power cycle tests
    • Test 06 (Mixed Modes Comprehensive): Stress test running RTU Shared, TCP ReuseSerial, and Standard TCP simultaneously; confirm the global ConnectionManager handles concurrency perfectly without cross-interference.
    • Test 07 (UI Integration): Comprehensive frontend testing; operate the Web UI to observe device status indicators (including the new visual color for the skipped state) and verify that the FUXA system logger correctly prints the corresponding Device Names and state transitions.

…tion logic (frangoteam#211)

Refactored the communication layer to allow a single COM port to be shared across multiple slave devices. Key enhancements: ConnectionManager, Reference Counting, Skip Mechanism, Queue System, Timeout Management, and Concurrency Control. Fixes frangoteam#211
@rvbatista
Copy link
Copy Markdown
Contributor

Wow, great job, TCP could also need queue in case that serial devices are using a shared gateway.

@lookmydog
Copy link
Copy Markdown
Author

Wow, great job, TCP could also need queue in case that serial devices are using a shared gateway.

Regarding the TCP 'reuseserial' mode, I've implemented a handling mechanism identical to RTU, as the device-side hardware is RTU-based. This has been included in the current Pull Request. However, I must apologize for refactoring the mutex implementation you used into a queue-based mechanism.

@rvbatista
Copy link
Copy Markdown
Contributor

Don't need apologies, I just copied the strategy from the SocketReuse of the ModbusTcp.

@unocelli
Copy link
Copy Markdown
Member

@lookmydog Thanks a lot for this work, the overall direction is great and the refactor is a strong step forward for Modbus connection robustness and shared COM/TCP handling.

I did a brief review with Codex support and found a couple of small improvements before merge.

I also want to be transparent: I could not test COM sharing deeply, because I don’t currently use RTU hardware in my setup.
Thanks again for the effort.

@alext-extracellular
Copy link
Copy Markdown
Contributor

I've tested this on my modbus RTU hardware and it is working quite well, it has prevented the crashing I was experiencing when using a slider to send lots of commands very quickly. However I still see lots of CRC errors and data length errors, which I am getting on the latest fuxa release too.

'/dev/serial/by-id/usb-FTDI_FT232R_USB_UART_BG01Q2N7-if00-port0_57600_Even_Slave1' retry after skip period (state: fault, count reset to 0)
Error: Data length error, expected 9 got 4
    at ModbusRTU._onReceive (/usr/src/app/FUXA/server/node_modules/modbus-serial/index.js:450:14)
    at SerialPort.emit (node:events:517:28)
    at addChunk (node:internal/streams/readable:368:12)
    at readableAddChunk (node:internal/streams/readable:341:9)
    at Readable.push (node:internal/streams/readable:278:10)
    at /usr/src/app/FUXA/server/node_modules/@serialport/stream/dist/index.js:208:18
2026-03-02T18:42:53.404Z [ERR] 	'ebyte' _readMemory error at 0-100000! Data length error, expected 9 got 4
2026-03-02T18:42:53.420Z [ERR] 	'ebyte' _readMemory error at 0-100000! CRC error
'/dev/serial/by-id/usb-FTDI_FT232R_USB_UART_BG01Q2N7-if00-port0_57600_Even_Slave1' recovered from error state.
Error: CRC error
    at ModbusRTU._onReceive (/usr/src/app/FUXA/server/node_modules/modbus-serial/index.js:460:14)
    at SerialPort.emit (node:events:517:28)
    at addChunk (node:internal/streams/readable:368:12)
    at readableAddChunk (node:internal/streams/readable:341:9)
    at Readable.push (node:internal/streams/readable:278:10)
    at /usr/src/app/FUXA/server/node_modules/@serialport/stream/dist/index.js:208:18
2026-03-02T18:43:04.409Z [ERR] 	'ebyte' _readMemory error at 0-100000! CRC error
2026-03-02T18:43:04.425Z [ERR] 	'ebyte' _readMemory error at 0-100000! CRC error
'/dev/serial/by-id/usb-FTDI_FT232R_USB_UART_BG01Q2N7-if00-port0_57600_Even_Slave1' recovered from error state.

I don't know if this is just my device or hardware setup, but I don't see these errors when I use QModbus for example.

@lookmydog
Copy link
Copy Markdown
Author

lookmydog commented Mar 3, 2026

@alext-extracellular

Thanks for the feedback! It is interesting that QModbus works fine; this suggests the hardware itself can handle the communication, but the timing logic in FUXA might need further tuning.

These errors could be caused by serial packet fragmentation (sticky packets) or improper frame segmentation. To help us analyze and reproduce the issue, could you please provide the following information?

  1. Your FUXA Modbus configuration (a screenshot of the settings page).
  2. Details of the polled data for each device (e.g., the number of Coils, Discrete Inputs, Holding Registers, Input Registers, etc.). The more detailed, the better!
  3. Your system architecture (e.g., the number of devices, Slave IDs, and the physical wiring/connection method).

This will help us determine if the issue is related to Node.js event loop latency or the way the serial buffer is being parsed. Thanks!

@alext-extracellular
Copy link
Copy Markdown
Contributor

@lookmydog
Ok, so it seems it only happens when the connection option = serial port.
like in the below image:
image

When I change to RTUBufferedPort, then it works fine, no errors reported in logs. Perhaps that is my mistake and I should have been using that anyway?

Working:
image

I can't actually try multiple modbus masters on the same network because although I have 2, I only have 1 power supply atm :(

@alext-extracellular
Copy link
Copy Markdown
Contributor

alext-extracellular commented Mar 3, 2026

One thing I wanted to get an opinion on, is 2 different approaches to improving the modbus device in FUXA. You have implemented a queue, which results in minimum time to see the change in real life, however, when I spam lots of commands, the eventual delay that builds up in the queue can be several seconds long.

Seperately, I also implemented a different approach, which is storing a "desired state" and "actual state". Any set tags commands changes the desired state, and this is synchronised to the device on the polling interval. Any read tags, come from the actual state which is the last read value from the actual device. This means there is a consistent delay inbetween clicking a button on the UI and a result happening, which is roughly the polling time. However it also means there cannot be a lag building up between commands sent and the result. AFAIK this is generally how it is implemented in PLCs.
However I didnt have a look at the multi-device support because I can't test that anymore.
(note I did use ai to help me implement this because I am not familiar with JS)

What are your opinions? particularly @unocelli because I guess this is a philosophical question about how you want your app to work.

@lookmydog would you be interested in combining our approaches perhaps?

@lookmydog
Copy link
Copy Markdown
Author

@alext-extracellular

Regarding RTUBufferedPort:

You are absolutely right. Based on my experience with the modbus-serial library, RTUBufferedPort consistently provides better stability and data integrity for serial communications compared to the standard SerialPort. It has proven to be a much more robust choice in practical applications.

Regarding State-based Control ("desired state" and "actual state"):

To be honest, I am not entirely familiar with the implementation details of a "State-based" architecture. However, if this model can effectively resolve the communication congestion and latency issues we are facing in FUXA's high-frequency environments, I am more than willing to give it a try and evaluate its performance.

Next Steps:

I am very interested in exploring this further. Could you share more of your specific ideas or perhaps some pseudo-code? I would be happy to collaborate with you, @unocelli to see how we can combine our approaches and improve the Modbus communication module together.

@jingshui127
Copy link
Copy Markdown

I tested it, and still only one slave is working properly! Could it be that it wasn't configured correctly? Can you take a screenshot of the configuration?

@jingshui127
Copy link
Copy Markdown

image image image

I tested it, and still only one slave is working properly! Could it be that it wasn't configured correctly? Can you take a screenshot of the configuration?

@lookmydog
Copy link
Copy Markdown
Author

@jingshui127

Did you download the source code from the main branch? The fixes haven't been merged into main yet, so the issue still persists there. If that's not the case, please provide more details so I can look into it. Thanks!

@jingshui127
Copy link
Copy Markdown

v1.3.0-2727

@jingshui127

Did you download the source code from the main branch? The fixes haven't been merged into main yet, so the issue still persists there. If that's not the case, please provide more details so I can look into it. Thanks!

@alext-extracellular
Copy link
Copy Markdown
Contributor

@alext-extracellular
Copy link
Copy Markdown
Contributor

v1.3.0-2727

@jingshui127
Did you download the source code from the main branch? The fixes haven't been merged into main yet, so the issue still persists there. If that's not the case, please provide more details so I can look into it. Thanks!

You will need to go to this repo https://github.com/lookmydog/FUXA/tree/fix/issue-211 and clone that and run it, or at least take the modbus implementation out and put it in your local fuxa.

@alext-extracellular
Copy link
Copy Markdown
Contributor

alext-extracellular commented Mar 10, 2026

I have been able to test now with multiple modbus devices on the same bus, unfortunately it is not working well, still errors and not responding. these are the settings:
image
image

There were 3 tags in total , so it should be completely fine on a timing front.

'/dev/serial/by-id/usb-1a86_USB_Single_Serial_586D015045-if00_38400_Even' disconnect
'/dev/serial/by-id/usb-1a86_USB_Single_Serial_586D015045-if00_38400_Even' Shared Connection created
2026-03-10T19:02:55.532Z [INF] 	'g3 rtu' restored 0/0 values
'/dev/serial/by-id/usb-1a86_USB_Single_Serial_586D015045-if00_38400_Even_Slave130' retry after skip period (state: fault, count reset to 0)
Error: Timeout SlaveID 130
    at Timeout._onTimeout (/usr/src/app/FUXA/server/runtime/devices/modbus/index.js:1592:20)
    at listOnTimeout (node:internal/timers:569:17)
    at process.processTimers (node:internal/timers:512:7)
2026-03-10T19:02:57.825Z [ERR] 	'mfc rtu' _readMemory error at 41-400000! Timeout SlaveID 130
'/dev/serial/by-id/usb-1a86_USB_Single_Serial_586D015045-if00_38400_Even_Slave130' recovered from error state.
Error: Timeout SlaveID 130
    at Timeout._onTimeout (/usr/src/app/FUXA/server/runtime/devices/modbus/index.js:1592:20)
    at listOnTimeout (node:internal/timers:569:17)
    at process.processTimers (node:internal/timers:512:7)
2026-03-10T19:03:00.329Z [ERR] 	'mfc rtu' _readMemory error at 41-400000! Timeout SlaveID 130
'/dev/serial/by-id/usb-1a86_USB_Single_Serial_586D015045-if00_38400_Even' disconnect
'/dev/serial/by-id/usb-1a86_USB_Single_Serial_586D015045-if00_38400_Even' Shared Connection created
2026-03-10T19:03:03.819Z [INF] 	'mfc rtu' restored 0/0 values
Error: Timeout SlaveID 130
    at Timeout._onTimeout (/usr/src/app/FUXA/server/runtime/devices/modbus/index.js:1592:20)
    at listOnTimeout (node:internal/timers:569:17)
    at process.processTimers (node:internal/timers:512:7)
2026-03-10T19:03:06.345Z [ERR] 	'mfc rtu' _readMemory error at 41-400000! Timeout SlaveID 130
'/dev/serial/by-id/usb-1a86_USB_Single_Serial_586D015045-if00_38400_Even_Slave130' escalated: timeout -> fault. Skip 10s
Error: Timeout SlaveID 130
    at Timeout._onTimeout (/usr/src/app/FUXA/server/runtime/devices/modbus/index.js:1592:20)
    at listOnTimeout (node:internal/timers:569:17)
    at process.processTimers (node:internal/timers:512:7)
2026-03-10T19:03:08.822Z [ERR] 	'mfc rtu' _readMemory error at 41-400000! Timeout SlaveID 130
Error: RTU SlaveID 130 skipped (state: fault, retry: 10s)
    at ConnectionManager.executeRtu (/usr/src/app/FUXA/server/runtime/devices/modbus/index.js:1572:5)
    at _execute (/usr/src/app/FUXA/server/runtime/devices/modbus/index.js:1054:29)
    at /usr/src/app/FUXA/server/runtime/devices/modbus/index.js:821:5
    at new Promise (<anonymous>)
    at _readMemory (/usr/src/app/FUXA/server/runtime/devices/modbus/index.js:741:10)
    at MODBUSclient._polling (/usr/src/app/FUXA/server/runtime/devices/modbus/index.js:281:7)
    at MODBUSclient.polling (/usr/src/app/FUXA/server/runtime/devices/modbus/index.js:267:15)
    at Device.polling (/usr/src/app/FUXA/server/runtime/devices/device.js:215:14)
    at Timeout._onTimeout (/usr/src/app/FUXA/server/runtime/devices/device.js:231:26)
    at listOnTimeout (node:internal/timers:569:17)
    at process.processTimers (node:internal/timers:512:7)
2026-03-10T19:03:08.824Z [ERR] 	'mfc rtu' _readMemory error at 41-400000! RTU SlaveID 130 skipped (state: fault, retry: 10s)

@alext-extracellular
Copy link
Copy Markdown
Contributor

if I connect mbusd to my serial device and then connect both tcp devices in fuxa to it, even with "Reuse" they are also erroring.

@lookmydog
Copy link
Copy Markdown
Author

I have been able to test now with multiple modbus devices on the same bus, unfortunately it is not working well, still errors and not responding. these are the settings:

I'll investigate the root cause. (Since my environment doesn't use such fast polling times and baudrates, I might have missed these specific cases.)

if I connect mbusd to my serial device and then connect both tcp devices in fuxa to it, even with "Reuse" they are also erroring.

For TCP connections, the queuing mechanism relies on the ReuseSerial option. Could you please try with that enabled?

@lookmydog
Copy link
Copy Markdown
Author

💡 Modbus RTU Testing Report & Observations
I have tested the configuration on my physical Modbus RTU setup. Here are the details and initial results from my environment:

🛠 Environment & Configuration
Communication: Baudrate 38400, 8-Even-1

Architecture: PC -> USB-to-RS232 -> RS232/RS485 Converter -> 4x Modbus RTU AI Modules

Device Setup:

Slave ID Polling Time Data (Tags)
3 200 ms 8 Temperature points (16 words)
4 350 ms 8 Temperature points (16 words)
5 500 ms 8 Temperature points (16 words)
6 1000 ms 8 Temperature points (16 words)

Operation: Polling Read only (no write operations).

📊 Preliminary Testing Results
I have been running several test sessions, each lasting over 4 hours, using the current version of this Pull Request. The communication behavior observed so far is largely consistent; while I still encounter occasional Timeout errors, the frequency is low enough to allow for continuous data collection during these sessions.

📝 Notes on Device Response
Based on these initial tests, it seems the performance is quite sensitive to how each device handles request timing:

On Polling Time: For devices that might take a bit longer to respond, slightly extending the polling interval seems to provide better consistency.

Regarding Reliability: This also helps avoid the automatic "Skip" logic, ensuring that each device remains active and updated without being temporarily sidelined by the driver due to timeout.

🔍 Physical Environment Considerations
I also wonder if any external factors might be influencing the stability on your end? In some cases, we've seen communication issues tied to the physical layer:

Cable Length: Long RS485 runs can lead to signal attenuation or reflection.

Signal Interference: Electromagnetic interference (EMI) from nearby high-voltage lines or motors.

Wiring Quality: It might be worth checking the termination resistors or shielding if the timeouts persist.

Just sharing these observations as I continue to evaluate the current PR. If you have more data or logs you'd like me to look at, I'm happy to discuss further!

@alext-extracellular
Copy link
Copy Markdown
Contributor

@lookmydog
ok I could not test on my RTU setup because it is currently busy being used for some production thing.
I set up a little testbed using only modbus tcp. I am using modbux as the server, and FUXA as the client. polling at 350 ms.

example:
image

I have 4 tags on both id=1 and id=2. I tried the different Socket Reuse options, nothing, Reuse and ReuseSerial.

Socket Reuse: nothing

No issues, polls both with no errors appearing in logs.
(probably because server is allowing multiple connections)

Socket Reuse: Reuse

When switching both onto this setting, sometimes need to turn one off for a few seconds then enable again to get errors to go away, but generally works fine.

Socket Reuse: ReuseSerial

When switching to this, start to see a lot of warnings:

host.docker.internal:5521' TCP disconnect (Ref=0)
'host.docker.internal:5521' Shared Connection created
2026-03-15T17:44:53.620Z [WAR] 	'modbux2' TCP connection closed, attempting to reconnect...
'host.docker.internal:5521' TCP disconnect (Ref=0)
'host.docker.internal:5521' Shared Connection created
2026-03-15T17:44:53.758Z [WAR] 	'modbux1' TCP connection closed, attempting to reconnect...
'host.docker.internal:5521' TCP disconnect (Ref=0)
'host.docker.internal:5521' Shared Connection created
'host.docker.internal:5521' TCP disconnect (Ref=0)
'host.docker.internal:5521' Shared Connection created
2026-03-15T17:44:58.819Z [INF] 	'modbux2' restored 0/0 values
2026-03-15T17:44:59.010Z [WAR] 	'modbux1' TCP connection closed, attempting to reconnect...
'host.docker.internal:5521' TCP disconnect (Ref=0)
'host.docker.internal:5521' Shared Connection created
2026-03-15T17:44:59.169Z [WAR] 	'modbux2' TCP connection closed, attempting to reconnect...

Fix is to disable one for a few seconds and then enable again, then it seems stable.

Will test again with the RTU system when I can, overall seems ok, with potentially some buggy state on initial connection

@lookmydog
Copy link
Copy Markdown
Author

Hi everyone,

This commit addresses connection stability issues and introduces new mechanisms to improve the Modbus driver's reliability and user experience.

🐛 Bug Fixes

  • Connection Stability: Fixed a critical issue where normal operations (e.g., adding tags) would unexpectedly trigger device disconnections.

✨ New Features

  • State-Sync Mechanism: Implemented a state-sync feature to ensure better synchronization and handling of the connection state. (This implementation references the approach from alext-extracellular).
  • UI 'Skip' Status Indicator: Added a visual indicator in the frontend UI for devices that are temporarily skipped. This helps operators quickly identify the device status without checking the backend logs.

Copy link
Copy Markdown
Contributor

@alext-extracellular alext-extracellular left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for commenting out all of the console outputs? I think it is quite useful to have for logging

@lookmydog
Copy link
Copy Markdown
Author

I commented those out because I'm aiming to transition to FUXA's built-in logging system, as I think console.log might be a bit messy for the PR. They are still there if someone needs them for local debugging.

Also, thanks for the inspiration with the progress checklist! It’s a very clear way to visualize the development stages, and I'll be using that moving forward.

user added 2 commits March 26, 2026 10:25
Sometimes device connection flapping during auto-recovery after power cycle tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Modbus RTU multiple devices per serial port.

5 participants