— Async, Bulk Data, Custom FCs & Protocol Conversion
(Module 5 · Development & Implementation – Bringing Modbus to Life)
Learning objectives
Upon finishing this chapter you will be able to …
- Implement true asynchronous Modbus on both TCP and RTU, keeping 100 + slaves responsive on a single-core edge computer.
- Stream large data blocks (kilobytes/second) through standard function codes without overrunning 125-register limits.
- Create, document and test user-defined function codes (FC 65-72) when public FCs are insufficient.
- Embed Modbus in a protocol-conversion micro-service (MQTT ↔ Modbus, OPC UA ↔ Modbus) with type mapping, buffering and rate-limiting.
- Benchmark and optimise latency, jitter and CPU load using event traces and profiling tools.
17.1 Why “advanced” matters
Edge analytics, brown-field IIoT and high-speed motion routinely push classic Modbus beyond its original 8-bit PLC scope. These advanced patterns let you squeeze the last microsecond of latency, the last packet of throughput—and bridge to modern stacks—without abandoning the installed base.
17.2 Asynchronous Modbus RTU over RS-485
17.2.1 Problem statement
Traditional RTU masters issue one request, wait full response, losing the bus for 10–40 ms at 9 600 Bd. If you must poll 50 slaves at 100 ms cadence, synchronous loops fail.
17.2.2 Concept: inter-char timer multiplexing
- Send request to Slave A.
- While Slave A is computing, immediately queue request for Slave B.
- Identify response by Slave Address byte, match against outstanding table.
Constraint: Total responses must not overlap; still honour 3.5-char gap.
17.2.3 Reference implementation — STM32 FreeRTOS ISR
- Maintain ring-buffer of outstanding
(addr, fc, crc)
records. - UART RX ISR detects new frame (gap timer) → dispatch to handler X.
- Scheduler unblocks next TX once UART TX-complete ISR fires.
(Listing 17-1: C pseudo, 140 LOC)
17.2.4 Throughput numbers
Baud | Sync loop (32 slaves) | Async pipeline (32×2 outstanding) |
---|---|---|
9 600 | 540 ms | 210 ms |
38 400 | 140 ms | 55 ms |
115 200 | 46 ms | 18 ms |
(Fig-17-1: bar graph)
17.3 Asynchronous Modbus TCP with asyncio
"""
listing_17_2_async_pipeline.py
Four outstanding TIDs per slave, auto-window control.
"""
from asyncio import sleep, gather, Semaphore
from pymodbus.client import AsyncModbusTcpClient
WINDOW = 4
sem = Semaphore(WINDOW)
async def poll_block(cli, unit, start, count):
async with sem:
rsp = await cli.read_holding_registers(start, count, unit=unit)
return rsp.registers if not rsp.isError() else rsp
async def poll_slave(ip, unit):
cli = AsyncModbusTcpClient(ip, nodelay=True)
await cli.connect()
while True:
tasks = [poll_block(cli, unit, s, 125) for s in range(0, 500, 125)]
data = await gather(*tasks, return_exceptions=True)
process(data)
await sleep(0.1)
Auto-window prevents > 4 un-ACKed frames to stay within switch buffer & slave queue.
17.4 Bulk-data streaming patterns
17.4.1 Sliding-window burst
- Break file (e.g. 4 kB firmware fragment) into 60-register packets.
- Use FC 16 Write Multiple Registers with incrementing start address.
- Slave returns echo; if CRC/exception, resend only failed packet.
(Fig-17-2: timing chart, 115 kBd, 35 kB/s effective.)
17.4.2 Circular buffer + pointer registers
- Two holding registers expose Write_PTR and Read_PTR.
- Stream producer writes data into 256-reg circular buffer, updates Write_PTR.
- Consumer polls Read_PTR; if not equal, reads block and advances pointer.
- Achieves near-continuous 100 kB/s on 100 Mbit TCP with tiny memory.
17.5 User-defined function codes (65–72 / 0x41-0x48)
Step | Detail |
---|---|
1. Reserve FC | Pick within 65-72; avoid clashes with vendor specs. |
2. Publish spec | Byte layout, max length, expected exception mapping. |
3. Implement client stub | Extends library. Example for pymodbus: client.execute(CustomRequest(...)) . |
4. Implement server handler | Add case in switch; validate payload; reply. |
5. Conformance test | Fuzz illegal length, bad CRC, out-of-range opID. |
Example FC 65: “Transfer CSV line”
Request PDU: 0x41 | CSV_len | CSV bytes[]
Response PDU: 0x41 | Status (0x00 OK, 0x01 Full)
(Listing 17-3: Python custom PDU classes.)
17.6 Protocol-conversion micro-service
17.6.1 Stack
Layer | Technology |
---|---|
Southbound | pymodbus async TCP & RTU dialects |
Middleware | asyncio Queue (back-pressure) |
Northbound | paho-mqtt or opcua asyncua |
17.6.2 Rate-limiting
- Max N in-flight polls; convert output to MQTT topic
/site/slave17/hr40001
withretain
. - Use token-bucket: 10 tokens per second per slave; prevents MQTT brokers overrunning slow RTU bus.
17.6.3 Type mapping table
Modbus | MQTT JSON | Convert |
---|---|---|
Coil | bool | int→bool |
Holding uint16 | number | v / 10.0 scaling |
Float32 | number | decode swap pattern |
(Fig-17-3: architecture diagram; docker compose stack.)
17.7 Edge analytics integration
- Use InfluxDB Telegraf
plugins.inputs.modbus
inside container; push to InfluxDB/Chronograf. - Pipeline output to Grafana dashboards; threshold alerts drive webhooks.
17.8 Benchmarking & profiling
Tool | Scope | Example usage |
---|---|---|
perf (Linux) | CPU hotspots | perf record -e cycles:u ./master |
Pyinstrument | Python async | py -m pyinstrument listing_17_2_async_pipeline.py |
Saleae Logic | Bus gaps | Measure TX-Enable jitter < 1 µs |
bpftrace | Kernel TCP retrans | bpftrace -e 'kprobe:tcp_retransmit_skb { @[comm]=count(); }' |
(Fig-17-4: flamegraph before/after NODELAY + window tuning.)
17.9 Security considerations for advanced patterns
- Custom FC can bypass firewalls → update ACLs.
- Bulk write bursts may trigger IDS “flood” → whitelist IP.
- Protocol converters: validate payload to avoid MQTT injection.
- Async pipeline: ensure per-slave mutex on writes; simultaneous two-master writes corrupt state.
17.10 Best-practice checklist
✔︎ | Rule |
---|---|
☐ | Keep ≤ 4 in-flight TIDs per TCP socket to avoid congest switch queues. |
☐ | For RTU async, never exceed (baud / frame_len) × 0.7 request rate. |
☐ | Document every user-defined FC; publish Wireshark dissector. |
☐ | Use token-bucket when bridging to MQTT/OPC UA to respect RTU bandwidth. |
☐ | Profile with real capture (Saleae) before claiming “deterministic”. |
Chapter recap
- Async pipelines unlock major throughput gains, but require careful response matching and buffer limits.
- Bulk-data transfers are feasible via standard FC 16 or circular buffers; validate CRC at application layer if > 252 bytes.
- User-defined FCs should be last resort and must come with rigorous documentation and test cases.
- A lean micro-service can safely translate Modbus to MQTT/OPC UA with rate-limiting and type mapping.
- Profiling tools—perf, Pyinstrument, Saleae—turn “it feels fast” into data-driven optimisation.
Assets to create
ID | Visual / file |
---|---|
Fig-17-1 | Async vs sync latency graph |
Fig-17-2 | Sliding-window bulk-transfer timing |
Fig-17-3 | Protocol-converter architecture |
Fig-17-4 | Flame-graph before/after tuning |
Listing 17-1..3 | Async RTU C ISR, Python pipeline, custom FC classes |
Docker-compose | RTU-to-MQTT bridge demo |
Next: Chapter 18 – Integrating Modbus with PLCs—we’ll configure real Siemens, Rockwell, and Beckhoff controllers as Modbus masters/slaves, map tags to registers, and write Ladder + Structured Text examples.