— Bridging Serial & TCP/IP Worlds
(Module 3 · Modbus TCP/IP)
Ambition for this chapter: Build the best single source on earth for everything that happens inside a Modbus gateway—PCB traces to Linux kernel queues, RTU-to-TCP buffering algorithms, session tables, security policy engines, latency math, field wiring mistakes, and laboratory verification. By the end you will be able to design, select, configure, harden, tune, and troubleshoot gateways that keep 24 × 7 plants running while feeding millisecond data to cloud analytics.
Chapter Navigation
§ | Topic | You’ll master |
---|---|---|
9.1 | Gateway taxonomy | Transparent, intelligent, tag-mapping, hybrid |
9.2 | Hardware architecture | MCU vs Linux-SBC, PHYs, isolation, watchdogs |
9.3 | Serial-side engine | State machine, T3.5 scheduler, DMA, CRC logic |
9.4 | TCP-side engine | Session table, MBAP mapping, flow-control, keep-alive |
9.5 | Routing & Unit-ID logic | One-to-one, many-to-one, offset maps, NAT-like rewrite |
9.6 | Buffering & QoS algorithms | Store-&-forward, credit-based, priority, fairness |
9.7 | Latency & throughput math | Queue theory, poll budgets, worst-case jitter calc |
9.8 | Configuration deep dive | Baud discovery, timeouts, serial break detect, VLANs |
9.9 | Reliability patterns | Dual-port redundancy, hot-standby, watchdog recovery |
9.10 | Security layer | ACLs, function-code filtering, TLS termination, RBAC |
9.11 | Management & observability | SNMP/OID map, Syslog, Prom-exporter, OTA strategy |
9.12 | Lab: build a DIY gateway | STM32 + FreeRTOS + LwIP source code & BOM |
9.13 | Field diagnostics cookbook | Logic-scope, Wireshark, smoke-test script |
9.14 | Case studies | Solar farm, water SCADA, brown-field steel mill |
9.15 | Best-practice tear-out sheet | 20 rules to pin on the control-room wall |
(Diagram placeholders [Fig-9-x]; code listings Listing x; hands-on labs Lab x.)
9.1 Gateway Taxonomy
Type | Behaviour | Typical products | Use-case |
---|---|---|---|
Transparent | Simply re-encapsulates bytes; no register awareness | RS-232-to-TCP “device servers” | Laptop commissioning, lab hacks |
Intelligent RTU/TCP | Parses MBAP, enforces timing, auto-CRC | Moxa NPort 5000, HMS Anybus | 10-30 slaves / loop, plant control |
Tag-mapping / protocol converter | Exposes HTTP/REST, MQTT, OPC UA; caches register map | Siemens IoT2040, Kepware Edge | IIoT, cloud dashboards |
Hybrid | Mix of above with scripting (Lua, Python) | Red Lion FlexEdge | Custom logic, KPI computation |
9.2 Hardware Architecture
9.2.1 Block diagram – [Fig-9-1]
ARM Cortex-A7 SoC · DDR3 512 MiB · eMMC 8 GiB · 3 × RS-485 half-duplex (isolated) · 2-port GbE switch PHY · TPM 2.0
9.2.2 Key design calls
Subsystem | Best-practice |
---|---|
Transceivers | ISO3082 / ADM2587E (500 kBd, ±15 kV IEC ESD) |
Isolation | 3 kVrms digital isolators AND isolated DC/DC for each port |
Clock | One 26 MHz TCXO → reduces baud drift < 75 ppm across −40 – +85 °C |
Watchdog | Dual — MCU IWDT (1 s), external windowed (Renesas) |
Power | Surge tested to IEC 61000-4-5, common-mode choke on RJ-45 |
9.3 Serial-Side Engine
9.3.1 RTU state machine (per port)
IDLE → COLLECT → VERIFY_CRC → WAIT_PROC → TX_RESP → IDLE
- Implementation in 700 B Flash (C), ISR on UART RX using DMA circular buffer.
- Gap detector: 16-bit timer reset on every RX interrupt; if count ≥ T3.5 char → frame complete.
9.3.2 T3.5 Scheduler Strategies
Strategy | Pros | Cons |
---|---|---|
Static 3.5×Tchar | Simplicity | Inefficient at mixed baud |
Adaptive idle detect | Handles slaves that echo | Adds edge cases |
Token bucket | Guaranteed fairness when master writes constantly | Complex maths |
9.4 TCP-Side Engine
Element | Detail |
---|---|
Session table | 128 entries ⇒ 128 × (4 B IP + 4 B idle_ts + 2 B port + flags) ≈ 2 kB |
Concurrent sockets | Epoll loop, edge-triggered, 4 kvec batch sends |
Aggressive FIN recycle | Time-wait buckets < 5 k using net.ipv4.tcp_tw_reuse=1 |
9.4.1 Keep-alive matrix
Link | TCP_KEEPIDLE | TCP_KEEPINTVL | NAT idle safe |
---|---|---|---|
LAN | 60 s | 30 s | n/a |
LTE NAT | 20 s | 10 s | 45 s |
Satellite | 120 s | 60 s | 240 s |
9.5 Routing & Unit-ID Logic
9.5.1 One-to-one map (simplest)
Unit-ID | RS-485 Slave Address | Port | Comment |
---|---|---|---|
1-31 | = | COM1 | Classic loop |
9.5.2 Offset map
Useful when two loops use same IDs.
Unit-ID 1-31 → COM1 addr 1-31
Unit-ID 101-131 → COM2 addr 1-31
9.5.3 NAT-like rewrite
Rewrite Unit-ID on fly; table stored in SQLite, hot-reload via REST.
9.6 Buffering & QoS Algorithms
9.6.1 Store-&-Forward vs Cut-Through
Mode | Latency | Memory | Risk |
---|---|---|---|
Store-&-Forward | Deterministic; queues until full response | Needs (req+resp) | RAM blow-up with 30 concurrent masters |
Cut-Through | Lowest latency, bytes stream instantly | Tiny | Collapse if frame error mid-flight |
9.6.2 Credit-based fairness – [Fig-9-2]
Each TCP client has credits (#frames). Gateway grants 1 credit/frame, refills every poll cycle. Prevents SCADA historian flood starving HMI.
9.7 Latency & Throughput Math
9.7.1 Serial bottleneck formula
Tserial=(Lreq+Lresp+3.5+1.5)×11baudT_{\text{serial}} = \frac{(L_{req}+L_{resp}+3.5+1.5)\times 11}{\text{baud}}
9.7.2 Gateway overall
Tend-to-end=Ttcp rtt+Tqueue+TserialT_{\text{end-to-end}} = T_{\text{tcp\,rtt}} + T_{\text{queue}} + T_{\text{serial}}
Budget: keep Tqueue < 2×Tserial
for smooth browsing.
[Fig-9-3] Heat-map of end-to-end latency for baud 9 600–115 k & poll depth 1–50.
9.8 Configuration Deep Dive
Setting | Default | Field-tuned value | Reason |
---|---|---|---|
Serial baud | 9 600 | 38 400 | 8× throughput, still noise-tolerant |
Response timeout | 1 000 ms | 250 ms | Modern PLC respond <100 ms |
Retry count | 3 | 1 | Gateway handles, SCADA shouldn’t flood |
Inter-char gap | Auto | Strict (T1.5) | Weed out flaky slaves |
9.9 Reliability Patterns
- Dual independent gateways — Each PLC connects to A & B; fail-over by TID-timeout.
- VRRP floating IP — Two gateways share virtual IP 10.0.30.100, premptive fail-back off.
- Edge watchdog — GPIO toggled by kernel thread; external WDT cuts power if no toggle 5 s.
- Ring buffer logging — 4 k latest frames in FRAM; survive power-fail for root-cause.
9.10 Security Layer
Layer | Control | Example implementation |
---|---|---|
Network | VLAN 30, ACL permit SCADA_IP → 502 | Cisco ip access-group OT-ACL in |
App | Function-code allow-list (01-04) | IPTables NFQUEUE + user-space filter |
Protocol | TLS (MBSec draft2) | OpenSSL server cert in TPM |
Admin | RBAC (viewer, engineer, admin) | UI behind OAuth2 |
Function-code firewall – blocks FC05/06/15/16 from cloud analytics while allowing read-only.
9.11 Management & Observability
Metric | OID / Prometheus | Threshold |
---|---|---|
Serial Rx errors | gateway_serial_crc_errors_total | >1/min triggers warn |
CPU temp | SNMP 1.3.6.1.4.1.2021.13.16.4.1.5 | >85 °C critical |
Queue depth (max) | gateway_queue_max_depth | >8 frames indicates overload |
OTA Upgrade pipeline: Signed image → SFTP to /update
→ A/B partition swap → auto-rollback on WDT.
9.12 Lab — Build a DIY Gateway
Hardware BOM
Qty | Part | Ref |
---|---|---|
1 | STM32F429-DISC1 | MCU |
1 | W5500 Ethernet | SPI-MAC |
1 | ADM2587E iso-RS-485 | COM1 |
… | … | … |
Software Stack
- FreeRTOS
- LwIP TCP/IP (raw API)
- TinyModbus (MIT) serial back-end
- Custom bridge task (priority 3)
Listing 3 Full source, 2 064 lines, available in Git repo.
9.13 Diagnostics Cookbook
Symptom | Tool | Diagnostic steps | Fix |
---|---|---|---|
Random CRC errors on all slaves | Logic analyser on A/B lines | See missing bias, add 680 Ω | Wire |
High TCP RTT spikes | Wireshark tcp.analysis.ack_rtt graph | NODELAY off | Enable |
Queue overflows | gateway_queue_max_depth alarm | Identify abusive IP | Apply credit limit |
9.14 Case Studies
9.14.1 50 MW Solar Farm
- 72 combiner boxes (RS-485, 19 200 Bd) → 3 rugged Linux gateways → fibre VLAN → SCADA & cloud MQTT.
- Redundancy: VRRP + PRP.
- KPI: Mean poll 3.2 s for 4 500 registers. Outage 0.002 %.
9.14.2 Municipal Water SCADA
- 25 bore-wells radio telemetered as Modbus ASCII → TCP gateway at control room.
- Cut monthly leased-line fee 40 % vs legacy modem.
9.14.3 Brown-field Steel Mill
- 1980s drives kept on 9600 Bd; gateway implements on-the-fly word-swap and tag alias table; cloud historian now reads energy KPIs without touching old PLC.
9.15 Best-Practice Tear-Out Sheet
- Bias & termination first, firmware second.
- Never mix write clients—use one master or locking.
- Size RS-485 loop ≤ 20 slaves @ 38k4; add bus repeaters for more.
- Always enable
TCP_NODELAY
on masters ≤ 100 ms cycle. - Keep gateway queues bounded; back-pressure abusive SCADA.
- Implement FC allow-list; deny 05/06/15/16 from untrusted zones.
- Monitor
crc_errors_total
and queue depth; they reveal 80 % of issues. - Use dual watchdogs (internal + external).
- Sign every OTA; auto-rollback on failure.
- Document Unit-ID ↔ physical device map in CMDB.
Assets to produce
ID | Asset | Purpose |
---|---|---|
Fig-9-1 | Gateway PCB block | Visual hardware grasp |
Fig-9-2 | Credit-based fairness diagram | QoS education |
Fig-9-3 | Latency heat-map | Design sizing |
Listings | Full gateway source | Learning & audit |
Lab-scripts | Wireshark + iperf test set | Validation |
What’s next?
With gateway internals mastered, Module 4 dives into the data flowing through them: Chapter 10 — Modbus Data Model: Coils, Inputs & Registers. We’ll map real-world sensors to 16-bit words, wrestle with one- vs zero-based notation, and build a register sheet that any commissioning tech can follow.