One Control Engineer’s Tedious Fight With Intermittent Modbus RTU Communications Failures And When to Just Give Up

Summary A Control Engineer with a PLC master and a two-slave node Modbus RTU network reported inexplicable failures. After hours, days or even weeks of correct operation, the meters mysteriously failed, requiring a manual power cycle.. This guide explores common causes of Modbus network failures and how to resolve them.

Common Causes of Modbus RTU Network Failures

Modbus RTU networks fail due to a number of both hardware and software issues.

Hardware Issues in Modbus RTU Networks

  1. Wrong wiring – Using the wrong type of wire. It’s hard to believe, but I once had a customer who wired his Modbus RTU network with speaker wire.
  2. Bad terminations – A common failure is a loose termination somewhere on the network. Over time, especially in a high-vibration environment, termination can loosen. Often, it’s not enough to fail but just enough to intermittently connect. One bad termination can ruin your whole day, as it’s nearly impossible to find in a big network using RS485 communications.
  3. Missing terminating resistors – You need a 120 Ohm terminating resistor at each end of the network, not at every device. The terminating resistors prevent signal reflections and ensure data integrity. Unfortunately, you don’t need terminating resistors for short distance (under 50m) networks, so some users make a habit of not installing terminating resistors on every network.
  4. Mislocated terminating resistors – Using terminating resistors in the middle of the network is not a difficult mistake to make. Some devices come with embedded terminating resistors that you must disconnect when that node is not at the end of the network. It’s easy to overlook that. Other times, devices are added to a network without moving the terminating resistor.
  5. Wrong terminating resistor – Using a terminating resistor that isn’t 120Ohms is a common mistake. As a shortcut, some users grab whatever resistor is in the toolbox instead of installing the correct one.
  6. Excessive cable length – The maximum cable length of an RS485 network is about 4,000 ft, but that length decreases significantly if your Modbus network is operating at higher data rates.
  7. Routing Modbus cable near power lines – EMI from nearby power lines or motors can negatively affect Modbus RTU networks. Modbus networks can experience intermittent failures when a nearby motor turns on.

Software Issues in Modbus RTU Networks

  1. Incorrect baud rate – All devices on the network must use the same baud rate as the Modbus RTU Master. Any device with the wrong baud rate won’t respond to Modbus RTU Request messages.
  2. Incorrect Modbus network IDs – An RTU Master device must have the correct list of Modbus RTU Slave IDs to successfully message slave devices.
  3. Duplicate Modbus network IDs – An RTU Master will likely get a timeout if two Modbus RTU slaves are set to the same Modbus network ID.
  4. Incorrect network timing – A Modbus master and all RTU devices must transmit individual bytes within a message within 1.5-character times. Modbus RTU Master devices, specifically, like the one in this case, must wait 3.5-character times between messages. If the PLC Modbus RTU master uses sloppy network timing, it could cause unexpected issues with Modbus meters on the problem network.
  5. Register and coil addressing – Modbus RTU Masters must know which addresses can be polled in each Modbus RTU slave device. When gaps exist in consecutive addresses, the results are unpredictable. For example, one Modbus device with registers 40002 and 40004 may respond to a request for three registers starting at 40002 with a zero for the undefined register, while another may fail the request because 40003 isn’t defined.
  6. Insufficient response timeout – Modbus RTU master devices may need to increase the response timeout for devices with older processors. If the response timeout setting in the PLC is right on the edge of when a meter can respond, there will be intermittent polling failures. Those failures should not lead to device failures in a properly designed Modbus RTU device. Unfortunately, some PLC Modbus masters fix the response timeout.
  7. Firmware errors – Modbus slave devices can experience memory leaks and other issues, leading to intermittent or complete device failure.
  8. Power quality – Power spikes can cause indeterminate failures on factory floor devices.

My Action Plan for Resolving the Intermittent Modbus RTU Network Failure

The action plan suggested to the user for this network failure included:

  1. Evaluate the wiring – Ensure the network uses Belden 9841 or similar shielded, twisted-pair (STP) cable and that the cable length is within specification for the selected network baud rate.
  2. Tighten All terminations – Examine every wire termination and either tighten it or replace it.
  3. Check the terminating resistor – Verify that you have only one 120 Ohm terminating resistor at the end nodes of the network and that each is really 120 ohms.
  4. Evaluate wire routing – Look for possible interference sources from motors or power cables.
  5. Lower the baud rate – Test a lower baud rate. Higher baud rates can exceed the ability of older Modbus devices to respond properly and timely.
  6. Verify network timing – Use an oscilloscope to verify that all message bytes are within 1.5-character times and that there is a 3.5-character gap between messages.
  7. Increase the response timeout at the RTU Master – Test a lower response timeout at the Modbus RTU network master (if that’s possible in this PLC). The response timeout can be too fast for one of the Modbus RTU network nodes. If the problem persists, increase the response timeout until it is resolved.

The user reported that all terminations appeared tight, correct wire type and gauge, correct terminating resistor and network timeout didn’t increase or decrease problem frequency. Since no other devices are affected, the user ruled out power quality as a possibility.

These tests ruled out many of the usual sources of Modbus RTU failures and none revealed the problem. Frankly, it wasn’t expected that they would, since most of these issues would not be resolved by a power cycle, though I have seen sporadically failing wire terminations and timing issues that caused all sorts of intermittent havoc.

The user evaluated proceeding with procuring test equipment, adding instrumentation and other Modbus devices to conduct a more exhaustive analysis, but time and cost were already excessive.

Failing that, the final two options were to replace either or both meters or implement a workaround.

How To Work Around a Modbus RTU Intermittent Failure Like This

If replacing devices is not an option, the Control Engineer could mask the problem by powering the device (Modbus meter in this case) through a Normally Closed relay contact and have a PLC energize the relay to automatically power cycle the meter when it detects a communication failure.

Yes, this is addressing the symptom rather than the cause, but how much labor is this problem worth? A non-critical, ancillary network is often not worth the excessive labor required to solve intermittent problems.

Knowing When to Give Up: How the Intermittent Failure was Resolved

The user chose not to resolve the failure. It was determined that the likely failure source was meter firmware, which entailed excessive upgrade costs as the units are obsolete and no longer supported. Evaluating the options of replacing the meters or conducting more exhaustive tests and more expense, the user chose to use the relay work around to mask the problem.

Conclusion: Winning the “Tedious Fight” Through Pragmatism

Engineering decisions always involve tradeoffs. Endless investigation is warranted only when safety is compromised. With other type of issues, there is always a point which cost dictates that you either live with the problem or find a quick work around. In this case, it was the work around.

Have questions or need more information?
Visit the RTA Learning Center or contact an RTA Enginerd application engineer at 262-436-9299 or solutions@rtautomation.com.