It’s possible that even if you don’t know it, you have EtherNet/IP performance problems. In this article, I will describe the history behind some of the physical layer technology, why you may have hidden performance problems, and what you can do about it.
In the early days of Ethernet, there was COAX cable and we used 10Base2 (known as ThinNet) and 10Base5 (known as ThickNet) cabling. Those cables had a single conductor. All devices transmitted and received on that single conductor. With a single conductor, two devices couldn’t use the media concurrently, so CSMA/CD was born. CSMA (Carrier Sense Multiple Access with Collision Detection) is a system in which devices listen to their transmission on the wire and if they detect a collision – bits on the wire they didn’t send – they back off and try again later. Another way, that designers attacked this problem was to use Half Duplex systems in which all the devices coordinated: only one device transmitted (usually a master) and all the other devices, when not transmitting, were in receive mode (slaves). But that kind of architecture, Half-Duplex wasn’t well suited for an IT environment with lots of computers sending and receiving.
The first non-shared media was 10BaseT, introduced in the late 1980s. There were now two pairs of wires: one pair for transmit and one pair for receive. But designers wanted a way to preserve the Half-Duplex operation of 10Base2 and 10Base5, so hardware was designed that mimicked the operation of coax. When transmitting on the transmit pair, if the hardware detected a message coming in on the receive pair, it acted as if there was a collision and stopped transmitting. Just like with CSMA operation, it waited a random interval of time and tried again.
With today’s Ethernet, we don’t need this kind of Half Duplex mechanism, but the capability is still built into every Ethernet device. It’s interesting that technologies like Modbus (and others), that was Half-Duplex when conceived (Modbus RTU), are now, after adapting to Ethernet, still master-slave but on a Full-Duplex link. Modbus TCP still retains its Half-Duplex origins but uses it on a link where it can get much more throughput.
The hidden problem that you may have with any of your manufacturing Ethernet networks – and not know it – is that when relying upon the Auto-Negotiation feature your switch and your device may sometimes power up and use different duplex settings. Your switch may configure itself for Full-Duplex (FD) while your device may configure itself for Half-Duplex (HD).
If that happens, your switch, running Full-Duplex, is perfectly happy and everything appears to be working properly[1]. It transmits without a problem and receives without a problem. But the issue is on the device side which is using Half-Duplex. It will occasionally (or often) detect “soft” collisions. A soft collision occurs when an HD device has a transmit in operation on its transmit lines and starts to receive a message on its receive lines. When that happens, the device aborts the transmission and tries again later.
You can have just a few of these collisions, or you may get a lot, but the switch–device communication is still operational, and messages do get through. They just get delayed, and you lose bandwidth. You generally don’t know this unless you monitor the EtherNet/IP response time closely. Sometimes, the situation will correct itself on the next power cycle and both devices will use Full-Duplex. This is a problem that can be very sporadic and hard to duplicate.
How does this happen? Usually, it’s a failure in auto-negotiation. When power is applied, Ethernet devices negotiate their duplex and speed settings. That negotiation mostly works properly, but it’s been known to fail into a state where one device thinks it’s negotiated Full Duplex and the other thinks it’s negotiated Half-Duplex. You’ve now got a problem that isn’t readily apparent because everything still works. It just isn’t as efficient as it could be.
This becomes a much more significant problem if you are using linear segments where you hang a long series of devices off of one port on a switch. Someplace down the line, two of the devices have a mismatch, and it now affects the bandwidth of the entire segment. With a lot of devices on the line, the problem might become noticeable.
So, how can you detect this? There are two mechanisms. First, switch diagnostics. If the switch side is on the Half Duplex side of the mismatch, it will be detecting a lot of errors and should be reporting those errors in its statistics. If it’s on the Full Duplex side of the mismatch, it will be detecting fragmented packets. This is where investing in a better brand of switches makes a lot of sense. Some switches have better diagnostics that you can use to identify these sorts of problems. Others can even trap these errors and signal them in various ways. Or second, you can use devices with SNMP (Simple Network Management Protocol). With SNMP[2] you can interrogate the devices and check the link status and statistics of all the devices on the networks. You can sometimes have a device raise an SNMP error when it detects a lot of collisions.
Another option, if you’re using EtherNet/IP linear segments, is to use Device Level Ring (DLR) capable devices. With DLR, your device supports a DLR object in the EtherNet/IP object model. That object has link statistics that you can interrogate to identify if you are having auto-negotiation issues.
Another possibility is to standardize on a switch manufacturer that allows you to force the duplex to Full Duplex. And if you have the same capability in your EtherNet/IP devices, you should use it and force them to Full Duplex. If not, you may have a subpar EtherNet/IP network and not even know it. Of course, if your facility is using hundreds of switches, configuring every switch is not manageable. In that case, it’s best to rely on the auto-negotiation feature to set the duplex setting.
This article is the third in my series of articles on what every control engineer needs to know about IT. In the next article in this series, you will learn about EtherNet/IP classes and how to best use them in an industrial network. Click here to read the first and second articles in this series.
[1] If your switch is capable of detecting message fragments that never form complete messages, the message fragment count will increase on these errors.
[2] SNMP isn’t much help on linear segments where a number of devices are cascaded off of one switch port.