AUTOMOTIVE ELECTRICAL & AIR CONDITIONING NEWS
This is the third in a series of articles by Jason Turner and Clinton Smith from REDARC Technologies. They are writing about the latest in CAN and developments in this rapidly changing field.
In the last article we discussed some common types of CAN, or more correctly, common ‘CAN physical layer standards’. In this, the third article we will be considering the data side of CAN.
First up we have arbitration; that is, what does a CAN bus do when two or more devices attempt to transmit at the same time? In most types of networks, if multiple devices transmit a message at the same time then what is called a ‘collision’ occurs and the two messages are jumbled together. The devices then notice the message is wrong and so they discard it and wait for a randomly generated time before trying to transmit again. Since this time is random, one of the devices will begin retransmission first and so the other will have to wait, with no regard given to the importance of the messages. This is called ‘destructive’ arbitration because the original message is destroyed when the collision occurs.
CAN has a better way of arbitrating known as non-destructive bit-wise arbitration as shown in the diagram below. This is how it works. Assume there are three nodes attempting to transmit all at the same time. If all of them transmit bits of the same polarity, as they do in the example, then there is no problem.
But then ECU2 attempts to transmit a recessive bit, while the other two transmit a dominant bit. Since dominant bits win out as we discussed earlier, we can see that what appears on the bus is a dominant bit. ECU2 notices that what it transmitted does not match what it is reading back, telling it that another node must be transmitting as well, so it switches to listening only and stops transmitting.
Next ECU3 attempts to transmit a recessive bit while ECU1 transmits a dominant bit. Again, the dominant bit wins and ECU3 stops transmitting.
Now ECU1 is the only device transmitting on the bus so it continues as it was, completely oblivious that any other nodes were attempting to transmit, and no collision occurred so no message is lost. When transmission is complete, ECU2 and ECU3 will attempt to transmit their messages again.
Since dominant bits win, this means that messages with more dominant bits at the start will win arbitration more often, allowing for priority order to be established.
In order for this arbitration method to work, the messages must be in sync. However, since CAN is asynchronous (meaning there is no separate clock signal to keep the nodes in sync) each node must synchronise its clock at the start of a message and then on every falling edge resynchronise to ensure that it counts the number of bits correctly.
The problem is, if there is a long stream of bits of the same type, then eventually the nodes will go out of sync. For example, if the bit rate is 250kbps (bit time of 4μs), and there is 76μs of recessive bits measured between two dominant bits, then how many recessive bits is this? One node may count 20 and another only 19 because the latter ones clock is slightly slower than the former.
To prevent this, CAN uses a method called bit stuffing as shown in the diagram below. The way this method works is that after five consecutive bits of the same polarity, a bit of the opposite polarity is inserted. This means that resynchronisation can occur at least every six bits, which means the time is not long enough for the nodes to go out of sync.
To demonstrate, after five recessive bits, the transmitting node adds in a dominant bit.
Then, when a device receives the message, it sees that the sixth bit is a bit stuffing bit (because it occurs after five consecutive bits of the opposite polarity) so it discards this bit to get back to the original message.
Next we consider the data-link layer proper. That is, how do we turn ones and zeros into meaningful data? To do this messages are carried in ‘packets’ or rather ‘frames’. CAN defines four different types of frames.
First off we have the data frame. For most purposes this is the only type of frame that people really care about, as this is the only type of frame that contains data that is meaningful to the end user.
Next up is the remote frame. This essentially contains no data but is used to request another node to transmit a desired data frame. Then we have the error frame. This is transmitted automatically by any node that sees an error to let the transmitting node know that it needs to retransmit that message.
Lastly we have the overload frame. This is used to delay transmission; when one node is too busy processing a message and wants more time to complete its processing, it can send out up to two of these frames which prevents new messages being sent out. In reality this frame is rarely – if ever – seen anymore as CAN controllers these days are well and truly fast enough to fully process messages before receiving the next one. However, 20 years ago when CAN was created, controllers were not as fast as they are today, so this type of frame needed to be included in the standard. I am told that the only controller to use this message was Intel’s 82526 which was created in 1987 and is now obsolete, so it is unlikely if you will ever see this type of frame.
The CAN data frame, as shown in the diagram above, is made up of seven fields:
1. The first is the start of frame bit, and as its name implies, is only a single bit used to signify the start of the frame.
2. Next is the arbitration field which contains a message identifier.
3. Then the control field which has information about the length of the message that appears in the next field.
4. The data field. This field contains the message itself.
5. Part of the CAN error-detection mechanism is the CRC code transmitted in the CRC field.
6. As well as the acknowledge field used to ensure other devices are actually receiving the messages transmitted.
7. And, lastly, the end of frame field to signify the end of the frame.
The SOF bit is simply one single dominant bit that is transmitted to indicate the start of a frame. All the nodes on the bus use this bit to synchronise their internal clocks. Simply put, this bit can be transmitted whenever the bus is not already in use by another node.
The next field is an interesting one. As you can see, the arbitration field can either be 12 bits or 32 bits long. This is because of the difference between the formats in CAN standard 2.0A (Standard CAN) and 2.0B (Extended CAN), where the former describes an 11 bit message identifier, and the latter a 29 bit identifier. CAN has been designed so that both formats can be used on the same bus without causing issues.
We’ll look at the standard format first. The standard format uses an 11-bit message ID and a single dominant bit called the Remote Transfer Request bit. With CAN, instead of nodes having identifiers and anonymous messages sent to particular addresses (like other networks do), it is the other way around: with every message having a unique identifier and that message being sent to every node. Each node then checks if that message ID is one of the ones it cares about and if so, processes it; otherwise it simply ignores it.
The CAN standard doesn’t define what each message is, it simply defines that each message must have a unique ID. So why is this field called the arbitration field? The reason for this was partially answered earlier when I discussed how arbitration occurs. Since arbitration is bit-wise, with dominant bits winning out over recessive bits, and all messages having unique IDs, it means that the arbitration will be decided during the transmission of this field (with one or two exceptions). It also means that messages that have a lower ID will win arbitration more often, so messages can be prioritised such that the most important messages have the lowest IDs.
If we look at the extended format we can see that the first part is set up to be almost identical to the standard format, except that the RTR bit is replaced by a SRR bit.
This bit must be recessive and essentially does nothing other than ensure that standard and extended messages are compatible on the same bus. What it does mean though is that a standard message that has the same ID as the first part of an extended message will always win arbitration.
The next bit is the IDE bit. This bit must be recessive to indicate that this is an extended format message. You’ll see in a minute how the corresponding bit in a standard message is always dominant.
After this bit we have the second part of the identifier, a further 18 bits that when combined with the first 11 bits gives the entire 29-bit message ID.
Lastly, we have the RTR bit for the extended format, transmitted as a dominant bit, just as it was in the standard data frame.
The control field comes next, consisting of two reserved bits both set as dominant, and a four-bit data length code. This DLC is simply a number that indicates how many bytes long the message is, from zero to eight bytes.
You’ll notice that the first reserve bit is also labelled IDE. By comparing the two data-frame types we can see that this is because this bit in a standard data frame lines up with where the IDE bit appears in an extended data frame. Because this bit in the control field is dominant and the IDE bit in an extended data frame is recessive, this allows the differentiation between the two types of data frames.
Everything up until this point has been preparing the nodes for the actual data to arrive. The data field is the field that contains all the data for the message. Once again, the CAN standard does not define what this data means, how it is formatted, etc. All the standard defines is that this field contains zero to eight bytes of data.
The number of bytes in a message is defined on a per message basis, and is specified by the DLC. This means that on any particular bus you may have some messages that only contain one byte of data, and others that contain seven or eight bytes of data.
Once the data has been sent it needs to be checked to ensure that there are no errors; this is where the CRC field comes in, as shown in the diagram below. As each node receives a data frame it performs a CRC calculation as it goes so that by the time it gets here it has calculated a CRC value based on what it has received.
Before sending, the transmitter has already calculated a CRC on the data and it transmits that value in the CRC field. All the receiving nodes compare the CRC value against what they have calculated, and any node that does not get a match immediately transmits an error frame to stop the transmitting node from completing the transmission, and alert all other nodes to discard that frame. The original transmitting node will then attempt to retransmit the message at the next available opportunity.
The acknowledge field is used to determine whether other nodes are actually receiving the message being sent. To do this, the transmitting node transmits a recessive bit as the first bit of this field, known as the ACK slot. All other nodes that successfully receive the message transmit a dominant bit at the same time to acknowledge they have correctly received the message.
This way, when the transmitting node reads back this bit and sees it is dominant, it knows the message has been received by at least one other node; however, if it sees a recessive bit then it knows that no nodes have received the message and it will need to send it again. The second bit in the ACK field is also transmitted as recessive and simply exists as a spacer that allows for the transmission delay of other nodes sending the ACK bit. If this bit did not exist and the transmission delay caused one of the ACKs sent by a receiving node to creep into the next bit time, then the transmitting node would see a dominant bit at the start of the EOF (end of frame) field and would throw an error, failing transmission.
The very last field in a data frame is the aforementioned EOF field, which is simply seven consecutive recessive bits. This field serves two purposes: firstly it allows receiving nodes time to finish processing the message, and secondly, if an error is detected in the CRC, then the error frame can be transmitted within the data frame.
If this field was not here and an error was detected at the very end of the CRC field, then the error frame would appear after the transmitting node has finished and it will think transmission was successful and not retransmit the message. The error frame would simply be confused with an overload frame.
The second type of frame is the remote frame. Essentially, the remote frame is exactly the same as a data frame with one distinct difference: no data. So what is its purpose? Well, the point of a remote frame is to request data. By including this frame in the standard, it allows for messages to be transmitted on a request basis, so that they are only transmitted when another node requests the data.
This is how it is done:
Besides the missing-data field, the other difference appears in the arbitration field. In a remote frame, the RTR bit is transmitted as recessive so that receiving nodes can distinguish between a data frame and a remote frame.
A by-product of this is that because a data frame is identical to a remote frame up until this point, and the data frame transmits a dominant bit here, a data frame with the same ID as a remote frame transmitted at the same time will win arbitration. This is desired behaviour as the purpose of the remote frame is to request that very data frame anyway.
The last type of frame we will look at is one I have already mentioned a few times, the error frame. An error frame does not follow the other arbitration rules of a CAN bus, rather the error frame is transmitted immediately by any node that detects an error to intentionally disrupt whatever is being transmitted at the time.
There are many different types of errors that can trigger an error frame. These are:
The format of an error frame is quite simple, it is just six consecutive dominant bits and then eight consecutive recessive bits. Since the first six bits are dominant, they will always win out on the bus no matter what else is being transmitted. Additionally, it intentionally breaks the bit stuffing rule, to make it obvious that this is an error frame and not simply data on the bus.
By transmitting the error frame over the top of a message that is being transmitted, it tells the transmitting node that the message needs to be retransmitted, and it tells the receiving nodes to discard the message.
This is where things get complicated.
There are three types of error states that a node can reside in. These are:
There are two types of error counters, these are the receive error counter (REC) and transmit error counter (TEC).
These, along with a detailed list of fault confinement rules, determine which state a node should be in.
That now completes the material defined by the CAN standard. In our next article we will cover some really interesting material, the CAN application layer and higher level protocols.
If you have any questions or would like further information please do not hesitate to email Jason Turner at REDARC - firstname.lastname@example.org - or call (08) 8322 4848.
To view the full article and images, please click the link below.