Fixing TinyOS rfxlink layer ACK handling

In my last post I wrote about a strange situation where packet with a length of 248 bytes was received by application layer.
Today I propose a possible fix or rather workaround to that situation.

 Cause

This issue happens because error detection in rfxlink layers is not very effective. The problem I was experiencing was caused by incorrect ACK packet. Correct ACK packet must have following fields: fcf and dsn. First bit In fcf field must be set and dsn field must contain the same dsn number that original packet had.
ACKs are handled in SoftwareAckLayerC.nc in the following way:

tasklet_async event message_t* SubReceive.receive(message_t* msg)
{
    RADIO_ASSERT( state == STATE_ACK_WAIT || state == STATE_READY );

    if( call Config.isAckPacket(msg) )
    {
        if( state == STATE_ACK_WAIT && call Config.verifyAckPacket(txMsg, msg) )
        {
            call RadioAlarm.cancel();
            call AckReceivedFlag.set(txMsg);

            state = STATE_READY;
            signal RadioSend.sendDone(SUCCESS);
        }

        return msg;
    }

    if( state == STATE_READY && call Config.requiresAckReply(msg) )
    {
        call Config.createAckPacket(msg, &ackMsg);

        // TODO: what to do if we are busy and cannot send an ack
        if( call SubSend.send(&ackMsg) == SUCCESS )
            state = STATE_ACK_SEND;
        else
            RADIO_ASSERT(FALSE);
    }

    return signal RadioReceive.receive(msg);
}

Assume that somebody is sending following packets and we are receiving those with RFA1.

// correct ACK
length | fcf  | fcf  | dsn  | crc  | crc
  0x05 | 0x02 | 0x00 | 0x0A | 0xXX | 0xXX

// intentionally incorrect ACK
length | fcf  | fcf  | dsn  | crc  | crc
  0x05 | 0x00 | 0x00 | 0x0A | 0xXX | 0xXX

Notice what happens if call Config.isAckPacket(msg) is not true - this packet is signaled to next layer. Lets see what is actually happening in Config.isAckPacket(msg). From RFA1RadioP.nc we see:

async command bool SoftwareAckConfig.isAckPacket(message_t* msg)
{
    return call Ieee154PacketLayer.isAckFrame(msg);
}

Which leads us to:

async command bool Ieee154PacketLayer.isAckFrame(message_t* msg)
{
    return (getHeader(msg)->fcf & IEEE154_ACK_FRAME_MASK) == IEEE154_ACK_FRAME_VALUE;
}

This reveals that packet is considered to be ACK if first bit in fcf field is set. Therefore all packets that are just like ACKs except this bit not set, are considered to be data packets by SoftwareAckLayerC and are signaled to upper layers. There is no other checks in this layer. Unfortunately things are not better in other layers and ultimately the same faulty ACK packet reaches ActiveMessageLayerP:

event message_t* SubReceive.receive(message_t* msg)
{
    am_id_t id = call AMPacket.type(msg);
    void* payload = getPayload(msg);
    uint8_t len = call Packet.payloadLength(msg);

    msg = call AMPacket.isForMe(msg) 
        ? signal Receive.receive[id](msg, payload, len)
        : signal Snoop.receive[id](msg, payload, len);

    return msg;
}

async command uint8_t RadioPacket.payloadLength(message_t* msg)
{
    return call SubPacket.payloadLength(msg) - sizeof(activemessage_header_t);
}

Things are going out of control here because our faulty ACK packet which length is 5 bytes has reached this point and this layer thinks it is normal data packet. Look how payload length is calculated here: SubPacket.payloadLength(msg) - sizeof(activemessage_header_t). Header is longer than ACK packet length, therefore uint8_t wraps and payload length 248 is reported to upper layer.

Actually I do not know yet why I received faulty ACK packets in the first place. However clearly there must be stricter checks in rfxlink layers to filter out those packets.

 Workaround

Currently there is no good way to implement better checks because there is no such layer that uses right interfaces to do proper checks. I mean if for example new interfaces that can check fcf or other fields are introduced to SoftwareAckLayerC then wiring must be changed for platforms that are using rfxlink. I let TinyOS maintainers to decide how to add proper checks to rfxlink and propose simple workaround just for the RFA1 radio driver.
Firstly function to get ieee154_simple_header_t is added to RFA1DriverLayerP.nc:

ieee154_simple_header_t* getieeeHeader(message_t* msg)
{
    return ((void*)msg) + 1; // +1 is because of length byte
}

Actually it is too implementation specific however as I explained before, currently there is not a better place in rfxlink layers.
Original decision to signal to upper layer in downloadMessage() was following:

// memory is fast, no point optimizing header check
memcpy(data,(void*)&TRXFBST,length);

if( signal RadioReceive.header(rxMsg) )
{
    call PacketLinkQuality.set(rxMsg, (uint8_t)*(&TRXFBST+TST_RX_LENGTH));
    sendSignal = TRUE;
}

Modification to filter out incorrect ACK packets:

// memory is fast, no point optimizing header check
memcpy(data,(void*)&TRXFBST,length);

if (getHeader(rxMsg)->length == 5 && !(getieeeHeader(rxMsg)->fcf & 0x02))
{
    sendSignal = FALSE;
}
else if( signal RadioReceive.header(rxMsg) )
{
    call PacketLinkQuality.set(rxMsg, packet_link_quality);
    sendSignal = TRUE;
}

I hope this post creates some discussion how to properly add this kind of error checks to rfxlink layers.

 
7
Kudos
 
7
Kudos

Now read this

Convert subprocess stdout stream into non-blocking iterator in Rust

In one of my programs I had to interact with another subprocess. This subprocess took data from stdin and wrote result to stdout. It wasn’t just simple reading and writing - it took constant data stream from stdin and somewhere in the... Continue →