Friday, July 19, 2019

Converging observations in IoT

The problem is this. You have a sensor on a farm that transmits its observations. The observations can be received by one or more towers. Each tower forwards the observations it receives to a server that then stores them. You now have a situation where a single observation has been recorded twice. What to do?

You might choose to de-duplicate at the server. You know that there can only be one observation sent per hour, so if there are two within the hour that have the same values then pick just one. Easy.

However, what if you then have multiple servers, perhaps for redundancy? What then if one tower connects to one server, and another tower connects to another server? It could even be that a third party server sends us the same observations given a roaming style of arrangement. Same observations, two towers, two servers.

This is our reality.

You might employ clustering for your server and attempt to eliminate the duplicate state. Conflict-free Replicated Data Types (CRDTs) are great for this. This is a reasonable solution and having used CRDTs lovingly, we could just stop there.

We’ve chosen another strategy though, similar in spirit to CRDTs. However, we think it may be a bit simpler and it also allows for the other server to not be under our control as per the roaming scenario.

CRDTs have this wonderful property of always knowing how to merge i.e. there is never any conflict. Our view is that we permit conflict and eliminate it at the point of being consumed. We allow observations to be recorded indiscriminately as we don’t know what other systems will also be recording them.

The “point of being consumed” is most often when we render the data at the presentation layer. We provide the presentation layer with multiple sources of observations. We let the presentation layer de-duplicate. This is powerful as the presentation layer understands the context in which it is consumed. It is easy for it to reason that an observation that has the same value within the same hour is a duplicate and so it can be dropped. A presentation layer is a state machine.

Other consumers of observations e.g. a service that actuates given sensor inputs, is also a state machine and is also in a great position to de-duplicate. Its time window for duplicate detection could also be different from that of the presentation layer given that actuation may occur over a shorter or longer period of time.

Oh, and if you’re at all worried about the number of duplicates being sent to the presentation layer then: a) generally don’t worry (measure the effect); or b) de-duplicate further upstream.

And that’s it really.