Any plans for end-to-end encryption?


#1

With the current design, the HAT API endpoints exchange clear-text data with HAT-enabled users and applications. As a result, the HAT server process needs to have full access and visibility of all data in the HAT.

That is not an issue for the “hard hat” scenario where the HAT owner hosts this process and controls access to it on her own hardware. However, a useful deployment scenario might involve running the HAT server process on cloud infrastructure. In that case, a significant amount of trust has to be placed in the infrastructure provider (whose ability to protect the privacy of the data he is entrusted with may be constrained by law enforcement requirements).

A possible solution to this problem could be end-to-end-encryption, whereby only encrypted data is exchanged via the API, and the API endpoint server process is not given the capability to inspect the cleartext. Only the users and applications that are the actual producers and recipients of a piece of data will be able to do that (locally on their end, after receiving the encrypted data from the API). The current process of data debit contract would need to be augmented by peer-to-peer key exchange between the involved parties (without the central HAT process being part of that, except maybe as a messaging bus).

Is there something like this on the roadmap?


#2

Hi @thiloplanz,

No, there is nothing like this on the roadmap.

This is because:

  1. One of the core features of the HAT is personalised data contextualisation, and for this to work the HAT engine needs to be able to operate on non-encrypted data to perform the contextualisation.
  2. The HAT is not only a Personal Data Store, but rather a member of a network where data is exchanged according to the data owner’s preferences. For the data to be stored encrypted at the core and only decrypted at the receiving end, each such receiving end must be able to use the same crypto key, throwing away any possible benefits of storing the data encrypted. Furthermore, the HAT would not be able to selectively only send the data the receiver has requested (and the owner authorised), relying on every application to only use the data they have agreed to use after decrypting everything.
  3. The above two points can be addressed with homomorphic encryption, which is sadly not yet technically viable beyond the most trivial arithmetic operations.

Chances are we might have missed something that is available to solve the issues however, so suggestions are very welcome!

Best,
Andrius


#4

Regarding data contextualization, yes, this part of the HAT engine would need to be moved to the client side (the HAT management application that the HAT owner uses to control the HAT). Same for adding additional decryption keys when selecting recipients (which seems to fit naturally with the existing process of creating data debit keys actually, where the HAT owner already needs to be present to authorize data consumers).

So I think that while the central server would indeed become a “dumb storage node”, all the HAT logic and features can still be kept. You can think of it as splitting the current HAT server into two components, the storage part (which needs to be online all the time and have proper backup procedures, but does not need to be trustworthy anymore, so you could use something like iCloud or Dropbox) and the trusted controller application part (which could be a mobile phone app).

Again, the motivation here is to make cloud deployment a reasonable choice for the privacy-conscious (which is a challenging proposition in the current model).


#5

Sorry for the delay in replying to this.

Overall there is no reason why something like this could not be implemented on the current HAT - at the end of the day such approach would merely use any online system as plain storage that accepts encrypted data.

More generally, such approach would still need to have the software that handles unencrypted data running somewhere and more importantly - reachable at all times, both to be able to receive data from various sources and to participate in the outbound data flow in re-encrypting the data for each data debit client (to my knowledge an approach where a piece of encrypted data can be decrypted with multiple different keys does not exist, so it needs to be encrypted for each key of a data debit client). For a mobile device that would be a lot of traffic.

Still another alternative would be to design an approach where data gets routed directly to data debits at the time of arrival, however that prevents all but the most trivial contextualisation and does not allow for historical data access.

You could also note that the HAT can in fact run on the client side (e.g. a RaspberryPi at their home), using Dropbox or another cloud storage solution for data backup by using a directory there for the HAT database storage, encrypted.

Your idea seems interesting, however and I’m sure the community would love to see a detailed architecture and description on how that could work in real life!