The domain that SmartFlow aims to operate in is very sensitive in terms of confidentiality. A patient's medical data must never be released to unauthorised principals and such a guarantee becomes more demanding as the scale of the system increases. Security in SmartFlow cannot be left to each application's developer. Programming mistakes, unexpected interactions between components and inappropriate algorithms have all damaged the security of computer systems in the past. Instead, the middleware itself must offer strong security guarantees that can be leveraged by every application built on top of it. Building a small and easy to use framework that enforces security in the middleware layer will prevent sensitive application data leaks even in cases where the application developers fail to protect them correctly.
Information flow control has been a well known paradigm for building secure systems. It involves assigning labels to data and principals in a system and restricting the way in which this labelled data can be perceived and propagated by those principals. A label is a piece of information associated with a data item that controls how it can be handled by the system. Principals are entities owning data, such as patients, doctors, pharmacies and hospitals. A privileged monitor enforces information flow restrictions. Principals and data are only able to interact if a function comparing their labels is satisfied. A common example is enforcing a read down, write up restriction on labels: a principal is not permitted to write information to a lower security level, or to read from a higher one. Simple information flow control models providemandatory access control. The rules of access control are applied without exception, and thus the model is provably secure.
However, for real-world applications, it is seldom the case that only flow of information toward more classified levels is practical. Instead, extra privileges must be added to permit declassification. Decentralised information flow control is a variant that addresses this short-coming by empowering principals in the system to create and manage privileges over their own labels. Each patient in a hospital can use unique labels to control how their own data are propagated in the system. By allowing practically unlimited numbers of unique labels, bridging one patient's security does not automatically imply that the leak can affect every other patient in the system. Large scale data leaks become less probable as the the amount of trusted code is minimized: the majority of operations will be considered untrusted and will be closely monitored for leaks.
DEFCon, or Decentralized Event Flow Control, is our model for building secure event based applications. The figure to the right shows a DEFCon Engine hosting three processing Units. Every node participating in a DEFCon infrastructure provides a DEFCon engine. The engine is the container for event processing within a DEFCon application, and includes a trusted kernel. Application code is run by event processing units that are hosted by the engine. The engine provides for:
The engine permits units to communicate while upholding security properties. To avoid information leakage, units have a restricted runtime environment, preventing them from executing arbitrary code.
The engine provides units with a publish/subscribe API so that they can send and receive events. A pub/sub matcher passes events between units. Publish/subscribe dispatching enables one way communications free of covert channels. It also facilitates the deployment of new processing units and their integration with the rest of the dataflow path.
The engine initiates and terminates the execution of the threads of the units within a DEFCon application. This lets the engine apply restrictions to the runtime classes that units can access.
The above figure shows a number of events in transit: the pub/sub event matcher dispatches events between units. As explained in the following paragraph, Unit 1 publishes an event that Unit 2 can see in its entirety, but Unit 3 is only aware of and able to see some of the event. Units operating within an engine may be under the control of different administrative domains.
DEFC is our model of decentralised event flow control, as used in the DEFCon middleware. The DEFC model extends existing information flow control models, in particular the label model of Flume, by introducing a decentralised mechanism for controlling the flow of events in an event processing middleware. Similar to previous work, information flow in DEFCon is monitored and enforced through the use of security labels. Labels are the smallest structure on which information flow checking is based, and protect confidentiality and integrity of information flows. Labels are pairs, (S; I), consisting of a confidentiality component, S, and an integrity component, I. Each of S and I is a set of opaque tags. Tags are opaque values that only have meaning to the engine on which they are created. We implement tags as unique, random 64bit numbers. Each tag is used to represent an individual, indivisible concern either about the privacy, placed in S, or the integrity, placed in I, of data. Tags in confidentiality components are sticky: once a tag has been inserted into a label component, data protected by that label cannot flow to units that do not have that tag, unless privilege over the tag is exercised. In contrast, tags in integrity components are fragile, destroyed when information with such tags is mixed with information not containing the tag, again unless a privilege is exercised.
A key aspect of our approach is the use of information flow control at the granularity of events.
In our model, an event consists of a number of event parts. The figure on the side shows an event with three event parts. Here, the empty integrity tag on the patient ID means the data quality is not vouched for. The test grade, and the confidence of the HIV status are vouched for, to varying levels. The empty confidentiality tag for blood type indicates that it is public information, but the blood-lab has secured the patient ID, and has indicated that the HIV status is even more sensitive. Thus, a single event can have multiple parts with different security and integrity significance. Access to event data is controlled by the engine's kernel. The engine API allows units to request retrieval and modification of event parts, and to create new events. Dispatching a single event with secured parts streamlines the event-based system design by firmly supporting the principle of least privilege. An alternative would be to send multiple events containing different sets of information, but this leads to a larger number of messages being sent, without a clear relationship between them.
The list of SmartFlow related publications can be found in the corresponding section of this website.
This work was supported by grants EP/F042469 and EP/F044216 (``SmartFlow: Extendable Event-Based Middleware'') from the UK Engineering and Physical Sciences Research Council (EPSRC).