Skip links

Security is hard because it has to be right all the time? Yeah, like everything else

Systems Approach One refrain you often hear is that security must be built in from the ground floor; that retrofitting security to an existing system is the source of design complications, or worse, outright flawed designs.

While it is the case that the early internet was largely silent on the question of security, I suspect “retrofitting” is often used pejoratively. Certainly there have been convoluted and short-sighted attempts to improve security, but the internet has also evolved to include a sound architecture for securing end-to-end communication. Focusing on stopgap mechanisms is never a good recipe for understanding the underlying principles, no matter what aspect of a system one is talking about.

In a similar vein, it is worth remembering that the early internet came up significantly short on other requirements. On scalability, for example, it was originally assumed that all host-to-address bindings could be managed using a centralized hosts.txt file that every system admin had to download once a week, and EGP assumed a simple, loop-free “catenet” model of inter-network routes.

These, and similar limitations, were corrected over time, for example, with DNS and BGP, respectively. And today it is straightforward to explain the system design techniques — eg, aggregation and hierarchy — that were then applied.

The question to ask is: What are the analogs for security?

This example highlights a second straw man that I will set up and then knock down: That security is uniquely hard because you have to get it right over and over again, at every layer of the system.

Of course the same is true of every other system requirement.

There’s no such thing as getting scalability or availability right in just one place, and then you’re done. You have to make sure your system scales and survives failure at every layer and in every component. It takes only one bottleneck or single point of failure to defeat the system.

Security has introduced the idea of defense-in-depth (DiD) to capture this idea. DiD says (in part) that you need to build multiple, possibly overlapping defenses, but this is essentially what someone building a reliable system has to do as well. (DiD has other implications, which I’ll return to in a moment.)

It is extremely difficult to prove something cannot happen

This suggests the next possibility, which is that security is harder because we’ve set it up as an absolute requirement under all conditions, whereas we sometimes cut ourselves some slack on scalability and availability. For example, we may allow for an upper bound in the workload we expect to serve (eg, 2x the last flash crowd event) or the unlikely failure scenarios that we can safely ignore (eg, a transatlantic cable cut).

In contrast, we assume an adversary always finds the weakest link and exploits it, so there must be no weak links. But cost/risk calculations are exactly the same in all three cases: For security, you decide what parts of the system to trust, what threats you understand, what resources your adversaries can bring to their side, and what resources you are able to spend defending against those threats. My takeaway is that for all systems topics, but especially security, the starting point has to be a clear articulation of requirements and assumptions.

This brings me back to the idea of DiD, which is broader than just saying all layers or components of the system must be secured. It also implies that any single defense might be penetrated, but it will be hard to penetrate all of them.

Saltzer and Kaashoek make this point succinctly [PDF] when they talk about security being a negative goal, the point being that it is extremely difficult to prove something cannot happen. Building highly available systems has a similar negative goal, but somehow security feels qualitatively different. Perhaps because we know our adversaries are actively plotting against us, whereas our hardware fails passively (except, of course, when it doesn’t, pointing to the fuzzy line between security and availability).

Another seemingly unique aspect of security is the centrality of cryptographic algorithms. My initial (and by no means exhaustive) survey suggests that many books and courses explain security through the lens of cryptography. This is understandable, because without these algorithms we could not build the secure systems we have today.

But cryptography is a means, not an end. It is a necessary building block; you still need to construct end-to-end systems around those building blocks, which depend on many other components (and assumed technologies) as well. Get the overall architecture wrong, and even the most powerful cryptographic algorithms provide no value. From the systems perspective, the key is to abstract the algorithm in such a way that you can then design a system that builds upon it.

Get the overall architecture wrong, and even the most powerful cryptographic algorithms provide no value

This is a familiar theme. In our work to bring a systems perspective to 5G, I found that the lion’s share of attention in standard treatments of 5G is placed on the coding algorithm and underlying information theory (eg, OFDMA), with the rationale for the architecture of the communication system built around that algorithm often lacking.

Other complex algorithms show up in large systems (eg, Paxos for consistency, weighted fair queuing for packet scheduling, and so on), but those algorithms only work when the overall system has been factored into the right set of interdependent components. Get the factoring wrong, and you’ve unnecessarily coupled policy and mechanism, baked in unnecessary assumptions, or in some way limited how your system can evolve over time.

That’s not to say today’s security systems are poorly designed, but in describing those systems, emphasis should also be put on the design that is able to take advantage of the algorithms.

Exploring these possible reasons why security might be unique has served to identify four criteria for how we ought to talk about security:

  1. Understand the rationale for individual mechanisms, and not just their current implementation choices.
  2. Recognize that systems evolve, and sometimes in the middle of that evolution it’s difficult to see the forest (the architecture) for the trees (today’s mechanisms).
  3. Be as thorough and detailed as possible about requirements and assumptions a system makes, along with the risks that follow.
  4. Decompose the system into its elemental components and explain how they all work together in an end-to-end way.

These last two points seem to be the key: Being explicit about assumptions is essential for coping with a negative goal, and once you’ve done that, separating concerns and requirements, unbundling features, and teasing apart related concepts is the cornerstone of the systems approach. This seems especially relevant to security, where I am still searching for the clarity that should be possible. ®