Today, public cloud growth rates are sky high. Gartner says the global market for public cloud services will be worth $302 billion by 2021. Will this growth continue, or will enterprise cloud adoption be derailed because of asymmetric data security and privacy risk?

Cloud providers have spent billions to secure their infrastructures against both external and insider threats. Respondents to a 2018 cloud computing survey by IDG say their top three challenges or barriers to implementing cloud are concerns about vendor lock-in (47%), security concerns (34%), and concerns about where data is stored (34%). With all of this investment in security, why is data security and privacy still such a concern?

Asymmetric Risk

Enterprises know that all it takes for an attacker to breach a virtual server is a single misconfiguration, a missing critical patch, or an unknown vulnerability. If that server is hosting a database, the attacker can steal everything, even with encryption at rest or transparent data encryption. It is even easier for an insider threat such as a disgruntled or ideologically motivated system administrator. The data layer is so valuable that the risk of data breach is asymmetrical. The attack surface is so large that defending against this risk by trying to seal all of the holes is practically impossible. One small breach due to one small mistake leads to a massive gain by an attacker and potentially massive loss to the customers involved.  The potential reputational and regulatory (e.g. GDPR) implications for the provider and their customers are significant. In fact, a 1Q19 Gartner survey had privacy regulation returning as the top emerging risk worrying organizations. The problem is that finding a way in is simply not that resource intensive, while the potential gain for the attacker and loss for the enterprise could be massive.

Cloud providers have a significant problem. While it is fantastic that their infrastructure is extremely well protected; most security failures are out of their control. Security is becoming seen as a shared responsibility between them and their enterprise customers. Both can spend enormous sums to reduce the likelihood of breach, but they can't change the asymmetrical nature of the potential gain/loss without fundamentally changing the place attackers can get a massive gain, the data layer.

Eliminating Asymmetric Data Layer Risk

So is it possible to change the data layer so that the asymmetric risk is effectively eliminated? The answer is absolutely, because of recent breakthroughs in high performance, massively scalable, and highly functional searchable encryption. This technology allows a database server to manage strongly encrypted records without ever having the encryption keys needed to decrypt them. Clearly, an attacker that breaches a database with this property can no longer simply snapshot memory and disk and run off with a massive gain.

Recipe for Data Layer Transformation

  1. Servers only contain strongly encrypted records and never have encryption keys
  2. Massive scale through highly parallel, distributed architecture
  3. Highly functional (ACID transactions and spatial, graph, and range query)
  4. Pervasive compartmentalization of information
  5. Strong protections against query access pattern and statistical attack

Compartmentalizing Risk

With a data layer with these properties, attackers and insider threats are faced with two much more difficult attack options with minimal potential for gain. Attackers can try to steal the encryption keys from elsewhere. An encrypted database that allows every record to potentially have a different encryption key (pervasive compartmentalization) eliminates the asymmetric gain for an attacker that steals a single encryption key and breaches the server. For example, Bob's medical information could have a different encryption key than his identity information. The theft of one encryption key wouldn't allow an attacker to access any of Bob's other information or another person's information. A distributed database even further reduces the potential gain/loss as an attacker has to steal many keys and breach many servers in multiple locations. It is easy to see how compartmentalization of strongly encrypted records effectively increases the difficulty for an attacker or insider threat and limits the amount of potential gain.


Defending Against Inference Risk

Clearly a strongly encrypted, compartmentalized database distributed over multiple servers is incredibly secure compared to a traditional database. The information an attacker can gain is extremely limited.  The question becomes what information could an attacker without the encryption keys try to learn about the encrypted records and how hard would it be to do so?

Alan Turing, the father of modern computing, and the code breakers at Bletchley Park used very clever techniques to correlate information they knew (or planted) with Nazi encrypted communications. Their ultimate goal was to break Nazi encryption methods. If the Nazi's would have been more careful how they used their encryption methods, they probably would not have been breached. Today, strong encryption like AES-256 is not breakable by any known method other than trying all the possible keys, which is intractable. Learning one or many plaintext/ciphertext pairs doesn't help break the encryption. Randomized strong encryption is also not subject to statistical attack. The reason is that the same record encrypted twice will have two different encryptions, so an attacker has no way to know they are the same.  They are said to be indistinguishable. If properly encrypted records themselves can't be breached or statistically attacked, how much risk remains?  The answer depends on the value of the information, what can be learned about it from the searchable encryption method, how much an attacker already knows about the information, and how difficult would it be for an attacker to perform such an attack.

High Performance Searchable Encryption

Craxel's high performance searchable encryption is based on a breakthrough in probabilistic spatial search. It turns out that probabilistic spatial search provides much more powerful ways than previously available to minimize what a sophisticated attacker could learn about the searchable fields in an encrypted database. It is also extremely fast, scalable, and highly functional. Our strong protections against inference and query access pattern attacks make the potential gain for an attacker very small and the effort very large.

Is the Juice Worth the Squeeze?

It might not be acceptable to have any risk, no matter how low, when indexing extraordinarily high value information. Why? The gain involved for an attacker may be so large it is worth expending all possible resources to try to infer something about a record. The concept is fundamental to understanding asymmetric risk. Is the juice worth the squeeze? With the power of our searchable encryption technology, the answer for an attacker will almost always be no. Contrast this with traditional databases, where just one mistake, and the attacker or insider threat gets everything. Therefore, traditional databases will eventually become obsolete except in very specific use cases where a customer has to have some functionality that can't be efficiently performed in an encrypted database. With our breakthroughs combined with advances in areas such as secure computation, these will be few and far between.  

Conclusion

The database market is going to be transformed by this technology. The business case for eliminating asymmetric data security and privacy risk is too compelling and the technology is simply too fast, scalable, and capable to continue living with the status quo.