Stacked ceramic bowls and abstract shapes in rows.

Cloud security fundamentals Part 3: Integrity

The post discusses establishing integrity in the cloud environment through secure and provable changes. The post is part 3 in a multi part series focusing on cloud security fundamentals.

5/31/20264 min read

The posts 1 and 2 on cloud security fundamentals covered Identity and Isolation, which enable an organization to operate in the cloud exclusively. If done correctly, what else remains? The cloud environment for one will go through changes and guaranteeing the objectives of identity and isolation through changes become relevant. I classify this guarantee as integrity, tied to operations. Integrity builds confidence in what exists in the cloud environment at any given point in time. In its absence, an organization cannot reasonably determine what exists and whether it should exist or not. In this post, we will explore how to establish provable integrity.

Establishing integrity

The goal of establishing integrity is to ensure that changes applied to the cloud environment follow a documented and established process where changes can be tracked and reasoned. In technical terms, it can be implemented through a mature DevOps pipeline with checks and balances. The use of mature deployment mechanisms ensures that changes tie back to an artifact and change origin. When such a mechanism exists, it allows us to reason that the system provides integrity through changes. So, when a change is made, the record of the change and its manifestation are tied and provable.

The objective behind provable binding is to restrict system privileges (i.e. system that implements changes e.g. pipeline) while ensuring that the system only acts in an intended manner. In order to establish provable binding, a system must meet certain security criteria to establish integrity. The following are some of the key principles to keep in mind when designing such systems:

The system access to privileged credentials is secure and restricted. E.g. a peer review process that triggers / enables the system to gain access to the credentials using secure protocols.
There is no single universal privileged credential but a set of privileged credentials that provide access to a specific cloud account(s) and/or environment(s). The objective is to not have a single credential that has access to multiple cloud accounts / environments but create multiple risk domains that restrict impact if credentials are compromised.
The system should be the only entity that can gain access to the privileged credential. If there are ways to circumvent this, it will jeopardize accountability and integrity.
Ensure continuous monitoring and alerting that detect deviations.
Establish mechanisms for detecting controls failures that guarantee 1 to 4.
These principles ensure that the system is the only entity that can enact a change, its change is scoped through restricted access and importantly, controls that govern its behavior are operating. If these principles are achieved and evidenced, what else needs to be considered?

Beyond mature change management

If all changes go through a mature deployment mechanism with the right checks and balances then there is not much left to address. However, the reality is that not all changes can or will use the standard deployment mechanism, e.g. manual changes. It means certain changes will be done outside a mature deployment mechanism. So, what are the options to manage such use cases?

To start, a verification of a manual change can be manual e.g. shoulder surfing. However, it can result in conflict of interest. One may argue that the same conflict exists for peer review in the development lifecycle however code as an artefact hinders such conflicts from arising. Still, some form of review is better. The challenge with manual changes is lack of structure and code. A manual change may include ad-hoc scripts and GUI changes that make documentation, review and verification challenging. Plus, such changes applied can be one offs, e.g. during troubleshooting or production outages. For GUI, screen recording is an option however, using it to validate a complex change can be difficult across multiple steps. Such manual changes are applied outside the standard development lifecycle, which causes drifts and creates reconciliation challenges. As a result, complicating the review process and making integrity hard to achieve.

So, what are the alternatives in absence of everything code?

In order to increase confidence in manual changes, there are practical steps to consider:

Start with pre-change pseudo even if exact or all steps are not known. Think of pseudo code as change expressed in simple but detailed steps. Even pre-change pseudo code is not complete or accurate, estimating an anticipated change can enable important documentation, allowing for future changes to mature and standardize over iterations.
Post change summarization in the form of pseudo code can improve use and effectiveness of manual reviews. The use of pseudo code can be especially useful for complex changes that are actioned through multiple scripts and GUI screens.
Post change summarization in a structured format is comparatively easy for a reviewer to follow, thereby, reducing review effort and improving review quality.
Analysis of anticipated change v/s actual change can improve change awareness and provide insights on how to move towards a standard development lifecycle.

While the above sounds good in theory, the challenging part is the summarization itself, requiring a developer or engineer to manually go through the documentation process. Further, the verification part can also be time consuming. I believe there is an opportunity to leverage AI/ML to assist with documenting post change summarization and also with verification, thereby reducing human time and effort. As an example, an AI system as an independent observer can document the entire manual change, which a reviewer can query for post change certification. Additionally, investment in ML can enable contextual learning for manual changes that increase change awareness and inform opportunities for standard development lifecycle. The benefits also include manual change efficiencies and standardization. And lastly, AI can scale with increasing number and complexity of change.

Key takeaways

Changes are what enables progress. Having integrity in changes is essential to establish confidence in the process. At the same time, the ability to accurately describe the state of the environment is fundamental to prove integrity. A mature change management is possible through standard development lifecycle while accommodating manual changes that are unavoidable. A mature change management process is also foundational for security assurance and a strong incident management process (whether security or operations).

NO AI TRAINING: Without in any way limiting the author’s exclusive rights under copyright, any use of this publication to “train” generative artificial intelligence (AI) technologies is expressly prohibited without author's explicit consent. The author reserves all rights to license uses of this work for generative AI training and development of machine learning language models.

Cloud security fundamentals Part 3: Integrity

Establishing integrity

Beyond mature change management

So, what are the alternatives in absence of everything code?

Key takeaways

Author: Anurag Jain