Every holiday and during occasional long summer weekends, my teams were always called in to run an IR and forensics investigation. Whether it was in response to a ransomware attack or an external APT exfiltrating source code, our client would always say the same thing:
“Oh, so they finally exploited that one…”
In almost every IR case, the vulnerability or misconfiguration exploited to gain initial access was known to the security team. Worse, in many cases, there was even an open ticket to remediate the risk, alongside many others.
Visibility is not security
Over the past five years, security teams have directed their budget and focus toward identifying security risks, misconfigurations and vulnerabilities in their cloud environment. Today, CSPM and ASPM platforms have become a commodity. However, while the visibility side of things is mature, organizations face an endless backlog of risks without the proper technology and processes to efficiently remediate them.
Once a cloud security risk is identified, as a security leader you have a responsibility to handle it. However, knowing about a security risk and opening a ticket is just the beginning of the process. Most medium to large sized-organizations spend between 30-60 days on average to resolve a single cloud security risk. Further, the constant influx of issues means these organizations are dealing with 50-160 new critical risks on a daily basis, way more than any security team could possibly handle.
We have a remediation problem
While today’s security stack is rich in tools that identify and contextualize risks, it lacks technology to actually handle those findings in a timely manner. This has unfortunately resulted in endless security tickets and an impossible risk backlog. Further, it has forced security teams to spend a majority of their time organizing and prioritizing risk findings. Security teams have no choice but to choose which risks are the most urgent, as it is simply impossible to tackle everything.
Today, remediating cloud security risks is an extremely manual activity, requiring back and forth between the owners (security), the fixers (DevOps) and sometimes, when it comes to application-related risks, engineering as well. The problem is that those other teams are not responsible for the organization’s security. They are not being measured on security, but instead on productivity and scalability. This leads to a serious problem when it comes to fixing security risks, because security is so reliant on them.
Adding more complexity to the remediation process, is the process itself. First, security needs to validate the issue and understand if it’s a single occurrence or a global problem. Then, they need to open a ticket for DevOps to handle. DevOps will then need to find time to determine the best way to actually handle the security risk. However, DevOps are not security experts, which leads to many back-and-forth interactions and iterations between teams to define the best path to remediation. Now, the biggest question: WHERE should the fix be applied? Terraform? CloudFormation? Or should we just remediate this specific risk manually (while not best practice, is often the case)? Understanding where the fix should be applied requires DevOps to do tedious code review, forensics and root-cause analysis, after they’ve already spent so much time dealing with the “security team’s problems”. This is the current state of remediation – the price is high both for security and engineering.
What if remediation is not an option?
Even worse, there are far too many scenarios where remediation is simply not an option. For example:
- A patch is not yet available
- The current infrastructure cannot support an upgrade
- The CI/CD is taking too long to execute
- The DevOps owner is on a two-week vacation
- The fix is too risky from a business continuity perspective
- And more!
Then what?
Time is against us, and we need a plan.
New problems require new solutions
Following the wide-adoption of cloud security risk visibility tools, security teams now require the ability to efficiently remediate cloud risks. The tedious and manual way of doing things today is just not good enough. It’s time we evolve and leverage innovative technology to revolutionize the risk remediation process.
At ZEST, we believe a new approach is required, which we’re calling risk resolution. This approach includes three pillars:
Remediation
When it comes to remediation, it should always be a code-first approach (i.e. IaC) and consider the root cause of the problem. This allows both security and DevOps to quickly remediate issues at scale and as part of a safe deployment process. It also is the best way to prevent future and recurring risks.
Mitigation
Resolution paths should offer multiple options. In cases where full remediation isn’t possible, mitigation using existing controls and cloud-native services should be an option. Mitigation is like taking painkillers while you wait for your surgery appointment. It’s critical to living your life pain free, it’s quick and efficient and allows you to make bigger decisions in the background.
Prevention
Proactively preventing risks is the dream of all security engineers. Prevention strategies should be provided each time a risk is resolved to ensure that risks do not continuously resurface after remediation (as this is often the case for about 80% of risks). ZEST delivers resolution paths that also offer preventative measures to prevent the same or similar future risks from surfacing, decreasing overall risk exposure across an organization’s cloud environment
My recommendation to our readers
If you’re a security engineer ask yourself the following questions:
- How many new security tickets are we creating each week?
- How many of those security tickets are we actually closing (with confidence)?
- How much of our time is spent on triaging cloud vulnerabilities and misconfigurations?
If you are not happy with the answers and eager to find a better way to take down cloud risks, feel free to contact me directly and I’d be happy to show you a demo of ZEST.
Let’s clean your risk backlog, and keep it clean!