When walking into a Fine China shop, you can look, but Do Not Touch! This concept applies in a customer Proof of Concept; you can’t influence the infrastructure or applications, you can’t review the website or encourage an application to disclose its version or variables to expose its vulnerabilities. It’s the mother of all challenges, one I live with everyday when working at JASK. Welcome to Threat Hunting with Big Data Science where the rules are clear – DON’T TOUCH.
In the world of AI driven cyber security, it takes time for technology to learn the network and listen for threat signals reaching a noise level worthy of human interaction. Just as a large city such as San Francisco, CA would not rank its safety on the number of tickets issued, AI driven cyber security cannot base the number of alerts generated as a success metric. The number of events generated is not a metric any efficiently running SOC should accept as a measure of its health. While time is taken for AI to solve the challenge of learning the network, what functions remain for the SOC personnel and SE to do?
Thankfully that’s Big Data; the gold we are panning for sits within and the coal that keeps the fire burning is continually produced. In a Hadoop and Spark backed platform, the questions come as fast and fluid as the answers. It’s threat hunting with your hands tied, Big Data Science meets Signals Intelligence with network data. The underpinnings of Spark and Hadoop build a base for an AI driven platform and a big data hunting ground. The data is exposed through Zeppelin notebooks, making it the perfect playground for threat hunting and the moment my job gets interesting. The blinders are taken off and we press ‘Play’ on the notebooks.
“Everyone is compromised” right? That is what has been preached more than a decade and what we are still told today. With this mindset, you would expect that in a POC you would find something bad, compelling you to purchase bad. Unfortunately for my bank account, the reality is that while everyone is compromised (and it’s relatively easy to locate a compromise), how large of an impact will it have on the organization? It seems the “Everyone is compromised” statement mostly addresses trackers and adware 99.99% (four-nines) of the time. The monetization of employees via adware isn’t something a CISO prioritizes as a high-risk to the business. You have to dig deeper to make the payday and that’s when the real hunting begins. I would likely modify the phrase “Everyone is compromised” to “Everyone is critically compromised at some point in time.” The job of threat hunting isn’t to just detect a threat, but to analyze and predict threats the company classifies as high risk.
Threat hunting is about letting the network tell us where to look. When looking at network data, we see DNS authoritative answers for non-authoritative domains and top DNS queries for non-internal assets. We verify strong TLS ciphers are being used throughout the enterprise, drill down with a focus on web server response codes, request headers, response headers, suspicious user-agents on internal assets, and analyze the network data for how a business’ customers interact with the websites and applications both internal and external. Do we see fast flux domains? Do we see rapid queries? Do we see suspicious executables (those hidden within zip files) or file transfer methods? Do we see an excess of SMB, RDP, or authentication protocol traffic? The questions we are able to ask Big Data are limitless and the “Big Data Lips Don’t Lie”.
These Big Data Queries perform the predictive function of viewing how a customer’s internal, external, good and bad users interact with the business. We don’t have the ability to touch an internal asset or application and influence the results, however, every business has customers. Whether it is an employee or external user, these personnel are hands-on performing the pentest. You may hire a “professional” pentest once or twice a year, but the reality is we can never predict with 100% certainty how customers will interact with the applications. Where is the company accidentally exposing itself and how do you determine how at risk your company is?
Part II: The Results (Coming Soon)