The Dangerous Rise of Ransomware

Ransomware is a relatively new type of cybersecurity threat.  It amounts to an attacker taking and encrypting your valuable data, and then charging you to de-crypt it.  The idea came about 10 years ago, as a theoretical concept called “cryptovirology”.  Although the idea is not new, it has only become a real threat in recent years. The economics of ransomware is different than the threats we have seen before it, new economics that give hope to cyber-defenders hoping to combat it successfully.

First, there is money in trafficking ransomware.  The criminal usually demands to be paid in bitcoin to de-crypt. Bitcoin fits this need perfectly; it is hard to trace and easy to launder. In terms of US dollars, the amounts demanded were in the low hundreds, but are steadily climbing higher; some estimate that $1 billion USD will be paid in 2016. Compared to spam botnets, where criminals make pennies per bot, and the actual income from spam email click through have plunged to almost nothing.  If there’s money to be made, criminals will focus on using the most effect manner with the highest payout. Today that happens to be ransomware.

Second, the business of ransomware is scalable.  When a new tool becomes available in the hacker market, criminal organizations mount campaigns, just like sales and marketing departments all around the world advertising their product. Much like a successful commercial, each of these campaigns continues as long as it makes money. If a threat is widespread and therefore scalable, then defense for it become scalable, too. There are enough artifacts to effectively study the campaigns, and build defenses for it that are based on the behavior of the campaign, not the specific signatures used. This behavioral defense is more sustainable and can limit the life of ransomware campaigns.

Third, ransomware, surprisingly, relies on open-source. Ransomware has started to appear as Github repositories, where it is modified by other hackers to create new variants. While this may sound scary, compare this another threat in past years: zero-day exploits that were secretly developed and may only be possessed by a few actors around the world. If hackers have access to open source, then security product developers have access to it as well. For those who are active members of the open source community, this puts the cyber-defender on a more even footing.

Ransomware represents a new combination of economic factors in a cybersecurity threat.  The revenue stream is more direct; from the consumer to the criminal, with no middlemen.  It operates on a larger scale and it does not rely as much on limited supply inputs. This attracts a lot of attention and innovation from the malware community, but it gives security products a chance for strong innovation.

As Chief Data Scientist at JASK, I study the network behavior and tools of Ransomware to better defend companies against a dominant threat in cybersecurity today.

Why We Picked Tensorflow for Cybersecurity


When I started in security analytics several years ago, the choice of tool and platform was typically dictated for you, usually based on earlier investments the company had already made. These days, scientists have the opposite problem: a dizzying array of tools in a variety of licensing modes.  The frustrations of limited toolsets have been replaced by the anxiety of choice. As wonderful as unlimited options may seem, in reality we must limit our options in order to be successful. Ideally, an organization can converge on a single choice: not perfect, but one that allows maximizing benefit while decreasing the challenges of maintenance.

At JASK, we have chosen a toolset that we think does that: Google Tensorflow.  At a high level these were the reasons:

  • Data Science needs a toolset that can take advantage of either CPU’s or GPU’s, or a mix of them.
  • A product for model building must recognize that the best language for modelling is not the best language for algorithms.
  • The experiences of local development and cluster development should be the same

We need more cowbell.

It seems intuitive to use as much processing power as a piece of hardware offers; unfortunately, we rarely have this option.  Most notebooks and workstations either have a combined GPU/CPU on board (not always NVidia), and high performing GPU’s are a special-option only on most servers. On the other hand, while a GPU is fantastic at certain problems (matrix multiplication, for example) no class on GPU programming would tell you to do everything on a GPU. If you did hear this in a class, I recommend a supplement Heterogeneous Parallel Programming.  Tensorflow meets this requirement: I can develop on a laptop with no GPU’s, then run the same node on a cloud instance with an array of GPU’s installed.

A statistician and a mathematician walk into a bar …

Back at University, Computational Finance and Applied Mathematics shared some faculty, even attended the same graduation ceremony.  Yet, all their coursework was in R and ours was in Matlab, which I think is the most concise illustration of model vs algorithm building in terms of software tools. Here’s another one: some believe in having a minimal knowledge of each algorithm’s inner workings and a wide view of all the possibilities and available tools, while others believe in the need to understand fewer algorithms but deep enough to program them yourself. I now have a theory for a likely reason behind this: your position on the spectrum I described, is a function of how much hate and fear you have for C and C++ programming.  To unite these examples, the Quant’s and the Amath’s both knew python, and to take advantage of decades of numerical optimization you have to do it in C (or let’s face it, Fortran). ML solutions must be built on something that can bridge these two worlds: Tensorflow’s Python code for the model, which is compiled into C builds that bridge.

Anyone know a pop culture reference about parallel programming? 

As much as I would like every data scientist in the world to have their own Hadoop cluster, we know that’s not going to happen. Also, in line with Moore’s law, today’s laptop surpasses the main frame I helped my father load punch cards into when I was little. Doing your development on clusters is expensive, and debugging and testing become problematic as well. I have found that I am more willing to give up some application performance than to pay the price of easy debugging and testing. I find that with some education, data scientists can be persuaded to do their development with “small data”, and we can treat cluster paralleling and performance in a separate step. The ability to develop, test, and run on a local machine and then treat parallelization as a configuration step is a very nice thing about Tensorflow.

Does Tensorflow have everything we need?   While baked-in visualization and a large user community are very beneficial, I would trade that for a tool that ran GPU’s from different vendors in a heartbeat. And while it was our choice, there are other good ones to evaluate for yourself.  Your mileage may vary, when deciding whats the best tool for you, I recommend also looking at Theano, DSSTNE, and sklearn to see if they are a better fit for you.

But as a team, you have to start somewhere, and my experience has shown that “somewhere” should be somewhat close to what it will look like in production, and something that has enough capability so that you are not limited greatly or required to have 50 different software packages for 50 problems.



Telling the Security Story

Data analytics and machine learning can be very empowering for security, but don’t lose sight of your true goal when using them.

In work as an IT auditor, a security investigator, or threat analyst, there is a common need: they have to “tell the story” of a risk, incident, or threat to make change happen. The story must have impact to motivate action, but how many security practitioners feel they do that consistently? Is it the tools, the training, or both? There is a shared responsibility in telling the story.

Telling the security story is no different than telling any other story. People must be able to follow the order of events in the narrative. As this is not fantasy, they have to remain credible with the plot, characters, and detail at each step. Besides using one’s own credibility, the audience has to take away some internal insight; if not, then the writing is stereo instructions, not a story.

Data analytics and machine learning give people powerful tools to help tell these stories. They can make the story more concise, provide unexpected plot points to include, and definitely increase the amount of insight the audience takes away. However, it is by far better to use it to enhance than relying solely on it to motivate the user.

There are other useful things that allow you to tell the security story well. First, build the story out of small and objective observations; these can be tied together through the narrative. Next, treat it like a conversation, not a one direction sales pitch, allowing time to solicit questions and/or input. Don’t get lost in the details of background in the beginning, make sure each point of detail can be correlated to another pivotal point in the story . Lastly, don’t skip to the end: make sure to take the audience along for the journey or they won’t be there when they are needed.

How do machine learning and data analytics make the difference? They allow the focus to move from hundreds of lines of data into just a few. Machine learning’s output is an objective voice to the data. Above all, having a centralized theme, with occasional pivots into deeper detail (raw logs, for example), is a powerful way to tell the story with success.

At JASK, we have created a product that uses the power of machine learning and data analytics to tell the best story in the cyber security world. As the Chief Data Scientist and Director of Products, my experience and knowledge get translated into tools for security practitioners worldwide.

Hadoop New Core SOC

Security teams are increasingly frustrated with legacy solutions that are not designed to address the data volumes they face today. Threat hunting and incident investigations are hindered by searches that take too long to run or simply time out. If the searches do finally run, we quickly discover critical data is missing because it was deliberately excluded due to either the costs associated with indexing it or the long term storage costs, ultimately yielding an incomplete picture.

To begin solving the problem, Hadoop becomes the first logical choice. It’s free, it is scalable, and it’s fast. Once Hadoop is in place, an interesting shift starts to occur inside the SOC – suddenly there is new demand to get more data in, extend access to more users, and of course the data has to be kept safe and secure. However, these are actually all good problems to have, you just need the right approach to ensure that Hadoop never becomes just another isolated data silo with limited data.

At JASK, we believe Hadoop is the new core of the SOC for the same fundamental reasons. We leverage Clouderas Hadoop distribution to enable security teams to capture, store and process, and drive value from security data – at massive scale – all while providing everything your organization needs to keep data safe and secure while still meeting compliance requirements.


The Rise of the Security Data Scientist

In the future of cybersecurity, there is a new role that will be critical to the security of an organization: the Security Data Scientist. The security data scientist will bring new skills to the job, but that doesn’t mean they will be brought in from outside the InfoSec domain.

     The new age of security has arrived. Populating the security operations centers (SOC) with skilled firefighters is no longer enough. An organization now needs smoke detection, experts sitting in the same room, looking for the beginnings of a security incident. In an ideal world, these people could even spot the areas with newspapers and gasoline stored together, heading off the danger. This can be accomplished, but only with a new mindset and new tools.

I have spent the past years trying to produce such a change: creating the tools for the job, training and developing the people for the job, evangelizing the need, even doing the job itself. I have learned many things from pursuing the Data Scientist Role. I came to the same understanding everyone agrees on: big data platforms and machine learning have a large role to play in the new age. What I have seen in the wild surprised me, and it just might surprise you too:

It is easier to teach data science to a security person than it is to teach security to a data scientist.

While this has been my experience so far, the reason behind the concept has taken a while to grasp. It comes down to expectations of the job and the type of people hired based on this criteria. For the most part, today’s data scientist is expected to have a broad knowledge of what tools are available, without too much depth into the computational details of each. This makes sense, as they are expected to spend a short time on each problem and then move on to the next. Security personnel are tasked with comprehending a complex dynamical network that includes machines and their human counterparts; in the science world, they have more in common with biologists who spend years studying one species. The passion and the interest in security data may lie more with the security person trying to do better at their job, but the data scientist can certainly contribute to the new SOC.

In general, data scientists most need knowledge about what tool to use next and less details of the domain, while security personnel need only the most relevant tools and they need to be able to use them well. The best security products will find a way to accommodate both needs.

     At JASK, as an experienced scientist and tool maker , my goal is to create products that are wholly relevant to protecting your network and elevating the security skill of your SOC, while providing a place for data scientists to contribute their expertise and models.

Owning the game in the security operations center (SOC)


The cat and mouse game we play in the SOC has changed. Just a few short years ago, it was impressive if we were managing a million security events a day. Fast forward a few years and we are now dealing with billions. As a result: investigations are taking longer than ever, false positives are at an all time high, and most importantly; real actual attacks are taking place while we exhaust ourselves trying to prioritize and understand precisely where to focus our efforts. We don’t need to change the game – we need to own it.

Fortunately, new strategies with a foundation in big data, machine learning, and artificial intelligence (AI) are changing the game for us. Leveraging big data to deal with the sheer volumes of security data is not only the best economical choice, it paves the way for leveraging streaming analytics to accelerate incident investigations. It enables threat hunting across massive amounts of data and dramatically improves the ability to perform real-time detection.

AI is making the already highly capable humans in the SOC even more capable. Deep learning can improve the overall ability to detect threats, allowing the humans to focus their efforts and begin to understand the real attacks and dramatically lower the amount of time wasted on false positives. It is time to own this game of cat and mouse. What’s your next move?