Four “Red Flag” SOC Phrases

Security is a hot-button issue in businesses today to a degree we haven’t seen before. For the C-suite and other business leaders that begin paying more attention to what’s going on in the SOC, it can be like entering a foreign country – full of customs and phrases that are as difficult to interpret as a whole new language.

CISOs and senior security staff end up playing an important role as translator. However, it can be challenging to distill complex SOC dynamics into bite-sized issues that the company can recognize and overcome.

This barrier has contributed to problems such as security blind spots, job dissatisfaction among analysts and the ever-growing skills gap – big time, top-of-mind challenges for the security industry. That’s why I recently penned an article for Dark Reading, called “SOC in Translation: 4 Common Phrases & Why They Raise Flags.” It begins:

Having worked in many different security environments, I've picked up on more than a few phrases that you hear only in the security operations center (SOC). These catchphrases frequently need translation — especially as CISOs and the entire C-suite look to get more involved with their organizations' security practices.

Below are a few to listen for, along with what they mean for the business.

If you’re running into communication barriers within your SOC or between security teams and business leaders, I recommend giving it a quick read. Being able to interpret these four phrases can save your business a lot of headache, and point you in the right direction when it comes to selecting new security tools and positioning your team for success.

The full Dark Reading article can be found at:


Domain Hijacking Impersonation Campaigns

A number of domain “forgeries” or tricky, translated look-alikes have been observed recently. These attack campaigns cleverly abuse International Domain Names (IDN) which, once translated into ASCII in a standard browser, result in the appearance of a corporate or organization name that allows the targeting of such organization’s domains for impersonation or hijacking. This attack has been researched and defined in past campaigns as an IDN homograph attack.

The interesting part of this attack is that it allows bad actors to hijack the targeted organization’s domain without actually hijacking it. As seen in past campaigns, in order to hijack a domain, malicious users must compromise the targeted entity’s domain guardian, which is usually a name registrar, an administrator or web marketing department within the organization. Malicious users would proceed with different attack vectors in order to obtain credentials that allow the transfering or redirection of such domains. One of the popular attack vectors against an organization’s internet domain was DNS hijacking, which allows malicious actors to find technical ways of tampering or subverting a company’s DNS in order to redirect it to another hosted site, subsequently targeting redirected victims with different attack vectors (Drive By downloads, Phishing, Impersonation, etc).

Malicious actors have cleverly devised a way to use International Domain Names that, when translated into ASCII on standard browsers, look exactly like the targeted organization. Next, malicious actors proceed to register a targeted organization’s domain and get SSL/TLS certificates. Once these are translated into browsers, it is very difficult, and almost impossible, to notice the difference. Previous work from researcher Xudong Zeng of Symantec and recent research by IronGeek and Brian Krebs give a good example of how the use of IDNs can be effective when trying to impersonate a targeted entity.

Figure below show a simple translation tool.

The above example shows a domain name of a known cryptocurrency exchange which was recently targeted, according to TheNextWeb. Malicious actors used an IDN, cloned the site, purchased SSL/TLS certificates and proceeded to present a clone site to trick victims.

Figure Shows cloned site punycode/IDN site.

Figure Shows translated ID with secure icon on browser.

As seen on both images above, this type of attack is very difficult to detect, even for a detailed observer.


How can we defend against these types of attacks? 

Although these type of attacks are very difficult to detect by standard users, they don’t represent direct compromises of actual internet domains. Still, there are measures that can be taken in order to protect against them.

  • Protect your domain registrars’ accounts so they cannot be compromised and your domain redirected. (Multiple Factor Authentication, Complex Passwords, Private Registrations)
  • Select reputable domain registrars that will have support and legal weight in case of domain misappropriation/dispute.
  • Monitor for impersonation and registration of rogue/non-standard character domains that may be used against your organization. Here is an IDN checker website that can provide information on possible suspicious IDN registration that match an internet domain when translated to English alphabet.
  • Use tools such as Domain Lock to prevent transfers.. Also, DNSSEC (DNS secure verification of actual domain and name servers) can help users to detect impersonating sites and deter malicious actors.
  • Properly document your domain. It is not far-fetched that malicious actors can, at one point, attempt to claim ownership based on previous registration or other geopolitical factors.
  • Utilize web filters and blacklists to help prevent some of these attacks.


For users:

  • Do not install mobile applications outside of authorized application stores. This attack is even more difficult to detect on mobiles.
  • Install punycode alert add-ons from internet browsers’ authorized stores.

Fig Shows Punycode alert chrome add-on.


To read a more technical and in-depth summary, access Rod's Threat Advisory on this topic here



JASK is modernizing security operations to reduce organizational risk and improve human efficiency. Through technology consolidation, enhanced AI and machine learning, the JASK Autonomous Security Operations Center (ASOC) platform automates the correlation and analysis of threat alerts, helping SOC analysts focus on high-priority threats, streamline investigations and deliver faster response times.

Cryptocoin Mining Attack Vectors Reshaping the Threatscape

The rise in value of cryptocurrencies is driving malicious actors to implement payloads that allow the use of CPU/GPU of compromised hosts in order to mine cryptocurrency.  The process of mining is defined as “the use of computational power to process transactions for a cryptocurrency blockchain in order to receive a reward of cryptocurrency for the effort. The computational power will come in the form of CPU processing or GPU processing. Miners are rewarded for successful ‘shares,’ or completed computations, by receiving a payment with fees that are collected along the way by the p2p network.”*.

By implementing cryptocurrency mining payloads, malicious actors can now increase the value of their victims by using their computer power. It is common in the cybercrime underground to seek profit from compromise hosts. These compromised hosts often called “zombies” or “bots” are usually part of botnets, which is a network of private computers infected with malicious software and controlled as a group without the owners' knowledge. These botnets are built with the purpose of executing malicious activity (DDoS, Spam, Identity Theft, Carding, Information Theft, etc).  These activities feed the underground crime ecosystem as malicious actors make profit from the resources obtained from these botnets.

With the addition of cryptocurrency mining payloads, there is now an additional benefit from compromised hosts since the number of crypto mining attacks and payloads are extending and shifting current threatscape with some of the main attack vectors including:


  • Cryptojacking: Code hosted in web applications that hijacks CPU processing power to mine cryptocurrency. Coinhive javascript code miner is an example of this that is  used in thousands of websites across the internet. This is one of the most popular attack vectors as websites can receive thousands of views from oblivious users and use their computers CPUs for mining. These attacks can use cleverly disguised web page elements to hide mining code, with reports of mining code hidden in the page’s favicon. A favicon is an icon associated with the web address that is displayed in the browser.

Fig 1.1 Favicon embedded mining code *


  • Malware Crypto Mining: There are several reports of malware variants now incorporating cryptocurrency mining payloads, such as JS Coinminer. Malware campaigns are always active and seek to compromise as many victims as possible, now with added benefit of CPU processing power use.


  • Malicious Mobile Applications: There have been cases reported of malicious actors attempting to mine cryptocurrency via mobile devices. They attempt to do this by publishing malicious applications in application stores that, once installed, proceed to use mobile processing power. As little as it could be, it is important to take into consideration that in mining, the so-called mining “pools” always takes advantage of as many devices as possible by using distributed processing/mining in order to expedite coin production.

Fig 1.2 Shows Malwarebytes Mobile cryptomining site


  • Adware Crypto Mining: Adware crypto mining involves the embedding of crypto mining code in ads, pop-ups, and other type of web advertising, in some cases pushing these advertisements that might be legitimate but with embedded code that then uses hosts/viewers computing power.


  • Crypto Mining Post-Exploitation Payloads: As malicious actors are able to compromise hosts with any available exploits, they proceed to use post exploitations payloads that allow the mining of cryptocurrency. This is especially the case for malicious actors targeting major CMS applications such as Wordpress in order to get massive amounts of processing power from very large distributions of servers across the web. It is important to notice that one the most mined cryptocurrency is Monero. This cryptocurrency can be mined using CPUs (more abundant and common than GPUs) and has a higher level of anonymity than many other cryptocurrencies.


These new benefits are affecting the threatscape. For example DDoS campaigns seem to be shifting as malicious actors consider the use of compromised hosts for attacks or for mining. Every time an attack campaign is uncovered - be it malware, ransomware, or DDoS - what follows is a process where attack sources - usually infected hosts - get cleaned, taken down or blacklisted.

Before cryptocurrency mining, in order to produce revenue from compromised hosts, malicious actors had to either extract valuable information (identity, banking, credentials) or had to use these hosts for not so subtle activities such as SPAM or DDoS. These two activities are very noisy and usually lead to blacklist and take downs. Now with cryptocurrency mining payloads these hosts can produce more revenue and stay undiscovered for a longer period of time.

This situation presents a factor that may be shifting attack campaigns where DDoS campaigns are more focused on specific targets and less widespread as malicious actors focus on mining and keeping hands on compromised hosts. A constant dynamic of the underground economy where malicious campaigns are driven by return of investment.

Building Lightweight Streaming Time Series Models


With modern technology today, almost all personal devices participate in a highly connected interweb and leave a footprint of our digital behaviors. The power of analytic modeling can help us identify adverse situations in streams of networked device data relating to fraud, system faults, or even human error. The main problem we focus on in this blog is the processing of network logs for detecting anomalies. For example, by analyzing the HTTP login logs an unusual spike in the number of logins could be indicative of a possible cyber-attack. One way to analyze large pipelines of this type of connected data is using time series models.

From a Research and Development perspective the problem of time series anomaly detection at scale is a complex task. First off, from a statistics perspective, the definition of anomaly itself is vague and tends to vary across problem domains. Furthermore many of the use cases in the cyber security industry hold a standard for all models, including the time series model to function in real time by being lightweight and efficient. From a data perspective, the individual series are often times sparse valued, unevenly sampled and too large to process by a single CPU.

In this blog post, we outline the work we have done building distributed streaming time series models for processing enterprise network traffic. Monitoring for changes in these network patterns over time can help solve tactical cyber security use cases and can also highlight unusual or suspicious connections. Ultimately, time series models can help establish a standard for the network environment and exhibit things difficult for the human eye to mine out of the plethora of available information, such as detecting a large amount of data transferred from a single host.

We try to overcome all the aforementioned complications and propose a statistically intelligent hybrid streaming model for detecting anomalies in cybersecurity data that scales to fortune 500 level requirements and supports a flexible design abstraction for creating new instances of models that map to individual use cases using a simple template approach.

1. Anomaly detection on time series

We describe a prototype use case to walk through the end-to-end design choices. In production our model was deployed to predict anomalies on real-time streams of customer traffic. Prediction on streaming data not only makes tuning and training an interesting challenge but also requires the model to be lightweight and time efficient. Application layer data like HTTP logs are sparse, which makes it difficult for models to learn and identify patterns. The proposed algorithmic solution we devised to tackle these challenges is shown in Flowchart 1. n the following subsections, we present a breakdown of our modeling life-cycle: data-collection, pre-processing, identifying our model, identifying model parameters, training, and testing.


1.1 Pre-processing

The data which our model was trained on was collected in the form of HTTP counts per source IP from 5 separate customer cloud instances.  This data had the form of PCAP data (examples of raw PCAPscan be found here for a variety of security use cases: and then transformed with BRO[7] for easy parsing of some key protocol types.

Once we had the BRO extracted form of the PCAPs we built basic transformations using standard ETL concepts focused on converting logs to a generically-typed feature vector per time point. We added a tiny sprinkle of abstraction at this layer in order to support processing heterogeneous use cases in the same code base, so that we can build a time series either out of strings or numeric values.

These granular logs pruned through the ETL helped the model learn on only the essential data pruned for isolating time series patterns per host/user.

A deviation from this “normal” pattern then helps it detect a possible anomaly.


1.2 Identifying our model

After having preprocessed the logs we proceed with the iterative process of mathematical model design and parameter selection.  This involves an experiment and evaluate approach where we essentially have a bake off between algorithmic concepts deciding some candidate algorithms that seem to have the best outcomes on test datasets we isolated.

We experimented with Symbolic Aggregate Approximation [3], Cluster-Based Local Outlier Factor [4], Hybrid Sliding window, and Autoregressive Moving Average (ARIMA). However, one of our production constraints was to have a model which, along with being time optimal, would also be effective to detect anomalies in real time. These two requirements proved a major drawback for most of the aforementioned algorithms we evaluated against.  Either the model was too slow in time complexity or that it failed to detect efficiently on streamed data. After doing  an intensive comparison study we concluded that Hybrid Sliding-Window was the best fit for our requirements and was closely accurate to make anomaly detections.  Some nice libraries that we would like to have used if we could relax some of the real-time constraints would be Twitter’s time series library [5] and the R library called forecast[6]

The sliding window logic helps us traverse over our real-time streaming data at the same time it keeps listening for a new log to be added to the time series being traversed. The latest log we score will use the history of the host and the population in the anomaly detection step.  To carry of this test for an anomaly we incorporate three major conditions, (1) Check for historical Median Absolute Deviation (MAD) against the present MAD value, (2) Compare against the median of the already seen time series for all time series in the population, and (3) A threshold check for handling any sparsity in the window.

As seen from literature [1, 2], absolute standard deviation outshines standard deviation around the mean when it comes to robust outlier detections. Thus, we built into our algorithm the heavy weighting of Median Absolute Deviation (MAD) as a key statistic. Before settling for MAD, we also did a comparative study between MAD-based and percentile-based outlier detections with a fixed window length. An example of this comparative study for a window length of 263 is as shown in Figure 1. It can be seen that MAD turns out to perform better than percentile based outlier detections.

Figure 1. Percentile-based versus MAD-based outlier detection comparative study.


The pseudo-code for the Hybrid Sliding Window is as shown in the picture  below.

Algorithm 1. Streaming time series anomaly detection with global statistics

1.3 Computing our model’s parameters

As seen from subsection 1.2 and Algorithm 1. we realize that our hybrid model’s performance greatly depends on the three parameters: (1) , the length of the sliding window, (2) , threshold to normalize MAD, and (3), threshold for dealing with a sparse window.

The three values mentioned above are computed offline and the model is then initialized with these precomputed values. Intuitively, we have an idea of an approximate range for these values but in order to come up with concrete values, we rigorously carry out automated tests.  The goal of this phase is to score multiple independent datasets varying on a range of these parameters values and fix on the ones that give the best results in terms of anomaly detection. These best results are backed by a comparative study in the form of raw data and plots produced by the unit tests. One of the examples of such a comparative study is as shown in Figure 2., where we plot the number of anomalies fired versus the different window length values. It can clearly be seen that the length of the window does have a significant impact on the number of anomalies detected.

After computing these parameters offline, we then initialized our model with these parameters to be able to detect anomalies on real-time streaming data.

Figure 2. Varying window length versus number of anomalies.

We summarize the entire model’s life-cycle through Flow Chart 1.

Flowchart 1. Life cycle of hybrid model for anomaly detection on streaming data.


2. Conclusion

Automated detection of anomalous behavior in cybersecurity data can help reduce overall alert volume for a large number of practical use cases.  If the problem is to detect outliers in streaming time series it turns out the use of Median Absolute Deviation as a local baseline statistic to describe the history of an individual time series is useful in combination with studying a population's global Median/and rare quantile values.  We built a threshold based model for dynamically adjusting to changing local behaviors in an individual time series along with global behaviors in the population.  For a population of 25,000 Individual time series on one month of data we raised 50 anomalies in this fashion with the initial run of hyper parameter values we learned in our model iteration phase.   There is more work to be done in making the parameter learning phase fully online streaming and we will talk about open source efforts around this topic in a future blog post.




[1] Christophe Leys, Christophe Ley, Olivier Klein, Philippe Bernard, Laurent Licata, Detecting outliers: Do not use standard deviation around the mean, use absolute deviation around the median, Journal of Experimental Social Psychology, Volume 49, Issue 4, 2013, Pages 764-766, ISSN 0022-1031,

[2] Leo H. Chiang, Randy J. Pell, Mary Beth Seasholtz, Exploring process data with the use of robust outlier detection algorithms, Journal of Process Control, Volume 13, Issue 5, 2003, Pages 437-449, ISSN 0959-1524,

[3] E. Keogh, J. Lin and A. Fu, "HOT SAX: efficiently finding the most unusual time series subsequence," Fifth IEEE International Conference on Data Mining (ICDM'05), 2005, pp. 8 pp.-.doi: 10.1109/ICDM.2005.79

[4] He, Z., Xu, X., & Deng, S. (2003). Discovering cluster-based local outliers. Pattern Recognition Letters, 24(9), 1641-1650.



[7] BRO Intrusion Detection System:







Connecting the Dots

As a technology marketer, I have spent 20 years working with technical teams to identify ways to articulate how and why solutions work the way they do. While I have worked in many complex industries, I have found cybersecurity, AI and ML to be an interesting challenge.

This spurred the memory of a case study I once read about Meg Whitman at eBay1. Whitman, a non-technical leader, joined the start-up in its infancy and used analogies to gain a "better understanding of the company's technical underpinnings". For example, she likened technical capacity to a shoe factory and the project scoping process to train seats. These comparisons made the technology easier to understand and enabled her to lead the company to success through innovation and creativity while adding her valuable outsider’s perspective.

We are fortunate at JASK to have a world-class team of data scientists and ML engineers that have spent time explaining to me how and why the JASK Automated Security Operations Center (ASOC) platform works. The more I learn, the more I begin to understand the problem, and the impact that the right solution will have on SOC teams.

For those of you who remember relying on paper maps without GPS or internet connectivity, you might recall the feeling of driving without the visibility to be confident you were going the right way. I remember driving in what I thought was the right direction, but the longer I drove with uncertainty, the more anxious I would become, making minutes seem like hours.  

I can imagine as a security analyst, navigating through hundreds, maybe thousands, of alerts to gather enough context to determine if one alert, or a combination of alerts, is a higher priority than a completely different set. If I were tasked with protecting a company from the modern cyber threat, without knowing for certain that I was heading in the right direction, my level of anxiety would be far worse.

We all understand alert fatigue. With our phones always with us, many of us deal with constant alerts. It’s difficult to know what to pay attention to or how to prioritize responses without context. My phone may ring 20 times, but if I don’t know who is calling, I am unlikely to answer.  Or, if my mom calls during the work day, I might wait until the end of the day to call her back. Except once, when I was in a meeting, my mom called, followed immediately by my dad. Those two calls back to back, formed an unusual pattern that I immediately responded to, because I had visibility into who was calling, and in what order.

For my final analogy, have you ever looked up at the night sky and pointed out various constellations? I haven’t. It’s difficult to look at billions of lights and discern patterns that form shapes. Unless something, or someone, shows me how to connect the dots, I don’t even know where to start. While constellations were used for navigational purposes centuries ago, I bet the majority of us would rather use an app like Sky Guide to show us exactly where the patterns are and how they connect to form constellations.

When I think of JASK and how we help SOC analysts more quickly and easily identify important threats and prioritize the most critical, I think of these analogies and how modern technology makes things more efficient. GPS tells us exactly how to get where we are going, caller ID reduces alert fatigue by giving us visibility into who is calling, and apps like Sky Guide connect the dots for us to show us exactly how patterns form constellations. We are saved from the frustrations we may have dealt with years ago because technology provides us guidance where we need it and context when we can’t find it on our own.

My hope is that, as the cybersecurity industry continues to harness the power of AI and ML, we remain focused on the problem we are really trying to solve, addressing the failures that have created unmanageable processes for SOC teams. This means reducing the noise, not creating more of it.  We won’t accomplish this with yet another tool that does a better job of detecting threats, but we will do it with solutions that improve SOC processes by delivering context and visibility to security analysts.   


1Harvard Business School Publishing, Meg Whitman at eBay, 2000


Love The Vendor That Loves You Back

The sales machine is a complex beast and many may misinterpret who a good sales team is ultimately meant to serve. When salespeople want you as a customer, it’s their goal to bring you into the fold and partner with your business. In the very competitive career world of sales, these guys and gals find more satisfaction than you may think in helping people and watching every business they touch become more efficient (this is what drives the money and bragging rights they often use as a measure of who is the best). It’s also easy to argue that salespeople are the busiest people you will ever meet - look at all of the emails, phone calls, and surprise visits they make to your office. If there’s anything I can attest to throughout my career as a both a customer of a sales team and an engineer supporting it, it would be the shear fortitude, determination, and dogged pursuit that a good sales person has.

Allow me to show a different perspective on sales interactions and help you realize all there is to gain by learning to manage your vendors - specifically your salesperson. In the cyber security startup ecosystem, you will experience the most driven sales and engineering teams in existence as they fight for your business and their companies’ success. This dedication is begging to be leveraged by prospects and customers who oftentimes fail to recognize the opportunity, and perhaps choose to belittle the interaction.

An Analyst has a number of tedious tasks and at least as many vendors who claim to assist with them. Sticking to the most relevant task, I could count 10-20 security startups who claim a “reduction in alert fatigue.” This begs the question: can they actually make a difference? Say you decide to engage with a vendor of this nature. Maybe you think they are selling dreams, or worse yet, snake oil. Then you decide to take a combative or defensive approach, either ignoring them entirely or - in the other extreme - setting a course to disprove them.

Lets pause here a moment and consider that perhaps you are going about this all wrong.

What if you were to change the approach from radio silence or discrediting them bit-by-bit, and instead turn the conversation into a job interview? Rather than setting the conversation up for failure  in binary fashion with finite outcomes, why don’t you try more open ended questions?

Here is a list of five questions I recommend you ask your sales contact to drive a more productive outcome:

  • How many alerts can your product review for me in a day?
  • What types of attacks can your system detect?
  • How long will it take to tune your product to achieve the advertised results?
  • How strongly do you stand behind your claims?
  • Are you willing to put your team on your tool to tune and review alerts until your product lives up to its claims?

This last question may be the most important. It is a call to action for the vendor to put their money where their mouth is. If you’ve partnered with a solid vendor, they should be  willing to dedicate resources beyond their technology to ensure the value they claim is realized and your problems are solved. They must be willing to invest in the relationship.

If you continue to move forward after asking these questions, you’ve not only expanded your capabilities in terms of technology, but you’ve added virtual headcount to your team at no additional cost. Believe it or not, there is also benefit to the vendor beyond the sale price. A willingness to engage at this deeper level is a tremendous learning resource for any company. Particularly when it comes to hot new technologies like AI, improvement can’t happen in a vacuum and there is no silver bullet that can solve all of today and tomorrow’s problems. It's so important to invest in a team with an interest in your success, and whose success you’re invested in, in return. Solving the immediate problem of today is proven to have the adverse side effect of shining the spotlight on the new problem of tomorrow, meaning you need a vendor committed to the long game. Your sales guy may not be your quarterback in this endeavor, but they could at least be an all star receiver.  

The moral of the story - don’t be so quick to dismiss that cold call or to rip apart a new technology. If you take an objective approach and interview for the good sales guy who is ok being held accountable to problem solving and results, you may find yourself in the company of people who are enjoyable to work with, willing to invest in you and your business, and able to give you a lot more than the face value of your new technology purchase.

Visit and get a feel for what I’m talking about today.

Introducing CHIRON: A Case for Home Network Monitoring and Defense

Chiron is an innovative solution developed by JASK’s Director of Security Research, Rod Soto, and Director of Data Science, Joseph Zadeh.  While JASK fully supports our team’s innovation, CHIRON is not a product of JASK, nor is it represented or sold by JASK.


Nowadays, all our homes have become microenvironments for complex networking, composed of almost every single home appliance with added processing and networking capabilities. Examples of these home appliances include toasters, refrigerators, thermostats, cameras, TVs, wearables, door locks, light bulbs, vacuum cleaners, routers, printers, as well as personal computing products such as laptops, desktops, phones, tablets, etc.

Most of these devices, once connected, interact not only with the user but also with the internet. One of the reasons why they constantly interact with the internet is because these devices are basically propped-up sensors, that have enough processing power to interact among each other and send information to the cloud where very large distributed computing infrastructure ingests it, processes it and responds to requests from these devices. This type of architecture requires a lot of computing power and expensive infrastructure, making it only affordable by very large enterprises.

However, at home, these interactions require a networking infrastructure that is very simple: an internet connection, a router and a WiFi access point. These interactions are transparent to the end user, as multiple network connections and data (some of it containing very personal information) goes from home to the cloud. Home users do not have any insight into these exchanges. They have no idea what is transferred to and from their home networks except for what they immediately see on their screens. This blind spot is a very dangerous, as home networking is faced with many challenges including:

  • Malicious file downloads: Many Drive-by malicious sites will push malicious files into unsuspecting victims, as well as phishing emails which lead victims into executing malicious code via browser or fake/malicious applications.
  • Privacy risks: Many devices can lead to loss of privacy. A simple example is how malicious actors were able to spy on victims via webcams.
  • Data theft: Malicious actors have been known to target home based Network Attached Storage exfiltration personal data such as photos, financial data, sensitive private data.
  • Piracy: Are there torrent peer-to-peer type of file sharing software running inside homes? Is their home network running a node for a piracy service?
  • Are there people using their home networks without their knowledge? People using WiFi for personal use, downloading movies.
  • Are they being targeted by malicious organizations or even state sponsor actors?
  • Are there malicious/criminal linked services running at their home networks like Dark Web TOR services or SPAM email servers?

The above items are legitimate use cases for home networking monitoring and defense. Today, home defense is usually limited to antivirus software, but considering that many devices in the home network cannot run antiviruses, and users only count with common sense to face many of current internet threats, the home network is pretty much defenseless.


Enter CHIRON: a home-based analytics, machine learning threat detection tool

CHIRON is a home analytics framework based on ELK stack combined with Machine Learning threat detection framework AKTAIONCHIRON parses and displays data from P0f, Nmap, and BRO IDS. CHIRON is designed for home use and will give great visibility into home internet devices (IoT, computers, cell phones, tablets, etc).

It provides a picture of who and what your home devices are communicating to and interacting with. This graph below shows examples of how IoT devices such as Google Chrome and Amazon Firesticks, dots and echos can be seen by CHIRON.

The following is a CHIRON dashboard that shows identified operating systems, most active services/ports, and the most active local and external IP addresses.

These dashboards are simple and easy to read, however they reveal a great deal of what is happening in the home network. This would allow users to find unusual services, operating systems, communications and services that may indicate something suspicious is occurring in the home network.

The line between home networks and internet is blurring, as all these internet-enabled devices are constantly communicating back and forth. CHIRON seeks to provide basic answers for home network monitoring such as:

  • Do you live in highly dense building? Is anybody poaching your Internet service?
  • Where do all those devices connect to?
  • Where are all my users connecting to?
  • Is there any suspicious NORTH-SOUTH traffic? Are there suspicious IPs connecting to your webcam or door locking system?
  • Dynamic asset discovery (know what devices in your home are actually live and communicating).

CHIRON will perform the following basic tasks:

  • Performs basic discovery and analytics of home network assets (IoT devices, workstations, laptops, servers, routers)
  • Fingerprints users, services, and protocols
  • Applies analytics to users and devices (Average session length, Traffic, Visited sites)
  • Identifies odd application/traffic/services


AKTAION - Machine Learning Threat Detection framework

Besides providing simple and easy to understand analytics, CHIRON also works with AKTAION a Machine Learning framework for threat detection and active defense. Aktaion is scheduled to run every 4 hours and comes with its own benign training dataset.

If either phishing or ransomware delivery is discovered, Micro behavior indicators will be shown as in the following picture.

Future CHIRON iterations will incorporate other home related protocols and tools such as BlueTooth, Zigbee, Kismet and popular open source IDS.

CHIRON framework was conceived to be open source. The objective is to bring collaboration from the security community in developing a home based monitoring, analytics, detection framework that is easy to use and transparent for end users. With collaboration and feedback from the community this framework can eventually become a free and easy to use and deploy tool for those who do not have technical knowledge yet are exposed to the dangers of the internet.

Give CHIRON a try, go ahead and download the virtual machine here.  You can also reach out to the creators via twitter @rodsoto @josephzadeh


Keeping the “Science” in “Data Science”: Calibrating Algorithms for Threat Detection

As attack payloads and methods have become more easily adaptable and customizable to individual campaigns and targets (e.g. polymorphic malware, customized payloads, credential theft, etc.), threat detection systems have migrated from using static, predefined rules (e.g. snort, regex) to more data-driven detectors (UBA, honey net salting, etc.). Although often more effective than older methods at detecting complex and subtle attacks, these newer techniques can be accompanied by a greater level of uncertainty. Fortunately, there are statistical methods we can use to improve performance and to clarify results for both product developers and users.

For many years, threat detection was based on signatures which can be thought of as assertions about data, based on previously-seen attacks or on rule violations. We often think of hash matching or regex rules as being the main type of signature detection, but in fact, many types of behavioral detection are also signature detection. For example, detecting outbound traffic volumes over some predefined threshold, or detecting any use of particular protocols on an internal network are common unusual behaviors that are flagged, but they are also assertions about the data. Signatures in general can be circumvented by a knowledgeable attacker, particularly an insider, and behavior-based signatures are often brittle or require manual tuning by IT staff. (For example, just how much external volume is unusual?)

Adaptive Detection

The advantage of data science-driven detection is the ability to adapt detection for each installation environment, and in some cases, each system. However, developers and customers need to become comfortable with the inherently statistical nature of many of these methods. Unlike assertions, model-based deductions are not binary (threat or non-threat) but rather yield a probability (between 0 and 1) that a particular event or datum is a threat. Depending on the model, there may also be a confidence in the probability estimate. These details are often hidden by using thresholds. For instance, if the threat probability is > 0.95 then it’s a threat, otherwise (<= 0.95) non-threat.

Of course, signature-based detections produce false alerts, but given the use of thresholding and the desire to not miss true alerts, false alerts can be more common with probabilistic detection. One common method to modulate the true alert / false alert set is to maintain a set of whitelist or blacklist items related to the detection method. For instance, a blacklist of known malicious domains can be maintained to help improve the true alert set and a set of known safe domains can be maintained to help improve the false alert set. Unfortunately, these lists are necessarily incomplete and rarely maintained over time. Also, as with other assertion based methods, they are inherently reactive, and do not accommodate unknown domains in advance.

Model Metrics

In order to discuss more data-driven methods to improve performance, let’s consider our metrics.


Figure 1. AUC examples. Black is no better than random; blue is better than red.

Other metrics like F1 and Precision-Recall can be used, and at JASK we normally calculate both AUC and F1, but when we can calculate AUC, it is the most useful for model comparison and for selecting an operating point, or a trade off between TPR and FPR.

Model Training and Calibration

When we develop a detection model, we will normally try to gather data from as many environments as possible for training. Then, as we move a model into production, not only for a new model, but each time we do an installation, we want to make sure the model is working as well as it can be. However, instead of using an assertion-based tuning method, we want to continue to use a model-driven method. Depending on the model being used, we have several options.

The most straightforward method for model calibration is to adjust the training set, based on local data characteristics, and re-train the model. For batch training classification, this is a reliable approach. Models that are inherently streaming and maintain a latent or feature-space “state,” such as topic and community models, can be operated in a dynamic mode such that over time, as new data is ingested, the model will drift adaptively toward more relevant solutions. Active learning models facilitate direct feedback either from users (e.g. this was a good detection, that was not a good detection) or from an internal quality metric. Such “guided learning” or “reinforcement learning” models often learn more adaptive and robust representations of data, but require considerable more effort to develop.

Example Calibration: Domain Generation Algorithm Detector

One detection model improvement that does not fit well into the above categories, but instead requires a model extension, is n-gram subtypes for DGA or Domain Generation Algorithm detection. Attackers often use DGAs to create domains to use for command and control, beaconing, and other communication infrastructure use cases. We developed an artificial neural network (ANN) using Long-Short Term Memory (LSTM) for detecting algorithmically-generated domain names. Figure 2 shows a few typical examples from non-threat and threat cases. DGAs are often used for hosting malicious command and control servers, and other threat infrastructure.

Figure 2. Examples of threat and non-threat domains based on a Domain Generation Algorithm (DGA)

We found during some customer installations that a customer’s external traffic might involve a particular set of domains that for some reason were flagged as DGA by our baseline detector. As mentioned above, this sort of behavior is relatively common for probabilistic models. Instead of gathering a list of domains we happened to observe and whitelisting these in an adjunct process, external to the model, we gathered these domains (see Figure 3) and noted that many of them were characterized by dictionary-word n-grams. The most straightforward way to address this was to extend the model to check for dictionary words in n-grams in the candidate domains. This enabled the model to be robust to potential FPs from all domains in this category, not just the domains in the observed set.

Figure 3. False Positive DGA examples, with some egregious examples highlighted. Many in this category are composed of dictionary word n-grams.

The following figure shows an example of the DGA detection algorithm running in production on the JASK system.

Figure 4. Screen Capture from DGA detection in JASK Trident.

Ongoing Monitoring and Metrics

In order to maintain ongoing situational awareness of detection performance across all JASK installations, we have instrumented the product to include various logging and metrics into the usual system performance dashboards. See Figure 4 below for an example of one such internal dashboard. This provides data scientists, engineering, and DevOps with views over time, per installation, aggregate in time (hours, days) and aggregate in category and in customer segment (financial, data center, etc.) as attacks often cluster along these axes. This allows us to get ahead of model trends during operation, not just during initial calibration.

Figure 5. Internal performance monitoring dashboard, showing detection rates by category over time.


We encourage data science teams to look for principled, model-driven ways to measure and control the performance of analytics, rather than ad hoc external methods such as lists or separate processing steps. Retraining, dynamic / streaming models, and guided / reinforcement learning are all viable options. A little effort up front can lead to improved and more robust performance over the lifecycle of an engagement.

Keeping Security Analysts in the Deep End

My 15 year old son followed in my footsteps and became a lifeguard last summer (proud dad moment). His job is at a large water park and he had nine water saves in just his first summer. While I have two water saves after 4+ years as guard, it’s safe to assume a larger pool means more people and more risk.

This lifeguard paradox has parallels to today’s security analyst in that as the amount of alerts and data available to them grows, they must take on more responsibility. More logs and/or alerts to triage, which take time to analyze, easily means there is a bigger need for more lifeguards. However, as we in the security industry know, there are far more jobs than possible applicants and recent graduates can accommodate.

There is quite a symbolic cohesion between a lifeguard and a security analyst. A lifeguard's duty is to maintain a watch on the pool and ensure everyone is safe, jumping in as needed to rescue. Security analysts may not be saving lives, but they are monitoring the alert landscape while jumping into alerts as needed to rescue. If a security analyst is able to stand guard (or sit) and watch the entire environment, they quickly get overwhelmed as there is not a standard alert to person ratio, unlike pools’ lifeguard-to-child ratios.

Many studies estimate a security analyst can triage, escalate or close about 8-12 security alerts per hour. This means that, on average, an analyst only dedicates 5-7 minutes to any one alert. Adding more log events and alerts then pushes teams to expand and orchestrate their activities. I’ve seen teams filter out events saying we are only going to watch “X”  level or “Y” type of event, essentially trying to make the analyst more efficient by only looking for the “assumed bad.” In this process, the context and story analysts see becomes fragmented. Worse yet, the assumed “good” or “unimportant” that is filtered is typically where the malicious activity is hiding.

What analysts need is an easy button, providing an asset-centric view to all data with autonomous correlation, enabling them to spend more time looking through and understanding the relationships of various data types down to a protocol level - i.e what we call the “deep end”. The deep end, for the security analyst, is all the data output from security tools and enterprise systems. This, however, can come in different forms and from many different point solutions. An analyst’s ability to sift through this data and determine what story it is telling is the true purpose of collecting all this data. Simply put, in order for an analyst to effectively do their job, they must utilize the deep end - be able to review all the data in the current systems that were built for a 5-7 minute allowance per alert, and triage. Unfortunately, we all know this is not only impossible, it's broken!

Why is this so? As an analyst, if I get an alert and do a search in any SIEM or log management solution for that IP, I will inevitably find a tremendous amount of information. I may find heaps of proxy, endpoint, dhcp, AD and countless other logs available, but its haphazard organization in today's SIEM makes it very challenging to understand how all of this information ties together in a meaningful way. For that reason, it's hard to stay in the “deep end”  - it’s hard to find the relevance necessary to complete the story of the alert and triage with an accurate measure of its “risk” to my organization.

I call this phenomena “analyst blinders.” With those blinders on, many investigations may end with: “I don’t have enough information to make a decision.” This happens because when an analyst has too much unorganized information and to little time, they are limited in triage capabilities and will often unintentionally miss or ignore the relevant data they need.

Security teams are building tools to make searching faster, but how does one build a system to cope with the amount of data brought forth in order to find what is really relevant? I propose inserting a “pool skimmer.” In pools, pool skimmers continuously cycle through all the water in the enclosure and catch all the dirt and debris. If pool skimmers were operating like today’s security methodologies, it would be the equivalent of putting a bypass in the system for up to 80% of the water flow. As you can imagine, quite a bit of filth would remain. In the security industry, we need that skimmer, allowing us to see everything with no filter, but still allowing us to find what is relevant - and for that, we need Artificial Intelligence.

Humans simply cannot scale. AI has been tasked to change how we look at this data problem.  Applying AI removes scaling issues and saves teams from having to write manual correlation rules and malicious logic to understand abnormal behavior. AI can make sense of the patterns and behavior of each asset from your enterprise. Some individual security and system logs can be meaningless. However, when combined with other “meaningless” logs, the activity becomes more significant. The AI would examine all the data - high-fidelity and seemingly meaningless alike - ultimately identifying the relationships and plotting them over time to quickly understand the relevance. AI applied in this manner becomes your lifeguard, watching the pool with an awareness of the history for each swimmer.

Using AI keeps the analyst in the deep end of the data pool. The analyst is able to review all of the relevant data used by the AI to pull apart a possible compromise. The JASK timeline view is great example of showing the story of anomalous activity. Each flag represents a datapoint that the AI associated to the asset. Analysts are then presented with a view of all relevant information in an organized in an easily understood format. The analyst can drill down on any of these and get the specific details for supporting context of the investigation.

Think of each data point above as an action which can lead to drowning, and it becomes very clear multiple pieces of the puzzle are needed to make an informed decision. Individually, each event can be interesting, but do not tell a full story. It is only when all the data comes together that a real risk can be identified. Splashing, screaming, and a head underwater are all indicators of drowning. Alone, however, they all are typical actions of kids in a pool. If a lifeguard is able to understand all these actions, in the order they happened, as a story, it is much easier to realize that someone is in danger, before it is too late. Of course, this example has oversimplified an analyst's job. It is not humanly possible to match all these events and times and create a story without AI.

This is exactly why I chose to come to JASK, to help the security analyst as a lifeguard. The JASK platform allows the analyst to watch more data without overwhelming them with more alerts. Let the AI be the frontline analyst, keep your analysts in the deep end. Remove the blinders and provide the freedom to investigate and understand.

Learn more about how JASK provides the right context to your security alerts by signing up for our webinar here.


Meltdown - The mirror in the CPU

A new series of vulnerabilities have been disclosed (CVE-2017-5753/5715/5754) affecting the most popular computer processors, and leaving millions of devices exposed to exploitation. These vulnerabilities allow users/applications with low level privileges to view data in the memory. Data stored in the memory may include passwords, pictures, texts, and any other types. At first these vulnerabilities were thought to only affect INTEL, however other reports indicate that AMD and ARM processors are affected as well.

Meltdown and Spectre are the two trending names associated to these vulnerabilities. Meltdown implies there are no limits between applications and operating system, in this case, exploitation will allow attacker to access memory data across any  running application or processes. Spectre exploit forces/tricks programs/applications to dump memory by causing errors then this memory data can be accessed.

Fig 1.  Meltdown POC

Fig 2. Spectre POC  Modified from *

Proof of concept exploitation code suggests, side channel type attack which may require some previous steps before full exploitation (I.E Tricking user to browse page with exploit code, access server transferring code then executing). However this does not minimize the risks of these types of vulnerabilities, as for example a malicious actor could simply create an account in a popular cloud based provider then execute exploit on his/her servers and be able to see others information via memory/application leakage.

It is also very possible that these vulnerabilities will soon be chained to other exploits, enabling them to be executed in a manner that allows more streamlined memory access.

The biggest implication of these vulnerabilities is the number of devices that may be affected. Considering that Intel, AMD, ARM are probably the majority of modern processors, the task of applying mitigations seems very difficult. Some of these devices may not be patchable (Think embedded processors such as Cable Modems, Routers, and many other IoTs), some others may be patched however the current mitigations as of the writing  of this blog indicate that more than fixes they are workarounds and these workarounds, come with a price which is reduction in performance and latency. These reductions in performance may be significant enough to discourage patching these devices for some vendors.

Suggested mitigations consists applying patches at the operating system level, as deployed hardware at this point is flawed and unmodifiable.  Below a list of detailed technical resources and mitigation information.


U.S Cert


Official Intel


Official AMD


Official ARM







Official Vulnerability page with technical POC and Research information




Google Chrome


Official Microsoft


Google Project Zero