The concept of Security Zoning, also known as Segmentation, is one of the most important architectural foundations within modern network security design. Security Zoning was first introduced back in the mid 90s when the Firewalls started to hit the market. In those days, firewalls were usually deployed at the Internet Perimeter and the deployment principals were fairly simple (Outside, Inside and DMZ). 

 

Over he last 20 years, the pervasiveness of security zoning has increased significantly moving from its original use at the perimeter to common use inside the organisation, such as within data centres, cloud infrastructure, or controlling access to high value assets. Unfortunately, many zoned architecture deployments are driven by the goal of meeting compliance requirements and not actually being a maximally effective security control.

 

The intention of this post is to show a new way of thinking about the security zoning design approach in an era of Big Data and Data Science. Security is a field that has many amazing and large data sets just waiting to be analysed.

 

Over the last decade we have seen huge growth in network size, speed, connectedness and application mix. Application architectures have both grown and become more mission critical at the same time. In response, the complexity of network security architectures, i.e. firewalls and the associated rules sets, has increased exponentially. Today, many deployments have become un-manageable. Either the operational costs have blown out or organisations have simply given up trying to engineer an effective implementation. I still see many organisation who try to manage their firewall rule sets in a spreadsheet. In most cases, this approach (IMHO) just does not work effectively any more.

 

If we had to boil the problem down, we are dealing with a 'management of complexity issue'. This is a problem which is ripe for the application of Big Data Tools, Data Science and Machine Learning principals. 

 

Big Data tools are able to ingest massive data sets and process those sets to uncover common sets of characteristics. Let's look at just two key potential data sources which could be leveraged to improve the design approach; 

  • Endpoint information - A fingerprint of the endpoint to determine its open port and application profile and hence its potential role.
  • Network flow data - Conversations both within and external to the organisation. In other words, who talks to who, how much, and with which applications.

 

To obtain Endpoint Information, NMAP is a popular, but often hard to interpret, port scanning tool. NMAP can scan large IP address ranges and gather data on the targets, for example open ports, services running on open ports, versions of the service, etc. Feature extraction is a key part of an unsupervised machine learning process and each of these can be considered a ‘feature’ with each endpoint having a value for each of the features. For example, an endpoint with port 80 open, acting as a web server and running Apache.

 

Machine Learning techniques can be used to process the large data sets which would be produced by an enterprise wide scan. Groups of endpoints with common, or closely matching feature value sets can be ‘clustered’ using one of a number of machine learning algorithms. In this case, clusters are distinct groups of samples (IP addresses) which have been grouped together. Different algorithms with different configurations group these samples in different ways with K-Means being one of the most commonly used algorithms.

 

Entry into the domain does not require a deep mathematical understanding (although it helps). Python based machine learning tool kits like Scikit-Learn provide an easy entry point.

 

Flow Information can be output by many vendor's networking equipment, through probes, taps and host based agents. There are a number of tools which can ingest network flow information and place it in a NoSQL data store, such as MongoDB or Parquet

 

With flow information providing detailed information on conversations, Graph Databases like Neo4j ) can be used to construct a relational map. That is, the relationships which exist between different endpoints on the network. Graph Databases can enable this capability in much the same way social media networks like LinkedIn and Facebook show relationships between people.

 

Today, a variety of visualisation tools are available to see this information in a human friendly display format.

 

The real power will emerge when the two sources are combined. Understanding the function of the endpoints, combined with information about their relationships with other endpoints will be a very powerful capability in the design process.

 

I'm not suggesting this is the only answer as many other potential data sources exist. Additionally, I’ll admit have probably oversimplified the situation. However, my point is that by utilising just these two data sources, coupled with some now commonly available Data Science tools, a new and far more effective security zoning design approach can be created. My key goal is to hopefully spawn some new thinking, discussion and projects in this direction.

 

The present Security State of many networks is a pretty sad situation. We are regularly seeing breach discovery times in excess of 200 days, with the discoveries often made by external parties. Those figures are based on the breaches that we know about. I’d would suggest those figures are just the tip of the iceberg.

This is a dreadful situation which simply says that many organisations DO NOT HAVE either sufficient ‘visibility’ into their internal infrastructure, or are not able to effectively process, correlate or analyse the data which does exist.

There are many people in the industry openly stating that the attackers have the advantage. I would not try and argue this point, but there is a lot that can be done. If we view security technologies from a Force Multiplier perspective, there are some technologies which provide only a marginal benefit (compliance activities perhaps.. IMHO), while others provide a very significant advantage to the defender.  

I believe that Security Analytics has the potential to have a profound effect on the security business and provide the defenders a very significant advantage. Effective Analytics providing detection capability, should enable a reduction in those statistics from hundreds of days to hours or minutes.

In the last few years we have seen an explosion in Big Data technology with many Open Source tools now being freely available. The scene is young and changing rapidly. But there are many opportunities for people in Security roles to gain exposure to these technologies. While some investment is required, it is possible to enter this domain at low cost.

At present Security Analytics tools are in their infancy. There are a lot of security companies using the buzzwords of Data Science, Machine Learning (ML) and Artificial Intelligence (AI), with very little to no detail on how they are being used or what capabilities are achieved. In reality most are just performing Correlation and basic statistics. With that said, those activities in themselves are very worthwhile. Coupled with some good visualisations, there is a lot of value in doing just those two things.

To lift the hood on some of the terms used in the Security Analytics Domain;

  • Statistics – is quantifying numbers.
  • Data Mining - discovering and explaining patterns in large data sets.
  • Anomaly detection - detecting what is outside of normal.
  • Machine Learning – learning from and making predictions on data through the use of models.
  • Supervised Machine Learning – The initial input data (or training data) has a known label (or result) which can be learned. The model then learns from the training data until a defined level of error is achieved.
  • Unsupervised Machine Learning – The input data is not labelled and the model is prepared by deducing structures present in the input data.
  • Artificial Intelligence -  automatic ways of reasoning and reaching a conclusion by computers.

Mathematical skills in Probability and Statistics, including Bayesian Models, as well as Linear Algebra are heavily used in these domains.

Today there are an increasing number of security data and telemetry sources available for analysis. These include various security logs from hosts, servers and network security devices such as firewalls, IDS/IPS alerts, flow information, packet captures, threat and intelligence feeds, etc. As network speeds and complexity has increased, so has the volume of the data. While there is a vast amount of security data available, identifying threats or intrusions within this data, can still be a huge challenge.

From my recent research into this space, I can conclude Security Analytics is a hard and complex problem, with the necessary algorithms being literally rocket science. To build any sort of Security Analytics toolsets, it is essential that detailed security domain knowledge be coupled with a knowledge of Big Data and Data Science technologies. There are currently very few people who possess both skill sets, so forming small teams will be essential. While this is a big and somewhat complex field, this fact should not put people off starting. Like any new technology, there will be a learning curve.

Suggestions going forward - I always like to provide some actionable recommendations out of any discussion.

Before you can analyse the data, you need to have the data and easy access to it.

 

Establishing a Security Data Lake.

To address the storage of security data, some organisations are now creating a centralised repository known as a Security Data Lake. This should not be seen as an exercise in replacing SIEM Technology, but an augmentation to these systems. On this topic, I would refer people to an excellent free O’Reilly publication by Raffael Marty, located at;

http://www.oreilly.com/data/free/security-data-lake.csp

Data Lakes are often Hadoop clusters or some other NoSQL database, many of which are now freely available. Establishment of a Security Data Lake should be a starting point.

 

Look to closely monitor your ten to twenty most critical servers.

There needs to be a starting point and monitoring a set of key servers is an excellent and practical starting point.  There are many statistics that can be monitored – root/admin logons, user usage statistics, password resets, user source addresses, port usage statistics, packet size distribution, and many others. Start by visualising this data and use it as an operational tool. Security Analytics will mature over time, getting started provides operational experience that will only grow over time.

Apache Metron ( http://metron.incubator.apache.org ) and PNDA ( http://pnda.io ) are two Open Source projects which could potentially be a starting point for your organisation. Both are worth a serious look.

 

Last week, the United States and Canada issued a joint advisory on the threat posed by crypto based Ransomware. The advisory followed a string of high-profile incidents which had affected a number of hospitals both in the US and other countries.

The CERT advisory can be viewed at: https://www.us-cert.gov/ncas/alerts/TA16-091A

The pervasiveness of this threat is demonstrating just how many organisations are clearly completely vulnerable to this type of threat, often with severe business impact.

While it is clear that the Malware problem is massive. It has been well over a decade since we have seen any form of large scale destructive Malware. Back in 2004, I spent some time in New York City performing a consultancy for a then large financial institution in the wake of a destructive worm infection. On a Friday evening, an Internet Based Worm (which I won’t name here) penetrated their internal network spreading widely and randomly erasing hard disk sectors throughout the organisation. While it was contained, the damage was significant. Fortunately, they had the weekend to recover from backups and restore operations. Had the event occurred at another time, the business Impact may have been in the billions of dollars!

Around that time, and following high profile events like SQL Slammer and Blaster, there were many people, including myself, greatly concerned about the possibility of a large scale destructive worm outbreak and the resulting potential economic impact. Fortunately, the high profile Internet worm trend died off, simply because there was no money to be made and significant personal risk existed for the authors of such Malware. Ransomware is just another form of Malware….. but with a significant financial return! Given the fact so many organisations are openly vulnerable to Ransomeware, again concerns me greatly.

The CERT Advisory recommends a range of fairly fundamental preventative security measures, such as adequate backups, system patching, etc. While those measures are strongly recommended, I would also highlight the importance of a robust network security architecture. Having previously worked with many customers who had been affected by those events, some severely, some far less so, it became very clear that those who had robust network security architectures, and mature operational procedures, were far less impacted.

In light of the current trend and growth of Ransomware, I would additionally highlight the importance of Network Security. This includes the use of Zoned Security Architectures, quality Firewalls, IPS (with auto updates), Network AV and Day-Zero malware detection systems. While there is no silver bullet, these approaches can significantly reduce your organisations risk profile.

I can’t see this problem going away any time soon. I predict it will get worse before it gets better.

 

To everyone who attended my presentation on Friday 11 March 2016 at the Novotel in Brisbane, thank you for the opportunity.

The presentation material can be downloaded from Here.

For those of you interested in the topic of Cybersecurity and Network Security Architecture. I have just posted two White Papers under Knowledge Base.

Background - My primary focus these days is keeping corporate and government networks protected within a constantly changing Threat Landscape. While there is a lot of very good information available on many aspects of Information Security, I could not find much good information on Network Security Architecture and Design, and definitely very little which is up-to-date. 

To help address this gap, I have written two White Papers on this subject area:

  • The first covers the fundamentals of Network Security Architecture. 
  • The second then moves on to discuss the changes in recent years, and what this means for Security Architectures in 2015 and going forward. In particular, I discuss Architectural Foundations and then look at 'operationalising' security, the need for a new mind set, the role of Analytics and considerations for deploying 'Cyber Kill Chains'. I have attempted to capture the big issues and provide a number of technological and policy recommendations.

Please view them under 'Knowledge Base' or via these links to the PDF versions:

White Paper 1 - The Fundamentals of Network Security Design - Download Here.

White Paper 2 - New Considerations for Network Security Design - 2015 - Download Here.

There is about 80 pages of information, with more in the works. I hope it is useful and welcome feedback.