Originally Published: 13 October 2018
In the last week we have seen a spectacular report out of Bloomberg in relation to malicious hardware implants within Supermicro server motherboards. The implications of this report are potentially huge. However, the technical details disclosed are minimal and a large number of unanswered questions exist.
Personally, I subscribe to the adage of “where there’s smoke, there must be at least some amount of fire”. Subsequent reports have claimed that it was not just Supermicro motherboards affected, but that the problem could be far more widespread affecting other vendors as well. With all that said, I acknowledge that this whole situation has not been substantiated and there is a chance it could be inaccurate, grossly exaggerated, or completely false. However, for the purpose of this post let’s put that debate aside and assume that the reports are correct.
The first point I would like to make is that hardware inspection is a highly specialised field and there are currently very few vendor organisations either experienced or equipped to perform this work. This means that for the vast majority of organisations, hardware inspection is not going to be a viable option.
So, I want to discuss options for network monitoring and visibility. But first the problem.
From the information available to date, it has been suggested that the malicious modifications have been made to the management controller of server motherboards, Cisco calls this an IMC (Intelligent Management Controller), Dell call it a DRAC (Dell Remote Access Controller) and HPE calls this ILO (Integrated Lights Out). All of these devices are essentially a small computer that controls the computer. There is a long history of these devices being notoriously insecure. Furthermore, these management controllers have access to just about every aspect of the server’s hardware providing them more control over the hardware than the operating system itself.
Compromised hardware only takes the attacker so far. At some point the malicious hardware will need to communicate over the network to a Command and Control (C&C) server. Depending on the nature of the implant, malicious communication attempts could originate from either compromised management controller interface, from the server’s operating system, hosted virtual machines or any/all of the above.
In my mind this situation further makes a compelling case for the deployment of network monitoring and analytics. The key issue now is the ability to detect malicious traffic and respond quickly in the event of such an occurrence, whether that attack has stemmed from either a hardware implant, or through unrelated but still malicious activities.
What are my recommendations and the options.
A critical point is that server management controller interfaces should not be routable to the internet. I recommend that they are segmented and are only allowed to communicate with the minimum number of workstations needed to support the operation of those devices. i.e. how many people really need access to the management controller? If you can isolate the kill chain at this point, it is highly likely an attacker won’t be able to gain control and further progress an attack. If server management ports must connect to something on the big bad internet, then enable it very selectively.
At a network level, I would recommend the deployment of a Sinkhole on the management network. A sinkhole is a part of the network that attracts all traffic which has no other legitimate destination. Sinkholes are an infrequently deployed but incredibly useful for attracting all sorts of traffic which could either be the result of misconfiguration, or, of key interest here, malicious traffic. Once traffic is routed into a sinkhole there are many tools which can be used for analysis. I realise that’s a bit light on in technical detail, but I will aim to publish a subsequent Blog post on Sinkholes in the next week or so.
Let’s now talk about available network based monitoring options.
When we start talking about monitoring options, the first call out is firewall logs. Assuming the management network is segmented, then whatever firewall is in place will be capable of connection logging. There are many examples where detailed evidence of an attack has been collected in the firewall logs. If you aren’t collecting and archiving your firewall logs, then this is recommendation one. And if you think I’m ‘stating the bleeding obvious’ – you would not believe the number of organisation who fall into the category of ‘people who should know better’, who don’t do this. If this is your organisation and it gets compromised, any forensic investigation will be both exponentially harder and exponentially more expensive! Ignore this advice at your peril.
Netflow – I have been a huge fan of Netflow as a security tool for many years. Netflow is the networking equivalent of a telephony Call Detail Record (CDR). At an IP level it records who spoke to who, how much, and for how long. Like firewall logs, flow records can be exported, collected, analysed and archived. A ley point is that for security applications, you must use full-flow Netflow to capture all conversations at a point-in-the-network as opposed to sampled Netflow.
Analysis tools – Many both commercial and open source tools are available which can be used for both log and Netflow analysis. I won’t call out any commercial options or discuss SIEMs, but the Elasticstack (formerly the ELK) is a very robust and widely deployed open source option. If you have nothing, Elasticstack is a good place to start.
Full Packet Capture – This approach captures all traffic that passes some through some point-in-the-network. I’m not going to elaborate on it too much as it’s a costly approach and generally reserved for serious organisations. However, I will mention one approach I have seen some organisations deploy. It is the use of a full packet capture card that collects in the order of a day to a weeks data in a circular buffer. In the event of an incident being detected the available full packet capture can be copied and stored for investigation.
What are we looking for?
If we use a Cyber Kill Chain as a foundation, then we wish to look for any evidence of those attack stages within an attack lifecycle. This can range from beacons to a C&C server, download of additional malware or most importantly achievement of the ultimate objective, exfiltration of data (at which point you’re probably pretty screwed). And we must also assume that any potential attack traffic is going to be encrypted. That is another more in-depth topic, but lets just say it makes monitoring the contents of a traffic stream very difficult.
Geo-Location – Is a widely available feature. It can provide a very quick indication of the termination country of a connection’s remote endpoint. So, if you were to see a connection from inside your network, connecting to a suspicious country, that’s something that requires investigation.
Threat Intelligence – The sheer number of active threats in general including malicious destinations on the Internet is well beyond the vast majority of organisations to track. This is where the use of Threat Intelligence comes in. If anything inside your organisation (and that includes Cloud Infrastructure) speaks to a known malicious internet endpoint, then you want to know about it. Threat Intelligence comes in a variety of forms, including both Open-Source and commercial feeds. The key objective is to correlate the information received from a reputable feed (or feeds) with the traffic ingressing and egressing your network. The goal of Threat Intelligence usage is to quickly identify a malicious event within what will typically be a mountain of network traffic.
Threat Intelligence works on the assumption that someone has seen an attack previously. So if this is a unique or first time attack, it probably won’t help, but there will be a lot of cases that have been seen before making it a valuable tool.
As the primary focus of this post is compromised server hardware, then monitoring the communication habits of management controller ports should be a key focus. If these devices start trying to talk to unexplained destinations (including trying to resolve unexplained destinations), then prompt investigation is required.
_