Event Analysis Training -- Working with "BlackLists"
Many SIM, NIDS and NBAD solutions have some sort of "blacklist" functionality which highlights when systems on your network interact with IP addresses that have been identified as being associated with scanning, virus propogation, SPAM originators and so on. These solutions typically take one or more lists of potentially "bad" IP addresses and then scour your netflow, IDS, firewall or other types of logs and to see if there are any correlations. This blog entry will focus on how these types of events can be analyzed.
Why Do Blacklist Correlation?
On any given day, you may be able to count all of the individual network sessions in and out of your network in the millions. Auditing these can be very difficult. If you know ahead of time where users are supposed to be going, you can use a firewall to filter this access. If you don't you can try to use a network IDS/IPS to look into these sessions and find evidence of foul play and policy abuse.
However with a blacklist, someone else has done the work for you. All you need to do is sit back and wait for your network to be connected to by one of these "bad guys" on the black list, or worse, wait for one of your systems to connect to these IP addresses.
Tenable runs our technology at a variety of locations and we often have access to our systems during customer evaluations. For a given network, if we aggregate network IDS/IPS events and there are perhaps 10,000 to 1,000,000 events per day, a heavy day of blacklist correlation could mean 10 to 30 hits. The effort required to analyze a blacklist interaction is very minimal and often of very useful value. And if you have a solution like the Log Correlation Engine, you can do this type of correlation on any log source you have including netflow, sniffed network traffic, firewall accept events and so on.
Where do Blacklists come from?
There are many sources of lists of "bad" IP addresses. At Tenable, we've enabled the Log Correlation Engine to work with published black lists from the Internet Storm Center, Arbor Networks, the Emerging Threats project and a few other. These sites have a variety of different techniques they use to aggregate information about potential:
- large scale scanners
- botnet command and control nodes
- malware payload distributors
- SPAM sending sources
- phishing sites
Some of these blacklists rely on input from sensors from all over the world. For example, if someone sends their logs to the Internet Storm Center, the sources of their activity are correlated with other submissions and the top sources of abuse are published. Sometimes, when a specific worm or trojan is analyzed, the IP addresses it communicates with are published and these are added to these lists as well.
Accuracy
Being limited to an IP address, there are often false positives where there is more than one function at that node. For example, a botnet that is receiving commands over IRC may result in having an IRC server IP address being placed onto a blacklist. Anyone that used IRC and connected to this server would then be seen communicating with a blacklisted IP address, even though the botnet is likely limited to a few specific IRC channels.
Similarly a web server may be identified as distributing malware or be a known phishing site, but if the IP address actually hosts many virtual web servers, then traffic going to those other web sites could be flagged as suspicious, even though they all travel to the same IP address.
The quality of being on a list or off a list is also something that changes over time. A hostile host could be scanning a site for several days or weeks before there is enough activity for it to be listed on someone's blacklist. For example, in the screen shot below, the last 5 days of all normalized events from a smaller university are shown:
Reading from left to right, there has been some sort of Snort based detection of malware (the SnortET-Malware_Activity events), but within the last day, there was a variety of logged network traffic with one hit of a connection to a known scanning source IP address as tracked by Arbor Networks. This would be the Blacklist_Arbornetworks_Top_Attack_Source event, as well as the Outbound_Blacklist_Communication event.
Compromise Detection
When looking for a compromise, any type of outbound connectivity from your network (the "good guys") to a blacklisted IP address should be investigated. A wide variety of trojans, malware and other types of backdoors will reach out to one or more hosts to receive instructions.
This outbound connection to a blacklisted IP address should be treated much more seriously than being scanned by a known blacklisted IP address. Imagine you are in charge of security for a large network and a blacklisted IP address scans you. Your SIM or NBAD may see 1000s of connections and lots of scanning events, but they will all be "inbound" to your network. However, if your systems are compromised, you may see an outbound connection. Consider the below example screen shot from a Log Correlation Engine monitoring netflow:
In this image (which was over a 48 hour period), there were a variety of blacklist events in the past (reading left to right), but the last few events also included 4 Outbound_Blacklist_Communication events. All of the previous events included connections "from" the bad guys to our network. They may be hostile and should be investigated, but at least the good guys didn't connect back. However, these last four events were "outbound" in nature. For some reason, hosts on the monitored network reached out to these "blacklisted" sites.
Since there are only 4 events, analyzing them can be accomplished in short order.When this occurs, you should attempt to see if this connection is a false positive or if the system is indeed compromised. If you are collecting logs from other sources (NIDS, firewall, anti-virus, .etc) you may be able to correlate the outbound blacklist connection with other types of activity. When trying to discriminate between a harmless false positive alerts and a compromise, keep the following in mind:
- If you have collected web proxy logs, these should contain information about the visited domain name and web URIs involved.
- If the system in question is not a desktop or user workstation, yet it is sending mail, browsing the web and doing chat, this is a good indication that you have a compromised server. One of the techniques our Security Center customers use is to create asset lists of various servers and then use these as a report or filter to analyze blacklist connections quickly.
- Even if the system is a desktop, even power users don't surf the net 24x7. Analyzing how much traffic is normally sent can show "off hours" activity from a host.
- Keep in mind the protocol you are dealing with. If you run FTP servers on your network and configure them such that the server connects to the client to transfer data, you may get a valid outbound connection on some high port. Although highly interesting (why is a blacklisted IP address talking with your FTP server) its likely not a compromise.
If you don't have access to any other corroborating logs, you can still do some interesting analysis with the original netflow or network session data. In this redacted screen shot below, we list each network session from hosts on a network that have made a connection to a known blacklist source:
Host 208.87.149.250 was listed by Arbor Networks as an attack source. At one of our monitoring sites, we saw several hosts connect to this IP address. Reading through the logs (which came from the Log Correlation Engine's network monitor agent), the first session transfered 1363 bytes, but less than a day later, a different host (perhaps the same host with a new DHCP lease which LCE could have figured if it was collecting DHCP logs) attempts two connections, both of which time out. This could indicate that the blacklisted host at the target IP has been taken offline and is no longer available. If this is the case, then that first host that was able to transfer 1363 bytes could have been infected or have received some sort of hostile communication.
Lastly, if no additional log sources are available, bringing in vulnerability, asset and configuration data can help. If the source system looks like a desktop and is not running the latest web browser patch, it may be vulnerable to a variety of client-side web exploits. If the system is a server, and it has made an outbound connection, perhaps it has been compromised. Servers generally don't "surf" the network and if it has made such a connection either this is "normal" or it just started. If it just started, then trying to figure out when it happened and maybe why it occurred could be interesting.
Scan And Generic Activity Detection
Besides looking for compromised systems, tracking blacklist connections can be useful on many levels:
Follow On Traffic - Some SIMs and NBAD solutions can correlate blacklist connection events with various types of NIDS events. For example, the Log Correlation Engine can look for hosts that have been attacked, and then interact with a blacklisted IP address which produce logs such as:
Compromised_System_Blacklist_Connection - source host 24.3.2.1 has connected to host 77.3.43.112 on port 80 with a blacklist event of Blacklist-BleedingSnort_Botnet_Source. Within the last five minutes, the source host was targeted with 24 critical IDS attacks. This could indicate that the system has been compromised and is participating in a botnet. Time of activity was 10/03/2007 09:11:45
SPAM - If you receive a correlation with connections from a known "spammer", you should check to make sure that each of the target mail servers are indeed authorized mail servers AND they have anti-SPAM filtering in place. On the other hand, if your computers are making connections to a known SPAM source, this could indicate that you have one or more systems on your network that have been compromised and are being used to send SPAM. This could also mean that one of your users has happened to send mail to a server that was blacklisted for sending SPAM.
Large Network Scans - Although your SIM, NBAD or NIDS may detect port scans and scanning activity, correlating these events with blacklist source IP addresses can easily identify probes and potential worm infection events. If you can recognize these types of scans, including the target ports, you may be able to further analyze this and compare what the worm was probing for with known vulnerabilities on your network.
Phishing - Some blacklist sites track IP addresses that have been associated with phishing scams. If you see many users visit one of these IP addresses, it could indicate a scam that has targeted your network.
Botnet Command and Control - Some blacklist sites (such as Emerging Threats) identify IP addresses that are used to manage various types of botnets.
For More Information
This blog entry is second in a series of "Event Analysis Training" entries. The previous blog entry and several other event analysis links are listed below: