Thursday, 24 November 2016

Building a Whitelist of Network Domains


There are a number of times when a white list is useful to security professionals, such as:
  • You are alerting on a list of domains on your network, and don't want to set off thousand of alerts when someone accidentally adds "windowsupdate.com" to the list
  • You are reviewing sandbox reports, and don't want to get common non-malicious domains back in your reports
I've recently extended the whitelist ThreatCrowd uses when sites are marked as malicious, following feedback that a number of domains had been mistakenly flagged as malicious by users. 

This coincided with Alexa announcing they would stop publishing a commonly used whitelist - the top 1 million sites. Thankfully Alexa have changed their minds about discontinuing the data-set, for now at least, and there are other similiar sources too.



Sources like this aren't well suited to matching against network data though - sites that are programatically accessed (eg; download.windowsupdate.com) often won't be listed in datasets designed to record human traffic. A better choice may be to use the top x domains on your network. However that does require access to network logs of a large network.

For this use case - I've used logs from networks that are publicly available online. There are plenty of people who (perhaps inadvertently) publish this online. In this case I've used data from freedom of information requests for the top sites requested on a number of UK government networks.

Two things to note are:
  • This data is biased towards the UK
  • I'd suggest only using domains seen on more than one network. For example one of the domains seen on only one network below is likely Chinese APT (yes, they're aware).

You can find the list below, for all your whitelisting needs:

Monday, 28 March 2016

Clustering the Threat Landscape

Much of threat intelligence is grouping together information to identify common traits in attackers.
To that end, I wrote a quick python script to identify common indicators in reports in Alienvault's OTX platform. You can see the output of this script in the image below, with some of the more interesting clusters annotated:



This isn't a perfect method - there are some odd links there that I wouldn't expect to see. But there are also some very interesting overlaps highlighted between disparate clusters of attacks that identify possible links between groups.

You can download and browse through the Maltego file [here] - and some of the clusters are displayed below.
Update: You can download the source file [here], to see what indicators reports overlap on. It's trimmed to the first indicator for each overlap.

BlackEnergy

Carbanak with a report on more commodity malware connected via the domain trader562[.]com


Lots of overlaps with Chinese APT


RocketKitten


Sony Attacks


Sunday, 28 February 2016

Crowdsourced feeds from ThreatCrowd

Voting

Voting was added to ThreatCrowd recently, and I've been pleased to see a number of users regularly contributing votes.


These votes provide a useful source of malicious indicators, and so I've now put these into a feed in two files:

 https://www.threatcrowd.org/feeds/domains.txt
 https://www.threatcrowd.org/feeds/ips.txt
https://www.threatcrowd.org/feeds/hashes.txt

These feeds are not a substitute for the scale of auto-extracted command and control domains or the quality of some commercially provided feeds. But crowd-sourcing does go some way towards the quick sharing of threat intelligence between the community.

Updates
These files are updated once per hour, on the hour.

API
You can submit votes via the interface, or a simple API:

This will place a vote for "good.com" being non-malicious:
 https://www.threatcrowd.org/vote.php?vote=1&value=good.com

This will place a vote for "bad.com" being malicious:
 https://www.threatcrowd.org/vote.php?vote=0&value=bad.com

License
This data is available for free, and commercial use is allowed. It's licensed under http://www.apache.org/licenses/LICENSE-2.0
I make no guarantees to the quality of the data.