Malware Databased (custom malware signatures)

So this past week I attended the information security conference call Converge in Detroit. Based on information that I learned while there as well as projects I was already started down I will be working to build a malware signature data set specifically tooled for web servers. Currently during investigations I may need to use 2 even three scans as well as manual searching to locate all malicious content in a website. This seems not great to say the least.

I have been working with yara rules for a little while now, especially to help identify malicious injections related to the eitest botnet. Yara’s really neat tool which allows you to parse files for particular patterns or strings. I find this particularly useful in detecting web shells that use heavy obfuscation to hide their intent and can easy change their signature. Check out this great step by step on the different features yara rules present.

While at Converge I learned about the companion to clamAV known as sigtool, which allows you to create more standard signatures for malware, detecting on file hashes or hex of the contents. This I believe can server an especially useful case in anther undeserved detection area. Phishing kits.

I first learned about another use of one of my favorite external scanning tools urlscan.io from a threat analyst that used this to track phishing kits, sadly, I cannot find that original tweet, so I’ll make do with this one.

See like any developer, malware authors reuse assets and code. So simply by identifying and hashing the pieces of a phishing kit we can pretty easily scan for them, both externally using tools like urlscan, but also with a clamAV hash database, which you can create with sigtool.

sigtool --sha256 alibabame.zip >> /home/example/lw.hdb

I ran some tests of the new ruleset in some investigations today, and things are looking pretty good from a detection angle.

clamscan -ir -l /test/scan.txt -d /test/lw-yara/lw.hdb -d /test/lw-yara/lw-rules_index.yar /home/*/public_html/

/home/user1/public_html/File/outlook/css/apple-touch-icon-72x72.png: apple-touch-icon-72x72.png-docusign-phishing-0001.UNOFFICIAL FOUND
/home/user1/public_html/Users/SEEDYYSOFT.zip: apple-touch-icon-72x72.png-docusign-phishing-0001.UNOFFICIAL FOUND
/home/user2/public_html/wop/zip/docc/wp/Google_docs_files/universal_language_settings-21.png: universal_language_settings-21.png-google-phishing-001.UNOFFICIAL FOUND
/home/user3/public_html/paypal/assets/includes/whitelist.dat: whitelist.dat-wells-phishing0002.UNOFFICIAL FOUND
/home/user4/public_html/imagedb/2712.php: YARA.FOPOobfuscator.UNOFFICIAL FOUND
/home/user5/public_html/docusign/docusign17/images/live_hotmail.png: live_hotmail.png-docusign-phishing-0001.UNOFFICIAL FOUND
/home/user5/public_html/docusign.zip: cJZKeOuBrn4kERxqtaUH3T8E0i7KZn-EPnyo3HZu7kw.woff-google-phishing-001.UNOFFICIAL FOUND
/home/user6/public_html/comcast/blacklist_lookup.php: blacklist_lookup.php-wells-phishing0002.UNOFFICIAL FOUND
/home/user7/public_html/user7new/modules/block/block-admin-display-form.tpl.php: YARA.generic_php_injection_0.UNOFFICIAL FOUND
/home/user7/public_html/user7new/sites/all/libraries/colorbox/chlkujyz.php: YARA.generic_php_injection_1.UNOFFICIAL FOUND

This is a subset of the hits return in the scans, but a promising sign that this technique will prove useful in providing some deeper results in beginning investigations. So if you’d like to test down the ruleset and run using clamscan using the -d flag to specify the custom database, the lw.hdb is the hash dataset, and the lw-rules_index.yar is the broader yara ruleset.

I plan on continuing to develop this dataset to continue to target code used in web server attacks.

https://github.com/Hestat/lw-yara

Have questions, or comments on the technique or signatures, find me on twitter @laskow26. Good hunting.