www.nextron-systems.com
Open in
urlscan Pro
91.250.66.252
Public Scan
Submitted URL: https://www.bsk-consulting.de/2015/02/16/write-simple-sound-yara-rules/
Effective URL: https://www.nextron-systems.com/2015/02/16/write-simple-sound-yara-rules/
Submission: On November 28 via manual from US — Scanned from DE
Effective URL: https://www.nextron-systems.com/2015/02/16/write-simple-sound-yara-rules/
Submission: On November 28 via manual from US — Scanned from DE
Form analysis
0 forms found in the DOMText Content
* Why Nextron * Products * L * Scanners * THORAPT Scanner * Use Cases * Videos * THOR CloudOn-Demand Live Forensic Scans * Microsoft Defender ATP * THOR ThunderstormTHOR as a Web Service * THOR LiteFree IOC and YARA Scanner * Online Training * Compare our Scanners * Endpoint Agents * AURORAYour Custom Sigma-based EDR Agent * Videos * R * Management & Analysis * ASGARDManagement Center * ASGARDAnalysis Cockpit * Feeds * VALHALLAYARA Rule Feed * Solutions * Solutions Matrix * Security Validation * Accelerated Forensic Analysis * Supercharged Detection * Large Scale Incident Response * Partners * Become a Partner * Authorized Resellers * Company * About Us / Contact * Jobs * Certificates & Keys * Blog * Get Started Select Page * Why Nextron * Products * L * Scanners * THORAPT Scanner * Use Cases * Videos * THOR CloudOn-Demand Live Forensic Scans * Microsoft Defender ATP * THOR ThunderstormTHOR as a Web Service * THOR LiteFree IOC and YARA Scanner * Online Training * Compare our Scanners * Endpoint Agents * AURORAYour Custom Sigma-based EDR Agent * Videos * R * Management & Analysis * ASGARDManagement Center * ASGARDAnalysis Cockpit * Feeds * VALHALLAYARA Rule Feed * Solutions * Solutions Matrix * Security Validation * Accelerated Forensic Analysis * Supercharged Detection * Large Scale Incident Response * Partners * Become a Partner * Authorized Resellers * Company * About Us / Contact * Jobs * Certificates & Keys * Blog * Get Started HOW TO WRITE SIMPLE BUT SOUND YARA RULES Feb 16, 2015 | LOKI, THOR, Tool, Tutorial, YARA During the last 2 years I wrote approximately 2000 Yara rules based on samples found during our incident response investigations. A lot of security professionals noticed that Yara provides an easy and effective way to write custom rules based on strings or byte sequences found in their samples and allows them as end user to create their own detection tools. However it makes me sad to see that there are mainly two types of rules published by the researchers: 1. rules that generate many false positives and 2. rules that match only the specific sample and are not much better than a hash value. I therefore decided to write an article on how to build optimal Yara rules, which can be used to scan single samples uploaded to a sandbox and whole file systems with a minimal chance of false positives. These rules are based on contained strings and easy to comprehend. You do not need to understand the reverse engineering of executables and I decided to avoid the new Yara modules like “pe” which I still consider as “testing” features that may lead to memory leaks or other errors when used in practice. AUTOMATIC RULE GENERATION First I believed that automatically generated rules can never be as good as manually created ones. During my work for out IOC scanners THOR and LOKI I had to create hundreds of Yara rules manually and it became clear that there is an obvious disadvantage. What I used to do was to extract UNICODE and ASCII strings from my samples by the following commands: strings -el samples.exe strings -a sample.exe I prefer the UNICODE strings as they are often overlooked and less frequently changed within a certain malware/tool family. Make sure that you use UNICODE strings with the “wide” keyword and ASCII strings with the “ascii” keyword in your rules and use “fullword” if there is a word boundary before and after the string. The problem with this method is that you cannot decide if the string that is returned by the commands is unique for this malware or often used in goodware samples as well. Look at the extracted strings in the following example: NTLMSSP %d.%d.%d.%d %s\IPC$ \\%s NT LM 0.12 %s%s%s %s.exe %s %s\Admin$\%s.exe RtlUpcaseUnicodeStringToOemString LoadLibrary( NTDLL.DLL ) Error:%d Could you be sure that the string “NT LM 0.12” is a unique one, which is not used by legitimate software? To accomplish this task for me I developed “yarGen“, a Yara rule generator that ships with a huge string database of common and benign software. I used the Windows system folder files of Windows 2003, Windows 7 and Windows 2008 R2 server, typical software like Microsoft Office, 7zip, Firefox, Chrome, Cygwin and various Antivirus solution program folders to generate the database. yarGen allows you to generate your own database or add folders with more goodware to the existing database. yarGen extracts all ASCII and UNICODE strings from a sample and removes all strings that do also appear in the goodware string database. Then it evaluates and scores every string by using fuzzy regular expressions and the “Gibberish Detector” that allows yarGen to detect and prefer real language over character chains without meaning. The top 20 of the strings will be integrated in the resulting rule. Let’s look at two examples from my work. A sample of the Enfal Trojan and a SMB Worm sample. yarGen generates the following rule for the Enfal Trojan sample: rule Enfal_Generic { meta: description = "Auto-generated rule - from 3 different files" author = "YarGen Rule Generator" reference = "not set" date = "2015/02/15" super_rule = 1 hash0 = "6d484daba3927fc0744b1bbd7981a56ebef95790" hash1 = "d4071272cc1bf944e3867db299b3f5dce126f82b" hash2 = "6c7c8b804cc76e2c208c6e3b6453cb134d01fa41" strings: $s0 = "urlmon" fullword $s1 = "Registered trademarks and service marks are the property of their respec" wide $s2 = "Micorsoft Corportation" fullword wide $s3 = "IM Monnitor Service" fullword wide $s4 = "imemonsvc.dll" fullword wide $s5 = "iphlpsvc.tmp" fullword $s6 = "XpsUnregisterServer" fullword $s7 = "XpsRegisterServer" fullword $s8 = "{53A4988C-F91F-4054-9076-220AC5EC03F3}" fullword $s9 = "tEHt;HuD" fullword $s10 = "6.0.4.1624" fullword wide $s11 = "#*8;->)" fullword $s12 = "%/>#?#*8" fullword $s13 = "\\%04x%04x\\" fullword $s14 = "3,8,18" fullword $s15 = "3,4,15" fullword $s16 = "3,7,12" fullword $s17 = "3,4,13" fullword $s18 = "3,8,12" fullword $s19 = "3,8,15" fullword $s20 = "3,6,12" fullword condition: all of them } The resulting string set contains many useful strings but also random ASCII characters ($s9, $s11, $s12) that do match on the given sample but are less likely to produce the same result on other samples of the family. yarGen generates the following rule for the SMB Worm sample: rule sig_smb { meta: description = "Auto-generated rule - file smb.exe" author = "YarGen Rule Generator" reference = "not set" date = "2015/02/15" hash = "db6cae5734e433b195d8fc3252cbe58469e42bf3" strings: $s0 = "LoadLibrary( NTDLL.DLL ) Error:%d" fullword ascii $s1 = "SetServiceStatus failed, error code = %d" fullword ascii $s2 = "%s\\Admin$\\%s.exe" fullword ascii $s3 = "%s.exe %s" fullword ascii $s4 = "iloveyou" fullword ascii $s5 = "Microsoft@ Windows@ Operating System" fullword wide $s6 = "\\svchost.exe" fullword ascii $s7 = "secret" fullword ascii $s8 = "SVCH0ST.EXE" fullword wide $s9 = "msvcrt.bat" fullword ascii $s10 = "Hello123" fullword ascii $s11 = "princess" fullword ascii $s12 = "Password123" fullword ascii $s13 = "Password1" fullword ascii $s14 = "config.dat" fullword ascii $s15 = "sunshine" fullword ascii $s16 = "password <=14" fullword ascii $s17 = "del /a %1" fullword ascii $s18 = "del /a %0" fullword ascii $s19 = "result.dat" fullword ascii $s20 = "training" fullword ascii condition: all of them } The resulting rules are good enough to use them as they are, but they are far from an optimal solution. However it is good that so many strings have been found, which do not appear in the analyzed goodware samples. If you don’t want to use or download yarGen, you could also use the online tool Yara Rule Generator provided by Joe Security, which was inspired by/based on yarGen. It is not necessary to use a generator if your eye is trained and experienced. In this case just read the next section and select the strings to match the requirements of the (what I call) sufficiently generic Yara rules. SUFFICIENTLY GENERIC YARA RULES As I said in the introduction rules that generate false positives are pretty annoying. However the real tragedy is that most of the rules are far too specific to match on more than one sample and are therefore almost as useful as a file hash. What I tend to do with the rules is to check all the strings and put them into at least 2 different categories: * Very specific strings = hard indicators for a malicious sample * Rare strings = likely that they do not appear in goodware samples, but possible * Strings that look common = (Optional) e.g. yarGen output strings that do not seem to be specific but didn’t appear in the goodware string database Check out the modified rules in order to understand this splitting. Ignore the definition named $mz, I’ll explain it later and look at the string definitions below. The definitions starting with $s contain the very specific strings, which I regard as so special that they would not appear in legitimate software. Note the typos in both strings: “Micorsoft Corportation” instead of “Microsoft Corporation” and “Monnitor” instead of “Monitor”. The strings starting with $x seem to be special (I tend to google the strings) but I cannot say if they also appear in legitimate software. The definitions starting with $z seem to be ordinary but have not been part of the goodware string database so they have to be special in some way. rule Enfal_Malware_Backdoor { meta: description = "Generic Rule to detect the Enfal Malware" author = "Florian Roth" date = "2015/02/10" super_rule = 1 hash0 = "6d484daba3927fc0744b1bbd7981a56ebef95790" hash1 = "d4071272cc1bf944e3867db299b3f5dce126f82b" hash2 = "6c7c8b804cc76e2c208c6e3b6453cb134d01fa41" strings: $mz = { 4d 5a } $s1 = "Micorsoft Corportation" fullword wide $s2 = "IM Monnitor Service" fullword wide $x1 = "imemonsvc.dll" fullword wide $x2 = "iphlpsvc.tmp" fullword $x3 = "{53A4988C-F91F-4054-9076-220AC5EC03F3}" fullword $z1 = "urlmon" fullword $z2 = "Registered trademarks and service marks are the property of their" wide $z3 = "XpsUnregisterServer" fullword $z4 = "XpsRegisterServer" fullword condition: ( $mz at 0 ) and ( ( 1 of ($s*) ) or ( 2 of ($x*) and all of ($z*) ) ) and filesize < 40000 } Now check the condition statement and notice that I combine the rules with a magic header of an executable defined by $mz and a file size to exclude typical false positives like Antivirus signature files, browser cache or dictionary files. Set an ample file size value to avoid false negatives. (e.g. samples between 100K and 200K => set file size < 300K) You can see that I decided that a single occurrence of one of the very specific strings would trigger that rule. ( 1 of $s* ) Than I combine a bunch of less unique strings with most or all of the ordinary looking strings. ( 2 of $x* and all of $z* ) Let’s look at second example. (see below) $s1 is a very special string with string formatting placeholders “%s” in combination with an Admin$ share. $s2 seems to be the typical “svchost.exe” but contains the number “0” instead of an “O”, which is very uncommon and a clear indicator for something malicious. All the definitions starting with $a are special but I cannot say for sure if they won’t appear in legitimate software. The strings defined by $x seem ordinary but were produced by yarGen, which means that they did not appear in the goodware string database. This special example contains a list of typical passwords which is defined by $z1..z8. rule SMB_Worm_Tool_Generic { meta: description = "Generic SMB Worm/Malware Signature" author = "Florian Roth" reference = "http://goo.gl/N3zx1m" date = "2015/02/08" hash = "db6cae5734e433b195d8fc3252cbe58469e42bf3" strings: $mz = { 4d 5a } $s1 = "%s\\Admin$\\%s.exe" fullword ascii $s2 = "SVCH0ST.EXE" fullword wide $a1 = "LoadLibrary( NTDLL.DLL ) Error:%d" fullword ascii $a2 = "\\svchost.exe" fullword ascii $a3 = "msvcrt.bat" fullword ascii $a4 = "Microsoft@ Windows@ Operating System" fullword wide $x1 = "%s.exe %s" fullword ascii $x2 = "password <=14" fullword ascii $x3 = "del /a %1" fullword ascii $x4 = "del /a %0" fullword ascii $x5 = "SetServiceStatus failed, error code = %d" fullword ascii $z1 = "secret" fullword ascii $z2 = "Hello123" fullword ascii $z3 = "princess" fullword ascii $z4 = "Password123" fullword ascii $z5 = "Password1" fullword ascii $z6 = "sunshine" fullword ascii $z7 = "training" fullword ascii $z8 = "iloveyou" fullword ascii condition: $mz at 0 and ( 1 of ($s*) and 1 of ($x*) ) or ( all of ($a*) and 2 of ($x*) ) or ( 5 of ($z*) and 2 of ($x*) ) and filesize < 200000 } You see that I combined the string definitions in a similar way as before. This method in combination with the magic header and the file size should be a good starting point for the final stage – testing. TESTING Testing the rules is very important. It seems that most authors decide that the rules are good enough if they match on the given samples. You should definitely do the following checks: 1. Scan the malware samples 2. Scan a big goodware archive To carry out the tests download the Yara scanner and run it from the command line. The goodware directory should include system files from various Windows versions, typical software and possible false positive sources (e.g. typical CMS software if you wrote Yara rules that match on malicious web shells) Yara Rule Testing on Samples and Goodware If the rule matched on the malicious samples and did not generate a match on the goodware archive your rule is good enough to test the rule in practice. UPDATE Make sure to check Part 2 of “How to Write Simple and Sound YARA Rules”. NEWSLETTER New blog posts (~1 email/month) Subscribe * Subscribe to RSS Feed * Follow on Twitter * Follow on LinkedIn RECENT BLOG POSTS * ASGARD 2.14 Release November 3, 2022 * Mjolnir Security: Blue Team Incident Response Training August 29, 2022 * Antivirus Event Analysis Cheat Sheet v1.10.0 August 13, 2022 * THOR TechPreview 10.7.3 Features August 3, 2022 * New Analysis Cockpit 3.5 July 29, 2022 * Follina CVE-2022-30190 Detection with THOR and Aurora June 13, 2022 BLOG TOPICS * Alert (12) * APT (6) * ASGARD Analysis Cockpit (7) * ASGARD Management Center (17) * Aurora (2) * Bug Report (1) * Check Point (1) * Command Line (9) * LOKI (4) * Newsletter (67) * Nextron (26) * Partner (3) * Press (1) * Security Fix (5) * Security Monitoring (19) * Service Notice (1) * Sigma (3) * SPARK (14) * SPARK Core (5) * Splunk (2) * THOR (51) * THOR Cloud (2) * THOR Lite (13) * Thunderstorm (3) * Tool (18) * Tutorial (22) * VALHALLA (5) * Video (3) * YARA (16) RESOURCES * Manuals * Whitepapers * Customer Portal * GitHub * YouTube NEWS * Blog * Newsletter * RSS Feed * Twitter * LinkedIn IMPRINT & PRIVACY * Imprint * Privacy Policy * Change privacy consent * Privacy consents history * Revoke privacy consents About Us / Contact Nextron Systems GmbH © 2022 All Rights Reserved Nextron Systems GmbH © 2022. All Rights Reserved. Privacy preferences We use cookies and similar technologies on our website and process personal data about you, such as your IP address. We also share this data with third parties. The data processing can take place with your consent or on the basis of our legitimate interest. You can change and revoke your consent within our privacy policy at any time with effect for the future. To do so, simply click on "Change privacy settings" or "Revoke consents" in our privacy policy. ● Essential● Services● Statistic Accept all Continue without consent Individual privacy preferences Privacy policy • Imprint WordPress Cookie Plugin by Real Cookie Banner