.NET Framework MSIL: What is Obfuscation?

Jul 8
10:33

2011

M Wright

M Wright

  • Share this article on Facebook
  • Share this article on Twitter
  • Share this article on Linkedin

Obfuscation protects your code from reverse engineering by making your code so confusing that it cannot be easily decompiled into human readable code.......

mediaimage
Q: What is Obfuscation?

A: Obfuscation protects your code from reverse engineering by making your code so confusing that it cannot be easily decompiled into human readable code.  A well-written .NET obfuscator tool does this for you automatically by modifying assemblies after they have been compiled. Altering the code in such a way that the code will still run and execute in the same way but any attempt to decompile the assemblies will only produce meaningless code that will confuse human interpreters.

Basic .NET obfuscators just rename all the identifiers within the code to randomly generated names,.NET Framework MSIL: What is Obfuscation? Articles i.e. all method and class names will be renamed to meaningless words. They may use hashing techniques or arithmetically offset the characters to unreadable or unprintable characters.  These techniques make the code very hard to understand and navigate but with time and a bit more effort than non-obfuscated assemblies they can be reverse-engineered.

Advanced .NET obfuscators provide even more protection. They use advanced techniques to not only rename the symbol identifiers but change the underlying MSIL code within the assemblies making the code almost impossible to decompile by decompilation software. While it will always be possible to manually analyse the MSIL code and reverse-engineer an assembly, if the code is too difficult to decompile with the use of automated decompilation software, it is safe to say that it will be nearly impossible for a human to decompile and reverse engineer the assemblies and most certainly not worth the effort it would take to do so.

Basic obfuscation (i.e. symbol renaming) can be further enhanced by overload induction. Overload induction takes symbol renaming a step further by reusing symbol names where ever possible. If two methods or functions have different parameters they can be renamed with the same identifier name even if both methods may have completely different functionality. This adds further confusion since the majority of methods and functions within the assemblies end up with the same symbol names.

A side effect of the symbol renaming used by .NET obfuscators is that any stack traces produced in error messages are no longer in human readable format. Advanced .NET obfuscators provide the ability to parse these obfuscated stack traces and return a human readable version.  In general this functionality is only available to the person/company who obfuscated the code in the first place and is either controlled by password encrypted symbol names or symbol name lookup files.


Obfuscation Example:
The following C# example demonstrates symbol renaming in conjunction with overload induction:

Source Code Before Obfuscation:
        private void IncreaseSalaries(EmployeeInfoCollection employees) {
           while (employees.HasMore()) {
              employee = employees.GetNext(true);
              employee.IncreaseSalary();
              NotifyEmployee(employee);
           }
        }


Reverse-Engineered Source Code After Obfuscation:
        private void a(a b) {
           while (b.a()) {
              a = b.a(true);
              a.a();
              a(a);
           }
        }


The above example not only makes the code incredibly difficult to understand, but it reduces the amount of code by using shorter symbol names resulting in smaller assemblies.