There is a lot of debate on the use of obfuscation in software development. Some people argue that obfuscation is useful for hiding sensitive information, while others claim that it does not have any real benefits. In this article, I will try to answer the question: is code obfuscation useful? There are a few reasons why code obfuscation may be useful. First, code obfuscation can help to protect against unauthorized access to your software. If someone is able to see the hidden code or symbols, they may be able to guess how your system works and potentially exploit it. Code obfuscation can also help to protect against common security vulnerabilities such as buffer overflows and cross-site scripting (XSS). Finally, code obfuscation can help to make your software more difficult to read and understand. This can make it more difficult for attackers to guess how your system works and exploit it. There are also some disadvantages of using code obfuscation. First, some people argue that code obfuscation does not have any real benefits. This means that it does not really help you in terms of performance or security. Second, some people argue that using code obfuscation can actually lead to increased complexity and difficulty when trying to understand your software. This can make it more difficult for attackers to guess how your system works and exploit it. Finally, some people argue that using code obfuscATION can actually lead to errors when trying to debug your software. This could lead to data loss or even damage if you are unable to understand the sourcecode of your software!
Certain languages like Java and .NET can be easily decompiled into readable source code. Code obfuscation is a process that makes your application binaries harder to read with a decompiler. It’s an important tool for protecting your business’s intellectual property.
Why Obfuscate Code?
Compiled languages like C++ get converted directly to bytecode. The only way to reverse engineer how they work is with a disassembler, which is an arduous and complicated process. It’s not impossible, but trying to infer high level application logic from a stream of assembly language is hard.
On the other hand, languages like C# and Java aren’t compiled for any particular operating system. Rather, they’re compiled to an intermediary language, like .NET’s MSIL. The intermediary language is similar to assembly, but it can be easily converted back into the source code. This means that if you have a public DLL or executable that your business is distributing, anyone with a copy of your executable can open it up in a .NET decompiler like dotPeek, and directly read (and copy) your source code.
Code obfuscation can’t prevent this process—any .NET DLL can be plugged into a decompiler. What obfuscation does do is use a number of tricks to make the source code annoying as hell to read and debug.
The simplest form of this is entity renaming. It’s common practice to properly name variables, methods, classes, and parameters according to what they do. But you don’t have to, and technically there’s nothing stopping you from naming them with a series of lowercase L’s and I’s, or random combinations of Chinese unicode characters. To the computer, there’s no issue, but it’s completely illegible to a human:
An basic obfuscator will handle this process automatically, taking the output from the build, and converting it to something that’s a lot harder to read. There’s no performance hit compared to non-obfuscated code
More advanced obfuscators can go further, and actually change the structure of your source code. This includes replacing control structures with more complicated but semantically identical syntax. They can also insert dummy code that doesn’t do anything except confuse the decompiler. The effect of this is that it makes your source look like spaghetti code, making it more annoying to read.
Another common focus is hiding strings from decompilers. In managed executables, you can search for strings like error messages to locate sections of code. String obfuscation replaces strings with encoded messages, which are decrypted at runtime, making it impossible to search for them from a decompiler. This usually comes with a performance penalty.
There are plenty of options for obfuscators, though it will depend on which language you are obfuscating. For .NET, there’s Obfuscar. For Java, there’s ProGuard. For JavaScript, there’s javascript-obfuscator.
Other Options: Convert to a Compiled Language
Converting one programming language to another isn’t an entirely crazy idea—Unity uses IL2CPP, a converter that transforms .NET code into compiled C++ bytecode. It’s a lot more performant, but it also helps secure games against easy cracking, which is crucial for an environment plagued by piracy and cheaters.
Microsoft has CoreRT, an experimental .NET Core runtime using Ahead-Of-Time compilation, though it isn’t ready for production use.
Should You Obfuscate?
If you’re deploying code in untrusted environments where you want to protect your source code, you should almost always use at least a basic obfuscator to rename functions, methods, and properties to make decompiling take a bit more effort.
If you really need nobody to be able to decompile your app, you can use a more intrusive obfuscator, but really you should consider if the problem would be better solved by switching to a language that doesn’t have this issue, such as C++ or Rust.