Software Protection Using Obfuscation
Software Protection Using Obfuscation
This definition of obfuscation, in particular the ―Virtual Black This transformation is valid under certain conditions such as
Box‖ property, is too strong — obfuscators will still be of ensuring that the statements S1 and S2 do not change the value
practical use even if they do not provide perfect black boxes. of q. We can easily extend this to obfuscate larger blocks with
This definition does not give an indication of the quality of a a bigger set of dynamically opaque predicates but the
proposed obfuscation technique (the measure says whether the conditions for ensuring that the transformation is valid become
transformation completely obfuscates or not). After the more complex.
publication of Barak et al [2] , the focus of obfuscation
B. Variable transformations
research has changed to designing obfuscations that are
difficult, but not necessarily impossible, for an attacker to In this section, we show how to transform an integer variable i
undo. within a method. To do this, we define two functions f and g:
f :: X Y
III. OBFUSCATING TRANSFORMATIONS g :: Y X
In this part we have explained some obfuscation techniques
where X Z — this represents the set of values that i takes.
and also extend some of them.
We require that g is a left inverse of f (and so f needs to be
A. Opaque predicates injective). To replace the variable i with a new variable, j say,
One of the most valuable obfuscation techniques is the use of of type Y we need to perform two kinds of replacement
opaque predicates. An opaque predicate P is a predicate depending on whether we have an assignment to i or use of i .
An assignment to i is a statement of the form i = V and a use of
whose value is known at obfuscation time — P T denotes a
F ? i is an occurrence of i which is not an assignment. The two
predicate which is always True (similarly for P ) and P replacements are:
denotes a predicate which sometimes evaluates to True and Any assignments of i of the form i = V are replaced by
sometimes to False. j = f (V ).
Here are some example predicates which are always True Any uses of i are replaced by g(j ).
(supposing that x and y are integers): These replacements can be used to obfuscate a while loop.
x2 0 C. Loops
x ( x 1)
2 2
0(mod 4) In this section, we show some possible obfuscations of this
simple while loop:
x2 7 y2 1 i = 1;
Opaque predicates can be used to transform a program block while (i < 100)
B as follows: {
T ...
if ( P ) { B ; }This hides the fact that B will always
i + +;
be executed.
F }
if ( P ) { B ' ; } else { B ; }. Since the predicate is We can obfuscate the loop counter i — one possible way is to
always false we can make the block B ' to be a copy use a variable transformation. We define functions f and g to
of B which may contain errors. be:
?
if ( P ) { B ; } else { B ' ; }. In this case, we can have f(i) = (2i + 3)
two copies of B each with the same functionality. g(i) = (i − 3) div 2
When creating opaque predicates, we must ensure that they are and we can verify that g · f = id.
stealthy so that it is not obvious to an attacker that a predicate Using the rules for variable transformation, we obtain:
is in fact bogus. So we must choose predicates that match the j = 5;
―style‖ of the program and if possible use expressions that are while ((j − 3)/2 < 100)
already used within the program. {
1) Extensions of Opaque Predicates ...
As mentioned in the previous section, one of the limitations j = (2 * (((j − 3)/2) + 1)) + 3;
with using opaque predicates is that often the value of an }
opaque predicate is true for all possible inputs. With some simplifications, the loop becomes:
This lead to developing dynamically opaque predicates. These j = 5;
are a set of predicates which all evaluate to the same result in while (j < 203)
any given run, but in different runs they may evaluate to {
different values. For example, if we had a block of three ...
statements S1; S2; S3 and two linked predicate p and q (which j = j + 2;
evaluate to the same value in the same run) then we can }
obfuscate the block as follows: Another method we could use is to introduce a new variable, k
if (p) S1; else {S1; S2; } say, into the loop and put an opaque predicate (depending on
if (q) {S2; S3; } else S3; k) into the guard. The variable k performs no function in the
> ЗАМЕНТЕ ЈА ОВАА ЛИНИЈА СО ИНДЕНТИФИКАЦИОНИОТ БРОЈ НА ВАШИОТ ТРУД < 3
5. Initialize a set of equations E1 ={C0 =IV0, . . . ,Cn1 programs need to detect that virus, and then build software or
=IVn−1} which expresses the current state of the tools for removing viruses from user pc.
memory cells as a function of their initial values. Dynamic obfuscation is a powerful tool in hands of hacker.
6. Initialize a set of equations E2 ={} which expresses And the second reason for this is that every time virus code is
how a piece Pi can be recovered in cleartext from the changed, dynamically, it can be detect from antivirus program.
initial values IV0, . . . , IVn−1. Because this paper is concentrated on methods for obfuscation,
7. Initialize a table next =hP0 =?, . . . ,Pn−1 =?i which we will provide some basic methods for implementing
maps each subprogram Pi to the cell it should jump to dynamic obfuscation in malware software.
in order to execute Pi+1. In the conclusion we will provide methods for deobfuscating
8. make obscure() and removal of malware software. However, only one solution
And the other function that is important for this algorithm id for detecting malware software is deobfuscaiton and then
make_obscure(). So for this method we have: providing unique removal software (antivirus) for that virus or
For p ∈ [0 . . . n −1] do worm etc.
1. Select a cell Cc to hold piece Pp in cleartext.
A. Obfuscation methods for malware software
2. Consult E1 to find the current contents V of Cc.
Update E2 :=E2[P −p =V]. Using Gaussian 1) Trojan.Clampi – Virtualization
elimination, try to invert E2 (i.e. find values for all Trojan.Clampi used a commercial tool to obfuscate its code.
the IVi ’s). If there is no solution select another cell Essentially this tool converts the existing instructions into an
for Pp. intermediate language. It also embeds a custom interpreter
3. next :=next[Pp−1 =Cc] For even (odd) p:s, simulate a known as a virtual machine to interpret this custom language.
mutation where every cell Ci in upper (lower) space Reverse engineering this code requires an understanding of
is xor:ed with its partner cell CPF(i) in lower (upper) how the virtual machine processes the custom code. While not
space. impossible this can be a very time-consuming task. To
4. E1 :=E1[CPF(i) =CPF(i) ⊕ Ci ]. understand Clampi one cannot simply rearrange blocks back
into a readable order, but must decipher essentially an entirely
D. Ideas for dynamic obfuscation new pseudo-language for each new sample.
Dynamic obfuscation is really great method in obfuscation 2) Entry point obfuscation
theory for protection of program code. Our opinion is that in Modifying an executable’s start address, or the code at the
this stage dynamic obfuscation hasn’t any good algorithm for original start address, constitutes extremely suspicious
code protection. Maybe the best combination of protection is behavior for anti-virus heuristics. A virus can try to get control
to combine two areas of protection, and that is obfuscation and elsewhere instead; this is called entry point obfuscation or
encryption. EPO.
We think that mixture of obfuscation and encryption can give 3) Packers
really good results for code protection. Reasons for this are: Packers are a throwback to days of yore when the Internet was
o If we use AES to encrypt already obfuscate code, we still a research toy and computer storage space was at a
can double protection level of a program premium. System RAM and disks were much smaller in the
o AES isn’t cracked till today. In other words, there is no 80’s and early 90’s. To keep the size of binary executables to
known methods that have cracked all rounds of an absolute minimum, so-called packing tools were
Rijndael algorithm. popularized that encrypted and compressed files. While
The main idea is not to obfuscate whole code that is executed, packers still have legitimate uses today (bundling executables
because that will significantly slow down all application. Code with component files and commercial software protection),
that needs to be transformed with transformer also should be this technique was adopted and extended by malware authors
encrypted and the transformer should also have encryption key to add polymorphism, armoring, metamorphism, EPO, and a
so it can in same time decrypt obfuscated code. Maybe it will host of other techniques aimed at evading AV scanners.
slow down code a little, but it doubles the protection of code. Packers offer powerful benefits to malware authors. When
It should be mentioned that in any case, dynamic obfuscation creating a new strain of an existing malware, if the malware
have higher level of protection that static obfuscation because author modifies most of the code but leaves parts of it intact
it dynamically change program every time that is executed. (or picks and chooses pieces from other existing malware), the
resultant executable will share patterns with its relatives. This
V. OBFUSCATING MALWARE means that if any signature exists for any piece of the
Today we face up with more and more sophisticated methods antecedent, an AV scanner can match this pattern. However,
for hiding malicious or malware software, so it can penetrate packing the file with a packer means that just a tiny change in
into user OS and programs. Antivirus program constantly are the source (for example, changing a register name) will result
trying to detect this kind of obfuscated viruses that are using in a radically different binary executable. This effect is akin to
dynamic obfuscation.Wethink that hackers are step forward. how a single letter change in a lengthy document will result in
Because fundamental theory says, we should have a reason so a completely different cryptographic hash. There are literally
we can act. In our case, companies that produce antivirus thousands of discrete packing tools out there that are used to
compress, encrypt and armor malware.
> ЗАМЕНТЕ ЈА ОВАА ЛИНИЈА СО ИНДЕНТИФИКАЦИОНИОТ БРОЈ НА ВАШИОТ ТРУД < 6
VI. CONCLUSION
Obfuscation as a field of research has still a long way to go till
find really stable methods for code protection. But in this
moment it is the most sustainable method compared to
watermarking, tampering and other techniques for software
and intellectual protection.
Static obfuscation, as predecessor of dynamic obfuscation,
provided excellent stable foundation for dynamic obfuscation.
As method we think that static obfuscation has still to be
developed, but it has no shiny future. The key method for