0% found this document useful (0 votes)
14 views

Week 3 - Lecture

Uploaded by

karish jey
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Week 3 - Lecture

Uploaded by

karish jey
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 68

Week 3 - Lecture

Malware Forensics
x86 Assembly I (Review)
Instruction Set Architecture (ISA)
● Is an abstract model of a computer, which defines the supported instructions,
data types, registers, etc.
● ISA specifies the behavior of machine code running on implementations of
that ISA
● Examples: Intel x86-64, ARM variants, MIPS, etc
○ https://www.mips.com/products/architectures/mips32-2/
Micro-architecture
● Micro-architecture is the way a given instruction set architecture (ISA) is
implemented in hardware (i.e., microchips) – i.e., the processor.

MIPS32, 5 Stage pipeline, Havard architecture


https://github.com/grantae/mips32r1_core/tree/master/mips32r1
Fetch-Execute Cycle
All CPUs follow this process:
● Fetch instruction from RAM
● Decode instruction
● Execute instruction
○ Result is stored in local registers or sent back to RAM (external memory e.g., DDR)
Definitions
● Instruction: An atomic action that the cpu can perform. e.g. Add
two numbers, move some data around, etc.
○ By combining instructions we can do more complicated things.
● Register: ‘Variable’ like things in the CPU, local memory which
is fast
● Memory/RAM: Where the data and instructions are stored.
Instruction types
● Data handling and memory operations
● Arithmetic and logic operations: Add, subtract, multiply, AND, XOR
● Control flow operations: Jump, branching (go to another location in the
program and execute instructions there)
● Co-processor instructions
Instruction anatomy x86
● mnemonic op1, op2, op3
● mov eax, 4 ;
○ Mnemonic : mov
■ is the (e.g the action) – here move
■ Often called ‘instruction’
○ Op1: eax
■ is the destination
○ Op2: 4
■ is the source
● Result is:
○ eax = 4
Instruction anatomy x86

● An assembler, takes instructions and converts them to a binary executable.


● Conversely, a disassembler converts the binary to assembly language.
Register Access in x86

● 32 bit: mov eax,ecx - Affects whole register eax ← ecx


● 16 bit: mov ax,cx - Only affects low 16 bits, rest unchanged ax ← cx
○ Only the lower 16 bits of eax will be modified (i.e., the AL and AH bytes). The upper 16 bits of eax will remain
unchanged.
● 8 bit: mov al,cl - Only affects low 8 bits, rest unchanged al ← cl
Putting Instructions Together (Intel x86) = Programs

● mov EAX, 5 ; puts 5 into EAX


● mov EBX, EAX ; puts 5 into ebx
● add EBX, 10 ; adds 10 to ebx
● sub EAX, 5 ; minus five from eax
● xor EAX, EAX ; XOR eax with itself

Comments
Program Structure in Assembly* (Review)

*NASM is used here to build x86 programs


https://www.nasm.us/
Structure of a program
● DATA segment (.data)
○ Variables, strings, etc

● CODE segment (.text)


○ Assembly instructions
○ Often called the ‘text’ segment confusingly

● ENTRY POINT (Can be found in PE/ELF header)


○ An address inside the code section where the program starts. On Linux, the _start symbol –
analogous to main() in C
Anatomy of an assembly (asm) program (NASM)
Writing Assembly: Defining data (NASM)
Making Function Calls (NASM)

Function return values are typically stored in the EAX register for integer and pointer values. Therefore to use returned
value you must access it in EAX (See last lab question).
Example Windows Library call – Displaying a Message Box (Def. below)
Example Windows Library call – Displaying a Message Box (Code and Running below)
Example C Library call (MSVCRT.dll) – printf() (Code and Running below)
Assembly II
x86 Instructions (Reference)

Specific instructions are discussed in later slides


Mov - Move
mov <reg>,<reg>

mov <reg>,<mem>

mov <mem>,<reg>

mov <reg>,<const>

mov <mem>,<const>

e.g., mov eax, ebx — copy the value in ebx into eax
Add / Sub - Add / Subtract
add <reg>,<reg> sub <reg>,<reg>

add <reg>,<mem> sub <reg>,<mem>

add <mem>,<reg> sub <mem>,<reg>

add <reg>,<con> sub <reg>,<con>

add <mem>,<con> sub <mem>,<con>

add eax, 10 — EAX ← EAX + 10 sub eax, 10 — EAX ← EAX - 10


Inc /dec - Increment / Decrement
The inc instruction increments the contents of its operand by one. The dec
instruction decrements the contents of its operand by one.

inc <reg>

inc <mem>

dec <reg>

dec <mem>

dec eax — subtract one from the contents of EAX.


imul - Integer Multiplication

imul <reg32>,<reg32>

imul <reg32>,<mem>

imul <reg32>,<reg32>,<con>

imul <reg32>,<mem>,<con>

imul eax, [var] — multiply the contents of EAX by the 32-bit contents of the memory
location var. Store the result in EAX.

imul esi, edi, 25 — ESI → EDI * 25


and, or, xor - Bitwise logical and, or and exclusive or

These instructions perform the specified logical operation (logical bitwise and, or, and
exclusive or, respectively) on their operands, placing the result in the first operand location.

and <reg>,<reg> or <reg>,<reg> xor <reg>,<reg>

and <reg>,<mem> or <reg>,<mem> xor <reg>,<mem>

and <mem>,<reg> or <mem>,<reg> xor <mem>,<reg>

and <reg>,<con> xor <reg>,<con>


or <reg>,<con>
and <mem>,<con> xor <mem>,<con>
or <mem>,<con>

and eax, 0fH — clear all but the last 4 bits of EAX.
xor edx, edx — set the contents of EDX to zero (clear all).
not — Bitwise Logical Not

Logically negates the operand contents (that is, flips all bit values in the operand).

not <reg>

not <mem>

not BYTE PTR [var] — negate all bits in the byte at the memory location var.
neg - Negate

Performs the two's complement negation of the operand contents.

neg <reg>

neg <mem>

Example

neg eax — EAX → - EAX


shl, shr - Shift Left, Shift Right

These instructions shift the bits in their first operand's contents left and right, padding the resulting empty bit positions with zeros. The
shifted operand can be shifted up to 31 places. The number of bits to shift is specified by the second operand, which can be either an 8-
bit constant or the register CL. In either case, shifts counts of greater then 31 are performed modulo 32.

shl <reg>,<con8>

shl <mem>,<con8>

shl <reg>,<cl>

shl <mem>,<cl>

shr <reg>,<con8>

shr <mem>,<con8>

shr <reg>,<cl>

shr <mem>,<cl>

shl eax, 1 — Multiply the value of EAX by 2 (if the most significant bit is 0)

shr ebx, cl — Store in EBX the floor of result of dividing the value of EBX by 2n wheren is the value in CL.
jmp - Jump

Transfers program control flow to the instruction at the memory location indicated by the operand.

jmp <label>

jmp loop - Jump to the instruction labeled loop.


jcondition — Conditional Jump

je <sym> - Jump if Equal ; ZF = 1


jne <loc> - Jump if Not Equal ; ZF = 0
jg <loc> - Jump if Greater
jge <loc> - Jump if Greater or Equal
ja <loc> - Jump if Above (unsigned comparison)
jae <loc> - Jump if Above or Equal (unsigned comparison)
jl <loc> - Jump if Lesser
jle <loc> - Jump if Less or Equal
jb <loc> - Jump if Below (unsigned comparison)
jbe <loc> - Jump if Below or Equal (unsigned comparison)
jz <loc> - Jump if Zero ; ZF = 1
jnz <loc> - Jump if Not Zero ; ZF = 0
cmp - Compare

Compare the values of the two specified operands, setting the condition codes in the machine
status word appropriately. This instruction is equivalent to the sub instruction - the result of the
subtraction is discarded, though.
cmp <reg>,<reg>
cmp <reg>,<mem>
cmp <mem>,<reg>
cmp <reg>,<con>
Status Register, Memory addressing,
Conditional statements and
Looping in Assembly
General Purpose Register Sizes
Values Range
Assembly Sizes
1 Bit = 0/1

Byte = 8 bits (AH,AL,BH,BL,CH,CL,DH,DL)

Word = 2 bytes (size of AX,BX,CX……..)

Dword = 4 bytes (size of EAX,EBX,ECX……..)

Qword = 8 bytes
x86 CPU Registers and Status Reg.

+ EFLAGS

EGLAGS is the “flags register”, a 32-bit status register, that records the outcome (status) of
operations
Flags Register: EFLAGS
● EFLAGS register is a status register. 32 bits in size, and each bit is a flag.

● During execution, each flag is either set (1) or cleared (0) to control CPU operations or
indicate the results of a CPU operation.

The following flags are most important to malware analysis:


● ZF The zero flag is set when the result of an operation is equal to zero; otherwise, it is cleared.

● CF The carry flag is set when the result of an operation is too large or too small for the
destination operand; otherwise, it is cleared.

● SF The sign flag is set when the result of an operation is negative or cleared when the result is
positive. This flag is also set when the most significant bit is set after an arithmetic operation.

● TF The trap flag is used for debugging. The x86 processor will execute only one instruction at a
time if this flag is set.
Flags Register: EFLAGS
Memory addressing
… 21 1F 00 00 … RAM

Memory address = 0x40200

• Anything inside [ ] in assembly means the contents of the


memory address specified. i.e.,. Go to RAM and get the
value at the memory address (like arrays)
• Characters are stored as ASCII note i.e., numbers

• mov ebx, 0x402000 ; mov hex value into ebx


• mov eax, [ebx] ; get value at memory address in ebx, and store in eax
x86 Move Instruction: MOV <destination> , <source>
MOV <destination> , <source>

Can also do the following:

Hex constant

Variable - if a string it will be the address of the first character only!


Example Modify Letters (loops covered in other slides, just follow through
example)

Set eax to address of


first character
Check if at end of the
string i.e., 0
Jump to loopstop if
ZF=true (result of CMP)
Increment letter by one
Increment address in
eax to next character
Jump to loop label

Print out the modified


string
Example Length of String (loops covered in other slides, just follow through example)

jz and je are identical jump instructions, test if ZF=1


Branching (Selection)
Branching via jmp
• jmp : go to another location/label in the code now.
– Unconditional
Code:
jmp exit
------------
-------------
exit:
<instructions for exit>
Conditional branching via jcond
jbe Jump if below/equal
cmp eax, 18 jae Jump if above/equal
je Jump if equal
jbe too_young eax <=18 ? Y/ N { jz Jump if zero ZF=1

jne Jump if not equal


N { jnz Jump if not zero
ZF=0

Y <do age restricted stuff>


Lots more
Can vote.

too_young:
<print msg and exit> Cannot vote.
Example Modifying Letters
Loops
Loops via de-incrementing
● Define a loop counter (e.g., ebx) and initialise it
● Define loop label
● Add loop’s functionality
● De-increment inside of the loop
● Compare to 0 and jump if not zero (not equal)
● Loop until it is zero

jnz and jne are identical jump


instructions, test if ZF=0
Loops via incrementing
● Define a loop counter (e.g., ebx) and initialise it
● Define loop label
● Add loop’s functionality
● Increment inside of the loop
● Compare to 0 and jump if not zero (not equal)
● Loop until it is zero

jnz and jne are identical jump


instructions, test if ZF=0
Functions (CALL & RET)
IDA - Interactive Disassembler
Opening an exe or dll in IDA will dis-assemble the file
Correspondence to the original should be clear, besides code the assembler adds (i.e., stack)
Click the function name to access it’s code, the graph mode shows the program flow
Can rename the function to make reverse eng. easier
Can switch to the .text view instead of graph view
Can switch to the .text view instead of graph view, right click in code to return
Can inspect the strings also
Can inspect the imports also
Can find cross references in code and jump to it (double click string in strings tab first)
Can inspect the imports also
Jumps to selected reference
Works also as a Debugger

• Disassembles the program for us


• Runs the program
• Can stop at any point
• Can step through instruction by instruction
• Can inspect any part of program’s memory
• Can modify the program in memory so it does something different

• Sophisticated anti-debugging techniques can cause it to do something


we don’t realise
• Will cover in later lectures
Set breakpoint at printf
Set breakpoint at printf
Run to that point (hasn’t ran printf yet)
Continue and printf is ran

You might also like