Security Risks of Modifying Executable Files
1. Executable File Structure
We have already examined the structure of executable files through the following topic.
Just like document files are organized in a defined format, executable files also have a similar structured format. Among their components, the machine code in the .text section is loaded and executed.
2. Modifying an Executable File
- This means that by directly modifying the machine code stored in the .text section, the execution behavior of a program can be changed.
- Using this characteristic, it is possible to alter the execution result without modifying the program source code.
2.1. bin to hex
Use the xxd command to convert an executable file into a text-based hex file.
$ objdump -d -M intel add
Disassembly of section .text:
0000000000401000 <_start>:
401000: b0 02 mov al,0x2 ; Load the value 2 into the AL register
401002: 04 03 add al,0x3 ; Add the immediate value 3 to AL (AL = 2 + 3)
$ ./add
result: 5
$ xxd add > add.hex
2.2. edit hex
Open the generated "add.hex" file with a text editor, change the value "03" at offset 0x00001003 to "01", and save it.
- offset refers to the position of the byte to be modified within the executable file.
- The value 03 at that position is the operand of the add al, instruction. By changing it to `01`, the calculation result can be adjusted.
2.3. hex to bin
Use the xxd command to convert the text-based hex file back into a binary file.
$ xxd -r add.hex > add
$ chmod +x add
$ objdump -d -M intel add
Disassembly of section .text:
0000000000401000 <_start>:
401000: b0 02 mov al,0x2 ; Load the value 2 into the AL register
401002: 04 01 add al,0x1 ; Add the immediate value 1 to AL (AL = 2 + 1)
$ ./add
result: 3
∴ When checking the execution result, the original output "result: 5" has been changed to "result: 3".
3. Preventing Executable File Modification
- As shown above, all components of an executable file—including both code and data sections—can be modified.
- This provides the advantage of changing program behavior without modifying the source code.
- However, it also introduces a security risk where executables can be arbitrarily altered.
- So how can these risks be prevented?
- To mitigate such risks, modern operating systems employ various protection techniques and security mechanisms.
3.1. Hash-Based Integrity Verification
- Hash-based integrity verification computes a fixed-length value using a hash function from the contents of an executable file.
- If the hash value calculated at runtime differs from the original, it can detect that the file has been tampered with.
- Common hash algorithms include SHA-256, SHA-1, and MD5.
3.2. Digital Signature and Code Signing
- Digital signatures and code signing embed a signature based on public-key cryptography into the executable file.
- The operating system performs signature verification before execution to ensure that the file was created by a trusted issuer and has not been modified.
- Representative examples include Authenticode on Windows and Code Signing on macOS.
3.3. File Permissions and Access Control
- The operating system manages Read, Write, and Execute permissions separately at the file system level.
- This restricts modification of executable files to specific users or processes, preventing unauthorized changes.
3.4. Memory Protection Techniques
- W^X (Write XOR Execute)
- Prevents writing in memory regions where code is executed, thereby blocking code modification.
- This technique is mainly designed to defend against memory vulnerability attacks such as buffer overflows.
- DEP (Data Execution Prevention)
- Blocks code execution in data regions.
- ASLR (Address Space Layout Randomization)
- Randomizes the memory addresses of code and data each time the program runs.
Memory protection techniques apply permission-based protection to the memory of a running program.
| Term | Method | Role |
|---|---|---|
| DEP | Execution is prohibited in data regions. | Restricts execution areas. |
| W^X | Prevents modification of executing code. | Blocks code modification. |
| ASLR | Randomizes memory locations at each execution. | Prevents address prediction. |