What is Buffer overflow?

Buffer overflow, in the presence of a buffer overflow security vulnerabilities in the computer, the attacker can exceed the normal length of the number of characters to fill a domain, usually the memory address. In some cases, these excess characters can be run as “executable” code. So that an attacker can not be bound by security measures to control the attacked computer. It is one of the most common means of attack, the worm on the operating system in high-risk vulnerabilities overflow speed and large-scale propagation are using this technology. Buffer overflow attacks in theory can be used to attack any defective imperfect procedures, including anti-virus software , firewalls and other security products, as well as attacks on the banks of the attack program.

In unix systems, to gain root privileges via a buffer overflow it is quite common to use a hacking technique . In fact, this is a hacker in the system already has a basic local account of the preferred mode of attack. It is also widely used in long-range attacks, by daemon process stack overflow to get rootshell remote technology, there are already many examples.
In the windows system, there is also the problem of buffer overflow. Moreover, with the popularity of internet internet service program, win series platform more and more low-level win program becomes fatal on your system, because they are the same will happen remote stack overflow . Moreover, since the system users and administrators win a general lack of awareness of security, a win on the system stack overflow , if the malicious use, will cause the entire machine to be hackers controlled, which may cause the entire local area network fall into the hands of hackers. Microsoft’s popular product iis server4.0 was found in a known as “illegal htr request” defects. According to Microsoft said the flaw in certain circumstances lead to arbitrary code can be run on the server side. But found the loopholes in the Internet security company eeye ceo firas bushnaq words, this is only the tip of the iceberg. bushnaq said that hackers could exploit to the iis server complete control, in fact, many e-commerce sites is precisely based on this system.

Let us look at the principle of buffer overflow. As everyone knows, c language without array bounds checking, using c language in many applications, it is assumed the size of the buffer is sufficient, certainly greater than its capacity to copy the string length. But the fact is not always the case, when the program error, or malicious user deliberately into a long string, there are many unexpected things happen over that portion of the character will be covered with an array of other neighboring variable Of the space, so that the variable appears unpredictable value. If it happens, the array and subroutine return address when near, it is possible due to the part of the string exceeds covers subroutine return address, so that the subroutine is finished return turned to another unpredictable address to make the program There was an error in the execution flow. Even, because the application does not access the process address space range of addresses, leaving the offending process failures occur. This error is often committed in the programming.

Use a buffer overflow while attempting to destroy or illegally entering the system program usually consists of the following components:

  • Prepare for some brings up a shell of machine code string formation, in the following we will call it shellcode.
  • Apply a buffer, and the machine code fill in the lower end of the buffer.
  • Estimation machine code in the stack may start position and the write end position of the buffer. This initial position is also a parameter we need to call repeatedly when we execute this program.
  • The buffer as a buffer overflow system error entry procedures, and the implementation of the wrong procedures.
    Through the above analysis and examples, we can see the buffer overflow on the security of the system a huge threat. In unix systems, the use of a class of well-written procedure, and use suid programs that exist such a mistake can easily obtain the system superuser privileges. When a service program to provide services in the port, buffer overflow programs can easily turn off this service, making the system service paralyzed in a certain period of time, serious downtime may make the system immediately, thus becomes a denial of service attack . This error is not only the error of the programmer, the system itself in the realization of this error occurs more. Today, buffer overflow errors are continually from unix, windows, router , gateway to be found and other network devices, and constitute a larger number of security threats to the system maximum degree category.

Buffer overflow is the code inherent vulnerabilities , in addition to pay attention to the development phase to write correct code than for the user, the general prevention of errors

  • Shut down a port or service. Administrators should be aware of what is installed on their systems, and which services are running
  • Install the software vendor patches, loopholes a release, a large vendor will provide timely patches
  • In firewall filter traffic on specific, internal staff can not prevent overflow attacks
  • Check your own key service program to see if there is a terrible vulnerability
  • Run the software with the minimum permissions required

Buffer overflow attack

Concept of buffer overflow. Find and exploit buffer overflow vulnerability.

Stack overflows are usually the easiest to use for all buffer overflows. Before understanding the stack overflow, first understand the following concepts:

  • Buffers: In short, it is a contiguous area of ​​computer memory that can hold multiple instances of the same data type.
  • Stack: A stack is an abstract data type that is often used in computer science. The object in the stack has a feature: the last object placed on the stack is always taken first, and this feature is usually called a last-in, first-out (LIFO) queue. The stack defines some operations. The two most important are PUSH and POP. The PUSH operation adds an element to the top of the stack. POP operation Instead, remove one element at the top of the stack and decrement the stack by one.
  • Register ESP, EBP, EIP
  1. The ESP register of the CPU holds the top pointer of the current thread,
  2. The EBP register holds the bottom pointer of the current thread.
  3. The EIP register of the CPU stores the memory address stored in the next CPU instruction. When the CPU executes the current instruction, it reads the memory address of the next instruction from the EIP register and then continues execution.

Modern computers are designed to understand the high-level language of people’s minds. In the use of high-level language constructor of the most important technology is the process (procedure) and function (function). From this point of view, a procedure call can change the control flow of a program like a jump command, but unlike a jump, the function returns control to the statement or instruction after the call when the job completes. This high-level abstraction is achieved by the help of the stack. The stack is also used to dynamically allocate space to the local variables used in the function, as well as passing parameters and function return values ​​to the function.

The stack consists of logical stack frames. When a function is called, the logical stack frame is pushed onto the stack and the logical stack frame is popped off the stack when the function returns. The stack frame contains the parameters of the function, the local variables of the function, and the data required to restore the previous stack frame, including the value of the instruction pointer (IP) at the time of the function call.

The first thing that must be done when a routine is called is to save the previous FP (so that it can be restored when the routine exits). Then it copies the SP to the FP, creates a new FP, and moves the SP forward for the local variable reservation. This is called the prolog work of the routine. When the routine exits, the stack must be clean, this is called the end of the routine (epilog) work. Intel’s ENTER and LEAVE instructions, Motorola’s LINK and UNLINK instructions, can be used for effective prologue and finishing.

Below we use a simple example to show the appearance of the stack: example.c:

void function(int a, int b, int c) {
   char buffer1[5];
   char buffer2[10];
void main() {

To understand what the program does when calling function (), we compile with the -S option of gcc to produce assembly code output:

$ gcc -S -o example.s example1.c

By looking at the assembly language output, we see that the call to function () is translated to:

pushl $3
pushl $2
pushl $1
call function

We call the three parameters of the function into the stack from backward to forward, and then call function (). The instruction call will also push the instruction pointer (IP) on the stack. We call this saved IP as the return address (RET). The first thing done in the function is the prologue of the routine:

pushl %ebp
movl %esp,%ebp
subl $20,%esp

The frame pointer EBP is pushed onto the stack, and the current SP is copied to EBP, making it a new frame pointer. We call this saved FP the SFP. Next we reduce the value of SP to reserve space for local variables We must remember that memory can only be addressed in words, where a word is 4 bytes, 32 bits, so a buffer of 5 bytes takes up 8 bytes (2 words) of memory, And 10 bytes of the buffer will take 12 bytes (3 words) of memory space.This is why the SP to lose 20 reasons, so that we can imagine function () is called when the stack looks like:

So, from the above chart, if we input buffer1 long, and directly overwrite the back of the sfp and ret, you can modify the function of the return address. Let’s look at an example.


On how to prepare Shell Code, how to prepare in advance a dangerous memory in the implementation of the code and how to accurately calculate the implementation of the buffer overflow that period of dangerous code while the return address back to the original return address … … This involves too much of the underlying Compilation of knowledge, brother Fucai also just fly, can not become a real hacker master. However, the level of the level of hacker friends seems to improve our code security is very necessary!

So, in this example, we assume that the so-called dangerous code is already in the source code, the function bar. Function foo is a normal function, the main function is called, the implementation of a very unsafe strcpy work. Using unsafe strcpy, we can pass a buffer over the length of the buf string, the implementation of the copy, the buffer overflow, the ret return address into the address of the function bar, to call the purpose of the function bar.

#include <stdio.h>
#include <string.h>
void foo(const char* input)
        char buf[10];
        printf("My stack looks like:\n%p\n%p\n%p\n%p\n%p\n%p\n%p\n\n");
        strcpy(buf, input);
        printf("buf = %s\n", buf);
        printf("Now the stack looks like:\n%p\n%p\n%p\n%p\n%p\n%p\n%p\n\n");
void bar(void)
        printf("Augh! I've been hacked!\n");
int main(int argc, char* argv[])
        printf("Address of foo = %p\n", foo);
        printf("Address of bar = %p\n", bar);
        if (argc != 2)
                printf("Please supply a string as an argument!\n");
                return -1;
        return 0;

Compile the above program with GCC, and turn off the Buffer Overflow Protect switch:

gcc -g -fno-stack-protector test.c -o test

In order to find out the return address, I use gdb debug the above compiled program.

(gdb) r
Starting program: /media/Personal/MyProject/C/StackOver/test abc
Address of foo = 0x80483d4           
Address of bar = 0x8048419         

Breakpoint 1, main (argc=2, argv=0xbfe5ab24) at test.c:24
24              foo(argv[1]);

(gdb) info registers ebp
ebp            0xbfe5aa88       0xbfe5aa88            
(gdb) n

Breakpoint 2, foo (input=0xbfe5c652 "abc") at test.c:4
4       {
(gdb) n
6               printf("My stack looks like:\n%p\n%p\n%p\n%p\n%p\n%p\n%p\n\n");

(gdb) info registers ebp
ebp            0xbfe5aa68       0xbfe5aa68             

(gdb) x/ 0xbfe5aa68
0xbfe5aa68:     0xbfe5aa88                    
(gdb) n
My stack looks like:
7               strcpy(buf, input);

(gdb) x/i 0x8048499
0x8048499 <main+108>:   movl   $0x8048653,(%esp)                      
(gdb) disassemble main
Dump of assembler code for function main:
0x0804842d <main+0>:    lea    0x4(%esp),%ecx
0x08048431 <main+4>:    and    $0xfffffff0,%esp
0x08048434 <main+7>:    pushl -0x4(%ecx)
0x08048437 <main+10>:   push   %ebp

0x08048494 <main+103>: call   0x80483d4 <foo>
0x08048499 <main+108>: movl   $0x8048653,(%esp)                
0x080484a0 <main+115>: call   0x8048340 <puts@plt>

Therefore, as long as we enter a long string, covering 0x08048499, become bar function address 0x8048419, to achieve the purpose of calling the bar function.To enter something like 0x8048419 into the application, we need to use a Perl or Python script, such as the following Python script:

import os  arg = ‘ABCDEFGHIJKLMN’ + ‘”x19″x84″x04″x08’  cmd = ‘./test ‘ + arg  os.system(cmd)

Note that the above 08 04 84 19 to two two anti-written. Do the following:

  $python hack.py
  Address of foo = 0x80483d4
  Address of bar = 0x8048419

The stack looks like:



Now the stack looks like:

0xbf83246e 0x8048616 0x42412484 0x46454443 0x4a494847 0x4e4d4c4b 0x8048419

Heap overflows

A heap is an area of ​​memory that is used by the application and is dynamically allocated at run time. Heap memory is different from stack memory in that it is more persistent between functions. This means that the memory allocated to a function will remain allocated until it is completely freed. This indicates that a heap overflow may have occurred but has not been noticed until the memory segment is used later. Here is a simple look at the following to see one of the most simple examples of heap overflow:

#include <stdio.h>

#include <stdlib.h>

#include <string.h>

int main(int argc, char *argv[])


char *input = malloc(20);

char *output = malloc(20);

strcpy(output, “normal output”);

strcpy(input, argv[1]);

printf(“input at %p: %s\n”, input, input);

printf(“output at %p: %s\n”, output, output);

printf(“\n\n%s\n”, output);


We look at the implementation of the results:

[root@root]# ./heap1 hackshacksuselessdata

input at 0x8049728: hackshacksuselessdata
output at 0x8049740: normal output

normal output

[root@root]# ./heap1 
input at 0x8049728: 
output at 0x8049740: hackshackshackshacks5hacks6hacks7


[root@root]# ./heap1 “hackshacks1hackshacks2hackshacks3hackshacks4what have I done?”
input at 0x8049728: hackshacks1hackshacks2hackshacks3hackshacks4what have I done?
output at 0x8049740: what have I done?

Formatted string error

This error is the use of printf, sprintf, fprint and other functions, the format is not used string, for example: the correct usage is:

printf(“%s”, input)

If written directly:


There will be loopholes, when the input of some illegally manufactured characters, the memory will be rewritten, the implementation of some illegal instructions.

Unicode and ANSI buffer sizes do not match

We often encounter the need to convert between Unicode and ANSI, the vast majority of Unicode functions in accordance with the wide character format (double byte) size, rather than in accordance with the byte size to calculate the size of the buffer, so when the conversion is not careful It may cause overflow. For example, the most commonly attacked function is MultiByteToWideChar, see the following code:

BOOL GetName(char *szName)


WCHAR wszUserName[256];

// Convert ANSI name to Unicode.

MultiByteToWideChar(CP_ACP, 0,






WszUserName is wide, so sizeof (wszUserName) will be 256 * 2 bytes, so there is a potential buffer overflow problem. The correct wording should be this:

MultiByteToWideChar(CP_ACP, 0,




sizeof(wszUserName) / sizeof(wszUserName[0]));

Prevention and detection

  • Unsafe function
    Avoid the use of unsafe string handling functions, such as the use of safe functions instead:

    Insecure function Safety function
    strcpy strncpy
    strcat strncat
    sprint _snprintf
    gets fgets
  • Visual C ++ .NET / GS Options
    /GS option can prevent the destruction of the stack to ensure the integrity of the stack, but can not completely prevent the buffer overflow problem, for example, for heap overflow, / GS is powerless.
  • Source code scanning
    The simplest source code scan:

    grep strcpy *.c



Leave a Reply

Avatar placeholder

Your email address will not be published. Required fields are marked *