Title : Linux x86 kernel function hooking emulation
Author : mayhem
==Phrack Inc.==
Volume 0x0b, Issue 0x3a, Phile #0x08 of 0x0e
|=-----------------=[ IA32 ADVANCED FUNCTION HOOKING ]=------------------=|
|=-----------------------------------------------------------------------=|
|=-------------------=[ mayhem <mayhem@hert.org> ]=---------------------=|
|=-----------------------=[ December 08th 2001 ]=------------------------=|
--[ Contents
1 - Introduction
1.1 - History
1.2 - New requirements
2 - Hooking basics
2.1 - Usual techniques
2.2 - Things not to forget
3 - The code explained
4 - Using the library
4.1 - The API
4.2 - Kernel symbol resolution
4.3 - The hook_t object
5 - Testing the code
5.1 - Loading the module
5.2 - Playing around a bit
5.3 - The code
6 - References
--[ 1 - Introduction
Abusing, logging , patching , or even debugging : obvious reasons to think
that hooking matters . We will try to understand how it works . The
demonstration context is the Linux kernel environment . The articles ends
with a general purpose hooking library the linux kernel 2.4 serie,
developped on 2.4.5 and running on IA32, it's called LKH, the Linux Kernel
Hooker.
----[ 1.1 - History
One of the reference on the function hijacking subject subject has
been released in November 1999 and is written by Silvio Cesare
(hi dude ;-). This implementation was pretty straightforward since
the hooking was consisting in modifying the first bytes of the
function jumping to another code , in order to filter access on the
acct_process function of the kernel, keeping specific processes from
beeing accounted .
----[ 1.2 - New requirements
Some work has been done since that time :
- Pragmatic use of redirection often (always ?) need to access the
original parameters, whatever their number and their size (for example
if we want to modify and forward IP packets) .
- We may need to disable the hook on demand, which is perfect for runtime
kernel configuration . We may want to call the original functions
(discrete hooking, used by monitoring programs) or not (aggressive hooking,
used by security patches to manage ACL - Access Control Lists - ) on kernel
ojects .
- In some cases, we may also want to destroy the hook just after the first
call, for example to do statistics (we can hook one time every seconds or
every minuts) .
--[ 2 - Hooking basics
----[ 2.1 Usual techniques
Of course, the core hooking code must be done in assembly language, but the
hooking wrapping code is done in C . The LKH high level interface is described
in the API section . May we first understand some hooking basics .
This is basicaly what is hooking :
- Modify the begin of a function code to points to another code
(called the 'hooking code') . This is a very old and efficient way
to do what we want . The other way to do this is to patch every calls
in the code segment referencing the function . This second method
has some advantages (it's very stealth) but the implementation is a bit
complex (memory area blocks parsing, then code scanning) and not very
fast .
- Modify in runtime the function return address to takes control when the
hooked function execution is over .
- The hook code must have two different parts, the first one must be
executed before the function (prepare the stack for accessing para-
meters, launch callbacks, restore the old function code) , the second
one must be executed after (reset the hook again if needed)
- Default parameters (defining the hook behaviour) must be set during
the hook creation (before modifying the function code) . Function
dependant parameters must be fixed now .
- Add callbacks . Each callback can access and even modify the original
function parameters .
- Enable, disable, change parameters, add or remove callbacks when we want .
----[ 2.2 - Things not to forget
-> Functions without frame pointer:
A important feature is the capability to hook functions compiled with the
-fomit-frame-pointer gcc option . This feature requires the hooking code to
be %ebp free , that's why we will only %esp is used for stack operations.
We also have to update some part (Some bytes here and there) to fix %ebp
relative offsets in the hook code . Look at khook_create() in lkh.c for more
details on that subject .
The hook code also has to be position independant . That's why so many
offsets in the hookcode are fixed in runtime (Since we are in the kernel,
offsets have to be fixed during the hook creation, but very similar
techniques can be used for function hooking in *runtime* processes).
-> Recursion
We must be able to call the original function from a callback, so the
original code has t be restored before the execution of any callback .
-> Return values
We must returns the correct value in %eax, wether we have callbacks or no,
wether the original function is called or no . In the demonstration, the
return value of the last executed callback is returned if the original
function is not called . If no callbacks and no original function is called,
the return value is beyond control.
-> POST callbacks
You cannot access function parameters if you execute callbacks after the
original function . That's why it's a bad idea . However, here is the
technique to do it :
- Set the hook as aggressive
- Call the PRE callbacks .
- Call the original function from a callback with its own parameters .
- Call the POST callbacks .
--[ 3 - The code explained .
First we install the hook.
A - Overwrite the first 7 bytes of the hijacked routine
with an indirect jump pointing to the hook code area .
The offset put in %eax is the obsolute address of the hook
code, so each time we'll call the hijack_me() function,
the hook code will takes control .
Before hijack:
0x80485ec <hijack_me>: mov 0x4(%esp,1),%eax
0x80485f0 <hijack_me+4>: push %eax
0x80485f1 <hijack_me+5>: push $0x8048e00
0x80485f6 <hijack_me+10>: call 0x80484f0 <printf>
0x80485fb <hijack_me+15>: add $0x8,%esp
After the hijack:
0x80485ec <hijack_me>: mov $0x804a323,%eax
0x80485f1 <hijack_me+5>: jmp *%eax
0x80485f3 <hijack_me+7>: movl (%eax,%ecx,1),%es
0x80485f6 <hijack_me+10>: call 0x80484f0 <printf>
0x80485fb <hijack_me+15>: add $0x8,%esp
The 3 instructions displayed after the jmp dont means anything ,
since gdb is fooled by our hook .
B - Reset the original bytes of the hooked function, we need that if
we want to call the original function without breaking things .
pusha
movl $0x00, %esi (1)
movl $0x00, %edi (2)
push %ds
pop %es
cld
xor %ecx, %ecx
movb $0x07, %cl
rep movsl
The two NULL offsets have actually been modified during the hook
creation (since their values depends on the hooked function offset,
we have to patch the hook code in runtime) . (1) is fixed with
the offset of the buffer containing the first 7 saved bytes of the
original function . (2) is fixed with the original function address.
If you are familiar with the x86 assembly langage, you should know
that these instructions will copy %ecx bytes from %ds:%esi to
%es:%edi . Refers to [2] for the INTEL instructions specifications.
C - Initialise the stack to allow parameters read/write access and
launch our callbacks . We move the first original parameter
address in %eax then we push it .
leal 8(%esp), %eax
push %eax
nop; nop; nop; nop; nop
nop; nop; nop; nop; nop
nop; nop; nop; nop; nop
nop; nop; nop; nop; nop
nop; nop; nop; nop; nop
nop; nop; nop; nop; nop
nop; nop; nop; nop; nop
nop; nop; nop; nop; nop
Note that empty slots are full of NOP instruction (opcode 0x90) .
This mean no operation . When a slot is filled (using khook_add_entry
function) , 5 bytes are used :
- The call opcode (opcode 0xE8)
- The calback offset (4 bytes relative address)
We choose to set a maximum of 8 callbacks . Each of the inserted
callbacks are called with one parameter (the %eax pushed value contains
the address of the original function parameters, reposing the stack).
D - Reset the stack .
add $0x04, %esp
We now remove the original function's parameter address
pushed in (C) . That way, %esp is reset to its old value (the
one before entering the step C). At this moment, the stack
does not contains the original function's stack frame since it
was overwritten on step (A) .
E - Modify the return address of the original function on the stack .
On INTEL processors, functions return addresses are saved on the stack,
which is not a very good idea for security reasons ;-) . This
modification makes us return where we want (to the hook-code)
after the original function execution. Then we call the original
function. On return, the hook code regains control . Let's look at
that carefully :
-> First we get our actual %eip and save it in %esi (the end
labels points to some code you can easily identify on
step E5). This trick is always used in position independant
code.
1. jmp end
begin:
pop %esi
-> Then we retreive the old return address reposing
at 4(%esp) and save it in %eax .
2. movl 4(%esp), %eax
-> We use that saved return address as an 4 bytes offset
at the end of the hook code (see the NULL pointer in
step H), so we could return to the right place at the
end of the hooking process .
3. movl %eax, 20(%esi)
-> We modify the return address of the original function
so we could return just after the 'call begin' instruction .
4. movl %esi, 4(%esp)
movl $0x00, %eax
-> We call the original function . The 'end' label is used
in step 1, and the 'begin' label points the code just
after the "jmp end" (still in step 1) .
The original function will return just after the 'call begin'
instruction since we changed its return address .
5. jmp *%eax
end:
call begin
F - Back to the hooking code . We set again the 7 evil bytes in the
original function 's code . These bytes were reset to their original
values before calling the function, so we need to hook the function
again (like in step A) .
This step is noped (replaced by NOP instructions) if the hook is
single-shot (not permanent), so the 7 bytes of our evil indirect
jump (step A) are not copied again . This step is very near from
step (B) since it use the same copy mechanism (using rep movs*
instructions), so refers tothis step for explainations . NULL
offsets in the code must be fixed during the hook creation :
- The first one (the source buffer) is replaced by the evil bytes
buffer .
- The second one (the destination buffer) is replaced by the original
function entry point address .
movl $0x00, %esi
movl $0x00, %edi
push %ds
pop %es
cld
xor %ecx, %ecx
movb $0x07, %cl
rep movsb
G - Use the original return address (saved on step E2) and get
back to the original calling function . The NULL offset you
can see (*) must be fixed in step E2 with the original function
return address . The %ecx value is then pushed on the stack so the
next ret instruction will use it like if it was a saved %eip
register on the stack . This returns to the (correct) original
place .
movl $0x00, %ecx *
pushl %ecx
ret
--[ 4 - Using the library
----[ 4.1 - The API
The LKH API is pretty easy to use :
hook_t *khook_create(int addr, int mask);
Create a hook on the address 'addr'. Give also the default type
(HOOK_PERMANENT or HOOK_SINGLESHOT) , the default state
(HOOK_ENABLED or HOOK_DISABLED) and the default mode (HOOK_AGGRESSIVE
or HOOK_DISCRETE) . The type, state and mode are OR'd in the
'mask' parameter .
void khook_destroy(hook_t *h);
Disable, destroy, and free the hook ressources .
int khook_add_entry(hook_t *h, char *routine, int range);
Add a callback to the hook, at the 'range' rank . Return -1 if the
given rank is invalid . Otherwise, return 0 .
int khook_remove_entry(hook_t *h, int range);
Remove the callback put in slot 'range', return -1 if the given rank
is invalid . Otherwise return 0 .
void khook_purge(hook_t *h);
Remove all callbacks on this hook .
int khook_set_type(hook_t *h, char type);
Change the type for the hook 'h' . The type can be HOOK_PERMANENT
(the hookcode is executed each time the hooked function is called) or
HOOK_SINGLESHOT (the hookcode is executed only for 1 hijack, then the
hook is cleanly removed .
int khook_set_state(hook_t *h, char state);
Change the state for the hook 'h' . The state can be HOOK_ENABLED
(the hook is enabled) or HOOK_DISABLED (the hook is disabled) .
int khook_set_mode(hook_t *h, char mode);
Change the mode for the hook 'h' . The mode can be HOOK_AGGRESSIVE
(the hook does not call the hijacked function) or HOOK_DISCRETE
(the hook calls the hijacked function after having executed the
callback routines) . Some part of the hook code is nop'ed
(overwritten by no operation instructions) if the hook is aggressive
(step E and step H) .
int khook_set_attr(hook_t *h, int mask);
Change the mode, state, and/or type using a unique function call.
The function returns 0 in case of success or -1 if the specified
mask contains incompatible options .
Note that you can add or remove entries whenever you want, whatever the
state , type and mode of the used hook .
----[ 4.2 - Kernel symbol resolution
A symbol resolution function has been added to LKH, allowing you to access
exported functions values .
int ksym_lookup(char *name);
Note that it returns NULL if the symbol remains unresolved . This lookup
can resolve symbols contained in the __ksymtab section of the kernel, an
exhaustive list of these symbols is printed when executing 'ksyms -a' :
bash-2.03# ksyms -a | wc -l
1136
bash-2.03# wc -l /boot/System.map
14647 /boot/System.map
bash-2.03# elfsh -f /usr/src/linux/vmlinux -s # displaying sections
[SECTION HEADER TABLE]
(nil) --- foffset: (nil) 0 bytes [*Unknown*]
(...)
0xc024d9e0 a-- __ex_table foffset: 0x14e9e0 5520 bytes [Program data]
0xc024ef70 a-- __ksymtab foffset: 0x14ff70 9008 bytes [Program data]
0xc02512a0 aw- .data foffset: 0x1522a0 99616 bytes [Program data]
(...)
(nil) --- .shstrtab foffset: 0x1ad260 216 bytes [String table]
(nil) --- .symtab foffset: 0x1ad680 245440 bytes [Symbol table]
(nil) --- .strtab foffset: 0x1e9540 263805 bytes [String table]
[END]
As a matter of fact, the memory mapped section __ksymtab does not contains
every kernel symbols we would like to hijack.
In the other hand, the non-mapped section .symtab is definitely bigger
(245440 bytes vs 9008 bytes). When using 'ksyms', the __NR_query_module
syscall (or __NR_get_kernel_syms for older kernels) is used internaly, this
syscall can only access the __ksymtab section since the complete kernel
symbol table contained in __ksymtab is not loaded in memory. The solution
to access to whole symbol table is to pick up offsets in our System.map
file (create it using `nm -a vmlinux > System.map`) .
bash-2.03# ksyms -a | grep sys_fork
bash-2.03# grep sys_fork /boot/System.map
c0105898 T sys_fork
bash-2.03#
#define SYS_FORK 0xc0105898
if ((s = khook_create((int) SYS_FORK, HOOK_PERMANENT, HOOK_ENABLED)) == NULL)
KFATAL("init_module: Cant set hook on function *sys_fork* ! \n", -1);
khook_add_entry(s, (int) fork_callback, 0);
#undef SYS_FORK
For systems not having System.map or uncompressed kernel image (vmlinux),
it is acceptable to uncompress the vmlinuz file (take care, its not a
standard gzip format!
[3] contains very useful information about this) and create manually
a new System.map file .
Another way to go concerning kernel non-exported symbols resolution could
be a statistic based lookup : Analysing references in the kernel
hexadecimal code could allow us to predict the symbol values (fetching
call or jmp instructions), the difficulty of this tool would be the
portability, since the kernel code changes from a version to another.
Dont forgett t change SYS_FORK to your own sys_fork offset value.
----[ 4.3 - LKH Internals: the hook_t object
Let's look at the hook_t structure (the hook entity in memory) :
typedef struct s_hook
{
int addr;
int offset;
char saved_bytes[7];
char voodoo_bytes[7];
char hook[HOOK_SIZE];
char cache1[CACHE1_SIZE];
char cache2[CACHE2_SIZE];
} hook_t;
h->addr The address of the original function, used to
enable or disable the hook .
h->offset This field contains the offset from h->addr where to
begin overwrite to set the hijack . Its value is 3 or
0 , it depends if the function has a stack frame
or not .
h->original_bytes The seven overwritten bytes of the original
function .
h->voodoo_bytes The seven bytes we need to put at the beginning of the
function to redirect it (contains the indirect jump code
seen in step A on paragraph 3) .
h->hook The opcodes buffer contaning the hooking code,
where we insert callback reference using
khook_add_entry() .
The cache1 and cache2 buffers are used to backup some hook code when we
set the mode HOOK_AGGRESSIVE (since we have to nop the original function
call, saving this code is necessary , for eventually reset the hook as
discrete after)
Each time you create a hook, an instance of hook_t is declared and
allocated . You have to create one hook per function you want to
hijack .
----[ 5 - Testing the code
Please check http://www.devhell.org/~mayhem/ for fresh code first. The
package (version 1.1) is given at the end of the article) .
Just do #include "lkh.c" and play ! In this example module using LKH,
we wants to hook :
- the hijack_me() function, here you can check the good parameters passing
and their well done modification throught the callbacks .
- the schedule() function, SINGLESHOT hijack .
- the sys_fork() function, PERMANENT hijack .
------[ 5.1 - Loading the module
bash-2.03# make load
insmod lkh.o
Testing a permanent, aggressive, enabled hook with 3 callbacks:
A in hijack_one = 0 -OK-
B in hijack_one = 1 -OK-
A in hijack_zero = 1 -OK-
B in hijack_zero = 2 -OK-
A in hijack_two = 2 -OK-
B in hijack_two = 3 -OK-
--------------------
Testing a disabled hook:
A in HIJACKME!!! = 10 -OK-
B in HIJACKME!!! = 20 -OK-
--------------------
Calling hijack_me after the hook destruction
A in HIJACKME!!! = 1 -OK-
B in HIJACKME!!! = 2 -OK-
SCHEDULING!
------[ 5.2 - Playing around a bit
bash-2.05# ls
FORKING!
Makefile doc example.c lkh.c lkh.h lkh.o user user.c user.h user.o
bash-2.05# pwd
/usr/src/coding/LKH
(Did not printed FORKING! since pwd is a shell builtin command :)
bash-2.05# make unload
FORKING!
rmmod lkh;
LKH unloaded - sponsorized by the /dev/hell crew!
bash-2.05# ls
Makefile doc example.c lkh.c lkh.h lkh.o user user.c user.h user.o
bash-2.05#
You can see "FORKING!" each time the sys_fork() kernel function is called
(the hook is permanent) and "SCHEDULING!" when the schedule() kernel function
is called for the first time (since this hook is SINGLESHOT, the schedule()
function is hijacked only one time, then the hook is removed) .
Here is the commented code for this demo :
------[ 5.3 - The code
/*
** LKH demonstration code, developped and tested on Linux x86 2.4.5
**
** The Library code is attached .
** Please check http://www.devhell.org/~mayhem/ for updates .
**
** This tarball includes a userland code (runnable from GDB), the LKH
** kernel module and its include file, and this file (lkm-example.c)
**
** Suggestions {and,or} bug reports are welcomed ! LKH 1.2 already
** in development .
**
** Special thanks to b1nf for quality control ;)
** Shoutout to kraken, keep the good work on psh man !
**
** Thanks to csp0t (one work to describe you : *elite*)
** and cma4 (EPITECH powa, favorite win32 kernel hax0r)
**
** BigKaas to the devhell crew (r1x and nitrogen fux0r)
** Lightman, Gab and Xfred from chx-labs (stop smoking you junkies ;)
**
** Thanks to the phrackstaff and particulary skyper for his
** great support . Le Havre en force ! Case mais oui je t'aime ;)
*/
#include "lkh.c"
int hijack_me(int a, int b); /* hooked function */
int hijack_zero(void *ptr); /* first callback */
int hijack_one(void *ptr); /* second callback */
int hijack_two(void *ptr); /* third callback */
void hijack_fork(void *ptr); /* sys_fork callback */
void hijack_schedule(void *ptr); /* schedule callback */
static hook_t *h = NULL;
static hook_t *i = NULL;
static hook_t *j = NULL;
int
init_module()
{
int ret;
printk(KERN_ALERT "Change the SYS_FORK value then remove the return \n");
return (-1);
/*
** Create the hooks
*/
#define SYS_FORK 0xc010584c
j = khook_create(SYS_FORK
, HOOK_PERMANENT
| HOOK_ENABLED
| HOOK_DISCRETE);
#undef SYS_FORK
h = khook_create(ksym_lookup("hijack_me")
, HOOK_PERMANENT
| HOOK_ENABLED
| HOOK_AGGRESSIVE);
i = khook_create(ksym_lookup("schedule")
, HOOK_SINGLESHOT
| HOOK_ENABLED
| HOOK_DISCRETE);
/*
** Yet another check
*/
if (!h || !i || !j)
{
printk(KERN_ALERT "Cannot hook kernel functions \n");
return (-1);
}
/*
** Adding some callbacks for the sys_fork and schedule functions
*/
khook_add_entry(i, (int) hijack_schedule, 0);
khook_add_entry(j, (int) hijack_fork, 0);
/*
** Testing the hijack_me() hook .
*/
printk(KERN_ALERT "LKH: perm, aggressive, enabled hook, 3 callbacks:\n");
khook_add_entry(h, (int) hijack_zero, 1);
khook_add_entry(h, (int) hijack_one, 0);
khook_add_entry(h, (int) hijack_two, 2);
ret = hijack_me(0, 1);
printk(KERN_ALERT "--------------------\n");
printk(KERN_ALERT "Testing a disabled hook :\n");
khook_set_state(h, HOOK_DISABLED);
ret = hijack_me(10, 20);
khook_destroy(h);
printk(KERN_ALERT "------------------\n");
printk(KERN_ALERT "Calling hijack_me after the hook destruction\n");
hijack_me(1, 2);
return (0);
}
void
cleanup_module()
{
khook_destroy(i);
khook_destroy(j);
printk(KERN_ALERT "LKH unloaded - sponsorized by the /dev/hell crew!\n");
}
/*
** Function to hijack
*/
int
hijack_me(int a, int b)
{
printk(KERN_ALERT "A in HIJACKME!!! = %u \t -OK- \n", a);
printk(KERN_ALERT "B in HIJACKME!!! = %u \t -OK- \n", b);
return (42);
}
/*
** First callback for hijack_me()
*/
int
hijack_zero(void *ptr)
{
int *a;
int *b;
a = ptr;
b = a + 1;
printk(KERN_ALERT "A in hijack_zero = %u \t -OK- \n", *a);
printk(KERN_ALERT "B in hijack_zero = %u \t -OK- \n", *b);
(*b)++;
(*a)++;
return (0);
}
/*
** Second callback for hijack_me()
*/
int
hijack_one(void *ptr)
{
int *a;
int *b;
a = ptr;
b = a + 1;
printk(KERN_ALERT "A in hijack_one = %u \t -OK- \n", *a);
printk(KERN_ALERT "B in hijack_one = %u \t -OK- \n", *b);
(*a)++;
(*b)++;
return (1);
}
/*
** Third callback for hijack_me()
*/
int
hijack_two(void *ptr)
{
int *a;
int *b;
a = ptr;
b = a + 1;
printk(KERN_ALERT "A in hijack_two = %u \t -OK- \n", *a);
printk(KERN_ALERT "B in hijack_two = %u \t -OK- \n", *b);
(*a)++;
(*b)++;
return (2);
}
/*
** Callback for schedule() (kernel exported symbol)
*/
void hijack_schedule(void *ptr)
{
printk(KERN_ALERT "SCHEDULING! \n");
}
/*
** Callbacks for sys_fork() (kernel non exported symbol)
*/
void
hijack_fork(void *ptr)
{
printk(KERN_ALERT "FORKING! \n");
}
--[ 6 - References
[1] Kernel function hijacking
http://www.big.net.au/~silvio/
[2] INTEL Developers manual
http://developers.intel.com/design/pentiu m4/manuals/
[3] Linux Kernel Internals
http://www.linuxdoc.org/guides.html
|=[ EOF ]=---------------------------------------------------------------=|