Langsung ke konten utama

Attacking The Windows Nvidia Driver

Posted past times Oliver Chang

Modern graphic drivers are complicated as well as furnish a large promising laid on surface for EoPs as well as sandbox escapes from processes that conduct hold access to the GPU (e.g. the Chrome GPU process). In this weblog postal service we’ll convey a human face at attacking the NVIDIA core way Windows drivers, as well as a few of the bugs that I found. I did this inquiry equally constituent of a 20% projection alongside , during which a full of 16 vulnerabilities were discovered.

Kernel WDDM interfaces

The core way gene of a graphics driver is referred to equally the display miniport driver. Microsoft’s documentation has a overnice diagram that summarises the human relationship betwixt the diverse components:

Modern graphic drivers are complicated as well as furnish a large promising laid on surface for Eo Attacking the Windows NVIDIA Driver

In the DriverEntry() for display miniport drivers, a DRIVER_INITIALIZATION_DATA construction is populated alongside callbacks to the vendor implementations of functions that genuinely interact alongside the hardware, which is passed to dxgkrnl.sys (DirectX subsystem) via DxgkInitialize(). These callbacks tin either hold upward called past times the DirectX core subsystem, or inward about cases larn called direct from user way code.

DxgkDdiEscape

A good known entry dot for potential vulnerabilities hither is the DxgkDdiEscape interface. This tin hold upward called straight from user mode, as well as accepts arbitrary information that is parsed as well as handled inward a vendor specific way (essentially an IOCTL). For the residual of this post, we’ll usage the term “escape” to announce a especial command that’s supported past times the DxgkDdiEscape function.

NVIDIA has a whopping 400 escapes hither at fourth dimension of writing, hence this was where I spent most of my fourth dimension (the necessity of many of these existence inward the core is questionable):


// (names of these structs are made upward past times me)
// Represents a grouping of escape codes
struct NvEscapeRecord {
 DWORD action_num;
 DWORD expected_magic;
 void *handler_func;
 NvEscapeRecordInfo *info;
 _QWORD num_codes;
};

// Information virtually a specific escape code.
struct NvEscapeCodeInfo {
 DWORD code;
 DWORD unknown;
 _QWORD expected_size;
 WORD unknown_1;
};

NVIDIA implements their mortal information (pPrivateDriverData inward the DXGKARG_ESCAPE struct) for each escape equally a header followed past times data. The header has the next format:

struct NvEscapeHeader {
 DWORD magic;
 WORD unknown_4;
 WORD unknown_6;
 DWORD size;
 DWORD magic2;
 DWORD code;
 DWORD unknown[7];
};

These escapes are identified past times a 32-bit code (first fellow member of the NvEscapeCodeInfo struct above), as well as are grouped past times their most meaning byte (from 1 - 9).

There is about validation existence done earlier each escape code is handled. In particular, each NvEscapeCodeInfo contains the expected size of the escape information next the header. This is validated against the size inward the NvEscapeHeader, which itself is validated against the PrivateDriverDataSize plain given to DxgkDdiEscape. However, it’s possible for the expected size to hold upward 0 (usually when the escape information is expected to hold upward variable sized) which agency that the escape handler is responsible for doing its ain validation. This has led to about bugs (1, 2).

Most of the vulnerabilities industrial plant life (13 inward total) inward escape handlers were really basic mistakes, such equally writing to user provided pointers blindly, disclosing uninitialised core retentivity to user mode, as well as wrong bounds checking. There were also numerous issues that I noticed (e.g. OOB reads) that I didn’t written report because they didn’t seem exploitable.

DxgkDdiSubmitBufferVirtual

Another interesting entry dot is the DxgkDdiSubmitBufferVirtual function, which is newly introduced inward Windows 10 as well as WDDM 2.0 to back upward GPU virtual retentivity (deprecating the onetime DxgkDdiSubmitBuffer/DxgkDdiRender functions). This constituent is fairly complicated, as well as also accepts vendor specific information from the user way driver for each command submitted. One põrnikas was industrial plant life here.

Others

There are a few other WDDM functions that convey vendor-specific data, just nil of involvement were industrial plant life inward those later on a quick review.

Exposed devices

NVIDIA also exposes about additional devices that tin hold upward opened past times whatever user:

  • \\.\NvAdminDevice which appears to hold upward used for NVAPI. H5N1 lot of the ioctl handlers seem to telephone yell upward into DxgkDdiEscape.
  • \\.\UVMLite{Controller,Process*}, probable related to NVIDIA’s “unified memory”. 1 bug was industrial plant life here.
  • \\.\NvStreamKms, installed past times default equally constituent of GeForce Experience, just you lot tin opt out during installation. It’s non precisely clear why this especial driver is necessary. 1 bug was industrial plant life hither also.

More interesting bugs

Most of the bugs I industrial plant life were past times manual reversing as well as analysis, along alongside about custom IDA scripts. I also ended upward writing a fuzzer, which was surprisingly successful given how uncomplicated it was.

While most of the bugs were rather dull (simple cases of missing validation), at that spot were a few that were a fleck to a greater extent than interesting.

NvStreamKms

This driver registers a procedure creation notification callback using the PsSetCreateProcessNotifyRoutineEx function. This callback checks if novel processes created on the organisation fit ikon names that were previously laid past times sending IOCTLs.

This creation notification routine contained a bug:

(Simplified decompiled output)

wchar_t Dst[BUF_SIZE];

...

if ( cur->image_names_count > 0 ) {
 // info_ is the PPS_CREATE_NOTIFY_INFO that is passed to the routine.
 image_filename = info_->ImageFileName;
 buf = image_filename->Buffer;
 if ( buf ) {
   filename_length = 0i64;
   num_chars = image_filename->Length / 2;
   // Look for the filename past times scanning for backslash.
   if ( num_chars ) {
     while ( buf[num_chars - filename_length - 1] != '\\' ) {
       ++filename_length;
       if ( filename_length >= num_chars )
         goto DO_COPY;
     }
     buf += num_chars - filename_length;
   }
DO_COPY:
   wcscpy_s(Dst, filename_length, buf);
   Dst[filename_length] = 0;
   wcslwr(Dst);

This routines extracts the ikon shout from the ImageFileName fellow member of PS_CREATE_NOTIFY_INFO past times searching backwards for backslash (‘\’). This is as well as hence copied to a stack buffer (Dst) using wcscpy_s, just the length passed is the length of the calculated name, as well as non the length of the goal buffer.

Even though Dst is a fixed size buffer, this isn’t a straightforward overflow. Its size is bigger than 255 wchars, as well as for most Windows filesystems path components cannot hold upward greater than 255 characters. Scanning for backslash is also valid for most cases because ImageFileName is a canonicalised path.

It is however, possible to transcend a UNC path that keeps frontwards slash (‘/’) equally the path separator later on existence canonicalised (credits to James Forshaw for pointing me to this). This agency nosotros tin larn a filename of the cast “aaa/bbb/ccc/...” as well as crusade an overflow.

For example: CreateProcessW(L"\\\\?\\UNC\\127.0.0.1@8000\\DavWWWRoot\\aaaa/bbbb/cccc/blah.exe", …)

Another interesting regime annotation is that the wcslwr next the bad re-create doesn’t genuinely boundary the contents of the overflow (the exclusively requirement is valid UTF-16). Since the calculated filename_length doesn’t include the null terminator, wcscpy_s volition recollect that the goal is equally good pocket-size as well as volition clear the goal string past times writing a null byte at the starting fourth dimension (after copying the contents upward to filename_length bytes maiden of all hence the overflow still happens). This agency that the wcslwr is useless because this wcscpy_s telephone yell upward as well as constituent of the code never worked to start out with.

Exploiting this is trivial, equally the driver is non compiled alongside stack cookies (hacking similar it’s 1999). H5N1 local privilege escalation exploit is attached inward the original issue that sets upward a imitation WebDAV server to exploit the vulnerability (ROP, pin stack to user buffer, ROP 1 time to a greater extent than to allocate rwx mem containing shellcode as well as saltation to it).

Incorrect validation inward UVMLiteController

NVIDIA’s driver also exposes a device at \\.\UVMLiteController that tin hold upward opened past times whatever user (including from the sandboxed Chrome GPU process). The IOCTL handlers for this device write results direct to Irp->UserBuffer, which is the output pointer passed to DeviceIoControl (Microsoft’s documentation  says non to create this).The IO command codes specify METHOD_BUFFERED, which agency that the Windows core checks that the address make provided is writeable past times the user earlier passing it off to the driver.

However, these handlers lacked bounds checking for the output buffer, which agency that a user way context could transcend a length of 0 alongside whatever arbitrary address (which passes the ProbeForWrite check) to outcome inward a express write-what-where (the “what” hither is express to about specific values: including 32-bit 0xffff, 32-bit 0x1f, 32-bit 0, as well as 8-bit 0).

A uncomplicated privilege escalation exploit is attached inward the original issue.

Remote laid on vector?

Given the quantity of bugs that were discovered, I investigated whether if whatever of them tin hold upward reached from a completely remote context without having to compromise a sandboxed procedure maiden of all (e.g. through WebGL inward a browser, or through video acceleration).

Luckily, this didn’t appear to hold upward the case. This wasn’t equally good surprising, given that the vulnerable APIs hither are really depression score as well as exclusively reached later on going through many layers (for Chrome, libANGLE -> Direct3D runtime as well as user way driver -> core way driver), as well as to a greater extent than oftentimes than non called alongside valid arguments constructed inward the user way driver.

NVIDIA’s response

The nature of the bugs industrial plant life showed that NVIDIA has a lot of piece of employment to do. Their drivers contained a lot of code which in all likelihood shouldn’t hold upward inward the kernel, as well as most of the bugs discovered were really basic mistakes. One of their drivers (NvStreamKms.sys) also lacks really basic mitigations (stack cookies) fifty-fifty today.

However, their response was mostly quick as well as positive. Most bugs were fixed good nether the deadline, as well as it seems that they’ve been finding about bugs on their ain internally. They also indicated that they’ve been working on re-architecturing their core drivers for security, just weren’t ready to portion whatever concrete details.

Timeline

2016-07-26
First põrnikas reported to NVIDIA.
2016-09-21
6 of the bugs reported were fixed silently inward the 372.90 release. Discussed land gap issues alongside NVIDIA.
2016-10-23
Patch released that includes prepare for residual (all 14) of the bugs that were reported at the fourth dimension (375.93).
2016-10-28
Public bulletin released, as well as P0 bugs derestricted.
2016-11-04
Realised that https://bugs.chromium.org/p/project-zero/issues/detail?id=911 wasn’t fixed properly. Notified NVIDIA.
2016-12-14
Fix for number 911 released along alongside bulletin.
2017-02-14
Final 2 bugs fixed.

Patch gap

NVIDIA’s maiden of all patch, which included fixes to half-dozen of the bugs I reported, did non include a world bulletin (the release notes bring upward “security updates”). They had planned to free world details a calendar month later on the land is released. We noticed this, as well as allow them know that nosotros didn’t consider this to hold upward practiced do equally an aggressor tin opposite the land to notice the vulnerabilities earlier earth is made aware of the details given this large window.

While the maiden of all half-dozen bugs fixed did non conduct hold details released for to a greater extent than than xxx days, the remaining 8 at the fourth dimension had a land released five days earlier the maiden of all bulletin was released. It looks similar NVIDIA has been trying to cut this gap, just based on recent bulletins it appears to hold upward inconsistent.

Conclusion

Given the large laid on surface exposed past times graphics drivers inward the core as well as the to a greater extent than oftentimes than non lower character of 3rd political party code, it appears to hold upward a really rich target for finding sandbox escapes as well as EoP vulnerabilities. GPU vendors should endeavor to boundary this past times moving equally much laid on surface equally they tin out of the kernel.

Komentar

Postingan populer dari blog ini

Exception-Oriented Exploitation On Ios

Posted past times Ian Beer, This postal service covers the regain in addition to exploitation of CVE-2017-2370 , a heap buffer overflow inwards the mach_voucher_extract_attr_recipe_trap mach trap. It covers the bug, the evolution of an exploitation technique which involves repeatedly in addition to deliberately crashing in addition to how to build alive meat introspection features using onetime meat exploits. It’s a trap! Alongside a large number of BSD syscalls (like ioctl, mmap, execve in addition to so on) XNU also has a pocket-sized number of extra syscalls supporting the MACH side of the meat called mach traps. Mach trap syscall numbers start at 0x1000000. Here’s a snippet from the syscall_sw.c file where the trap tabular array is defined: /* 12 */ MACH_TRAP(_kernelrpc_mach_vm_deallocate_trap, 3, 5, munge_wll), /* xiii */ MACH_TRAP(kern_invalid, 0, 0, NULL), /* xiv */ MACH_TRAP(_kernelrpc_mach_vm_protect_trap, 5, 7, munge_wllww), Most of the mach traps a

Lifting The (Hyper) Visor: Bypassing Samsung’S Real-Time Total Protection

Posted yesteryear Gal Beniamini, Traditionally, the operating system’s total is the concluding security boundary standing betwixt an assaulter together with total command over a target system. As such, additional aid must hold upwards taken inwards lodge to ensure the integrity of the kernel. First, when a organization boots, the integrity of its primal components, including that of the operating system’s kernel, must hold upwards verified. This is achieved on Android yesteryear the verified kicking chain . However, only booting an authenticated total is insufficient—what most maintaining the integrity of the total spell the organization is executing? Imagine a scenario where an assaulter is able to abide by together with exploit a vulnerability inwards the operating system’s kernel. Using such a vulnerability, the assaulter may endeavor to subvert the integrity of the total itself, either yesteryear modifying the contents of its code, or yesteryear introducing novel attacker-co

Chrome Bone Exploit: 1 Byte Overflow As Well As Symlinks

The next article is an invitee weblog post from an external researcher (i.e. the writer is non a or Google researcher). This post is most a Chrome OS exploit I reported to Chrome VRP inward September. The folks were squeamish to allow me do a invitee post most it, therefore hither goes. The study includes a detailed writeup , therefore this post volition have got less detail. 1 byte overflow inward a DNS library In Apr I constitute a TCP port listening on localhost inward Chrome OS. It was an HTTP proxy built into shill, the Chrome OS network manager. The proxy has at nowadays been removed equally component of a fix, but its source tin give notice nonetheless move seen from an one-time revision: shill/http_proxy.cc . The code is unproblematic in addition to doesn’t seem to incorporate whatever obvious exploitable bugs, although it is real liberal inward what it accepts equally incoming HTTP. It calls into the c-ares library for resolving DNS. There was a possible 1 byte ov