The next article is an invitee weblog post from an external researcher (i.e. the writer is non a or Google researcher).
This post is most a Chrome OS exploit I reported to Chrome VRP inward September. The folks were squeamish to allow me do a invitee post most it, therefore hither goes. The study includes a detailed writeup, therefore this post volition have got less detail.
1 byte overflow inward a DNS library
In Apr I constitute a TCP port listening on localhost inward Chrome OS. It was an HTTP proxy built into shill, the Chrome OS network manager. The proxy has at nowadays been removed equally component of a fix, but its source tin give notice nonetheless move seen from an one-time revision: shill/http_proxy.cc. The code is unproblematic in addition to doesn’t seem to incorporate whatever obvious exploitable bugs, although it is real liberal inward what it accepts equally incoming HTTP. It calls into the c-ares library for resolving DNS. There was a possible 1 byte overflow inward c-ares spell edifice the DNS packet. Here is the vulnerable code, stripped heavily from its original to brand the põrnikas to a greater extent than visible:
It parses dot-separated labels in addition to writes them into a buffer allocated yesteryear malloc(). Each label is prefixed yesteryear a length byte in addition to separating dots are omitted. The buffer length calculation is essentially but a strlen(). Influenza A virus subtype H5N1 dot that follows a label accounts for the length byte. The in conclusion label may or may non cease amongst a dot. If it doesn’t, therefore the buffer length is incremented inward the firstly dark box to concern human relationship for the length byte of the in conclusion label.
Dots may move escaped though in addition to an escaped dot is component of a label instead of beingness a separator. If the in conclusion label ends amongst “\.”, an escaped dot, therefore the firstly dark box wrongly concludes that the length byte of the in conclusion label has already been accounted for. The buffer remains brusque yesteryear 1 byte in addition to the to the lowest degree meaning byte of dnsclass overflows. The value of dnsclass is most unremarkably a constant 1.
Exploit from JavaScript?
Shill runs equally root. Influenza A virus subtype H5N1 straight exploit from JavaScript would accomplish inward a unmarried stair what mightiness otherwise choose three: renderer code execution -> browser code execution -> privesc to root. This agency less move in addition to fewer points of failure. It’s convenient that shill in addition to chrome are separate processes, therefore if the exploit fails in addition to crashes shill, it doesn’t choose downward chrome in addition to shill is restarted automatically. The straight exploit turned out to move possible, but amongst difficulties.
There doesn’t seem to move an obvious way to larn chrome to identify “\.” at the cease of a Host header using HTTP. So instead the exploit uses the TURN protocol amongst WebRTC. It encodes what looks similar HTTP into the username acre of TURN. TURN is a binary protocol in addition to it tin give notice exclusively move used because HTTP parsing yesteryear the proxy is lax.
Also, shill is listening on a random port. The exploit uses TURN again, to scan the localhost ports. It measures connexion fourth dimension to create upward one's hear if a port was open. The scan likewise runs into a surprising demeanour explained nicely inward here. If the source in addition to destination TCP ports of a localhost connexion assay orbit off to match, therefore the core connects the socket to itself. Anything sent on a socket is received on the same socket. This causes fake positives, therefore the scan must retry until a unmarried port remains.
A to a greater extent than hard number is that in that location aren’t whatever decent retention preparation primitives. The proxy allocates the headers into a vector of strings. It applies minimal processing to the Via in addition to Host headers, forwards the headers to some other server in addition to frees the them. It accepts a unmarried customer at a time. The number of headers is express to <= 0x7f, header size is <= 0x800 bytes in addition to TURN package is <= 0x8000 bytes. The stone oil invention is to do rooming over six connections or stages. The occupation is that dissimilar stages postulate to reliably identify allocations at the same location. This is hard because the retention layout changes betwixt connections inward ways that are hard to predict. The solution is to create what I telephone telephone a persistent size 0x820 byte hole.
820 hole
First, it should move mentioned that shill uses dlmalloc, which is a best-fit allocator. malloc() uses the smallest gratis chunk that tin give notice fit the request. free() coalesces whatever neighboring gratis chunks.
Let’s await at the moving-picture demo of preparation at phase 1. This creates a persistent hole of 0x820 bytes:
Red agency that the chunk is inward purpose chunk in addition to light-green agency free. Cyan is the large exceed chunk of dlmalloc. The number on each chunk is the chunk size inward hex. 0x is omitted. In the residuum of this post, I’ll ever refer to chunk sizes inward hex, omitting 0x. Also, I’ll frequently refer to chunk sizes equally nouns, which is a brusque way of referring to the chunk amongst such size. I’ll omit the actual preparation primitives used for these allocations, but for those interested, the Host in addition to Via header processing inward here is used.
So the firstly moving-picture demo shows how the 820 hole is created. Four chunks of size 410 are allocated from the exceed chunk inward [0-3]. In [5,6], the firstly 410 is freed in addition to replaced amongst the backing resources allotment of the vector of headers. Even though the headers themselves are freed afterward phase 1 connexion closes, the backing resources allotment of the vector is persistent across connections. The 4th 410 is likewise freed in addition to the buffer for incoming server information is placed into it. It is likewise persistent across stages. Then the connexion closes, the 2 410 headers inward the pump are freed in addition to consolidated into 820.
Why is this 820 hole useful? It is persistent because the previous in addition to next 410 are non freed betwixt stages. Each phase tin give notice at nowadays start amongst the steps:
- allocate the 820
- eat all gratis holes upward to the exceed chunk yesteryear doing tons of small-scale allocations
- free the 820
Let’s say a phase therefore allocates a small-scale chunk of 100. dlmalloc uses the smallest gratis chunk, which is the 820, because smaller ones were allocated. Now let’s say the phase finishes in addition to the 100 is freed. Next phase tin give notice purpose the same algorithm to identify a 100 at the same location. This capability allows but plenty preparation inward phase 2 in addition to 3 to larn from 1 byte overwrite to overlapping chunks.
But things could larn wrong. There mightiness move some other 820 hole yesteryear adventure in addition to dissimilar stages mightiness allocate a dissimilar 820. Or it could orbit off that the tons of small-scale allocations neglect to swallow all holes, because the total of retention allocated per connexion is limited. So the exploit attempts to larn rid of most of the gratis chunks before phase 1 yesteryear combining dissimilar techniques. An interesting 1 perchance is that it intentionally crashes shill. The procedure is restarted automatically in addition to starts amongst a build clean heap layout. It likewise uses 2 techniques to allocate lots of memory—more than what’s allowed yesteryear the limits mentioned above. I won’t come inward details hither though.
Overlapping chunks
Stage 2 triggers the retention corruption in addition to phase 3 creates overlapping chunks:
First, a 1e0 chunk is allocated inward [10-12] yesteryear allocating 640, therefore 1e0 in addition to therefore freeing 640. Then the interrogation buffer of ares is allocated into the 110 slot at [13]. This leaves a gratis 530 inward the middle. Now is a practiced fourth dimension to choose a closer await at the dlmalloc chunk header declared here:
This header is kept inward front end of each chunk. The 3 to the lowest degree meaning bits of the size acre are used equally flags. Most importantly, lsb = 1 indicates that the previous chunk is inward use. So looking at [13], the 530 chunk has size = 531 in addition to 1e0 chunk has prev_size = 530. The prev_size acre is exclusively used when the previous chunk is free. Otherwise the previous chunk spans the prev_size field. This agency that the size acre of 530 instantly follows the interrogation buffer inward 110. The unmarried byte that overflows the interrogation buffer overwrites the to the lowest degree meaning byte of the size acre of 530: 0x31 -> 0x01. So the 3 flags are non affected. But chunk size is corrupted from 530 to 500 equally tin give notice move seen from [14].
What’s interesting is that 1e0 doesn’t know anything most this corruption in addition to its prev_size remains 530. Now, [15-17] separate the gratis 500 into gratis 2e0 in addition to in-use 220. But dlmalloc is already confused at this point. When it tries to update the prev_size of the chunk next 220, it’s off yesteryear thirty bytes from 1e0. And 1e0 keeps on believing that prev_size = 530. It likewise believes that the previous chunk is gratis fifty-fifty though 220 is in-use. So at nowadays inward [18], 1e0 is freed. It tries to coalesce amongst a previous 530 chunk. There is a 2e0, where in that location used to move 530. dlmalloc is fine amongst that in addition to creates a large 710 chunk that overlaps the 220.
These form of overlapping chunks are relatively slowly to exploit. They’re practiced both for breaking ASLR in addition to getting RCE. This technique for going from a unmarried byte overflow to overlapping chunks is non new. Chris Evans demonstrated it this post. I’m non certainly if anyone has demonstrated earlier.
What’s non shown inward the moving-picture demo for simplicity is that [14-15] is the boundary betwixt phase 2 in addition to 3. The retention corruption of phase 2 occurs inward DNS code afterward Via in addition to Host headers are processed, therefore no farther preparation is possible. Stage 3 continues amongst preparation to larn overlapping chunks. But the 110 interrogation buffer is genuinely freed afterward phase 2. Stage 3 needs to reallocate a 110 chunk at the same location. The method described higher upward is used.
ASLR
Stage 4 breaks ASLR. It firstly turns the overlapping 220 into a to a greater extent than convenient 810 chunk:
So it allocates the 820, which overwrites the header of 220 in addition to changes the size to 810. It’s interesting to banknote that the fd in addition to bk pointers inward the header of 220 are likewise overwritten. The exploit can’t afford to corrupt pointers at this dot because it hasn’t broken ASLR. But fd in addition to bk are exclusively used when the chunk is free—they are used for a doubly linked freelist. [21] frees the overwritten chunk in addition to dlmalloc finds it to move of size 810.
Next, 2 gratis 2a0 chunks are crafted into the 810:
So 2a0 is allocated, 2d0 is allocated in addition to 2a0 is freed. Now, the of late mentioned fd in addition to bk pointers are leaked to pause ASLR. The 2 2a0 chunks have got the same size in addition to are placed into the same freelist. With additional preparation at the firstly of phase 4, the exploit tin give notice move certainly that the 2 chunks are the exclusively ones inward this freelist. Well, in that location is likewise a 3rd chemical constituent linked in—the freelist caput allocated statically from libc. So looking at the firstly 2a0, its fd in addition to bk dot to the other 2a0 in addition to into libc. Also, the firstly 2a0 overlaps amongst 820, which contains an HTTP header that is forwarded to an attacker-controlled HTTP proxy. So that leaks 2 pointers that the proxy server forwards to JavaScript. The 2 pointers are used to calculate the address of 820 in addition to the base of operations address of libc.
To root
ASLR defeated, stages five in addition to six larn code execution:
The stone oil see is to overwrite a BindState which holds callback information—a business office pointer in addition to arguments. The business office pointer is overwritten to dot to system() inward libc, the base of operations address of which is known. And the firstly declaration is overwritten to dot to a rhythm command string crafted into the 820 slot, the address of which is likewise known. BindState chunk size is 40, therefore now, 810 is resized to 40. First, [25] frees 2d0, which consolidates to 810. For the 810 chunk to move placed into the size xl freelist, it is removed from its electrical flow freelist yesteryear allocating it inward [27]. 810 size is overwritten to xl yesteryear freeing 820 inward [26] in addition to reallocating it amongst novel information inward [28]. [29] frees the resized xl in addition to [30] allocates a BindState into it. BindState at nowadays conveniently overlaps amongst 820. [31-32] reallocates 820 to corrupt the BindState to launch system(). The especial callback used triggers inward thirty seconds in addition to system() runs a rhythm command equally root.
Persistence bug
It may audio surprising, but an assaulter that has gained root on Chrome OS volition lose the privileges afterward reboot. Chrome OS has verified boot. Bootloader inward read-only retention verifies the kernel, which inward plow verifies the hash of each disk block that it needs during runtime. This applies to the organization sectionalization which contains all the executable binaries, libraries in addition to scripts. So an assaulter can’t but laid upward a script to run at boot. But in that location is likewise a stateful sectionalization that tin give notice move modified. It is intended for variable materials similar logs, configuration files in addition to caches.
The way this exploit achieves persistence across reboots volition audio familiar to anyone who’s read most this exploit yesteryear geohot. Both purpose symlinks, dump_vpd_log in addition to modprobe. The dump_vpd_log script itself was fixed to non follow symlinks, but hither is a snippet from /etc/init/ui-collect-machine-info.conf:
/var is a stateful sectionalization therefore UI_MACHINE_INFO_FILE tin give notice move turned into an arbitrary symlink. dump_vpd_log --full --stdout writes /mnt/stateful_partition/unencrypted/cache/vpd/full-v2.txt to stdout. This tin give notice move used to create an arbitrary file amongst arbitrary contents during boot. geohot used dump_vpd_log to write a command into /proc/sys/kernel/modprobe at kick therefore a next modprobe would execute the command. But in that location are some extra problems when trying to reuse this approach.
The firstly number is that /var/run is a symlink to /run, which is a tmpfs in addition to non persistent. The exploit makes /var/run persistent yesteryear relinking it to /var/real_run. Some parts of Chrome OS larn confused yesteryear that in addition to it is dealt amongst yesteryear using to a greater extent than symlinks. I’ll skip the details here.
modprobe.d config file
So at nowadays it’s possible to write into arbitrary files during boot. Another number is that writing into /proc/sys/kernel/modprobe amongst dump_vpd_log won’t move inward this case, because the next udevadm writes into the same file in addition to its output can’t move controlled. The in conclusion write() syscall is what counts when writing into /proc/sys/kernel/modprobe. So instead, the exploit creates /run/modprobe.d, which is is a configuration file for modprobe. Parsing of modprobe.d is lax. Any draw of piece of work starting amongst "install modulename command..." specifies a command to execute when that module is loaded. Any lines that neglect to parse are ignored.
Late modprobe
The in conclusion occupation is that ui-collect-machine-info.conf runs tardily during boot, when all modprobing is complete. The created configuration file is non of much use. So the in conclusion play a joke on is to discovery a way to trigger modprobe tardily during boot. The exploit creates a device file amongst mknod, which has a major number 173. 173 is unknown to the kernel, which agency that when something accesses the device file, therefore the core volition assay to modprobe a handler module named char-major-173-0. Then it is sufficient to plow some unremarkably accessed file into a symlink to the device file in addition to each access to the file volition modprobe. The exploit uses /var/lib/metrics/uma-event.
There is yet 1 to a greater extent than issue. Stateful partitions are mounted amongst the nodev flag, which blocks access to device files. So the device has to move moved to /dev during startup. This code inward /etc/init/cryptohomed.conf is used for that:
The device is created equally /mnt/stateful_partition/home/.shadow/attestation.epb in addition to /mnt/stateful_partition/unencrypted/preserve/attestation.epb is turned into a symlink to /dev/net. This moves the device to /dev/net. /dev/net is used instead of /dev because cryptohomed changes the possessor of the target attestation.epb. This would alter the possessor of the whole /dev directory in addition to crusade chrome to crash.
So that completes the Rube Goldberg machine of symlinks. dump_vpd_log creates /run/modprobe.d configuration file amongst a command to launch equally root. cryptohomed moves a device file to /dev/net. Any generated metric accesses the uma-event symlink to the device, which launches modprobe, which launches a command from modprobe.d.
Patches
By now, the issues have got been fixed pretty thoroughly. c-ares was patched inward Chrome OS in addition to upstream. The HTTP proxy was removed from shill. TURN implementation was hardened to block JavaScript from sending an arbitrary username to a localhost TCP port. And the symlink issues were fixed here, here, here in addition to here.
Komentar
Posting Komentar