Langsung ke konten utama

Over The Air: Exploiting Broadcom’S Wi-Fi Stack (Part 2)

Posted yesteryear Gal Beniamini,

In this spider web log postal service we'll proceed our journeying into gaining remote nitty-gritty code execution, yesteryear way of Wi-Fi communication alone. Having previously developed a remote code execution exploit giving us command over Broadcom’s Wi-Fi SoC, nosotros are at nowadays left alongside the chore of exploiting this advantage dot inward company to farther elevate our privileges into the kernel.

ll proceed our journeying into gaining remote nitty-gritty code execution Over The Air: Exploiting Broadcom’s Wi-Fi Stack (Part 2)

In this post, we’ll explore 2 distinct avenues for attacking the host operating system. In the commencement part, we’ll discovery as well as exploit vulnerabilities inward the communication protocols betwixt the Wi-Fi firmware as well as the host, resulting inward code execution inside the kernel. Along the way, we’ll also discovery a curious vulnerability which persisted until quite recently, using which attackers were able to direct assail the internal communication protocols without having to exploit the Wi-Fi SoC inward the commencement place! In the 2nd part, we’ll explore hardware blueprint choices allowing the Wi-Fi SoC inward its electrical flow configuration to fully command the host without requiring a vulnerability inward the commencement place.

While the vulnerabilities discussed inward the commencement portion direct hold been disclosed to Broadcom as well as are at nowadays fixed, the utilisation of hardware components remains every bit it is, as well as is currently non mitigated against. We promise that yesteryear publishing this research, mobile SoC manufacturers as well as driver vendors volition travel encouraged to do to a greater extent than secure designs, allowing a improve flat of separation betwixt the Wi-Fi SoC as well as the application processor.

Part 1 - The “Hard” Way

The Communication Channel


As we’ve established inward the previous spider web log post, the Wi-Fi firmware produced yesteryear Broadcom is a FullMAC implementation. As such, it’s responsible for handling much of the complexity required for the implementation of 802.11 standards (including the bulk of the MLME layer).

Yet, patch many of the operations are encapsulated inside the Wi-Fi chip’s firmware, some flat of command over the Wi-Fi nation machine is required inside the host’s operating system. Certain events cannot travel handled entirely yesteryear the Wi-Fi SoC, as well as must so travel communicated to the host’s operating system. For example, the host must travel notified of the results of a Wi-Fi scan inward company to travel able to acquaint this information to the user.

In company to facilitate these cases where the host as well as the Wi-Fi SoC want to communicate alongside i another, a exceptional communication channel is required.

However, remember that Broadcom produces a wide range of Wi-Fi SoCs, which may travel connected to the host via many dissimilar interfaces (including USB, SDIO or fifty-fifty PCIe). This way that relying on the underlying communication interface mightiness require re-implementing the shared communication protocol for each of the supported channels -- quite a deadening task.

ll proceed our journeying into gaining remote nitty-gritty code execution Over The Air: Exploiting Broadcom’s Wi-Fi Stack (Part 2)

Perhaps there’s an easier way? Well, i matter nosotros tin flaming e'er travel sure of is that regardless of the communication channel used, the chip must travel able to transmit received frames dorsum to the host. Indeed, mayhap for the rattling same reason, Broadcom chose to piggyback on top of this channel inward company to do the communication channel betwixt the SoC as well as the host.

When the firmware wishes to notify the host of an event, it does so yesteryear only encoding a “special” frame as well as transmitting it to the host. These frames are marked yesteryear a “unique” EtherType value of 0x886C. They do non incorporate actual received data, but rather encapsulate information virtually firmware events which must travel handled yesteryear the host’s driver.

ll proceed our journeying into gaining remote nitty-gritty code execution Over The Air: Exploiting Broadcom’s Wi-Fi Stack (Part 2)

Securing the Channel


Now, let’s switch over the the host’s side. On the host, the driver tin flaming logically travel divided into several layers. The lower layers bargain alongside the communication interface itself (such every bit SDIO, PCIe, etc.) as well as whatever transmission protocol may travel tied to it. The higher layers as well as then bargain alongside the reception of frames, as well as their subsequent processing (if necessary).
ll proceed our journeying into gaining remote nitty-gritty code execution Over The Air: Exploiting Broadcom’s Wi-Fi Stack (Part 2)

First, the upper layers perform some initial processing on the received frames, such every bit removing encapsulated information which may direct hold been added on-top of it (for example, transmission powerfulness indicators added yesteryear the PHY module). Then, an of import distinction must travel made - is this a regular frame that should travel only forwarded to the relevant network interface, or is it inward fact an encoded lawsuit that the host must handle?

As we’ve just seen, this distinction is easily made! Just direct hold a hold back at the ethertype as well as cheque whether it has the “special” value of 0x886C. If so, handgrip the encapsulated lawsuit as well as discard the frame.

Or is it?

In fact, at that spot is no guarantee that this ethertype is unused inward every unmarried network as well as yesteryear every unmarried device. Incidentally, it seems that the rattling same ethertype is used for the LARQ protocol used inward HPNA chips (initially developed yesteryear Epigram, as well as afterwards purchased yesteryear Broadcom).

Regardless of this niggling oddity - this brings us to our commencement question: how tin flaming the Wi-Fi SoC as well as host driver distinguish betwixt externally received frames alongside the 0x886C ethertype (which should travel forwarded to the network interface), as well as internally generated lawsuit frames (which should non travel received from external sources)?

This is a crucial question; the internal lawsuit channel, every bit we’ll come across shortly, is extremely powerful as well as provides a huge, generally unaudited, assail surface. If attackers are able to inject frames over-the-air that tin flaming afterwards travel processed every bit lawsuit frames yesteryear the driver, they may rattling good travel able to attain code execution inside the host’s operating system.

Well… Until several months prior to this enquiry (mid 2016), the firmware made no sweat to filter these frames. Any frame received every bit portion of the information RX-path, regardless of its ethertype, was only forwarded blindly to the host. As a result, attackers were able to remotely shipping frames containing the exceptional 0x886C ethertype, which were as well as then processed yesteryear the driver every bit if they were lawsuit frames created yesteryear the firmware itself!

So how was this number addressed? After all, we’ve already established that just filtering the ethertype itself is non sufficient. Observing the differences betwixt the pre- as well as post- patched versions of the firmware reveals the answer: Broadcom went for a combined patch, targeting both the Wi-Fi SoC’s firmware as well as the host’s driver.

The patch adds a validation method (is_wlc_event_frame) both to the firmware’s RX path, as well as to the driver. On the chip’s  side, the validation method is called directly before transmitting a received frame to the host. If the validation method deems the frame to travel an lawsuit frame, it is discarded. Otherwise, the frame is forwarded to the driver. Then, the driver calls the exact same verification method on received frames alongside the 0x886C ethertype, as well as processes them only if they transcend the same validation method. Here is a brusque schematic detailing this flow:

ll proceed our journeying into gaining remote nitty-gritty code execution Over The Air: Exploiting Broadcom’s Wi-Fi Stack (Part 2)

As long every bit the validation methods inward the driver as well as the firmware rest identical, externally received frames cannot travel processed every bit events yesteryear the driver. So far so good.

However… Since nosotros already direct hold code-execution on the Wi-Fi SoC, nosotros tin flaming only “revert” the patch. All it takes is for us to “patch out” the validation method inward the firmware, thereby causing whatever received frame to in i lawsuit once again travel forwarded blindly to the host. This, inward turn, allows us to inject arbitrary messages into the communication protocol betwixt the host as well as the Wi-Fi chip. Moreover, since the validation method is stored inward RAM, as well as all of RAM is marked every bit RWX, this is every bit unproblematic every bit writing “MOV R0, #0; BX LR” to the function’s prologue.

ll proceed our journeying into gaining remote nitty-gritty code execution Over The Air: Exploiting Broadcom’s Wi-Fi Stack (Part 2)

The Attack Surface


As nosotros mentioned earlier, the assail surface exposed yesteryear the internal communication channel is huge. Tracing the command menses from the entry dot for handling lawsuit frames (dhd_wl_host_event), nosotros tin flaming come across that several events have “special treatment”, as well as are processed independently (see wl_host_event as well as wl_show_host_event). Once the initial handling is done, the frames are inserted into a queue. Events are as well as then dequeued yesteryear a nitty-gritty thread whose sole purpose is to read events from the queue as well as dispatch them to their corresponding handler function. This correlation is done yesteryear using the event’s internal “event-type” plain every bit an index into an array of handler functions, called evt_handler.

ll proceed our journeying into gaining remote nitty-gritty code execution Over The Air: Exploiting Broadcom’s Wi-Fi Stack (Part 2)

While at that spot are upward to 144 dissimilar supported lawsuit codes, the host driver for Android, bcmdhd, only supports a much smaller subset of these. Nonetheless, virtually 35 events are supported inside the driver, each including their ain elaborate handlers.

Now that we’re convinced that the assail surface is large enough, nosotros tin flaming start hunting for bugs! Unfortunately, it seems similar the Wi-Fi chip is considered every bit “trusted”; every bit a result, some of the validations inward the host’s driver are insufficient… Indeed, auditing the relevant handler functions as well as auxiliary protocol handlers outlined above, nosotros discovery a substantial number of vulnerabilities.

The Vulnerability


Taking a closer hold back at the vulnerabilities we’ve found, nosotros tin flaming come across that they all differ from i some other slightly. Some allow for relatively strong primitives, some weaker. However, most importantly, many of them direct hold diverse preconditions which must travel fulfilled to successfully trigger them; some are express to sure physical interfaces, patch others piece of work only inward sure configurations of the driver. Nonetheless, one vulnerability seems to travel acquaint inward all versions of bcmdhd as well as inward all configurations - if nosotros tin flaming successfully exploit it, nosotros should travel set.

Let’s direct hold a closer hold back at the lawsuit frame inward question. Events of type "WLC_E_PFN_SWC" are used to bespeak that a “Significant Wi-Fi Change” (SWC) has occurred inside the firmware as well as must travel handled yesteryear the host. Instead of direct handling these events, the host’s driver only gathers all the transferred information from the firmware, as well as broadcasts a “vendor event” bundle via Netlink to the cfg80211 layer.

More concretely, each SWC lawsuit frame transmitted yesteryear the firmware contains an array of events (of type wl_pfn_significant_net_t), a total count (total_count), as well as the number of events inward the array (pkt_count). Since the total number of events tin flaming travel quite large, it mightiness non fit inward a unmarried frame (i.e., it mightiness travel larger than the the maximal MSDU). In this case, multiple SWC lawsuit frames tin flaming travel sent consecutively - their internal information volition travel accumulated yesteryear the driver until the total count is reached, at which dot the driver volition procedure the entire listing of events.
ll proceed our journeying into gaining remote nitty-gritty code execution Over The Air: Exploiting Broadcom’s Wi-Fi Stack (Part 2)

Reading through the driver’s code, nosotros tin flaming come across that when this lawsuit code is received, an initial handler is triggered inward company to bargain alongside the event. The handler as well as then internally calls into the "dhd_handle_swc_evt" business office inward company to procedure the event's data. Let’s direct hold a closer look:

1.  void* dhd_handle_swc_evt(dhd_pub_t *dhd,const void *event_data,int *send_evt_bytes)
2.  {
3.   ...
4.   wl_pfn_swc_results_t *results = (wl_pfn_swc_results_t *)event_data;
5.   ...
6.   gscan_params = &(_pno_state->pno_params_arr[INDEX_OF_GSCAN_PARAMS].params_gscan);
7.   params = &(gscan_params->param_significant);
8.   ...
9.   if (!params->results_rxed_so_far) {
10.      if (!params->change_array) {
11.          params->change_array = (wl_pfn_significant_net_t *)
12.                                  kmalloc(sizeof(wl_pfn_significant_net_t) *
13.                                          results->total_count, GFP_KERNEL);
14.          ...
15.      }
16.  }
17.  ...
18.  change_array = &params->change_array[params->results_rxed_so_far];
19.  memcpy(change_array,
20.         results->list,
21.         sizeof(wl_pfn_significant_net_t) * results->pkt_count);
22.  params->results_rxed_so_far += results->pkt_count;
23.  ...
24. }

(where "event_data" is the arbitrary information encapsulated inward the lawsuit passed inward from the firmware)

As nosotros tin flaming come across above, the business office commencement allocates an array to agree the total count of events (if i hasn’t been allocated before) as well as and then proceeds to concatenate the encapsulated information starting from the appropriate index (results_rxed_so_far) inward the buffer.

However, the handler fails to verify the relation betwixt the total_count as well as the pkt_count! It only “trusts” the assertion that the total_count is sufficiently large to shop all the subsequent events passed in. As a result, an assaulter alongside the powerfulness to inject arbitrary lawsuit frames tin flaming specify a pocket-size total_count as well as a larger pkt_count, thereby triggering a unproblematic nitty-gritty heap overflow.

Remote Kernel Heap Shaping


This is all good as well as good, but how tin flaming nosotros leverage this primitive from a remote advantage point? As we’re non locally acquaint on the device, we’re unable to assemble whatever information virtually the electrical flow nation of the heap, nor do nosotros direct hold address-space related information (unless, of course, we’re able to somehow leak this information). Many classic exploits targeting nitty-gritty heap overflows rely on the powerfulness to shape the kernel’s heap, ensuring a sure nation prior to triggering an overflow - an powerfulness nosotros also lack at the moment.

What do nosotros know virtually the allocator itself? There are a few possible underlying implementations for the kmalloc allocator (SLAB, SLUB, SLOB), configurable when edifice the kernel. However, on the vast bulk of devices, kmalloc uses “SLUB” - an unqueued “slab allocator” alongside per-CPU caches.

Each “slab” is only a pocket-size part from which identically-sized allocations are carved. The commencement chunk inward each slab contains its metadata (such every bit the slab’s freelist), as well as subsequent blocks incorporate the allocations themselves, alongside no inline metadata. There are a number of predefined slab size-classes which are used yesteryear kmalloc, typically spanning from every bit niggling every bit 64 bytes, to just about 8KB. Unsurprisingly, the allocator uses the best-fitting slab (smallest slab that is large enough) for each allocation. Lastly, the slabs’ freelists are consumed linearly - consecutive allocations occupy consecutive retentivity addresses. However, if objects are freed inside the slab, it may leave of absence fragmented - causing subsequent allocations to fill-in “holes” inside the slab instead of proceeding linearly.
ll proceed our journeying into gaining remote nitty-gritty code execution Over The Air: Exploiting Broadcom’s Wi-Fi Stack (Part 2)
With this inward mind, let’s direct hold a measuring dorsum as well as analyse the primitives at hand. First, since nosotros are able to arbitrarily specify whatever value inward total_count, nosotros tin flaming select the overflown buffer’s size to travel whatever multiple of sizeof(wl_pfn_significant_net). This way nosotros tin flaming inhabit whatever slab cache size of our choosing. As such, there’s no limitation on the size of the objects nosotros tin flaming target alongside the overflow. However, this is non quite enough… For starters, nosotros soundless don’t know anything virtually the electrical flow nation of the slabs themselves, nor tin flaming nosotros trigger remote allocations inward slabs of our choosing.

It seems that commencement as well as foremost, nosotros demand to discovery a way to remotely shape slabs. Recall, however, that at that spot are a few obstacles nosotros demand to overcome. As SLUB maintains per-CPU caches, the affinity of the nitty-gritty thread inward which the allotment is performed must travel the same every bit the i from which the overflown buffer is allocated. Gaining a heap shaping primitive on a dissimilar CPU core volition crusade the allocations to travel taken from dissimilar slabs. The most straightforward way to tackle this number is to confine ourselves to heap shaping primitives which tin flaming travel triggered from the same nitty-gritty thread on which the overflow occurs. This is quite a substantial constraint… In essence, it forces us to disregard allocations that occur every bit a outcome of processes that are external to the lawsuit handling itself.

Regardless, alongside a concrete destination inward mind, nosotros tin flaming start looking for heap shaping primitives inward the registered handlers for each of the lawsuit frames. As luck would direct hold it, after going through every handler, nosotros come upward across a (single) perfect fit!

Events frames of type “WLC_E_PFN_BSSID_NET_FOUND” are handled yesteryear the handler business office dhd_handle_hotlist_scan_evt. This business office accumulates a linked listing of scan results. Every fourth dimension an lawsuit is received, its information is appended to the list. Finally, when an lawsuit arrives alongside a flag indicating it is the in conclusion lawsuit inward the chain, the business office passes on the collected listing of events to travel processed. Let’s direct hold a closer look:

1. void *dhd_handle_hotlist_scan_evt(dhd_pub_t *dhd, const void *event_data,
2.                                   int *send_evt_bytes, hotlist_type_t type)
3. {
4.    struct dhd_pno_gscan_params *gscan_params;
5.    wl_pfn_scanresults_t *results = (wl_pfn_scanresults_t *)event_data;
6.    gscan_params = &(_pno_state->pno_params_arr[INDEX_OF_GSCAN_PARAMS].params_gscan);
7.    ...
8.    malloc_size = sizeof(gscan_results_cache_t) +
9.                      ((results->count - 1) * sizeof(wifi_gscan_result_t));
10.   gscan_hotlist_cache = (gscan_results_cache_t *) kmalloc(malloc_size, GFP_KERNEL);
11.   ...
12.   gscan_hotlist_cache->next = gscan_params->gscan_hotlist_found;
13.   gscan_params->gscan_hotlist_found = gscan_hotlist_cache;
14.   ...
15.   gscan_hotlist_cache->tot_count = results->count;
16.   gscan_hotlist_cache->tot_consumed = 0;
17.   plnetinfo = results->netinfo;
18.   for (i = 0; i < results->count; i++, plnetinfo++) {
19      hotlist_found_array = &gscan_hotlist_cache->results[i];
20.     ... //Populate the entry alongside the sanitised network information
21.   }
22.  if (results->status == PFN_COMPLETE) {
23.    ... //Process the entire chain
24.  }
25.  ...
26.}

Awesome - looking at the business office above, it seems that we’re able to repeatedly crusade allocations of size { sizeof(gscan_results_cache_t) + (N-1) * sizeof(wifi_gscan_result_t) | due north > 0 } (where due north denotes results->count). What’s more, these allocations are performed inward the same nitty-gritty thread, as well as their lifetime is completely controlled yesteryear us! As long every bit nosotros don’t shipping an lawsuit alongside the PFN_COMPLETE status, none of the allocations volition travel freed.

Before nosotros motility on, we’ll demand to select a target slab size. Ideally, we’re looking for a slab that’s relatively inactive. If other threads on the same CPU select to allocate (or free) information from the same slab, this would add together dubiety to the slab’s nation as well as may forestall us from successfully shaping it. After looking at /proc/slabinfo and tracing kmalloc allocations for every slab alongside the same affinity every bit our target nitty-gritty thread, it seems that the kmalloc-1024 slab is generally inactive. As such, we’ll select to target this slab size inward our exploit.

By using the heap shaping primitive higher upward nosotros tin flaming start filling slabs of whatever given size alongside  “gscan” objects. Each “gscan” object has a brusque header containing some metadata relating to the scan as well as a pointer to the side yesteryear side chemical element inward the linked list. The residue of the object is as well as then populated yesteryear an inline array of “scan results”, carrying the actual information for this node.
ll proceed our journeying into gaining remote nitty-gritty code execution Over The Air: Exploiting Broadcom’s Wi-Fi Stack (Part 2)

Going dorsum to the number at manus - how tin flaming nosotros utilisation this primitive to arts and crafts a predictable layout?

Well, yesteryear combining the heap shaping primitive alongside the overflow primitive, nosotros should travel able to properly shape slabs of whatever size-class prior to triggering the overflow. Recall the initially whatever given slab may travel fragmented, similar so:
ll proceed our journeying into gaining remote nitty-gritty code execution Over The Air: Exploiting Broadcom’s Wi-Fi Stack (Part 2)
However, after triggering plenty allocations (e.g. (SLAB_TOTAL_SIZE / SLAB_OBJECT_SIZE) - 1) alongside our heap shaping primitive, all the holes (if present) inward the electrical flow slab should larn populated, causing subsequent allocations of the same size-class to travel placed consecutively.
ll proceed our journeying into gaining remote nitty-gritty code execution Over The Air: Exploiting Broadcom’s Wi-Fi Stack (Part 2)

Now, nosotros tin flaming shipping a unmarried crafted SWC lawsuit frame, indicating a total_count resulting inward an allotment from the same target slab. However, nosotros don’t want to trigger the overflow yet! We soundless direct hold to shape the electrical flow slab before nosotros do so. To forestall the overflow from occurring, we’ll furnish a pocket-size pkt_count, thereby only partially filling inward the buffer.

ll proceed our journeying into gaining remote nitty-gritty code execution Over The Air: Exploiting Broadcom’s Wi-Fi Stack (Part 2)
Finally, using the heap shaping primitive in i lawsuit again, nosotros tin flaming fill upward the residue of the slab alongside to a greater extent than of our “gscan” objects, bringing us to the next heap state:

ll proceed our journeying into gaining remote nitty-gritty code execution Over The Air: Exploiting Broadcom’s Wi-Fi Stack (Part 2)

Okay… We’re getting there! As nosotros tin flaming come across above, if nosotros select to utilisation the overflow primitive at this point, nosotros could overwrite the contents of i of the “gscan” objects alongside our ain arbitrary data. However, we’ve yet to decide just what form of outcome that would yield…

Analysing The Constraints


In company to decide the number of overwriting a “gscan” object, let’s direct hold a closer hold back at the menses that processes a chain of “gscan” objects (that is, the operations performed after an lawsuit alongside a “completion” flag is received). This processing is handled yesteryear wl_cfgvendor_send_hotlist_event. The business office goes over each of the events inward the list, packs the event’s information into an SKB, as well as afterwards broadcasts the SKB over Netlink to whatever potential listeners.

However, the business office does direct hold a sure obstruction it needs to overcome; whatever given “gscan” node may travel larger than the maximal size of an SKB. Therefore, the node would demand to travel split into several SKBs. To maintain rail of this information, the “tot_count” as well as “tot_consumed” fields inward the “gscan” construction are utilised. The “tot_count” plain indicates the total number of embedded scan outcome entries inward the node’s inline array, as well as the “tot_consumed” plain indicates the number of entries consumed (transmitted) so far.

As a result, the business office slightly modifies the contents of the listing patch processing it. Essentially, it enforces the invariant that each processed node’s “total_consumed” plain volition travel modified to gibe its “tot_count” field. As for the information existence transmitted as well as how it’s packed, we’ll skip those details for brevity’s sake. However, it’s of import to Federal Reserve notation that other than the aforementioned side effect, the business office higher upward appears to travel quite harmless (that is, no farther primitives tin flaming travel “mined” from it). Lastly, after all the events are packed into SKBs as well as transmitted to whatever listeners, they tin flaming finally travel reclaimed. This is achieved yesteryear only walking over the list, as well as calling “kfree” on each entry.

Putting it all together, where does this leave of absence us alongside regards to exploitation? Assuming nosotros select to overwrite i of the “gscan” entries using the overflow primitive, nosotros tin flaming modify its “next” plain (or rather, must, every bit it is the commencement plain inward the structure) as well as dot it at whatever arbitrary address. This would crusade the processing business office to utilisation this arbitrary pointer every bit if it were an chemical element inward the list.
ll proceed our journeying into gaining remote nitty-gritty code execution Over The Air: Exploiting Broadcom’s Wi-Fi Stack (Part 2)

Due to the invariant of the processing business office - after processing the crafted entry, its seventh byte (“tot_consumed”) volition travel modified to gibe its sixth byte (“tot_count”). In addition, the pointer volition as well as then travel kfree-d after processing the chain. What’s more, remember that the processing business office iterates over the entire listing of entries. This way that the commencement 4 bytes inward the crafted entry (its “next” field) must either dot to some other retentivity location containing a “valid” listing node (which must as well as then satisfy the same constraints), or must otherwise agree the value 0 (NULL), indicating that this is the in conclusion chemical element inward the list.

This doesn’t hold back easy… There’s quite a large number of constraints nosotros demand to consider. If nosotros willfully select the ignore the kfree for a moment, nosotros could attempt as well as search for retentivity locations where the commencement 4 bytes are zero, as well as where it would travel beneficial to modify the seventh byte to gibe the 6th. Of course, this is just the tip of the iceberg; nosotros could repeatedly trigger the same primitive inward company to repeatedly re-create bytes i seat to the left. Perhaps, if nosotros were able to locate a retentivity address where plenty aught bytes as well as plenty bytes of our choosing are present, nosotros could arts and crafts a target value yesteryear consecutively using these 2 primitives.

In company to gage the feasibility of this approach, I’ve encoded the constraints higher upward inward a pocket-size SMT instance (using Z3), as well as supplied the actual heap information from the kernel, along alongside diverse target values as well as their corresponding locations. Additionally, since the kernel’s translation tabular array is stored at a constant address inward the kernel’s VAS as well as fifty-fifty slight modifications to it tin flaming outcome inward exploitable conditions, its contents (along alongside corresponding target values) was added to the SMT instance every bit well. The instance was constructed to travel satisfiable if as well as only if whatever of the target values could occupy whatever of the target locations inside no to a greater extent than than x “steps” (where each measuring is an invocation of the primitive). Unfortunately, the results were quite grim… It seemed similar this approach just wasn’t powerful enough.

Moreover, patch this sentiment mightiness travel overnice inward theory, it doesn’t quite piece of work inward practice. You see, calling kfree on an arbitrary address is non without side-effects of its own. For starters, the page containing the retentivity address must travel marked every bit either a “slab” page, or every bit “compound”. This only holds truthful (in general) for pages truly used yesteryear the slab allocator. Trying to telephone telephone kfree on an address inward a page that isn’t marked every bit such, triggers a nitty-gritty panic (thereby crashing the device).

Perhaps, instead, nosotros tin flaming select to ignore the other constraints as well as focus on the kfree? Indeed, if nosotros are able to consistently locate an allotment whose information tin flaming travel used for the purpose of the exploit, nosotros could sweat to gratis that retentivity address, as well as and then “re-capture” it yesteryear using our heap shaping primitive. However, this raises several additional questions. First, volition nosotros travel able to consistently locate a slab-resident address? Second, fifty-fifty if nosotros were to discovery such an address, sure enough it volition travel associated alongside a per-CPU cache, important that freeing it volition non necessarily allow us to reclaim it later on. Lastly, whichever allotment nosotros do select to target, volition direct hold to satisfy the constraints higher upward - that is, the commencement 4 bytes must travel zero, as well as the seventh byte volition travel modified to gibe the 6th.

However, this is where some slight trickery comes inward handy! Recall that kmalloc holds a number of fixed-size caches. Yet what should direct identify when a larger allotment is requested? In plough out that inward that case, kmalloc only returns a number of consecutive gratis pages (using __get_free_pages) as well as returns them to the caller. This is done without any per-CPU caching. As such, if nosotros are able to gratis a large allocation, nosotros should as well as then travel able to reclaim it without having to consider which CPU allocated it inward the commencement place.

This may solve the occupation of affinity, but it soundless doesn’t aid us locate these allocations. Unfortunately, the slab caches are allocated quite tardily inward the kernel’s kick process, as well as their contents are rattling “noisy”. This way that fifty-fifty guessing a unmarried address inside a slab is quite difficult, fifty-fifty to a greater extent than so for remote attackers. However, early allocations which utilisation the large allotment menses (that is, which are created using __get_free_pages) do consistently inhabit the same retentivity addresses! This is every bit long every bit they occur early on plenty during the kernel’s initialisation so that no non-deterministic events direct identify concurrently.

Combining these 2 facts, nosotros tin flaming search for a large early on allocation. After tracing the large allotment path as well as rebooting the kernel, it seems that at that spot are indeed quite a few such allocations. To aid navigate this large trace, nosotros tin flaming also compile the Linux nitty-gritty alongside a exceptional previously disclosed privilege escalation to inject code into system_server, nosotros tin flaming direct number the ioctls required to interact alongside the bcmdhd driver, thus replacing the chip retentivity access capabilities provided yesteryear dhdutil inward the higher upward experiment. Similarly, using a previously disclosed nitty-gritty exploit, nosotros are able to execute code inside the kernel, allowing us to discovery changes to the kernel’s code segments.

Putting this together, nosotros tin flaming extract the Wi-Fi chip’s (BCM43596) ROM, inspect it, as well as locate the DMA business office every bit described above. Then, nosotros tin flaming insert the same hook; pointing whatever non-consumed DMA RX descriptors at the nitty-gritty code’s physical address. After installing the claw as well as generating some Wi-Fi traffic, nosotros discovery the next result:

ll proceed our journeying into gaining remote nitty-gritty code execution Over The Air: Exploiting Broadcom’s Wi-Fi Stack (Part 2)

Once in i lawsuit again nosotros are able to DMA freely into the nitty-gritty (bypassing RKP’s protection along the way)! It seems that both Samsung’s Exynos 8890 SoC as well as Qualcomm’s Snapdragon 810 either lack SMMUs or neglect to utilise them.

Afterword


In conclusion, we’ve seen that the the isolation betwixt the host as well as the Wi-Fi SoC can, as well as should, travel improved. While flaws be inward the communication protocols betwixt the host as well as the chip, these tin flaming eventually travel solved over time. However, the electrical flow lack of protection against a rogue Wi-Fi chip leaves much to travel desired.

Since mobile SoCs are proprietary, it remains unknown whether current-gen SoCs are capable of facilitating such isolation. We promise that SoCs that do, indeed, direct hold the capability to enable retentivity protection (for example, yesteryear way of an SMMU), select to do so soon. For the SoCs that are incapable of doing so, mayhap this enquiry volition serve every bit a motivator when designing next-gen hardware.

The electrical flow lack of isolation tin flaming also direct hold some surprising side effects. For example, Android contexts which are able to interact alongside the Wi-Fi firmware, tin flaming leverage the Wi-Fi SoC’s DMA capability inward company to direct hijack the kernel. Therefore, these contexts should travel sentiment of existence “as privileged every bit the kernel”, an supposition which I believe is non currently made yesteryear Android’s safety architecture.  

The combination of an increasingly complex firmware as well as Wi-Fi’s incessant onwards march, hint that firmware bugs volition in all probability travel just about for quite some time. This hypothesis is supported yesteryear the fact that fifty-fifty a relatively shallow inspection of the firmware revealed a number of bugs, all of which were exploitable yesteryear remote attackers.

While retentivity isolation on its ain volition aid defend against a rogue Wi-Fi SoC, the firmware’s defenses tin flaming also travel bolstered against attacks. Currently, the firmware lacks exploit mitigations (such every bit stack cookies), as well as doesn’t brand total utilisation of the existing safety mechanisms (such every bit the MPU). Hopefully, hereafter versions are able to improve defend against such attacks yesteryear implementing modern exploit mitigations as well as utilising SoC safety mechanisms.

Komentar

Postingan populer dari blog ini

Exception-Oriented Exploitation On Ios

Posted past times Ian Beer, This postal service covers the regain in addition to exploitation of CVE-2017-2370 , a heap buffer overflow inwards the mach_voucher_extract_attr_recipe_trap mach trap. It covers the bug, the evolution of an exploitation technique which involves repeatedly in addition to deliberately crashing in addition to how to build alive meat introspection features using onetime meat exploits. It’s a trap! Alongside a large number of BSD syscalls (like ioctl, mmap, execve in addition to so on) XNU also has a pocket-sized number of extra syscalls supporting the MACH side of the meat called mach traps. Mach trap syscall numbers start at 0x1000000. Here’s a snippet from the syscall_sw.c file where the trap tabular array is defined: /* 12 */ MACH_TRAP(_kernelrpc_mach_vm_deallocate_trap, 3, 5, munge_wll), /* xiii */ MACH_TRAP(kern_invalid, 0, 0, NULL), /* xiv */ MACH_TRAP(_kernelrpc_mach_vm_protect_trap, 5, 7, munge_wllww), Most of the mach traps a

Lifting The (Hyper) Visor: Bypassing Samsung’S Real-Time Total Protection

Posted yesteryear Gal Beniamini, Traditionally, the operating system’s total is the concluding security boundary standing betwixt an assaulter together with total command over a target system. As such, additional aid must hold upwards taken inwards lodge to ensure the integrity of the kernel. First, when a organization boots, the integrity of its primal components, including that of the operating system’s kernel, must hold upwards verified. This is achieved on Android yesteryear the verified kicking chain . However, only booting an authenticated total is insufficient—what most maintaining the integrity of the total spell the organization is executing? Imagine a scenario where an assaulter is able to abide by together with exploit a vulnerability inwards the operating system’s kernel. Using such a vulnerability, the assaulter may endeavor to subvert the integrity of the total itself, either yesteryear modifying the contents of its code, or yesteryear introducing novel attacker-co

Chrome Bone Exploit: 1 Byte Overflow As Well As Symlinks

The next article is an invitee weblog post from an external researcher (i.e. the writer is non a or Google researcher). This post is most a Chrome OS exploit I reported to Chrome VRP inward September. The folks were squeamish to allow me do a invitee post most it, therefore hither goes. The study includes a detailed writeup , therefore this post volition have got less detail. 1 byte overflow inward a DNS library In Apr I constitute a TCP port listening on localhost inward Chrome OS. It was an HTTP proxy built into shill, the Chrome OS network manager. The proxy has at nowadays been removed equally component of a fix, but its source tin give notice nonetheless move seen from an one-time revision: shill/http_proxy.cc . The code is unproblematic in addition to doesn’t seem to incorporate whatever obvious exploitable bugs, although it is real liberal inward what it accepts equally incoming HTTP. It calls into the c-ares library for resolving DNS. There was a possible 1 byte ov