Langsung ke konten utama

Over The Air: Exploiting Broadcom’S Wi-Fi Stack (Part 2)

Posted past times Gal Beniamini,

In this weblog postal service we'll proceed our journeying into gaining remote inwardness code execution, past times agency of Wi-Fi communication alone. Having previously developed a remote code execution exploit giving us command over Broadcom’s Wi-Fi SoC, nosotros are at nowadays left alongside the line of piece of work of exploiting this advantage dot inwards social club to farther elevate our privileges into the kernel.

ll proceed our journeying into gaining remote inwardness code execution Over The Air: Exploiting Broadcom’s Wi-Fi Stack (Part 2)

In this post, we’ll explore 2 distinct avenues for attacking the host operating system. In the starting fourth dimension part, we’ll uncovering as well as exploit vulnerabilities inwards the communication protocols betwixt the Wi-Fi firmware as well as the host, resulting inwards code execution inside the kernel. Along the way, we’ll also detect a curious vulnerability which persisted until quite recently, using which attackers were able to straight assail the internal communication protocols without having to exploit the Wi-Fi SoC inwards the starting fourth dimension place! In the 2nd part, we’ll explore hardware blueprint choices allowing the Wi-Fi SoC inwards its electrical flow configuration to fully command the host without requiring a vulnerability inwards the starting fourth dimension place.

While the vulnerabilities discussed inwards the starting fourth dimension component stimulate got been disclosed to Broadcom as well as are at nowadays fixed, the utilisation of hardware components remains equally it is, as well as is currently non mitigated against. We promise that past times publishing this research, mobile SoC manufacturers as well as driver vendors volition live encouraged to create to a greater extent than secure designs, allowing a meliorate flat of separation betwixt the Wi-Fi SoC as well as the application processor.

Part 1 - The “Hard” Way

The Communication Channel


As we’ve established inwards the previous weblog post, the Wi-Fi firmware produced past times Broadcom is a FullMAC implementation. As such, it’s responsible for handling much of the complexity required for the implementation of 802.11 standards (including the bulk of the MLME layer).

Yet, piece many of the operations are encapsulated inside the Wi-Fi chip’s firmware, some flat of command over the Wi-Fi state machine is required inside the host’s operating system. Certain events cannot live handled exclusively past times the Wi-Fi SoC, as well as must so live communicated to the host’s operating system. For example, the host must live notified of the results of a Wi-Fi scan inwards social club to live able to acquaint this information to the user.

In social club to facilitate these cases where the host as well as the Wi-Fi SoC wishing to communicate alongside i another, a exceptional communication channel is required.

However, think that Broadcom produces a wide range of Wi-Fi SoCs, which may live connected to the host via many unlike interfaces (including USB, SDIO or fifty-fifty PCIe). This agency that relying on the underlying communication interface mightiness require re-implementing the shared communication protocol for each of the supported channels -- quite a irksome task.

ll proceed our journeying into gaining remote inwardness code execution Over The Air: Exploiting Broadcom’s Wi-Fi Stack (Part 2)

Perhaps there’s an easier way? Well, i affair nosotros tin give the axe e'er live sure of is that regardless of the communication channel used, the chip must live able to transmit received frames dorsum to the host. Indeed, mayhap for the really same reason, Broadcom chose to piggyback on top of this channel inwards social club to create the communication channel betwixt the SoC as well as the host.

When the firmware wishes to notify the host of an event, it does so past times but encoding a “special” frame as well as transmitting it to the host. These frames are marked past times a “unique” EtherType value of 0x886C. They make non comprise actual received data, but rather encapsulate information most firmware events which must live handled past times the host’s driver.

ll proceed our journeying into gaining remote inwardness code execution Over The Air: Exploiting Broadcom’s Wi-Fi Stack (Part 2)

Securing the Channel


Now, let’s switch over the the host’s side. On the host, the driver tin give the axe logically live divided into several layers. The lower layers bargain alongside the communication interface itself (such equally SDIO, PCIe, etc.) as well as whatever transmission protocol may live tied to it. The higher layers so bargain alongside the reception of frames, as well as their subsequent processing (if necessary).
ll proceed our journeying into gaining remote inwardness code execution Over The Air: Exploiting Broadcom’s Wi-Fi Stack (Part 2)

First, the upper layers perform some initial processing on the received frames, such equally removing encapsulated information which may stimulate got been added on-top of it (for example, transmission powerfulness indicators added past times the PHY module). Then, an of import distinction must live made - is this a regular frame that should live but forwarded to the relevant network interface, or is it inwards fact an encoded trial that the host must handle?

As we’ve just seen, this distinction is easily made! Just accept a await at the ethertype as well as banking company check whether it has the “special” value of 0x886C. If so, grip the encapsulated trial as well as discard the frame.

Or is it?

In fact, in that location is no guarantee that this ethertype is unused inwards every unmarried network as well as past times every unmarried device. Incidentally, it seems that the really same ethertype is used for the LARQ protocol used inwards HPNA chips (initially developed past times Epigram, as well as after purchased past times Broadcom).

Regardless of this lilliputian oddity - this brings us to our starting fourth dimension question: how tin give the axe the Wi-Fi SoC as well as host driver distinguish betwixt externally received frames alongside the 0x886C ethertype (which should live forwarded to the network interface), as well as internally generated trial frames (which should non live received from external sources)?

This is a crucial question; the internal trial channel, equally we’ll meet shortly, is extremely powerful as well as provides a huge, generally unaudited, assail surface. If attackers are able to inject frames over-the-air that tin give the axe after live processed equally trial frames past times the driver, they may really good live able to accomplish code execution inside the host’s operating system.

Well… Until several months prior to this enquiry (mid 2016), the firmware made no travail to filter these frames. Any frame received equally component of the information RX-path, regardless of its ethertype, was but forwarded blindly to the host. As a result, attackers were able to remotely ship frames containing the exceptional 0x886C ethertype, which were so processed past times the driver equally if they were trial frames created past times the firmware itself!

So how was this number addressed? After all, we’ve already established that just filtering the ethertype itself is non sufficient. Observing the differences betwixt the pre- as well as post- patched versions of the firmware reveals the answer: Broadcom went for a combined patch, targeting both the Wi-Fi SoC’s firmware as well as the host’s driver.

The patch adds a validation method (is_wlc_event_frame) both to the firmware’s RX path, as well as to the driver. On the chip’s  side, the validation method is called straight off before transmitting a received frame to the host. If the validation method deems the frame to live an trial frame, it is discarded. Otherwise, the frame is forwarded to the driver. Then, the driver calls the exact same verification method on received frames alongside the 0x886C ethertype, as well as processes them only if they overstep the same validation method. Here is a brusk schematic detailing this flow:

ll proceed our journeying into gaining remote inwardness code execution Over The Air: Exploiting Broadcom’s Wi-Fi Stack (Part 2)

As long equally the validation methods inwards the driver as well as the firmware stay identical, externally received frames cannot live processed equally events past times the driver. So far so good.

However… Since nosotros already stimulate got code-execution on the Wi-Fi SoC, nosotros tin give the axe but “revert” the patch. All it takes is for us to “patch out” the validation method inwards the firmware, thereby causing whatever received frame to in i trial once to a greater extent than live forwarded blindly to the host. This, inwards turn, allows us to inject arbitrary messages into the communication protocol betwixt the host as well as the Wi-Fi chip. Moreover, since the validation method is stored inwards RAM, as well as all of RAM is marked equally RWX, this is equally uncomplicated equally writing “MOV R0, #0; BX LR” to the function’s prologue.

ll proceed our journeying into gaining remote inwardness code execution Over The Air: Exploiting Broadcom’s Wi-Fi Stack (Part 2)

The Attack Surface


As nosotros mentioned earlier, the assail surface exposed past times the internal communication channel is huge. Tracing the command menstruum from the entry dot for handling trial frames (dhd_wl_host_event), nosotros tin give the axe meet that several events have “special treatment”, as well as are processed independently (see wl_host_event as well as wl_show_host_event). Once the initial handling is done, the frames are inserted into a queue. Events are so dequeued past times a inwardness thread whose sole purpose is to read events from the queue as well as dispatch them to their corresponding handler function. This correlation is done past times using the event’s internal “event-type” champaign equally an index into an array of handler functions, called evt_handler.

ll proceed our journeying into gaining remote inwardness code execution Over The Air: Exploiting Broadcom’s Wi-Fi Stack (Part 2)

While in that location are upwards to 144 unlike supported trial codes, the host driver for Android, bcmdhd, only supports a much smaller subset of these. Nonetheless, most 35 events are supported inside the driver, each including their ain elaborate handlers.

Now that we’re convinced that the assail surface is large enough, nosotros tin give the axe start hunting for bugs! Unfortunately, it seems similar the Wi-Fi chip is considered equally “trusted”; equally a result, some of the validations inwards the host’s driver are insufficient… Indeed, auditing the relevant handler functions as well as auxiliary protocol handlers outlined above, nosotros uncovering a substantial number of vulnerabilities.

The Vulnerability


Taking a closer await at the vulnerabilities we’ve found, nosotros tin give the axe meet that they all differ from i some other slightly. Some allow for relatively strong primitives, some weaker. However, most importantly, many of them stimulate got diverse preconditions which must live fulfilled to successfully trigger them; some are express to sure physical interfaces, piece others piece of work only inwards sure configurations of the driver. Nonetheless, one vulnerability seems to live acquaint inwards all versions of bcmdhd as well as inwards all configurations - if nosotros tin give the axe successfully exploit it, nosotros should live set.

Let’s accept a closer await at the trial frame inwards question. Events of type "WLC_E_PFN_SWC" are used to dot that a “Significant Wi-Fi Change” (SWC) has occurred inside the firmware as well as must live handled past times the host. Instead of straight handling these events, the host’s driver but gathers all the transferred information from the firmware, as well as broadcasts a “vendor event” parcel via Netlink to the cfg80211 layer.

More concretely, each SWC trial frame transmitted past times the firmware contains an array of events (of type wl_pfn_significant_net_t), a total count (total_count), as well as the number of events inwards the array (pkt_count). Since the total number of events tin give the axe live quite large, it mightiness non fit inwards a unmarried frame (i.e., it mightiness live larger than the the maximal MSDU). In this case, multiple SWC trial frames tin give the axe live sent consecutively - their internal information volition live accumulated past times the driver until the total count is reached, at which dot the driver volition procedure the entire listing of events.
ll proceed our journeying into gaining remote inwardness code execution Over The Air: Exploiting Broadcom’s Wi-Fi Stack (Part 2)

Reading through the driver’s code, nosotros tin give the axe meet that when this trial code is received, an initial handler is triggered inwards social club to bargain alongside the event. The handler so internally calls into the "dhd_handle_swc_evt" business office inwards social club to procedure the event's data. Let’s accept a closer look:

1.  void* dhd_handle_swc_evt(dhd_pub_t *dhd,const void *event_data,int *send_evt_bytes)
2.  {
3.   ...
4.   wl_pfn_swc_results_t *results = (wl_pfn_swc_results_t *)event_data;
5.   ...
6.   gscan_params = &(_pno_state->pno_params_arr[INDEX_OF_GSCAN_PARAMS].params_gscan);
7.   params = &(gscan_params->param_significant);
8.   ...
9.   if (!params->results_rxed_so_far) {
10.      if (!params->change_array) {
11.          params->change_array = (wl_pfn_significant_net_t *)
12.                                  kmalloc(sizeof(wl_pfn_significant_net_t) *
13.                                          results->total_count, GFP_KERNEL);
14.          ...
15.      }
16.  }
17.  ...
18.  change_array = &params->change_array[params->results_rxed_so_far];
19.  memcpy(change_array,
20.         results->list,
21.         sizeof(wl_pfn_significant_net_t) * results->pkt_count);
22.  params->results_rxed_so_far += results->pkt_count;
23.  ...
24. }

(where "event_data" is the arbitrary information encapsulated inwards the trial passed inwards from the firmware)

As nosotros tin give the axe meet above, the business office starting fourth dimension allocates an array to concur the total count of events (if i hasn’t been allocated before) as well as so proceeds to concatenate the encapsulated information starting from the appropriate index (results_rxed_so_far) inwards the buffer.

However, the handler fails to verify the relation betwixt the total_count as well as the pkt_count! It but “trusts” the assertion that the total_count is sufficiently large to shop all the subsequent events passed in. As a result, an assailant alongside the powerfulness to inject arbitrary trial frames tin give the axe specify a pocket-size total_count as well as a larger pkt_count, thereby triggering a uncomplicated inwardness heap overflow.

Remote Kernel Heap Shaping


This is all good as well as good, but how tin give the axe nosotros leverage this primitive from a remote advantage point? As we’re non locally acquaint on the device, we’re unable to assemble whatever information most the electrical flow state of the heap, nor make nosotros stimulate got address-space related information (unless, of course, we’re able to somehow leak this information). Many classic exploits targeting inwardness heap overflows rely on the powerfulness to shape the kernel’s heap, ensuring a sure state prior to triggering an overflow - an powerfulness nosotros also lack at the moment.

What make nosotros know most the allocator itself? There are a few possible underlying implementations for the kmalloc allocator (SLAB, SLUB, SLOB), configurable when edifice the kernel. However, on the vast bulk of devices, kmalloc uses “SLUB” - an unqueued “slab allocator” alongside per-CPU caches.

Each “slab” is but a pocket-size part from which identically-sized allocations are carved. The starting fourth dimension chunk inwards each slab contains its metadata (such equally the slab’s freelist), as well as subsequent blocks comprise the allocations themselves, alongside no inline metadata. There are a number of predefined slab size-classes which are used past times kmalloc, typically spanning from equally lilliputian equally 64 bytes, to around 8KB. Unsurprisingly, the allocator uses the best-fitting slab (smallest slab that is large enough) for each allocation. Lastly, the slabs’ freelists are consumed linearly - consecutive allocations occupy consecutive retention addresses. However, if objects are freed inside the slab, it may locomote out fragmented - causing subsequent allocations to fill-in “holes” inside the slab instead of proceeding linearly.
ll proceed our journeying into gaining remote inwardness code execution Over The Air: Exploiting Broadcom’s Wi-Fi Stack (Part 2)
With this inwards mind, let’s accept a measuring dorsum as well as analyse the primitives at hand. First, since nosotros are able to arbitrarily specify whatever value inwards total_count, nosotros tin give the axe select the overflown buffer’s size to live whatever multiple of sizeof(wl_pfn_significant_net). This agency nosotros tin give the axe inhabit whatever slab cache size of our choosing. As such, there’s no limitation on the size of the objects nosotros tin give the axe target alongside the overflow. However, this is non quite enough… For starters, nosotros silent don’t know anything most the electrical flow state of the slabs themselves, nor tin give the axe nosotros trigger remote allocations inwards slabs of our choosing.

It seems that starting fourth dimension as well as foremost, nosotros demand to uncovering a way to remotely shape slabs. Recall, however, that in that location are a few obstacles nosotros demand to overcome. As SLUB maintains per-CPU caches, the affinity of the inwardness thread inwards which the allotment is performed must live the same equally the i from which the overflown buffer is allocated. Gaining a heap shaping primitive on a unlike CPU core volition crusade the allocations to live taken from unlike slabs. The most straightforward way to tackle this number is to confine ourselves to heap shaping primitives which tin give the axe live triggered from the same inwardness thread on which the overflow occurs. This is quite a substantial constraint… In essence, it forces us to disregard allocations that occur equally a outcome of processes that are external to the trial handling itself.

Regardless, alongside a concrete finish inwards mind, nosotros tin give the axe start looking for heap shaping primitives inwards the registered handlers for each of the trial frames. As luck would stimulate got it, after going through every handler, nosotros come upwards across a (single) perfect fit!

Events frames of type “WLC_E_PFN_BSSID_NET_FOUND” are handled past times the handler business office dhd_handle_hotlist_scan_evt. This business office accumulates a linked listing of scan results. Every fourth dimension an trial is received, its information is appended to the list. Finally, when an trial arrives alongside a flag indicating it is the lastly trial inwards the chain, the business office passes on the collected listing of events to live processed. Let’s accept a closer look:

1. void *dhd_handle_hotlist_scan_evt(dhd_pub_t *dhd, const void *event_data,
2.                                   int *send_evt_bytes, hotlist_type_t type)
3. {
4.    struct dhd_pno_gscan_params *gscan_params;
5.    wl_pfn_scanresults_t *results = (wl_pfn_scanresults_t *)event_data;
6.    gscan_params = &(_pno_state->pno_params_arr[INDEX_OF_GSCAN_PARAMS].params_gscan);
7.    ...
8.    malloc_size = sizeof(gscan_results_cache_t) +
9.                      ((results->count - 1) * sizeof(wifi_gscan_result_t));
10.   gscan_hotlist_cache = (gscan_results_cache_t *) kmalloc(malloc_size, GFP_KERNEL);
11.   ...
12.   gscan_hotlist_cache->next = gscan_params->gscan_hotlist_found;
13.   gscan_params->gscan_hotlist_found = gscan_hotlist_cache;
14.   ...
15.   gscan_hotlist_cache->tot_count = results->count;
16.   gscan_hotlist_cache->tot_consumed = 0;
17.   plnetinfo = results->netinfo;
18.   for (i = 0; i < results->count; i++, plnetinfo++) {
19      hotlist_found_array = &gscan_hotlist_cache->results[i];
20.     ... //Populate the entry alongside the sanitised network information
21.   }
22.  if (results->status == PFN_COMPLETE) {
23.    ... //Process the entire chain
24.  }
25.  ...
26.}

Awesome - looking at the business office above, it seems that we’re able to repeatedly crusade allocations of size { sizeof(gscan_results_cache_t) + (N-1) * sizeof(wifi_gscan_result_t) | north > 0 } (where north denotes results->count). What’s more, these allocations are performed inwards the same inwardness thread, as well as their lifetime is completely controlled past times us! As long equally nosotros don’t ship an trial alongside the PFN_COMPLETE status, none of the allocations volition live freed.

Before nosotros motion on, we’ll demand to select a target slab size. Ideally, we’re looking for a slab that’s relatively inactive. If other threads on the same CPU select to allocate (or free) information from the same slab, this would add together doubtfulness to the slab’s state as well as may foreclose us from successfully shaping it. After looking at /proc/slabinfo and tracing kmalloc allocations for every slab alongside the same affinity equally our target inwardness thread, it seems that the kmalloc-1024 slab is generally inactive. As such, we’ll select to target this slab size inwards our exploit.

By using the heap shaping primitive higher upwards nosotros tin give the axe start filling slabs of whatever given size alongside  “gscan” objects. Each “gscan” object has a brusk header containing some metadata relating to the scan as well as a pointer to the side past times side chemical element inwards the linked list. The residuum of the object is so populated past times an inline array of “scan results”, carrying the actual information for this node.
ll proceed our journeying into gaining remote inwardness code execution Over The Air: Exploiting Broadcom’s Wi-Fi Stack (Part 2)

Going dorsum to the number at manus - how tin give the axe nosotros work this primitive to arts and crafts a predictable layout?

Well, past times combining the heap shaping primitive alongside the overflow primitive, nosotros should live able to properly shape slabs of whatever size-class prior to triggering the overflow. Recall the initially whatever given slab may live fragmented, similar so:
ll proceed our journeying into gaining remote inwardness code execution Over The Air: Exploiting Broadcom’s Wi-Fi Stack (Part 2)
However, after triggering plenty allocations (e.g. (SLAB_TOTAL_SIZE / SLAB_OBJECT_SIZE) - 1) alongside our heap shaping primitive, all the holes (if present) inwards the electrical flow slab should larn populated, causing subsequent allocations of the same size-class to live placed consecutively.
ll proceed our journeying into gaining remote inwardness code execution Over The Air: Exploiting Broadcom’s Wi-Fi Stack (Part 2)

Now, nosotros tin give the axe ship a unmarried crafted SWC trial frame, indicating a total_count resulting inwards an allotment from the same target slab. However, nosotros don’t want to trigger the overflow yet! We silent stimulate got to shape the electrical flow slab before nosotros make so. To foreclose the overflow from occurring, we’ll render a pocket-size pkt_count, thereby only partially filling inwards the buffer.

ll proceed our journeying into gaining remote inwardness code execution Over The Air: Exploiting Broadcom’s Wi-Fi Stack (Part 2)
Finally, using the heap shaping primitive in i trial again, nosotros tin give the axe create total the residuum of the slab alongside to a greater extent than of our “gscan” objects, bringing us to the next heap state:

ll proceed our journeying into gaining remote inwardness code execution Over The Air: Exploiting Broadcom’s Wi-Fi Stack (Part 2)

Okay… We’re getting there! As nosotros tin give the axe meet above, if nosotros select to work the overflow primitive at this point, nosotros could overwrite the contents of i of the “gscan” objects alongside our ain arbitrary data. However, we’ve yet to decide just what form of outcome that would yield…

Analysing The Constraints


In social club to decide the outcome of overwriting a “gscan” object, let’s accept a closer await at the menstruum that processes a chain of “gscan” objects (that is, the operations performed after an trial alongside a “completion” flag is received). This processing is handled past times wl_cfgvendor_send_hotlist_event. The business office goes over each of the events inwards the list, packs the event’s information into an SKB, as well as after broadcasts the SKB over Netlink to whatever potential listeners.

However, the business office does stimulate got a sure obstruction it needs to overcome; whatever given “gscan” node may live larger than the maximal size of an SKB. Therefore, the node would demand to live split into several SKBs. To hold rails of this information, the “tot_count” as well as “tot_consumed” fields inwards the “gscan” construction are utilised. The “tot_count” champaign indicates the total number of embedded scan outcome entries inwards the node’s inline array, as well as the “tot_consumed” champaign indicates the number of entries consumed (transmitted) so far.

As a result, the business office slightly modifies the contents of the listing piece processing it. Essentially, it enforces the invariant that each processed node’s “total_consumed” champaign volition live modified to tally its “tot_count” field. As for the information beingness transmitted as well as how it’s packed, we’ll skip those details for brevity’s sake. However, it’s of import to Federal Reserve annotation that other than the aforementioned side effect, the business office higher upwards appears to live quite harmless (that is, no farther primitives tin give the axe live “mined” from it). Lastly, after all the events are packed into SKBs as well as transmitted to whatever listeners, they tin give the axe finally live reclaimed. This is achieved past times but walking over the list, as well as calling “kfree” on each entry.

Putting it all together, where does this leave of absence us alongside regards to exploitation? Assuming nosotros select to overwrite i of the “gscan” entries using the overflow primitive, nosotros tin give the axe modify its “next” champaign (or rather, must, equally it is the starting fourth dimension champaign inwards the structure) as well as dot it at whatever arbitrary address. This would crusade the processing business office to work this arbitrary pointer equally if it were an chemical element inwards the list.
ll proceed our journeying into gaining remote inwardness code execution Over The Air: Exploiting Broadcom’s Wi-Fi Stack (Part 2)

Due to the invariant of the processing business office - after processing the crafted entry, its seventh byte (“tot_consumed”) volition live modified to tally its sixth byte (“tot_count”). In addition, the pointer volition so live kfree-d after processing the chain. What’s more, think that the processing business office iterates over the entire listing of entries. This agency that the starting fourth dimension 4 bytes inwards the crafted entry (its “next” field) must either dot to some other retention location containing a “valid” listing node (which must so satisfy the same constraints), or must otherwise concur the value 0 (NULL), indicating that this is the lastly chemical element inwards the list.

This doesn’t await easy… There’s quite a large number of constraints nosotros demand to consider. If nosotros willfully select the ignore the kfree for a moment, nosotros could seek as well as search for retention locations where the starting fourth dimension 4 bytes are zero, as well as where it would live beneficial to modify the seventh byte to tally the 6th. Of course, this is just the tip of the iceberg; nosotros could repeatedly trigger the same primitive inwards social club to repeatedly re-create bytes i seat to the left. Perhaps, if nosotros were able to locate a retention address where plenty nix bytes as well as plenty bytes of our choosing are present, nosotros could arts and crafts a target value past times consecutively using these 2 primitives.

In social club to gage the feasibility of this approach, I’ve encoded the constraints higher upwards inwards a pocket-size SMT instance (using Z3), as well as supplied the actual heap information from the kernel, along alongside diverse target values as well as their corresponding locations. Additionally, since the kernel’s translation tabular array is stored at a constant address inwards the kernel’s VAS as well as fifty-fifty slight modifications to it tin give the axe outcome inwards exploitable conditions, its contents (along alongside corresponding target values) was added to the SMT instance equally well. The instance was constructed to live satisfiable if as well as only if whatever of the target values could occupy whatever of the target locations inside no to a greater extent than than 10 “steps” (where each measuring is an invocation of the primitive). Unfortunately, the results were quite grim… It seemed similar this approach just wasn’t powerful enough.

Moreover, piece this stance mightiness live overnice inwards theory, it doesn’t quite piece of work inwards practice. You see, calling kfree on an arbitrary address is non without side-effects of its own. For starters, the page containing the retention address must live marked equally either a “slab” page, or equally “compound”. This only holds truthful (in general) for pages genuinely used past times the slab allocator. Trying to telephone proper name upwards kfree on an address inwards a page that isn’t marked equally such, triggers a inwardness panic (thereby crashing the device).

Perhaps, instead, nosotros tin give the axe select to ignore the other constraints as well as focus on the kfree? Indeed, if nosotros are able to consistently locate an allotment whose information tin give the axe live used for the purpose of the exploit, nosotros could endeavor to gratuitous that retention address, as well as so “re-capture” it past times using our heap shaping primitive. However, this raises several additional questions. First, volition nosotros live able to consistently locate a slab-resident address? Second, fifty-fifty if nosotros were to uncovering such an address, for sure it volition live associated alongside a per-CPU cache, pregnant that freeing it volition non necessarily allow us to reclaim it later on. Lastly, whichever allotment nosotros make select to target, volition stimulate got to satisfy the constraints higher upwards - that is, the starting fourth dimension 4 bytes must live zero, as well as the seventh byte volition live modified to tally the 6th.

However, this is where some slight trickery comes inwards handy! Recall that kmalloc holds a number of fixed-size caches. Yet what should locomote on when a larger allotment is requested? In plough out that inwards that case, kmalloc but returns a number of consecutive gratuitous pages (using __get_free_pages) as well as returns them to the caller. This is done without any per-CPU caching. As such, if nosotros are able to gratuitous a large allocation, nosotros should so live able to reclaim it without having to consider which CPU allocated it inwards the starting fourth dimension place.

This may solve the job of affinity, but it silent doesn’t assistance us locate these allocations. Unfortunately, the slab caches are allocated quite belatedly inwards the kernel’s kicking process, as well as their contents are really “noisy”. This agency that fifty-fifty guessing a unmarried address inside a slab is quite difficult, fifty-fifty to a greater extent than so for remote attackers. However, early allocations which work the large allotment menstruum (that is, which are created using __get_free_pages) make consistently inhabit the same retention addresses! This is equally long equally they occur early on plenty during the kernel’s initialisation so that no non-deterministic events locomote on concurrently.

Combining these 2 facts, nosotros tin give the axe search for a large early on allocation. After tracing the large allotment path as well as rebooting the kernel, it seems that in that location are indeed quite a few such allocations. To assistance navigate this large trace, nosotros tin give the axe also compile the Linux inwardness alongside a exceptional previously disclosed privilege escalation to inject code into system_server, nosotros tin give the axe straight number the ioctls required to interact alongside the bcmdhd driver, thus replacing the chip retention access capabilities provided past times dhdutil inwards the higher upwards experiment. Similarly, using a previously disclosed inwardness exploit, nosotros are able to execute code inside the kernel, allowing us to detect changes to the kernel’s code segments.

Putting this together, nosotros tin give the axe extract the Wi-Fi chip’s (BCM43596) ROM, inspect it, as well as locate the DMA business office equally described above. Then, nosotros tin give the axe insert the same hook; pointing whatever non-consumed DMA RX descriptors at the inwardness code’s physical address. After installing the claw as well as generating some Wi-Fi traffic, nosotros detect the next result:

ll proceed our journeying into gaining remote inwardness code execution Over The Air: Exploiting Broadcom’s Wi-Fi Stack (Part 2)

Once in i trial to a greater extent than nosotros are able to DMA freely into the inwardness (bypassing RKP’s protection along the way)! It seems that both Samsung’s Exynos 8890 SoC as well as Qualcomm’s Snapdragon 810 either lack SMMUs or neglect to utilise them.

Afterword


In conclusion, we’ve seen that the the isolation betwixt the host as well as the Wi-Fi SoC can, as well as should, live improved. While flaws be inwards the communication protocols betwixt the host as well as the chip, these tin give the axe eventually live solved over time. However, the electrical flow lack of protection against a rogue Wi-Fi chip leaves much to live desired.

Since mobile SoCs are proprietary, it remains unknown whether current-gen SoCs are capable of facilitating such isolation. We promise that SoCs that do, indeed, stimulate got the capability to enable retention protection (for example, past times agency of an SMMU), select to make so soon. For the SoCs that are incapable of doing so, mayhap this enquiry volition serve equally a motivator when designing next-gen hardware.

The electrical flow lack of isolation tin give the axe also stimulate got some surprising side effects. For example, Android contexts which are able to interact alongside the Wi-Fi firmware, tin give the axe leverage the Wi-Fi SoC’s DMA capability inwards social club to straight hijack the kernel. Therefore, these contexts should live stance of beingness “as privileged equally the kernel”, an supposition which I believe is non currently made past times Android’s safety architecture.  

The combination of an increasingly complex firmware as well as Wi-Fi’s incessant onwards march, hint that firmware bugs volition in all likelihood live around for quite some time. This hypothesis is supported past times the fact that fifty-fifty a relatively shallow inspection of the firmware revealed a number of bugs, all of which were exploitable past times remote attackers.

While retention isolation on its ain volition assistance defend against a rogue Wi-Fi SoC, the firmware’s defenses tin give the axe also live bolstered against attacks. Currently, the firmware lacks exploit mitigations (such equally stack cookies), as well as doesn’t brand total work of the existing safety mechanisms (such equally the MPU). Hopefully, futurity versions are able to meliorate defend against such attacks past times implementing modern exploit mitigations as well as utilising SoC safety mechanisms.

Komentar

Postingan populer dari blog ini

Chrome Bone Exploit: 1 Byte Overflow As Well As Symlinks

The next article is an invitee weblog post from an external researcher (i.e. the writer is non a or Google researcher). This post is most a Chrome OS exploit I reported to Chrome VRP inward September. The folks were squeamish to allow me do a invitee post most it, therefore hither goes. The study includes a detailed writeup , therefore this post volition have got less detail. 1 byte overflow inward a DNS library In Apr I constitute a TCP port listening on localhost inward Chrome OS. It was an HTTP proxy built into shill, the Chrome OS network manager. The proxy has at nowadays been removed equally component of a fix, but its source tin give notice nonetheless move seen from an one-time revision: shill/http_proxy.cc . The code is unproblematic in addition to doesn’t seem to incorporate whatever obvious exploitable bugs, although it is real liberal inward what it accepts equally incoming HTTP. It calls into the c-ares library for resolving DNS. There was a possible 1 byte ov...

Exception-Oriented Exploitation On Ios

Posted past times Ian Beer, This postal service covers the regain in addition to exploitation of CVE-2017-2370 , a heap buffer overflow inwards the mach_voucher_extract_attr_recipe_trap mach trap. It covers the bug, the evolution of an exploitation technique which involves repeatedly in addition to deliberately crashing in addition to how to build alive meat introspection features using onetime meat exploits. It’s a trap! Alongside a large number of BSD syscalls (like ioctl, mmap, execve in addition to so on) XNU also has a pocket-sized number of extra syscalls supporting the MACH side of the meat called mach traps. Mach trap syscall numbers start at 0x1000000. Here’s a snippet from the syscall_sw.c file where the trap tabular array is defined: /* 12 */ MACH_TRAP(_kernelrpc_mach_vm_deallocate_trap, 3, 5, munge_wll), /* xiii */ MACH_TRAP(kern_invalid, 0, 0, NULL), /* xiv */ MACH_TRAP(_kernelrpc_mach_vm_protect_trap, 5, 7, munge_wllww), Most of the mach traps a...

Lifting The (Hyper) Visor: Bypassing Samsung’S Real-Time Total Protection

Posted yesteryear Gal Beniamini, Traditionally, the operating system’s total is the concluding security boundary standing betwixt an assaulter together with total command over a target system. As such, additional aid must hold upwards taken inwards lodge to ensure the integrity of the kernel. First, when a organization boots, the integrity of its primal components, including that of the operating system’s kernel, must hold upwards verified. This is achieved on Android yesteryear the verified kicking chain . However, only booting an authenticated total is insufficient—what most maintaining the integrity of the total spell the organization is executing? Imagine a scenario where an assaulter is able to abide by together with exploit a vulnerability inwards the operating system’s kernel. Using such a vulnerability, the assaulter may endeavor to subvert the integrity of the total itself, either yesteryear modifying the contents of its code, or yesteryear introducing novel attacker-co...