Notes On Windows Uniscribe Fuzzing

Posted past times Mateusz Jurczyk of Google

Among the total of 119 vulnerabilities amongst CVEs fixed past times Microsoft inwards the [1][2] together with fuzzing [3][4]. However, what makes this attempt a chip dissimilar from the previous ones is the fact that Uniscribe is a little-known user-mode component, which had non been widely recognized as a feasible laid on vector before, as opposed to the kernel-mode font implementations included inwards the win32k.sys together with ATMFD.DLL drivers. In this post, nosotros outline a brief history together with description of Uniscribe, explicate how nosotros approached at-scale fuzzing of the library, together with highlight some of the to a greater extent than interesting discoveries nosotros have got made so far. All the raw reports of the bugs we’re referring to (as they were submitted to Microsoft), together amongst the corresponding proof-of-concept samples, tin live on constitute inwards the official põrnikas tracker [5]. Enjoy!

Introduction

It was Nov 2016 when nosotros started yet some other iteration of our Windows font fuzzing task (whose architecture was thoroughly described inwards [4]). At that point, the gist laid on surface was generally fuzz-clean amongst regards to the techniques nosotros were using, but nosotros soundless similar to play amongst the configuration together with input corpus from fourth dimension to fourth dimension to consider if nosotros tin crush out whatsoever to a greater extent than bugs amongst the existing infrastructure. What nosotros ended upward amongst a several days afterward were a bunch of samples which supposedly crashed the invitee Windows organization running within of Bochs. When nosotros fed them to our reproduction pipeline, none of the bugchecks occurred 1 time again for unclear reasons. As disappointing as that was, at that topographic point also was 1 interesting together with unexpected result: for 1 of the essay cases, the user-mode harness crashed itself, without bringing the whole OS downward at the same time. This could betoken either that at that topographic point was a põrnikas inwards our code, or that at that topographic point was some unanticipated font parsing going on inwards ring-3. When nosotros started earthworks deeper, nosotros constitute out that the unhandled exception took house inwards the next context:

(4464.11b4): Access violation - code c0000005 (first chance)

First adventure exceptions are reported earlier whatsoever exception handling.

This exception may live on expected together with handled.

eax=0933d8bf ebx=00000000 ecx=09340ffc edx=00001b9f esi=0026ecac edi=00000009

eip=752378f3 esp=0026ec24 ebp=0026ec2c iopl=0 nv upward ei pl zr na pe nc

cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010246

USP10!ScriptPositionSingleGlyph+0x28533:

752378f3 668b4c5002 mov cx,word ptr [eax+edx*2+2] ds:002b:09340fff=????

Until that moment, nosotros didn’t fully realize that our tools were triggering whatsoever font-handling code beyond the well-known gist implementation (despite some related bugs having been publicly fixed inwards the past, e.g. CVE-2016-7274 [6]). As a result, the fuzzing organization was non prepared to grab user-mode faults, together with thus whatsoever such crashes had remained completely undetected inwards favor of organization bugchecks, which caused total machine restarts.

We rapidly determined that the usp10.dll library corresponded to “Uniscribe Unicode script processor” (in Microsoft’s ain words) [7]. It is a relatively large module (600-800 kB depending on organization version together with bitness) responsible for rendering Unicode-encoded text, as the call suggests. From a safety perspective, it’s of import that the code base of operations dates dorsum to Windows 2000, together with includes a C++ implementation of the parsing of diverse complex TrueType/OpenType structures, inwards improver to what is already implemented inwards the kernel. The specific tables that Uniscribe touches on are primarily Advanced Typography Tables (“GDEF”, “GSUB”, “GPOS”, “BASE”, “JSTF”), but also “OS/2”, “cmap” together with “maxp” to some extent. What’s every bit pregnant is that the code tin live on reached only past times calling the DrawText [8] or other equivalent API amongst Unicode-encoded text together with an attacker-controlled font. Since no special calls other than the typical ones are necessary to execute the most exposed areas of the library, it makes for a keen laid on vector inwards applications which occupation GDI to homecoming text amongst fonts originating from untrusted sources. This is also evidenced past times the stack describe of the master crash, together with the fact that it occurred inwards a computer programme which didn’t include whatsoever usp10-specific code:

0:000> kb

ChildEBP RetAddr

0026ec2c 09340ffc USP10!otlChainRuleSetTable::rule+0x13

0026eccc 0133d7d2 USP10!otlChainingLookup::apply+0x7d3

0026ed48 0026f09c USP10!ApplyLookup+0x261

0026ef4c 0026f078 USP10!ApplyFeatures+0x481

0026ef98 09342f40 USP10!SubstituteOtlGlyphs+0x1bf

0026efd4 0026f0b4 USP10!SubstituteOtlChars+0x220

0026f250 0026f370 USP10!HebrewEngineGetGlyphs+0x690

0026f310 0026f370 USP10!ShapingGetGlyphs+0x36a

0026f3fc 09316318 USP10!ShlShape+0x2ef

0026f440 09316318 USP10!ScriptShape+0x15f

0026f4a0 0026f520 USP10!RenderItemNoFallback+0xfa

0026f4cc 0026f520 USP10!RenderItemWithFallback+0x104

0026f4f0 09316124 USP10!RenderItem+0x22

0026f534 2d011da2 USP10!ScriptStringAnalyzeGlyphs+0x1e9

0026f54c 0000000a USP10!ScriptStringAnalyse+0x284

0026f598 0000000a LPK!LpkStringAnalyse+0xe5

0026f694 00000000 LPK!LpkCharsetDraw+0x332

0026f6c8 00000000 LPK!LpkDrawTextEx+0x40

0026f708 00000000 USER32!DT_DrawStr+0x13c

0026f754 0026fa30 USER32!DT_GetLineBreak+0x78

0026f800 0000000a USER32!DrawTextExWorker+0x255

0026f824 ffffffff USER32!DrawTextExW+0x1e

As tin live on seen here, the Uniscribe functionality was invoked internally past times user32.dll through the lpk.dll (Language Pack) library. As before long as nosotros learned most this novel laid on vector, nosotros jumped at the kickoff adventure to fuzz it. Most of the infrastructure was already inwards place, since both user- together with kernel-mode font fuzzing portion a large number of the pieces. The extra piece of work that nosotros had to practise was generally related to filtering the input corpus, piffling amongst the mutator configuration, adjusting the organization configuration together with implementing logic for the detection of user-mode crashes (both inwards the essay harness together with Bochs instrumentation). All of these steps are discussed inwards item below. After a few days, nosotros had everything working as planned, together with after some other couple, at that topographic point were already over lxxx crashes at unique addresses waiting for triage. Below is a summary of the issues that were constitute inwards the kickoff fuzzing run together with reported to Microsoft inwards Dec 2016.

Results at a glance

Since lxxx was soundless a fairly manageable number of crashes to triage manually, nosotros tried to reproduce each of them past times hand, deduplicating them together with writing downward their details at the same time. When nosotros finished, nosotros ended upward amongst 8 class high-severity issues that could potentially allow remote code execution:

Tracker ID	Memory access type at crash	Crashing function	CVE
1022	Invalid write of n bytes (memcpy)	usp10!otlList::insertAt	CVE-2017-0108
1023	Invalid read / write of 2 bytes	usp10!AssignGlyphTypes	CVE-2017-0084
1025	Invalid write of n bytes (memset)	usp10!otlCacheManager::GlyphsSubstituted	CVE-2017-0086
1026	Invalid write of n bytes (memcpy)	usp10!MergeLigRecords	CVE-2017-0087
1027	Invalid write of 2 bytes	usp10!ttoGetTableData	CVE-2017-0088
1028	Invalid write of 2 bytes	usp10!UpdateGlyphFlags	CVE-2017-0089
1029	Invalid write of n bytes	usp10!BuildFSM together with nearby functions	CVE-2017-0090
1030	Invalid write of n bytes	usp10!FillAlternatesList	CVE-2017-0072

All of the bugs but 1 were triggered through a criterion DrawText telephone telephone together with resulted inwards heap retention corruption. The 1 exception was the #1030 issue, which resided inwards a documented Uniscribe-specific ScriptGetFontAlternateGlyphs API function. The routine is responsible for retrieving a listing of alternate glyphs for a specified character, together with the interesting fact most the põrnikas is that it wasn’t a job amongst operating on whatsoever internal structures. Instead, the role failed to honor the value of the cMaxAlternates argument, together with could thence write to a greater extent than output information to the pAlternateGlyphs buffer than was allowed past times the role caller. This meant that the buffer overflow was non specific to whatsoever particular retention type – depending on what pointer the customer passed in, the overflow would have got house on the stack, heap or static memory. The exploitability of such a põrnikas would greatly depend on the computer programme pattern together with compilation options used to construct it. We must admit, however, that it is unclear what the real-world clients of the role are, together with whether whatsoever of them would encounter the requirements to conk a feasible laid on target.

Furthermore, nosotros extracted 27 unique crashes caused past times invalid retention reads from non-NULL addresses, which could potentially Pb to information disclosure of secrets stored inwards the procedure address space. Due to the large book of these crashes, nosotros were unable to analyze each of them inwards much item or perform whatsoever advanced deduplication. Instead, nosotros partitioned them past times the top-level exception address, together with filed all of them as a unmarried entry #1031 inwards the põrnikas tracker:

usp10!otlMultiSubstLookup::apply+0xa8
usp10!otlSingleSubstLookup::applyToSingleGlyph+0x98
usp10!otlSingleSubstLookup::apply+0xa9
usp10!otlMultiSubstLookup::getCoverageTable+0x2c
usp10!otlMark2Array::mark2Anchor+0x18
usp10!GetSubstGlyph+0x2e
usp10!BuildTableCache+0x1ca
usp10!otlMkMkPosLookup::apply+0x1b4
usp10!otlLookupTable::markFilteringSet+0x1a
usp10!otlSinglePosLookup::getCoverageTable+0x12
usp10!BuildTableCache+0x1e7
usp10!otlChainingLookup::getCoverageTable+0x15
usp10!otlReverseChainingLookup::getCoverageTable+0x15
usp10!otlLigCaretListTable::coverage+0x7
usp10!otlMultiSubstLookup::apply+0x99
usp10!otlTableCacheData::FindLookupList+0x9
usp10!ttoGetTableData+0x4b4
usp10!GetSubtableCoverage+0x1ab
usp10!otlChainingLookup::apply+0x2d
usp10!MergeLigRecords+0x132
usp10!otlLookupTable::subTable+0x23
usp10!GetMaxParameter+0x53
usp10!ApplyLookup+0xc3
usp10!ApplyLookupToSingleGlyph+0x6f
usp10!ttoGetTableData+0x19f6
usp10!otlExtensionLookup::extensionSubTable+0x1d
usp10!ttoGetTableData+0x1a77

In the end, it turned out that these 27 crashes manifested 21 actual bugs, which were fixed past times Microsoft as CVE-2017-0083, CVE-2017-0091, CVE-2017-0092 together with CVE-2017-0111 to CVE-2017-0128 inwards the MS17-011 safety bulletin.

Lastly, nosotros also reported vii unique NULL pointer dereference issues amongst no deadline, amongst the promise that having whatsoever of them fixed would potentially enable our fuzzer to regain other, to a greater extent than severe bugs. On March 17th, MSRC responded that they investigated the cases together with concluded that they were low-severity DoS problems only, together with would non live on fixed as constituent of a safety bulletin inwards the nigh future.

Input corpus, mutation configuration together with adjusting the essay harness

Gathering a corporation corpus of input samples is arguably 1 of the most of import parts of fuzzing preparation, particularly if code coverage feedback is non involved, making it impossible for the corpus to gradually evolve into a to a greater extent than optimal form. We were lucky plenty to already have got had several font corpora at our disposal from previous fuzzing runs. We decided to occupation the same laid of files that had helped us regain [4]). It was originally generated past times running a corpus distillation algorithm over a large number of fonts crawled off the web, using an instrumented construct of the FreeType2 open-source library, together with consisted of 14848 TrueType together with 4659 OpenType files, for a total of 2.4G of disk space. In gild to tailor the corpus amend for Uniscribe, nosotros reduced it to precisely the files that contained at to the lowest degree 1 of the “GDEF”, “GSUB”, “GPOS”, “BASE” or “JSTF” tables, which are parsed past times the library. This left us amongst 3768 TrueType together with 2520 OpenType fonts consuming 1.68G on disk, which were much to a greater extent than probable to expose bugs inwards Uniscribe than whatsoever of the removed ones. That was the finally corpus that nosotros worked with.

The mutator configuration was also pretty similar to what nosotros did for the kernel: nosotros used the same v criterion bitflipping, byteflipping, chunkspew, special ints together with binary arithmetic algorithms amongst the precalculated per-table mutation ratio ranges. The entirely alter made specifically for Uniscribe was to add together mutations for the “BASE” together with “JSTF” tables, which were previously non accounted for.

Last but non least, nosotros extended the functionality of the invitee fuzzing harness, responsible for invoking the tested font-related API (mostly displaying all of the font’s glyphs at diverse dot sizes, but also querying a number of properties etc.). While it was clear that some of the relevant code was executed automatically through user32!DrawText amongst no modifications required, nosotros wanted to maximize the coverage of Uniscribe code as much possible. Influenza A virus subtype H5N1 total reference of all its externally available functions tin live on constitute on MSDN [9]. After skimming through the documentation, nosotros added calls to ScriptCacheGetHeight, ScriptGetFontProperties, ScriptGetCMap, ScriptGetFontAlternateGlyphs, ScriptSubstituteSingleGlyph together with ScriptFreeCache. This rapidly proved to live on a successful idea, as it allowed us to regain the aforementioned generic põrnikas inwards ScriptGetFontAlternateGlyphs. Furthermore, nosotros decided to withdraw invocations of the GetKerningPairs together with GetGlyphOutline API functions, as their corresponding logic was located inwards the kernel, piece our focus had instantly shifted strictly to user-mode. As such, they wouldn’t Pb to the uncovering of whatsoever novel bugs inwards Uniscribe, but would instead tiresome the overall fuzzing procedure down. Apart from these tike modifications, the core of the essay harness remained unchanged.

By taking the measures listed above, nosotros hoped that they were sufficient to trigger most of the depression hanging fruit bugs. With this assumption, the entirely constituent left was to brand certain that the crashes would live on reliably caught together with reported to the fuzzer. This dependent area is discussed inwards the adjacent section.

Crash detection

The kickoff stair nosotros took to regain Uniscribe crashes effectively was disabling Special Pools for win32k.sys together with ATMFD.DLL (which caused unnecessary overhead for no gain inwards user-mode), piece enabling the PageHeap pick inwards Application Verifier for the harness process. This was done to improve our chances at detecting invalid retention accesses, together with brand reproduction together with deduplication to a greater extent than reliable.

Thanks to the fact that the fuzz-tested code inwards usp10.dll executed inwards the same context as the residue of the harness logic, nosotros didn’t have got to write a full-fledged Windows debugger to supervise some other process. Instead, nosotros precisely laid upward a top-level exception handler amongst the SetUnhandledExceptionFilter function, which so got called every fourth dimension a fatal exception was generated inwards the process. The handler’s task was to ship out the the world of the crashing CPU context (passed inwards through ExceptionInfo->ContextRecord) to the hypervisor (i.e. the Bochs instrumentation) through the “debug print” hypercall, together with so genuinely written report that the crash occurred at the specific address.

In the gist font fuzzing scenario, crashes were detected past times the Bochs instrumentation amongst the BX_INSTR_RESET instrumentation callback. This approach worked because the invitee organization was configured to automatically reboot on bugcheck, consequently triggering the bx_instr_reset handler. The easiest agency to integrate this approach amongst user-mode fuzzing would live on thence to precisely add together a ExitWindowsEx telephone telephone inwards the epilogue of the exception handler, making everything piece of work out of the box without fifty-fifty touching the existing Bochs instrumentation. However, the method would resultant inwards losing information most the crash location, making automated deduplication impossible. In gild to address this problem, nosotros introduced a novel “crash encountered” hypercall, which received the address of the faulting didactics inwards the declaration from the guest, together with passed this information farther downward our scalable fuzzing infrastructure. Having the crashes grouped past times the exception address correct from the start saved us a ton of postprocessing time, together with express the number of essay cases nosotros had to human face at to a bare minimum.

This is the terminate of a listing of differences betwixt the Windows gist font fuzzing setup we’ve been using for nearly ii years now, together with an equivalent setup for user-mode fuzzing that nosotros entirely built a few months ago, but has already proven real effective. Everything else has remained the same as described inwards the “font fuzzing techniques” article from finally twelvemonth [4].

Conclusions

It is a fascinating but dire realization that fifty-fifty for such a good known class of põrnikas hunting targets as font parsing implementations, it is soundless possible to regain novel laid on vectors dating dorsum to the previous century, having remained largely unaudited until now, together with existence as exposed as the interfaces nosotros already know about. We believe that this is a keen instance of how gradually ascension the bar for a multifariousness of software tin have got much to a greater extent than comport upon than trying to kill every finally põrnikas inwards a narrow hit of code. It is also illustrative of the fact that the fourth dimension spent on thoroughly analyzing the laid on surface together with looking for little-known targets may plough out real fruitful, as the safety community soundless doesn’t have got a total agreement of the laid on vectors inwards every of import information processing stack (such as the Windows font treatment inwards this case).

This attempt together with its results demo that fuzzing is a real universal technique, together with most of its components tin live on easily reused from 1 target to another, particularly within the compass of a unmarried file format. Finally, it has proven that it is possible to fuzz non precisely the Windows kernel, but also regular user-mode code, regardless of the surround of the host organization (which was Linux inwards our case). While the Bochs x86 emulator incurs a pregnant overhead as compared to native execution speed, it tin oft live on scaled against to soundless attain a internet gain inwards the number of iterations per second. As an interesting fact, issues #993 (Windows gist registry hive loading), #1042 (EMF+ processing inwards GDI+), #1052 together with #1054 (color profile processing) fixed inwards the finally Patch Tuesday were also constitute amongst fuzzing Windows on Bochs, but amongst slightly dissimilar input samples, essay harnesses together with mutation strategies. :)

plantillasnowcrystalsdelui

Cari Blog Ini