Notes On Windows Uniscribe Fuzzing

Posted past times Mateusz Jurczyk of Google

Among the total of 119 vulnerabilities amongst CVEs fixed past times Microsoft inwards the [1][2] together with fuzzing [3][4]. However, what makes this endeavour a chip dissimilar from the previous ones is the fact that Uniscribe is a little-known user-mode component, which had non been widely recognized as a feasible assail vector before, as opposed to the kernel-mode font implementations included inwards the win32k.sys together with ATMFD.DLL drivers. In this post, nosotros outline a brief history together with description of Uniscribe, explicate how nosotros approached at-scale fuzzing of the library, together with highlight some of the to a greater extent than interesting discoveries nosotros have got made so far. All the raw reports of the bugs we’re referring to (as they were submitted to Microsoft), together amongst the corresponding proof-of-concept samples, tin live constitute inwards the official põrnikas tracker [5]. Enjoy!

Introduction

It was Nov 2016 when nosotros started yet some other iteration of our Windows font fuzzing chore (whose architecture was thoroughly described inwards [4]). At that point, the heart assail surface was to a greater extent than oft than non fuzz-clean amongst regards to the techniques nosotros were using, but nosotros silent similar to play amongst the configuration together with input corpus from fourth dimension to fourth dimension to consider if nosotros tin squash out whatever to a greater extent than bugs amongst the existing infrastructure. What nosotros ended upward amongst a several days afterward were a bunch of samples which supposedly crashed the invitee Windows organisation running within of Bochs. When nosotros fed them to our reproduction pipeline, none of the bugchecks occurred 1 time again for unclear reasons. As disappointing as that was, at that spot also was 1 interesting together with unexpected result: for 1 of the examine cases, the user-mode harness crashed itself, without bringing the whole OS downwardly at the same time. This could signal either that at that spot was a põrnikas inwards our code, or that at that spot was some unanticipated font parsing going on inwards ring-3. When nosotros started earthworks deeper, nosotros constitute out that the unhandled exception took house inwards the next context:

(4464.11b4): Access violation - code c0000005 (first chance)

First risk exceptions are reported earlier whatever exception handling.

This exception may live expected together with handled.

eax=0933d8bf ebx=00000000 ecx=09340ffc edx=00001b9f esi=0026ecac edi=00000009

eip=752378f3 esp=0026ec24 ebp=0026ec2c iopl=0 nv upward ei pl zr na pe nc

cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010246

USP10!ScriptPositionSingleGlyph+0x28533:

752378f3 668b4c5002 mov cx,word ptr [eax+edx*2+2] ds:002b:09340fff=????

Until that moment, nosotros didn’t fully realize that our tools were triggering whatever font-handling code beyond the well-known heart implementation (despite some related bugs having been publicly fixed inwards the past, e.g. CVE-2016-7274 [6]). As a result, the fuzzing organisation was non prepared to grab user-mode faults, together with thus whatever such crashes had remained completely undetected inwards favor of organisation bugchecks, which caused total machine restarts.

We apace determined that the usp10.dll library corresponded to “Uniscribe Unicode script processor” (in Microsoft’s ain words) [7]. It is a relatively large module (600-800 kB depending on organisation version together with bitness) responsible for rendering Unicode-encoded text, as the call suggests. From a safety perspective, it’s of import that the code base of operations dates dorsum to Windows 2000, together with includes a C++ implementation of the parsing of diverse complex TrueType/OpenType structures, inwards add-on to what is already implemented inwards the kernel. The specific tables that Uniscribe touches on are primarily Advanced Typography Tables (“GDEF”, “GSUB”, “GPOS”, “BASE”, “JSTF”), but also “OS/2”, “cmap” together with “maxp” to some extent. What’s as meaning is that the code tin live reached only past times calling the DrawText [8] or other equivalent API amongst Unicode-encoded text together with an attacker-controlled font. Since no special calls other than the typical ones are necessary to execute the most exposed areas of the library, it makes for a groovy assail vector inwards applications which usage GDI to homecoming text amongst fonts originating from untrusted sources. This is also evidenced past times the stack line of the master crash, together with the fact that it occurred inwards a computer programme which didn’t include whatever usp10-specific code:

0:000> kb

ChildEBP RetAddr

0026ec2c 09340ffc USP10!otlChainRuleSetTable::rule+0x13

0026eccc 0133d7d2 USP10!otlChainingLookup::apply+0x7d3

0026ed48 0026f09c USP10!ApplyLookup+0x261

0026ef4c 0026f078 USP10!ApplyFeatures+0x481

0026ef98 09342f40 USP10!SubstituteOtlGlyphs+0x1bf

0026efd4 0026f0b4 USP10!SubstituteOtlChars+0x220

0026f250 0026f370 USP10!HebrewEngineGetGlyphs+0x690

0026f310 0026f370 USP10!ShapingGetGlyphs+0x36a

0026f3fc 09316318 USP10!ShlShape+0x2ef

0026f440 09316318 USP10!ScriptShape+0x15f

0026f4a0 0026f520 USP10!RenderItemNoFallback+0xfa

0026f4cc 0026f520 USP10!RenderItemWithFallback+0x104

0026f4f0 09316124 USP10!RenderItem+0x22

0026f534 2d011da2 USP10!ScriptStringAnalyzeGlyphs+0x1e9

0026f54c 0000000a USP10!ScriptStringAnalyse+0x284

0026f598 0000000a LPK!LpkStringAnalyse+0xe5

0026f694 00000000 LPK!LpkCharsetDraw+0x332

0026f6c8 00000000 LPK!LpkDrawTextEx+0x40

0026f708 00000000 USER32!DT_DrawStr+0x13c

0026f754 0026fa30 USER32!DT_GetLineBreak+0x78

0026f800 0000000a USER32!DrawTextExWorker+0x255

0026f824 ffffffff USER32!DrawTextExW+0x1e

As tin live seen here, the Uniscribe functionality was invoked internally past times user32.dll through the lpk.dll (Language Pack) library. As shortly as nosotros learned well-nigh this novel assail vector, nosotros jumped at the showtime risk to fuzz it. Most of the infrastructure was already inwards place, since both user- together with kernel-mode font fuzzing portion a large number of the pieces. The extra piece of work that nosotros had to practise was to a greater extent than oft than non related to filtering the input corpus, lilliputian amongst the mutator configuration, adjusting the organisation configuration together with implementing logic for the detection of user-mode crashes (both inwards the examine harness together with Bochs instrumentation). All of these steps are discussed inwards particular below. After a few days, nosotros had everything working as planned, together with after some other couple, at that spot were already over fourscore crashes at unique addresses waiting for triage. Below is a summary of the issues that were constitute inwards the showtime fuzzing run together with reported to Microsoft inwards Dec 2016.

Results at a glance

Since fourscore was silent a fairly manageable number of crashes to triage manually, nosotros tried to reproduce each of them past times hand, deduplicating them together with writing downwardly their details at the same time. When nosotros finished, nosotros ended upward amongst 8 assort high-severity issues that could potentially allow remote code execution:

Tracker ID	Memory access type at crash	Crashing function	CVE
1022	Invalid write of n bytes (memcpy)	usp10!otlList::insertAt	CVE-2017-0108
1023	Invalid read / write of 2 bytes	usp10!AssignGlyphTypes	CVE-2017-0084
1025	Invalid write of n bytes (memset)	usp10!otlCacheManager::GlyphsSubstituted	CVE-2017-0086
1026	Invalid write of n bytes (memcpy)	usp10!MergeLigRecords	CVE-2017-0087
1027	Invalid write of 2 bytes	usp10!ttoGetTableData	CVE-2017-0088
1028	Invalid write of 2 bytes	usp10!UpdateGlyphFlags	CVE-2017-0089
1029	Invalid write of n bytes	usp10!BuildFSM together with nearby functions	CVE-2017-0090
1030	Invalid write of n bytes	usp10!FillAlternatesList	CVE-2017-0072

All of the bugs but 1 were triggered through a criterion DrawText telephone telephone together with resulted inwards heap retentiveness corruption. The 1 exception was the #1030 issue, which resided inwards a documented Uniscribe-specific ScriptGetFontAlternateGlyphs API function. The routine is responsible for retrieving a listing of alternate glyphs for a specified character, together with the interesting fact well-nigh the põrnikas is that it wasn’t a occupation amongst operating on whatever internal structures. Instead, the business office failed to abide by the value of the cMaxAlternates argument, together with could so write to a greater extent than output information to the pAlternateGlyphs buffer than was allowed past times the business office caller. This meant that the buffer overflow was non specific to whatever particular retentiveness type – depending on what pointer the customer passed in, the overflow would have got house on the stack, heap or static memory. The exploitability of such a põrnikas would greatly depend on the computer programme blueprint together with compilation options used to construct it. We must admit, however, that it is unclear what the real-world clients of the business office are, together with whether whatever of them would come across the requirements to driblet dead a feasible assail target.

Furthermore, nosotros extracted 27 unique crashes caused past times invalid retentiveness reads from non-NULL addresses, which could potentially atomic number 82 to information disclosure of secrets stored inwards the procedure address space. Due to the large book of these crashes, nosotros were unable to analyze each of them inwards much particular or perform whatever advanced deduplication. Instead, nosotros partitioned them past times the top-level exception address, together with filed all of them as a unmarried entry #1031 inwards the põrnikas tracker:

usp10!otlMultiSubstLookup::apply+0xa8
usp10!otlSingleSubstLookup::applyToSingleGlyph+0x98
usp10!otlSingleSubstLookup::apply+0xa9
usp10!otlMultiSubstLookup::getCoverageTable+0x2c
usp10!otlMark2Array::mark2Anchor+0x18
usp10!GetSubstGlyph+0x2e
usp10!BuildTableCache+0x1ca
usp10!otlMkMkPosLookup::apply+0x1b4
usp10!otlLookupTable::markFilteringSet+0x1a
usp10!otlSinglePosLookup::getCoverageTable+0x12
usp10!BuildTableCache+0x1e7
usp10!otlChainingLookup::getCoverageTable+0x15
usp10!otlReverseChainingLookup::getCoverageTable+0x15
usp10!otlLigCaretListTable::coverage+0x7
usp10!otlMultiSubstLookup::apply+0x99
usp10!otlTableCacheData::FindLookupList+0x9
usp10!ttoGetTableData+0x4b4
usp10!GetSubtableCoverage+0x1ab
usp10!otlChainingLookup::apply+0x2d
usp10!MergeLigRecords+0x132
usp10!otlLookupTable::subTable+0x23
usp10!GetMaxParameter+0x53
usp10!ApplyLookup+0xc3
usp10!ApplyLookupToSingleGlyph+0x6f
usp10!ttoGetTableData+0x19f6
usp10!otlExtensionLookup::extensionSubTable+0x1d
usp10!ttoGetTableData+0x1a77

In the end, it turned out that these 27 crashes manifested 21 actual bugs, which were fixed past times Microsoft as CVE-2017-0083, CVE-2017-0091, CVE-2017-0092 together with CVE-2017-0111 to CVE-2017-0128 inwards the MS17-011 safety bulletin.

Lastly, nosotros also reported seven unique NULL pointer dereference issues amongst no deadline, amongst the promise that having whatever of them fixed would potentially enable our fuzzer to regain other, to a greater extent than severe bugs. On March 17th, MSRC responded that they investigated the cases together with concluded that they were low-severity DoS problems only, together with would non live fixed as constituent of a safety bulletin inwards the nigh future.

Input corpus, mutation configuration together with adjusting the examine harness

Gathering a corporation corpus of input samples is arguably 1 of the most of import parts of fuzzing preparation, specially if code coverage feedback is non involved, making it impossible for the corpus to gradually evolve into a to a greater extent than optimal form. We were lucky plenty to already have got had several font corpora at our disposal from previous fuzzing runs. We decided to usage the same laid of files that had helped us regain [4]). It was originally generated past times running a corpus distillation algorithm over a large number of fonts crawled off the web, using an instrumented construct of the FreeType2 open-source library, together with consisted of 14848 TrueType together with 4659 OpenType files, for a total of 2.4G of disk space. In social club to tailor the corpus improve for Uniscribe, nosotros reduced it to exactly the files that contained at to the lowest degree 1 of the “GDEF”, “GSUB”, “GPOS”, “BASE” or “JSTF” tables, which are parsed past times the library. This left us amongst 3768 TrueType together with 2520 OpenType fonts consuming 1.68G on disk, which were much to a greater extent than probable to expose bugs inwards Uniscribe than whatever of the removed ones. That was the lastly corpus that nosotros worked with.

The mutator configuration was also pretty similar to what nosotros did for the kernel: nosotros used the same 5 criterion bitflipping, byteflipping, chunkspew, special ints together with binary arithmetic algorithms amongst the precalculated per-table mutation ratio ranges. The alone modify made specifically for Uniscribe was to add together mutations for the “BASE” together with “JSTF” tables, which were previously non accounted for.

Last but non least, nosotros extended the functionality of the invitee fuzzing harness, responsible for invoking the tested font-related API (mostly displaying all of the font’s glyphs at diverse dot sizes, but also querying a number of properties etc.). While it was clear that some of the relevant code was executed automatically through user32!DrawText amongst no modifications required, nosotros wanted to maximize the coverage of Uniscribe code as much possible. Influenza A virus subtype H5N1 total reference of all its externally available functions tin live constitute on MSDN [9]. After skimming through the documentation, nosotros added calls to ScriptCacheGetHeight, ScriptGetFontProperties, ScriptGetCMap, ScriptGetFontAlternateGlyphs, ScriptSubstituteSingleGlyph together with ScriptFreeCache. This apace proved to live a successful idea, as it allowed us to regain the aforementioned generic põrnikas inwards ScriptGetFontAlternateGlyphs. Furthermore, nosotros decided to take invocations of the GetKerningPairs together with GetGlyphOutline API functions, as their corresponding logic was located inwards the kernel, piece our focus had forthwith shifted strictly to user-mode. As such, they wouldn’t atomic number 82 to the regain of whatever novel bugs inwards Uniscribe, but would instead tiresome the overall fuzzing procedure down. Apart from these tyke modifications, the core of the examine harness remained unchanged.

By taking the measures listed above, nosotros hoped that they were sufficient to trigger most of the depression hanging fruit bugs. With this assumption, the alone constituent left was to brand certain that the crashes would live reliably caught together with reported to the fuzzer. This discipline is discussed inwards the adjacent section.

Crash detection

The showtime stair nosotros took to observe Uniscribe crashes effectively was disabling Special Pools for win32k.sys together with ATMFD.DLL (which caused unnecessary overhead for no gain inwards user-mode), piece enabling the PageHeap choice inwards Application Verifier for the harness process. This was done to improve our chances at detecting invalid retentiveness accesses, together with brand reproduction together with deduplication to a greater extent than reliable.

Thanks to the fact that the fuzz-tested code inwards usp10.dll executed inwards the same context as the residuum of the harness logic, nosotros didn’t have got to write a full-fledged Windows debugger to supervise some other process. Instead, nosotros exactly laid upward a top-level exception handler amongst the SetUnhandledExceptionFilter function, which together with so got called every fourth dimension a fatal exception was generated inwards the process. The handler’s chore was to ship out the Blue Planet of the crashing CPU context (passed inwards through ExceptionInfo->ContextRecord) to the hypervisor (i.e. the Bochs instrumentation) through the “debug print” hypercall, together with and so genuinely written report that the crash occurred at the specific address.

In the heart font fuzzing scenario, crashes were detected past times the Bochs instrumentation amongst the BX_INSTR_RESET instrumentation callback. This approach worked because the invitee organisation was configured to automatically reboot on bugcheck, consequently triggering the bx_instr_reset handler. The easiest means to integrate this approach amongst user-mode fuzzing would live so to exactly add together a ExitWindowsEx telephone telephone inwards the epilogue of the exception handler, making everything piece of work out of the box without fifty-fifty touching the existing Bochs instrumentation. However, the method would outcome inwards losing information well-nigh the crash location, making automated deduplication impossible. In social club to address this problem, nosotros introduced a novel “crash encountered” hypercall, which received the address of the faulting pedagogy inwards the declaration from the guest, together with passed this information farther downwardly our scalable fuzzing infrastructure. Having the crashes grouped past times the exception address correct from the start saved us a ton of postprocessing time, together with express the number of examine cases nosotros had to aspect at to a bare minimum.

This is the destination of a listing of differences betwixt the Windows heart font fuzzing setup we’ve been using for nearly 2 years now, together with an equivalent setup for user-mode fuzzing that nosotros alone built a few months ago, but has already proven really effective. Everything else has remained the same as described inwards the “font fuzzing techniques” article from lastly yr [4].

Conclusions

It is a fascinating but dire realization that fifty-fifty for such a good known flat of põrnikas hunting targets as font parsing implementations, it is silent possible to regain novel assail vectors dating dorsum to the previous century, having remained largely unaudited until now, together with beingness as exposed as the interfaces nosotros already know about. We believe that this is a groovy illustration of how gradually ascent the bar for a multifariousness of software tin have got much to a greater extent than comport upon than trying to kill every lastly põrnikas inwards a narrow arrive at of code. It is also illustrative of the fact that the fourth dimension spent on thoroughly analyzing the assail surface together with looking for little-known targets may plow out really fruitful, as the safety community silent doesn’t have got a total agreement of the assail vectors inwards every of import information processing stack (such as the Windows font treatment inwards this case).

This endeavour together with its results present that fuzzing is a really universal technique, together with most of its components tin live easily reused from 1 target to another, specially within the orbit of a unmarried file format. Finally, it has proven that it is possible to fuzz non exactly the Windows kernel, but also regular user-mode code, regardless of the surround of the host organisation (which was Linux inwards our case). While the Bochs x86 emulator incurs a meaning overhead as compared to native execution speed, it tin oft live scaled against to silent attain a cyberspace gain inwards the number of iterations per second. As an interesting fact, issues #993 (Windows heart registry hive loading), #1042 (EMF+ processing inwards GDI+), #1052 together with #1054 (color profile processing) fixed inwards the lastly Patch Tuesday were also constitute amongst fuzzing Windows on Bochs, but amongst slightly dissimilar input samples, examine harnesses together with mutation strategies. :)

plantillasnowcrystalsdelui

Cari Blog Ini