LINE Engineering
Blog

Buffer overflow in PJSIP, a VoIP open source library

Kim Youngsung 2018.02.27

He's a security engineer at LINE. He enjoys looking for bugs and is highly interested in secure coding.

Hi all, I am Youngsung Kim (Facebook, Twitter) of the Application Security team at LINE and am in charge of evaluating security of LINE services. On this post, I'd like to share a vulnerability (CVE-2017-16872, AST-2017-009) of PJSIP, a VoIP open source library. PJSIP is a multimedia communication library based on the following standard protocols; SIP, SDP, RTP, STUN, TURN, and ICE. The Asterisk framework, widely used on IP-PBX and VoPI gateway has an SIP stack implemented based on PJSIP.

The cause of the vulnerability was due to incautiousness about sign extension for Integers in the process of converting signed int to unsigned long when handling client's SIP requests on 64-bit environment. There was no window to report the security issue to the PJSIP development teams, so I made my report to the Asterisk's security page. Afterwards, I've consulted with George Joseph, an engineer at Asterisk, and the patch (PJSIP patch, Asterisk patch) has been applied on the pjproject v2.7.1. I'd like to express my gratitude to George for processing the patch.

What is sign extension?

In C, if you pass an argument of a data type different from a declared parameter's, the data type of the argument is automatically changed to that of the declared parameter's. This change is called type casting. Say, we call a function, function(int a), and we pass a char as the argument. The type of the argument would be automatically converted from char to int. When a char gets converted to an int, the original data which was only a byte long becomes 4 bytes long as an int, resulting in creating some "free space" (i.e. spare bits). A char is a type that is signed and we need to mark somewhere in the data to show the sign, and we do it with the top bit of data. When a signed value gets "free space" due to type casting, the free space gets filled with the value that represents the sign. If the top bit of the original data had been 0, then the spare bits would be filled with 0. Likewise, if the top bit had been 1, then the spare bits would be filled with 1. This operation we call, sign extension.

Let's have a look into this with an example. Say we have a char valued -5, and we change it into an int. Since -5 is clearly a negative value, based on the 2's complement, the hexadecimal value of the char will be 0xFB. In converting this to an int, we get spare 3 bytes, and we fill these 3 bytes all with 1. Then our original data, -5, becomes 0xFFFFFFFB.

sign_extension_1

What do you think will happen if we convert a value's data type to a different type with a different sign? Say, convert a signed char into an unsigned int. Again, we will use -5 to see what would happen. A signed char valued -5 will become an int valued 0xFFFFFFFB, as we already saw in our previous example. However, the difference here is that the data now is unsigned, which makes the decimal value of the data, 4294967291. What a drastic change? Such unexpected change of value can allow bypassing bounds checks or cause buffer overflow. We have to take a good care in handling data types.

sign_extension_2

Unexpected value change by sign extension

Let's look at PJSIP's vulnerability in relevance to sign extension.

The following code is a part of the sip_transaction.c file. As you can see, the create_tsx_key_2543() function passes a CSeq number as the first argument to the pj_utoa() function. The data type of the CSeq number, i.e. rdata->msg_info.cseq->cseq is signed int.

// pjsip/src/pjsip/sip_transaction.c

static pj_status_t create_tsx_key_2543( ... )
{
...
    /* Add CSeq (only the number). */
    len = pj_utoa(rdata->msg_info.cseq->cseq, p); // rdata->msg_info.cseq->cseq : pj_int32_t
                                                  // (defined: pjsip/include/pjsip/sip_msg.h)

Contrarily, the first parameter of the pj_utoa() function, val is declared as an unsigned long. The CSeq number will have its data type automatically converted from signed int to unsigned long.

// pjlib/src/pj/string.c

PJ_DEF(int) pj_utoa(unsigned long val, char *buf)
{
    // Note that passing a negative `val` can result in a large unsigned long value.
    // In 64bit machines, the size of unsigned long is 8 bytes.
    // Max unsigned long = 0xFFFFFFFFFFFFFFFF = 18446744073709551615

    return pj_utoa_pad(val, buf, 0, 0);
}

If a negative number, for example, -1, is passed as the first argument of the pj_utoa() function, then the value of val goes through sign extension, loses its sign, and becomes a large number unexpectedly. On a 64-bit environment, an unsigned long is 8 bytes long, which makes the maximum value of this val, 18,446,744,073,709,551,615.

sign_extension_utoa

The pj_utoa_pad() function, called by the pj_utoa() function, saves the value of val, the first parameter of the pj_utoa() function, in buf, the function's second parameter. The maximum value the val can have is 18,446,744,073,709,551,615. This means that at maximum, a 21 byte long String ("18446744073709551615"+NULL) can be written in the buffer (buf). Summing up, we need to allocate at least 21 bytes for the buffer buf.

However, as you can see below, when the create_tsx_key_2543() function calculates the size of the buffer that will contain a CSeq number and a Via port number, the sizes of the CSeq number and Via port is not considered as 21 bytes each but 9 bytes each.

// pjsip/src/pjsip/sip_transaction.c

/*
 * Create key from the incoming data, to be used to search the transaction
 * in the transaction hash table.
 */
static pj_status_t create_tsx_key_2543( ... )
{
...
    /* Calculate length required. */
    // Note that Cseq, Via port are calculated as 9bytes each.
    len_required = method->name.slen +      /* Method */
                   9 +                      /* CSeq number */
                   ...
                   9 +                      /* Via port. */
                   16;                      /* Separator+Allowance. */
    key = p = (char*) pj_pool_alloc(pool, len_required);

...
    /* Add CSeq (only the number). */
    len = pj_utoa(rdata->msg_info.cseq->cseq, p);    // rdata->msg_info.cseq->cseq : pj_int32_t
                                                     // Implicit type casting from int to unsigned long
    p += len;
    *p++ = SEPARATOR;

...
    len = pj_utoa(rdata->msg_info.via->sent_by.port, p); // rdata->msg_info.via->sent_by.port : int
                                                         // Implicit type casting from int to unsigned long
    p += len;
    *p++ = SEPARATOR;

    *p++ = '\0';
...
    return PJ_SUCCESS;
}

The len_required variable of the create_tsx_key_2543() function specifies the size of the heap which is to be allocated by the pj_pool_alloc() function. If you see the code above, you can see that the sizes of a CSeq number and a Via port number are calculated as 9 bytes each, as mentioned earlier. However, if the CSeq number (rdata->msg_info.cseq→cseq) or the Via port number (rdata→msg_info.via→sent_by.port) are passed as negative numbers to the pj_utoa() function, then we will be writing 21 bytes each, not 9 bytes. As a result, we will encounter a buffer overflow.

Suppose the CSeq number or the Via port number is positive as they ought to be. Then we will have a 4 byte long int for each number, and converting this data to a decimal String will require 11 bytes ("2147483647"+NULL) each at maximum. Still, calculating the required size as 9 bytes is not enough. However, the spare 16 bytes we add to the len_required for separators and allowance give us the space we lacked. Summing up, assigning 9 bytes each for CSeq number or Via port number is not enough, but we will be alright thanks to the spare bytes we allocate for the buffer.

So, when does a problem rise? When the CSeq number or the Via port number is of a negative value. But no one would assign a negative value for these fields at first place. Which means, they get converted to negative values somewhere in the process. Let's have a look at the following code which parses CSeq header fields contained in an SIP message. Look where the pj_strtoul() gets called. The CSeq number, a decimal String, gets converted to an int.

// pjsip/src/pjsip/sip_parser.c

/* Parse CSeq header. */
static pjsip_hdr* parse_hdr_cseq( pjsip_parse_ctx *ctx )
{
    pj_str_t cseq, method;
    pjsip_cseq_hdr *hdr;
...
    // Parses only the numeric value of the CSeq header field and store it in cseq variable.
    pj_scan_get( ctx->scanner, &pconst.pjsip_DIGIT_SPEC, &cseq); //

    // Converts a cseq value, which is a string in decimal form, to an integer.
    // Type casting from 'unsigned long' to 'pj_int32_t'
    // 'hdr->cseq' can be set to a negative integer.
    hdr->cseq = pj_strtoul(&cseq); // hdr->cseq : pj_int32_t
                                   // (defined: pjsip/include/pjsip/sip_msg.h)
...
}

Let's see the implementation of the pj_strtoul() function. The CSeq number, passed as a String gets parsed and returned as an unsigned long.

// pjlib/src/pj/string.c

PJ_DEF(unsigned long) pj_strtoul(const pj_str_t *str)
{
    unsigned long value;
    unsigned i;

    PJ_CHECK_STACK();

    value = 0;
    for (i=0; i<(unsigned)str->slen; ++i) {
        if (!pj_isdigit(str->ptr[i]))
            break;
        value = value * 10 + (str->ptr[i] - '0');
    }
    return value;
}

As we've seen earlier, the following code parses CSeq header fields sent by a client. The value returned by the pj_strtoul() function which is an unsigned long, gets assigned to the hdr->cseq which is typed as an int.

// pjsip/src/pjsip/sip_parser.c

/* Parse CSeq header. */
static pjsip_hdr* parse_hdr_cseq( pjsip_parse_ctx *ctx )
{
...

    // Converts a cseq value, which is a string in decimal form, to an integer.
    // Type casting from 'unsigned long' to 'pj_int32_t'
    // 'hdr->cseq' can be set to a negative integer.
    hdr->cseq = pj_strtoul(&cseq); // hdr->cseq : pj_int32_t
                                   // (defined: pjsip/include/pjsip/sip_msg.h)
...
}

In other words, the CSeq number gets converted from unsigned long to int, and can become a negative value as illustrated below.

sign_extension_utoa

Such type casting and sign extension change the value unintentionally and were found in a number of places in the PJSIP code, including the create_tsx_key_2543() function. The patch has fixed them all.

Proof of concept

So, how did sign extension cause the vulnerability? We need to prove that a buffer overflow can occur by sign extension. PJSIP implemented a custom heap allocator for performance. However, because of the allocator, an overflow may not cause a crash, or AddressSanitizer may not be able to detect an overflow, all unintentionally. So we needed to figure out how the allocator works before making the PoC code. The way the PJSIP heap allocator works is that it allocates a big-sized block in a heap and then use the remaining for dynamic memory allocation request. To prove the vulnerability, in other words, to make a program malfunction, the size of the overflowing data has to be bigger than whats remaining in a block (i.e. the space between the end and cur labels in the following diagram.)

pool

Buffer overflows by sign extension has a limit on how much overflows. To manipulate an overflow, we need to allocate a large chunk to reduce the remaining memory to be close to zero. This means that the chunk allocated by the pj_pool_alloc() function has to be as big as end−cur to demonstrate an overflow by sign extension overrunning the following block's header. We looked for the ways to alter the size of the chunk to allocate, and we found that we could change the method->name.slen value of the create_tsx_key_2543() function to make the remaining space be nearly zero. The following code shows you manipulating the `method→name.slen` value by giving a long method name for the CSeq field.

OPTIONS sip:3 SIP/2.0
f: <sip:2>
t: <sip:1>
i: a
CSeq: 18..(Cseq)..514 AAAAAA..(method)..AAAAAAA
v: SIP/2.0/U 4:186..(port)..14

To see the details of the Proof of Concept and the AddressSanitizer, see this ticket.

Ending notes

All the code that parse SIP message header fields such as CSeq field, TTL Field, and port number, had a potential to be affected by buffer overflow. To address this issue, the patch validates the range of each field value when parsing the field values, so that the fields do not have negative values. If you have an interest in vulnerabilities caused by Integers in C, check out SEI CERT C Coding for more information.

VoIP Security Vulnerability OpenSource CVE PJSIP PJPROJECT ASTERISK

Kim Youngsung 2018.02.27

He's a security engineer at LINE. He enjoys looking for bugs and is highly interested in secure coding.

Add this entry to Hatena bookmark

Back to blog list