Comparison of Several Methods for Checking Addition Overflow

In practice, it is sometimes necessary to detect in advance whether arithmetic overflow will occur. C# provides the checked keyword to solve this problem; how should C/C++ handle it? This article compares and analyzes several methods for checking addition overflow at the assembly-instruction level. Limited by the experimental environment and my own knowledge, the scope of this article is limited to the X86-64 architecture.

However, it is worth noting that even other architectures are very likely to provide similar instructions for detecting the overflow bit.[1]

For convenience, the following discussion uses the addition of two 32-bit signed integers as an example. For operations on unsigned integers, it is sufficient to check the Carry bit.[2]

This article uses three methods:

Use embedded assembly code to check the processor’s Overflow flag;
Convert 32-bit addition to 64-bit addition, and then compare the result with INT_MAX or INT_MIN;
Determine and verify the interval in which the operation result lies.

Brief Overview of the Principles #

Using Embedded Assembly to Check the Processor’s Overflow Flag #

The theoretical basis for this method is as follows:[3]

The status register in a traditional processor design includes at least three central flags: Zero, Carry, and Overflow, which are set or cleared automatically as effects of arithmetic and bit manipulation operations.

In the X86 architecture, we can use these flags to determine whether the most recent arithmetic operation overflowed. In particular, X86 provides the JO and JNO instructions for conditional jumps based on the Overflow flag.[4]

Unfortunately, things are not always so ideal. Sometimes the compiler uses the LEA instruction to optimize arithmetic operations. Because the LEA instruction is not an arithmetic-operation instruction, it does not set the Overflow flag, causing this detection method to fail. A compromise solution is to use embedded assembly instructions to force the compiler to generate an ADD instruction for the addition operation.

Converting 32-Bit Addition to 64-Bit Addition, and Then Comparing with `INT_MAX` or `INT_MIN` #

Obviously, a 32-bit addition operation can produce at most a 33-bit result. X86 processors provide some support for 64-bit integer operations, so we can perform the computation using 64-bit integer arithmetic, and then compare the result with the boundary values of 32-bit integers to determine whether overflow occurred.

Determining and Verifying the Interval in Which the Operation Result Lies #

Because of the definition of two’s complement, once overflow occurs, the interval of the result can be determined in advance. If overflow occurs when adding two positive integers, the result must be less than either operand; if overflow occurs when adding two negative integers, the result must be greater than either operand.[5]

Based on this principle, we can verify the interval in which the result lies and thereby determine whether overflow occurred.

Brief Overview of the Experimental Code #

To take maximum advantage of compiler optimizations, I constructed the following code for the experiment:

cpp

inline int Add1(int a, int b)
{
    asm ("addl %1, %0"
        : "=r" (a)
        : "r" (a), "r" (b));
    asm goto ("jo %l0"
        : /* no output */
        : /* no input */
        : /* no clobber */
        : OVERFLOW);
    return a;

OVERFLOW:
    exit(1);
}

inline int Add2(int a, int b)
{
    long long tmp = (long long)a + (long long)b;
    if (tmp <= (long long)INT_MAX && tmp >= (long long)INT_MIN) {
        return (int)tmp;
    } else {
        exit(1);
    }
}

inline int Add3(int a, int b)
{
    int tmp = (int)((unsigned int)a + (unsigned int)b);

    if (a >= 0) {
        if (b >= 0 && tmp < a) {
            exit(1);
        }
    } else {
        if (b <= 0 && tmp > a) {
            exit(1);
        }
    }

    return tmp;
}

void T1(void) {
    int a, b;
    scanf("%d%d", &a, &b);
    int c = Add1(a, b);
    printf("%d + %d = %d\n", a, b, c);
}

void T2(void) {
    int a, b;
    scanf("%d%d", &a, &b);
    int c = Add2(a, b);
    printf("%d + %d = %d\n", a, b, c);
}

void T3(void) {
    int a, b;
    scanf("%d%d", &a, &b);
    int c = Add3(a, b);
    printf("%d + %d = %d\n", a, b, c);
}

Here, the Add1, Add2, and Add3 methods respectively correspond to the C-language implementations of the three methods proposed above. The inline specifier instructs the compiler to inline them for deeper optimization. The Add2 and Add3 methods refer to Microsoft’s SafeInt library.[6]

The T1, T2, and T3 methods respectively correspond to the three test methods. Here, values are not assigned directly to a and b; instead, the scanf function is used for input to prevent the compiler from performing offline computation directly during optimization and writing the generated result directly to the target location.

Assembly Code Analysis #

I used the following statement to generate the assembly code:

gcc -O3 -march=native -S test.c -o test.s

Incidentally, my experimental environment was:

CPU: Intel Core2 P7450
GCC: 4.8.2

Because the complete code is relatively long, only some key code is excerpted here. In the code generated for T1, the fragment related to Add1 (after inlining) is as follows:

gas

	movl 12(%rsp), %edx
	movl 8(%rsp), %esi
	addl %esi, %ecx
	jo   .L26
	# No Overflow
.L26:
	# Overflow

This is very simple: this assembly code stores the result of the scanf function in registers, then performs the addl operation, and then performs a conditional jump (based on the setting of the overflow bit).

In the code generated for T2, the fragment related to Add2 (after inlining) is as follows:

gas

	movl   12(%rsp), %edx
	movl   $2147483648, %edi
	movl   8(%rsp), %esi
	movslq %edx, %rax
	movslq %esi, %rcx
	addq   %rax, %rcx
	movl   $4294967295, %eax
	addq   %rcx, %rdi
	cmpq   %rax, %rdi
	ja     .L30
	# No Overflow
.L30:
	# Overflow

movslq can move data of long length and sign-extend it to a location of quad length. This code first extracts the results of scanf into the 32-bit registers edx and esi, and then respectively sign-extends the results and moves them into the 64-bit registers rax and rcx. Next, it performs 64-bit addition, with the result stored in the rcx register. The compiler then performs an optimization that looks quite complicated to a human reader. I have proven that this optimization is correct, but I am not going to expand on the details here.

In the code generated for T3, the fragment related to Add3 (after inlining) is as follows:

gas

	movl	8(%rsp), %esi
	movl	12(%rsp), %edx
	testl	%esi, %esi
	leal	(%rdx,%rsi), %ecx
	js	.L34
	cmpl	%ecx, %esi
	jg	.L45
.L35:
    # No Overflow
.L34:
	cmpl	%ecx, %esi
	jge	.L35
	movl	%edx, %eax
	shrl	$31, %eax
	testb	%al, %al
	je	.L35
.L36:
    # Overflow
.L45:
	testl	%edx, %edx
	js	.L35
	jmp	.L36

Even without analysis, it is clear that this code is relatively complex, and its runtime efficiency will not be higher than that of the previous two code fragments.

Brief Overview of the Experimental Design #

Using pregenerated random numbers, call Add1, Add2, Add3, and direct addition without overflow checking respectively, and collect timing performance statistics. At this point, these functions need to be modified slightly: when overflow occurs, they cannot call the exit function, but should instead use setjmp and longjmp for handling.

The main code is as follows:

(Because my work platform has migrated to Windows, I am temporarily unable to conduct the experiment, so the following experiment is missing.)

cpp

Experimental Results Analysis #

Summary #

References #

[1] ARM Processors: Condition Codes 1: Condition Flags and Codes. July, 2010. http://community.arm.com/groups/processors/blog/2010/07/16/condition-codes-1-condition-flags-and-codes

[2] Ian! D. Allen. The CARRY flag and OVERFLOW flag in binary arithmetic. http://teaching.idallen.com/dat2343/10f/notes/040_overflow.txt

[3] Status register. http://en.wikipedia.org/wiki/Status_register

[4] X86 instruction listings. http://en.wikipedia.org/wiki/X86_instruction_listings

[5] Randal E. Bryant and David R. O’Hallaron. Computer System: A Programmer’s Perspective, 2ed. Addison-Wesley. February 2010.

[6] SafeInt. http://safeint.codeplex.com/