int

Thanks @MambaWong for pointing out the errors #22 of this article

related file

cpython/Objects/longobject.c
cpython/Include/longobject.h
cpython/Include/longintrepr.h

memory layout

after python3, there's only type named int, the long type in python2.x is int type in python3.x

the structure of long object looks like the structure of tuple object, obviously, there's only one field to store the real int value, that's ob_digit

But how does CPython represent the variable size int in byte level? Let's see

how is element stored inside

ingeter 0

notice, when the value is 0, the ob_digit field doesn't store anything, the value 0 in ob_size indicate that long object represent integer 0

i = 0

ingeter 1

there are two different types of ob_digit depends on your system.

#if PYLONG_BITS_IN_DIGIT == 30
typedef uint32_t digit;
typedef int32_t sdigit;
typedef uint64_t twodigits;
typedef int64_t stwodigits; /* signed variant of twodigits */
#define PyLong_SHIFT    30
#define _PyLong_DECIMAL_SHIFT   9 /* max(e such that 10**e fits in a digit) */
#define _PyLong_DECIMAL_BASE    ((digit)1000000000) /* 10 ** DECIMAL_SHIFT */
#elif PYLONG_BITS_IN_DIGIT == 15
typedef unsigned short digit;
typedef short sdigit; /* signed variant of digit */
typedef unsigned long twodigits;
typedef long stwodigits; /* signed variant of twodigits */
#define PyLong_SHIFT    15
#define _PyLong_DECIMAL_SHIFT   4 /* max(e such that 10**e fits in a digit) */
#define _PyLong_DECIMAL_BASE    ((digit)10000) /* 10 ** DECIMAL_SHIFT */

I've modified the source code to change the value of PYLONG_BITS_IN_DIGIT to 15

but when it's going to represent integer 1, ob_size becomes 1 and field in ob_digit represent the value 1 with type unsigned short

i = 1

ingeter -1

when i becomes -1, the only difference from the integer 1 is the value in ob_size field, CPython interpret ob_size as a signed type to differ the positive and negative sign

i = -1

ingeter 1023

the basic unit is type digit, which provide 2 bytes(16bits) for storage. And 1023 takes the rightmost 10 bits, so the value ob_size field is still 1.

ingeter 32767

the integer 32767 represent in the same way as usual

ingeter 32768

CPython don't use all the 16 bits in digit field, the first bit of every digit is reserved, we will see why later

little endian and big endian

notice, because the digit is the smallest unit in the CPython abstract level, The order between bytes inside a single ob_digit is the same as your machine order(mine is little endian)

Order between digit in the ob_digit array are represent as most-significant-digit-in-the-right-most order(little endian order)

we can have a better understanding with the integer value -262143

the minus sign is stored in the ob_size field

the interger 262143(2^18 = 262144) in binary representation is 00000011 11111111 11111111

reserved bit

why the left-most bit in digit is reserved? Why order between digit in the ob_digit array are represented as little-endian?

let's try to add two integer value

i = 1073741824 - 1 # 1 << 30 == 1073741824
j = 1

k = i + j

first, initialize a temporary PyLongObject with size = max(size(i), size(j)) + 1

step1, sum the firt digit in each ob_digit array to a variable named carray

step2, set the value in temp[0] to (carry & PyLong_MASK)

step3, right shift the carray up to the leftmost bit

step4, add the second digit in each ob_digit array to the result of carray

step5, set the value in temp[1] to (carry & PyLong_MASK)

step6, right shift the carray again

go to step4 and repeat until no more digit left, set the final carray to the last index of temp

the variable temp contains the sum, now, you see the reserved bit is used for the carry or borrow when you add/sub an integer, the digit in ob_digit array are stored in little-endian order so that the add/sub operation can process each digit from left to right

the sub operation is similar to the add operation, so you can read the source code directly

small ints

CPython also use a buffer pool to store the frequently used integer

#define NSMALLPOSINTS           257
#define NSMALLNEGINTS           5
static PyLongObject small_ints[NSMALLNEGINTS + NSMALLPOSINTS];

let's see

c = 0
d = 0
e = 0
print(id(c), id(d), id(e)) # 4480940400 4480940400 4480940400
a = -5
b = -5
print(id(a), id(b)) # 4480940240 4480940240
f = 1313131313131313
g = 1313131313131313
print(id(f), id(g)) # 4484604176 4484604016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

long.md

long.md

int

contents

related file

memory layout

how is element stored inside

ingeter 0

ingeter 1

ingeter -1

ingeter 1023

ingeter 32767

ingeter 32768

little endian and big endian

reserved bit

small ints

Files

long.md

Latest commit

History

long.md

File metadata and controls

int

contents

related file

memory layout

how is element stored inside

ingeter 0

ingeter 1

ingeter -1

ingeter 1023

ingeter 32767

ingeter 32768

little endian and big endian

reserved bit

small ints