forked from python/cpython
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
pythonGH-96068: Document object layout (pythonGH-96069)
- Loading branch information
1 parent
16ebae4
commit 575f888
Showing
5 changed files
with
157 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,82 @@ | ||
# Object layout | ||
|
||
## Common header | ||
|
||
Each Python object starts with two fields: | ||
|
||
* ob_refcnt | ||
* ob_type | ||
|
||
which the form the header common to all Python objects, for all versions, | ||
and hold the reference count and class of the object, respectively. | ||
|
||
## Pre-header | ||
|
||
Since the introduction of the cycle GC, there has also been a pre-header. | ||
Before 3.11, this pre-header was two words in size. | ||
It should be considered opaque to all code except the cycle GC. | ||
|
||
## 3.11 pre-header | ||
|
||
In 3.11 the pre-header was extended to include pointers to the VM managed ``__dict__``. | ||
The reason for moving the ``__dict__`` to the pre-header is that it allows | ||
faster access, as it is at a fixed offset, and it also allows object's | ||
dictionaries to be lazily created when the ``__dict__`` attribute is | ||
specifically asked for. | ||
|
||
In the 3.11 the non-GC part of the pre-header consists of two pointers: | ||
|
||
* dict | ||
* values | ||
|
||
The values pointer refers to the ``PyDictValues`` array which holds the | ||
values of the objects's attributes. | ||
Should the dictionary be needed, then ``values`` is set to ``NULL`` | ||
and the ``dict`` field points to the dictionary. | ||
|
||
## 3.12 pre-header | ||
|
||
In 3.12 the the pointer to the list of weak references is added to the | ||
pre-header. In order to make space for it, the ``dict`` and ``values`` | ||
pointers are combined into a single tagged pointer: | ||
|
||
* weakreflist | ||
* dict_or_values | ||
|
||
If the object has no physical dictionary, then the ``dict_or_values`` | ||
has its low bit set to one, and points to the values array. | ||
If the object has a physical dictioanry, then the ``dict_or_values`` | ||
has its low bit set to zero, and points to the dictionary. | ||
|
||
The untagged form is chosen for the dictionary pointer, rather than | ||
the values pointer, to enable the (legacy) C-API function | ||
`_PyObject_GetDictPtr(PyObject *obj)` to work. | ||
|
||
|
||
## Layout of a "normal" Python object in 3.12: | ||
|
||
* weakreflist | ||
* dict_or_values | ||
* GC 1 | ||
* GC 2 | ||
* ob_refcnt | ||
* ob_type | ||
|
||
For a "normal" Python object, that is one that doesn't inherit from a builtin | ||
class or have slots, the header and pre-header form the entire object. | ||
|
||
![Layout of "normal" object in 3.12](./object_layout_312.png) | ||
|
||
There are several advantages to this layout: | ||
|
||
* It allows lazy `__dict__`s, as described above. | ||
* The regular layout allows us to create tailored traversal and deallocation | ||
functions based on layout, rather than inheritance. | ||
* Multiple inheritance works properly, | ||
as the weakrefs and dict are always at the same offset. | ||
|
||
The full layout object, with an opaque part defined by a C extension, | ||
and `__slots__` looks like this: | ||
|
||
![Layout of "full" object in 3.12](./object_layout_full_312.png) | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
digraph ideal { | ||
|
||
rankdir = "LR" | ||
|
||
|
||
object [ | ||
shape = none | ||
label = <<table border="0" cellspacing="0"> | ||
<tr><td><b>object</b></td></tr> | ||
<tr><td port="w" border="1">weakrefs</td></tr> | ||
<tr><td port="dv" border="1">dict or values</td></tr> | ||
<tr><td border="1" >GC info 0</td></tr> | ||
<tr><td border="1" >GC info 1</td></tr> | ||
<tr><td port="r" border="1" >refcount</td></tr> | ||
<tr><td port="h" border="1" >__class__</td></tr> | ||
</table>> | ||
] | ||
|
||
values [ | ||
shape = none | ||
label = <<table border="0" cellspacing="0"> | ||
<tr><td><b>values</b></td></tr> | ||
<tr><td port="0" border="1">values[0]</td></tr> | ||
<tr><td border="1">values[1]</td></tr> | ||
<tr><td border="1">...</td></tr> | ||
</table>> | ||
|
||
] | ||
|
||
class [ | ||
shape = none | ||
label = <<table border="0" cellspacing="0"> | ||
<tr><td><b>class</b></td></tr> | ||
<tr><td port="head" bgcolor="lightgreen" border="1">...</td></tr> | ||
<tr><td border="1" bgcolor="lightgreen">dict_offset</td></tr> | ||
<tr><td border="1" bgcolor="lightgreen">...</td></tr> | ||
<tr><td port="k" border="1" bgcolor="lightgreen">cached_keys</td></tr> | ||
</table>> | ||
] | ||
|
||
keys [label = "dictionary keys"; fillcolor="lightgreen"; style="filled"] | ||
NULL [ label = " NULL"; shape="plain"] | ||
object:w -> NULL | ||
object:h -> class:head | ||
object:dv -> values:0 | ||
class:k -> keys | ||
|
||
oop [ label = "pointer"; shape="plain"] | ||
oop -> object:r | ||
} |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
digraph ideal { | ||
|
||
rankdir = "LR" | ||
|
||
|
||
object [ | ||
shape = none | ||
label = <<table border="0" cellspacing="0"> | ||
<tr><td><b>object</b></td></tr> | ||
<tr><td port="w" border="1">weakrefs</td></tr> | ||
<tr><td port="dv" border="1">dict or values</td></tr> | ||
<tr><td border="1" >GC info 0</td></tr> | ||
<tr><td border="1" >GC info 1</td></tr> | ||
<tr><td port="r" border="1" >refcount</td></tr> | ||
<tr><td port="h" border="1" >__class__</td></tr> | ||
<tr><td border="1">opaque (extension) data </td></tr> | ||
<tr><td border="1">...</td></tr> | ||
<tr><td border="1">__slot__ 0</td></tr> | ||
<tr><td border="1">...</td></tr> | ||
</table>> | ||
] | ||
|
||
oop [ label = "pointer"; shape="plain"] | ||
oop -> object:r | ||
} |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.