Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-103295: expose API for writing perf map files #103546

Merged
merged 60 commits into from
May 21, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
f76d2c5
Update osmodule.h
gsallam Apr 14, 2023
de6b63e
Update posixmodule.c
gsallam Apr 14, 2023
4f55b2d
Update pycore_ceval_state.h
gsallam Apr 14, 2023
03fccfb
Update perf_trampoline.c
gsallam Apr 14, 2023
78d543a
Update pylifecycle.c
gsallam Apr 14, 2023
e8c321d
Update pycore_ceval.h
gsallam Apr 14, 2023
3a97933
Update _testinternalcapi.c
gsallam Apr 14, 2023
94441c4
Create test_perfmaps.py
gsallam Apr 14, 2023
a221fd8
Update posixmodule.c
gsallam Apr 14, 2023
392e842
Update osmodule.h
gsallam Apr 14, 2023
4223241
Update osmodule.h
gsallam Apr 14, 2023
804153a
Update osmodule.h
gsallam Apr 14, 2023
ce54c6c
Update posixmodule.c
gsallam Apr 14, 2023
6c3f233
📜🤖 Added by blurb_it.
blurb-it[bot] Apr 14, 2023
71db361
Update osmodule.h
gsallam Apr 15, 2023
020c872
Update test_perfmaps.py
gsallam Apr 15, 2023
e33b45d
Update ignored.tsv
gsallam Apr 15, 2023
04e9716
Fix trampoline APIs on windows, return int instead of void* from the …
czardoz Apr 18, 2023
4b8cd26
fix restructuredtext literals
czardoz Apr 19, 2023
ccc2d5b
add perfmaps docs under utilities
czardoz Apr 19, 2023
71a7578
make "state" a non-required part of trampoline
czardoz Apr 19, 2023
0ede039
make the API no-op on windows, instead of raising errors
czardoz Apr 20, 2023
7356a9b
use PyUnstable as a prefix
gsallam May 8, 2023
b504e9a
move the implementation of the perf-map api to the sys module
gsallam May 8, 2023
99dab1a
Remove the perf-map API from the posixmodule.c
gsallam May 8, 2023
6b95cbc
Update perfmaps.rst to reflect the PyUnstable prefix
gsallam May 8, 2023
d0ef0a1
Update perf_trampoline.c to use PyUnstable_WritePerfMapEntry
gsallam May 8, 2023
c5e766b
Update pylifecycle.c to use PyUnstable_PerfMapState_Fini
gsallam May 8, 2023
b0641a7
expose PyUnstable_PerfMapState_Fini
gsallam May 8, 2023
c392b24
use assertIn instead of assertEqual and tear down the perf_map_state …
gsallam May 8, 2023
262a38b
use malloc instead of fixed buffer size
gsallam May 8, 2023
a52a25a
Fix perf map entry to have the START and SIZE as hex numbers without 0x
gsallam May 8, 2023
0932e30
Add (uintptr_t) casting for the code_addr
gsallam May 8, 2023
5aeb74c
update the perf map entry example to be hex without 0x
gsallam May 8, 2023
b0682e0
Update the doc with the PyUnstable prefix
gsallam May 8, 2023
db00a03
Merge remote-tracking branch 'upstream/main'
czardoz May 8, 2023
e581026
change function name to the new one
czardoz May 8, 2023
e185092
fix tests_perfmaps
gsallam May 9, 2023
c97137b
Update ignored.tsv
gsallam May 9, 2023
344782b
remove extra new line
gsallam May 9, 2023
1ff695a
remove extra trailing white space
gsallam May 9, 2023
2f13a4e
check perf_map_entry for NULL.
gpshead May 9, 2023
faa85f5
Merge remote-tracking branch 'upstream/main'
czardoz May 9, 2023
50d27e1
update docs
czardoz May 9, 2023
717c4e4
remove rst extension from inline doc links
czardoz May 9, 2023
428dfc3
fix the free_state function signature
czardoz May 18, 2023
ca2c0ff
Merge remote-tracking branch 'upstream/main'
czardoz May 18, 2023
e386ef8
trim extra whitespace
czardoz May 18, 2023
d245fbf
Return error codes instead of setting exceptions in PyUnstable_PerfMa…
gsallam May 20, 2023
2027de6
Add a note that holding the GIL is not required for these APIs
gsallam May 20, 2023
0383b80
Remove the init_state call since it is always NULL for the perf backend
gsallam May 20, 2023
59f3c15
trim extra whitespaces
gsallam May 20, 2023
8bc6150
simplify link to perf map docs from perf profiling
carljm May 20, 2023
c7a49b4
document return values, other doc tweaks
carljm May 20, 2023
125513a
move headers to sysmodule.h
carljm May 20, 2023
10630e8
pass through init return value from write-entry
carljm May 20, 2023
386dd42
apparently :c:data:`errno` doesn't exist
carljm May 20, 2023
6585f32
use PyMem_Raw* instead of malloc/free
carljm May 20, 2023
ea5965a
Merge branch 'main' into gsallam/main
carljm May 20, 2023
e6be2a7
fix PyMem_Free to PyMem_RawFree
carljm May 20, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 50 additions & 0 deletions Doc/c-api/perfmaps.rst
carljm marked this conversation as resolved.
Show resolved Hide resolved
carljm marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
.. highlight:: c

.. _perfmaps:

Support for Perf Maps
----------------------

On supported platforms (as of this writing, only Linux), the runtime can take
advantage of *perf map files* to make Python functions visible to an external
profiling tool (such as `perf <https://perf.wiki.kernel.org/index.php/Main_Page>`_).
A running process may create a file in the ``/tmp`` directory, which contains entries
that can map a section of executable code to a name. This interface is described in the
`documentation of the Linux Perf tool <https://git.kernel.org/pub/scm/linux/
kernel/git/torvalds/linux.git/tree/tools/perf/Documentation/jit-interface.txt>`_.

In Python, these helper APIs can be used by libraries and features that rely
on generating machine code on the fly.

Note that holding the Global Interpreter Lock (GIL) is not required for these APIs.

.. c:function:: int PyUnstable_PerfMapState_Init(void)

Open the ``/tmp/perf-$pid.map`` file, unless it's already opened, and create
a lock to ensure thread-safe writes to the file (provided the writes are
done through :c:func:`PyUnstable_WritePerfMapEntry`). Normally, there's no need
to call this explicitly; just use :c:func:`PyUnstable_WritePerfMapEntry`
and it will initialize the state on first call.

Returns ``0`` on success, ``-1`` on failure to create/open the perf map file,
or ``-2`` on failure to create a lock. Check ``errno`` for more information
about the cause of a failure.

.. c:function:: int PyUnstable_WritePerfMapEntry(const void *code_addr, unsigned int code_size, const char *entry_name)

Write one single entry to the ``/tmp/perf-$pid.map`` file. This function is
thread safe. Here is what an example entry looks like::

# address size name
7f3529fcf759 b py::bar:/run/t.py

Will call :c:func:`PyUnstable_PerfMapState_Init` before writing the entry, if
the perf map file is not already opened. Returns ``0`` on success, or the
same error codes as :c:func:`PyUnstable_PerfMapState_Init` on failure.

.. c:function:: void PyUnstable_PerfMapState_Fini(void)

Close the perf map file opened by :c:func:`PyUnstable_PerfMapState_Init`.
This is called by the runtime itself during interpreter shut-down. In
general, there shouldn't be a reason to explicitly call this, except to
handle specific scenarios such as forking.
1 change: 1 addition & 0 deletions Doc/c-api/utilities.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,3 +19,4 @@ and parsing function arguments and constructing Python values from C values.
conversion.rst
reflection.rst
codec.rst
perfmaps.rst
4 changes: 1 addition & 3 deletions Doc/howto/perf_profiling.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ functions to appear in the output of the ``perf`` profiler. When this mode is
enabled, the interpreter will interpose a small piece of code compiled on the
fly before the execution of every Python function and it will teach ``perf`` the
relationship between this piece of code and the associated Python function using
`perf map files`_.
:doc:`perf map files <../c-api/perfmaps>`.

.. note::

Expand Down Expand Up @@ -206,5 +206,3 @@ You can check if your system has been compiled with this flag by running::
If you don't see any output it means that your interpreter has not been compiled with
frame pointers and therefore it may not be able to show Python functions in the output
of ``perf``.

.. _perf map files: https://github.com/torvalds/linux/blob/0513e464f9007b70b96740271a948ca5ab6e7dd7/tools/perf/Documentation/jit-interface.txt
13 changes: 13 additions & 0 deletions Include/sysmodule.h
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,19 @@ Py_DEPRECATED(3.11) PyAPI_FUNC(int) PySys_HasWarnOptions(void);
Py_DEPRECATED(3.11) PyAPI_FUNC(void) PySys_AddXOption(const wchar_t *);
PyAPI_FUNC(PyObject *) PySys_GetXOptions(void);

#if !defined(Py_LIMITED_API)
typedef struct {
FILE* perf_map;
PyThread_type_lock map_lock;
} PerfMapState;

PyAPI_FUNC(int) PyUnstable_PerfMapState_Init(void);

PyAPI_FUNC(int) PyUnstable_WritePerfMapEntry(const void *code_addr, unsigned int code_size, const char *entry_name);

PyAPI_FUNC(void) PyUnstable_PerfMapState_Fini(void);
#endif

#ifndef Py_LIMITED_API
# define Py_CPYTHON_SYSMODULE_H
# include "cpython/sysmodule.h"
Expand Down
19 changes: 19 additions & 0 deletions Lib/test/test_perfmaps.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
import os
import sys
import unittest

from _testinternalcapi import perf_map_state_teardown, write_perf_map_entry

if sys.platform != 'linux':
raise unittest.SkipTest('Linux only')


class TestPerfMapWriting(unittest.TestCase):
def test_write_perf_map_entry(self):
self.assertEqual(write_perf_map_entry(0x1234, 5678, "entry1"), 0)
self.assertEqual(write_perf_map_entry(0x2345, 6789, "entry2"), 0)
with open(f"/tmp/perf-{os.getpid()}.map") as f:
perf_file_contents = f.read()
self.assertIn("1234 162e entry1", perf_file_contents)
self.assertIn("2345 1a85 entry2", perf_file_contents)
perf_map_state_teardown()
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
Introduced :c:func:`PyUnstable_WritePerfMapEntry`, :c:func:`PyUnstable_PerfMapState_Init` and
:c:func:`PyUnstable_PerfMapState_Fini`. These allow extension modules (JIT compilers in
particular) to write to perf-map files in a thread safe manner. The
:doc:`../howto/perf_profiling` also uses these APIs to write
entries in the perf-map file.
27 changes: 27 additions & 0 deletions Modules/_testinternalcapi.c
Original file line number Diff line number Diff line change
Expand Up @@ -759,6 +759,31 @@ clear_extension(PyObject *self, PyObject *args)
Py_RETURN_NONE;
}

static PyObject *
write_perf_map_entry(PyObject *self, PyObject *args)
{
const void *code_addr;
unsigned int code_size;
const char *entry_name;

if (!PyArg_ParseTuple(args, "KIs", &code_addr, &code_size, &entry_name))
return NULL;

int ret = PyUnstable_WritePerfMapEntry(code_addr, code_size, entry_name);
if (ret == -1) {
PyErr_SetString(PyExc_OSError, "Failed to write performance map entry");
return NULL;
}
return Py_BuildValue("i", ret);
}

static PyObject *
perf_map_state_teardown(PyObject *Py_UNUSED(self), PyObject *Py_UNUSED(ignored))
{
PyUnstable_PerfMapState_Fini();
Py_RETURN_NONE;
}

static PyObject *
iframe_getcode(PyObject *self, PyObject *frame)
{
Expand Down Expand Up @@ -815,6 +840,8 @@ static PyMethodDef module_functions[] = {
_TESTINTERNALCAPI_ASSEMBLE_CODE_OBJECT_METHODDEF
{"get_interp_settings", get_interp_settings, METH_VARARGS, NULL},
{"clear_extension", clear_extension, METH_VARARGS, NULL},
{"write_perf_map_entry", write_perf_map_entry, METH_VARARGS},
{"perf_map_state_teardown", perf_map_state_teardown, METH_NOARGS},
{"iframe_getcode", iframe_getcode, METH_O, NULL},
{"iframe_getline", iframe_getline, METH_O, NULL},
{"iframe_getlasti", iframe_getlasti, METH_O, NULL},
Expand Down
84 changes: 16 additions & 68 deletions Python/perf_trampoline.c
Original file line number Diff line number Diff line change
Expand Up @@ -193,75 +193,33 @@ typedef struct trampoline_api_st trampoline_api_t;
#define trampoline_api _PyRuntime.ceval.perf.trampoline_api
#define perf_map_file _PyRuntime.ceval.perf.map_file

static void *
perf_map_get_file(void)
{
if (perf_map_file) {
return perf_map_file;
}
char filename[100];
pid_t pid = getpid();
// Location and file name of perf map is hard-coded in perf tool.
// Use exclusive create flag wit nofollow to prevent symlink attacks.
int flags = O_WRONLY | O_CREAT | O_EXCL | O_NOFOLLOW | O_CLOEXEC;
snprintf(filename, sizeof(filename) - 1, "/tmp/perf-%jd.map",
(intmax_t)pid);
int fd = open(filename, flags, 0600);
if (fd == -1) {
perf_status = PERF_STATUS_FAILED;
PyErr_SetFromErrnoWithFilename(PyExc_OSError, filename);
return NULL;
}
perf_map_file = fdopen(fd, "w");
if (!perf_map_file) {
perf_status = PERF_STATUS_FAILED;
PyErr_SetFromErrnoWithFilename(PyExc_OSError, filename);
close(fd);
return NULL;
}
return perf_map_file;
}

static int
perf_map_close(void *state)
{
FILE *fp = (FILE *)state;
int ret = 0;
if (fp) {
ret = fclose(fp);
}
perf_map_file = NULL;
perf_status = PERF_STATUS_NO_INIT;
return ret;
}

static void
perf_map_write_entry(void *state, const void *code_addr,
unsigned int code_size, PyCodeObject *co)
{
assert(state != NULL);
FILE *method_file = (FILE *)state;
const char *entry = PyUnicode_AsUTF8(co->co_qualname);
if (entry == NULL) {
_PyErr_WriteUnraisableMsg("Failed to get qualname from code object",
NULL);
return;
const char *entry = "";
if (co->co_qualname != NULL) {
entry = PyUnicode_AsUTF8(co->co_qualname);
}
const char *filename = PyUnicode_AsUTF8(co->co_filename);
if (filename == NULL) {
_PyErr_WriteUnraisableMsg("Failed to get filename from code object",
NULL);
const char *filename = "";
if (co->co_filename != NULL) {
filename = PyUnicode_AsUTF8(co->co_filename);
}
size_t perf_map_entry_size = snprintf(NULL, 0, "py::%s:%s", entry, filename) + 1;
char* perf_map_entry = (char*) PyMem_RawMalloc(perf_map_entry_size);
if (perf_map_entry == NULL) {
return;
}
fprintf(method_file, "%" PRIxPTR " %x py::%s:%s\n", (uintptr_t) code_addr, code_size, entry,
filename);
fflush(method_file);
snprintf(perf_map_entry, perf_map_entry_size, "py::%s:%s", entry, filename);
PyUnstable_WritePerfMapEntry(code_addr, code_size, perf_map_entry);
PyMem_RawFree(perf_map_entry);
}

_PyPerf_Callbacks _Py_perfmap_callbacks = {
&perf_map_get_file,
NULL,
&perf_map_write_entry,
&perf_map_close
NULL,
};

static int
Expand Down Expand Up @@ -465,13 +423,6 @@ _PyPerfTrampoline_Init(int activate)
if (new_code_arena() < 0) {
return -1;
}
if (trampoline_api.state == NULL) {
void *state = trampoline_api.init_state();
if (state == NULL) {
return -1;
}
trampoline_api.state = state;
}
extra_code_index = _PyEval_RequestCodeExtraIndex(NULL);
if (extra_code_index == -1) {
return -1;
Expand All @@ -491,10 +442,6 @@ _PyPerfTrampoline_Fini(void)
tstate->interp->eval_frame = NULL;
}
free_code_arenas();
if (trampoline_api.state != NULL) {
trampoline_api.free_state(trampoline_api.state);
trampoline_api.state = NULL;
}
carljm marked this conversation as resolved.
Show resolved Hide resolved
extra_code_index = -1;
#endif
return 0;
Expand All @@ -507,6 +454,7 @@ _PyPerfTrampoline_AfterFork_Child(void)
// Restart trampoline in file in child.
int was_active = _PyIsPerfTrampolineActive();
_PyPerfTrampoline_Fini();
PyUnstable_PerfMapState_Fini();
if (was_active) {
_PyPerfTrampoline_Init(1);
}
Expand Down
1 change: 1 addition & 0 deletions Python/pylifecycle.c
Original file line number Diff line number Diff line change
Expand Up @@ -1775,6 +1775,7 @@ Py_FinalizeEx(void)
*/

_PyAtExit_Call(tstate->interp);
PyUnstable_PerfMapState_Fini();

/* Copy the core config, PyInterpreterState_Delete() free
the core config memory */
Expand Down
80 changes: 79 additions & 1 deletion Python/sysmodule.c
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,10 @@ extern const char *PyWin_DLLVersionString;
#include <emscripten.h>
#endif

#ifdef HAVE_FCNTL_H
#include <fcntl.h>
#endif

/*[clinic input]
module sys
[clinic start generated code]*/
Expand Down Expand Up @@ -2144,7 +2148,7 @@ sys_activate_stack_trampoline_impl(PyObject *module, const char *backend)
if (strcmp(backend, "perf") == 0) {
_PyPerf_Callbacks cur_cb;
_PyPerfTrampoline_GetCallbacks(&cur_cb);
if (cur_cb.init_state != _Py_perfmap_callbacks.init_state) {
if (cur_cb.write_state != _Py_perfmap_callbacks.write_state) {
if (_PyPerfTrampoline_SetCallbacks(&_Py_perfmap_callbacks) < 0 ) {
PyErr_SetString(PyExc_ValueError, "can't activate perf trampoline");
return NULL;
Expand Down Expand Up @@ -2240,6 +2244,80 @@ sys__getframemodulename_impl(PyObject *module, int depth)
}


#ifdef __cplusplus
extern "C" {
#endif

static PerfMapState perf_map_state;

PyAPI_FUNC(int) PyUnstable_PerfMapState_Init(void) {
#ifndef MS_WINDOWS
char filename[100];
pid_t pid = getpid();
// Use nofollow flag to prevent symlink attacks.
int flags = O_WRONLY | O_CREAT | O_APPEND | O_NOFOLLOW | O_CLOEXEC;
snprintf(filename, sizeof(filename) - 1, "/tmp/perf-%jd.map",
(intmax_t)pid);
int fd = open(filename, flags, 0600);
if (fd == -1) {
return -1;
}
else{
perf_map_state.perf_map = fdopen(fd, "a");
if (perf_map_state.perf_map == NULL) {
close(fd);
return -1;
}
}
perf_map_state.map_lock = PyThread_allocate_lock();
if (perf_map_state.map_lock == NULL) {
fclose(perf_map_state.perf_map);
return -2;
}
#endif
return 0;
}

PyAPI_FUNC(int) PyUnstable_WritePerfMapEntry(
const void *code_addr,
unsigned int code_size,
const char *entry_name
) {
#ifndef MS_WINDOWS
if (perf_map_state.perf_map == NULL) {
int ret = PyUnstable_PerfMapState_Init();
if(ret != 0){
return ret;
}
}
PyThread_acquire_lock(perf_map_state.map_lock, 1);
fprintf(perf_map_state.perf_map, "%" PRIxPTR " %x %s\n", (uintptr_t) code_addr, code_size, entry_name);
fflush(perf_map_state.perf_map);
PyThread_release_lock(perf_map_state.map_lock);
#endif
return 0;
}

PyAPI_FUNC(void) PyUnstable_PerfMapState_Fini(void) {
#ifndef MS_WINDOWS
if (perf_map_state.perf_map != NULL) {
// close the file
PyThread_acquire_lock(perf_map_state.map_lock, 1);
fclose(perf_map_state.perf_map);
PyThread_release_lock(perf_map_state.map_lock);

// clean up the lock and state
PyThread_free_lock(perf_map_state.map_lock);
perf_map_state.perf_map = NULL;
}
#endif
}

#ifdef __cplusplus
}
#endif


static PyMethodDef sys_methods[] = {
/* Might as well keep this in alphabetic order */
SYS_ADDAUDITHOOK_METHODDEF
Expand Down
1 change: 1 addition & 0 deletions Tools/c-analyzer/cpython/ignored.tsv
Original file line number Diff line number Diff line change
Expand Up @@ -356,6 +356,7 @@ Python/pystate.c - initial -
Python/specialize.c - adaptive_opcodes -
Python/specialize.c - cache_requirements -
Python/stdlib_module_names.h - _Py_stdlib_module_names -
Python/sysmodule.c - perf_map_state -
Python/sysmodule.c - _PySys_ImplCacheTag -
Python/sysmodule.c - _PySys_ImplName -
Python/sysmodule.c - whatstrings -
Expand Down