Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-126868: Add freelist for compact int objects #126865

Draft
wants to merge 16 commits into
base: main
Choose a base branch
from
Draft
2 changes: 2 additions & 0 deletions Include/internal/pycore_freelist_state.h
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ extern "C" {
# define Py_dicts_MAXFREELIST 80
# define Py_dictkeys_MAXFREELIST 80
# define Py_floats_MAXFREELIST 100
# define Py_ints_MAXFREELIST 100
# define Py_slices_MAXFREELIST 1
# define Py_contexts_MAXFREELIST 255
# define Py_async_gens_MAXFREELIST 80
Expand All @@ -35,6 +36,7 @@ struct _Py_freelist {

struct _Py_freelists {
struct _Py_freelist floats;
struct _Py_freelist ints;
struct _Py_freelist tuples[PyTuple_MAXSAVESIZE];
struct _Py_freelist lists;
struct _Py_freelist dicts;
Expand Down
2 changes: 2 additions & 0 deletions Include/internal/pycore_long.h
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,8 @@ extern void _PyLong_FiniTypes(PyInterpreterState *interp);

/* other API */

PyAPI_FUNC(void) _PyLong_ExactDealloc(PyObject *self);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need it to be exported like that or could you live with a simple extern? If not, can you explain which file needs this export?

Copy link
Member

@Fidget-Spinner Fidget-Spinner Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's needed by the JIT on Windows, as it's used in bytecodes.c. This is a pretty common pattern in the internal C API unfortunately

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For semantic purposes, would it be better to have a macro named differently? (it would be a simple alias but it could help semantics and reviewers)


#define _PyLong_SMALL_INTS _Py_SINGLETON(small_ints)

// _PyLong_GetZero() and _PyLong_GetOne() must always be available
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Increase performance of int by adding a freelist for compact ints.
45 changes: 35 additions & 10 deletions Objects/longobject.c
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
#include "pycore_bitutils.h" // _Py_popcount32()
#include "pycore_initconfig.h" // _PyStatus_OK()
#include "pycore_call.h" // _PyObject_MakeTpCall
#include "pycore_freelist.h" // _Py_FREELIST_FREE, _Py_FREELIST_POP
#include "pycore_long.h" // _Py_SmallInts
#include "pycore_object.h" // _PyObject_Init()
#include "pycore_runtime.h" // _PY_NSMALLPOSINTS
Expand Down Expand Up @@ -42,7 +43,7 @@ static inline void
_Py_DECREF_INT(PyLongObject *op)
{
assert(PyLong_CheckExact(op));
_Py_DECREF_SPECIALIZED((PyObject *)op, (destructor)PyObject_Free);
_Py_DECREF_SPECIALIZED((PyObject *)op, _PyLong_ExactDealloc);
}

static inline int
Expand Down Expand Up @@ -220,11 +221,16 @@ _PyLong_FromMedium(sdigit x)
{
assert(!IS_SMALL_INT(x));
assert(is_medium_int(x));
/* We could use a freelist here */
PyLongObject *v = PyObject_Malloc(sizeof(PyLongObject));

// The small int cache is incompatible with _Py_NewReference which is called
// by _Py_FREELIST_POP.
PyLongObject *v = (PyLongObject *)_Py_FREELIST_POP_MEM(ints);
Copy link
Member

@markshannon markshannon Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you go back to _Py_FREELIST_POP and avoid reinitializing the object on line 237. We don't need to re-assign the ob_type field in objects taken from the freelist. Something like:

    PyLongObject *v = (PyLongObject *)_Py_FREELIST_POP(ints);
    if (v == NULL) {
        v = PyObject_Malloc(sizeof(PyLongObject));
        if (v == NULL) {
            PyErr_NoMemory();
            return NULL;
        }
        _PyObject_Init((PyObject*)v, &PyLong_Type);
    }
    digit abs_x = x < 0 ? -x : x;
    _PyLong_SetSignAndDigitCount(v, x<0?-1:1, 1);
    v->long_value.ob_digit[0] = abs_x;
    return (PyObject*)v;

if (v == NULL) {
PyErr_NoMemory();
return NULL;
v = PyObject_Malloc(sizeof(PyLongObject));
if (v == NULL) {
PyErr_NoMemory();
return NULL;
}
}
digit abs_x = x < 0 ? -x : x;
_PyLong_SetSignAndDigitCount(v, x<0?-1:1, 1);
Expand Down Expand Up @@ -3611,24 +3617,43 @@ long_richcompare(PyObject *self, PyObject *other, int op)
Py_RETURN_RICHCOMPARE(result, 0, op);
}

void
_PyLong_ExactDealloc(PyObject *self)
{
if (_PyLong_IsCompact((PyLongObject *)self)) {
_Py_FREELIST_FREE(ints, self, PyObject_Free);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realised that we didnt check for if it's a smallint cached value here. It should be rejected there

Copy link
Member

@Fidget-Spinner Fidget-Spinner Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind. It's not needed because (theoretically) the refcount of a smallint never hits 0 due to it being immortal, so this code will never be called on it. Can you add an assert to be safe?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also means my comment about the small ints was wrong. Could you remove that please?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment has been adapted. If the refcount can never become zero, then the code in long_dealloc checking for small ints can be removed (perhaps also with an assert). Since this is something unrelated to the freelists I will make a separate PR for that and rebase this one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Fidget-Spinner I think your first remark was right after all and we do need to check for smallints (at least in the 32-bit stable ABI build). I will add the check once there is consensus on #127120 (which might very well be the status quo)

return;
}
PyObject_Free(self);
}

static void
long_dealloc(PyObject *self)
{
/* This should never get called, but we also don't want to SEGV if
* we accidentally decref small Ints out of existence. Instead,
* since small Ints are immortal, re-set the reference count.
*/
PyLongObject *pylong = (PyLongObject*)self;
Fidget-Spinner marked this conversation as resolved.
Show resolved Hide resolved
if (pylong && _PyLong_IsCompact(pylong)) {
assert(pylong);

if (_PyLong_IsCompact(pylong)) {
stwodigits ival = medium_value(pylong);
if (IS_SMALL_INT(ival)) {
PyLongObject *small_pylong = (PyLongObject *)get_small_int((sdigit)ival);
if (pylong == small_pylong) {
/* This should never get called, but we also don't want to SEGV if
* we accidentally decref small Ints out of existence. Instead,
* since small Ints are immortal, re-set the reference count.
*
* With a fixed refcount for immortal objects this should not happen
*/
_Py_SetImmortal(self);
return;
}
}
if (PyLong_CheckExact(self)) {
_Py_FREELIST_FREE(ints, self, PyObject_Free);
return;
}
}

Py_TYPE(self)->tp_free(self);
}

Expand Down
1 change: 1 addition & 0 deletions Objects/object.c
Original file line number Diff line number Diff line change
Expand Up @@ -906,6 +906,7 @@ _PyObject_ClearFreeLists(struct _Py_freelists *freelists, int is_finalization)
clear_freelist(&freelists->object_stack_chunks, 1, PyMem_RawFree);
}
clear_freelist(&freelists->unicode_writers, is_finalization, PyMem_Free);
clear_freelist(&freelists->ints, is_finalization, free_object);
}

/*
Expand Down
21 changes: 11 additions & 10 deletions Python/bytecodes.c
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
#include "pycore_pyerrors.h" // _PyErr_GetRaisedException()
#include "pycore_pystate.h" // _PyInterpreterState_GET()
#include "pycore_range.h" // _PyRangeIterObject
#include "pycore_long.h" // _PyLong_ExactDealloc()
#include "pycore_setobject.h" // _PySet_NextEntry()
#include "pycore_sliceobject.h" // _PyBuildSlice_ConsumeRefs
#include "pycore_tuple.h" // _PyTuple_ITEMS()
Expand Down Expand Up @@ -514,8 +515,8 @@ dummy_func(

STAT_INC(BINARY_OP, hit);
PyObject *res_o = _PyLong_Multiply((PyLongObject *)left_o, (PyLongObject *)right_o);
PyStackRef_CLOSE_SPECIALIZED(right, (destructor)PyObject_Free);
PyStackRef_CLOSE_SPECIALIZED(left, (destructor)PyObject_Free);
PyStackRef_CLOSE_SPECIALIZED(right, (destructor)_PyLong_ExactDealloc);
PyStackRef_CLOSE_SPECIALIZED(left, (destructor)_PyLong_ExactDealloc);
INPUTS_DEAD();
ERROR_IF(res_o == NULL, error);
res = PyStackRef_FromPyObjectSteal(res_o);
Expand All @@ -527,8 +528,8 @@ dummy_func(

STAT_INC(BINARY_OP, hit);
PyObject *res_o = _PyLong_Add((PyLongObject *)left_o, (PyLongObject *)right_o);
PyStackRef_CLOSE_SPECIALIZED(right, (destructor)PyObject_Free);
PyStackRef_CLOSE_SPECIALIZED(left, (destructor)PyObject_Free);
PyStackRef_CLOSE_SPECIALIZED(right, (destructor)_PyLong_ExactDealloc);
PyStackRef_CLOSE_SPECIALIZED(left, (destructor)_PyLong_ExactDealloc);
INPUTS_DEAD();
ERROR_IF(res_o == NULL, error);
res = PyStackRef_FromPyObjectSteal(res_o);
Expand All @@ -540,8 +541,8 @@ dummy_func(

STAT_INC(BINARY_OP, hit);
PyObject *res_o = _PyLong_Subtract((PyLongObject *)left_o, (PyLongObject *)right_o);
PyStackRef_CLOSE_SPECIALIZED(right, (destructor)PyObject_Free);
PyStackRef_CLOSE_SPECIALIZED(left, (destructor)PyObject_Free);
PyStackRef_CLOSE_SPECIALIZED(right, (destructor)_PyLong_ExactDealloc);
PyStackRef_CLOSE_SPECIALIZED(left, (destructor)_PyLong_ExactDealloc);
INPUTS_DEAD();
ERROR_IF(res_o == NULL, error);
res = PyStackRef_FromPyObjectSteal(res_o);
Expand Down Expand Up @@ -797,7 +798,7 @@ dummy_func(
PyObject *res_o = PyList_GET_ITEM(list, index);
assert(res_o != NULL);
Py_INCREF(res_o);
PyStackRef_CLOSE_SPECIALIZED(sub_st, (destructor)PyObject_Free);
PyStackRef_CLOSE_SPECIALIZED(sub_st, (destructor)_PyLong_ExactDealloc);
DEAD(sub_st);
PyStackRef_CLOSE(list_st);
res = PyStackRef_FromPyObjectSteal(res_o);
Expand All @@ -817,7 +818,7 @@ dummy_func(
DEOPT_IF(Py_ARRAY_LENGTH(_Py_SINGLETON(strings).ascii) <= c);
STAT_INC(BINARY_SUBSCR, hit);
PyObject *res_o = (PyObject*)&_Py_SINGLETON(strings).ascii[c];
PyStackRef_CLOSE_SPECIALIZED(sub_st, (destructor)PyObject_Free);
PyStackRef_CLOSE_SPECIALIZED(sub_st, (destructor)_PyLong_ExactDealloc);
DEAD(sub_st);
PyStackRef_CLOSE(str_st);
res = PyStackRef_FromPyObjectSteal(res_o);
Expand All @@ -838,7 +839,7 @@ dummy_func(
PyObject *res_o = PyTuple_GET_ITEM(tuple, index);
assert(res_o != NULL);
Py_INCREF(res_o);
PyStackRef_CLOSE_SPECIALIZED(sub_st, (destructor)PyObject_Free);
PyStackRef_CLOSE_SPECIALIZED(sub_st, (destructor)_PyLong_ExactDealloc);
DEAD(sub_st);
PyStackRef_CLOSE(tuple_st);
res = PyStackRef_FromPyObjectSteal(res_o);
Expand Down Expand Up @@ -950,7 +951,7 @@ dummy_func(
PyList_SET_ITEM(list, index, PyStackRef_AsPyObjectSteal(value));
assert(old_value != NULL);
Py_DECREF(old_value);
PyStackRef_CLOSE_SPECIALIZED(sub_st, (destructor)PyObject_Free);
PyStackRef_CLOSE_SPECIALIZED(sub_st, (destructor)_PyLong_ExactDealloc);
DEAD(sub_st);
PyStackRef_CLOSE(list_st);
}
Expand Down
20 changes: 10 additions & 10 deletions Python/executor_cases.c.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

20 changes: 10 additions & 10 deletions Python/generated_cases.c.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading