Skip to content

gh-143008: Fix Null pointer dereferences in TextIOWrapper underlying stream access#145957

Open
cmaloney wants to merge 12 commits intopython:mainfrom
cmaloney:textio_nullbuffer
Open

gh-143008: Fix Null pointer dereferences in TextIOWrapper underlying stream access#145957
cmaloney wants to merge 12 commits intopython:mainfrom
cmaloney:textio_nullbuffer

Conversation

@cmaloney
Copy link
Copy Markdown
Contributor

@cmaloney cmaloney commented Mar 15, 2026

TextIOWrapper keeps its underlying stream in a member called self->buffer. That stream can be detached by user code, such as custom .flush() implementations, resulting in self->buffer being set to NULL. The implementation often checked at the start of functions if self->buffer is in a good state but did not always recheck after other Python code which could modify self->buffer.

The cases which need to be re-checked are hard to spot so rather than rely on reviewer effort make a better safety net by changing all self->buffer access to go through helper functions.

Thank you @yihong0618 for the test, NEWS and an initial implementation in gh-143041.

TextIOWrapper keeps its underlying stream in a member called
`self->buffer`. That stream can be detached by user code, such as custom
`.flush` implementations resulting in `self->buffer` being set to NULL.
The implementation often checked at the start of functions if
`self->buffer` is in a good state, but did not always recheck after
other Python code was called which could modify `self->buffer`.

The cases which need to be re-checked are hard to spot so rather than
rely on reviewer effort create better safety by making all self->buffer
access go through helper functions.

Thank you yihong0618 for the test, NEWS and initial implementation in
pythongh-143041.

Co-authored-by: yihong0618 <zouzou0208@gmail.com>
@yihong0618
Copy link
Copy Markdown
Contributor

wow that is cool thanks

@cmaloney cmaloney added needs backport to 3.13 bugs and security fixes needs backport to 3.14 bugs and security fixes labels Mar 15, 2026
@cmaloney cmaloney changed the title gh-143008: Safer TextIOWrapper underlying stream access gh-143008: Fix Null Pointer Dereferences in TextIOWrapper underlying stream access Mar 17, 2026
@cmaloney cmaloney changed the title gh-143008: Fix Null Pointer Dereferences in TextIOWrapper underlying stream access gh-143008: Fix Null pointer dereferences in TextIOWrapper underlying stream access Mar 17, 2026
def test_reentrant_detach_during_flush(self):
# gh-143008: Reentrant detach() during flush should raise RuntimeError
# instead of crashing.
# gh-143008: Reentrant detach() during flush should not crash.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool seems better

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you clarify what the goal / what you want with this comment? Is it more that you'd prefer RuntimeError or some similar piece? Would you like me to not use and update the test you wrote but build a separate one?

Overall trying to get this to a place it is hopefully shippable before the 3.15 beta.

Copy link
Copy Markdown
Contributor

@yihong0618 yihong0618 Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no just agree this sorry for misleading it

@cmaloney
Copy link
Copy Markdown
Contributor Author

@colesbury could you take a look as you've done some recent work on TextIO safety around threading. These cases are generally single threaded but generally the pattern "Implementation usually checked state; then did an operation which could modify it and didn't recheck".

@colesbury
Copy link
Copy Markdown
Contributor

I'm a bit confused about the relationship here between _textiowrapper_buffer_safe and CHECK_ATTACHED. When can buffer be NULL and CHECK_ATTACHED not trigger?

@colesbury colesbury self-requested a review April 17, 2026 17:57
@colesbury
Copy link
Copy Markdown
Contributor

Less important, but please match the CPython style for C code, e.g.:

  • { on new lines for functions definitions
  • when wrapping function calls, indent continuation lines to align with the call site

@cmaloney
Copy link
Copy Markdown
Contributor Author

I'm a bit confused about the relationship here between _textiowrapper_buffer_safe and CHECK_ATTACHED. When can buffer be NULL and CHECK_ATTACHED not trigger?

CHECK_ATTACHED, by way of CHECK_INITIALIZED, checks self->ok first. The self->ok flag is set to 0 at start of __init__, 1 at the end, and cleared/reset to 0 during .clear() as well as at the start of the Py_tp_dealloc. During __init__ there are calls to check the file is in a good state and get some attributes (remember: self->buf is the underlying file-like object not an internal buffer).

The more complicated case is during deletion where there is an interaction with the base class. TextIOWrapper implements Py_tp_dealloc and its base class (TextIOBase which has a base class IOBase) implements a Py_tp_finalize with the function iobase_finalize. That function tries to ensure all data is written in the full I/O "stack" before any part of it is closed out of order. It tries to have Text I/O flush and close, then Buffered I/O (self->buf), then the Raw I/O. If there is an underlying file (self->buf != NULL) during those operations need to try and write data in the TextIOWrapper even though self->ok is 0.

Adding to the complexity here all the objects involved and the multiple operations (flush(), close(), .closed) can be overridden and could call TextIOWrapper.clear() or TextIOWrapper.detach() causing self->buf to become NULL.

Less important, but please match the CPython style for C code, e.g.:

Sorry for missing those bits, doing a review of that and will update for it shortly.

@cmaloney
Copy link
Copy Markdown
Contributor Author

There are a couple cases I'm not sure how to PEP 7, in particular:

/* hits 81 cols */
        PyObject *bytes = _textiowrapper_buffer_callmethod_noargs(self,
                                                                  &_Py_ID(read));
/* hits 81 cols */
            res = _textiowrapper_buffer_callmethod_onearg(self,
                                                          &_Py_ID(_dealloc_warn),
                                                          (PyObject *)self);


PyObject *input_chunk = _textiowrapper_buffer_callmethod_onearg(self,
            &_Py_ID(read), bytes_to_feed);

A couple directions I looked at but don't know how to decide between:

  1. Make the variable name shorter (increases diff)
  2. Make the function name shorter (ex. _textiowrapper_buf_... or _textio_buffer_, _textiowrapper_buf_callmeth_noargs)
  3. Whats the right indentation to just put on the next line?

@colesbury
Copy link
Copy Markdown
Contributor

I think it's fine to make the function names shorter, i.e., buffer_callmethod_onearg instead of _textiowrapper_buffer_callmethod_onearg. File-scope static functions don't need a prefix or underscore.

In general the preference is for wrapping function calls like the following (from PEP 7 example):

PyErr_Format(PyExc_TypeError,
             "cannot create '%.100s' instances",
             type->tp_name);

But if that's not practical due to overly long lines, than just try to fit the style of surround code. For example, from elsewhere in textio.c:

        PyObject *incrementalDecoder = PyObject_CallFunctionObjArgs(
            (PyObject *)state->PyIncrementalNewlineDecoder_Type,
            self->decoder, self->readtranslate ? Py_True : Py_False, NULL);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting review needs backport to 3.13 bugs and security fixes needs backport to 3.14 bugs and security fixes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants