gh-143008: Fix Null pointer dereferences in TextIOWrapper underlying stream access#145957
gh-143008: Fix Null pointer dereferences in TextIOWrapper underlying stream access#145957cmaloney wants to merge 12 commits intopython:mainfrom
Conversation
TextIOWrapper keeps its underlying stream in a member called `self->buffer`. That stream can be detached by user code, such as custom `.flush` implementations resulting in `self->buffer` being set to NULL. The implementation often checked at the start of functions if `self->buffer` is in a good state, but did not always recheck after other Python code was called which could modify `self->buffer`. The cases which need to be re-checked are hard to spot so rather than rely on reviewer effort create better safety by making all self->buffer access go through helper functions. Thank you yihong0618 for the test, NEWS and initial implementation in pythongh-143041. Co-authored-by: yihong0618 <zouzou0208@gmail.com>
|
wow that is cool thanks |
| def test_reentrant_detach_during_flush(self): | ||
| # gh-143008: Reentrant detach() during flush should raise RuntimeError | ||
| # instead of crashing. | ||
| # gh-143008: Reentrant detach() during flush should not crash. |
There was a problem hiding this comment.
Could you clarify what the goal / what you want with this comment? Is it more that you'd prefer RuntimeError or some similar piece? Would you like me to not use and update the test you wrote but build a separate one?
Overall trying to get this to a place it is hopefully shippable before the 3.15 beta.
There was a problem hiding this comment.
no just agree this sorry for misleading it
|
@colesbury could you take a look as you've done some recent work on TextIO safety around threading. These cases are generally single threaded but generally the pattern "Implementation usually checked state; then did an operation which could modify it and didn't recheck". |
|
I'm a bit confused about the relationship here between |
|
Less important, but please match the CPython style for C code, e.g.:
|
The more complicated case is during deletion where there is an interaction with the base class. Adding to the complexity here all the objects involved and the multiple operations (
Sorry for missing those bits, doing a review of that and will update for it shortly. |
|
There are a couple cases I'm not sure how to PEP 7, in particular: /* hits 81 cols */
PyObject *bytes = _textiowrapper_buffer_callmethod_noargs(self,
&_Py_ID(read));
/* hits 81 cols */
res = _textiowrapper_buffer_callmethod_onearg(self,
&_Py_ID(_dealloc_warn),
(PyObject *)self);
PyObject *input_chunk = _textiowrapper_buffer_callmethod_onearg(self,
&_Py_ID(read), bytes_to_feed);A couple directions I looked at but don't know how to decide between:
|
|
I think it's fine to make the function names shorter, i.e., In general the preference is for wrapping function calls like the following (from PEP 7 example): PyErr_Format(PyExc_TypeError,
"cannot create '%.100s' instances",
type->tp_name);But if that's not practical due to overly long lines, than just try to fit the style of surround code. For example, from elsewhere in PyObject *incrementalDecoder = PyObject_CallFunctionObjArgs(
(PyObject *)state->PyIncrementalNewlineDecoder_Type,
self->decoder, self->readtranslate ? Py_True : Py_False, NULL); |
TextIOWrapper keeps its underlying stream in a member called
self->buffer. That stream can be detached by user code, such as custom.flush()implementations, resulting inself->bufferbeing set to NULL. The implementation often checked at the start of functions ifself->bufferis in a good state but did not always recheck after other Python code which could modifyself->buffer.The cases which need to be re-checked are hard to spot so rather than rely on reviewer effort make a better safety net by changing all
self->bufferaccess to go through helper functions.Thank you @yihong0618 for the test, NEWS and an initial implementation in gh-143041.
TextIOWrapper.truncatevia re-entrantflush#143008