Skip to content

Add HTTP parser memory regression tests#12871

Open
codingkiddo wants to merge 1 commit into
aio-libs:masterfrom
codingkiddo:test/http-parser-memory-regression
Open

Add HTTP parser memory regression tests#12871
codingkiddo wants to merge 1 commit into
aio-libs:masterfrom
codingkiddo:test/http-parser-memory-regression

Conversation

@codingkiddo

Copy link
Copy Markdown

What do these changes do?

This adds HTTP parser memory regression coverage for oversized inputs.

The new tests cover a small first slice of #12736:

  • a very long request line
  • a very long header line

The tests measure peak memory usage while feeding already-created input into the parser, so the large test input allocation itself is not included in the measured peak.

Are there changes in behavior for the user?

No. This is a test-only change and does not change runtime behavior or public APIs.

Is it a substantial burden for the maintainers to support this?

No. The change only adds focused regression tests for existing HTTP parser behavior.

The memory threshold is intentionally conservative to reduce the chance of CI flakiness across platforms and Python versions. The tests are also limited to a small HTTP parser scope, leaving the broader multipart and chunked-parser cases from #12736 for follow-up PRs.

Related issue number

Part of #12736

Checklist

  • I think the code is well written
  • Unit tests for the changes exist
  • Documentation reflects the changes
  • If you provide code modification, please add yourself to CONTRIBUTORS.txt
  • Add a new news fragment into the CHANGES/ folder

@psf-chronographer psf-chronographer Bot added the bot:chronographer:provided There is a change note present in this PR label Jun 8, 2026
@codecov

codecov Bot commented Jun 8, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.95%. Comparing base (86c33ba) to head (6996a9e).
⚠️ Report is 4 commits behind head on master.
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@           Coverage Diff           @@
##           master   #12871   +/-   ##
=======================================
  Coverage   98.95%   98.95%           
=======================================
  Files         131      131           
  Lines       47719    47743   +24     
  Branches     2475     2475           
=======================================
+ Hits        47219    47243   +24     
  Misses        376      376           
  Partials      124      124           
Flag Coverage Δ
Autobahn 22.26% <20.00%> (-0.01%) ⬇️
CI-GHA 98.92% <100.00%> (+<0.01%) ⬆️
OS-Linux 98.68% <100.00%> (+<0.01%) ⬆️
OS-Windows 97.07% <100.00%> (+<0.01%) ⬆️
OS-macOS 97.95% <100.00%> (+<0.01%) ⬆️
Py-3.10 98.17% <100.00%> (-0.01%) ⬇️
Py-3.11 98.42% <100.00%> (+<0.01%) ⬆️
Py-3.12 98.51% <100.00%> (-0.01%) ⬇️
Py-3.13 98.49% <100.00%> (+<0.01%) ⬆️
Py-3.14 98.51% <100.00%> (+<0.01%) ⬆️
Py-3.14t 97.58% <100.00%> (+<0.01%) ⬆️
Py-pypy-3.11 97.42% <48.00%> (-0.04%) ⬇️
VM-macos 97.95% <100.00%> (+<0.01%) ⬆️
VM-ubuntu 98.68% <100.00%> (+<0.01%) ⬆️
VM-windows 97.07% <100.00%> (+<0.01%) ⬆️
cython-coverage 38.17% <100.00%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

Signed-off-by: Vinod Kumar <codingkiddo@gmail.com>
@codingkiddo codingkiddo force-pushed the test/http-parser-memory-regression branch from bedeaec to 6996a9e Compare June 8, 2026 08:24
@codspeed-hq

codspeed-hq Bot commented Jun 8, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

✅ 72 untouched benchmarks
⏩ 72 skipped benchmarks1


Comparing codingkiddo:test/http-parser-memory-regression (6996a9e) with master (86c33ba)

Open in CodSpeed

Footnotes

  1. 72 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Comment thread tests/test_http_parser.py


def _get_peak_memory_used(func: Callable[[], None]) -> int:
tracemalloc = pytest.importorskip("tracemalloc")

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what situation is this unavailable? I can't see any notice in the docs suggesting it's not always available..

Comment thread tests/test_http_parser.py
Comment on lines +3132 to +3137
try:
func()
_, peak = tracemalloc.get_traced_memory()
return peak
finally:
tracemalloc.stop()

@Dreamsorcerer Dreamsorcerer Jun 8, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably be:

Suggested change
try:
func()
_, peak = tracemalloc.get_traced_memory()
return peak
finally:
tracemalloc.stop()
try:
func()
else:
_, peak = tracemalloc.get_traced_memory()
finally:
tracemalloc.stop()
return peak

Comment thread tests/test_http_parser.py
Comment on lines +3141 to +3149
text = b"GET /" + (b"-" * 1_000_000) + b" HTTP/1.1\r\nHost: a\r\n\r\n"

def parse() -> None:
with pytest.raises(http_exceptions.LineTooLong):
parser.feed_data(text)

peak = _get_peak_memory_used(parse)

assert peak < 5 * 1024 * 1024

@Dreamsorcerer Dreamsorcerer Jun 8, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just thinking through this test with you here, but I don't this would produce our regression.

I think what we want to do here is something more like:

for i in range(10):
    parser.feed_data(text[chunk_len*i : chun_len*(i+1)])

So we are allocating memory for each new chunk, but the parser should avoid allocating memory for the entire line (i.e. it should be discarding the data or raising an exception as it processes these chunks).

Also, the test currently asserts the peak is <5MiB, but the payload is closer to 1MiB? Maybe we need a larger payload to test this properly?

@Dreamsorcerer Dreamsorcerer Jun 8, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If at all possible, it would great if you could validate the tests against an older release (e.g. 3.12, or 3.10 if you can get it running) where I expect most of them to fail.

Comment thread tests/test_http_parser.py
response.feed_eof() # Should complete without exception


def _get_peak_memory_used(func: Callable[[], None]) -> int:

@Dreamsorcerer Dreamsorcerer Jun 8, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the number of tests we want to look at, probably best to define this as a fixture in conftest.py.

It could even return a context manager possibly? Something that can be used like:

def test_foo(memory_recorder):
    with memory_recorder:
        ...
    assert memory_recorder.peak < foo

Just an idea..

@Dreamsorcerer Dreamsorcerer added the pr-unfinished The PR is unfinished and may need a volunteer to complete it label Jun 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bot:chronographer:provided There is a change note present in this PR pr-unfinished The PR is unfinished and may need a volunteer to complete it

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants