Skip to content

Added first fuzzers as step in OSS-Fuzz integration.#5320

Open
DavidKorczynski wants to merge 8 commits into
aio-libs:masterfrom
DavidKorczynski:master
Open

Added first fuzzers as step in OSS-Fuzz integration.#5320
DavidKorczynski wants to merge 8 commits into
aio-libs:masterfrom
DavidKorczynski:master

Conversation

@DavidKorczynski

@DavidKorczynski DavidKorczynski commented Dec 7, 2020

Copy link
Copy Markdown

What do these changes do?

Hi Maintainers,

I would like to set up continuous fuzzing of aiohttp by way of OSS-Fuzz. Essentially, OSS-Fuzz is a free service run by Google that performs continuous fuzzing of important open source projects. OSS-Fuzz recently added support for Python and it would be great to have aiohttp integrated. The only expectation of integrating into OSS-Fuzz is that bugs will be fixed. This is not a "hard" requirement in that no one enforces this and the main point is if bugs are not fixed then it is a waste of resources to run the fuzzers, which we would like to avoid.

In this PR google/oss-fuzz#4764 I have done exactly that, namely created the necessary logic from an OSS-Fuzz perspective to integrate aiohttp. If you would like to integrate, could you please provide a set of email(s) that will get access to the data produced by OSS-Fuzz, such as bug reports, coverage reports and more stats. The emails should be linked to a Google account in order to view the detailed reports and notice the emails affiliated with the project will be public in the OSS-Fuzz repo, as they will be part of a configuration file.

Are there changes in behavior for the user?

No

Related issue number

Fixes #3159.

Checklist

  • I think the code is well written
  • Unit tests for the changes exist
  • Documentation reflects the changes
  • If you provide code modification, please add yourself to CONTRIBUTORS.txt
    • The format is <Name> <Surname>.
    • Please keep alphabetical order, the file is sorted by names.
  • Add a new news fragment into the CHANGES folder
    • name it <issue_id>.<type> for example (588.bugfix)
    • if you don't have an issue_id change it to the pr id after creating the pr
    • ensure type is one of the following:
      • .feature: Signifying a new feature.
      • .bugfix: Signifying a bug fix.
      • .doc: Signifying a documentation improvement.
      • .removal: Signifying a deprecation or removal of public API.
      • .misc: A ticket has been closed, but it is not of interest to users.
    • Make sure to use full sentences with correct case and punctuation, for example: "Fix issue with non-ascii contents in doctest text files."

@webknjaz webknjaz left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If accepted, this would need to add the test dependencies, integrate the runs into pytest instead of dropping-in standalone files nobody executes, and make the CI green (it's broken right now).

@DavidKorczynski

DavidKorczynski commented Dec 7, 2020

Copy link
Copy Markdown
Author

Thanks for the response @webknjaz

I would be happy to do those aspects as long as aiohttp would like to integrate into oss-fuzz, do you know if this is the case?

@webknjaz

webknjaz commented Dec 7, 2020

Copy link
Copy Markdown
Member

@DavidKorczynski I don't see any reason why not. But let's wait for @asvetlov to weigh in.

@DavidKorczynski

Copy link
Copy Markdown
Author

Thanks @webknjaz

Notice here that if you prefer to not have the fuzzers in the aiohttp repository then we can also store the fuzzers in the OSS-Fuzz repository for now. However, from a long term perspective it would be great to get fuzzing into the main aiohttp repo.

@asvetlov

asvetlov commented Dec 7, 2020

Copy link
Copy Markdown
Member

I support the idea in general but should learn about OSS-Fuzz more.
I've started by reading https://github.com/google/oss-fuzz README.
From my understanding, fuzzers can help with finding programming errors related to Python C Extensions.
Is it correct?
Maybe you can point on resources that are good starters?
How is it integrated, how can I get a feedback from fuzzing runs, what should we do to write really good and useful fuzzers, what is the fuzzers maintenance procedure, etc., etc.?

Fuzzers can live in aiohttp repo, I see no problem with it.
My email is andrew.svetlov@gmail.com

@codecov

codecov Bot commented Jan 26, 2021

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.17%. Comparing base (91dadb3) to head (42f0d25).
⚠️ Report is 6032 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #5320   +/-   ##
=======================================
  Coverage   97.16%   97.17%           
=======================================
  Files          41       41           
  Lines        8739     8768   +29     
  Branches     1402     1404    +2     
=======================================
+ Hits         8491     8520   +29     
- Misses        129      130    +1     
+ Partials      119      118    -1     
Flag Coverage Δ
unit 97.05% <ø> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@DavidKorczynski

DavidKorczynski commented Jan 26, 2021

Copy link
Copy Markdown
Author

@asvetlov

Fuzzers have traditionally been used for native code as a way of catching memory corruption bugs and alike. However, recently there has been a move to more languages, e.g. Go, Rust and Python. When fuzzing pure python apps (no native code) the general bug you will be looking for are unhandled execptions, e.g. if function F promises to throw exceptions from the set X fuzzers can be used to try and break things and determine whether function F can ever throw an exception outside the promised set X. These types of bugs are closer to reliability bugs, but still crucial.

The specific fuzzer we will be using in this context is Atheris: https://github.com/google/atheris This is a good place to start for understanding fuzzing in a Python context.

I also created a small video that shows how the Atheris fuzzer works here: https://www.youtube.com/watch?v=Wjjlk_W7WFo

The way this is integrated is by having a set of files over in the OSS-Fuzz repository, that essentially builds and runs the fuzzers in a continuous manner. Once this gets merged, a way of running this by way of OSS-Fuzz will be the following simple commands:

git clone https://github.com/google/oss-fuzz
cd oss-fuzz
python3 infra/helper.py build_image aiohttp
python3 infra/helper.py build_fuzzers aiohttp
python3 infra/helper.py run_fuzzer aiohttp fuzz_http_parser

The specific fuzzers do not need a lot of maintenance, the main thing will be fixing the bugs when they happen. They way to write really good fuzzers is, from a high-level perspective, writing fuzzers that hit high-level functions such that the coverage-guided aspects of the fuzzer will make it reach far through the code.

You will get a ton of feedback about the status of the fuzzers, e.g. statistics, open crashes, input to trigger crashes, stack traces, fixed crashes, etc. from oss-fuzz.com. You will be able to login there and see all of it. In addition to this you will get emails whenever a crash is found.

We can also integrate the fuzzing into the CI, so the fuzzers will be run for each pull request. This can be hugely beneficial as a way of catching bugs before they are shipped.

@DavidKorczynski

Copy link
Copy Markdown
Author

Status update following #6772 (comment) is I will be looking revive this PR in the coming days.

@Dreamsorcerer

Copy link
Copy Markdown
Member

@DavidKorczynski Looking to revive this. Does #12887 look right for CI integration?

Is the idea of this PR that we can maintain the fuzzers in our repo and drop them from the oss-fuzz repo? It appears the current version fails on master due to a missing parameter currently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bot:chronographer:provided There is a change note present in this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[test enhancements] Fuzzing tests

4 participants