fix: bump netty to 4.1.118 in mmlspark/release demo image (MSRC 110886)#2557
Conversation
…ease demo image Addresses MSRC case 110886 / incident 31000000570827. The mmlspark/release image (built from tools/docker/demo/Dockerfile) ships Spark 3.5.4, which pins netty 4.1.96.Final. That version is flagged for multiple CVEs (CVE-2023-44487, CVE-2024-29025, CVE-2025-24970, ...). Spark has not bumped netty in any 3.5.x release. netty 4.1.x is binary-compatible, so we replace all netty-*-4.1.96.Final*.jar files in /opt/spark/jars/ with 4.1.118.Final right after the Spark extract. This includes netty-codec-http2 (the specific artifact named by the finder). Also removes 'pyspark' from the conda install line. It was pulling a complete second Spark install (PySpark 4.0.1) into /usr/local/lib/python*/site-packages/pyspark/ that nothing in the demo image actually used (SPARK_HOME points at /opt/spark) and that doubled the surface area scanners report on. Validated locally: - /opt/spark/jars/netty-*-4.1.96.Final*.jar: 0 matches after build - /opt/spark/jars/netty-*-4.1.118.Final*.jar: full set present - /usr/local/lib/.../pyspark: no longer exists - spark-submit --version: works - spark.range(5).count(): returns 5 Jetty (shaded inside hadoop-client-runtime-3.3.4.jar at 9.4.43) is OUT OF SCOPE for this PR; that requires a Spark/Hadoop swap and will be tracked separately.
|
Hey @BrendanWalsh 👋! We use semantic commit messages to streamline the release process. Examples of commit messages with semantic prefixes:
To test your commit locally, please follow our guild on building from source. |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #2557 +/- ##
==========================================
- Coverage 84.69% 84.66% -0.03%
==========================================
Files 335 335
Lines 17753 17753
Branches 1615 1615
==========================================
- Hits 15036 15031 -5
- Misses 2717 2722 +5 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Summary
Addresses MSRC case 110886 / incident 31000000570827 -- the finder reports
io.netty_netty-codec-http2andorg.eclipse.jetty_jetty-ioare still flagged as vulnerable inmcr.microsoft.com/mmlspark/releasedespite the Spark 3.5.0 -> 3.5.4 bump in #2520.This PR fixes the netty half of that case while staying on Spark 3.5.
Changes in
tools/docker/demo/DockerfileReplace netty 4.1.96.Final with 4.1.118.Final in
/opt/spark/jars/. Spark 3.5.x pins netty 4.1.96 upstream and the entire 3.5.x line never bumped it. Since netty 4.1.x is binary-compatible, we drop in 4.1.118 right after the Spark extract step. Covers allnetty-*artifacts that ship in upstream Spark 3.5.4, includingnetty-codec-http2(the artifact named on the MSRC case) and the native epoll/kqueue jars for linux-x86_64, linux-aarch_64, osx-x86_64, osx-aarch_64.Drop
pysparkfrom theconda installline. That argument was silently pulling in a complete second Spark install (PySpark 4.0.1) into/usr/local/lib/python*/site-packages/pyspark/that nothing in the demo actually uses --SPARK_HOMEandPYTHONPATHboth point at/opt/spark. The duplicate install was doubling the surface area scanners report on.CVEs covered by the netty bump
Local validation
Built locally from this branch and inspected the resulting image:
netty-*-4.1.96.Final*.jarin/opt/spark/jars/netty-*-4.1.118.Final*.jar(incl.netty-codec-http2)/usr/local/lib/.../pyspark//opt/sparkversionspark-submit --versionspark.range(5).count()smoke testOut of scope (tracked separately)
The other half of the MSRC case --
org.eclipse.jetty_jetty-io 9.4.43-- is shaded insidehadoop-client-runtime-3.3.4.jarunderorg.apache.hadoop.shaded.org.eclipse.jetty.*. It cannot be hot-swapped the way netty can; it requires either migrating the demo image to aspark-3.5.x-bin-without-hadooptarball plus a user-provided Hadoop 3.4.1 (which ships jetty 9.4.53), or waiting for thespark4.1branch to ship. Follow-up PR will track that.Risk
Low. netty 4.1.x guarantees binary compatibility, the swap happens at install time so the build is reproducible, and all upstream Spark 4.x releases already use 4.1.118+ on top of the same Spark codebase.