NVIDIA
diff --git a/‎README.md‎
Lines changed: 2 additions & 0 deletions b/‎README.md‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎ci/Dockerfile‎
Lines changed: 12 additions & 4 deletions b/‎ci/Dockerfile‎
Lines changed: 12 additions & 4 deletions
diff --git a/‎ci/deploy.sh‎
Lines changed: 12 additions & 1 deletion b/‎ci/deploy.sh‎
Lines changed: 12 additions & 1 deletion
diff --git a/‎ci/test.sh‎
Lines changed: 4 additions & 1 deletion b/‎ci/test.sh‎
Lines changed: 4 additions & 1 deletion
diff --git a/‎docker/Dockerfile‎
Lines changed: 9 additions & 1 deletion b/‎docker/Dockerfile‎
Lines changed: 9 additions & 1 deletion
diff --git a/‎docker/Dockerfile.pip‎
Lines changed: 12 additions & 3 deletions b/‎docker/Dockerfile.pip‎
Lines changed: 12 additions & 3 deletions
diff --git a/‎docker/Dockerfile.python‎
Lines changed: 12 additions & 4 deletions b/‎docker/Dockerfile.python‎
Lines changed: 12 additions & 4 deletions
diff --git a/‎docs/site/configuration.md‎
Lines changed: 4 additions & 0 deletions b/‎docs/site/configuration.md‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎docs/site/get-started/spark_connect.md‎
Lines changed: 7 additions & 0 deletions b/‎docs/site/get-started/spark_connect.md‎
Lines changed: 7 additions & 0 deletions
diff --git a/‎docs/source/conf.py‎
Lines changed: 1 addition & 1 deletion b/‎docs/source/conf.py‎
Lines changed: 1 addition & 1 deletion
@@ -4,6 +4,8 @@ Spark Rapids ML enables GPU accelerated distributed machine learning on [Apache
 
 These APIs seek to minimize any code changes to end user Spark code.  After your environment is configured to support GPUs (with drivers, CUDA toolkit, and RAPIDS dependencies), you should be able to just change an import statement or class name to take advantage of GPU acceleration.   See [here](./python/README.md#clis-enabling-no-package-import-change) for experimental CLIs that enable GPU acceleration without the need for changing the `pyspark.ml` package names in an existing pyspark ml application.
 
+[![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/NVIDIA/spark-rapids-ml)
+
 **Python**
 ```python
 # from pyspark.ml.feature import PCA
 
@@ -17,15 +17,23 @@
 ARG CUDA_VERSION=11.8.0
 FROM nvidia/cuda:${CUDA_VERSION}-devel-ubuntu22.04
 
+# ubuntu22
+RUN sed -i -e 's|http://archive.ubuntu.com/ubuntu|https://archive.ubuntu.com/ubuntu|g' \
+           -e 's|http://security.ubuntu.com/ubuntu|https://security.ubuntu.com/ubuntu|g' \
+           /etc/apt/sources.list
+# ubuntu24+
+RUN find /etc/apt/sources.list.d/ -name '*.sources' -exec sed -i \
+           -e "s|http://archive.ubuntu.com/ubuntu|https://archive.ubuntu.com/ubuntu|g" \
+           -e "s|http://security.ubuntu.com/ubuntu|https://security.ubuntu.com/ubuntu|g" {} +
 # Install packages to build spark-rapids-ml
 RUN chmod 1777 /tmp
 RUN apt update -y \
-    && DEBIAN_FRONTEND=noninteractive TZ=Etc/UTC apt install -y openjdk-8-jdk \
-    && apt install -y git numactl software-properties-common wget zip \
+    && DEBIAN_FRONTEND=noninteractive TZ=Etc/UTC apt install -y openjdk-17-jdk \
+    && apt install -y git numactl software-properties-common wget zip maven \
     && rm -rf /var/lib/apt/lists
 
 # Config JAVA_HOME
-ENV JAVA_HOME /usr/lib/jvm/java-1.8.0-openjdk-amd64
+ENV JAVA_HOME /usr/lib/jvm/java-1.17.0-openjdk-amd64
 
 # Install conda
 ENV PATH="/root/miniconda3/bin:${PATH}"
@@ -38,6 +46,6 @@ RUN wget --quiet https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86
     && conda config --set solver libmamba
 
 # install cuML
-ARG CUML_VER=25.04
+ARG CUML_VER=25.06
 RUN conda install -y -c rapidsai -c conda-forge -c nvidia cuml=$CUML_VER cuvs=$CUML_VER python=3.10 cuda-version=11.8 numpy~=1.0 \
     && conda clean --all -f -y
@@ -1,6 +1,6 @@
 #!/bin/bash
 #
-# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
+# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -17,6 +17,17 @@
 
 set -ex
 
+# build plugin jar
+pushd jvm
+mvn clean package -DskipTests
+popd
+
+# copy plugin jar to python package
+JARS_DIR=python/src/spark_rapids_ml/jars
+mkdir -p $JARS_DIR
+rm -f $JARS_DIR/*.jar
+cp jvm/target/*.jar $JARS_DIR
+
 # build whl package
 pushd python
 pip install -r requirements_dev.txt && pip install -e .
 
@@ -1,6 +1,6 @@
 #!/bin/bash
 #
-# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
+# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -44,6 +44,9 @@ pip install -r requirements_dev.txt && pip install -e .
 # benchmark
 ./run_benchmark.sh $bench_args
 
+# plugin tests
+./run_plugin_test.sh
+
 # check compatibility with Spark 3.3 in nightly run
 # also push draft release docs to gh-pages
 if [[ $type == "nightly" ]]; then
 
@@ -1,5 +1,5 @@
 #
-# Copyright (c) 2023, NVIDIA CORPORATION.
+# Copyright (c) 2023-2025, NVIDIA CORPORATION.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -38,6 +38,14 @@
 ARG CUDA_VERSION=11.5.2
 FROM nvidia/cuda:${CUDA_VERSION}-devel-ubuntu20.04
 
+# ubuntu22
+RUN sed -i -e 's|http://archive.ubuntu.com/ubuntu|https://archive.ubuntu.com/ubuntu|g' \
+           -e 's|http://security.ubuntu.com/ubuntu|https://security.ubuntu.com/ubuntu|g' \
+           /etc/apt/sources.list
+# ubuntu24+
+RUN find /etc/apt/sources.list.d/ -name '*.sources' -exec sed -i \
+           -e "s|http://archive.ubuntu.com/ubuntu|https://archive.ubuntu.com/ubuntu|g" \
+           -e "s|http://security.ubuntu.com/ubuntu|https://security.ubuntu.com/ubuntu|g" {} +
 # Install packages to build spark-rapids-ml jars
 RUN apt update -y && \
     DEBIAN_FRONTEND=noninteractive TZ=Etc/UTC apt install -y openjdk-8-jdk wget git zip
 
@@ -18,12 +18,21 @@ ARG CUDA_VERSION=11.8.0
 FROM nvidia/cuda:${CUDA_VERSION}-devel-ubuntu22.04
 
 ARG PYSPARK_VERSION=3.3.1
-ARG RAPIDS_VERSION=25.4.0
+ARG RAPIDS_VERSION=25.6.0
 ARG ARCH=amd64
 #ARG ARCH=arm64
+
+# ubuntu22
+RUN sed -i -e 's|http://archive.ubuntu.com/ubuntu|https://archive.ubuntu.com/ubuntu|g' \
+           -e 's|http://security.ubuntu.com/ubuntu|https://security.ubuntu.com/ubuntu|g' \
+           /etc/apt/sources.list
+# ubuntu24+
+RUN find /etc/apt/sources.list.d/ -name '*.sources' -exec sed -i \
+           -e "s|http://archive.ubuntu.com/ubuntu|https://archive.ubuntu.com/ubuntu|g" \
+           -e "s|http://security.ubuntu.com/ubuntu|https://security.ubuntu.com/ubuntu|g" {} +
 # Install packages to build spark-rapids-ml
 RUN apt-get update -y \
-    && DEBIAN_FRONTEND=noninteractive TZ=Etc/UTC apt install -y openjdk-8-jdk \
+    && DEBIAN_FRONTEND=noninteractive TZ=Etc/UTC apt install -y openjdk-17-jdk \
     && rm -rf /var/lib/apt/lists
 
 RUN apt-get update -y \
@@ -50,7 +59,7 @@ RUN pip install --no-cache-dir pyspark==${PYSPARK_VERSION} "scikit-learn>=1.2.1"
     numpydoc pydata-sphinx-theme pylint pytest "sphinx<6.0" "twine>=4.0.0"
 
 # Config JAVA_HOME
-ENV JAVA_HOME /usr/lib/jvm/java-1.8.0-openjdk-$ARCH
+ENV JAVA_HOME /usr/lib/jvm/java-1.17.0-openjdk-$ARCH
 
 ### END OF CACHE ###
 
 
@@ -15,18 +15,26 @@
 #
 
 ARG CUDA_VERSION=11.8.0
-FROM nvidia/cuda:${CUDA_VERSION}-devel-ubuntu20.04
+FROM nvidia/cuda:${CUDA_VERSION}-devel-ubuntu22.04
 
-ARG CUML_VERSION=25.04
+ARG CUML_VERSION=25.06
 
+# ubuntu22
+RUN sed -i -e 's|http://archive.ubuntu.com/ubuntu|https://archive.ubuntu.com/ubuntu|g' \
+           -e 's|http://security.ubuntu.com/ubuntu|https://security.ubuntu.com/ubuntu|g' \
+           /etc/apt/sources.list
+# ubuntu24+
+RUN find /etc/apt/sources.list.d/ -name '*.sources' -exec sed -i \
+           -e "s|http://archive.ubuntu.com/ubuntu|https://archive.ubuntu.com/ubuntu|g" \
+           -e "s|http://security.ubuntu.com/ubuntu|https://security.ubuntu.com/ubuntu|g" {} +
 # Install packages to build spark-rapids-ml
 RUN apt update -y \
-    && DEBIAN_FRONTEND=noninteractive TZ=Etc/UTC apt install -y openjdk-8-jdk \
+    && DEBIAN_FRONTEND=noninteractive TZ=Etc/UTC apt install -y openjdk-17-jdk \
     && apt install -y git numactl software-properties-common wget zip \
     && rm -rf /var/lib/apt/lists
 
 # Config JAVA_HOME
-ENV JAVA_HOME /usr/lib/jvm/java-1.8.0-openjdk-amd64
+ENV JAVA_HOME /usr/lib/jvm/java-1.17.0-openjdk-amd64
 
 # Install conda
 ENV PATH="/root/miniconda3/bin:${PATH}"
 
@@ -11,5 +11,9 @@ The following configurations can be supplied as Spark properties.
 | spark.rapids.ml.uvm.enabled | false | if set to true, enables [CUDA unified virtual memory](https://developer.nvidia.com/blog/unified-memory-cuda-beginners/) (aka managed memory) during estimator.fit() operations to allow processing of larger datasets than would fit in GPU memory |
 | spark.rapids.ml.gpuMemRatioForData | None |  If set to a float value between 0 and 1, Spark Rapids ML will reserve a portion of free GPU memory on each GPU and incrementally append PySpark data batches into this reserved space. This setting is recommended for large datasets, as it prevents duplicating the entire dataset in GPU memory and reduces the risk of out-of-memory errors. |
 | spark.rapids.ml.cpu.fallback.enabled | false | if set to true and spark-rapids-ml estimator.fit() is invoked with unsupported parameters or parameter values, the pyspark.ml cpu based estimator.fit() and model.transform() will be run; if set to false, an exception is raised in this case (default). |
+| spark.rapids.ml.verbose | None | if set to a boolean value (true/false) or an integer between 0 and 6, controls the verbosity level for cuML logging during estimator.fit() operations. This parameter can be set globally in Spark configuration and will be used if not explicitly set in the estimator constructor. |
+| spark.rapids.ml.float32_inputs | None | if set to a boolean value (true/false), controls whether input data should be converted to float32 precision before being passed to cuML algorithms. Setting this to true can reduce memory usage and potentially improve performance, but may affect numerical precision. This parameter can be set globally in Spark configuration and will be used if not explicitly set in the estimator constructor. |
+| spark.rapids.ml.num_workers | None | if set to an integer value greater than 0, specifies the number of workers to use for distributed training. This parameter can be set globally in Spark configuration and will be used if not explicitly set in the estimator constructor. |
+
 
 Since the algorithms rely heavily on Pandas UDFs, we also require `spark.sql.execution.arrow.pyspark.enabled=true` to ensure efficient data transfer between the JVM and Python processes. 
@@ -0,0 +1,7 @@
+---
+title: Spark Connect
+parent: Getting Started
+---
+# Getting Started on Spark Connect
+
+Spark Rapids ML supports Spark Connect via the [Spark Rapids ML Connect Plugin](https://github.com/NVIDIA/spark-rapids-ml/blob/main/jvm).  A prebuilt plugin jar compatible with Spark Connect 4.0 is bundled with the `spark-rapids-ml` pip package.   See the getting-started [guide](https://github.com/NVIDIA/spark-rapids-ml/blob/main/jvm/README.md) for more information.
@@ -23,7 +23,7 @@
 project = 'spark-rapids-ml'
 copyright = '2025, NVIDIA'
 author = 'NVIDIA'
-release = '25.04.0'
+release = '25.06.0'
 
 # -- General configuration ---------------------------------------------------
 # https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration