Skip to content

Releases: modin-project/modin

Modin 0.37.1

02 Oct 19:18
0.37.1
bce3707

Choose a tag to compare

This release includes a bug fix and a test fix.

Key Features and Updates Since 0.37.0

  • Stability and Bugfixes
    • FIX-#7684: When we exceed max_cost for all available Backends an error may occur (#7685)
  • Update testing suite
    • TEST-#7686: Fix comparisons in caster tests to check the backend instead of type (#7687)

Contributors

@sfc-gh-jkew
@sfc-gh-mvashishtha

Modin 0.37.0

30 Sep 18:06
0.37.0
69739ae

Choose a tag to compare

This release includes bugfixes for Series.json, DataFrame.rename, and eval, and performance improvements for joins with AutoSwitchBackend enabled.

Key Features and Updates Since 0.36.0

  • Stability and Bugfixes
    • FIX-#7624: Add proper implementation for Series.to_json (#7673)
    • FIX-#7664: Add typing_extensions dependency (#7665)
    • FIX-#7667: Fix axis=None case for DataFrame.rename (#7674)
    • FIX-#7669: Respect eval(inplace=False). (#7670)
    • FIX-#7671: Fix transfer message truncating on larger sizes (#7672)
    • FIX-#7675: Allow backend switching to backends other than provided arguments (#7679)
  • Update testing suite
    • TEST-#7681: Interop Tests should all use Backend.get instead of "Ray" (#7680)
  • New Features
    • FEAT-#7676: in-place casting between DataFrame engines (#7666)

Contributors

@sfc-gh-dpetersohn
@sfc-gh-joshi
@sfc-gh-mvashishtha

Modin 0.36.0

09 Sep 23:51
0.36.0
f23be82

Choose a tag to compare

This release includes a bug fix, a performance improvement
for query() and eval(), and changes to the testing suite.

Key Features and Updates Since 0.35.0

  • Stability and Bugfixes
    • FIX-#7653: Respect AutoSwitchBackend for DataFrame.T/`Series.T~ (#7654)
  • Performance enhancements
    • PERF-#7657: Fork pandas eval and query implementation to improve performance. (#7658)
  • Update testing suite
    • TEST-#7659: Ignore ray.init() warning about accelerators environment variable. (#7660)
    • TEST-#7661: Run push-to-main ray tests in parallel. (#7662)

Contributors

@sfc-gh-dpetersohn
@sfc-gh-joshi
@sfc-gh-mvashishtha

Modin 0.35.0

15 Aug 17:32
0.35.0
1551d01

Choose a tag to compare

This release includes various bug fixes and improvements, and adds
support for pandas 2.3.

Key Features and Updates Since 0.34.0

  • Stability and Bugfixes
    • FIX-#7622: Fall back to printing backend switching progress when tqdm is not available. (#7623)
    • FIX-#7638: Suppress default to pandas warnings on native pandas backend (#7639)
    • FIX-#7640: Respect AutoSwitchBackend.disable() in init. (#7641)
    • FIX-#7645: Stop raising an error for applying numpy ufuncs. (#7646)
  • Performance enhancements
    • PERF-#7435: Use shallow copies in native pandas mode (#7634)
  • Update testing suite
    • TEST-#7629: Update code for mypy 1.17 (#7630)
    • TEST-#7643: Fix residual failures from pandas 2.3 (#7644)
  • New Features
    • FEAT-#7604: Support pandas 2.3 (#7635)
    • FEAT-#7627: Define move_to and move_from methods (#7628)
    • FEAT-#7636: Make AutoSwitchBackend False by default. (#7637)
    • FEAT-#7647: Shorten the hybrid progress bar text (#7648)

Contributors

@sfc-gh-joshi
@sfc-gh-mvashishtha
@sfc-gh-vrpatel

Modin 0.34.0

07 Jul 17:24
0.34.0
27c6710

Choose a tag to compare

This release includes various bug fixes and improvements.

Key Features and Updates Since 0.33.0

  • Stability and Bugfixes
    • FIX-#5961: Preserve dtypes when inserting column to empty frame. (#7601)
    • FIX-#7551: Fix name ambiguity for value_counts() on Pandas backend (#7585)
    • FIX-#7582: Add copy parameter to array methods. (#7584)
    • FIX-#7595: Log backend switching information with the modin logger. (#7597)
    • FIX-#7611: Display 'modin.pandas' instead of 'None' in backend switching information. (#7612)
    • FIX-#7616: Implement array_function stub (#7617)
  • Update testing suite
    • TEST-#7451: Use https for modin-datasets.intel.com (#7596)
    • TEST-#7587: Stop calling np.array(copy=None) for numpy<2 (#7588)
    • TEST-#7598: Allow xgboost to log to root. (#7599)
    • TEST-#7602: Fix test_pickle by correctly using fixtures. (#7603)
    • TEST-#7611: Cap mpi4py<4.1 in CI. (#7614)
  • New Features
    • FEAT-#7606: Consider self_cost in hybrid casting calculator (#7605)
    • FEAT-#7607: Support pinning groupby objects in place. (#7608)
    • FEAT-#7618, FEAT-#7544: Support set_backend() for groupby objects. (#7619)
    • FEAT-#7620: Support pin_backend(inplace=False) for groupby objects. (#7621)

Contributors

@sfc-gh-vrpatel
@sfc-gh-joshi
@sfc-gh-mvashishtha
@sfc-gh-jkew

Modin 0.33.2

19 Jun 00:06
0.33.2
2133edc

Choose a tag to compare

This patch release includes some bug fixes.

Key Features and Updates Since 0.33.1

  • Stability and Bugfixes
    • FIX-#5961: Preserve dtypes when inserting column to empty frame. (#7601)
    • FIX-#7551: Fix name ambiguity for value_counts() on Pandas backend (#7585)
    • FIX-#7595: Log backend switching information with the modin logger. (#7597)
  • Update testing suite
    • TEST-#7598: Allow xgboost to log to root. (#7599)
    • TEST-#7602: Fix test_pickle by correctly using fixtures. (#7603)
  • Uncategorized improvements

Contributors

@sfc-gh-vrpatel
@sfc-gh-mvashishtha

Modin 0.33.1

29 May 00:44
0.33.1
7015a94

Choose a tag to compare

This patch releases fixes a regression introduced in Modin 0.33.0.

Key Features and Updates Since 0.33.0

  • Stability and Bugfixes
    • FIX-#7582: Add copy parameter to array methods. (#7584)

Contributors

@sfc-gh-mvashishtha

Modin 0.33.0

23 May 20:36
0.33.0
cc19143

Choose a tag to compare

This release introduces a set of features for switching Modin execution between
multiple backends (e.g. Ray and local Pandas) manually or automatically. It also
includes several bug fixes.

Key Features and Updates Since 0.32.0

  • Stability and Bugfixes
    • FIX-#7327: Use sort parameter of DataFrame.stack (#7396)
    • FIX-#7346: Handle execution on Dask workers to avoid creating conflicting clients (#7347)
    • FIX-#7375: Fix Series.duplicated dropping name (#7395)
    • FIX-#7381: Fix Series binary operators ignoring fill_value (#7394)
    • FIX-#7383: Avoid broadcast issue in partition manager with custom NPartitions (#7399)
    • FIX-#7404: Implement interchange protocol for datetime columns (#7434)
    • FIX-#7405: Internally sort indices for loc/iloc set (#7440)
    • FIX-#7413: Always use positional index before computing argmin/argmax (#7463)
    • FIX-#7461: Set backend correctly with environment variables. (#7462)
    • FIX-#7465: Properly implement Series.rename_axis (#7466)
    • FIX-#7486: Add support for .astype(pandas.CategoricalDtype(…)) (#7487)
    • FIX-#7490: Exclude move_to and _update_inplace from casting. (#7491)
    • FIX-#7495: Separate extensions for aliases. (#7496)
    • FIX-#7521: Fix wrong extension being used when backend is pinned (#7546)
    • FIX-#7528: Dispatch module-level extensions to the correct backend (#7529)
    • FIX-#7532: Display choices in error message of environment vars (#7533)
    • FIX-#7536: setuptools / ray version conflict in pkg_resources._vendor (#7537)
    • FIX-#7538: set_backend should exit early if there is nothing to do (#7539)
    • FIX-#7547: native qc move_to_me_cost does not work with non-subclasses (#7548)
    • FIX-#7553: Fix groupby when AutoSwitchBackend is disabled. (#7554)
    • FIX-#7555: Get the correct extension when AutoSwitchBackend is False. (#7556)
    • FIX-#7559: Create the dummy query compiler just once per backend. (#7560)
    • FIX-#7562: Raise AttributeError for missing extension properties. (#7563)
    • FIX-#7569: Fix handling of pyarrow dtype and empty dataframes (#7570)
    • FIX-#7576: Fix ambiguous AttributeError message (#7577)
    • FIX-#7578: Change groupby extension allow list and fix cached_property extensions (#7579)
  • Performance enhancements
    • PERF-#7397: Avoid materializing index/columns in shape checks (#7398)
  • Refactor Codebase
    • REFACTOR-#7315: Refactor axis checks in squeeze (#7400)
    • REFACTOR-#7418: Rename internal interchange protocol methods. (#7422)
    • REFACTOR-#7427: Require query compilers to expose engine and storage format. (#7430)
    • REFACTOR-#7470: Combine backend casting and extension code at the API layer. (#7485)
    • REFACTOR-#7493: Improve the clarity of the costing functions (#7494)
    • REFACTOR-#7527: Add more costing logic to the base query compiler. (#7530)
    • REFACTOR-#7534: Provide internal, overridable method for max_shape (#7535)
    • REFACTOR-#7564: Fix docstrings for transfer thresholds. (#7565)
  • Update testing suite
    • TEST-#7419: Fix a few errors in CI (#7420)
    • TEST-#7421: Fix unidist with APT-installed MPI (#7423)
    • TEST-#7431: Fix formatting for isort 6 and black 25 (#7432)
    • TEST-#7437: Check execution-filter outputs correctly in CI. (#7438)
    • TEST-#7441: Correctly skip sanity tests if we don't need them. (#7442)
    • TEST-#7457: Fix SSL certificate error in notebooks by using http. (#7458)
    • TEST-#7497: Skip tests requiring lxml on windows. (#7500)
    • TEST-#7571: xfail test_read_csv_s3_issue4658 due to missing s3 bucket (#7572)
  • Documentation improvements
    • DOCS-#7566: Add pandas on snowflake + backend pinning to documentation page (#7567)
  • New Features
    • FEAT-#7433: Replace NativeDataFrameMode with a complete "native" execution. (#7436)
    • FEAT-#7445: Add metrics interface so third-parties can collect metrics from the modin frontend (#7444)
    • FEAT-#7448: Allow QueryCompilerCaster to apply cost-optimization on automatic casting (#7464)
    • FEAT-#7455: Add Backend config variable as an alias for execution. (#7456)
    • FEAT-#7459: Add methods to get and set backend. (#7460)
    • FEAT-#7468: Add progress bar for engine switch (#7469)
    • FEAT-#7472: Add an option register dataframe and series accessors with a particular backend. (#7473)
    • FEAT-#7474: Register general functions with a particular backend. (#7489)
    • FEAT-#7475: Choose the correct init method from extensions and apply casting to init. (#7488)
    • FEAT-#7477: Move the query compiler calculator so it can be used in more places (#7478)
    • FEAT-#7480: Implement max_cost interface (#7481)
    • FEAT-#7482: Add "from_qc" API to QueryCompiler and BackendCostCalculator to handle asymmetric information scenarios (#7483)
    • FEAT-#7492: Allow I/O function accessors. (#7502)
    • FEAT-#7505: Support post-operation automatic backend switch. (#7506)
    • FEAT-#7507: Support pre-operation automatic backend switch. (#7512)
    • FEAT-#7509: Add AutoSwitchBackend configuration variable (#7510)
    • FEAT-#7511: Support pre-operation switch for init by passing arguments to cost functions. (#7531)
    • FEAT-#7521: Support pinning objects to a backend (#7522)
    • FEAT-#7523: Improve formal definition of the automatic switching algorithm (#7524)
    • FEAT-#7540: Ability to configure NativeQueryCompiler AutoSwitch Settings (#7561)
    • FEAT-#7542: Support post-operation backend switch for groupby. (#7545)
    • FEAT-#7543: Let plugins register groupby accessors. (#7575)
    • FEAT-#7549: Emit metrics on auto-switch and casting behavior (#7550)
    • FEAT-#7557: Add operation and size information to backend switch progress (#7558)
    • FEAT-#7573: Dispatch array_ufunc to query compilers (#7574)

Contributors

@CRiddler
@YarShev
@anmyachev
@data-makerman
@devin-petersohn
@emmanuel-ferdman
@mpeleshenko
@noloerino
@sfc-gh-dpetersohn
@sfc-gh-jkew
@sfc-gh-joshi
@sfc-gh-mvashishtha

Modin 0.32.0

11 Sep 13:43
0.32.0
3e951a6

Choose a tag to compare

This release introduces support for Polars API, a new query compiler for small data,
more functions that can use dynamic partitioning, as well as several bug fixes.

Key Features and Updates Since 0.31.0

  • Stability and Bugfixes
    • FIX-#0000: Fix type hint (#7343)
    • FIX-#7113: Fix docstring overrides for subclasses (#7354)
    • FIX-#7134: Use a separate docstring class for BasePandasDataset (#7353)
    • FIX-#7329: Do not sort columns on df.update (#7330)
    • FIX-#7351: Add ipython method calls to non-lookup list (#7352)
    • FIX-#7355: Cpu count would be set incorrectly on a cluster (#7356)
    • FIX-#7357: Fix NoAttributeError on DataFrame.copy (#7358)
    • FIX-#7371: Fix inserting datelike values into a DataFrame (#7372)
    • FIX-#7373: Try a previous version of motoserver/moto service, pin to 5.0.13 (#7374)
    • FIX-#7379: Fix __imul__ performing addition instead of multiplication (#7380)
    • FIX-#7387: Limit the number of pytest workers for tests with Ray engine on Windows (#7388)
    • FIX-#7389: Fix uploading artifacts (#7390)
  • Refactor Codebase
    • REFACTOR-#0000: Update copyright date (#7333)
  • Documentation improvements
    • DOCS-#0000: Update RunLLM Ask AI widget script path (#7345)
    • DOCS-#7335: Fix borken links in Modin Usage Examples page (#7336)
    • DOCS-#7382: Add documentation on how to use Modin Native query compiler (#7386)
  • New Features
    • FEAT-#4605: Add native query compiler (#7259)
    • FEAT-#7308: Interoperability between query compilers (#7376)
    • FEAT-#7331: Initial Polars API (#7332)
    • FEAT-#7337: Using dynamic partitionning in broadcast_apply (#7338)
    • FEAT-#7340: Add more granular lazy flags to query compiler (#7348)
    • FEAT-#7368: Add a new environment variable for using dynamic partitioning (#7369)

Contributors

@MortalHappiness
@Retribution98
@YarShev
@ZhipengXue97
@anmyachev
@arunjose696
@devin-petersohn
@likawind
@sfc-gh-joshi
@sfc-gh-mvashishtha

Modin 0.31.0

26 Jun 16:04
0.31.0
c8bbca8

Choose a tag to compare

First release compatible with NumPy 2.0.

Key Features and Updates Since 0.30.0

  • Stability and Bugfixes
    • FIX-#7138: Stop reloading modules for custom docstrings (#7307)
    • FIX-#7263: Empty docstrings should not be inherited (#7264)
    • FIX-#7272: Remove HDK engine (#7275)
    • FIX-#7277: Remove Cudf storage format as unmaintained (#7290)
    • FIX-#7278: Make sure enable_logging decorator preserve type hints (#7279)
    • FIX-#7292: Prepare Modin code to NumPy 2.0 (#7293)
    • FIX-#7295: Unpin numexpr to allow versions >= 2.8.4 to match pandas (#7296)
    • FIX-#7309: Update versioneer with versioneer install --vendor (#7311)
    • FIX-#7320: Bump the github-actions group with 3 updates (#7319)
    • FIX-#7321: Using C engine instead of pyarrow for getting metadata in read_csv (#7322)
  • Performance enhancements
    • PERF-#7299: Avoid using synchronize_labels for combine function (#7300)
  • Refactor Codebase
    • REFACTOR-#7271: Remove instance_type attribute of axis partitions (#7268)
    • REFACTOR-#7273: Remove deprecated functions from utils.py, accessor.py and io.py (#7274)
    • REFACTOR-#7285: Remove deprecated configs (#7286)
    • REFACTOR-#7294: Reduce access of methods _modin_frame methods from _query_compiler (#7297)
    • REFACTOR-#7313: Add similar methods as in #7294 for operating on columns (#7314)
  • Update testing suite
    • TEST-#0000: Add a Dependabot config to auto-update GitHub action versions (#7318)
    • TEST-#7316: Run a subset of CI tests with python 3.10 and 3.11 on a scheduled basis (#7289)
  • Documentation improvements
    • DOCS-#0000: Adds RunLLM widget to docs (#7326)
    • DOCS-#7287: Update Modin on Dask documentation (#7288)
  • New Features
    • FEAT-#6574: UserWarning no longer displayed when Series/DataFrames are small (#7323)
    • FEAT-#7249: Add reload_modin feature (#7280)
    • FEAT-#7265: Automatic publication of Modin wheel to PyPI (#7262)
    • FEAT-#7283: Introduce MinRowPartitionSize and MinColumnPartitionSize (#7284)
    • FEAT-#7310: NumPy 2.0 support (#7312)

Contributors

@Jayson729
@Retribution98
@YarShev
@anmyachev
@arunjose696
@kurtmckee
@sfc-gh-dpetersohn
@vsreekanti