Skip to content

Optimised eif_new.py#24

Open
lpryszcz wants to merge 7 commits intosahandha:masterfrom
lpryszcz:master
Open

Optimised eif_new.py#24
lpryszcz wants to merge 7 commits intosahandha:masterfrom
lpryszcz:master

Conversation

@lpryszcz
Copy link
Copy Markdown

I've optimised Python version so it matches performance with C++ version and allow saving the models.
There is runtime examle added to Notebooks/comparison_py_cxx.ipynb
The code was rewritten entirely. Some functions are optimised with numba.
The iForest is now a numpy array, which allow fast computation and model dump with low storage footprint.

@lpryszcz lpryszcz mentioned this pull request Aug 31, 2020
@wundermahn
Copy link
Copy Markdown

Is this still an active project?

@lpryszcz
Copy link
Copy Markdown
Author

lpryszcz commented Jul 1, 2021

That's a good question @wundermahn . If you want optimised Python version, you can get it directly from my fork.

@psmgeelen
Copy link
Copy Markdown

Hi there, this would be the fix for my problem as well, would it? I am currently trying to pickle the isolationForest model and failing due to som Cython issue:

File "stringsource", line 2, in eif.iForest.__reduce_cython__
TypeError: no default __reduce__ due to non-trivial __cinit__

@lpryszcz
Copy link
Copy Markdown
Author

lpryszcz commented Dec 6, 2021

hi @psmgeelen , yes, you can't save models from Cython version. Try my fork - it has a performance similar to Cython version, but is implemented in Python (with Numba optimisations).

@psmgeelen
Copy link
Copy Markdown

psmgeelen commented Dec 6, 2021

@lpryszcz , you are the best! I will get on it now! So I really only need the eif_new.py file and that's it? Maybe it's worthwhile to have your version to be integrated in scikit. I recommended you anyhow scikit-learn/scikit-learn#16517

EDIT: It works out of the box, I love the script! Small questions though, does it make sense to have a threshold that is always 0.5? Instead you could just push the values directly.

@lpryszcz
Copy link
Copy Markdown
Author

lpryszcz commented Feb 4, 2022

I'm glad it works for you :) And thanks for the recommendation @psmgeelen . I'd be more than happy to contribute to scikit-learn given there is interest from their side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants