- Fix issue #198: handle LightGBM models with string categorical features in SHAP/dashboard paths by normalizing tree-SHAP evaluation input for categorical columns.
- Prevent crashes in LightGBM what-if/SHAP flows caused by object/string categorical values during SHAP value computation.
- Fix CatBoost PDP/dashboard callback crashes when categorical values in
X_roware missing (NaN) by preserving dataframe categorical handling and sanitizing CatBoost categorical prediction inputs. - Fix issue #146:
ExplainerHub.to_yaml(..., integrate_dashboard_yamls=True)now honorspickle_typeinstead of hardcoding.joblib, and correctly dumps explainer files whendump_explainers=True. - Fix issue #294: align multiclass
model_output='logodds'semantics across Prediction Box and Contributions Plot by using per-class raw margins for multiclass logodds displays. - Fix multiclass PDP highlight predictions in logodds mode to use the same raw-margin scale as SHAP contributions.
- Fix XGBoost multiclass decision-path summary wording to display
prediction (logodds)when explainermodel_output='logodds'. - Fix issue #256: add robust multiclass probability fallback for classifiers that expose
decision_functionbut notpredict_proba(e.g.LinearSVC), and use it consistently across kernel SHAP, prediction helpers, PDP, and permutation scorer paths. - Prevent multiclass class-count mismatches when user-provided/broken
predict_probaoutputs do not match model class count by falling back todecision_function-based probabilities. - Fix issue #118: add LightGBM decision-tree visualization support (dtreeviz) across explainer auto-detection, tree plotting, and decision-path rendering in dashboard tree tabs.
- Fix dtreeviz callback rendering on macOS by switching matplotlib to a non-interactive backend for off-main-thread tree rendering to prevent dashboard 500 errors.
- Add regression tests for LightGBM with string categorical features covering dashboard initialization,
get_shap_row(...), unseen categorical values inX_row, and regression dashboard initialization. - Add CatBoost regression tests for classifier/regression
pdp_df(...)withX_rowcontaining missing categorical values. - Add hub regression test for integrated hub yaml serialization to verify
pickle_typeis preserved and explainer artifacts are written. - Add regression tests for issue #294 covering multiclass logodds consistency across prediction table, contributions, PDP highlight predictions, and XGBoost decision-path summaries.
- Add pipeline tests for transformed feature-name cleanup (
strip_pipeline_prefix,feature_name_fn) and pipeline categorical grouping autodetection. - Add explainer-method unit tests for binary-like onehot detection, transformed feature-name deduping, inferred pipeline cats, and pipeline extraction warning text.
- Add regression tests for issue #256 covering multiclass
LinearSVCwith kernel SHAP, PDP, and permutation-importances flows usingdecision_functionfallback. - Add guard tests to confirm multiclass
predict_probamodels (logistic regression) keep working for PDP and permutation-importances paths. - Add LightGBM tree-visualization regression tests (shadow trees, decision paths, plot_trees, and dtreeviz render contracts) in the boosting-model test suite.
- Add pipeline feature-name cleanup options:
strip_pipeline_prefix=Trueandfeature_name_fn=...for sklearn/imblearn pipeline transformed output columns. - Add optional
auto_detect_pipeline_cats=Trueto infer onehot groups from transformed pipeline columns whencatsis not provided. - Preserve input index in transformed pipeline dataframes produced during pipeline extraction.
- Improve pipeline extraction warning guidance and include concrete checks (
get_feature_names_out, transform compatibility onX/X_background). - Relax onehot grouping validation to also accept binary-like scaled onehot columns (not only strict
0/1) when parsingcats.
- Update
explainerdashboardGitHub Actions workflow to run a weekly scheduled full test suite (pytest) to detect dependency breakages earlier.
- Allow FeatureInputComponent (what-if inputs) to customize numeric ranges and rounding, and apply min/max/step to inputs.
- Add
input_featuresandhide_featurestoFeatureInputComponentso what-if fields can be explicitly ordered and selectively hidden, while preserving the full callback input contract. - Fix issue #220: accept single-row list/array
X_rowinputs inget_contrib_df, and harden related row-input paths (get_col_value_plus_prediction,pdp_df, and classifier/regressionprediction_result_df) with regression tests. - Fix issue #262: add feature-based filters to classifier/regression
random_index(...)(numeric ranges and categorical inclusion), enabling what-if random selection constrained by input feature values. - Improve compatibility with AutoGluon/custom wrappers by coercing pandas
DataFrameoutputs frompredict_proba/predictto numpy arrays before indexing in classifier/regression helper paths. - Harden one-vs-all scorer handling so
make_one_vs_all_scoreralso accepts classifiers whosepredict_probareturns a pandasDataFrame. - Fix ExplainerHub
add_dashboard_routeafter first request by allowing dynamic dashboard registration/setup during route-triggered add, and add a regression test for issue #269. - Fix issue #273: avoid crashes when sorting categorical values containing mixed types and NaNs across explainer setup, PDP/category ordering, and categorical plot paths.
- Add regression tests for mixed-type categorical sorting and mixed target-label sorting fallbacks in explainers and plots.
- Fix FeatureInputComponent range calculation for boolean columns (avoid np.round on bools) and add a regression test.
- Ensure save_html includes custom tabs by providing a static-export fallback for tabs without a to_html implementation.
- Support string class labels in ClassifierExplainer by preserving label mappings and avoiding float casts.
- Support CalibratedClassifierCV by using its fitted base estimator for SHAP (avoids falling back to kernel).
- Replace print statements with standard logging and warnings; progress messages are now INFO-level and user-actionable guidance uses warnings. A one-time warning is emitted if logging is not configured, with instructions to call
enable_default_logging().
- Handle missing values in categorical features by surfacing a "NaN" option in inputs and normalizing NaN selections back to real missing values.
- Add tests covering categorical NaN handling for both merged and unmerged input paths.
- Preserve categorical dtypes during permutation importance shuffles and PDP grid generation to prevent dtype-related model errors (e.g., LightGBM).
- Align categorical/boolean dtypes for user-provided
X_rowinputs and add dtype alignment tests.
- Add support for GPU Tree SHAP explainers via
shap='gputree'(requires CUDA-enabled SHAP). - Add SageMaker Studio support: auto-detect environment, apply proxy prefixes, and CLI flags for overrides.
- Require Dash >=3.0.4 to support dash-bootstrap-components 2.x
- Relaxed dash-bootstrap-components upper bound to allow 2.x releases
- Updated DropdownMenu alignment to use
align_endfor dbc 2.x - Adjusted logistic regression test fixture to avoid convergence warnings
- Avoid sklearn feature-name warnings in PDP computations by passing numpy arrays to estimators without
feature_names_in_ - Consistent model-input handling in PDP and prediction helpers to prevent warning noise
- Allow NumPy 2.x but cap to
<2.4on Python 3.11+ to avoid numba/llvmlite downgrade issues
- Dropped support for Python 3.8 and 3.9 (Python 3.9 reached end-of-life). Minimum Python version is now 3.10
- Now explicitly supports and tests on Python 3.10, 3.11, 3.12, and 3.13
- Removed upper version constraints for
dashandplotlydependencies, now supports Dash 2.10+ and 3.0+, and Plotly 5.0+ and 6.0+ - Added backward compatibility code to support both Dash 2.x (
app.run_server()) and Dash 3.x (app.run()) APIs - Fixed Plotly 6.0 compatibility by updating
titlefonttotitle.fontformat - Improved integration test setup with automatic ChromeDriver management via
webdriver-manager - Fixed threading issues with Plotly validator initialization by switching to recommended
plotly.graph_objectsimport - Made
torchandskorchoptional dependencies on Intel Macs (where torch wheels are not available)
- Fixed
SystemExitwarnings in integration tests caused by Plotly validator initialization in multi-threaded contexts - Updated
.gitignoreto exclude webdriver-manager cache directories anduv.lockfile - XGBoost 3.1+ compatibility: Fixed handling of string-formatted predictions and
base_scorevalues returned by XGBoost 3.1+. Added robust string-to-numeric conversion with proper regex fallback to handle various string formats (e.g.,'[3.2967056E1]','[8.563135E-2,7.169811E-1,1.9738752E-1]') - XGBoost SHAP initialization: Fixed
base_scoreconversion in bothget_params()and booster's internal JSON configuration to ensure SHAP TreeExplainer initializes correctly with XGBoost 3.1+ - RandomForest dtreeviz compatibility: Fixed dtype handling for
y_train(now usesintinstead ofint16) and observation array conversion forpredict_path()to work with newer dtreeviz versions - Dtreeviz decisiontree_view: Ensure observations are passed as numpy arrays to avoid pandas label lookup errors when dtreeviz indexes features by integer position
- PyPI packaging: Removed duplicate wheel entries from hatchling build config to fix "Duplicate filename in local headers" upload errors
- Pandas deprecation warnings: Removed deprecated
pd.option_context("future.no_silent_downcasting")andcopy=Falseparameter from.infer_objects()calls - Runtime warnings: Fixed divide-by-zero warnings in classification plots and residuals plots (log-ratio calculations) by adding proper zero checks and using
np.divide()withwhereparameter
- fix deprecated needs_proba parameter in make_scorer
- fix merge_categorical_columns when there are no cats
- Handle pandas option setting context in case it doesn't exist
- Remove is_categorical_dtype as it is getting deprecated
- should now work with the format of shap 0.45 that returns a three dimensional np.array instead of a list of 2-dimensional np.arrays for classifiers
- Fixed several pandas warning about to be deprecated behaviours
- Add warning to set
shap_kwargs=dict(check_additivity=True)for skorch models, and switch this on for the tests.
- models that use kernel explainer but output multi-dimensional predictions such as PLSRegression are now supported. Predictions now get squeezed in the kernel function.
- Fixed bug with pandas v2, Pandas v2 now supported
- Fixed a number of user warnings
pins dependencies for flask-wtf>1.1, numpy<1.24 and pandas<2 while working to sort out some compatibility issues.
- tries to work around wonky index dropdown search bug introduced by latest dash release.
- Dropdown search now works again, but index propagation is still flaky when number of idxs > max_idxs_in_dropdown(1000 by default)
- displays warning to downgrade to dash 2.6.2 when this happens
- applied black to the codebase
- Now needs dtreeviz>2.1, due to the API change with version v2
- Fixed import and tree display bug with newer version of dtreeviz
- added routes_pathname_prefix:str=None, requests_pathname_prefix:str=None, to ExplainerDashboard to help running the dashboard on e.g. Sagemaker
- Bug with plotly
showticklabels=Falsechanged totickfont=dict(color="rgba(0, 0, 0, 0)") - Imports now comply with dtreeviz v2 API
- Upgrades the dashboard to
bootstrap5anddash-bootstrap-componentsv1(which is also based on bootstrap5), this may break older custom dashboards that included bootstrap5 components fromdash-bootstrap-components<1 - Support terminated for python
3.6and3.7as the latest version ofscikit-learn(1.1) dropped support as well and explainerdashboard depends on the improved pipeline feature naming inscikit-learn>=1.1
- Better support for large datasets through dynamic server-side index dropdown option selection. This means that not all indexes have to be stored client side in the browser, but
get rather automatically updated as you start typing. This should help especially with large datasets with large number of indexes.
This new server-side dynamic index dropdowns get activated if the number of rows >
max_idxs_in_dropdown(defaults to 1000). - Both sklearn and imblearn Pipelines are now supported with automated feature names generated, as long as all the transformers have a
.get_feature_names_out()method - Adds
shap_kwargsparameter to the explainers that allow you to pass additional kwargs to the shap values generating call, e.g.shap_kwargs=dict(check_addivity=False) - Can now specify absolute path with
explainerfile_absolute_pathwhen dumpingdashboard.yamlwithdb.to_yaml(...)
- Suppresses warnings when extracting final model from pipeline that was not fitted on a dataframe.
- No longer limiting werkzeug version due to upstream bug fixes of
dashandjupyter-dash
- Some dropdowns now better aligned.
- Adds support for sklearn Pipelines that add new features (such as those including OneHotEncoder)
as long as they support the new
get_features_out()method. Not all estimators and transformers have this method implemented yet, but if all estimators in your pipeline do, then explainerdashboard will extract the final dataframe and the model from your pipelines. For now this does result in a lot of "this model was fitted on a numpy array but you provided a dataframe" warnings.
- Fixed a bug with sorting pdp features
- Fixed werkzeug<=2.0.3 due to some new features that broke JupyterDash
- Changes use of pd.append that will be deprecated soon and is currently generated warnings.
- Forces dash v2 dependency
- fixes bug introduced by breaking change in pandas 1.40
- Switches do dash v2 style imports
- Export your ExplainerHub to static html with
hub.to_html()andhub.save_html()methods - Export your ExplainerHub to a zip file with static html exports with
to_zip()method - Manually add pre-calculated shap values with
explainer.set_shap_values() - Manually add pre-calculated shap interaction values with
explainer.set_shap_interaction_values()
- Fixed bug with What if tab components static html export (missing
</div>)
- Static html export! You can export a static version of the dashboard using the default values
that you specified in the components or through kwargs with
dashboard.to_html().- for custom components you need to define your own custom
to_html()methods, see the documentation.
- for custom components you need to define your own custom
- A toggle is added to the dashboard header that allows you to download a static export of the current live state of the dashboard.
- adds a new toggle and parameter to the ConfusionmatrixComponent to either average the percentage over the entire matrix, over the rows or over the columns. Set normalize='all', normalize='true', or normalize='pred'.
- also adds a
save_html(filename)method to allExplainerComponentsandExplainerDashboard ExplainerHubadds a new parameterindex_to_base_route: Dispatches Hub to/base_route/indexinstead of the default/and/index. Useful when the host root is not reserved for the ExplainerHub
- adds support for
PyTorchNeural Networks! (as long as they are wrapped byskorch) - adds
SimplifiedClassifierCompositeandSimplifiedRegressionCompositetoexplainerdashboard.custom - adds flag
simple=Trueto load these simplified one page dashboards:ExplainerDashboard(explainer, simple=True) - adds support for visualizing trees of
ExtraTreesClassifierandExtraTreesRegressor - adds
FeatureDescriptionsComponenttoexplainerdashboard.customand the Importances tab - adds possibility to dynamically add new dashboards to running ExplainerHub using
/add_dashboardroute withadd_dashboard_route=True(will only work if you're running the Hub as a single worker/node though!)
ExplainerDashboard.to_yaml("dashboards/dashboard.yaml", dump_explainer=True)will now dump the explainer in the correct subdirectory (and also default to explainer.joblib)
- Fixes incompatibility bug with dtreeviz >= 1.3
- raises ValueError when passing
shap='deep'as it is not yet correctly supported
Highlights:
- Adding support for cross validated metrics
- Better support for pipelines by using kernel explainer
- Making explainer threadsafe by adding locks
- Remove outliers from shap dependence plots
- parameter
permutation_cvhas been deprecated and replaced by parametercvwhich now also works to calculate cross-validated metrics besides cross-validated permutation importances.
- metrics now get calculated with cross validation over
Xwhen you pass thecvparameter to the explainer, this is useful when for some reason you want to pass the training set to the explainer. - adds winsorization to shap dependence and shap interaction plots
- If
shap='guess'fails (unable to guess the right type of shap explainer), then default to the model agnosticshap='kernel'. - Better support for sklearn
Pipelines: if not able to extract transformer+model, then default toshap.KernelExplainerto explain the entire pipeline - you can now remove outliers from shap dependence/interaction plots with
remove_outliers=True: filters all outliers beyond 1.5*IQR
- Sets proper
threading.Locksbefore making calls to shap explainer to prevent race conditions with dashboards calling for shap values in multiple threads. (shap is unfortunately not threadsafe)
- single shap row KernelExplainer calculations now go without tqdm progress bar
- added cutoff tpr anf fpr to roc auc plot
- added cutoff precision and recall to pr auc plot
- put a loading spinner on shap contrib table
index_dropdown=False now works for indexes not listed in set_index_list_func()
as long as it can be found by set_index_exists_func
- adds
set_index_exists_functo add function that checks for index existing besides those listed byset_index_list_func()
- bug fix to make
shap.KernelExplainer(used with explainer parametershap='kernel') work withRegressionExplainer - bug fix when no explicit
labelsare based with index selector - component only update if
explainer.index_exists(): noIndexNotFoundErrorsanymore. - fixed title for regression index selector labeled 'Custom' bug
get_y()now returns.item()when necessary- removed ticks from confusion matrix plot when no
labelsparam passed (this bug got reintroduced in recent plotly release)
- new helper function
get_shap_row(index)to calculate or look up a single row of shap values.
Highlights:
- Control what metrics to show or use your own custom metrics using
show_metrics - Set the naming for onehot features with all
0s withcats_notencoded - Speed up plots by displaying only a random sample of markers in scatter plots with
plot_sample. - make index selection a free text field with
index_dropdown=False
- new parameter
show_metricsfor bothexplainer.metrics(),ClassifierModelSummaryComponentandRegressionModelSummaryComponent:- pass a list of metrics and only display those metrics in that order
- you can also pass custom scoring functions as long as they
are of the form
metric_func(y_true, y_pred):show_metrics=[metric_func]- For
ClassifierExplainerwhat is passed to the custom metric function depends on whether the function takes additional parameterscutoffandpos_label. If these are not arguments, theny_true=self.y_binary(pos_label)andy_pred=np.where(self.pred_probas(pos_label)>cutoff, 1, 0). Else the rawself.yandself.pred_probasare passed for the custom metric function to do something with. - custom functions are also stored to
dashboard.yamland imported upon loadingExplainerDashboard.from_config()
- For
- new parameter
cats_notencoded: a dict to indicate how to name the value of a onehotencoded features when all onehot columns equal 0. Defaults to'NOT_ENCODED', but can be adjusted with this parameter. E.g.cats_notencoded=dict(Deck="Deck not known"). - new parameter
plot_sampleto only plot a random sample in the various scatter plots. When you have a large dataset, this may significantly speed up various plots without sacrificing much in expressiveness:ExplainerDashboard(explainer, plot_sample=1000).run - new parameter
index_dropdown=Falsewill replace the index dropdowns with a free text field. This can be useful when you have a lot of potential indexes, and the user is expected to know the index string. Input will be checked for validity withexplainer.index_exists(index), and field indicates when input index does not exist. If index does not exist, will not be forwarded to other components, unless you also setindex_check=False. - adds mean absolute percentage error to the regression metrics. If it is too
large a warning will be printed. Can be excluded with the new
show_metricsparameter.
get_classification_dfadded toClassificationComponentdependencies.
- accepting single column
pd.Dataframefory, and automatically converting it to apd.Series - if WhatIf
FeatureInputComponentdetects the presence of missing onehot features (i.e. rows where all columns of the onehotencoded feature equal 0), then adds'NOT_ENCODED'or the matching value fromcats_notencodedto the dropdown options. - Generating
namefor parameters forExplainerComponentsfor which no name is given is now done with a determinative process instead of a randomuuid. This should help with scaling custom dashboards across cluster deployments. Also dropsshortuuiddependency. ExplainerDashboardnow prints out local ip address when starting dashboard.get_index_list()is only called once upon starting dashboard.
This version is mostly about pre-calculating and optimizing the classifier statistics components. Those components should now be much more responsive with large datasets.
- new methods
roc_auc_curve(pos_label)andpr_auc_curve(pos_label) - new method
get_classification_df(...)to get dataframe with number of labels above and below a given cutoff.- this now gets used by
plot_classification(..)
- this now gets used by
- new method
confusion_matrix(cutoff, binary, pos_label) - added parameters
sort_featurestoFeatureInputComponent:- defaults to
'shap': order features by mean absolute shap - if set to
'alphabet'features are sorted alphabetically
- defaults to
- added parameter
fill_row_firsttoFeatureInputComponent:- defaults to
True: fill first row first, then next row, etc - if False: fill first column first, then second column, etc
- defaults to
- categorical mappings now updateable with pandas<=1.2 and python==3.6
- title now overridable for
RegressionRandomIndexComponent - added assert check on
summary_typeforShapSummaryComponent
- pre-Calculating lift_curve_df only once and then storing for each pos_label
- plus: storing only 100 evenly spaced rows of lift_curve_df
- dashboard should be more responsive for large datasets
- pre-calculating roc_auc_curve and pr_auc_curve
- dashboard should be more responsive for large datasets
- pre-calculating confusion matrices
- dashboard should be more responsive for large datasets
- pre-calculating classification_dfs
- dashboard should be more responsive for large datasets
- confusion matrix: added axis title, moved predicted labels to bottom of graph
- precision plot: when only adjusting cutoff, simply updating the cutoff line, without recalculating the plot.
- new dependency requirements
pandas>=1.2also impliespython>=3.7
- updates
pandasversion to be compatible with categorical feature operations - updates dtreeviz version to make
xgboostandpysparkdependencies optional
This is a major release and comes with lots of breaking changes to the lower level
ClassifierExplainer and RegressionExplainer API. The higherlevel ExplainerComponent and ExplainerDashboard API has not been
changed however, except for the deprecation of the cats and hide_cats parameters.
Explainers generated with version explainerdashboard <= 0.2.20.1 will not work
with this version, so if you have stored explainers to disk you either have to
rebuild them with this new version, or downgrade back to explainerdashboard==0.2.20.1!
(hope you pinned your dependencies in production! ;)
Main motivation for these breaking changes was to improve memory usage of the dashboards, especially in production. This lead to the deprecation of the dual cats grouped/not grouped functionality of the dashboard. Once I had committed to that breaking change, I decided to clean up the entire API and do all the needed breaking changes at once.
-
onehot encoded features are now merged by default. This means that the
cats=Trueparameter has been removed from all explainer methods, and thegroup catstoggle has been removed from allExplainerComponents. This saves both on code complexity and memory usage. If you wish to see the see the individual contributions of onehot encoded columns, simply don't pass them to thecatsparameter upon construction. -
Deprecated explainer attributes:
BaseExplainer:self.shap_values_catsself.shap_interaction_values_catspermutation_importances_catsself.get_dfs()formatted_contrib_df()self.to_sql()self.check_cats()equivalent_col
ClassifierExplainer:get_prop_for_label
-
Naming changes to attributes:
BaseExplainer:importances_df()->get_importances_df()feature_permutations_df()->get_feature_permutations_df()get_int_idx(index)->get_idx(index)importances_df()->get_importances_df()contrib_df()->get_contrib_df()*contrib_summary_df()->self.get_summary_contrib_df()*interaction_df()->get_interactions_df()*shap_values->get_shap_values_dfplot_shap_contributions()->plot_contributions()plot_shap_summary()->plot_importances_detailed()plot_shap_dependence()->plot_dependence()plot_shap_interaction()->plot_interaction()plot_shap_interaction_summary()->plot_interactions_detailed()plot_interactions()->plot_interactions_importance()n_features()->n_featuresshap_top_interaction()->top_shap_interactionsshap_interaction_values_by_col()->shap_interactions_values_for_col()
ClassifierExplainer:self.pred_probas->self.pred_probas()precision_df()->get_precision_df()*lift_curve_df()->get_liftcurve_df()*
RandomForestExplainer/XGBExplainer:decision_trees->shadow_treesdecisiontree_df()->get_decisionpath_df()decisiontree_summary_df()->get_decisionpath_summary_df()decision_path_file()->decisiontree_file()decision_path()->decisiontree()decision_path_encoded()->decisiontree_encoded()
- new
Explainerparameterprecision: defaults to'float64'. Can be set to'float32'to save on memory usage:ClassifierExplainer(model, X, y, precision='float32') - new
memory_usage()method to show which internal attributes take the most memory. - for multiclass classifiers:
keep_shap_pos_label_only(pos_label)method:- drops shap values and shap interactions for all labels except
pos_label - this should significantly reduce memory usage for multi class classification models.
- not needed for binary classifiers.
- drops shap values and shap interactions for all labels except
- added
get_index_list(),get_X_row(index), andget_y(index)methods.- these can be overridden with
.set_index_list_func(),.set_X_row_func()and.set_y_func(). - by overriding these functions you can for example sample observations
from a database or other external storage instead of from
X_test,y_test.
- these can be overridden with
- added
Popoutbuttons to all the major graphs that open a large modal showing just the graph. This makes it easier to focus on a particular graph without distraction from the rest of the dashboard and all it's toggles. - added
max_cat_colorsparameters toplot_importance_detailedandplot_dependenceandplot_interactions_detailed- prevents plotting getting slow with categorical features with many categories.
- defaults to
5 - can be set as
**kwargtoExplainerDashboard
- adds category limits and sorting to
RegressionVsColcomponent - adds property
X_mergedthat gives a dataframe with the onehot columns merged.
- shap dependence: when no point cloud, do not highlight!
- Fixed bug with calculating contributions plot/table for whatif component, when InputFeatures had not fully loaded, resulting in shap error.
- saving
X.copy(), instead of using a reference toX- this would result in more memory usage in development
though, so you can
del X_testto save memory.
- this would result in more memory usage in development
though, so you can
ClassifierExplaineronly stores shap (interaction) values for the positive class: shap values for the negative class are generated on the fly by multiplying with-1.- encoding onehot columns as
np.int8saving memory usage - encoding categorical features as
pd.categorysaving memory usage - added base
TreeExplainerclass thatRandomForestExplainerandXGBExplainerboth derive from- will make it easier to extend tree explainers to other models in the future
- e.g. catboost and lightgbm
- will make it easier to extend tree explainers to other models in the future
- got rid of the callable properties (that were their to assure backward compatibility), and replaced them with regular methods.
- fixes bug allowing single list of logins for ExplainerDashboard when passed on to ExplainerHub
- fixes bug with explainer generated with explainerdashboard < version 0.2.20 that did not have a onehot_cols property
WhatIfComponentdeprecated. UseWhatIfCompositeor connect components yourself to aFeatureInputComponent- renaming properties:
explainer.cats->explainer.onehot_colsexplainer.cats_dict->explainer.onehot_dict
- Adds support for model with categorical features that were not onehot encoded (e.g. CatBoost)
- Adds filter on number of categories to display in violin plots and pdp plot, and how to sort the categories (alphabetical, by frequency or by mean abs shap)
- fixes bug where str tab indicators returned e.g. the old ImportancesTab instead of ImportancesComposite
- No longer dependening on PDPbox dependency: built own partial dependence functions with categorical feature support
- autodetect xgboost.core.Booster or lightgbm.Booster and give ValueError to use the sklearn compatible wrappers instead.
- Introduces list of categorical columns:
explainer.categorical_cols - Introduces dictionary with categorical columns categories:
explainer.categorical_dict - Introduces list of all categorical features:
explainer.cat_cols
- ExplainerHub: parameter
user_jsonis now calledusers_file(and default to ausers.yamlfile) - Renamed a bunch of
ExplainerHubprivate methods:_validate_user_json->_validate_users_file_add_user_to_json->_add_user_to_file_add_user_to_dashboard_json->_add_user_to_dashboard_file_delete_user_from_json->_delete_user_from_file_delete_user_from_dashboard_json->_delete_user_from_dashboard_file
- Added NavBar to
ExplainerHub - Made
users.yamlto default file for storing users and hashed passwords forExplainerHubfor easier manual editing. - Added option
min_heighttoExplainerHubto set the size of the iFrame containing the dashboard. - Added option
fluid=TruetoExplainerHubto stretch bootstrap container to width of the browser. - added parameter
bootstraptoExplainerHubto override default bootstrap theme. - added option
dbs_open_by_default=TruetoExplainerHubso that no login is required for dashboards for which there wasn't a specific lists of users declared throughdb_users. So only dashboards for which users have been defined are password protected. - Added option
no_indextoExplainerHubso that no flask route is created for index"/", so that you can add your own custom index. The dashboards are still loaded on their respective routes, so you can link to them or embed them in iframes, etc. - Added a "wizard" perfect prediction to the lift curve.
- hide with
hide_wizard=Truedefault to not show withwizard=False.
- hide with
ExplainerHub.from_config()now works with non-cwd pathsExplainerHub.to_yaml("subdirectory/hub.yaml")now correctly stores the users.yaml file in the correct subdirectory when specified.
- added a "powered by: explainerdashboard" footer. Hide it with hide_poweredby=True.
- added option "None" to shap dependence color col. Also removes the point cloud from the violin plots for categorical features.
- added option
modetoExplainerDashboard.run()that can overrideself.mode.
ExplainerHubnow does user managment throughFlask-Loginand auser.jsonfile- adds an
explainerhubcli to start explainerhubs and do user management.
- Introducing
ExplainerHub: combine multiple dashboards together behind a single frontend with convenient url paths.- example:
db1 = ExplainerDashboard(explainer, title="Dashboard One", name='dashboard1') db2 = ExplainerDashboard(explainer, title="Dashboard Two", name='dashboard2') hub = ExplainerHub([db1, db2]) hub.run() # store an recover from config: hub.to_yaml("hub.yaml") hub2 = ExplainerHub.from_config("hub.yaml")
- adds option
dump_explainertoExplainerDashboard.to_yamlto automatically dump the explainerfile along with the yaml. - adds option
use_waitresstoExplainerDashboard.run()andExplainerHub.run(), to use thewaitresspython webserver instead of theFlaskdevelopment server - adds parameters to
ExplainerDashboard:name: this will be used to assign a url forExplainerHubdescription: this will be used for the title tooltip in the dashboard and in theExplainerHubfrontend.
- the
clinow uses thewaitressserver by default.
- Makes component
nameproperty for the default composites deterministic instead of random uuid, now also working when loading a dashboard .from_config()- note however that for custom
ExplainerComponentsthe user is still responsible for making sure that all subcomponents get assigned a deterministicname(otherwise random uuid names get assigned at dashboard start, which might differ across nodes in e.g. docker swarm deployments)
- note however that for custom
- Calling
self.register_components()no longer necessary.
- Makes component
nameproperty for the default composites deterministic instead of random uuid. This should help remedy bugs with deployment using e.g. docker swarm.- When you pass a list of
ExplainerComponentsto ExplainerDashboard the tabs will get names'1','2','3', etc. - If you then make sure that subcomponents get passed a name like
name=self.name+"1", then subcomponents will have deterministic names as well. - this has been implemented for the default
Compositesthat make up the defaultexplainerdashboard
- When you pass a list of
hide_whatifcontributionparameter now calledhide_whatifcontributiongraph
- added parameter
n_input_colsto FeatureInputComponent to select in how many columns to split the inputs - Made PredictionSummaryComponent and ShapContributionTableComponent also work with InputFeatureComponent
- added a PredictionSummaryuComponent and ShapContributionTableComponent to the "what if" tab
- features of
FeatureInputComponentare now ordered by mean shap importance - Added range indicator for numerical features in FeatureInputComponent
- hide them
hide_range=True
- hide them
- changed a number of dropdowns from
dcc.Dropdowntodbc.Select - reordered the regression random index selector component a bit
-
can now hide entire components on tabs/composites:
ExplainerDashboard(explainer, # importances tab: hide_importances=True, # classification stats tab: hide_globalcutoff=True, hide_modelsummary=True, hide_confusionmatrix=True, hide_precision=True, hide_classification=True, hide_rocauc=True, hide_prauc=True, hide_liftcurve=True, hide_cumprecision=True, # regression stats tab: # hide_modelsummary=True, hide_predsvsactual=True, hide_residuals=True, hide_regvscol=True, # individual predictions: hide_predindexselector=True, hide_predictionsummary=True, hide_contributiongraph=True, hide_pdp=True, hide_contributiontable=True, # whatif: hide_whatifindexselector=True, hide_inputeditor=True, hide_whatifcontribution=True, hide_whatifpdp=True, # shap dependence: hide_shapsummary=True, hide_shapdependence=True, # shap interactions: hide_interactionsummary=True, hide_interactiondependence=True, # decisiontrees: hide_treeindexselector=True, hide_treesgraph=True, hide_treepathtable=True, hide_treepathgraph=True, ).run()
- Fixed bug where if you passed a default index as **kwarg, the random index selector would still fire at startup, overriding the passed index
- Fixed bug where in case of ties in shap values the contributions graph/table would show more than depth/topx feature
- Fixed bug where favicon was not showing when using custom bootstrap theme
- Fixed bug where logodds where multiplied by 100 in ShapContributionTableComponent
- added checks on
loginsparameter to give more helpful error messages- also now accepts a single pair of logins:
logins=['user1', 'password1']
- also now accepts a single pair of logins:
- added a
hide_footerparameter to components with a CardFooter
- added
bootstrapparameter to dashboard to make theming easier: e.g.ExplainerDashboard(explainer, bootstrap=dbc.themes.FLATLY).run() - added
hide_subtitle=Falseparameter to all components with subtitles - added
descriptionparameter to all components to adjust the hover-over-title tooltip - can pass additional *kwargs to ExplainerDashboard.from_config() to override
stored parameters, e.g.
db = ExplainerDashboard.from_config("dashboard.yaml", higher_is_better=False)
- fixed bug where
drop_na=Trueforexplainer.plot_pdp()was not working.
**kwargsare now also stored when calling ExplainerDashboard.to_yaml()- turned single radioitems into switches
- RegressionVsColComponent: hide "show point cloud next to violin" switch
when feature is not in
cats
- fixed RegressionRandomIndexComponent bug that crashed when y.astype(np.int64), now casting all slider ranges to float.
- fixed pdp bug introduced with setting
X.indextoself.idxswhere the highlighted index was not the right index - now hiding entire
CardHeaderwhenhide_title=True - index was not initialized in ShapContributionsGraphComponent and Shap ContributionsTableComponent
- Now always have to pass a specific port when terminating a JupyterDash-based
(i.e. inline, external or jupyterlab) dashboard: ExplainerDashboard.terminate(port=8050)
- but now also works as a classmethod, so don't have to instantiate an actual dashboard just to terminate one!
- ExplainerComponent
_register_componentshas been renamed tocomponent_callbacksto avoid the confusing underscore
- new:
ClassifierPredictionSummaryComponent,RegressionPredictionSummaryComponent- already integrated into the individual predictions tab
- also added a piechart with predictions
- Wrapped all the ExplainerComponents in
dbc.Cardfor a cleaner look to the dashboard. - added subtitles to all components
- using
go.Scatterglinstead ofgo.Scatterfor some plots which should improve performance with larger datasets ExplainerDashboard.terminate()is now a classmethod, so don't have to build an ExplainerDashboard instance in order to terminate a running JupyterDash dashboard.- added
disable_permutationsboolean argument toImportancesComponent(that you can also pass toExplainerDashboard**kwargs)
- Added warning that kwargs get passed down the ExplainerComponents
- Added exception when trying to use
ClassifierRandomIndexComponentwith aRegressionExplainerorRegressionRandomIndexComponentwith aClassifierExplainer - dashboard now uses Composites directly instead of the ExplainerTabs
- removed
metrics_markdown()method. Addedmetrics_descriptions()that describes the metric in words. - removed
PredsVsColComponent,ResidualsVsColComponentandActualVsColComponent, these three are now subsumed inRegressionVsColComponent.
- Added tooltips everywhere throughout the dashboard to explainer components, plots, dropdowns and toggles of the dashboard itself.
- changed colors on contributions graph up=green, down=red
- added
higher_is_betterparameter to switch green and red colors.
- added
- Clarified wording on index selector components
- hiding
group catstoggle everywhere when no cats are passed - passing
**kwargsof ExplainerDashbaord down to all all tabs and (sub) components so that you can configure components from an ExplainerDashboard param. e.g.ExplainerDashboard(explainer, higher_is_better=False).run()will pass the higher_is_better param down to all components. In the case of the ShapContributionsGraphComponent and the XGBoostDecisionTrees component this will cause the red and green colors to flip (normally green is up and red is down.)
- added (very limited) sklearn.Pipeline support. You can pass a Pipeline as
modelparameter as long as the pipeline either:- Does not add, remove or reorders any input columns
- has a .get_feature_names() method that returns the new column names (this is currently beings debated in sklearn SLEP007)
- added cutoff slider to CumulativePrecisionComponent
- For RegressionExplainer added ActualVsColComponent and PredsVsColComponent in order to investigate partial correlations between y/preds and various features.
- added
index_nameparameter: name of the index column (defaults toX.index.nameoridxs.name). So when you passindex_name="Passenger", you get a "Random Passenger" button on the index selector instead of "Random Index", etc.
- Fixed a number of bugs for when no labels are passed (
y=None):- fixing explainer.random_index() for when y is missing
- Hiding label/y/residuals selector in RandomIndexSelectors
- Hiding y/residuals in prediction summary
- Hiding model_summary tab
- Removing permutation importances from dashboard
- Seperated labels for "observed" and "average prediction" better in tree plot
- Renamed "actual" to "observed" in prediction summary
- added unique column check for whatif-component with clearer error message
- model metrics now formatted in a nice table
- removed most of the loading spinners as most graphs are not long loads anyway.
- Explainer parameter
catsnow takes dicts as well where you can specify your own groups of onehotencoded columns. - e.g. instead of passingcats=['Sex']to group['Sex_female', 'Sex_male', 'Sex_nan']you can now do this explicitly:cats={'Gender'=['Sex_female', 'Sex_male', 'Sex_nan']}- Or combine the two methods:cats=[{'Gender'=['Sex_female', 'Sex_male', 'Sex_nan']}, 'Deck', 'Embarked']
- You don't have to pass the list of subcomponents in
self.register_components()anymore: it will infer them automatically fromself.__dict__.
- ExplainerComponents now automatically stores all parameters to attributes
- ExplainerComponents now automatically stores all parameters to a ._stored_params dict
- ExplainerDashboard.to_yaml() now support instantiated tabs and stores parameters to yaml
- ExplainerDashboard.to_yaml() now stores the import requirments of subcomponents
- ExplainerDashboard.from_config() now instantiates tabs with stored parameters
- ExplainerDashboard.from_config() now imports classes of subcomponents
- added docstrings to explainer_plots
- added screenshots of ExplainerComponents to docs
- added more gifs to the documentation
- split explainerdashboard.yaml into a explainer.yaml and dashboard.yaml
- Changed UI of the explainerdashboard CLI to reflect this
- This will make it easier in the future to have automatic rebuilds and redeploys when an modelfile, datafile or configuration file changes.
- Load an ExplainerDashboard from a configuration file with the classmethod,
e.g. :
ExplainerDashboard.from_config("dashboard.yaml")
- explainer.dump() to store explainer, explainer.from_file() to load explainer from file
- Explainer.to_yaml() and ExplainerDashboard.to_yaml() can store the configuration of your explainer/dashboard to file.
- explainerdashboard CLI:
- Start an explainerdashboard from the command-line!
- start default dashboard from stored explainer :
explainerdashboard run explainer.joblib - start full configured dashboard from config:
explainerdashboard run explainerdashboard.yaml - build explainer based on input files defined in .yaml
(model.pkl, data.csv, etc):
explainerdashboard build explainerdashboard.yaml - includes new ascii logo :)
- If idxs is not passed use X.index instead
- explainer.idxs performance enhancements
- added whatif component and tab to InlineExplainer
- added cumulative precision component to InlineExplainer
Version 0.2.6:
- more straightforward imports:
from explainerdashboard import ClassifierExplainer, RegressionExplainer, ExplainerDashboard, InlineExplainer - all custom imports (such as ExplainerComponents, Composites, Tabs, etc)
combined under
explainerdashboard.custom:from explainerdashboard.custom import *
- New dashboard tab: WhatIfComponent/WhatIfComposite/WhatIfTab: allows you to explore whatif scenario's by editing multiple featues and observing shap contributions and pdp plots. Switch off with ExplainerDashboard parameter whatif=False.
- New login functionality: you can restrict access to your dashboard by passing
a list of
[login, password]pairs:ExplainerDashboard(explainer, logins=[['login1', 'password1'], ['login2', 'password2']]).run() - Added 'target' parameter to explainer, to make more descriptive plots. e.g. by setting target='Fare', will show 'Predicted Fare' instead of simply 'Prediction' in various plots.
- in detailed shap/interaction summary plots, can now click on single shap value for a particular feature, and have that index highlighted for all features.
- autodetecting Google colab environment and setting mode='external' (and suggesting so for jupyter notebook environments)
- confusion matrix now showing both percentage and counts
- Added classifier model performance summary component
- Added cumulative precision component
- added documentation on how to deploy to heroku
- Cleaned up modebars for figures
- ClassifierExplainer asserts predict_proba attribute of model
- with model_output='logodds' still display probability in prediction summary
- for ClassifierExplainer: check if has predict_proba methods at init
- removed monkeypatching shap_explainer note
- added ExplainerDashboard parameter "responsive" (defaults to True) to make the dashboard layout reponsive on mobile devices. Set it to False when e.g. running tests on headless browsers.
- Fixes bug that made RandomForest and xgboost explainers unpicklable
- Added tests for picklability of explainers
- RandomForestClassifierExplainer and RandomForestRegressionExplainer will be deprecated: can now simply use ClassifierExplainer or RegressionExplainer and the mixin class will automatically be loaded.
- Now also support for visualizing individual trees for XGBoost models! (XGBClassifier and XGBRegressor). The XGBExplainer mixin class will be automatically loaded and make decisiontree_df(), decision_path() and plot_trees() methods available, Decision Trees tab and components now also work for XGBoost models.
- new parameter n_jobs for calculations that can be parallelized (e.g. permutation importances)
- contrib_df, plot_shap_contributions: can order by global shap feature importance with sort='importance' (as well as 'abs', 'high-to-low' 'low-to-high')
- added actual outcome to plot_trees (for both RandomForest and XGB)
- optimized code for calculating permutation importance, adding possibility to calculate in parallel
- shap dependence component: if no color col selected, output standard blue dots instead of ignoring update
- added selenium integration tests for dashboards (also working with github actions)
- added tests for multiclass classsification, DecisionTree and ExtraTrees models
- added tests for XGBExplainers
- added proper docstrings to explainer_methods.py
- kernel shap bug fixed
- contrib_df bug with topx fixed
- fix for shap v0.36: import approximate_interactions from shap.utils instead of shap.common
- Removed ExplainerHeader from ExplainerComponents
- so also removed parameter
header_modefrom ExplainerComponent parameters - You can now instead syncronize pos labels across components with a PosLabelSelector and PosLabelConnector.
- so also removed parameter
- In regression plots instead of boolean ratio=True/False, you now pass residuals={'difference', 'ratio', 'log-ratio'}
- decisiontree_df_summary renamed to decisiontree_summary_df (in line with contrib_summary_df)
- added check all shap values >-1 and <1 for model_output=probability
- added parameter pos_label to all components and ExplainerDashboard to set the initial pos label
- added parameter block_selector_callbacks to ExplainerDashboard to block the global pos label selector's callbacks. If you already have PosLabelSelectors in your layout, this prevents clashes.
- plot actual vs predicted now supported only logging x axis or only y axis
- residuals plots now support option residuals='log-ratio'
- residuals-vs-col plot now shows violin plot for categorical features
- added sorting option to contributions plot/graph: sort={'abs', 'high-to-low', 'low-to-high'}
- added final prediction to contributions plot
- Interaction connector bug fixed in detailed summary: click didn't work
- pos label was ignored in explainer.plot_pdp()
- Fixed some UX issues with interations components
- All
State['tabs', 'value']condition have been taken out of callbacks. This used to fix some bugs with dash tabs, but seems it works even without, so also no need to insert dummy_tabs inExplainerHeader. - All
ExplainerComponentsnow have their own pos label selector, meaning that they are now fully self-containted and independent. No global dash elements in component callbacks. - You can define the layout of ExplainerComponents in a layout() method instead of _layout(). Should still define component_callbacks() to define callbacks so that all subcomponents that have been registered will automatically get their callbacks registered as well.
- Added regression
self.unitsto prediction summary, shap plots, contributions plots/table, pdp plot and trees plot. - Clearer title for MEAN_ABS_SHAP importance and summary plots
- replace na_fill value in contributions table by "MISSING"
- add string idxs to shap and interactions summary and dependence plots, including the violing plots
- pdp plot for classification now showing percentages instead of fractions
- added hide_title parameter to all components with a title
- DecisionPathGraphComponent not available for RandomForestRegression models for now.
- In contributions graph base value now called 'population average' and colored yellow.
- InlineExplainer api has been completely redefined
- JupyterExplainerDashboard, ExplainerTab and JupyterExplainerTab have been deprecated
- Major rewrite and refactor of the dashboard code, now modularized into ExplainerComponents and ExplainerComposites.
- ExplainerComponents can now be individually accessed through InlineExplainer
- All elements of components can now be switched on or off or be given an initial value.
- Makes it much, much easier to design own custom dashboards.
- ExplainerDashboard can be passed an arbitrary list of components to display as tabs.
- Added sections InlineExplainer, ExplainerTabs, ExplainerComponents, CustomDashboards and Deployment
- Added screenshots to documentation.
- fixes residuals y-pred instead of pred-y
- Random Index Selector redesigned
- Prediction summary redesigned
- Tables now follow dbc.Table formatting
- All connections between components now happen through explicit connectors
- Layout of most components redesigned, with all elements made hideable
- Fixed bug with GradientBoostingClassifier where output format of shap.expected_value was not not properly accounted for.
- Cleaned up standalone label selector code
- Added check for shap base values to be between between 0 and 1 for model_output=='probability'
- ExplainerDashboardStandaloneTab is now called ExplainerTab
added support for the jupyter-dash package for inline dashboard in
Jupyter notebooks, adding the following dashboard classes:
JupyterExplainerDashboardJupyterExplainerTabInlineExplainer