Release v1.2.0
Release v1.2.0
This update enhances the preprocessing and embedding layers in the mambular package, introducing several key improvements:
- Feature-Specific Preprocessing: The
Preprocessorclass now includes a feature preprocessing dictionary, enabling different preprocessing strategies for each feature. - Support for Unstructured Data: The model can now handle a combination of tabular features and unstructured data, such as images and text.
- Latent Representation Generation: It is now possible to generate latent representations of the input data, improving downstream modeling and interpretability.
These changes enhance flexibility and extend mambular's capabilities to more diverse data modalities.
Preprocessing improvements:
mambular/preprocessing/preprocessor.py: Addedfeature_preprocessingparameter to allow custom preprocessing techniques for individual columns. Updated thefitmethod to use this parameter for both numerical and categorical features. [1] [2] [3] [4] [5]
Embedding layer updates:
mambular/arch_utils/layer_utils/embedding_layer.py: Modified theforwardmethod to handle different dimensions of categorical embeddings and ensure they are properly processed. [1] [2]
Allow unstructured data as inputs:
mambular/arch_utils/layer_utils/embedding_layer.py: Modified theforwardmethod to handle num_features, cat_features and pre-embedded unstructured data. [1] [2]
Get latent representation of tables
mambular/base_models/basemodel.py: Updated theencodemethod to accept a singledataparameter instead of separatenum_featuresandcat_featuresparameters. [1] [2]