Release Notes
BUG FIX: Errors in num_all_caps
The function for counting all caps previous had an error in which the logic counted single-letter words. These have now been removed, and the resulting output should more accurately reflect words that are deliberately placed in "ALL CAPS," rather than portions of emojis (e.g., :D) or single-letter capitals (e.g., I, A).
Updated handling to timestamp formats
We were previously fairly lenient in how we handled time stamps; we allowed the time stamp column to contain null or unparseable values, which sometimes led to the TCT throwing uncaught errors. We now validate the formatting of timestamps upfront upon instantiation of the FeatureBuilder, and we do not allow users to proceed without correctly-formatted timestamps. The change in logic also cleans up some code redundancy (in which we were previously validating timestamps separately whenever we were using the timestamp column, rather than once at the beginning).
The following is the revised expected behavior in this version:
- If the user does not pass in a timestamp column, timestamp-related features ['Time Difference', 'Team Burstiness'] are removed from the output but execution proceeds normally.
- If the timestamp column is provided, we verify that there are no null values and that all values are parsable by pd.to_datetime(). Otherwise, halt execution and raise an error for the user to correct issues with the time column.