It seems beneficial to have more than one test suite per example when displaying mutation testing. For example, one test suite could be considered "bad" with a low mutation testing score, while another could be considered "good" with a high score.
The script(s) would need to be updated to support choosing which test suite to run. Or, it could instead always run both.