Thanks for the insightful reply.
I was rereading the concepts this morning and some things came to mind.
Regarding stability, are/have you considered cross validation to assess stability? How the predictivity changes across folds might be a good indicator, or at least a relative ranker.
Regarding simplicity, I am a total foreigner to this literature, so take my opinion with a grain of salt. I can't really see why the sum of rule lengths is a particularly interesting measure. I would rather have a small set of rules with a few clauses than any extreme, such as a super long rule or many tiny ones.
As I understood it, you are using decision trees, so why not use tree concepts? If you set the maximum height to, say, 5, simplicity might be the number of actually used nodes among the 2^5 that could have been used.