Found the Kubler Loss quite interesting :) Would love to see you doing some experimentation on real datasets, especially on NLP and Vision tasks.
This loss holds water if you train a dog/cat classifier and it outputs a high uncertainty for all frog images.
Also, I didn't find your formulation particularly attached to MSE. Would also love to see how a CCE version of it would fare.
Finally, I don't know if you are aware of Monte Carlo Dropout, its a really simple technique to extract some uncertainty information from neural networks. Basically, you measure the deviation of running inference multiple times through dropout and, if the model is very sure of its outputs, there should be minimal variance.
Here is a link of it: https://arxiv.org/abs/1506.02142