Evaluation methods that are "strictly proper" cannot be artificially improved through hedging, which makes them fair methods for accessing the accuracy of probabilistic forecasts. These methods are useful for evaluating machine learning or statistical models that produce probabilities instead of point estimates. In particular, these rules are often used for evaluating weather forecasts.
properscoring currently contains optimized and extensively tested routines for calculating the Continuous Ranked Probability Score (CRPS) and the Brier Score: - CRPS for an ensemble forecast - CRPS for a Gaussian distribution - CRPS for an arbitrary cumulative distribution function - Brier score for binary probability forecasts - Brier score for threshold exceedances with an ensemble forecast
If you're interested in these types of metrics, we'd love to hear your thoughts on this package.