References for: doi:10.3233/DS-240059
Full identifier: https://doi.org/10.3233/DS-240059
Nanopublication | Part | Subject | Predicate | Object | Published By | Published On |
---|---|---|---|---|---|---|
links a nanopublication to its assertion
http://www.nanopub.org/nschema#hasAssertion
assertion
|
doi:10.3233/DS-240059
|
Tobias Kuhn
|
2024-07-12T09:07:29.273Z
|
|||
links a nanopublication to its assertion
http://www.nanopub.org/nschema#hasAssertion
assertion
|
doi:10.3233/DS-240059
|
Measuring Data Drift with the Unstable Population Indicator
|
Tobias Kuhn
|
2024-07-12T09:07:29.273Z
|
||
links a nanopublication to its assertion
http://www.nanopub.org/nschema#hasAssertion
assertion
|
doi:10.3233/DS-240059
|
2024-06-26
|
Tobias Kuhn
|
2024-07-12T09:07:29.273Z
|
||
links a nanopublication to its assertion
http://www.nanopub.org/nschema#hasAssertion
assertion
|
doi:10.3233/DS-240059
|
Tobias Kuhn
|
2024-07-12T09:07:29.273Z
|
|||
links a nanopublication to its assertion
http://www.nanopub.org/nschema#hasAssertion
assertion
|
doi:10.3233/DS-240059
|
Tobias Kuhn
|
2024-07-12T09:07:29.273Z
|
|||
links a nanopublication to its assertion
http://www.nanopub.org/nschema#hasAssertion
assertion
|
doi:10.3233/DS-240059
|
Measuring data drift is essential in machine learning applications where model scoring (evaluation) is done on data samples that differ from those used in training. The Kullback-Leibler divergence is a common measure of shifted probability distributions, for which discretized versions are invented to deal with binned or categorical data. We present the Unstable Population Indicator, a robust, flexible and numerically stable, discretized implementation of Jeffrey's divergence, along with an implementation in a Python package that can deal with continuous, discrete, ordinal and nominal data in a variety of popular data types. We show the numerical and statistical properties in controlled experiments. It is not advised to employ a common cut-off to distinguish stable from unstable populations, but rather to let that cut-off depend on the use case.
|
Tobias Kuhn
|
2024-07-12T09:07:29.273Z
|
||
links a nanopublication to its pubinfo
http://www.nanopub.org/nschema#hasPublicationInfo
pubinfo
|
doi:10.3233/DS-240059
|
Tobias Kuhn
|
2024-07-12T09:07:29.273Z
|
|||
links a nanopublication to its pubinfo
http://www.nanopub.org/nschema#hasPublicationInfo
pubinfo
|
doi:10.3233/DS-240059
|
Tobias Kuhn
|
2024-07-12T09:07:29.273Z
|
|||
links a nanopublication to its provenance
http://www.nanopub.org/nschema#hasProvenance
provenance
|
doi:10.3233/DS-240059
|
Tobias Kuhn
|
2024-07-12T07:32:15.922Z
|
|||
links a nanopublication to its pubinfo
http://www.nanopub.org/nschema#hasPublicationInfo
pubinfo
|
doi:10.3233/DS-240059
|
Tobias Kuhn
|
2024-07-12T07:32:15.922Z
|
|||
links a nanopublication to its provenance
http://www.nanopub.org/nschema#hasProvenance
provenance
|
doi:10.3233/DS-240059
|
Tobias Kuhn
|
2024-07-12T07:32:15.922Z
|