Skip to main content

One post tagged with "deep dive"

View All Tags

Tales from the Binomial Tail: Confidence intervals for balanced accuracy

· 24 min read
Ted Sandler
Ted Sandler
Senior Applied Scientist at Groundlight
Leo Dirac
Leo Dirac
CTO and Co-founder at Groundlight

At Groundlight, we put careful thought into measuring the correctness of our machine learning detectors. In the simplest case, this means measuring detector accuracy. But our customers have vastly different performance needs since our platform allows them to train an ML model for nearly any Yes/No visual question-answering task. A single metric like accuracy is unlikely to provide adequate resolution for all such problems. Some customers might care more about false positive mistakes (precision) whereas others might care more about false negatives (recall).