Beschreibung:
<jats:p>
We address two major obstacles to practical deployment of AI-based models on distributed private data. Whether a model was trained by a federation of cooperating clients or trained centrally, (1) the output scores must be calibrated, and (2) performance metrics must be evaluated --- all without assembling labels in one place. In particular, we show how to perform calibration and compute the standard metrics of precision, recall, accuracy and ROC-AUC in the federated setting under three privacy models (
<jats:italic>i</jats:italic>
) secure aggregation, (
<jats:italic>ii</jats:italic>
) distributed differential privacy, (
<jats:italic>iii</jats:italic>
) local differential privacy. Our theorems and experiments clarify tradeoffs between privacy, accuracy, and data efficiency. They also help decide if a given application has sufficient data to support federated calibration and evaluation.
</jats:p>