• Media type: E-Book
  • Title: Online Decision-Making with High-Dimensional Covariates
  • Contributor: Bastani, Hamsa [Author]; Bayati, Mohsen [Author]
  • Published: [S.l.]: SSRN, 2019
  • Extent: 1 Online-Ressource (58 p)
  • Language: English
  • DOI: 10.2139/ssrn.2661896
  • Identifier:
  • Origination:
  • Footnote: In: in Operations Research
    Nach Informationen von SSRN wurde die ursprüngliche Fassung des Dokuments June 5, 2015 erstellt
  • Description: Big data has enabled decision-makers to tailor decisions at the individual-level in a variety of domains such as personalized medicine and online advertising. This involves learning a model of decision rewards conditional on individual-specific covariates. In many practical settings, these covariates are high-dimensional; however, typically only a small subset of the observed features are predictive of a decision’s success. We formulate this problem as a K-armed contextual bandit with high-dimensional covariates, and present a new efficient bandit algorithm based on the LASSO estimator. We prove that our algorithm’s cumulative expected regret scales at most poly-logarithmically in the covariate dimension d; to the best of our knowledge, this is the first such bound for a contextual bandit. The key step in our analysis is proving a new tail inequality that guarantees the convergence of the LASSO estimator despite the non-i.i.d. data induced by the bandit policy. Furthermore, we illustrate the practical relevance of our algorithm by evaluating it on a simplified version of a medication dosing problem. A patient’s optimal medication dosage depends on the patient’s genetic profile and medical records; incorrect initial dosage may result in adverse consequences such as stroke or bleeding. We show that our algorithm outperforms existing bandit methods as well as physicians to correctly dose a majority of patients
  • Access State: Open Access