• Media type: E-Book
  • Title: Textual Factors : A Scalable, Interpretable, and Data-driven Approach to Analyzing Unstructured Information
  • Contributor: Cong, Lin William [Author]; Liang, Tengyuan [Other]; Zhang, Xiao [Other]
  • Published: [S.l.]: SSRN, [2019]
  • Extent: 1 Online-Ressource (65 p)
  • Language: English
  • DOI: 10.2139/ssrn.3307057
  • Identifier:
  • Origination:
  • Footnote: Nach Informationen von SSRN wurde die ursprüngliche Fassung des Dokuments September 1, 2019 erstellt
  • Description: We introduce a general framework for analyzing large-scale text-based data, combining the strengths of neural-network language processing and generative statistical modeling. Our methodology generates textual factors by (i) representing texts using vector word embedding, (ii) clustering words using locality-sensitive hashing, and (iii) identifying spanning vector clusters through topic modeling. Our data-driven approach captures complex linguistic structures while ensuring computational scalability and economic interpretability. We also discuss applications of textual factors in (i) prediction and inference, (ii) interpreting (non-text-based) models and variables, and (iii) constructing new text-based metrics and explanatory variables, with illustrations using topics in finance and economics such as macroeconomic forecasting and factor asset pricing
  • Access State: Open Access