Beschreibung:
Web-based innovation indicators may provide new insights into firm-level innovation activities. However, little is known yet about the accuracy and relevance of web-based information. In this study, we use 4,485 German firms from the Mannheim Innovation Panel (MIP) 2019 to analyze which website characteristics are related to innovation activities at the firm level. Website characteristics are measured by several text mining methods and are used as features in different Random Forest classification models that are compared against each other. Our results show that the most relevant website characteristics are the website’s language, the number of subpages, and the total text length. Moreover, our website characteristics show a better performance for the prediction of product innovations and innovation expenditures than for the prediction of process innovations.