• Media type: E-Book
  • Title: Big data and social science : data science methods and tools for research and practice
  • Contributor: Foster, Ian [HerausgeberIn]; Ghani, Rayid [HerausgeberIn]; Jarmin, Ronald S. [HerausgeberIn]; Kreuter, Frauke [HerausgeberIn]; Lane, Julia [HerausgeberIn]
  • imprint: Boca Raton, FL; London; New York: CRC Press, Taylor & Francis Group, 2021
  • Published in: Chapman & Hall/CRC statistics in the social and behavioral sciences series
    A Chapman & Hall Book
  • Issue: Second edition
  • Extent: 1 Online-Ressource (xx, 391 Seiten); Illustrationen, Diagramme
  • Language: English
  • ISBN: 9780429324383; 9781000208634
  • Keywords: Big Data > Data Mining > Statistik > Sozialwissenschaften
  • Origination:
  • Footnote: Description based on publisher supplied metadata and other sources
  • Description: Cover -- Half Title -- Series Page -- Title Page -- Copyright Page -- Contents -- Preface -- Editors -- Contributors -- 1. Introduction -- 1.1 Why this book? -- 1.2 Defining big data and its value -- 1.3 The importance of inference -- 1.3.1 Description -- 1.3.2 Causation -- 1.3.3 Prediction -- 1.4 The importance of understanding how data are generated -- 1.5 New tools for new data -- 1.6 The book's "use case" -- 1.7 The structure of the book -- 1.7.1 Part I: Capture and curation -- 1.7.2 Part II: Modeling and analysis -- 1.7.3 Part III: Inference and ethics -- 1.8 Resources -- Part I: Capture and Curation -- 2. Working with Web Data and APIs -- 2.1 Introduction -- 2.2 Scraping information from the web -- 2.2.1 Obtaining data from websites -- 2.2.1.1 Constructing the URL -- 2.2.1.2 Obtaining the contents of the page from the URL -- 2.2.1.3 Processing the HTML response -- 2.2.2 Programmatically iterating over the search results -- 2.2.3 Limits of scraping -- 2.3 Application programming interfaces -- 2.3.1 Relevant APIs and resources -- 2.3.2 RESTful APIs, returned data, and Python wrappers -- 2.4 Using an API -- 2.5 Another example: Using the ORCID API via a wrapper -- 2.6 Integrating data from multiple sources -- 2.7 Summary -- 3. Record Linkage -- 3.1 Motivation -- 3.2 Introduction to record linkage -- 3.3 Preprocessing data for record linkage -- 3.4 Indexing and blocking -- 3.5 Matching -- 3.5.1 Rule-based approaches -- 3.5.2 Probabilistic record linkage -- 3.5.3 Machine learning approaches to record linkage -- 3.5.4 Disambiguating networks -- 3.6 Classification -- 3.6.1 Thresholds -- 3.6.2 One-to-one links -- 3.7 Record linkage and data protection -- 3.8 Summary -- 3.9 Resources -- 4. Databases -- 4.1 Introduction -- 4.2 The DBMS: When and why -- 4.3 Relational DBMSs -- 4.3.1 Structured Query Language -- 4.3.2 Manipulating and querying data.