Text Analysis with Topic Models for the Humanities and Social Sciences¶
Text Analysis with Topic Models for the Humanities and Social Sciences (TAToM) consists of a series of tutorials covering basic procedures in quantitative text analysis. The tutorials cover the preparation of a text corpus for analysis and the exploration of a collection of texts using topic models and machine learning.
These tutorials cover basic as well as somewhat advanced procedures and make extensive use of the Python programming language to organize, analyze, and visualize data.
- Getting started
- Working with text
- Feature selection: finding distinctive words
- Topic modeling with MALLET
- Topic modeling in Python
- Visualizing topic models
- Visualizing trends
- Classification, Machine Learning, and Logistic Regression
- Case study: Racine’s early and late tragedies
All materials are published under a Creative Commons Attribution 4.0 International license (CC-BY 4.0).
Comments are welcome, as are reports of bugs and typos. Please use the project’s issue tracker.
These tutorials have been developed with support from the DARIAH-DE initiative, the German branch of DARIAH-EU, the European Digital Research Infrastructure for the Arts and Humanities consortium. Funding has been provided by the German Federal Ministry for Research and Education (BMBF) under the identifier 01UG1110J.Impressum