Valence Class Analysis

Social scientists have recently begun to explore network-analytic methods for extracting cognitive schemas from survey data. These methods—collectively referred to as “schematic class analysis” and most prominently illustrated with relational class analysis (RCA) and correlational class analysis (CCA)—are invaluable tools for social scientists because they permit the statistical extraction of actors’ cultural models from data that have long been readily available.

valence_networksHowever, there is another source of data that has been available much longer surveys and in much greater quantities: text. Though the human brain does not store schematic information as discursively-articulable and mutually-exclusive symbols (although classical metaphors of learning, thinking, and memory generally rely on such imagery), language is nonetheless a pervasive simulator of such information. Recorded natural language, then, is an important and largely-untapped source of data for schema extraction. Furthermore, text is generally produced outside of formal research settings and therefore avoids the various biases that can be introduced with survey instruments—meaning that the practical schemas actors use for “on the fly” sense-making are more likely to be documented from data produced of the actors’ own volition.

I am actively working on a new form of schematic class analysis that identifies cognitive schemas from raw text. The method—called “valence class analysis”—extracts emotion-weighted schemas from document-topic probability distributions. The method is a novel combination of correlational class analysis, sentiment analysis, and latent Dirichlet allocation (a form of topic modeling), and will be of use to a wide range of social scientists.

The method will be made publicly available as an R package. The source code will also be available through my GitHub.