"Modelling and measuring social processes such as the dynamics of social identities and norms, community rituals, shared symbols and beliefs etca is an area at once immensely important and immensely difficult. Some progress has been made with modelling such processes, but measuring them has been largely elusive and usually remains the domain of subjective interpretation of social and cultural artifacts.
This work makes some small inroads into methods for quantification of social processes and into the collection of social media data containing the rich dynamics needed to capture their traces. I present a system for the collection of richly dynamic Twitter data using a novel approach to identify a targeted Twitter community. Data was collected from the apro-anaa (pro-anorexia) community operating on Twitter for 2 years 9 months, resulting in a corpus of over 1.2 million tweets from 300 thousand users with records of over 3.2 million changes to users follower lists and profile metadata.
Three methods for investigating aspects of social dynamics are presented with the collected pro-ana data as a test case: an approach to combine Bayesian topic models with word frequency based psychometric tools to identify corpus-relevant contexts of the target psychological phenomena; a Bayesian model, applicable to large data sets, which associates topics generated by topic models of text data with communities inferred by network community detection models; a methodology for identifying linguistic markers of group induction processes."