This project, Developing methods for analysing and evaluating literary engagement in digital contexts, begins from the starting point that the rapid rise in the amount of user-generated content produced on the internet, especially on social media sites, offers an extraordinary opportunity to study human interaction in a format that lends itself easily to multiple kinds of computational analysis. From the perspective of scholars of reading and reception, this growing body of data is particularly exciting, given that it is not just time-consuming to interview individual readers, carry out surveys and conduct focus groups, but also problematic to draw conclusions from artificial contexts where it is difficult to know the extent to which the answers being given have been influenced by the unequal relationship between reader and researcher. Although data derived from the internet has plenty of limitations of its own—the fact that users of a particular site or service may not be a very representative sample of the general population, for instance—it is still the case that born-digital responses to texts, other readers, and literary events offer researchers the tantalising possibility of grasping aspects of reading that have been previously inaccessible. Not only are there much larger amounts of material available than in the past, but also digital reception data often involves readers voluntarily recording their thoughts in the context of a community to which they feel a sense of belonging, rather than reporting them to a stranger.
The challenge for researchers who work on reading and who do not have large amounts of technical background knowledge is twofold. First, how can they access these rich bodies of data, and second, how can they carry out analysis of digital materials alongside their established methods of working with non-digital reception data? A scholar with experience in interpreting marginalia – comments written in the margins of books – is well placed to bring her skills to bear on digital forms of annotations, for instance, but might not know how to get hold of this data nor how to process it when the sheer amount of material available exceeds the capacity of a single human reader. Other disciplines have addressed these issues—corpus linguists have established methods of constructing and analysing large textual corpora, for instance, and computer scientists have developed techniques such as sentiment analysis which can process large numbers of statements to determine whether they are broadly positive or negative, while various other approaches are being taken by scholars across the digital humanities—but for scholars of reading without the technical background to scrape data from websites, or set up a Twitter archive, there are significant barriers to engaging with this data.
The aim of this project is to lower these barriers, by reporting on three different kinds of approaches that can be taken with digital reception data that are within the grasp of reception researchers without specialist digital humanities training. First, it examines the thematic content of the textual data that individuals generate when they engage in online discussions about the value of books or literary activities. Second, it investigates what can be learnt from the chronological information attached to these discussions, for example the timestamps on social network posts or tweets. Third, it considers the role played by place in online conversations about reading, using digital mapping tools to visualize the geographic information attached to social media posts. The project will produce a report setting out what kinds of information can be learnt about the cultural value of reading in the digital age from these three angles, and will supply guides for a number of digital tools which can be used to work with these three kinds of data.
The two types of social media on which the project centres are the micro-blogging service Twitter and the literary social network LibraryThing. Because the focus of the project is the value that reading and book-related activities brings to individuals, I have chosen books and authors that have won or been shortlisted for high-profile prizes such as the Nobel Prize and the Booker Prize, and that have featured in literary competitions with considerable cultural cachet. Using timestamped data from the Twitter API, for instance, will allow me to examine such things as how the content of discussions about a shortlisted book change in light of prize announcements, or how the progress of a literary competition might influence the way LibraryThing users position themselves in relation to a particular book as they go about their interactions with other readers on the site. Geography, too, can be considered: as people across a country or around the world take to Twitter to express their opinion about an author who has just won a prize or a competition, what kinds of patterns is it possible to discern from the spatial distribution of tweets? Previously, it was difficult for scholars of reading to access the when and where of reception data with such precision, and so—especially in light of the large amount of material that is now available online about readers’ preferences and responses to books—it seems an opportune moment to reflect on the methodological opportunities and limitations of this kind of digital work on the cultural value of reading.