Image Image Image Image Image Image Image Image Image




May 3, 2018

Announcing the Launch of the The Novelty Project

May 3, 2018 | By | No Comments

I am pleased to announce the launch of my CHI Project, “The Novelty Project.” This is more of a soft launch, really; this collaboration between Arend Hintze, Devin Higgins, and I has been in the works for the past two years now, yielding one forthcoming publication and a grant. I’ve built a website to serve as a companion landing page—where we will expand on information published, include information that we weren’t able to fit into our published work, and include some of the weirder findings that we uncover.

Novelty looks like this!

The major work of my year as a CHI Fellow was not the building of the public site, but the development of a 20th-century Corpus with the HathiTrust Research Center. Our team was awarded an HTRC Advanced Collaborative Support Grant last July, providing us access to in-copyright works in HTRC’s holdings via a Virtual Machine. Throughout the year, we worked with the incredibly talented and patient Eleanor Dickson to develop a corpus of 20th-century novels, split into two categories: canonical and non-canonical. Our goal was to build a corpus that would be large enough for us to a.) replicate our initial proof of concept, and b.) consider some of our more provocative hypotheses regarding literary modernism, p

ostmodernism, and the periodization of the 20th-century. Our goal in developing two, contrapuntal corpora was to get at the dynamic identified by Algee-Hewitt et. al in Pamphlet 11 from the Stanford Lit Lab, “Canon/Archive: Large-scale Dynamics in the Literary Field.” Our hope was to develop both an admittedly inclusive canon of the 20th-century novel, and an archive against or within which we might understand the canonical, and broader dynamics of the field.

One might suspect that determining the canonicity of our texts would be an impossible text—after all, “what’s in,” and “what’s out,” has been the subject of much heated debate (to put it lightly). In fact, canonicity was rather straightforward, thanks to a useful tool created by Nathaniel Conroy called Metacanon.* Metacanon collects citation scores from Google Scholar, JSTOR, The New York Times, and several other sources to calculate the most influential novels of any given time period. We used Metacanon’s date-range function to develop our canon list, gathering the top 100 most frequently cited works of fiction published within each decade of the 20th-century. This provided us a relatively even spread of publication dates (though the turn-of-the-century skews Jamesian). Once we had identified these texts, we isolated the novels from our results, queried HathiTrust’s holdings, selected a preferred edition, et voila: a canon corpus.

From there, we set out to build our corpus of non-canonical novels. This corpus isolates a hazy middle in the literary field—novels that were important or influential enough to have been digitized, but not important or influential enough to have been cited by scholars. Because we were not starting from a pre-determined list, but working from within the disorientingly rich and complex Hathi library this process turned out to be rather tedious: how does one identify a novel according to MARC Records? How to distinguish between a book published during the 20th-century and a book republished in the 20th-century? (Dickens, it seems, gets a reprint every five years). What of novels spread over multiple volumes—a fad that, fortunately for us, appeared to be falling out of fashion after 1900? How do we identify—and eliminate—works of criticism that are about novels, but not novels themselves? And what difference does a collection of short stories make in our results? Which versions do we keep, and why? Each of these questions, their answers, and our corresponding action has the potential to change our results. And while these finely-tuned details may make little difference at scale, they mattered significantly to us as we determined what-to-add and what-to-cut.

We are in the process of running these texts through our Novelty Filter, in hopes of turning to Phase III of this project over the summer. An online landing page is, unfortunately, a poor substitute for the work that our team has completed. But it provides us a space to consider some weird stuff—such as our comparison between Bestsellers and Prizewinners—and to provide our audience a chance to interact with our (forthcoming) data at a more granular level.

A final note of thanks to HTRC, and, especially, Eleanor Dickson. This project would have been impossible without Eleanor’s efforts, and without HTRC’s generosity. I’m eager to see how The Novelty Project continues to unfold, and hope that you’ll follow along with us.



*While we made great use of, it appears that the site is no longer functioning?



September 22, 2017

Introducing CHI Fellow Laura McGrath

September 22, 2017 | By | No Comments

Hello! My name is Laura McGrath, and I’m delighted to be returning as a CHI Fellow during 2017-18. I’m a PhD Candidate in the department of English, working on computational approaches to post45 American literature.

My dissertation, Middlemen: Making Literature in the Age of Multimedia Conglomerates, studies the major shifts in the field of literary production in the wake of the mergers and acquisitions that roiled the publishing industry in the 1980s and 1990s—a process that resulted in the formation of what we now call The Big Five. Each chapter examines one influential figure in the publishing industry: the agent, the acquisitions editor, the publicist, and the social media manager. Too often dismissed as “middlemen” or mere bureaucratic functionaries, such professionals are powerful nodes between the artist and the corporation, mediating between the domain of aesthetic or literary value and the managerial imperatives of huge media firms. As such, these overlooked figures are not just powerful gatekeepers, but administrators of literary prestige, value, and “corporate taste” in the contemporary, shaping the form and content of contemporary fiction while providing access to mainstream publication, and cultural consecration.

Read More



May 6, 2016

Launching the Armed Services Editions: A Computational Analysis

May 6, 2016 | By | No Comments

ASE Website


I am happy to announce the launch of my CHI Project, The Armed Services Editions: A Computational Analysis. On my page, users can navigate through three “Data Narratives”: simple analyses that I conducted to answer critical questions about these data. The Gender Data Narrative considers the distribution of gendered pronoun usage throughout the corpus, and features a basic foray into LDA topic modeling. The Genre Data Narrative considers the types of books that were sent to servicemen, and how the generic representation of books may have shifted over time. Finally, the Geography Data Narrative the geographic imagination of the corpus– both domestic and internationally– with NER.

This first phase of this project is, quite simply, a book history project. To date, the ASE Corpus has not been studied in total. Several scholars have published institutional histories of the Council on Books in Wartime, or discussed the role of specific books, or even discussed the ASEs in relation to a larger sociological project. I am interested in assembling a more thorough, stylistic, macro-history of the ASEs, that attends to both it sociological import as well as its formal properties through computational analysis. The data I’ve assembled is descriptive, working toward that end, and is a necessary foundation to the more advanced analysis I will be conducting this summer.

In addition to an analysis of the ASE Corpus, this website is also a record that chronicles the development of my methodological chops. While I had a basic foundation in R (thanks to a fabulous course at HILT), my skills needed (and still need) development. I used two textbooks to improve my skills, testing my dataset throughout. Users familiar with Text Analysis with R for Students of Literature by Matt Jockers and Humanities Data in R by Lauren Tilton and Taylor Arnold will likely be able to trace my data analysis back to the chapter problem sets.

Full disclosure: I feel insecure about this. I would like, eventually, to publish on the ASEs. A record of my fledgling explorations in R and data analysis is… well, nerve-wracking. Yet, as Ethan Wattrall has reminded me in a variety of ways, it’s also an important intervention. Over and over again this year, I have been reminded of and impressed by the generosity of my colleagues in DH; I post this basic data analysis in hopes of inviting that same generous conversation.

Only a fraction of the work that was completed on this project his featured on my project website. I should have foreseen this problem and created a time-lapse video of my hours and hours running OCR on hundreds of documents, or adding metadata to my database. Or, better yet, learning how to analyze data in R. For this project, however, I decided to visualize my data using Tableau. Tableau provides far less specificity, for sure, but it also allows for a greater degree of user interactivity. Since my data is, at this stage, largely descriptive, I wanted users to be able to explore with greater flexibility.

It’s been a long year working on this project, and that long year has turned out to be just the beginning. I’m so excited to see how this project continues to develop. Over the summer, I’ll be continuing this project by running these analyses—and much more interesting, advanced analysis (fingers crossed)—on the entirety of my corpus.

The questions motivating this project are increasingly pressing, and continue to motivate me—particularly as a powerful political candidate has remained consistently hostile toward the free exchange of ideas that should define any democratic discourse. Ultimately, this project asks, what (or whose) ideas are acceptable, and what (or whose) ideas aren’t? And what (and who) makes that so? These questions should be asked about 1940, and they should be asked about 2016.



December 8, 2015

Politics and Form : The Armed Services Editions

December 8, 2015 | By | No Comments

As a CHI Fellow, I’m undertaking a large-scale text analysis of the Armed Services Editions, a collection of novels sent to US Soldiers during WWII to “fight the war on ideas,” to consider issues of politics and literary form. I first stumbled on the Armed Services Editions a few years ago, while researching Ernest Hemingway’s The Sun Also Rises. You may recall Jake’s description of Robert Cohn, early in the novel:

He had been reading W.H. Hudson. That sounds like an innocent occupation, but Cohn and read and reread “The Purple Land.” “The Purple Land” is a very sinister book if read too late in life…For a man to take it at thirty-four as a guide-book to what life holds is about as safe as it would be for a man of the same age to enter Wall Street direct from a French convent, equipped with a set of the more practical Alger books.

I was working on a project on modernist reading networks, and this passage jumped out at me. I looked into The Purple Land and found that it was chosen to be a part of the Armed Services Editions in World War II, 16 years after the publication of The Sun Also Rises. Cursory research into the Armed Services Editions led me to the Council on Books in Wartime, a committee of publishers that assembled during World War II and contracted with the US Military to produce cheap paperback editions for US soldiers abroad. The goal (and slogan) of the Council on Books in Wartime was to use books as “weapons in the war of ideas.” Books had an important role to play in the war effort, the CBW wrote, because “Books can help us recover our past and teach us what a tough-fibered people we can be when we have to. Books can tell us what our enemies are like. Even prizefighters study their opponents carefully.[…]Books can tell us what our allies are like.” All of this was vitally important to such a “total war.”

Read More



November 30, 2015

Reading Digitally, Archiving by Smartphone

November 30, 2015 | By | No Comments

A friend of mine once joked that so many Victorianists become digital humanists because Victorian novels weigh so much. If the Victorianist is drawn to DH because of the ease—and chiropractic benefits—of digitization, then the Modernist might stay away for similar reasons. Hamstrung by copyright laws, modernist scholars like myself find it quite challenging to undertake a large-scale digital project with the texts we find so interesting. Of course, this is too simple: a number of online repositories, such as the Modernist Journals Project, the Modernist Versions Project, and Editing Modernism in Canada have done so much to increase digitization efforts and make rare texts available to scholars digitally. Yet, I couldn’t help but wonder at the relative lack of digitally-inflected panels, workshops, and seminars at the Modernist Studies Association’s most recent annual conference in Boston last week (2 workshops, 2 roundtables, 1 panel, 1 seminar, and a “digital exhibition,” featuring 8 projects).

I set aside lobster rolls and Sam Adams and oh-so-good East Coast pizza to attend one of the pre-conference workshops that took up this issue. Led by my new #scholarlygirlcrushes Shawna Ross and Claire Battershill, “Digital Modernist Texts in the Classroom,” addressed questions of access and digitization for research and teaching. Shawna and Claire are a part of a group working on Open Modernisms, designed for digitizing, archiving, and anthologizing modernist texts. In many ways, Open Modernisms is a crowdsourcing anthology project: users can upload their own texts to the database and access others, creating their own anthologies for teaching. (Undoubtedly, Open Modernisms will have other uses, but our workshop focused on teaching).

Read More



September 25, 2015

CHI Fellow Introduction: Laura McGrath

September 25, 2015 | By | No Comments

My name is Laura B. McGrath. I am a fifth-year doctoral candidate (ABD) in the Department of English at MSU. I am very excited to be a CHI Fellow for 2015-2016. IMG_1211I study literary modernism and digital humanities. My dissertation, tentatively titled Modernish: Modernism and Literary Distinction in the 21st Century, uses traditional, ethnographic, and digital methods to explore modernism’s symbolic capital in the field of contemporary publishing. I have also served as Project Manager for the Digital Humanities and Literary Cognition lab (DHLC) in MSU’s English Department. In my capacity as Project Manager, I have written and received a number of grants (including the ACLS Digital Innovations Grant), mentored our undergraduate student researchers, and taught DH methods with the aim of developing our centerpiece project: a study of the neuroscience of reading Jane Austen. I have been fortunate to represent our work at the Literary Lab at Stanford University, and was invited to be a member of the Graduate Student Caucus at the University of Chicago’s Cultural Analytics conference, hosted by the Neubauer Collegium.

I am very much looking forward to my year as a CHI Fellow. I am looking forward to launching a large-scale, text analysis project on the Armed Services Editions. The Armed Services Editions are a collection of ~1,500 novels, repurposed for American soldiers during WWII. These texts were collected by the Council on Books in Wartime, a short-lived (and scary-sounding) office in the Department of Defense. The idea, so the CBW’s story goes, was to select only the most “democratic” novels in order to help American soldiers “fight the war of ideas.” Through a literary-sociological corpus study, I want to figure out what a “democratic” novel or style is– at least, according to the Department of Defense in 1942. This project has the potential to open up some fascinating questions about the intersection of politics and literary form, and hopefully give some insight on the literary field during a crucial period in American culture.

My written work has appeared in SymbolismPerspectives, and Books and Culture.