A Conversation with: Christopher Warren

Christopher Warren is an Associate Professor of English, co-Director of the English Department's minor in Humanities Analytics (HumAn), and Director of the Bachelor of Arts in English Program. A member of the Dietrich faculty since 2010, he recently wrapped up the digital humanities project, 'Six Degrees of Francis Bacon,' which re-creates the British early modern social network to trace the personal relationships among figures like Bacon, Shakespeare, Isaac Newton and many others.

For readers who aren't familiar with it, describe 'Six Degrees of Francis Bacon.'

Six Degrees of Francis Bacon is an online reconstruction of the early modern social network that scholars and students from all over the world can collaboratively expand, revise, and curate. Unlike published prose, Six Degrees is extensible, collaborative, and interoperable: extensible in that people and associations can always be added, modified, developed, or, removed; collaborative in that it synthesizes the work of many scholars; and interoperable in that new work on the network is put into immediate relation to previously studied relationships.

Six Degrees was founded in 2011 by Daniel Shore and myself. Working with colleagues and students in Carnegie Mellon University's Statistics and Information Systems Departments, and with support from Google, the Council for Library and Information Systems (CLIR), and the National Endowment for the Humanities (NEH), Six Degrees has pioneered an innovative statistical method to infer large historical networks from textual sources; published a major network dataset of over 15,000 early modern people and over 170,000 relationships; created the first historically specific ontology of social relationships in early modern Britain; and developed a purpose-built website interface where scholars, students, and citizen humanists can visualize, query, and contribute to the dataset.

You collaborated on that project with some of our Libraries faculty. Can you describe how you worked together?

We'd been working on the machine learning part of Six Degrees for a couple years when we brought in Jessica Otis as a CLIR Postdoctoral Fellow to spearhead data curation. Jessica is trained as a historian of early modern Britain and also has skills as programmer, which allowed her to combine her insights as a scholar with the technical aspects of the project. Through Jessica's expertise and dedication, we created a much more systematic approach to handling and preserving data. After her two-year fellowship ended, Jessica was hired by the Libraries as its Digital Humanities specialist and she and I have continued to work closely. We've also worked closely with, Scott Weingart, an expert in network analysis for digital humanities who who joined the Libraries in 2017. Scott regularly brings his knowledge about networks and DH infrastructure to the project.

On November 17, you held the 'Redesigning Bacon' add-a-thon in Hunt Library, inviting the public to explore a redesigned SixDegreesofFrancisBacon.com. How did it go?

The purpose of Re-Designing Bacon was to introduce the results of the recent overhaul of the website and, frankly, to put the new site through its paces before we really tightened the screws and open-sourced the code. The key thought was that getting a bunch of people actually using the site would be the best way to understand the strengths and weaknesses of its design. Humanists and designers don't have a ton of experience collaborating with one another, so we wanted to hold it in a space equally hospitable to both crowds. Hunt Library was the obvious choice. We had a livestreamed presentation and worked remotely with a group at Washington, DC's Folger Shakespeare Library. The event was a great success. We learned a ton of lessons about the site's usability and added hundreds of new relationships.

What's next for the 'Six Degrees' project?

Six Degrees will I hope to continue to be the broadest, most-accessible source of who knew whom in early modern Britain and continue to support both qualitative and quantitative studies of early modern networks. CMU Libraries is a big reason why. We tend not to think very much about 'end of life' issues for digital projects, but what happens to these new kinds of endeavors once underlying technologies become obsolete? To its great credit, CMU Libraries has committed to hosting and maintaining the dynamic web application for the foreseeable future. Scholars will thus be able to continue to refine the network by contributing to the website. Versioned datasets will be archived periodically at the Folger Shakespeare Library as well.

Do you have any future projects you're working on with the Libraries?

Right now, I'm working on a problem that's bedeviled rare books cataloguers and book historians for a long time. In the 1400 – 1700 era, printing was often dangerous. It was an era of censorship, and, as a result, about one in five books from this time do not have known printers.

With key support from the Libraries, Max G'Sell (Statistics) and Taylor Berg-Kirkpatrick (SCS) and I are using artificial intelligence to identify clandestine printers. In short, we're using computers to find examples of damaged type, which can then be the tell-tale fingerprints that reveal a book's printer. To do this, we need a ton of digitized books, not plain text – which many of the databases contain – but images, jpegs, PDFs. Ethan Pullman and Denise Novak have worked with vendors to procure 2 terabytes of book images for us, and Chris Kellen has backed up the data on library machines. We'd also love to get the relevant MARC records to train our classifiers. These aren't traditional requests, so we deeply appreciate having the help of Libraries colleagues on this. It's actually made me think how awesome it'd be to develop a course called 'Library as Data.'

A Conversation with: Christopher Warren

Tags