Research Summary

My research use data science methods, especially network analysis and machine learning, to make humanistic inquiries.

While being open to the myriad unexploited benefits that could be brought about by scientific inquiries, I want to emphasize that none of these represents a technocratic approach that seeks to “replace” traditional humanities methods such as close reading or claims potent enough to play the indispensable role of deep thinking. Rather, I aim for these methods to assist philosophers in drawing new connections and finding new perspectives that could be better illuminated with most recent analytical tools, and to put both the historical thinkers (the materials to be read) and contemporary scholars (the readers) themselves in context. Such tools are not merely skeuomorphic artifacts that mimic what humanities scholars have been traditionally doing, but also represent a new mode of production that could thoroughly insist on a working principle and radically reduce silos.

The datasets I analyze with these methods range from small historical and philosophical corpora - for instance, the 58 correspondences between Princess Elisabeth of Bohemia and René Descartes in the 17th century and classical Chinese philosophical texts pre-Qin period (221 BC) - to large-scale digital information online, such as over 160,000 philosopher entries on Wikipedia and billions of publications in academic databases.

Big Questions

My research centers on the hermeneutic rethinking of humanities computing methodologies, and explores three principle ideas:

1. Algorithm design as lens modeling to prompt reflections / hermeneutical instrument

How can algorithms and modeling be hermeneutical instrument?

Is distance reading [link] really a black box? Is close reading [link] a completely white box?

2. Redesign computational methods for humanities research

How can humanities inquiries and computational thinking be integrated and embedded in a set of methodologies? How to redesign algorithms that highlight and promote rather than diminish human thinking? What can we learn from contrasting human-made conclusions and machine-produced results and thus draw on reflections on the process of knowledge modeling, representation, and interpretation?

The power of scientific computing lies in its concreteness and scalability, thus enabling us to investigate things that could never be done manually.

For example, we can link fragmented historical sources and canon-centered literature on a large scale with linked databases, to provide a comprehensive network view and bring to light the “great unread” [link]. We can also improve existing theories or develop new hypotheses by examining the hidden (and possibly surprising) patterns revealed by the data.

However, large-scale computing is not a panacea,

and the current data science techniques are not tailored to the reflective nature of humanities research.

I am committed to developing computational methods (and corresponding interpretative frameworks) specifically for research in the humanities, starting with the adjustment and repurposing of existing techniques.

Data Modeling (Representation) for Humanities

Another one of the biggest challenges for digital humanities is that most humanities data is not yet data. Despite increasingly available digital archives and databases, they mainly deal with keywords of objective information (like location, name, time, etc.) How can we represent subjective knowledge, such as ideas and concepts?

Missing data and uncertainties

Humanities research is not a once-and-for-all process. It is constituted by constant revisiting and rethinking of evidences and interpretations from multiple perspectives.

A person’s date of birth may be recorded differently in two sources, and the voices of underrepresented groups may be excluded from the narratives. How do we deal with uncertainties in the historical records? What insights can we gain by studying the absence of certain data [link]?

3. The Philosophical Foundation of Computational Humanities

What makes humanities computing possible and promising? What are so-called scientific and humanistic, what makes research fall into one of them?

The Story Behind

In my junior year of high school, I was told there was a malignant tumor growing in my body. I was given a few months to live. The effort I had put toward my dream of going to university became trivial. Despite my parents’ support, I realized I was facing death alone. I began skipping classes in pursuit of finding a way to understand my impending death. I spent my time hiding in a bookstore reading and writing non-stop, even though this led to social isolation. I began to see that people’s understandings of the world are shaped by their subjective experiences. I was searching for a world view to help me face death, but found there was no ‘right’ view.

Three months later, the tumor was removed, and while I was immensely relieved to learn it was benign, the sudden return to normal irked me. My perspective had shifted—I no longer saw one world, but rather a million worlds from a million subjective perspectives. The path expected of me, the traditional path, of earning good grades and competing for well-paying jobs, left me stuck. The Chinese education system assigns students to tracks of narrowly defined “science” and “humanities” at the age of 15, but I wanted to bridge these disciplines to show different perspectives create an integrated view of life. To escape this rigid system, I decided to attend a liberal arts college.

In college, I began searching for a way to show others this integrated view of life. I imagined a giant network that can connect all kinds of knowledge. Obsessed by this idea, I spent the summer of my sophomore year learning knowledge graph and network analysis, and chose to be a data science major. My curiosity led me to my first digital humanities project, which was with a start-up team to create an interactive novel. Our goal was to use “cold” machine learning methods to create warmhearted virtual friends for people, especially those who were lonely. For example, Sherlock Holmes shares with the user his fear of being replaced by modern technologies, and encourages people to reflect on their own experience. In helping others deal with loneliness, I found a community of creatives, researchers from all kinds of fields, who wanted to explore the possibilities and challenges of bridging different perspectives.

Then, I met Karen, a philosophy major, and learned how to look at life through yet another lens. Frustrated that women are left out of the canonical narrative of philosophy, we built a network of philosophers on Wikipedia that reveals how women are systematically isolated and given less credit. To reconnect these missing voices back to the networks, we are developing a digital archive of underrepresented philosophers and their intellectual contributions. This work proved to me distant and close readings can be combined, and strengthened my belief that we can bridge diverging perspectives and skills to positively impact the world.

Cover image source: Stewart Varner: Digital Humanities in Libraries