My favorite quote on interdisciplinary studies: “None of this implies that the current division of the academic landscape into the natural sciences, the social sciences, and the humanities is groundless; only that it is neither inevitable nor without alternatives. The advantage of studying knowledge practices is not to erase all distinctions but rather to query the necessity of the ones we have—and to imagine others based on criteria closer to how disciplines actually conduct inquiry” (Daston 2017, 147).

Research Summary

My research use data science methods, especially network analysis and machine learning, to make humanistic inquiries.

While being open to the myriad unexploited benefits that could be brought about by scientific inquiries, I want to emphasize that none of these represents a technocratic approach that seeks to “replace” traditional humanities methods such as close reading or claims potent enough to play the indispensable role of deep thinking. Rather, I aim for these methods to assist philosophers in drawing new connections and finding new perspectives that could be better illuminated with most recent analytical tools, and to put both the historical thinkers (the materials to be read) and contemporary scholars (the readers) themselves in context.

The datasets I analyze with these methods range from small historical and philosophical corpora - for instance, the 58 correspondences between Princess Elisabeth of Bohemia and René Descartes in the 17th century and classical Chinese philosophical texts pre-Qin period (221 BC) - to large-scale digital information online, such as over 160,000 philosopher entries on Wikipedia and billions of publications in academic databases.

Big Questions

My research centers on the hermeneutic rethinking of humanities computing methodologies, and explores three principle ideas:

1. Algorithm design as lens modeling to prompt reflections / hermeneutical instrument

How can algorithms and modeling be hermeneutical instrument?

Is distance reading [link] really a black box? Is close reading [link] a completely white box?

2. Redesign computational methods for humanities research

How can humanities inquiries and computational thinking be integrated and embedded in a set of methodologies? How to redesign algorithms that highlight and promote rather than diminish human thinking? What can we learn from contrasting human-made conclusions and machine-produced results and thus draw on reflections on the process of knowledge modeling, representation, and interpretation?

The power of scientific computing lies in its concreteness and scalability, thus enabling us to investigate things that could never be done manually.

For example, we can link fragmented historical sources and canon-centered literature on a large scale with linked databases, to provide a comprehensive network view and bring to light the “great unread” [link]. We can also improve existing theories or develop new hypotheses by examining the hidden (and possibly surprising) patterns revealed by the data.

However, large-scale computing is not a panacea,

and the current data science techniques are not tailored to the reflective nature of humanities research.

I am committed to developing computational methods (and corresponding interpretative frameworks) specifically for research in the humanities, starting with the adjustment and repurposing of existing techniques.

Data Modeling (Representation) for Humanities

Another one of the biggest challenges for digital humanities is that most humanities data is not yet data. Despite increasingly available digital archives and databases, they mainly deal with keywords of objective information (like location, name, time, etc.) How can we represent subjective knowledge, such as ideas and concepts?

Missing data and uncertainties

Humanities research is not a once-and-for-all process. It is constituted by constant revisiting and rethinking of evidences and interpretations from multiple perspectives.

A person’s date of birth may be recorded differently in two sources, and the voices of underrepresented groups may be excluded from the narratives. How do we deal with uncertainties in the historical records? What insights can we gain by studying the absence of certain data [link]?

3. The Philosophical Foundation of Computational Humanities

What makes humanities computing possible and promising? What are so-called scientific and humanistic, what makes research fall into one of them?

Cover image source: Stewart Varner: Digital Humanities in Libraries


Daston, Lorraine. 2017. “The History of Science and the History of Knowledge.” KNOW: A Journal on the Formation of Knowledge 1 (1): 131–54.