Today I am pleased to announce the launch of my CHI Fellowship project entitled Linguistic Semantics, Text Organization and NLP. It is an educational website whose primary function is to teach researchers in the Cultural Heritage and Digital Humanities (CHDH) communities about Natural Language semantics and how professional Linguists go about studying “meaning.” Additionally, it presents some novel research that I am conducting regarding the efficacy of using NLP techniques to assist in the organization of textual documents which have an unknown structure. Along with this material are some practical bits on NLP tools and getting started with them. There has been a definite realization in the field that Natural Language semantic research can more greatly inform our NLP practices and the site takes a step toward this goal by offering CHDH researchers the ability to learn some semantic fundamentals which they can then utilize in their experimental designs. 

I started thinking about the ideas behind this project after a couple of years of conversations with my colleagues regarding their current or prospective NLP projects. Many of them were excited to move into the world of NLP and expressed their frustration at trying to teach themselves how to use the tools and get started. One of the largest complaints was that they were overwhelmed: they didn’t know where to begin or what to begin with. There were at least two problems related to this issue. First, they weren’t really sure what the limitations of the NLP tools were. Essentially, they didn’t know what the machines were really good at versus what they were really bad at. Secondly, they shared a problem that afflicts all language users—a relative lack of awareness of the power of their own cognitive linguistic systems. That is to say, they do not appreciate all that they themselves are bringing to interpretive language tasks. 

The problem boils down to the following: if one does not know how their Language seems to be able to provide “meaning” and what a human’s syntactico-semantic abilities are; and they don’t know how those differ from whatever it is that machine’s do and machine’s “learn,” (because they don’t know what machine’s do or learn), then it is extremely difficult for them to pose sensical language related research questions, and subsequently select the proper tools to investigate them. Each one of my CHDH colleagues had a fantastic idea for an experiment or research project but simply lacked the technical semantic and syntactic diagnostic tools required to design and execute the project that they really wanted to. This site was created to help my colleagues with this problem. 

The site is an ongoing and developing project. Going forward, I will continue writing and adding tutorials on Natural Language Semantics, as well as updating the pages related to my research on refining a methodology for determining the (Conceptual) structural organization of a collection of texts/documents. My hope is that everyone finds it fun and enjoys using it.

I would like to express my gratitude to the Cultural Heritage Informatics Initiative (CHI), the Lab for Education and Advancement of Digital Research (LEADR) and Michigan State University (MSU) for the opportunity to create this project. Special Thanks to Ethan Watrall of the MSU Anthropology Department for being such an excellent teacher and mentor, and helping me to discover a whole new field.