Nov. 11, 2021
by Tom Recht
As an academic linguist who’s made the transition into working in tech, I’ve found it fascinating to see the two worlds’ different approaches to trying to make sense of how people use language. In tech, Natural Language Processing (NLP)---the language-centric domain of Machine Learning---is used to interpret language at scale, in a way that’s much more practical and results-oriented than the theoretical approaches linguists generally use to understand language.
NLP tools are growing more and more powerful by the day, and so it’s easy for the tech industry to get the sense that language is almost a solved problem. There are a multitude of language interpretation packages out there for everything from voice recognition to syntactic parsing to semantic search. So do NLP teams really need linguists?
I’m going to make the case that, if your project involves comprehending or producing human language, you probably do need a linguist on your team, and the reason is this: language isn’t just a collection of separate modules, it’s a single complex system, and the ways your current NLP tools are underperforming most likely have to do with not identifying the links between parts of that system.
Here’s a simple example. Say you have an image search engine where each image is tagged with a set of keywords man, woman, beach, dog, umbrella etc. The keywords might even include verbs like walk or bite. Your search engine may be great at retrieving images of a dog, an umbrella, or a woman walking on the beach. Now what if I throw the following two queries at it?
dog bites man
man bites dog
If all your search engine knows is single-word tags, it won’t be able to tell the difference between these two queries, and they’ll bring up an identical set of images. What’s missing here is a knowledge of syntactic structure: understanding that in the first sentence the subject is dog and the object is man, while in the second sentence it’s the other way around.
Now, there are pretty good tools out there for parsing syntactic dependencies. But here’s where it gets more complicated: sometimes the surface structure of a phrase isn’t enough by itself to tell us what the underlying syntactic dependency structure is. Consider these two phrases:
woman on a beach with an umbrella
woman on a beach with a boardwalk
In the first phrase, the words with an umbrella describe the woman, while in the second, the words with a boardwalk describe the beach. This is what linguists call an attachment ambiguity: which noun is that prepositional phrase supposed to be attached to? As English speakers we know the answer without thinking twice, because our real-world knowledge tells us that people can have umbrellas while beaches can have boardwalks. But a search engine will fail on this task unless it knows how to properly combine two disparate parts of the overall language system: syntactic structure and encyclopedic knowledge of the world.
Here’s another example still using image search as a use case. Say you want to search for images of a blue balloon. And let’s assume you’ve taught your search engine basic syntax, so it understands it’s supposed to retrieve images of a balloon that is blue, not just ones that contain a balloon plus some other thing that happens to be blue. Your engine now works so well for English that you decide to expand it into other languages -- for example my native language, Hebrew. Your dictionary tells you that the Hebrew for blue balloon is balon kakhol. Great -- that means the following image will be tagged as balon kakhol, and will be correctly retrieved when a Hebrew-speaking user searches for a blue balloon.
What about the image below? That’s balon kakhol too, right?
Wrong. In English it’s a blue balloon, yes. But in Hebrew, that light shade of blue is its own color with its own name, tkhelet. If you’ve simply tagged it as a “blue balloon” and just run your tags through Google Translate, it’ll wrongly show up on searches for balon kakhol and not show up at all on searches for balon tkhelet.
The point here is that languages carve up the world in different ways: there’s not a one-to-one relationship between color terms in one language and those in another -- not to mention more complex concepts, like verbs of motion or nouns denoting social roles. And the meaning of each term depends systematically on those of the others: you can’t know what kakhol means without knowing what tkhelet means, and vice versa. In our largely monolingual Western world, it’s easy to be blind to this kind of variation across languages, but linguists are trained to be able to quickly identify the countless kinds of ways in which languages describe the world differently.
Even a sentence that looks completely unambiguous can have multiple different meanings in a real-life context. Consider the sentence Alice has a dog. Perfectly simple, right? No syntactic or semantic ambiguity? But what if I emphasize it in two different ways…
Alice has a DOG.
ALICE has a dog.
Those two patterns of emphasis actually have different situational meanings. The first one could be an answer to the question, “What pets does Alice have?” The second one could be an answer to the question, “Which of our friends has a dog?” Using the wrong intonation pattern would elicit a blank stare from any human English speaker, because they have an implicit knowledge of the relationship between phonetics (the acoustic features of speech) and pragmatics (the kinds of meaning that depend on specific situations and contexts). If you’re designing a voice chatbot, it will need to understand the rules for how those two parts of the overall language system interact. This is the kind of question linguists study in depth (my own dissertation work was actually on a variant of this problem in ancient Greek!).
A quick final example. Languages not only reflect the social worlds we live in, they play a part in actively creating them. French has two different pronouns meaning “you”: tu, used in informal situations or for people you know well, and vous, used more formally for people you don’t know well. This may sound hard enough to get right, but many languages go much further than this: some East Asian languages have half a dozen or more different types of these “honorifics”. If you’re going to be generating foreign-language content while avoiding embarrassing faux pas, your language module needs an awareness of the nuanced relationships between lexicon (for example pronoun or other word choices), morphology (for example verb forms), and cultural and social categories. Again, a linguist is more likely than anyone else to be able to recognize this problem, know what prior work has been done on it, and figure out the relevant rules and patterns.
I could go on -- there are many more ways that seeing relationships between different parts of a language system can be crucial for effective work in NLP. But I hope this post has convinced you that linguists -- trained pattern-recognizers who can see both the messy details and the big picture, who understand how different areas of language interact as a single complex system, and who are used to diving into data from a language they’ve never seen and figuring out how it works -- are essential to any language-centered project.