- Posted by Stephen Whiteley
- On 02/09/2020
- Computer languages, Natural Language Processing, NLP, Translation
Computers do not really speak human languages — they ‘speak’ ones and zeros. However, we do need to communicate with computers, so they can perform the tasks they were created for. Imagine if we had to learn some sort of computer language or code to be able to use our PC, or its functions? Most likely, computers wouldn’t be so ubiquitous then. Fortunately, there is something that allows computers to understand our language, and that is Natural Language Processing, or NLP.
This is part two of a three-part article on Natural Language Processing. Read part one here:
“Have you ever used Siri, Alexa, Google Assistant, or a similar AI? Talking to machines has become part of our daily life, but how is it even possible?”
Essentially, NLP is a component of artificial intelligence that helps computers understand, interpret, and manipulate human language. In part one of this article, you can read more about NLP, its history, and what makes it so hard.
In this article, we will take a more detailed look at how NLP works and various possible uses for it.
How it works
Modern Natural Language Processing is statistical — it is based on collecting large amounts of language and textual data, and analysing it from various perspectives. The computer then establishes common patterns and algorithms which it then uses to understand, or produce, text and speech.
It may sound simple, but human language is extremely diverse, complex, and rather unstructured — especially from the point of view of a computer. We need to carry out many processes to make natural languages understandable to machines. Here are a few of the main ones.
The objective of this process is to reduce a word to its base form, and group together all the different forms of the same word. Verbs in the past tense are changed into the present, comparative forms of an adjective are joined together. Words with a similar meaning are standardised to their root. It allows computers to analyse different inflected forms of a word as a single item.
Word segmentation or tokenisation
This is an essential process that involves segmenting the text into separate words and sentences, cutting it into pieces called tokens. It may sound simple to us, but without this process, a computer wouldn’t know where one word or sentence ends and another begins. In some languages, like Japanese, this process is made harder by the fact that Japanese does not use spaces to separate individual words.
Word sense disambiguation
Another difficulty for computers in understanding human language is the fact that it depends a lot on the context, and one word can have multiple meanings and connotations. Human brains are proficient in word sense disambiguation. We need to teach a computer. There are various methods of word sense disambiguation, relying on dictionaries, corpora, cross-lingual evidence.
These are just a couple of examples; there are many more processes involved in making human language understandable for a machine. Including, stop words removal, stemming, topic modelling, part-of-speech-tagging, and others.
Why all the fuss, why do we even need to talk to computers? A computer is a practical tool that can perform many tasks, but we need to be able to communicate these tasks to it. In the past, only people who knew coding — the ‘ones and zeroes’ — could ‘ask’ a computer to perform a task, say, solve a mathematical problem. Nowadays, anyone can do it.
Here are some of the common uses of NLP
Autocorrect and spellcheck: Have you ever written a text on a smartphone or a document in a word processor? Then you have come across NLP. In autocorrect, it identifies the closest possible term to your misspelling (although it is definitely not perfect). Various spellcheck and writing tools help identify misspellings and grammar mistakes.
Spam filters: They use NLP technology to analyse email subject lines and body content. They can quite easily identify words and sentence patterns typical for spam emails.
Smart home devices with voice control: According to some data, 58% of millennials use such devices. Busy cooking but want to turn on your favorite music? Thanks to NLP and voice recognition, in particular, you can ask Google Home or Alexa to do it.
Other uses include, but are not limited to:
- Social media monitoring
- Survey analytics
- Smart search
- Translation tools
- Sentiment analysis
- Financial trading
- Aircraft maintenance
It is very likely that there will be more uses for NLP in the future.