An Introduction to Natural Language Processing
Have you ever used Siri, Alexa, Google Assistant, or a similar AI? Nowadays, they are quite common and even taken for granted. You can ask them to find the nearest flight out of town, play your favorite music, or even tell a joke. On websites offering various services, like legal or travel, there are often chat-bots that can easily answer many of your questions. Talking to machines has become part of our daily life. But how is it even possible? The answer is simple: Natural Language Processing.
What is Natural Language Processing?
It’s possible for us to talk to computers largely because of Natural Language Processing, or NLP. It is a component of artificial intelligence (AI), that helps computers understand, interpret, and manipulate human language.
NLP draws from many disciplines, including linguistics, computer science, and information engineering. It is primarily concerned with interactions between computers and human (natural) languages, in particular how to program computers to process and analyze large amounts of natural language data.
And although asking Siri to find some pictures of cute kittens online seems to us today like a piece of cake, NLP is actually not that simple.
Why is NLP so hard?
Computers do not speak human languages. They ‘speak’ code, or, at its most basic level, millions of ones and zeros. In the past, programmers used punchcards with ones and zeros to communicate with computers. This was an arduous and time-consuming manual process, and few people were skilled enough to do it.
Human language is complex, diverse, and, to a large extent, unstructured. There are endless ways of expressing things, with subtle connotations, subtext, and cultural references. There is slang, regional dialects and accents, misspellings, abbreviations…
Structuring human language, breaking it down to ‘ones and zeroes’, and making it possible for a machine to understand is no easy task.
How NLP works
Up until the 1990s, programmers were applying so-called ‘symbolic’ NLP. Basically, the programmer gave the computers a set of rules, like a grammar textbook with questions and matching answers, which they used to analyse language data.
In the late 1980s, the computational powers increased and new approaches to machine learning and language processing emerged. NLP became largely statistical: not working with a pre-determined set of rules, but deriving patterns from the large amount of language data collected.
Syntactic analysis and semantic analysis are the main techniques used to complete Natural Language Processing tasks. Here are some of the examples of the processes involved:
- Lemmatization — reducing the various inflected forms of a word into a single form for easy analysis
- Word segmentation — dividing a large piece of continuous text into distinct units
- Sentence breaking — placing sentence boundaries on a large piece of text
- Word sense disambiguation — giving meaning to a word based on the context
- Named entity recognition — determining the parts of a text that can be identified and categorised into preset groups, such as, for instance, names of people and names of places
Applications of NLP
One example of how NLP can be used has already been given above: personal AI assistants, such as GoogleAssistant, Siri, Cortana, Alexa, and others. But that’s by far not all that NLP is good for.
There are many applications for NLP, including (but not limited to):
- Various forms of text and speech recognition, including OCR and text-to-speech
- Machine language translation (e.g., Google Translate)
- Filtering textual information (e.g. spam filters in your email)
- Automatic text summarization or rewriting
- Checking text accuracy (e.g. Grammarly)
- Interactive Voice Response (IVR) applications used in call centers to respond to certain users’ requests
Natural Language Processing has definitely made our life easier. You don’t have to be a programmer, and know how to work with punchcards to search for information online, use text-to-speech, machine translations — in a word, to ‘talk’ to your computer and ‘ask’ it to do what you need.
It helps people whose jobs involve dealing with large amounts of text, be it linguists, lawyers, or medical professionals. But it may also be taking some work away from people – there are already quite a few people who prefer using services like Google Translate and Grammarly to the services of human translators and editors.
Part two of this article talks more about the applications, and part three looks at the potential impact of NLP: