Natural Language Processing with R

Version 1.0

Natural Language Processing is a sub-field of Artificial Intelligence. It is used for processing and analysing large amounts of natural language (texts). Some applications include search engines (Google), text classification (spam filters), identifying sentiments for a product (sentiment analysis), methods for discovering abstract topics in a collection of documents (topic modelling) and machine translation technologies. In this course you learn about exploratory analysis of text data, introduced to sentiment analysis of texts using sentiment lexicons and the concept of topic modelling (package topicmodels).

Course objectives

At the end of this course you should be able to set text data into a form that can be used for analysis, carry out cleaning of text data, exploratory analysis of text data, use meta-data to produce interesting visual displays depicting features of the text data, carry out sentiment analysis using sentiment lexicons and discover topics in a corpus.

Learning objectives

  • Describe the main components of language structure;
  • Perform pre-processing (cleaning) operations on text;
  • Apply methods from Corpus Linguistics to garner greater insights on a corpus;
  • Produce word-clouds, bar charts and other basic visualisations on variables of interest.

Course type

E learning – Available

Self learning – Available

Face to face – Not available

Skill level

Competency in using the R Programming language to perform basic data manipulation is required. For that refer to Intro to R.


To discuss booking this course for remote delivery, please contact the Data Science Campus Faculty.