Advanced Natural Language Processing in Python

Version 1.0

This course will focus on four key topics in Natural Language Processing: information retrieval, classification, sentiment analysis and topic modelling. Information retrieval covers the building blocks of a search engine – the inverted index and maps out in detail with both illustrations and code how an information retrieval application can be built. Three disparate, classical approaches will be examined to fulfil this objective.

Classification will then be outlined, focusing on its supervised machine learning foundations. A real-world classification problem of news classification will be illustrated using a BBC news dataset. Classification will again be enacted to undertake sentiment analysis. Key challenges in the field of sentiment analysis will also be explored. Topic Modelling as the name suggests is a process to automatically identify topics present in a dataset. The course will demonstrate from a practical perspective how this can be attained.

Course objectives

Participants should attain specialist knowledge and skills to enable development of more challenging language based applications like sentiment analysis and search engines.

Learning objectives

  • Describe the boolean retrieval model;
  • Execute the boolean retrieval model over a dataset;
  • Describe the vector space model; Execute the vector space model over a dataset;
  • Describe how a language model can be used to enact retrieval over a dataset;
  • Execute a language modelling approach to enact retrieval over a dataset;
  • Describe supervised machine learning;
  • Describe pathway of a typical machine learning project;
  • Follow steps in pathway to enact classification over a dataset;
  • Describe sentiment analysis;
  • Describe challenges in a sentiment analysis task;
  • Execute sentiment analysis using methods delineated to enact sentiment analysis over a dataset;
  • Describe the topic modelling task over a dataset;
  • Execute topic modelling using steps with code shown over a dataset.

Course type

E learning – Available

Self learning – Not available

Face to face – Not available

Skill level

Participants must have attended the Introduction to Natural Language course.


To discuss booking this course for remote delivery, please contact the Data Science Campus Faculty.

Other information

This course is under review and a new version will be available soon.