This course will focus on four key topics in Natural Language Processing: information retrieval, classification, sentiment analysis and topic modelling. Information retrieval covers the building blocks of a search engine – the inverted index and maps out in detail with both illustrations and code how an information retrieval application can be built. Three disparate, classical approaches will be examined to fulfil this objective.
Classification will then be outlined, focusing on its supervised machine learning foundations. A real-world classification problem of news classification will be illustrated using a BBC news dataset. Classification will again be enacted to undertake sentiment analysis. Key challenges in the field of sentiment analysis will also be explored. Topic Modelling as the name suggests is a process to automatically identify topics present in a dataset. The course will demonstrate from a practical perspective how this can be attained.
Participants should attain specialist knowledge and skills to enable development of more challenging language based applications like sentiment analysis and search engines.
- Describe the boolean retrieval model;
- Execute the boolean retrieval model over a dataset;
- Describe the vector space model; Execute the vector space model over a dataset;
- Describe how a language model can be used to enact retrieval over a dataset;
- Execute a language modelling approach to enact retrieval over a dataset;
- Describe supervised machine learning;
- Describe pathway of a typical machine learning project;
- Follow steps in pathway to enact classification over a dataset;
- Describe sentiment analysis;
- Describe challenges in a sentiment analysis task;
- Execute sentiment analysis using methods delineated to enact sentiment analysis over a dataset;
- Describe the topic modelling task over a dataset;
- Execute topic modelling using steps with code shown over a dataset.
E learning – Available
Self learning – Not available
Face to face – Not available
Participants must have attended the Introduction to Natural Language course.
To discuss booking this course for remote delivery, please contact the Data Science Campus Faculty.
This course is under review and a new version will be available soon.