We go behind our analysis on the use of microdata for the examination of preference tariff utilisation and take a deep dive into challenges of drawing together new administrative data sources to answer relevant policy questions.Read more on The use of microdata for firm-level analysis of preference tariff utilisation in the UK: technical report
We show how shipping instructions can be used to map the trade routes of critical goods. This will help understand our reliance on global ports for accessing specific products, and draw insights on the impact of important events such as strikes.Read more on Using new shipping data to improve government understanding of trade flows
We used machine learning to develop a model to identify areas of trees from satellite images in eastern Uganda, where the Mbale Trees Programme has been running since 2010.Read more on Using Sentinel-2 images to measure the change in tree coverage in eastern Uganda: what does it mean for the Mbale Trees Programme?
In this guest blog, data science apprentice Evie Brown from the Social Care Analysis team at the Office for National Statistics (ONS) presents work on grouping online job adverts by social care role. This project was a significant part of the final year of the Level Six Data Science Apprenticeship.Read more on Identifying different roles in the social care sector using online job advertisements
The Department for Business, Energy and Industrial Strategy (BEIS) asked the Campus to see if machine learning techniques could be used to predict the energy efficiency of properties that do not have an official score. Read more about our findings on this blog post.Read more on Predicting the Energy Performance Certificates (EPC) of properties
At the World Expo in Dubai, we supported other National Statistics Organisations to develop their data science capability by delivering a series of workshops, training sessions and discussions.Read more on UK data science on a global stage
In 2021, the ONS-UNECE machine learning group demonstrated the benefits of international cooperation for advancing machine learning in official statistics.Read more on How international collaboration is advancing machine learning in official statistics
This report summarises our work to meet widespread demand for better automated and higher quality solutions to this problem, by using machine learning (ML) methods to improve the match rate and accuracy of automatic classification of SIC and SOC classifications.Read more on Automated coding of Standard Industrial and Occupational Classifications (SIC/SOC)
Many countries in the developing world lack regular estimates of road traffic activity. To address this, we applied machine learning techniques to open-source satellite data to generate estimates of traffic volume in Kenya, East Africa.Read more on Detecting Trucks in East Africa
How the Data Science Campus and UNECE are continuing to lead international collaboration on Machine Learning for Official Statistics through the ML 2021 group.Read more on Leading international collaboration in machine learning for official statistics
The Campus is partnering with the United Nations Economic Commission for Europe (UNECE) to to develop research, build skills and share resources on Machine Learning developments and applications for official statistics across the global statistical community.Read more on ONS-UNECE Machine Learning 2021 Group
Large volumes of unstructured, free-text data such as patent applications, contain potentially valuable information for policymakers. Here we describe the pipeline used to process such data sources.Read more on pyGrams: An open source tool for discovering emerging terminology in large text datasets
In this report, we focus primarily on how the pyGrams tool can be used to analyse terms through time using the time stamps in document metadata.Read more on Discovering emerging important terminology in large text datasets using pyGrams: a comparison between net growth and e-score methods
We introduce pyGrams – a new Python tool for extracting, visualising and identifying emerging terms in large document collections, such as patents.Read more on Extracting, visualising and identifying emerging important terminology from patent collections
Our latest project investigates the use of machine learning techniques to predict missing energy performance scores. It also attempts to create a complete picture of the energy efficiency profile for domestic properties in Wales.Read more on Can machine learning be used to predict energy performance scores?