Core Themes
Data Science Campus work focuses on a number of core themes:
Read more on Core ThemesData Science Campus work focuses on a number of core themes:
Read more on Core ThemesWe have explored the feasibility of using open data to analyse local transport performance. In this publication we demonstrate how to produce hyperlocal bus service coverage and punctuality metrics.
Read more on Bus reliability – analysing the morning travel peak across EnglandUsing data science and automation in the production of price statistics
Read more on Transformation in the Ghana Statistical ServiceSynthetic data are artificially generated data that are made to resemble real-world, often sensitive, data.
Read more on Enabling Data Access through Privacy Preserving Synthetic DataThis report offers a thorough look into how we created a synthetic version of a large, linked, and confidential dataset while adhering to a formal privacy framework
Read more on Synthesising the linked 2011 Census and deaths dataset while preserving its confidentialityWe are planning to follow up on previous work we published earlier this year looking at public transport availability across the UK, by providing metrics for urban centres across the UK, as well as their international counterparts.
Read more on Comparing international transport performance in urban centres: upcoming workIn this guest blog, 10DS data science fellows Federico and Robin, talk about working with the Campus to create packages that import, process and visualise DfT’s journey time statistics data.
Read more on Guest blog: Enhancing open-access data analysis: introducing the Journey Time Statistics R and Python packagesWe go behind our analysis on the use of microdata for the examination of preference tariff utilisation and take a deep dive into challenges of drawing together new administrative data sources to answer relevant policy questions.
Read more on The use of microdata for firm-level analysis of preference tariff utilisation in the UK: technical reportWe show how shipping instructions can be used to map the trade routes of critical goods. This will help understand our reliance on global ports for accessing specific products, and draw insights on the impact of important events such as strikes.
Read more on Using new shipping data to improve government understanding of trade flowsWe have published data to help researchers and local planners understand how public transport access varies across the UK, using open data and open-source software.
Read more on Using open data to understand hyperlocal differences in UK public transport availabilityWe used machine learning to develop a model to identify areas of trees from satellite images in eastern Uganda, where the Mbale Trees Programme has been running since 2010.
Read more on Using Sentinel-2 images to measure the change in tree coverage in eastern Uganda: what does it mean for the Mbale Trees Programme?The Campus has delivered a suite of tools and code to North of Tyne Combined Authority (NTCA) to support their work to prioritise broadband infrastructure improvements in the region.
Read more on Using data science to help inform regional broadband investmentIn this guest blog, data science apprentice Evie Brown from the Social Care Analysis team at the Office for National Statistics (ONS) presents work on grouping online job adverts by social care role. This project was a significant part of the final year of the Level Six Data Science Apprenticeship.
Read more on Identifying different roles in the social care sector using online job advertisementsThe Department for Business, Energy and Industrial Strategy (BEIS) asked the Campus to see if machine learning techniques could be used to predict the energy efficiency of properties that do not have an official score. Read more about our findings on this blog post.
Read more on Predicting the Energy Performance Certificates (EPC) of propertiesRail services can be affected by temporary factors such as unseasonably hot weather, industrial action and engineering works. We produced maps from open data showing service levels at every station in Great Britain over a 21-day period.
Read more on Visualising rail schedules using open dataData synthesis is an active area of research for many organisations, including the Office for National Statistics (ONS). SynthGauge is a Python library that provides decision-makers with a range of metrics and visualisations for evaluating synthetic data.
Read more on Evaluating synthetic data using SynthGaugeTo support the national fight against coronavirus (COVID-19) in March 2020, BT made aggregate, anonymised mobility data available to the UK Government. We quickly turned this into daily updates, with only one day’s delay between activity and the reporting of it.
Read more on Case study: responding to the coronavirus pandemic using aggregated BT mobility dataUnderstanding and monitoring the major influences on COVID-19 infection numbers in communities is essential to inform policy making and evaluate the impact of non-pharmaceutical interventions. We have developed a community-level analysis by assembling a large set of static and dynamic data for England.
Read more on Use of hybrid data to understand the community-level influences on coronavirus (COVID-19) incidenceCattle are central to many people’s livelihoods in South Sudan, but there are very few recent data on the number of animals and their geographic distribution. Building on previous work, this blog explores new methods, based on convolutional neural networks, that can better distinguish cattle camps from similar landscape features.
Read more on Technical report: Predicting cattle camp locations in South Sudan from Sentinel 2 satellite imageryOn 27 March 2017, with an audience of UK and international data science leaders from across the public, private and academic sectors, a team of 8, some brilliant presentations, and a lot of excitement, the Data Science Campus was launched. 5 years later, Louisa Nolan shares what have we learned along the way.
Read more on The Data Science Campus – five years of data science for public goodPreference utilisation rates (PURs) measure the extent to which UK businesses make use of the zero or reduced tariffs available via free trade agreements (FTAs). In this work, we study the take-up of preferential tariffs by UK businesses between 2009 and 2019 and examine their trends and patterns.
Read more on Employing data science to analyse the use of preferential tariffs in free trade agreementsPart of our mission at the Data Science Campus (DSC) is to build data science capability across the public sector. In this project, which grew out of our Data Science Accelerator programme, we worked with the Royal Air Force (RAF) to help upskill their staff in the Python programming language and natural language processing (NLP) methods.
Read more on Technical Report: Project Mertz—novel use of historical RAF flight safety recordsWe have released a new version of our open codebase to enable the Traffic Cameras project to run on a single, stand-alone laptop as well as on Google Cloud Platform, making it more accessible to users without cloud infrastructure.
Read more on Estimating vehicle and pedestrian activity from town and city traffic camerasOur FCDO hub’s mentorship model has helped countries, including Rwanda and Vanuatu, deliver in-house data science projects with significant impact, while also creating a flourishing sustainable ecosystem of data scientists.
Read more on Making an international impact with data science mentoringThis report summarises our work to meet widespread demand for better automated and higher quality solutions to this problem, by using machine learning (ML) methods to improve the match rate and accuracy of automatic classification of SIC and SOC classifications.
Read more on Automated coding of Standard Industrial and Occupational Classifications (SIC/SOC)In this guest blog, Robyn Hunt from the ONS travel and tourism review team looks at using mobility data to model estimates from the International Passenger Survey.
Read more on Using Mobile Phone Data for Enhancing International Passenger Survey Traveller StatisticsMany countries in the developing world lack regular estimates of road traffic activity. To address this, we applied machine learning techniques to open-source satellite data to generate estimates of traffic volume in Kenya, East Africa.
Read more on Detecting Trucks in East AfricaIn this report, we talk about our use of novel machine and deep learning techniques to explore changes in population behaviour and well-being during the coronavirus pandemic.
Read more on Using data science to explore changes in behaviour and well-being during the coronavirus (COVID-19) pandemicWe have been exploring if machine learning methods and publicly available spatial datasets can be used to map and understand HIV risk in Cote d’Ivoire, West Africa.
Read more on Mapping HIV risk in Côte d’Ivoire, West AfricaWe have been exploring Facebook data to understand changing patterns of mobility and the impact of lockdown restrictions.
Read more on Using Facebook data to understand changing mobility patternsThis blog outlines some of the work the Campus has been doing, bringing our skills together with new data sources to help inform responses to the coronavirus (COVID-19) pandemic.
Read more on Understanding mobility during the COVID-19 pandemicA report on the quality and methodology of data used to produce extent and change of surface water statistics for reporting on Sustainable Development Goals indicator 6.6.1 for the UK.
Read more on Quality and methodology: Extent and change of surface water statisticsIn this blog, we describe how we have assessed the quality of the novel Global Surface Water Explorer (GSWE) dataset to better understand its value and fitness-for-purpose, producing data to report the UK’s position on indicator 6.6.1.
Read more on Using satellite imagery to report changes to water bodies for SDG 6.6.1We used text data from over 500,000 business websites to inform survey response-chasing efforts and gain insights into the impact of COVID-19.
Read more on Extracting text data from business website COVID-19 noticesOn World Statistics Day 2020, our ONS FCDO Data Science Hub talk about exploring the use of satellite imagery to conduct a cattle census in South Sudan.
Read more on Counting cows in South SudanThe latest update on our work with Barclays exploring if aggregated and anonymised card transactions data could provide significantly quicker, more granular insight into UK consumer spending.
Read more on Payments Data for Public GoodLarge volumes of unstructured, free-text data such as patent applications, contain potentially valuable information for policymakers. Here we describe the pipeline used to process such data sources.
Read more on pyGrams: An open source tool for discovering emerging terminology in large text datasetsIn this report, we focus primarily on how the pyGrams tool can be used to analyse terms through time using the time stamps in document metadata.
Read more on Discovering emerging important terminology in large text datasets using pyGrams: a comparison between net growth and e-score methodsWe introduce pyGrams – a new Python tool for extracting, visualising and identifying emerging terms in large document collections, such as patents.
Read more on Extracting, visualising and identifying emerging important terminology from patent collectionsWe are exploring new data sources such as Google’s Community Mobility Reports to strengthen the information that we have through surveys and other sources.
Read more on Supporting the response to coronavirus (COVID-19)Our latest project investigates the use of machine learning techniques to predict missing energy performance scores. It also attempts to create a complete picture of the energy efficiency profile for domestic properties in Wales.
Read more on Can machine learning be used to predict energy performance scores?This article summarises further development we have carried out to improve the timeliness and relevance of the shipping indicators, in response to feedback from our users.
Read more on Faster indicators of UK economic activity: more timely and relevant shipping indicatorsHow data science is helping to address the challenge of measuring the Sustainable Development goals in the UK.
Read more on Data science for sustainable developmentExtracting social-economic signals from internet traffic data.
Read more on What can internet use tell us about our society and the economy?Improving the current approach to identifying the proportion of vegetation for urban residential gardens in Great Britain.
Read more on Green spaces in residential gardensSo, June was notable for (i) more uninterrupted sunshine than I can remember in the UK; (ii) the short-lived excitement…
Read more on The month that was – June 2018Where do the days go? It’s already time for my latest update – providing an overview of what happened in the…
Read more on The month that was – May 2018Here’s the latest “monthly” update for the Data Science Campus, covering both March and April 2018. We took a break…
Read more on The months that were – March and April 2018Here’s our latest monthly update for the Data Science Campus, covering February 2018. We’ve taken the liberty of stretching the…
Read more on The month that was – February 2018 (plus a bit)1. Executive Summary The Evaluating Calorie Intake for Population Statistical Estimates (ECLIPSE) project was carried out by the Data Science…
Read more on Evaluating Calorie IntakeFollowing a brief midwinter break, here is our bumper two-month update, covering December 2017 and January 2018. The Campus continues…
Read more on The months that were – December 2017 and January 2018Here’s our update for November, the month when the Campus won an award and had the privilege of welcoming numerous…
Read more on The month that was – November 2017Using doubly labelled water to improve our understanding of the UK’s calorie intake. The problem We’re not very good at…
Read more on How much is the UK eating?This is our monthly update for October 2017 – one of a series of regular snapshots that highlight some of…
Read more on The month that was – October 2017This is our monthly update for September – one of a series of regular snapshots on what we’ve been doing…
Read more on The month that was – September 2017Welcome to “the month that was”, providing you with a snapshot of what we’ve been doing this month to support…
Read more on The month that was – August 2017Welcome to the first of our monthly updates – a regular snapshot on what we’ve been doing to support data…
Read more on The month that was – July 2017It is 100 days since the Data Science Campus officially opened our doors to help the Office for National Statistics…
Read more on The first 100 daysThe Office for National Statistics (ONS) is on a fantastic journey to better serve the UK by improving the quality…
Read more on Building and stocking the survey question libraryThe Office for National Statistics (ONS) Data Science Campus is keen to collaborate with partner organisations both in the UK…
Read more on Collaborating with Statistics Netherlands towards a data driven societyUrban forest refers to the trees and vegetation present in the streets, parks, gardens, balconies and even green roofs within…
Read more on Mapping the urban forestThe Data Science Campus celebrated a significant achievement on Monday 27 March, as it opened its doors for the first…
Read more on We are officially open for businessOn 5 December 2016, the Office for National Statistics and The Alan Turing Institute signed a Memorandum of Understanding signalling their shared…
Read more on When the team met Turing