Transformation in the Ghana Statistical Service
Using data science and automation in the production of price statistics
Read more on Transformation in the Ghana Statistical ServiceUsing data science and automation in the production of price statistics
Read more on Transformation in the Ghana Statistical ServiceSynthetic data are artificially generated data that are made to resemble real-world, often sensitive, data.
Read more on Enabling Data Access through Privacy Preserving Synthetic DataThis report offers a thorough look into how we created a synthetic version of a large, linked, and confidential dataset while adhering to a formal privacy framework
Read more on Synthesising the linked 2011 Census and deaths dataset while preserving its confidentialityTransparency declarations have a potential to capture a sizeable amount of public sector spending.
Read more on Exploring trends in local government spending through transparency declarationsThe LBD is at its core a re-usable longitudinal data spine with each of its component providing the longitudinal link between business references. Data spine is a new concept.
Read more on The Longitudinal Business Database: Capturing the UK economy with new business microdataWe demonstrate the use of a range of freely available anonymised and aggregated novel datasets to estimate visitation counts to natural areas.
Read more on A data science approach to estimate the use of natural spaces: a feasibility studyWe have been exploring how NLP techniques and LLMs could be used in the future to improve website search experience for end-users.
Read more on Using large language models (LLMs) to improve website search experience with StatsChatWe are planning to follow up on previous work we published earlier this year looking at public transport availability across the UK, by providing metrics for urban centres across the UK, as well as their international counterparts.
Read more on Comparing international transport performance in urban centres: upcoming workWe have produced an alternative estimation of the census travel to work matrices annually from 2012 to 2021, bridging the 10-year gap. This report looks at the technical specification of the gravity model, with the summary of input data and initial results.
Read more on Technical report: Estimation of travel to work matricesUsing novel modelling approaches, we have produced an alternative estimation of the travel to work matrices annually between 2012 and 2021, bridging the 10-year gap between Census data.
Read more on Estimation of travel to work matricesIn this guest blog, 10DS data science fellows Federico and Robin, talk about working with the Campus to create packages that import, process and visualise DfT’s journey time statistics data.
Read more on Guest blog: Enhancing open-access data analysis: introducing the Journey Time Statistics R and Python packagesThis report is part of a programme of work that the ONS has been doing with the Alan Turing Institute to explore the usefulness of various economic nowcasting methods, particularly the signature method.
Read more on Technical report: nowcasting UK household income using the new “signature” methodNowcasting refers to generating estimates of the current (“now”) state of the economy. We investigate how signature methods can be useful in the context of economic nowcasting
Read more on Helping decision makers understand the economy quickly through new methodsONS Data Science Campus (DSC) and Defra’s Spatial Data Science team developed a novel solution for estimating the number of visitors to natural spaces across England.
Read more on Using open-source data to measure our engagement with the natural environmentWe go behind our analysis on the use of microdata for the examination of preference tariff utilisation and take a deep dive into challenges of drawing together new administrative data sources to answer relevant policy questions.
Read more on The use of microdata for firm-level analysis of preference tariff utilisation in the UK: technical reportWe show how shipping instructions can be used to map the trade routes of critical goods. This will help understand our reliance on global ports for accessing specific products, and draw insights on the impact of important events such as strikes.
Read more on Using new shipping data to improve government understanding of trade flowsWe have published data to help researchers and local planners understand how public transport access varies across the UK, using open data and open-source software.
Read more on Using open data to understand hyperlocal differences in UK public transport availabilityWe used machine learning to develop a model to identify areas of trees from satellite images in eastern Uganda, where the Mbale Trees Programme has been running since 2010.
Read more on Using Sentinel-2 images to measure the change in tree coverage in eastern Uganda: what does it mean for the Mbale Trees Programme?The Campus has delivered a suite of tools and code to North of Tyne Combined Authority (NTCA) to support their work to prioritise broadband infrastructure improvements in the region.
Read more on Using data science to help inform regional broadband investmentCampus data scientists achieved an impressive third place in the UN PET Lab hackathon, competing against nearly 200 international teams. Read about their experience and approach during the competition.
Read more on Campus in the top three at the UN PET Lab hackathonThis summer, the Campus hosted two interns as part of the HDR UK’s Black Internship Programme. Read more about their experience, embedded in real-life data science projects within our teams.
Read more on Improving career opportunities for Black data scientists in the UKNHS Test and Trace asked us to extract insights from COVID-19 contact tracing app reviews to help them understand how the app could be improved.
Read more on Understanding NHS coronavirus (COVID-19) app reviews using topic modellingRail services can be affected by temporary factors such as unseasonably hot weather, industrial action and engineering works. We produced maps from open data showing service levels at every station in Great Britain over a 21-day period.
Read more on Visualising rail schedules using open dataData synthesis is an active area of research for many organisations, including the Office for National Statistics (ONS). SynthGauge is a Python library that provides decision-makers with a range of metrics and visualisations for evaluating synthetic data.
Read more on Evaluating synthetic data using SynthGaugeTo support the national fight against coronavirus (COVID-19) in March 2020, BT made aggregate, anonymised mobility data available to the UK Government. We quickly turned this into daily updates, with only one day’s delay between activity and the reporting of it.
Read more on Case study: responding to the coronavirus pandemic using aggregated BT mobility dataUnderstanding and monitoring the major influences on COVID-19 infection numbers in communities is essential to inform policy making and evaluate the impact of non-pharmaceutical interventions. We have developed a community-level analysis by assembling a large set of static and dynamic data for England.
Read more on Use of hybrid data to understand the community-level influences on coronavirus (COVID-19) incidenceThe UK’s exit from the European Union created uncertainty about workers across a range of sectors, exacerbated by concerns over workers leaving the country and the impact on labour supply. The coronavirus (COVID-19) pandemic created additional and sudden changes, with sectors being affected heterogeneously and demand switching from services to goods.
Read more on Worker shortages: A window on labour demand during the coronavirus (COVID-19) pandemicPreference utilisation rates (PURs) measure the extent to which UK businesses make use of the zero or reduced tariffs available via free trade agreements (FTAs). In this work, we study the take-up of preferential tariffs by UK businesses between 2009 and 2019 and examine their trends and patterns.
Read more on Employing data science to analyse the use of preferential tariffs in free trade agreementsThe Data Science and Data Visualisation Accelerator mentoring programmes will open for applications in January 2022.
Read more on Boost your data science and visualisation skills in 2022!In this guest blog, Robyn Hunt from the ONS travel and tourism review team looks at using mobility data to model estimates from the International Passenger Survey.
Read more on Using Mobile Phone Data for Enhancing International Passenger Survey Traveller StatisticsMany countries in the developing world lack regular estimates of road traffic activity. To address this, we applied machine learning techniques to open-source satellite data to generate estimates of traffic volume in Kenya, East Africa.
Read more on Detecting Trucks in East AfricaWe have been exploring if machine learning methods and publicly available spatial datasets can be used to map and understand HIV risk in Cote d’Ivoire, West Africa.
Read more on Mapping HIV risk in Côte d’Ivoire, West AfricaWe have been exploring Facebook data to understand changing patterns of mobility and the impact of lockdown restrictions.
Read more on Using Facebook data to understand changing mobility patternsThis blog outlines some of the work the Campus has been doing, bringing our skills together with new data sources to help inform responses to the coronavirus (COVID-19) pandemic.
Read more on Understanding mobility during the COVID-19 pandemicIn this blog, we describe how we have assessed the quality of the novel Global Surface Water Explorer (GSWE) dataset to better understand its value and fitness-for-purpose, producing data to report the UK’s position on indicator 6.6.1.
Read more on Using satellite imagery to report changes to water bodies for SDG 6.6.1We used text data from over 500,000 business websites to inform survey response-chasing efforts and gain insights into the impact of COVID-19.
Read more on Extracting text data from business website COVID-19 noticesOn World Statistics Day 2020, our ONS FCDO Data Science Hub talk about exploring the use of satellite imagery to conduct a cattle census in South Sudan.
Read more on Counting cows in South SudanThe latest update on our work with Barclays exploring if aggregated and anonymised card transactions data could provide significantly quicker, more granular insight into UK consumer spending.
Read more on Payments Data for Public GoodWe introduce pyGrams – a new Python tool for extracting, visualising and identifying emerging terms in large document collections, such as patents.
Read more on Extracting, visualising and identifying emerging important terminology from patent collectionsWe are exploring new data sources such as Google’s Community Mobility Reports to strengthen the information that we have through surveys and other sources.
Read more on Supporting the response to coronavirus (COVID-19)Our latest project investigates the use of machine learning techniques to predict missing energy performance scores. It also attempts to create a complete picture of the energy efficiency profile for domestic properties in Wales.
Read more on Can machine learning be used to predict energy performance scores?Tuli Amutenya is a Graduate Data Scientist from the Namibia Statistics Agency who has spent the last 6 months at ONS. Here, Tuli shares her experiences and the learning she will be taking home.
Read more on Bridging data science and statistics for international developmentHow data science is helping to address the challenge of measuring the Sustainable Development goals in the UK.
Read more on Data science for sustainable developmentToday we published latest release of new, faster, indicators of economic activity constructed from novel data sources.
Read more on Faster indicators of UK economic activityCan non-standard data sources help us understand the relationship between management practices and high growth?
Read more on Can non-standard data sources help us understand the relationship between management practices and high growth?Our new Schools Engagement Strategy in partnership with Techniquest, aims to build science capital in primary aged children.
Read more on Science outreach in the communityData Science Accelerator mentee Ciaran Evans from the UK Hydrographic Office attempts to answer the question “What is the beach made of?”
Read more on Mapping beaches with the Data Science Accelerator ProgrammeAcross the public sector, analysts can benefit from the Data Science Accelerator, which opened for a new cohort of applications this week.
Read more on Data Science Accelerator ProgrammeEvery day our digital footprint is growing through simple activities like shopping, meaning nearly every industry is seeking data science skills.
Read more on Increased demand in the data science job marketWe are currently using open data sets to develop a better understanding of loneliness in England. Are there places in England where people are more likely to be lonely and why?
Read more on Developing a Loneliness Prescription IndexSynthetic data mimics essential characteristics from the original dataset, creating new, substitute data that does not represent any real person, removing confidentiality requirements.
Read more on Synthetic data for public good and artSocial media is such a key part of everyday life and with the data readily available online, it has the potential to change the way we collect information to understand society. However, it is paramount that data sources used in the production of official statistics are accurate, relevant, unbiased, and most importantly, they must be used ethically.
Read more on Exploring the value of social media dataHigh growth businesses drive economic growth in the UK. Therefore, understanding the characteristics that may lead to high performance is an area of active research.
Read more on Understanding the characteristics of high growth companies using non-traditional data sourcesWe explore whether it is possible to classify financial corporations to their detailed Standard Industry Classification 2007 (SIC2007) using financial assets, liabilities and other firm-level data.
Read more on FinBins – granular classification of the UK’s financial sectorOver the last year, we have developed an experimental method for estimating the density of trees and vegetation present at…
Read more on How green is your street – visualising the urban forestThe ability for a household to access a range of services necessary for day-to-day living is of great importance to…
Read more on Access to services using multimodal transport networksUrban trees provide a wide range of environmental, social and economic benefits, such as improving air quality and are known…
Read more on Mapping the urban forest at street levelThe Data Science Campus has begun a research project with Hafod, a social residential tenant organisation, to explore the flows…
Read more on Caring for the future – working with HafodProject summary Many datasets contain variables that have been collected as free-text in an uncontrolled way. In the case where…
Read more on optimus – turning free-text lists into hierarchical datasetsThe maritime freight industry is of critical importance to the economic output of the UK, with almost half a billion…
Read more on Analysing port and shipping operations using big data1. Executive Summary The Evaluating Calorie Intake for Population Statistical Estimates (ECLIPSE) project was carried out by the Data Science…
Read more on Evaluating Calorie IntakeThe Office for National Statistics (ONS) publishes economic information at both a national and regional level on a regular basis. Steven…
Read more on Big fish, little fish – understanding local economies using interactive visualisationsUsing doubly labelled water to improve our understanding of the UK’s calorie intake. The problem We’re not very good at…
Read more on How much is the UK eating?Can we better understand traffic at British ports and can we use shipping as an early indicator for gross domestic…
Read more on Tracking ships to understand tradeThe Office for National Statistics (ONS) is on a fantastic journey to better serve the UK by improving the quality…
Read more on Building and stocking the survey question libraryCan we improve self-reported estimates of energy (calorie) consumption? Background The 2016 ‘Counting Calories’ report highlighted a disparity between self-reported…
Read more on Evaluating calorie intake for population statistical estimates (ECLIPSE)Urban forest refers to the trees and vegetation present in the streets, parks, gardens, balconies and even green roofs within…
Read more on Mapping the urban forestCan a national urban vegetation dataset be generated using computer vision and machine learning techniques? Background The value of urban…
Read more on Measuring the urban forest