The Office for National Statistics’s (ONS) Data Science Campus has a vital role in mobilising the power of data to help the UK make better decisions. At the launch of the new Campus in Newport in March 2017, I said that it would innovate with new methods and data sources providing opportunities to improve existing statistics and develop new outputs by working across government, industry, academia and charities in the UK and internationally. I also highlighted the wide range of training and learning programmes that the Campus would offer to build data science capability across the UK.
I have been greatly encouraged by what has been achieved so far in realising our vision for a new centre for the development and application of data science techniques. I am proud to introduce this first report on the progress made.
As well as recruiting experienced data scientists into the Campus, ONS has developed data science skills across the analytical professions throughout government. I was particularly proud to attend the graduation ceremony for the UK’s first data analytics apprentices in Newport in November 2018. The eight graduates are joining a unique career pathway that will see them use cutting-edge tools and technologies to provide statistics and insights to help shape policy across the country.
Professor Sir Charles Bean’s 2016 Review of Economic Statistics envisaged the Campus recruiting a cadre of data scientists along with active learning and experimentation facilitated through collaboration with relevant partners. The Campus has opened up many possibilities of working together with respected leaders in the field of data science, such as the Alan Turing Institute. These partnerships enable the Campus to develop research, innovate, and improve and exchange knowledge and skills.
Through its research programme, the Campus is actively helping to produce better statistics and is supporting better policy and operational decisions across a diverse span of the UK public sector and beyond, as the case studies in this report demonstrate.
I look forward to the Campus building on the tremendous achievements made so far and playing an increasingly pivotal role in driving forward the advancement of data science for public good.
It was a huge privilege for me to join ONS two years ago as Managing Director of the Data Science Campus. What a unique opportunity to put data science at the heart of decision making in the UK and help influence the most important policy issues facing the country.
We strive to achieve this in three ways:
- applying data science tools, methods and practices to strengthen statistics and evidence for policymaking
- innovating by assessing the value of new data sources and techniques
- improving data science skills across ONS and beyond
We have worked across government to deliver dozens of data science projects covering a broad span of topics including developing faster and more granular economic indicators, understanding trade, contributing to the public health debate and monitoring sustainable development. We have applied new data science tools, techniques and practices and investigated some of new types of data emerging from the data revolution.
Alongside data science delivery, we are growing capacity and supporting the data science community. We have a target to deliver 500 qualified data scientists for government by 2021 and, by working with ONS colleagues, the UK public sector and international statistics agencies we expect to exceed this. We have put in place a wide-ranging development programme from school to post-doctoral level which is attracting global interest.
Finally, how we work is important. By growing an experienced, diverse and creative data science group inside government we can demonstrate the value of government having direct access to world-class data science skills. By working in partnership with academia, industry and civil society organisations we can improve UK public sector access to data and data science skills. By developing a collaborative culture of working openly and supporting reuse of our work, we can maximise the impact of our programme.
I’m really proud of the great start we have made at the Campus. There is much more to be done. The next and critically important phase is to continue scaling the impact of data science across government. So as well as reviewing what we’ve achieved so far, the final section of this report looks forward to the Campus making a vital contribution to the challenges ahead. I can’t wait!
Data Science Campus
3. Who we are
The Data Science Campus (the Campus) is part of the Office for National Statistics (ONS), which is the government’s National Statistical Institute and the UK’s largest independent producer of official statistics. The Campus was created in response to the review of economic statistics published in 2016 by Professor Sir Charles Bean. The review recommended that ONS set up a national hub for data science to harness the power of big data to help Britain make better decisions and improve lives.
Across ONS, data is being mobilised to help Britain make better decisions and improve lives. Improvements in economic statistics, especially productivity, financial flows, prices and trade, mean that ONS is providing the statistics and data that decision-makers need for a modern economy. The upcoming 2021 census will be the first of its kind, “digital first”, drawing on additional sources of information to create the most comprehensive picture of today’s society. The creation of the Campus is part of this data-driven transformation of statistics in the UK.
The Campus launched in March 2017 with a core of well-qualified professionals, recruited mainly from industry and academia. Today, we have a team of nearly 70 experts actively delivering our ambitious research and academic programmes.
Our aim is to strengthen ONS and government expertise in data science across the UK, enabling quick, clear and relevant insight on the public issues at hand. Changes in society and technology have led to an explosion in data, making it more readily available and in richer and more complex forms. These developments mean that ONS has many opportunities to examine existing data and access new data sources and, through the work of the Campus, apply innovative statistical tools to these data sources to help with a better understanding of our society, our economy and our own lives.
What we do
As a response to these opportunities, ONS’s Data Science Campus is delivering innovative research that will positively impact and enhance capability across the public sector using data science, machine learning and artificial intelligence. We are building new skills and applying new tools, methods and practices to support government decision-making and the UK Statistics Authority’s Better Statistics, Better Decisions strategy.
Delivering data science projects
The Campus has delivered a series of high-profile data science projects that have provided valuable new insights for ONS and other stakeholders across government and beyond, making an impact on public policy over the last two years. Our programme spans the economy, trade, the environment and society. We publish technical reports on our projects and routinely make the code available for others to use via GitHub.
Partnership in action
We worked with data scientists at the Department for International Trade (DIT) to analyse over half a million responses from their free trade agreement consultation data. We used a variety of advanced text mining and text summarisation techniques to provide insights that are being used by DIT policy teams to inform decision making, providing invaluable support for delivering free trade agreement outcomes in DIT and other government departments.
The project process at the Campus allows teams to work flexibly while guarding against risks such as delays in accessing data and poor-quality data. A vital component of the project process is a review to ensure that all projects meet the stringent privacy and ethics standards set out by ONS. The agile nature of the project process also enables the Campus to mitigate the risks of projects becoming unfeasible or not progressing at pace to an impactful outcome for the stakeholder. We hold regular check-ups at Project Board meetings, and maintain regular contact with stakeholders.
With a diverse group of data scientists, lecturers and trainers, academic programme and partnership managers, the Knowledge Exchange team delivers a range of capacity building activities that support the growth of data science across the public sector.
From data science and artificial intelligence training and mentoring programmes – developed and delivered by our in-house Data Science Faculty – to apprenticeships, graduate placements and a bespoke MSc programme developed with university partners, the team acts as a conduit for leading edge data science skills to enter the public sector.
We work with UK and international partners, drawing on their expertise and resources and sharing the benefits of our own education and research programmes. Our partnerships with industry, academia and international organisations were vital in jump-starting our operation, and we continue to draw on their valuable support as the Campus grows.
Partnership in action
The Campus collaborated with HSBC and The Alan Turing Institute on The Turing-HSBC-ONS Economic Data Science Awards 2018. A programme of nine economic data science projects have been awarded a total of £750,000 in funding to combine world-leading science with the potential for high impact outside academia to improve our understanding of how the economy works.
We’re making an impact
The goal of the Campus is to investigate the use of new data sources, including administrative data and big data for public good and to help build data science capability for the benefit of the UK. A new generation of tools and technologies is being used to exploit the growth and availability of these new data sources. We employ innovative methods to provide rich, informed measurement and analyses on the economy, the global environment and wider society.
Our strategic objectives for 2019
Deliver research outputs
- Deliver 30 data science projects for UK public good.
- Publish case studies of new products and outputs used by ONS or elsewhere in government to provide greater insight to decision-makers.
- Set up new mechanisms to monitor the use and impact of Campus outputs in ONS and across government and record lessons learned across our projects.
Develop data science methods
- Publish new methods, code and analytical outputs for use by the wider community.
- Assess the value to ONS of non-traditional data sources and new technologies.
Grow data science skills in ONS and across government analytical professions
- Provide government with 150 qualified data scientists in 2019 (500 by 2021).
- Help departments strengthen their data science skills and understanding.
- Mentor projects on the Data Science Accelerator Programme or the ONS Data Science Academy.
- Support civil servants on the MSc in Data Analytics for Government and through our partners deliver continuous professional development (CPD) modules.
Form partnerships that provide access to new data and attract additional resources
- Increase the number of partnership agreements with universities, research institutes, international agencies and commercial businesses that lead to collaborative research or capability building activities.
- Harness commercial and other data sources and academic expertise to improve insight for public policy.
- Sponsor PhD and MSc students to carry out research in support of Campus objectives.
- Build data science leadership across statistics agencies to enhance knowledge sharing.
Our purpose and mission
We apply data science and build skills for public good across the UK and internationally.
We work at the frontier of data science and AI – building skills and applying tools, methods and practices – to create new understanding and improve decision-making for public good.
Our journey so far…
4. Delivering data science
Data science is at the intersection of mathematics and statistics, computer science and domain-specific expertise. For the Campus, it’s about improving our understanding of the UK’s economy, communities and people, using novel data sources and techniques such as machine learning and natural language processing to better inform decision-making for the public good.
We use a range of innovative data sources, methods and approaches to deliver new outputs and products for ONS and elsewhere in government. Our projects span a range of functions of data science across diverse sectors and themes.
Main sources, approaches and themes of Data Science Campus projects
Our approach to projects
Our ethos is simple: we want to work on projects that create the greatest public policy or delivery impact, or significantly improve our learning. To prioritise project requests, we ask:
- does it add value?
- does it increase our understanding of the UK’s economy and society?
- is it a stakeholder priority?
- what will we learn?
- can the learning be applied to other problems?
- is it ethical?
- is it possible?
- does it have an owner?
We follow a rigorous yet flexible process to deliver our projects, with the stakeholder and their desired outcomes always at the heart of that process. Wherever we can, we publish the findings and make the code available for others to use.
Figure 1: Project delivery process at the Data Science Campus
Table 1: Current and recent projects – Better statistics and data
|Novel approaches to the Living Costs and Food Survey||Explore how we can apply computer vision and natural language processing techniques to the ONS Living Costs and
|ONS Social Surveys|
|Sustainable Development Goals||Automatically identify bodies of water and analyse change over time, as well as explore access
to all weather roads.
|ONS Sustainable Development Goals|
|Classification of financial services||Explore classification of financial corporations to their detailed Standard Industry Classification 2007 using firm-level data.||ONS Economic Statistics Group|
|Mapping the urban forest||Assess the contribution of greenery in towns and cities to the UK’s natural capital by creating a local-level dataset from classifying local street images and using image analysis and deep learning.||ONS Natural Capital|
|UK garden green space||Generate a better estimate of the green space within UK gardens, to improve the accuracy of ONS estimates of natural capital.||ONS Natural Capital|
|UN Global Platform – mapping the urban forest||Deploy our image processing pipeline used in the Mapping the urban forest project on to Algorithmia – a distributed computing environment used by the UN Global Platform project.||UN Global Platform|
|How green is your street?||Use vegetation index data produced by the Mapping the urban forest project to produce
a data journalism and visualisation output, with the ONS Digital Publishing team.
|ONS Digital Publishing|
|Payments data for public good||Work with Barclays to analyse payments data for public good.||Barclays, ONS Economic Statistics Group|
|Approaches for producing granular trade statistics||Develop a tool to support the production of more granular international trade in services (ITIS) output tables while meeting standards in disclosure control and accuracy.||ONS Economic Statistics Group|
The Campus and Barclays are collaborating to investigate new ways of using payments data for public good, including analysis of the night-time economy and developing faster economic indicators to inform economic and monetary policy.
Barclays and the Campus are collaborating to explore the rich potential
of payments data for public good. We want to develop new and enhanced economic statistics taking advantage of the rich detail and timeliness of payments data. We are ensuring that we take into account privacy and ethical issues, by using only anonymised, aggregated statistics in which individuals’ bank details cannot be identified.
The Campus hosted a knowledge sharing day with apprentices from Barclays and ONS. Following that, Barclays hosted a joint hackathon at their RISE London venue. The hackathon brought together 50 economists, developers, data scientists and analysts. The hackathon teams investigated ways to enhance or supplement economic statistics and the ideas of the winning
teams have been taken forward in an ongoing collaboration.
This collaboration has led to two important pilot projects.
The first project looks at the night-time economy, and what payments
data categorised by region and time of day can tell us about the scale
and diversity of economic activity taking place at night.
The second project focuses on developing faster economic indicators. Payments data generated and collected very quickly could feed into the creation of a leading indicator for economic health. Using techniques such as anomaly detection and predictive models, the joint ONS and Barclays team – including economists, statisticians and data scientists – hope to produce new and important indexes and indicators to feed into national and regional economic policy.
The project will enhance local authority and government decision-making in a number of ways. Bank and card transactions and financial data offer a rich source of information about the economy and the project will help timely regional economic indicators to be constructed, which are important for informing effective economic and monetary policy. In addition, understanding the night-time economy enhances local authority decision-making about the local economy.
The urban forest project developed a tool using artificial intelligence to detect trees and vegetation in Google StreetView images. Our work could be used to improve estimates of natural capital by ONS.
We used artificial intelligence to create a tool to detect urban trees and vegetation on the streets of 112 major towns and cities in the UK. Urban trees provide a wide range of environmental, social and economic benefits, such as improving air quality, and are known to be associated with lower crime levels and greater community cohesion. Within ONS, the Natural Capital Accounts team wanted to create an inventory of natural capital across the UK with a focus on detecting urban street vegetation.
We used computer vision technology, typically deployed in the emerging field of self-driving cars, to create a tool to detect trees and vegetation along roads. We made use of the Google StreetView platform as a data source to acquire street-level imagery for all major towns and cities in Great Britain at 10-metre intervals. The tree detection algorithm gives a score to each image of the density of vegetation present, accurate to over 90%. The data source and cutting-edge algorithms used in this work showcased the Campus’s ability to operate at the very forefront of data science and AI within ONS.
We demonstrated our ability to describe, in detail, the visual components
of a city in high resolution, including building density, number of cars, bicycles, people, signage, street furniture and other objects that describe an urban scene. This is a highly interesting geospatial dataset that could be used for a range of public policy applications.
This work could be used in the urban analytics domain, to feed into natural capital estimates. There are numerous studies linking green space to various social, environmental and economic indicators. For example, exploring the relationship between green space and other factors, such
as indicators of well-being, offers an exciting direction for future research.
In collaboration with ONS Digital Publishing, an online “How green
is your street?” interactive tool has been published, allowing visitors
to the ONS website to check postcodes in Cardiff and Newport and receive a percentage greenery on their street.
Table 2: Current and recent projects – Insight and analysis
lorries in crossborder
of freight at UK ports by
processing unlabelled list data
that are collected manually
in lorry manifests with no
to allow aggregation of data.
and Rural Affairs
|Enhance understanding of public
access to important services
by creating a tool using
multimodal (private and public)
|Assess service accessibility
by deploying our initial access
to services project for generic
use across the UK.
|Risk factors for
|Determine the risk factors for
loneliness across the UK with
good geography, using health
data as an outcome measure
of loneliness and treating
loneliness as a hidden variable.
|ONS Public Policy
|Explore new sources of evidence
for indicators of housing quality,
for example, energy efficiency.
|Investigate what data science
methods could be used to create
the “go-to” source of materials
information in the UK, open for
Business, Energy and
and Rural Affairs
|Flows of tenants
|Explore mathematical models
to simulate tenant flows, and
clustering techniques to represent
the different patterns of support
and care provided.
Scotland, ONS Public
|Identifying emergent trends from patent data||Identify ground-breaking products and technologies by applying machine learning to patents and data on emerging technologies.||Department for Business, Energy and Industrial Strategy, Cabinet Office, Intellectual Property Office, ONS|
|Understanding characteristics of high-growth firms||Explore how non-traditional data sources such as geographical features and web-scraped data can be combined with more conventional business data to help understand the characteristics and behaviours
of high-growth companies.
|Department for Business, Energy and Industrial Strategy|
|Extracting economic signals from internet bandwidth consumption data||Explore if it is possible to extract economic signals and insights from publicly available internet bandwidth consumption data.||ONS Economic Statistics Group|
|Economic impact of the UK fishing industry on local areas||Publish an online, interactive tool for policymakers, using ONS data to produce local-level economic indicators and data visualisations for the fishing industry.||Department for Environment, Food and Rural Affairs, Scottish Government, Welsh Government|
|Analysis of Automatic Identification System (AIS) data to understand shipping and ports||Explore the operation, use and relationships between ports in the UK at a macro level and the behaviour and operational characteristics of ships at a
|Maritime and Coastguard Agency, Department for International Trade|
|Evaluating calorie intake||Support public health policies
by improving our understanding of how much the UK is eating.
|ONS Health Analysis and Life Events|
This project, which identified emerging trends in patent data, enabled the Cabinet Office and the Department for Business, Energy and Industrial Strategy (BEIS) to analyse the innovation landscape by analysing data from patent applications. It also used artificial intelligence to enable BEIS to better understand the progress of the Clean Growth Grand Challenge policy.
Our data scientists developed an information retrieval tool to retrieve popular technical terminology from patent abstracts and applied a method to quantify this. Thousands of new patents are granted each year, with thousands more applications filed unsuccessfully. The Cabinet Office and the Department for Business, Energy and Industrial Strategy (BEIS) want to use the data entries from patent applications to analyse the innovation landscape in the UK and globally.
BEIS also asked the Campus to look at grant documents to help them understand what proportion of government grants given to businesses support the Clean Growth Grand Challenge – one of the four Grand Challenges from the Industrial Strategy.
We developed an open source tool using a popular information retrieval method in natural language processing to extract popular terminology and other useful information from large document collections, such as patent applications.
We were also able to apply artificial intelligence, information retrieval and word embedding analysis to classify a data set of 70,000 business grants, based on whether they did or did not support the Clean Growth Grand Challenge. Our classification tool resulted in an 87% accuracy rate, based on a manually classified sample of 250.
BEIS data scientists have installed the patent technical term extraction tool on their servers and we have delivered training to their analysts on how to use these solutions to help shape policy decisions. The same text extraction software was also used on the ONS results for the Civil Service People Survey to identify key trends and terms being used in free-text sections
of the questionnaire.
The project has also enabled BEIS to better understand the progress of the Clean Growth Grand Challenge policy, and will enable the identification and classification of clean growth-related applications in line with the Industrial Strategy without the need for manual processing.
This project established a method to process messy, unstructured data on the contents of lorries passing through ports, using natural language processing. By enabling products to be grouped into categories and subgroups, the output has enabled the Department for Environment, Food and Rural Affairs to gain insights that were previously unavailable.
Our data scientists used natural language processing (NLP) and a pre-trained word-embedding model (FastText) to group item descriptions based on meanings of free-text words rather than syntactic similarity. Lorries travelling through UK ports provide short descriptions of their contents but entries are often written in the driver’s shorthand and include misspellings or typographical errors. The Department for Environment, Food and Rural Affairs (DEFRA) wants to better understand UK trade on sea routes and the type of goods travelling through UK ports but these insights were not possible from the pre-existing unstructured data.
Using NLP, we developed a tool that groups items into categories such as chemicals, vehicles, food, building materials and metals. The model can even identify subgroups, such as classifying cars by country of manufacture.
We were then able to create a pipeline, which used an optimised version of FastText to cluster items, automatically generate cluster labels and organise the data into a hierarchical dataset with named clusters and subclusters.
The project has provided DEFRA with previously inaccessible insights into the movement of goods through ports, with analysis carried out on 12 major shipping routes around the UK to date. The next stage of the project is to work with DEFRA to evaluate the performance of the pipeline and apply it to other datasets, which will allow the Campus team to refine the methodology.
The Campus is making the tool open source and a number of other organisations have expressed an interest in using the method to apply to projects which involve similar free-text variables.
We analysed Automatic Identification System (AIS) data to understand and predict the movement of ships in and around UK ports. These insights could be used by stakeholders to analyse port statistics and model port relationships to predict delays and maximise efficiency.
This project explores the operation, use and relationships between ports in the UK at a macro level and the behaviour and operational characteristics of ships at a micro level. A team at the Campus began this work in conjunction with Statistics Netherlands, while the Maritime and Coastguard Agency (MCA) provided the necessary Automatic Identification System (AIS) and Consolidated European Reporting System (CERS) data. Together, these two datasets provide frequent snapshots of the position, speed, heading, bearing and rate of turn for each ship, as well as details such as destination port and expected time of arrival for the voyage of each ship.
We developed functions to extract, decode, sort and filter AIS messages as well as use machine learning algorithms to classify the ships’ moving behaviour and expected arrival times. These analyses allow the prediction of inbound delays and the modelling of the interactions between UK and international ports.
The maritime freight industry is of critical importance to the economic output of the UK, with almost half a billion tonnes of freight handled by UK ports in 2017, according to the Department for Transport. As the demands upon shipping freight are likely to increase in the future, a more in-depth understanding of the UK maritime shipping industry becomes increasingly important. Both the MCA and Department for International Trade will use the outputs of this project to:
- process big data containing location of ships and reports containing itinerary information
- analyse port statistics based on several criteria
- model port relationships between UK and international ports
- classify ship travelling behaviour and predict delayed arrivals of freight ships
The Department for Transport is reusing part of our code to decode AIS messages and generate port statistics that are currently produced by an external organisation.
In addition to these direct impacts, the Campus developed its ability to work with big data. We are applying the learning from this to work on projects developing faster economic indicators, including analysis of trade in goods and estimates of gross domestic product.
The ECLIPSE project supports public health policy by improving our understanding of how much the UK is eating. We analysed the discrepancy between self-reported calorie intake data and actual calorie intake. Our findings were reported extensively on the front pages of several UK national newspapers and attracted mainstream broadcast media attention. The methods and code were replicated by Public Health England to analyse the latest available data.
Project ECLIPSE (Evaluating Calorie Intake for Population Statistical Estimates) researched methods for improving estimates of the national population’s energy consumption and explored data sources from within and outside of ONS.
An earlier report by the Behavioural Insights Team highlighted a disparity between self-reported calorie intake in official statistics and UK obesity levels. The report shows that calorie intake has decreased over time while obesity levels have risen.
Doubly labelled water (DLW) measures – a physical and far more accurate measure than reporting by individuals – were used to measure energy expenditure as an approximation of true calorie intake. The ECLIPSE project used this energy expenditure data from the National Diet and Nutrition Survey to understand the extent of the apparent reporting error by individuals and the factors that affect under-reporting.
Data scientists at the Campus found that the average under-reporting error for participants in the dataset was 32%. The DLW study also showed no statistical evidence of a decline over time in calorie consumption, supporting the conclusions of the Behavioural Insights Team.
This study raised interest around the issue of obesity and calorie intake,
and received extensive and long-term press attention, featuring on the front pages of The Times and The Daily Telegraph newspapers, as well
as receiving coverage on the BBC News website, Sky and The Guardian.
Collaboration with topic experts across government increased the impact of this project – we provided our code to Public Health England who have reused it to validate the findings on later datasets, and they were featured on the Today Programme on BBC Radio 4.
Our approach provides a practical solution for improving the accuracy of the calorie intake estimates at a national level, making use of an existing government data source.
This project delivered new insights for social policy in Wales by developing a tool to analyse access to services using public transport. The tool uses open source data to calculate transport times to and from public services. The Welsh Government is exploring the potential of our tool to improve estimations for the Welsh Index of Multiple Deprivation.
The public transport access to services project developed a new tool to calculate travel times to and from public services including the nearest food shop, pharmacy, post office and public library. The Welsh Government asked the Campus to create an application that would enable it to improve its estimations for the access to services component of the Welsh Index of Multiple Deprivation (WIMD), the official measurement of relative deprivation for small areas in Wales.
We used the open source route planner, OpenTripPlanner (OTP) to host public transport timetable data, and designed a tool in R to extract information from OTP about the route. We used the information to create informative visualisations that show the area that is accessible from a specific location within set timescales.
The Welsh Government is currently scoping our tool’s use for improving WIMD measurements by increasing the frequency of estimations and real-time monitoring. Other initiatives by the Welsh Government such as the Valleys Taskforce programme, the Cadw website and mobile app, and the South Wales Metro programme could also benefit from this programme in the future by providing insights into public transport provision.
The next stage for this project is to open up the tool for online access by multiple users, including businesses and the public, to analyse location data.
Table 3: Current and recent projects – Operations and automation
|Creating a business prices processing prototype system in Python||Develop a prototype for processing business prices in Python, rather than in proprietary systems.||ONS Economic Statistics Group|
|Automated report generation||Create a pipeline for automated report generation with access to online application programming interfaces (APIs).||Department for Exiting the European Union|
|Synthetic data using generative models||Create synthetic data using neural networks to enable safer data sharing between organisations, and augment incomplete data.||Department for Business, Energy and Industrial Strategy, ONS Methodology|
|Improving the ONS search engine||Investigate the challenges
of searching the ONS website and make recommendations
|ONS Digital Publishing|
Table 4: Current and recent projects – Other government projects
|Data science support across government||Four further projects have provided vital support to
ONS and other parts of government in areas such
as reporting platforms,
corporate analytics and developing economic indicators.
|ONS, other government departments and Devolved Administrations|
This project assessed traditional and non-traditional methods for creating synthetic data. High-quality synthetic data can be used to improve the speed and security with which data are shared between organisations and to increase privacy.
We are using a range of traditional and non-traditional methods to create synthetic data to make data sharing quicker and more secure. Government organisations, businesses, academia and other decision-making bodies would like to exploit big data, but the organisations that collect this information are often unable to share the granular data due to its sensitive nature. In this project, we proposed methods that generate synthetic data to replace the raw data for the purposes of processing and analysis.
There are twin aims to this project. The first is to create high-quality synthetic data that closely resemble the real data and are a suitable substitute for processing and analysis. The second is to ensure privacy – the synthetic dataset must not contain any identifiable data.
Our data scientists working on this are using a range of traditional and non-traditional methods to create synthetic data. Traditional methods for synthesising data include synthetic minority over-sampling. Non-traditional methods include state-of-the-art algorithms such as generative adversarial networks (GANs), variational autoencoders (VAE) and autoregressive models.
The Campus is in discussions with other government departments, including the Department for Business, Energy and Industrial Strategy, on how our work on generating synthetic data could improve their processes. Overall, the project is contributing towards a safer, easier and faster way to share data between ONS and the research community in cases where
the real data are sensitive.
5. Building a world-class knowledge centre
The ambition of the UK government is to have one of the most digitally skilled populations of civil servants in the world. The Campus plays an important role in this by building public sector data science and AI capability through a range of learning and development programmes. These are delivered both directly, and in collaboration with important partners in ONS, the Government Digital Service, industry and academia.
Our ambition is to tackle some of society’s big issues and use data to deliver public good. Together with our partners at ONS and across government, we want to embed new skills and embrace new ways of thinking to create a skilled and interconnected data science community. We aim to recruit new generations of talent from all walks of life; involve partners, stakeholders and experts from across government, academia and industry. We want to think big, exchange knowledge, and encourage participation so we can bring our work to life for the whole data science community to share.
We are aiming high. Under the UK Statistics Authority Business Plan, the Campus is tasked with training 150 qualified data scientists for government by the end of 2019, and 500 by 2021. We are already well ahead of schedule, and with the recent recruitment of new data science trainers and lecturers, we expect this number to increase significantly. We remain committed to strengthening our multi-strand capability programme, and are ready to meet the needs of both analysts and dedicated data scientists within ONS and across government.
Harmonised career pathway
The Campus is supporting the development of a harmonised career pathway for data scientists in government, in collaboration with the government Analysis Function and the Digital, Data and Technology Profession. The pathway is supported by a wide range of Campus-developed and delivered learning and development programmes that cater to four levels of practitioner skill, from Awareness through to Expert.
Available to staff across ONS, the wider civil service and the public sector, these programmes include:
- awareness training for senior civil servants
- classroom-based training
- mentoring opportunities
- a bespoke MSc programme
- a guest lecture series from national and international partners
Head of data science / lead data scientist
These roles provide leadership and direction across a programme of multidisciplinary data science projects, managing resources to ensure delivery. They are recognised as a strategic authority with technical expertise in cutting-edge techniques, defining the organisation’s vision.
They are a role model to other data scientists and champion adoption of best practice. They communicate with senior stakeholders and convince them of the strategic value of data science. They are champions for the use of data science across government.
Senior data scientist
Senior data scientists are experienced data scientists who provide
support and guidance to teams. They are recognised authorities on a number of data science specialisms within government, with some knowledge of cutting-edge techniques. They may work on projects of high political exposure, value or complexity. They engage with senior stakeholders and champion the value of data science. They line manage more junior colleagues. They communicate the value of data science to senior stakeholders.
Data scientists are proficient in data science. They have recognised technical ability in a number of data science specialisms and provide detailed technical advice on their area of expertise. They draw on other technical and analytical standards from across government and industry. They promote and present data science work both within and outside of the organisation. They engage with stakeholders to demonstrate the value of data science and propagate data science skills in other teams. They line manage and mentor junior data scientists and manage small project teams.
Junior / associate data scientist
Junior or Associate Data Scientists are responsible for aspects of existing data science projects, whilst gaining valuable hands-on experience. They are able to apply certain data science techniques and work to develop their technical ability. They adhere to the data science ethics framework. They work as part of a multidisciplinary team with data architects, data engineers, analysts and others and provide limited advice on data science projects within teams. They identify and communicate lessons learnt during projects and follow good practice. They clearly communicate the value of data science work to stakeholders.
Trainee data scientist / apprentice
Trainee Data Scientists and Apprentices are given experience of practical data science work under supervision from more senior colleagues.
They move from a strong awareness of core data science skills of coding, machine learning and statistics to a more effective working knowledge and develop their understanding of how to apply data science to business problems.
Data Science Faculty
Our Data Science Faculty manages a curriculum of classroom-based courses aimed at leaders, policy teams and frontline staff to demystify data science. We run courses at both the Awareness and Working levels of the harmonised career pathway.
We develop and deliver these courses in-house. Our future plan is to work with partners to build further courses aimed at the Expert level. The Campus Faculty also runs “train the trainer” programmes, one of which was successfully deployed at the Ministry of Justice to help them achieve a goal of 250 analysts trained in the commonly used statistical programming language R. To date, we have trained around 130 analysts at the Ministry of Justice.
Awareness level workshops
Art of the Possible — 1 to 2 hours
This short workshop gives an overview of the use of data science in government. It is designed to demonstrate the value of data science to non-technical staff through a series of examples from across government.
Total persons trained in Awareness-level workshops in 2018 — 686
Working level workshops
Introduction to R — 12 hours
Introduction to Python — 12 hours
These courses are designed for people who are new to programming and the R or Python languages. It provides the basic skills to operate in R or Python. Following this course, students can perform basic data manipulation and visualisation.
Data science with R — 2 x 12 hours
Data science with Python — 2 x 12 hours
These modules are hands-on and focuses on the reflection, collection and preparation stages of the data science process. We teach how to import data from almost any format into R or Python, and how to transform messy datasets into tidy ones. Students explore techniques such as visualisation to prepare data for analysis.
Spark and distributed systems — 4 hours
ONS is currently migrating operations to a more advanced platform (Cloudera) that hosts a series of distributed technologies. This course gives a quick introduction to the basics of using the main technology for analysis (Spark) while touching on other important technologies such as Hadoop (HDFS).
Total persons trained in Working-level workshops in 2018 — 194
Masters in Data Analytics for Government
The new Masters in Data Analytics for Government (MDataGov) is a collaborative project between the Data Science Campus, ONS Learning Academy and academic partners across the UK. Launched in October 2017, this flexible, part-time programme aims to build data science capability across government by equipping civil servants with an important set of
skills required from a modern government data analyst.
Students can choose to complete the programme within two to five years depending on their personal circumstances. There are four compulsory modules (Data Science Foundations, Statistics in Government, Survey Fundamentals and Statistical Programming) and eight optional modules from a range of courses in statistics and data science.
Total persons registered on MDataGov in 2018 — 32
Continuous professional development
Civil servants and the Campus’s national and international partners can study modules as stand-alone (assessed or non-assessed) courses for continuing professional development (CPD). These modules form the basis of the Practitioner level training managed by the Campus.
Total persons completed a CPD course in 2018 — 85
Data Analytics and Data Science apprenticeships
With the ONS Learning Academy, the Campus has been instrumental
in developing and delivering two levels of apprenticeships.
Level 4 Data Analytics apprenticeship
The Campus recruited its first intake of eight apprentices in October 2016, the first Level 4 Data Analytics apprentices in Wales. In collaboration with ALS Training, the apprenticeship is a two-year programme, equivalent to the first year of an undergraduate degree in data analytics. Two years later, all of our first cohort had graduated and secured roles in government as data scientists.
The Campus recruited a further five apprentices in September 2017. They have worked alongside our experienced data scientists and been involved with a variety of projects, and now all begun their work placements across ONS ahead of their graduation later this year.
The apprentices have been an integral part of the Campus. As well as making a valuable contribution to our projects, they have been active in our school outreach programme, working with our STEM initiative and promoting data science and apprenticeships across ONS and beyond.
Level 6 (Degree) Data Science apprenticeship
Trailblazers is an initiative run by the Institute for Apprenticeships and is made up of a group of employers that come together as the creators and early adopters of new apprenticeship standards. ONS and the Campus lead the Trailblazer Group in England with over 50 employers (public and private sector) and 20 universities.
As a result, we now offer a three-year Level 6 (Degree) Data Science apprenticeship. The Welsh Government agreed to incorporate this into their degree apprenticeship delivery pilot and the Campus, in partnership with Cardiff Metropolitan University, is the first adopter. Level 6 apprentices have the exciting opportunity to combine the study of data science theory at university with working at the Campus alongside our experienced data scientists.
We began recruiting for apprentices in October 2018 and expect new apprentices to start in March 2019. The Campus is recruiting four apprentices and the Welsh Government will recruit a further two. We hope that this close partnership will allow us to build a network of degree-level apprentices and guide them through their early career in data science.
Recognition for apprenticeships
The Campus and the ONS Learning Academy have worked hard over the last two years to raise the profile of apprenticeships both within and outside the organisation, with ONS now having over 100 apprentices in many different disciplines. Being nominated as a finalist in the Large Employer of the Year at the Wales Apprenticeship Awards was recognition of this.
We believe mentoring is one of the most valuable and effective development opportunities we offer. Analysts from across the public sector can sign up
to a range of different mentoring options:
- Data Science Accelerator
- Data Science Academy
- External mentoring of other government departments
The Data Science Accelerator is a capability-building programme, which gives analysts from across the public sector the opportunity to develop their data science skills. It started in 2015, and is backed by the Government Digital Service (GDS), ONS and Government Office for Science and the Analysis Function. We have been the South West and Wales hub since 2016.
Participants work on a three-month data science project. Having this protected time is an important benefit of the programme. Participants commit to spending one day a week at the Campus working on their project. Each participant is assigned a dedicated mentor (an experienced data scientist) and also benefits from peer support from other participants in their cohort.
We also run a similar programme called the Data Science Academy exclusively for ONS staff. Finally, we provide mentoring to teams across the public sector. These tend to be more flexible, with timetables built around individual work commitments.
Table 5 provides a list of the projects that have taken part in these mentoring schemes.
Table 5: Projects and departments taking part in the mentoring schemes
Data Science Accelerator
|Automation of object detection from satellite imagery||UK Hydrographic Office|
|Forecasting the condition of the
|Department for Education|
|Reducing potential harm through improved risk profiling||NHS Wales|
|Use machine learning to match individually collected prices to web-scraped prices||ONS|
|Pathway mining and analysis for patient-level data||Public Health Wales|
|Developing a tool to maximise the use of Trafficmaster data in Welsh Government||Welsh Government|
|AIS-derived products for improved defence situational awareness||UK Hydrographic Office|
|Sounding selection tool||UK Hydrographic Office|
|Better analysis and dissemination of the annual June survey of agriculture in England||Department for Environment, Food and Rural Affairs|
|Using machine learning techniques in economic statistics to improve survey methodology results||ONS|
|The Welsh name strategy||Pembrokeshire Local Authority|
|Automated patent casework allocation||Intellectual Property Office|
|Organisations’ engagement with
|Project MERTZ||Royal Air Force|
|Understanding the relationship between social care data and Ofsted inspections||Ofsted|
|Understanding public perceptions of teaching: automated analysis of free text data from online||Department for Education|
|What just happened? Using natural language processing to summarise patient notes and save doctor time||NHS Wales Informatics Service|
|Beach composition classification||UK Hydrographic Office|
|Text mining for public research impact evidence||UK Research and Innovation|
Data Science Academy
|Identifying holding companies of special purpose entities||ONS|
|Big data and visualisations for apportioned regional tax revenues||ONS|
|Propensity matching with clothing and formula effect||ONS|
|Improving the accessibility of statistics on specific crime types||ONS|
|Machine learning as an alternative estimation method for later period VAT returns||ONS|
|Investigating possible markers of wealth and income in postcode sectors||ONS|
|Automatic classification of individual consumption by purpose (COICOP)||ONS|
External mentoring of other government departments
|Probability of success for the Training for Success programme||Northern Ireland Statistics and Research Agency|
|Stroke patients and effective prescribing||Northern Ireland Statistics and Research Agency|
|Modelling student loan repayments||Government Actuary’s Department|
Katie Davidson enrolled on one of the earlier rounds of the Data Science Accelerator before ONS and the Campus became a regional hub. She subsequently received one of the first sponsorships by the Campus to complete an MSc in Data Science at Birkbeck University. During this time, she was promoted to Head of Data Science. We are delighted to say that
we are now helping Katie up-skill her team and she now sits on the board
of the cross-government Data Science Skills Working Group.
Data Science Accelerator in action
Since 2016, the UK Hydrographic Office (UKHO) has sent four analysts on the ONS Data Science Accelerator programme, working on a range of projects. For example, one project developed a sounding selection tool designed to detect seabed changes. The work focused on creating a process to significantly reduce the manual element of selecting the correct soundings from a survey to chart. It included using different data sources
to select the most relevant depths for mariners.
The first project mentored by the Data Science Accelerator programme at the Campus helped Catherine Seale from UKHO automate the identification of objects in the sea. This project has since been developed into a live system in use at UKHO. It detects objects visible in the ocean on satellite imagery, such as wind turbines or oil and gas platforms. The system was presented at the Government Digital Service Sprint 18 meeting in May 2018, and to date, has processed satellite imagery covering 881,280 square kilometres of ocean, uncovering 342 hazards that were unknown to UKHO.
UKHO also announced at Sprint 18 that it would become a new hub location for the Data Science Accelerator programme, with a focus on geospatial projects.
Leading the way in infrastructure
The Campus has recently created a dedicated network, isolated from the core ONS IT infrastructure. The Campus network spans two physical secure data centres, providing high-availability, resilience and security. Users can connect from both secure corporate laptops and off-network devices such as MacBooks and Microsoft Surface Pros.
The environment is suited to explorative and development work and data scientists benefit from less restricted internet access, local administrator rights and the ability to install software packages without restriction. Virtual machines can be rapidly deployed with the latest data science tools and software. Data scientists also have access to General-Purpose Graphics Processing Units (GPGPUs) for machine learning purposes allowing data
to be processed far faster than using traditional computer methods.
The infrastructure provides users with ample compute resource (processing, memory and large-scale storage) enabling virtual machines to scale as projects demand. This removes the previous limitations that users experienced when working solely off their laptops. Adhering to ONS policies and standards, non-sensitive data are now stored centrally in highly available data stores providing a single point of truth, enabling data scientists to better collaborate on projects. Users also benefit from being able to connect to external cloud providers such as Microsoft Azure and Amazon Web Services while using the Campus network.
The Campus network also contains a training environment providing students with a mixture of Microsoft Windows and Linux virtual machines, which are used to deliver various courses such as Python, R, Git, Apache Spark and natural language processing.
Working with others
Our ambition from day one was to be at the forefront of public sector data science in the UK. This can only happen through our partnerships and the exchange of data science knowledge between government, industry and academic practitioners – both in the UK and abroad.
Knowledge exchanges increase the use and understanding of data science within ONS and wider government; they enable access to data, tools, approaches and techniques developed at the leading edge of UK and international research. They also allow insights and methodologies developed within the public sector to be shared for the betterment of
data science and the UK public as a whole.
In December 2016, we announced our first UK partnership with the signing of a Memorandum of Understanding with the Alan Turing Institute. Since then, we have been working and signing agreements with a wide range of prospective university, international and commercial partners to explore opportunities for collaborative research and joint programmes that advance the state of data science within government and across the entire field.
We actively welcome partners from academia, government and industry who wish to help us meet the demands and challenges posed by the evolving economy, and work to push the boundaries of data science research within ONS and beyond.
We have built an extensive network with academic partners throughout the UK and beyond, providing funding opportunities for MSc and PhD candidates and delivering joint research programmes with a wide range
of national and international partners.
Already, our partnerships with universities have allowed us to research, innovate and inject additional capability into the field of data science. We have shared research and resources and collaborated on various continuous learning initiatives to expand and improve knowledge across the UK.
University partnership and collaboration
The Campus has undertaken collaborative research with universities from all corners of the UK. We sponsor PhD students and provide a range of challenging short-term projects for groups of PhD and MSc students. Examples include a project for MSc students at Manchester University on how to capture changes of opinion for different groups in Twitter feeds. We are particularly excited to support University of Warwick on the 2019 “Data Science for Social Good” Summer Fellowship through collaboration with the Alan Turing Institute.
Hankui Peng received her degree in statistics and MSc in statistical practice. The Campus is sponsoring her PhD that focuses on exploring space clustering with application to text data. This research has applications in diverse areas including product categorisation, fraud detection and sentiment analysis.
Our three-to six-month project-based paid internships have proved very popular. Students can choose projects that support their own field of research or they can work on existing projects undertaken by the Campus. Currently, six MSc and PhD students from different universities have worked on projects alongside our experienced data scientists. We also sponsor four PhD students through the Alan Turing Institute. Students
have access to our data science resources and research projects and benefit from a six-month placement with us during their studies.
Our knowledge sharing events and support of forums such as university “data dives” has gone from strength to strength in 2018. For example,
- collaborated with university faculties such as SAMBa (Statistical Applied Mathematics, Bath) to give students the opportunity to work with our data scientists and develop their own research
- hosted a knowledge exchange event with Lancaster University
to explore the latest research and ideas into time series
- delivered lectures to PhD students at the Alan Turing Institute to outline the work and projects we work on for the public good
- held data science showcases at University of Warwick and University College London for PhD students
Twice a year, SAMBa hold a week-long Integrative Think Tank workshop, where they invite non-mathematical partners to set high-level challenges that the students can formulate into mathematical problems and work together to identify routes to a solution. In 2018, the Campus provided two challenges, and a team of Campus data scientists who spent a week working with the SAMBa students, supporting and mentoring them.
The future looks bright
We are developing and strengthening our academic collaborations into 2019 and beyond through the support we have pledged to our existing and proposed ESPRC and UKRI Centres for Doctoral Training Centres in AI, Statistics and Data Science. We also value our membership of Doctoral Training Centres advisory boards, helping to steer the direction of the training, projects and opportunities that aid the development of the skills of PhD students.
|Alan Turing Institute||Royal Statistical Society|
|Birkbeck, University of London||STEM Cymru|
|Cardiff Metropolitan University||STEM Learning UK|
|Cardiff University||The Datalab|
|Consumer Data Research Centre||University College London|
|Gower College Swansea||University of Bath|
|Imperial College London||University of Bristol|
|Institute of Coding||University of Edinburgh|
|King’s College London||University of Exeter|
|Lancaster University||University of Glasgow|
|London School of Hygiene and Tropical Medicine||University of Manchester|
|Manchester University||University of Oxford|
|Nesta||University of Plymouth|
|NIESR||University of Portsmouth|
|Open Data Institute||University of South Wales|
|Oxford Brookes University||University of Southampton|
|Queen Mary, University of London||University of Sussex|
|Queen’s University Belfast||University of Swansea|
|Royal Holloway||University of the West of England|
|University of London||University of Warwick|
|Royal Society||Urban Big Data Centre, Glasgow|
The Campus is driving data science capability across government.
We have worked with the government Analysis Function and Digital, Data and Technology Profession to develop a harmonised career pathway for data scientists, working with important government departments to focus on the development of agreed standards in technical skills for different job grades, as well as a consistent approach to skills assessment in recruitment and progression.
Public sector data science audit
As part of the Government Data Science Partnership, we launched
a government data science skills working group. We agreed to conduct a data science skills survey across central government, leading to a HM Treasury request to widen the scope to the wider public sector. This was included in the 2018 Budget Red Book. The first phase of this audit began in January 2019 and we expect to publish the final report at the 2019 Government Data Science Conference in the autumn. We hosted the 2018 Government Data Science Partnership Conference in February 2018 and the first Government Data Science Community meet-up in November 2018, with 80 attendees from across government and the public sector. Planning is now underway for the next meet-up and the 2019 Government Data Science Conference.
National Materials DataHub
We have carried out an initial scoping exercise for the potential for data science to inform a National Materials DataHub. A joint collaboration with the Department for Business, Energy and Industrial Strategy (BEIS), the external advisory board was chaired by Campus Managing Director Tom Smith and included attendees from industry partners. The meeting discussed options for the next phase of work by the cross-government virtual team led jointly by ONS, BEIS and the Department for Environment, Food and Rural Affairs.
Economic Intelligence Wales
The Campus has been a lead partner in the creation of a new research unit – Economic Intelligence Wales. This is a new collaboration between the Development Bank of Wales, Cardiff Business School and ONS and was formally launched by the Welsh Government’s Cabinet Secretary for the Economy in June 2018. The new research unit has responsibility for collating and analysing data to create an independent, robust and reliable platform to inform timely policy and funding decisions. Data gaps will be identified and addressed as part of this process.
|Cabinet Office||NHS Digital|
|Department for Business, Energy and Industrial Strategy||NHS Scotland|
|Department for Education||NHS Wales|
|Department for Environment, Food and Rural Affairs||NHS Wales Informatics Service|
|Department for Exiting the European Union||Northern Ireland Statistics and Research Agency|
|Department for Health and Social Care||Ofsted|
|Department for International Trade||Pembrokeshire Local Authority|
|Government Actuary’s Department||Public Health Wales|
|Government Digital Service||Royal Air Force|
|Government Office for Science||Scottish Government|
|Innovate UK||UK Hydrographic Office|
|Intellectual Property Office||UK Research and Innovation|
|Maritime and Coastguard Agency||Welsh Government|
We work with industry partners on a range of non-commercial activities focused on outcomes for public good. In 2018, we partnered with Barclays to explore the development of rapid regional economic indicators using payment data. This led to the secondment of ONS analysts into Barclays, where they were able to work with the rich source of payments data held by Barclays, and benefit from the expertise and specific knowledge of their staff.
Other recent industry partnerships include:
- collaborating with Deloitte to raise awareness of data science across
the Northern Ireland Civil Service
- partnering with PwC on a range of activities including their #GreatWales campaign, which focused on the impact of digital on Wales’s economy, public services and infrastructure
We also supported a range of cross-sector groups aiming to increase the use and application of data science skills across the UK. These groups include:
- Royal Society Dynamics of Data Science Group
- Alan Turing Institute’s Data Skills Taskforce
- Institute of Coding
Partnership in action
Glass AI is a large-scale artificial intelligence system that reads, interprets and monitors the open internet. The company is building a new research resource for social, economic and market analysis. So far, Glass AI has digitally mapped the UK economy, tracking any topic of interest across hundreds of millions of web pages, and over 1.5 million organisations. We partnered with Glass AI on a project aiming to understand the characteristics of high-growth companies using non-traditional data sources, to inform policy decisions on investment and employment.
Glass AI supplied us with data on a random sample of 30,000 UK active companies, including descriptions, sector classifications, mentions, news articles, job adverts and biographies of staff published on the organisation’s website. This partnership has allowed the Campus and ONS to understand more about the use of non-traditional data sources in modern statistics.
|Data for Policy||PwC|
|EvolutionAI||The Behavioural Insights Team|
|Wales Council for Voluntary Action|
|Hafod Housing||Welsh Data Science Graduate Programme|
The world of statistics and data is constantly evolving and national statistical institutes (NSIs) from around the world are keen to hear more about the transformational journey of ONS and our own rapidly growing success story. We continue to be an active participant in the agreement between ONS and the Department for International Development (DFID) to support the modernisation of official statistics, initially in four African countries and with the UN Economic Commission for Africa (UNECA).
We regularly visit and receive visits from NSIs from across the world. In the last quarter of 2018 alone, the Campus welcomed delegates from China, the Republic of Korea, Singapore, New Zealand, Indonesia and Australia, as part of wider visits to ONS organised by ONS’s International team. We have a formal memorandum of understanding with a number of these international institutions.
Rwanda’s data revolution
ONS is supporting the Data Revolution for Rwanda initiative through the National Institute of Statistics of Rwanda (NISR). The Rwandan Government wants to build “an innovative data-enabled industry to harness rapid socio-economic development”. Funded by DFID, the Campus has been providing strategic advice on the design and implementation of a sustainable and efficient data science capability plan.
Teams from the Campus have visited Rwanda on several occasions
and collaborated with ONS colleagues to:
- provide advice on the legal, ethical and good practice aspects
of data management and sharing
- advise on and support the creation of effective partnerships across the Rwandan public sector and academia
- provide advice on technical infrastructure
- assess skills requirements across the Rwanda Government,
and provide training for 25 NISR staff
The Campus is currently collaborating on two data science projects, one with NISR, and one jointly with NISR and the National Bank of Rwanda
Results are encouraging. During 2018, we saw Rwanda’s Data Revolution policy taking shape and the building of the new Data Science and Training Centre.
Data scientists at the Campus have been working with the UN Global
Pulse Lab in Jakarta, a joint initiative of the United Nations and the Indonesian government. It is the first innovative lab of its kind in Asia. Pulse Lab Jakarta is working to close information gaps in the development and humanitarian sectors through the adoption of big data, real-time analytics and artificial intelligence.
The Campus has been helping the Lab to deploy several existing open source projects, including components of our Urban Forest image-processing pipeline, onto a platform in a format where they can be quickly adopted by other researchers in the international community.
The Campus is also contributing to the ONS work supporting the development of a regional data science campus for Africa at the UNECA headquarters in Addis Ababa, by facilitating a workshop on the use of data science within ONS.
International Monetary Fund
The Campus is leading on a joint International Monetary Fund and ONS project on mobile phone payments and remittances using commercial data. The project goal is to use anonymous peer-to-peer mobile phone money-transfers data to investigate the feasibility of producing economic and Sustainable Development Goal indicators. If successful we could develop
a “tool-box” that potentially could be used by other countries with similar data infrastructure in the future.
|Australian Bureau of Statistics||Statistics Canada|
|Brazilian Institute of Geography and Statistics||Statistics Centre – Abu Dhabi|
|Health Quality & Safety Commission New Zealand||Statistics Korea|
|International Monetary Fund||Statistics Netherlands|
|National Bank of Rwanda||Statistics Norway|
|National Bureau of Statistics of China||Statistics Poland|
|National Institute of Statistics and Census of Argentina||United Nations Data Forum|
|National Institute of Statistics and Geography (Mexico)||United Nations Economic Commission for Africa|
|National Institute of Statistics of Rwanda||United Nations Global Platform|
|New Zealand Embassy||World Bank|
|Singapore Institute of Statistics|
Harnessing the power of data science offers huge benefits for the UK government and the public at large. However, this is an emerging discipline and presents new challenges. Technology and statistical innovation are moving at pace and we need to constantly evolve our codes and ethics to match the highest standards demanded by the public sector.
We are in the business of harnessing the power of data to support the most important decisions facing the country, while ensuring that data are securely held and properly used.
Our governance supports these aims. The Campus is a Directorate within the UK’s Office for National Statistics (ONS), which is itself the executive office of the UK Statistics Authority. The Authority is an independent body at arm’s length from government, reporting directly to Parliament, with a statutory objective of promoting and safeguarding the production and publication of official statistics that serve the public good.
The Campus is led by Managing Director Tom Smith who reports to Heather Savory, Deputy National Statistician and Director General for Data Capability.
We have an Advisory Board that meets three times per year, and is chaired by the ONS Director General for Data Capability. The Advisory Board’s main roles are to:
- provide advice on Data Science Campus activities and the delivery
of its strategic objectives
- provide guidance on the development of the Campus and help the ONS executive give assurance to the Authority Board that the infrastructure is established and maintained in ways that serve the public good
- review how the Campus is working across ONS and government
- advise on the principles, policies and procedures of the Campus
- help resolve any high-level issues that inhibit the Campus achieving its goals
- help identify strategic risks to meeting Campus objectives and advise
on their mitigation
- help oversee and guide public engagement and communications strategies
- advise on the opportunities for the development of the Campus.
Our Advisory Board members are:
Deputy National Statistician and Director General for Data Capability at the Office for National Statistics
Dr Tom Smith
Managing Director at the Data Science Campus
Director of Digital and Tech Policy at the Department of Digital, Culture, Media and Sport.
Chief Statistician for the Welsh Government.
Professor David Hand
Emeritus Professor of Mathematics and Senior Research Investigator at Imperial College, London.
Chief Executive Officer of Local Trust.
Professor Sofia Olhede
Professor of Statistics of Mathematics at University College London (UCL) and Director of the UCL Centre for Data Science.
Professor Piyushimita Thakuriah
Distinguished Professor and Dean of Edward J. Bloustein School of Planning and Public Policy at Rutgers University.
Chief Data Officer for the Ordnance Survey.
Executive Director of the Global Partnership for Sustainable Development Data.
Co-founder of Privitar, an enterprise software company.
Executive Director of the Royal Statistical Society.
Dr Sofie De Broe
Head of Methodology and Scientific Director of the Centre for Big Data statistics at Statistics Netherlands.
Professor Martin Weale
Professor of Economics at King’s College London.
Chief Executive of Bury Council
We follow policies and frameworks that operate across ONS to uphold ethical principles and safeguard data including new policies setting out how ONS looks after and uses data for public benefit, published in January 2019.
As with all ONS work, if we have a concern about the ethics of any of our research projects we consult the National Statistician’s Data Ethics Advisory Committee (NSDEC). This committee was set up to provide independent and transparent ethical advice and to ensure that the use of data for research and statistical purposes is ethical and for the public good.
We helped the NSDEC develop a framework to help researchers, statisticians and data scientists assess the ethics of their research by scoring projects against six principles. This framework is now being used for the Campus’s Research Programme to assess projects in their early stages. Any with a high-risk score are referred to NSDEC for a full ethical review.
Further information on the NSDEC data ethics self-assessment process is available via the UK Statistics Authority website.
6. Outreach and volunteering
We are involved with a range of outreach and charitable activities. These range from delivering data science awareness training to charitable organisations, to supporting outreach in schools and youth groups. Our staff also volunteer through Business In The Community and the Wales Council for Voluntary Action.
STEM Ambassadors are part of a national scheme led by STEM Learning,
a partnership between government, charitable trusts and employers.
The scheme enables volunteers to engage with young people and promote STEM (science, technology, engineering and mathematics) through career talks, mentoring, practical workshops and exhibitions. The Campus is proud to support several STEM Ambassadors.
ONS introduced a pilot programme in 2018 in partnership with STEM Cymru and the Engineering Education Scheme Wales. STEM Ambassadors from the Campus hosted groups of female school pupils for “Girls into STEM” workshops.
Our Ambassadors shared their energy and enthusiasm for STEM subjects with the visitors during a hands-on workshop, which included games to promote mathematics and data science as well as an introduction to coding session with our team of programmable Lego Boost robots.
We have 10 ambassadors made up of data scientists and data analytics apprentices. They have supported several engaging outreach activities in schools, clubs and within ONS. In all, our STEM outreach activity has engaged with a total of 222 young people and led to the award of the 2018 ONS Excellence Awards for Building Capability to the Campus.
While this pilot programme is now complete, in 2019 the Campus is looking to partner with bodies such as the Institute of Coding in Wales to identify ways to scale STEM engagement at a national level.
Through the outreach charity – Business In The Community – we have supported local organisations through our volunteering time.
Two Campus volunteering days were held in 2018:
- part of the Campus team spent a day at Global Gardens in Cardiff undertaking a range of gardening and manual activities – the Global Gardens Project is about bringing communities together with a vision
to create a growing space that supports community-based sharing of food and cultures
- in November, Campus staff carried out internal and external building maintenance at Ystrad OAP Association, and socialised with the members over lunch – Ystrad OAP Association provides forums
for those aged 60 years and over to meet up and plan activities
Data in the community
In early 2018, the Campus hosted a group of Welsh voluntary sector organisations. We showcased our range of projects and discussed opportunities to build data science skills across their community.
To encourage this sector to make better use of their data, we agreed to support the flagship conference Gofod 3. This is an event organised by Wales Council for Voluntary Action in collaboration with charity organisations across Wales. Over 400 delegates attended and our data scientists delivered a condensed version of “The Art of the Possible” to several conference delegates exploring how data science could relate to their charitable work.
7. Looking to the future
We’ve only just begun our journey but already we have achieved so much.
Two years on, what does the future hold for the ONS’s Data Science Campus? The case studies in this report demonstrate how our early projects have delivered and we look forward to many more making a real impact for public good in the UK, providing new economic insights, using novel data sources to explore societal trends and assessing progress towards more sustainable development.
Over the past two years, we’ve built up a strong knowledge exchange team of academic managers, lecturers and trainers so that we can meet – and exceed – the target set by John Manzoni, Chief Executive of the Civil Service, to produce 500 data analysts across government trained in data science, by 2021. We’ve just recruited our first degree-level apprentices into the Campus and plan to extend ONS’s apprenticeship programme to higher academic levels over the coming years.
We’ve seen our partnerships with academic institutions blossom and are looking forward to working with six of the recently announced Centres for Doctoral Training in mathematics, statistics and AI. Moving forward we will be exploring new collaborations, such as fellowships, to enhance the benefits to the country of our close links with academia.
We have barely scratched the surface of the potential to collaborate for public good with industry and we look forward to building on the success we have seen, for example, in our partnership with Barclaycard which is helping provide new perspectives on the UK economy.
Internally to ONS we will continue to work to embed data science skills across the organisation and to further the use of new types of data and analytical methods in the production of official statistics such as the Census and our economic outputs. Across government, we’ve already worked with most UK government departments and have created a significant hub of data science activity based in London. In March we reached agreement with the Department for International Development to establish a new Campus hub at their office in East Kilbride in Scotland, focused exclusively on international development. As we establish the Campus further, we expect to set up new Campus hubs in different parts of the country to ensure that the work we do is at the heart of UK public policy decisions – finding innovative ways to illuminate the unknown challenges ahead.
ONS is increasingly being recognised as among the front-runners
of modernising statistical institutes worldwide, including in our ability to use big data and other novel data sources to improve national statistics. Overall we want to cement our place as a world leader in data science, working
in collaboration with partners, including working through UN agencies
and other bodies – developing partnership programmes that help countries across the world to build their data science capability.
With our talented and experienced Advisory Board now in place, we have a great launch pad so the Campus can help build on the achievements set out in this review of our first two years and help deliver a step change in the application of data science across the UK public sector. Above all, we want to help the UK public sector deliver the maximum benefit it can from the better use of data.
8. How to work with us
For the last two years, we have been proud to work with, and support, our partners and colleagues across government, academia and industry – both in the UK and globally. We always welcome the opportunity to work with others to harness the power of data science to create new understanding and improve decision-making for public good.
We actively welcome partners from academia, government and industry who wish to join us as we seek to meet the demands and challenges posed by the evolving economy, and work to push the boundaries of data science research within ONS and beyond.