When the team met Turing

On  5 December 2016, the Office for National Statistics and The Alan Turing Institute signed a Memorandum of Understanding signalling their shared commitment to creating global impact in data science through research, training and leadership.

Just a month later, the entire Data Science Campus (DSC) team found themselves on the 6:23 train from Cardiff to London. Although this proved challenging for some of us, we managed to arrive at the British Library in one piece and got through the security check without any red light warnings. The Alan Turing Institute (ATI) had organised a well-structured workshop for us, to present our current work and discuss possible ways of collaborating with some of its fellows.

If you’re not aware, the newly founded ATI is headed by the universities of Cambridge, Edinburgh, Oxford, Warwick and University College London and has already attracted some of the best data scientists across the globe to work across several data science disciplines. In the scope of future collaboration between the DSC and ATI, a workshop was organised for us to demonstrate a couple of prototype digital tools that the Campus is building for other government teams and present 4 projects that are being scoped at the moment. This would be a great opportunity for both institutes to discuss current data science issues and explore areas of common interest that could potentially lead to fruitful future collaborations.

During the first section of the workshop, we presented our projects to a very engaged audience. During the second section, we formed groups of people with common interests, discussed issues concerning our projects and arranged future collaborations with some of the ATI fellows.

During the first session we presented 6 different projects that we are working on and cover a variety of subjects from different business areas.

So what did we get up to?

Fellow Data Scientist Lanthao Benedikt kicked-off the presentations by talking about her project on Tourism and Migration. The project aims at identifying new data sources to supplement survey data and produce cheaper, faster and more precise statistics. The research also investigates innovative data science and visualisation techniques to draw insights from data and tell the stories behind the numbers. This is a joint research project between the DSC, the Department for Culture, Media and Sport (DCMS) and the Data Science Lab at Warwick Business School (WBS). The lab at WBS is co-directed by Suzy Moat and Tobias Preis who are very active ATI fellows and both attended the workshop.

I demonstrated 2 application prototypes that we are currently building for the collection and dissemination of data within the government environment. The first tool is being developed for the Sustainable Development Goals (SDG) team at ONS and it will be used to collect and disseminate data for reporting on the SDG indicators set by the United Nations in 2015. The tool allows the SDG team to acquire necessary data that is relevant to the 231 indicators and use appropriate techniques to visualise the UK’s progress on them.
The second tool is being developed for the Economic and Domestic Affairs Secretariat (EDS) team at the Cabinet Office and is intending to cover the team’s requirements for dissemination of employment statistics to the local authority districts.

The tool is using geographical and hexagonal maps to visualise the UK local authorities, provides statistics for numerous measures and allows the local authorities to easily and interactively compare their performance with other local authorities.

Steven Hopkins explained how we can track economic movement in local industries that are associated with fishing. The existence of a fishing industry provides valuable support to coastal communities, many of which are categorised as rural and deprived. As the fishing industry is faced with the challenge of remaining economically viable due to declining global fish stocks and rising operational costs, the coastal communities who most rely on fishing activities are vulnerable. The project presented is in collaboration with the Department for the Environment, Food and Rural Affairs (Defra) and will enable increased understanding of the economic and social benefits of coastal communities having strong fishing activities. The project aims to consolidate ONS and publicly available data to form a targeted dataset focusing on industries related to fishing to perform analysis and inform further studies.

Philip Stubbings discussed an early stage project, currently scoped as collaboration between the DSC, ONS Natural Capital Branch and Defra. It involves the wide ranging social, economic and environmental benefits of trees and canopy in urban environments and studies the importance of including natural assets in decision-making processes. The project aims to examine ways to count, classify and potentially monitor the status of trees in cities, utilising advances in computer vision and alternative data sources such as Google Street View as a means to augment existing satellite-based object detection methods.

Rowena Bailey closed the presentations session by setting the scene for a new project currently in the scoping phase, aimed at investigating Data Science tools and techniques that can be used to derive population-based calorie consumption estimates.

The objective is to identify potential improvements to the accuracy of official estimates of calorie intake by comparing alternative data sources and methods to the current survey-based approach. The talk generated a lively conversation and workshop attendees offered expert insight into the problem.

During the afternoon breakout session, we were divided into groups according to the areas of interest and interacted with ATI fellows and public members. These groups focused on:

  • the application prototypes that we are building and the discussions were evolved around ways we can create tools that would have access to the vast amount of public government data and can effectively disseminate this data to interested parties from the public and private sector.
  • the aspects and challenges in evaluating population level calorie consumption. The group consisted of a range of professions from academia and industry and included: statisticians, data scientists, health researchers and economists.
  • discussions around data science methods that are going to be used for the Tourism project and how these methods might be applied to the Natural Capital Accounts project. For example, techniques developed for locating scenic sites from satellite images in the Tourism project could be adapted for estimating land use in Natural Capital Accounts.
  • data ethics related to data science and how the 2 institutes can collaborate at a higher level. It was a great opportunity to mingle with other data scientists and reflect on various problems. A number of potential research areas were identified and presented back to the wider group at the end of the workshop and are being developed into future collaborations and work plans – so watch this space to find out how this project progresses!

It was a great opportunity to mingle with other data scientists and reflect on various problems. A number of potential research areas were identified and presented back to the wider group at the end of the workshop and are being developed into future collaborations and work plans – so watch this space to find out how this project progresses!