Can machine learning be used to predict energy performance scores?

The Data Science Campus explores innovative methods to provide richer insights into important policy themes.

Our latest project investigates the use of machine learning techniques to predict missing energy performance scores. It also attempts to create a complete picture of the energy efficiency profile for domestic properties in Wales.

We have produced a report reviewing the data sources and techniques used. We are also making the code relating to this project available in a Github repository

About 53% 1 of properties in Wales do not currently have energy efficiency information. This gap in reporting makes it harder to achieve the Welsh Government’s ambition to reach net-zero carbon emissions by 2050 in Wales. For this initiative to succeed, buildings will need to operate at close to zero emissions, as stated in Prosperity for all: A low carbon Wales.

The UK residential sector accounts for 18% of greenhouse gas emissions, which come predominantly from heating homes, so an improvement in home energy efficiency in Wales could significantly help reach the net-zero carbon emissions target in the next three decades.

Results of our work showed that, although the current data sources do not give a conclusive answer, access to further supplementary data sources could enable machine learning techniques to predict (impute) the missing energy performance scores for domestic properties in Wales more effectively.


  1. This is calculated by taking the number of unique properties in Wales with an EPC (666,137), and the total dwelling stock estimate for Wales (1.43 million).