The Problem

As a facility, or energy, manager it can be very difficult to know if your building is operating efficiently when weather variations play such a major factor in energy usage. A typical example of this problem, which we are sure most people will have struggled with, is trying to justify whether introducing a new energy-saving measure has been a worthwhile investment.

Perhaps you have purchased a new boiler for your building and now need to investigate the effect it has had on your energy efficiency? You compare bills and are relieved to learn that you’re spending less money on oil. But how do you know these savings are from the new boiler, and not just the consequence of a particularly mild winter? Or, maybe your utility bills are higher than expected and it seems as if the retrofit is not delivering the savings you had planned for? Is the new boiler ineffective, or are your energy improvements being obscured by the effects of an atypical weather?

The fact is that weather can have a huge influence on a facility’s energy usage and subsequently can act as a blocker for those who need an accurate understanding of their energy efficiency. To combat this problem and bring clarity to your energy data, Dattica has introduced a new Regression Analysis tool.

The Solution

Regression analysis is a statistical tool that allows you to normalise the impact that weather has on your energy usage. Use of the tool ensures that fluctuating weather conditions will not compromise your energy saving figures.

Simply put, rather than comparing one year’s usage to another’s, we regress your energy usage against weather data, to compare how much energy you would have used  this year to how much energy you did use this year.

The premise of this analysis is that demand for your building’s heating energy will tend to vary according to how cold the weather is. To quantify this effect in the energy industry we use a specialist type of weather data called heating degree days.

Heating Degree Days

Heating degree days provide a measure of how much in degrees and for how long in days the outside temperature was below a specific base temperature. When the outside temperature is at, or above, this base temperature it means that your building does not require any heating.

The most appropriate, or optimal, base temperature for any particular building depends on the temperature that the building is heated to, and the nature of the building i.e. the heat-generating occupants and equipment within it.

With regards timescale, weekly heating degree data is the most suitable for regression analysis as it balances the effects of weekend-related inaccuracies. Using this method the figures in your report are weekly energy consumption and weekly heating degree day values.


This bar chart displays the weekly HDD values for 2014

This represents the weekly demand for energy to heat a typical building with a base temperature of 15°

How Does it Work?

For a heated building we assume that the energy consumption required to heat that building for a particular period is driven by the number of heating degree days over that period.

To demonstrate how user-friendly our regression analysis tool here’s a quick tour of the new functionality provided.

When you click on the Regression Analysis tab, you are prompted to select the utility type, measure, and date range that you would like to use in your model.



Using this baseline set of energy consumption data, and heating degree data set at the optimal baseline temperature for your building, Dattica generates a linear regression model and displays the results in the chart form along with a summary table.

Understanding the Report

The report has two basic elements to understand:

  • Regression Model Scatter Plot
  • Linear Model Summary

Regression Model Scatter Plot

The regression model graph is a scatter plot of your energy consumption against weather data. Scatter plots are an easy way to visualise whether there is a relationship between two variables or not.


Your energy data are plotted as points on the graph and will typically scatter a bit

Regression creates the single line that best summarises the distribution of these points

This line of best fit is called the regression line, but is often known as trend or performance characteristic line

If this experiment is repeated, the regression line will lie within the shaded confidence interval 95% of the time


A good regression model will display a straight line graph with minimal scatter as demonstrated in our example above.

Linear Model Summary

This table summarises all the important information you need to know about your regression model.


Regression Equation

The regression equation is the mathematical formula applied to the line of best fit in our regression model. It is expressed through the basic linear equation below.


This equation is extremely important as it enables us to estimate or predict energy usage for known values of degree days.


The slope of the line characterises the relationship between your energy usage and the weather. In our example, m=52.7 which implies that 52.7 litres of oil are consumed per week for each degree day above our base temperature per week.


The intercept represents the baseload energy consumption. In other words, the amount of energy your building consumes that is not weather-dependent, i.e. when no degree days are recorded. Our example suggests that 1,290 litres of oil are consumed in our building per week, regardless of the outside temperature.

The coefficient of determination, commonly known as R² (pronounced R-squared) , is probably the most important output of the regression analysis. This statistic derived from the regression equation quantifies your model’s performance based on how close the data points are to the fitted regression line. Basically, it is a measure of how reliable your model is.

R²  ranges from 0 to 1. The higher the value the better a model fits your data. A lower number reflects a poor relationship and suggests that building controls may need to be examined.

To help you validate your model we have added general guideline ratings for R²  values in the model summary. In our sample model, an R² value of almost 0.7 is quite reasonable, signifying a strong relationship between oil usage and the weather. However, it also suggests that a higher R²  value is attainable by exploring stricter energy controls.

P Value

The p value is another valuable diagnostic measure. It determines whether heating degree days are significant at predicting your facility’s energy consumption or not. Similar to the guideline ratings for R² , we have provided ratings for p values to help you diagnose your model.

A good model will have a low p value and be labelled significant, as in our example above. This tells us that weather is a meaningful factor to our model and we can trust this regression analysis. On the other hand, a low p value implies that weather is not a significant predictor variable for your energy usage and other driving factors should be explored.

Other Diagnostics…

The results of your linear model are only reliable if your R²  and p value are satisfactory. Hence, assessing these statistics is a crucial step in the analysis.

If you have a better understanding of regression analysis and are interested in more advanced diagnostics, Dattica also provides the option to examine the residual plots of your model.


Examining these plots for unwanted residual patterns is an extremely effective way of detecting bias in your model.

What’s Next?

Regression analysis is often referred to as the ‘go-to method in analytics’ as it is so effective in explaining different types of phenomena people want to understand, predict, and make decisions on in business. In our previous post we outlined its applications in developing budgets, designing facility additions, setting energy use targets, pinpointing potential energy wastage, or calculating precise energy-saving figures.

This implementation using heating degree data is our first step in helping you understand regression and your data better. In addition to weather, there are many other influential driving factors affecting your energy consumption that this analysis can be applied to, e.g. sales, production, building occupancy, etc.
To reap the full benefits of regression analysis, Dattica is investing time and effort into exploring its potential. Here is a list of improvements to look out for in the next few months:

  • Regression against factors other than heating degree days
  • Multivariable Regression which means performing regression against several driving factors in one model
  • Measurement and Verification using Regression and CUSUM analysis