Case Study: Crawling for Market Surveillance and Policy Development

Swedish Energy Agency

NordCrawl is a digital tool that can help governments to reduce the cost of market surveillance of energy efficiency standards and labelling programmes and reduce the loss of energy savings associated with non-compliance.

In existence since the 1970s and operating in more than 80 countries around the world, national energy efficiency standards and labelling (EESL) programmes are the cornerstone of most national energy efficiency and climate change mitigation programmes. Programmes covering a broad range of products save between 10% and 25% of national or relevant sectoral energy consumption. They are cost-effective with national benefits outweighing costs by a ratio of at least 3 to 1 (4E, 2015). However, programme success hinges on effective market compliance regimes. Between 25% and 50% of energy savings expected from EESL programmes are lost due to non-compliant products (CLASP, 2019).

Despite their obvious importance, governments are struggling to allocate the resources needed for the data collection and analysis needed to ensure effective compliance. Some governments rely on purchasing expensive market data from commercial companies. Other governments carry out time and resource intensive market assessments that include manual internet checks and shop visits. Market surveillance also incorporate expensive testing of product to secure compliance. Considering the increasing number of regulated products, resource requirements will increase, which calls for improved and more effective methods.

In recent years, web crawler techniques have developed at a rapid pace. With intelligent software, it is possible to scrape (collect) large volumes of data from publicly available data sources on the Internet, practically in real time. In the realm of product policies, such as energy standards and labelling, this offers alternative, more dynamic and cheaper means to track products on the marketplace.

Web crawlers are computer programs that scan the web to find information. This information (e.g. product type, model name and number; different technical specifications and energy performance data) is then collected, processed and analysed.

In 2013, the IEA Technology Collaboration Platform for Energy Efficient End-Use appliances and Equipment (4E) started exploring the possibility to use web crawling to access data that could be used for energy efficiency programme market surveillance. The concept was further refined through a project financed by the Nordic Council of Ministers (budget EUR 258 000) and carried out by the Nordic countries (Sweden, Denmark, Finland, Iceland) during 2015 – 2017. During the project the NordCrawl software platform using web crawler data was developed. The data was compared to purchased data and it was determined that the scraped data were good enough for both market surveillance purposes and policy analyses.

The process of web crawling for policy insights

The process of web crawling for policy insights

Applications are numerous and include market insights, compliance monitoring and policy evaluation, as detailed below.

Market insights:

  • Snap-shots of the market: how it looks right now.
  • Time series, showing the development of the market over time. Indicates trends, e.g. whether the market is moving and in what direction.
  • Monitoring market responses to changes, such as policy interventions, campaigns etc.

Indication of compliance or non-compliance with regulations:

  • Compliance with minimum energy performance standard requirements
  • Compliance with energy labelling requirements
  • Indications and analyses of possible loopholes used by manufacturers

Policy analysis and evaluation:

  • Adaptation patterns – when does the market start to change and how fast does it move, in relation to when a regulation enters into force. 
  • Adaptation rates – how fast does the market adapt to new requirements? 
  • Technology development
  • Detailed data with a high time resolution can reveal the dynamics of the market, which can be used for studies of innovation rates, learning curves etc.
  • Price development
  • Estimations of sales volumes
  • Refined policy design based on the above

The initial setup of the system has taken a couple of years at a cost of about EUR 245,000.

The solution enables access to new data sets, time series, possibility to track market developments in almost real-time and saves resources. Automated checks for potential non-compliance makes it possible to focus human and financial resources on areas of highest risk. Data analysis in near real time gives faster responses to potentially non-compliant products. It also opens up for opportunities to do energy efficiency policy design and implementation differently e.g. more proactive monitoring and enforcement by informing actors before and during webcrawling and tracking the impact. Overall improved effectiveness and reduced cost for market surveillance. The solution also creates new opportunities for enhanced cross-border cooperation, data sharing and harmonisation. 

Since the operation costs are low, there are no real limits to the degree of detail or time-resolution that can be achieved, as long as data can be processed and stored in a systematic way. It is possible to develop more sophisticated algorithms to track non-compliance. It is possible to integrate the system into policy development and evaluation processes and it can be linked up with government product registries. It is possible to use this technique for any type of openly available data on the internet, which opens up for further possibilities to improve policy analysis.

This type of crawling-based solution will work best in markets where online shopping for appliances is widespread. It is, however, also possible to use the technique to scrape information from manufacturer websites and other sources and collect information valuable for market surveillance and policy development and evaluation.

International experience indicates that investments in market surveillance have a high return of investment in terms of the value of saved energy losses. In the Nordic region, collaborative market surveillance at a cost of EUR 2.1 million would avoid the loss of EUR 28 million worth of energy savings (due to non-compliant products), giving a return on investment of a factor 13.

NordCrawl can contribute to reducing the investment needed for market surveillance, the extent to which depends on how the surveillance process is set up. For a surveillance process with an emphasis on market assessments and document controls, considerable savings are possible. Meanwhile, for a process relying primarily on planned tests, NordCrawl would save less of the overall costs. 

While the crawler only crawls public data, the ethics around crawling are not completely clear. Furthermore, crawling can be seen as intrusive and it can slow down web-sites. The programme needs regular updating (as new needs arise or if companies start blocking crawling). Different products may need different approaches.

  • Bennich, P. and K. Mogensen (2019), Nordcrawl – The Nordic framework for collection, analysis and surveillance of market data based on automated and frequent crawling of retail web shops.
  • Bennich, P. et al (2017), Using webcrawler techniques for improved market surveillance – new possibilities for compliance and energy policy, ECEEE Summer Study Proceedings.
  • CLASP (2019), Compliance,  https://clasp.ngo/impact/compliance
  • Fjordbak Larsen, T. (2015), The Nordic Ecodesign Effect Project : Estimating benefits of Nordic market surveillance of ecodesign and energy labeling, TemaNord: 563.
  • IEA 4E (2015), Standards and labelling programs: A Global Assessment.
  • Lopes, C. (2019), Nordcrawl – A tool for product policy development, market surveillance and evaluation.