Developing a Complex Computer Vision System, a Case Study : Solar Panels Equipped Roofs.

At nam.R we are working hard to build the Digital Twin of France. To this end, we use a lot of sources of information such as aerial images.Extracting useful information from the unstructured data that are images? Sounds like a job for the computer vision team !

To detect all solar panels on the roofs of french buildings, we used aerial imagery and the known outlines of the buildings, through a pipeline consisting of a solar panel outline detector and a filtering algorithm.

We took inspiration from the projects revolving around the state-of-the-art object segmentation deep learning algorithm known as Mask-RCNN. This algorithm is the newborn of a family of algorithms developed by Ross Girshick & al., in direct continuation of RCNN, Fast-RCNN and Faster-RCNN.

The chosen pipeline consists in two complementary parts :
– an object detector, more specifically an instance segmentation algorithm, meant to detect solar panels and extract their contours ;
– a filtering algorithm that takes all detections and filters out those that don’t match our business rules.

While the filtering algorithm can easily be developed using the expert rules we chose to consider (size of the detected solar panels, their position regarding to the considered roof), the deep learning model depends directly of the data we will feed it with.

The first part of the project was, accordingly, to generate a dataset of roofs equipped with solar panels, and the matching labels. There are multiple existing tools for image annotations (VGG VIA, MIT LabelMe,…) that can be used as is. We chose the VGG Image Annotator. After a few (hundreds of) clicks, we ended up with a dataset we’re quite proud of.

Only then, we were able to train the Mask-RCNN model to detect solar panels.
The first version of our model wasn’t performing all that well and if was necessary to add more data to the training stage. We realized semi-supervised learning with automatic labelling.

This technique consists in using the model as a way to compute more labels that are then controlled by human operators and used as training data for a new, more robust version of the model. Controlling whether the proposed labels were right and correcting the wrong ones was way simpler than labelling by hand hundreds of images. Basically, we used our first model as a replacement for crowdsourcing !
After a few loops we fetched more data and matching labels and were able to train a model that had acceptable performances.

We transformed the raw output of the model into polygons in the same format and projection as our building polygons using geometric algorithms (Marching Squares, Douglas-Peucker) and geographic transformations. This enables us to directly filter out the potential false positives. We found out that roof windows, glass roofs and blue awning fabric were likely to be mistaken for solar panels due to their similar visual textures.

The first conclusion we draw is that the combination of machine learning and expert rules can become a reliable framework, harnessing the power of machine learning algorithms and the robustness of business rules.

The second one was the use of our first imperfect model to help us label more data. Real data but synthetic labels, a great example of human-machine cooperation, isn’t it?

Finally, there are many different ways for computing the performances of this kind of pipelines. The deep learning model itself can be evaluated using metrics such as its mean average precision but we were mainly interested in the performances of the whole flow. Thus, we chose metrics that are less image-centric and oriented more towards information retrieval : precision, recall and overall accuracy. We added a geometric metric that indicates how well our predicted panels matched with the actual ones, the Intersection over Union (IoU).

We achieved the performances of 96% overall algorithm accuracy and 84% IoU on our test set, values we’re quite proud of.

The predictions of solar panels were integrated in nam.R’s Digital Twin and the information is already put to good use !

More articles

  • Cedric Villani visiting nam.R at Web Summit

    When writing his report AI for Humanity, giving meaning to artificial intelligence, Cédric Villani had auditioned nam.R and received a contribution that was included in the fourth part: “Artificial intelligence in the service of ecological transition”. In particular, Cédric Villani […]


    Read More
  • European AI night

    A special wink for Emmanuel Bacry co-founder of nam.R and Florian Douetteau Founder of our partnership Dataiku. Congratulations to all actors to make France an international reference in artificial intelligence Cedric O, Mohammed Adnène Trojette Bertrand Pailhès Eric Labaye Professor […]


    Read More
  • Advanced Aerial Imagery Analysis with Deep Neural Networks Explained in 5 Minutes.

    At nam.R we are working hard to build a Digital Twin of France, and to achieve that we use a lot of sources of information. One of the richest of them being aerial images. For us, humans, “reading” images is […]


    Read More
  • Merging Geo-Spatial Data on Twin Polygons.

    To build a rich description of the object the real challenge is to join these pieces together. But in absence of a clear and coherent name or index, this process can become quite difficult and noisy. For example when trying […]


    Read More
  • A Data Library to strengthen external data value

    Private and public open data, social network data, private data platforms… The web is an infinite source of external data. It comes in a variety of formats: data tables, geolocated data, APIs, images and text. How can organisations take advantage […]


    Read More