At nam.R we are working hard to build the Digital Twin of France. To this end, we use a lot of sources of information such as aerial images.Extracting useful information from the unstructured data that are images? Sounds like a job for the computer vision team !
To detect all solar panels on the roofs of french buildings, we used aerial imagery and the known outlines of the buildings, through a pipeline consisting of a solar panel outline detector and a filtering algorithm.
We took inspiration from the projects revolving around the state-of-the-art object segmentation deep learning algorithm known as Mask-RCNN. This algorithm is the newborn of a family of algorithms developed by Ross Girshick & al., in direct continuation of RCNN, Fast-RCNN and Faster-RCNN.
The chosen pipeline consists in two complementary parts :
– an object detector, more specifically an instance segmentation algorithm, meant to detect solar panels and extract their contours ;
– a filtering algorithm that takes all detections and filters out those that don’t match our business rules.
While the filtering algorithm can easily be developed using the expert rules we chose to consider (size of the detected solar panels, their position regarding to the considered roof), the deep learning model depends directly of the data we will feed it with.
The first part of the project was, accordingly, to generate a dataset of roofs equipped with solar panels, and the matching labels. There are multiple existing tools for image annotations (VGG VIA, MIT LabelMe,…) that can be used as is. We chose the VGG Image Annotator. After a few (hundreds of) clicks, we ended up with a dataset we’re quite proud of.
Only then, we were able to train the Mask-RCNN model to detect solar panels.
The first version of our model wasn’t performing all that well and if was necessary to add more data to the training stage. We realized semi-supervised learning with automatic labelling.
This technique consists in using the model as a way to compute more labels that are then controlled by human operators and used as training data for a new, more robust version of the model. Controlling whether the proposed labels were right and correcting the wrong ones was way simpler than labelling by hand hundreds of images. Basically, we used our first model as a replacement for crowdsourcing !
After a few loops we fetched more data and matching labels and were able to train a model that had acceptable performances.
We transformed the raw output of the model into polygons in the same format and projection as our building polygons using geometric algorithms (Marching Squares, Douglas-Peucker) and geographic transformations. This enables us to directly filter out the potential false positives. We found out that roof windows, glass roofs and blue awning fabric were likely to be mistaken for solar panels due to their similar visual textures.
The first conclusion we draw is that the combination of machine learning and expert rules can become a reliable framework, harnessing the power of machine learning algorithms and the robustness of business rules.
The second one was the use of our first imperfect model to help us label more data. Real data but synthetic labels, a great example of human-machine cooperation, isn’t it?
Finally, there are many different ways for computing the performances of this kind of pipelines. The deep learning model itself can be evaluated using metrics such as its mean average precision but we were mainly interested in the performances of the whole flow. Thus, we chose metrics that are less image-centric and oriented more towards information retrieval : precision, recall and overall accuracy. We added a geometric metric that indicates how well our predicted panels matched with the actual ones, the Intersection over Union (IoU).
We achieved the performances of 96% overall algorithm accuracy and 84% IoU on our test set, values we’re quite proud of.
The predictions of solar panels were integrated in nam.R’s Digital Twin and the information is already put to good use !
Depuis quelques années les IA artistiques se multiplient. Certaines sont capables de recréer une image selon le style d’un peintre, d’improviser avec un musicien en direct ou encore de composer un récit dont le lecteur est le héros. Sont-elles pour […]
En matière de données, le secteur du tourisme est complexe à analyser. Pour 25% de données structurées produites par une multiplicité de professionnels (sites internet corporate et e-shop des voyagistes, hébergeurs, restaurateurs, transporteurs et logiciels CRM et de gestion), on […]