At Luxcarta, we recently developed a novel, deep learning technique for 3D building extraction from textured meshes. The process significantly speeds up the creation of accurate 3D maps of dense urban areas.
Key takeaways:
In recent years, vast swathes of the planet’s surface have been photographed using satellites, aircraft, drones, and other methods. Using powerful computers, it is now possible to stitch together these images to create a ‘textured mesh’. Using 3D building extraction from meshes, we can further enhance them. This process cuts out building footprints with heights from the mesh. It allows us to identify individual structures.
This is an incredibly powerful tool. Polygonal extraction of building footprints allows urban planners, architects, utilities providers, engineers, and many other professionals to achieve a far deeper understanding of urban areas and building heights for all sorts of purposes.
However, 3D mesh building modeling is typically very time-consuming and resource-intensive. So, we decided to experiment with a new deep-learning method for building segmentation from colour images with elevation data that speeds up the process of generating accurate LoD2 buildings. We show the performance and potential of our new method by evaluating it on three worldwide cities with different characteristics – which we presented at the SPIE conference in October 2023 (you can read the paper here).
For many years, cartographers have been able to manually ‘cut out’ building footprints from images and add them as a layer in their GIS mapping systems. However, this process is very time-consuming. Similarly, various techniques also exist for turning 2D aerial or satellite images into a 3D model. But again, this process tends to be resource-intensive and can take several days or even weeks to complete.
This is problematic for several reasons.
First and foremost, it adds a significant delay to any project. Imagine that a city wanted to create a map of the urban environment to help plan their flood defences. Creating a detailed, 3D map would usually add several weeks to the process – and may also require skilled (and expensive) consultants.
There’s also the issue of change. In modern cities, new buildings – both permitted and unofficial (i.e., informal housing) – can be added rapidly and so existing maps can quickly go out of date. If a utilities business wants to build new electricity lines, they need the most up-to-date maps to know where buildings are, and their height. If new structures have appeared in formerly empty space, then this could seriously disrupt the plans. Being able to create up-to-date and accurate 3D maps is therefore very valuable.
Another common problem is image noise and distortions. Satellite and aerial images must be orthorectified (the process of correcting images so they appear as if the photo was taken from directly above). But in urban environments, this can be very challenging – sometimes tall buildings obscure lower-level buildings next to them. The ability to identify these sorts of issues – and correcting them – usually requires highly experienced technicians.
Related: How to make 3D city models available to everyone
Recent advances in deep learning techniques present tantalising possibilities for polygonal extraction of building footprints from imagery. At Luxcarta, we wanted to explore the possibilities for the semantic segmentation from textured 3D meshes.
First, some definitions can be helpful:
The results were impressive. Our model demonstrated high (90%+) levels of accuracy (precision and recall), automatically identifying large numbers of building structures, their elevations, and footprints in very different urban environments – from suburban US cities, to compact semi-formal structures in Brazil, through to mixed building types in France. Most importantly, the process was significantly faster than manual polygonal extraction of building footprints. We estimate it could deliver a fourfold increase in productivity.
Qualitative evaluation on Rio de Janeiro, Brazil: segmentation
Qualitative evaluation on Rio de Janeiro, Brazil: polygonization
Our new method for building extraction via semantic segmentation from textured 3D meshes has multiple potential use cases in almost any sector that requires accurate maps of towns and cities. The fact that it offers a much faster and more accurate method of building 3D models of large areas than what has been previously possible is particularly valuable, particularly in challenging areas. Here are just some example use cases:
Our new 3D building extraction technique from meshes is robust, reliable, accurate, and fast. We can significantly speed up the mapping of towns and cities. We achieve this by efficiently and effectively applying deep learning techniques. These techniques perform semantic segmentation of textured 3D meshes.
For support with rapidly and accurately mapping your town or city, contact Luxcarta today.