×

You are using an outdated browser Internet Explorer. It does not support some functions of the site.

Recommend that you install one of the following browsers: Firefox, Opera or Chrome.

Contacts:

+7 961 270-60-01
ivdon3@bk.ru

Addressing the Problem of Labeled Industrial Data Scarcity through Synthetic Generation of Point Clouds for Training Deep Neural Networks for Semantic Segmentation

Abstract

Addressing the Problem of Labeled Industrial Data Scarcity through Synthetic Generation of Point Clouds for Training Deep Neural Networks for Semantic Segmentation

Osipov A.V., Katechkin A.M., Marinich A.N., Osipova M.A.

Incoming article date: 03.01.2026


The paper presents a methodology for addressing the scarcity of labeled industrial data for training deep neural networks for semantic segmentation. A platform is proposed for synthetic generation of training point cloud datasets based on a minimal number of real laser-scanning samples of mechanical, electrical, and plumbing networks. The algorithm includes detecting the axes of cylindrical elements using the Random Sample Consensus method, constructing perpendicular joint planes, and applying affine transformations to create assemblies of 2–7 elements. The training set is increased from 8 real scans to more than 800 synthetic examples, which makes it possible to improve the segmentation accuracy of the PointNet++ deep hierarchical point cloud learning architecture from 72% to 89% in terms of the Intersection over Union (IoU) metric. The developed system enables automated creation of BIM models of engineering infrastructure with 90–95% accuracy with respect to design parameters.

Keywords: synthetic data generation, point clouds, semantic segmentation, laser scanning, Random Sample Consensus method, shortage of labeled data, BIM modeling, engineering networks, deep learning