Tuyển tập Hội nghị Khoa học thường niên năm 2019 ISBN 978 604 82 2981 8 180 CLUSTERING ALGORITHM FOR RECOGNITION OF COMPUTER AIDED DESIGN IMAGES Nguyễn Văn Nam Thuyloi University 1 INTRODUCTION Comput[.]
Tuyển tập Hội nghị Khoa học thường niên năm 2019 ISBN: 978-604-82-2981-8 CLUSTERING ALGORITHM FOR RECOGNITION OF COMPUTER AIDED DESIGN IMAGES Nguyễn Văn Nam Thuyloi University INTRODUCTION Computer Aided Design (CAD) images are technical drawings that are frequently used in engineering domains In mechanical factories, CAD drawings must be converted to the corresponding Computer Numerical Control (CNC) machine commands for cutting material Most CAD software can automatically effectuate this conversion for CAD files However, the transformation of scanned CAD images are currently done by human beings Most recent recognition methods based on deep neural networks are not efficient enough since the CAD objects are too small and noisy In this paper, we rely on old but efficient DBSCAN clustering algorithm which produces more than 90% accuracy for cad recognition To the best of our knowledge, this is the first to address this problem Given only four 2D projections of a 3D object as in Fig 1, an engineer needs to recognize the details of the CAD drawings including projections (front, rear, left, right, top, down), line (solid, dotted, distance), circle (semi-circle, disk), arcs, text boxes, He then draws a flattening image corresponding to the real object as can be seen in Fig.2 (below) This last CAD file can then be converted to CNC commands Figure Flattening CAD image RESEARCH METHOD Figure Real 3D object and its four 2D projections Figure Scanned CAD Drawing 180 Tuyển tập Hội nghị Khoa học thường niên năm 2019 ISBN: 978-604-82-2981-8 As in Fig 3, a cad drawing includes a rectangle bounding box, a section of notes which is placed at a side of the bounding box, a text box describing the material type of the object, several cad projections demonstrating the detail shape, size of the object viewed from at most six directions: top, down, left, right, front, rear A projection in a CAD drawing consists of distance lines, alignment lines, textboxes, a closed contour of the whole object and some circles, disks, boxes inside the closed contour The closed contour may be as simple as rectangles, polygons, circles or any more complex combinations of them Since the final CAD image contains only closed contours, we will describe our method to extract this contour in every projections and then link them together As previous analysis, the cad drawing images can be clustered in to distinguished regions This leads us to use some clustering algorithms in machine learning K-Means [1] is a partitioning spatial clustering method which regroups pixels with the nearest mean K-Means segments data space into Voronoi cells This method cannot be applied in this case since the largest rectangle bounding box will be the only one K-Means partition Ward [2] is a hierarchical clustering algorithm This is bottom-up algorithm which merges small groups in to bigger ones based on some agglomerative criteria Once more, the largest rectangle bounds the whole image so Ward will consider it at the only partition DBSCAN [3] is a density-based clustering method This extract low- and high-density clusters Therefore, it can find clusters of arbitrary shapes especially closed form contours provided their points are close enough to their neighbors DBSCAN defines core points and outliers The formers must form a group of at least minPts points which the distance between one point to its closest one in the same group is less than eps The latter are all the remaining points The DBSCAN algorithm can be seen as in Fig Begin D ={ p} eps, minPts cluster = [] Any unlabeled p in D Y N Find all points reachable to p in D based on eps Y At least minPts found? N Record new cluster and label its points End Figure DBSCAN clustering algorithm The algorithm starts from any unvisited point p in the data space and find all points that are reachable to p One-point q is reachable to p if there is a path from p to q where all points in the path are close enough (compared to eps) to the previous one If the number of reachable points are at least minPts then a new cluster is recorded and its points are labelled Otherwise, they are outliers 181 Tuyển tập Hội nghị Khoa học thường niên năm 2019 ISBN: 978-604-82-2981-8 Based on DBSCAN clustering, our cad recognition method includes the five steps The first is to extract the outer and notes partition Secondly, cad projections are partitioned Third step aims to remove all distance lines, alignment lines Next, the largest contour of the object in each projection will be revealed Finally, all the largest contours of projections are linked together to shape the flattening image Figure The rear and left (right) projections RESEARCH RESULTS The method is testified with 20 random cad drawing images of 300dpi eps and minPts are chosen as and 10, correspondingly 90% of the cases produce accurate results to extract rectangle bounding box and notes This is because some notes are placed far from the bounding box and some are too close to the projections 98% of projection partitioning are correct Some errors are due to the fact that small projections may have links to their bigger demonstrations After removing all the line and distance lines, nearly 100% of largest contour are extracted from projections Figure Final flattenning result image Figs 5, 6, 7, show the results of our methods for the cad drawing in Fig CONCLUSION In this paper, we target the problem of cad drawing recognition The method is based on DBSCAN clustering algorithm This produces excellent experiment results with more than 90% of clustering accuracy for 20 random cad drawings In the future, we continue with recognition of small CAD items like small circles, disks REFERENCES Figure The front projection Figure Top (down) projections [1] Hartigan, J A & Wong, M A (1979) A kmeans clustering algorithm JSTOR: Applied Statistics, 28, 100 108 [2] WARD, J.H (1963), “Hierarchical Grouping to Optimize an Objective Function”, Journal of the American Statistical Association, 58, 236-244 [3] Martin Ester, Hans-Peter Kriegel, Jörg Sander, and Xiaowei Xu 1996 A densitybased algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise (KDD'96), Evangelos Simoudis, Jiawei Han, and Usama Fayyad (Eds.) AAAI Press 226-231 182 ... target the problem of cad drawing recognition The method is based on DBSCAN clustering algorithm This produces excellent experiment results with more than 90% of clustering accuracy for 20 random... the cad drawing images can be clustered in to distinguished regions This leads us to use some clustering algorithms in machine learning K-Means [1] is a partitioning spatial clustering method... DBSCAN [3] is a density-based clustering method This extract low- and high-density clusters Therefore, it can find clusters of arbitrary shapes especially closed form contours provided their points