Fast Shape Retrieval Based on Differential Chain Code Descriptor and Fuzzy Contour Matching

This work targets fast retrieval of similar CAD shapes from large drawing repositories. Instead of comparing bitmaps or expensive local curvature descriptors directly, it converts contour geometry into differential chain-code sequences and performs efficient sequence matching.
Problem setting
CAD drawings often contain many contour-like components with limited texture or intensity information, so shape is the most reliable retrieval cue. The challenge is to make contour matching both fast and robust: direct point-set matching can be expensive, while ordered chain-code matching is sensitive to noise because one wrong local point can shift many subsequent correspondences.
The method therefore treats retrieval as a two-stage problem. First, the contour is normalized into an ordered and uniformly sampled representation. Second, matching is performed on short differential-code sub-sequences rather than the entire contour at once.
Contour preprocessing
For binary CAD images, the method first uses run-length encoding to identify white regions in each row. Connectivity analysis converts the run representation into graph structures, and depth-first search extracts outer and inner contours. Because the raw run vertices are sparse and uneven, the contour is smoothed and uniformly resampled before coding.

Contour preprocessing. Run-length information is converted into contour graphs, then the raw contour is smoothed and resampled at a fixed spacing.
Differential chain-code descriptor
The descriptor is based on Freeman-style chain codes, but the paper computes differential codes directly from triples of consecutive points. Each middle point is assigned a direction-change code according to the rotation angle between adjacent contour vectors. This keeps the representation compact and translation-invariant while preserving the local turning structure of the contour.

Differential chain coding. Local turning angles are quantized into directional codes; short code sub-sequences are then matched against a longer model sequence.
Segmented fuzzy matching
The key robustness step is segmented fuzzy matching. A local point set is split into multiple parts, each part is matched against the model contour as a short chain-code sub-sequence, and each match produces a cost array. This avoids letting a single noisy point propagate errors through a long ordered sequence.

Segmented fuzzy matching. The local point set is split into parts, each part produces a cost array on the model contour, and candidate positions are later selected from the combined costs.
The cost arrays are then aggregated. A sliding window suppresses local noise by assigning the minimum value inside the window to its center position. Since the resulting cost distribution forms valley-like segments, local minima are retained as candidate matching positions.

Cost calculation. Sub-sequence costs are accumulated, smoothed by local windows, and converted into candidate matching positions from local minima.
Finally, candidate matches are verified geometrically. For each possible matching position, corresponding point pairs are formed, a transformation matrix is estimated with SVD, and the average post-transform point distance is used to reject false matches. ICP refinement can then further improve the final alignment.
Results
The experiments use shapes converted from DXF files and CAD drawings captured directly by measuring instruments. The paper compares matching quality and time against HALCON and OPT SMART. In the quantitative table over eight test images, the average error values are approximately 0.868 / 0.393 for HALCON, 0.742 / 0.320 for SMART, 0.498 / 0.222 for the proposed method, and 0.397 / 0.170 for the proposed method with ICP refinement.

Representative experimental row. The result panel shows a query shape and several matched/aligned contour outputs from the evaluation set.
Takeaway
The practical value is speed and robustness for CAD-like industrial shape retrieval. Differential chain codes make contour comparison a lightweight sequence problem, while segmentation, cost aggregation, SVD verification, and optional ICP refinement reduce the noise sensitivity that usually comes with ordered contour descriptors.