Local Registration of Multi-Focus Images via Optical Flow Tracking and Delaunay Triangulation

Significance 

Multi-focus image registration is critical for computational imaging tasks such as image fusion and shape-from-focus reconstruction, both of which depend on precisely aligned image sequences captured at varying focal depths. When a camera lens shifts focus across different planes, the sharp regions move while out-of-focus areas lose structural information. This spatial variation complicates alignment, as conventional registration algorithms—typically optimized for globally sharp or rigid scenes—struggle to maintain correspondence within defocused regions. The degradation in local texture and edge definition weakens feature extraction, leading to erroneous global transformations and a reduction in fusion quality. Despite decades of progress in computer vision and medical imaging, the problem of registering multi-focus images has received comparatively limited attention. Early studies adopted global registration strategies that rely on affine or polynomial transformations derived from salient keypoints such as Speeded-Up Robust Features (SURF), Scale-Invariant Feature Transform (SIFT), or Harris corners. These methods perform adequately when sufficient features are available across the image. However, when large regions are defocused, the extracted feature density drops sharply, leaving the transformation model constrained by sparse and unevenly distributed correspondences. As a result, global alignment tends to minimize average error at the cost of local precision, introducing minor distortions that propagate into downstream fusion or 3D reconstruction tasks.

Previous studies also attempted to bridge this limitation through hybrid schemes combining feature-based and intensity-based methods or by sequentially applying global and local corrections; however, these hybrid frameworks remain limited by their reliance on mutual information in low-texture regions or by assumptions of uniform scale change. Deep learning approaches, while powerful for medical or remote-sensing registration, require extensive domain-specific datasets and fail to generalize well to the highly variable defocus characteristics of natural scenes. Training such models to infer accurate transformations in blurred zones still remains particularly challenging. Addressing this challenge, a new research paper led by Professor Xiaohua Xia and published in Signal Processing describes a novel multi-focus image registration method that merges optical flow tracking with Delaunay triangulation to enhance alignment precision. The new algorithm successfully registers blurred regions often neglected by traditional techniques via extracting both salient and non-salient features and local mesh-based interpolation ensures accurate geometric consistency across the image which significantly outperforms global affine models.

The team applied their registration algorithm using C++ with OpenCV, and ran the experiments on an Intel i7 workstation. They tested it on a diverse set of multi-focus image pairs, each composed of a reference image and a target image captured at distinct focal depths. The workflow began with the extraction of salient features using the SURF detector, set with a Hessian threshold of 1000—a value chosen to balance sensitivity against computational cost. These features, concentrated in the sharply focused regions, provided a reliable scaffold for the initial motion estimation. Yet, since many areas of multi-focus images are blurred and feature-poor, the researchers extended the coverage by employing optical flow tracking. This technique propagated motion vectors from the detected keypoints into surrounding defocused regions, assuming local brightness constancy and small inter-frame displacements. The  authors obtained a much denser and more continuous map of correspondences, one that effectively bridged clear and blurred zones and laid the groundwork for a more faithful geometric transformation.

Afterward, the authors turned to Delaunay triangulation to partition the feature space into a network of local, non-overlapping triangles. Within each triangle, barycentric interpolation was used to determine the pixel-wise mapping between the reference and target images. This local approach enabled each mesh to adjust its transformation independently, a flexibility that helped correct spatial variations and depth-dependent distortions often missed by global affine registration. The visual outcome was striking: transitions across defocus boundaries appeared smooth, and the common warping or ghosting artifacts seen in global models were largely absent. To evaluate performance, the authors employed three metrics—mean squared error (MSE), normalized correlation coefficient (NCC), and Euclidean distance between corresponding features. Across eight representative image sets, their method consistently achieved the lowest MSE and the highest NCC, in many cases approaching perfect alignment. The Euclidean distance results further demonstrated its precision, showing reductions in displacement errors of up to half compared with improved affine models. When benchmarked against previously published methods, such as Xia’s scale-factor model and De’s two-step hybrid approach, the proposed algorithm reduced Root Mean Square Error values by roughly 20% and increased NCC by several percentage points. Even in practical fusion tests using NSCT, CVT, BRW, and ECNN frameworks, the registered images yielded fused outputs of noticeably higher structural similarity and mutual information, performing robustly even on datasets with steep blur gradients or microscopic detail.

In conclusion, Professor Xiaohua Xia and colleagues have made a meaningful departure from the conventional global registration paradigm by developing a locally adaptive model that handles one of the most enduring challenges in multi-focus image processing—maintaining feature coherence across regions with vastly different degrees of sharpness. By weaving optical flow tracking into their framework, the researchers ensured that even the texture-poor, blurred portions of an image contributed valuable correspondences, reinforcing geometric continuity between focal layers. This combination of dense optical correspondence and Delaunay-based local mapping forms a dual strategy that significantly advances the precision achievable in multi-focus registration. What makes this achievement significant, we argue,  is its broad effect. Enhanced registration accuracy does not exist in isolation—it directly elevates downstream applications. In image fusion, the improvement manifests as clearer composite images with balanced focus and far fewer artifacts. In shape-from-focus 3D reconstruction, refined alignment translates into more accurate depth recovery, an outcome with tangible benefits for industrial inspection, microscopy, and metrology. The same principles could stabilize real-time imaging pipelines in autonomous vision systems, where maintaining consistent registration under shifting focal conditions remains an unsolved issue.

Another major advantage of the new proposed approach is its simplicity and generalizability. Built upon well-established principles of gradient-based motion estimation and geometric triangulation, it avoids the data hunger and computational overhead typical of deep learning methods. This independence from large annotated datasets allows it to be deployed in specialized imaging scenarios where data scarcity or computational resource limits pose serious constraints. Additionally, the research work opens several promising research avenues. While the current design focuses on static scenes, the same principles could be extended to dynamic multi-focus sequences through adaptive optical flow acceleration or progressive mesh updates. The integration of granular-ball computing or lightweight deep adaptive neural modules may further enhance resilience to non-rigid motion and reduce computational latency. More broadly, the authors’ work underscores a growing realization in the vision community: registration accuracy is not a peripheral step but a fundamental determinant of downstream analytical reliability. Indeed, Xia and his team have set a new standard for spatial coherence in multi-focus imaging, one that future signal processing research will undoubtedly build upon.

Reference

Xiaohua Xia, Dianbin Yang, Shaobo Huo, Jianhong Sun, Huatao Xiang, Multi-focus image registration based on optical flow tracking and Delaunay triangulation, Signal Processing, Volume 228, 2025, 109763,

Go to Signal Processing

Check Also

A decoupled large-stroke piezoelectric tool holder for cylindrical microchannel turning

Significance  Reference Qinghou Cheng, Yangkun Zhang, Yingxue Yao, Yang Yang, A decoupled large-stroke 2-DOF tool …