1. Progressively Complementarity-aware Fusion Network for RGB-D Salient Object Detection
Published in 2018 CVPR
paper: Progressively Complementarity-aware Fusion Network for RGB-D Salient Object Detection
- Main novelty:
- Design a complementarity-aware fusion (CA-Fuse) module, which introduces cross-modal residual functions and complementarity-aware supervisions (side loss)
- Add level-wise supervision from deep to shallow densely
- Overall architecture
- CA-Fuse module
- Details: Add a large kernel convolution (Conv6, 13X13) and include five set of side loss functions (weights all set to 1) plus a loss to encourage informative combination of all side outputs.
- Results
2. Attention-aware Cross-modal Cross-level Fusion Network for RGB-D Salient Object Detection
Published in 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
paper: Attention-aware Cross-modal Cross-level Fusion Network for RGB-D Salient Object Detection.html)
- Main novelty: Proposed an attention-aware cross-modal cross-level fusion (ACCF) module to fuse different level RGB and depth features. The ACCF module is similar to the SE block.
- Details: Add a large kernel convolution (Conv6, 13X13) and include five loss functions (weights all set to 1).
- Results (worse than the CVPR results)
3. Multi-Modal Fusion Network with Multi-Scale Multi-Path and Cross-Modal Interactions for RGB-D Salient Object Detection
Published in 2019 Pattern Recognition
- Main novelty:
- Propose a global understanding branch (pooling) and a local capturing branch (realized through dilated convolution).
- Multi-layer fusion by element-wise summation.
- Architecture
- Details
- Train the R_SalNet first by VGG initialization. Then train the D_SalNet with R_SalNet initialization. Finally, fintune the whole network with paired inputs.
- FC layer outputs (3136 vector) is warped to the saliency map of 56 x 56.
- Results
- Achieve better results when compared to state-of-the-arts methods (seems even worse then the IROS paper).
- The fusion direction is important (depth to RGB works best). The authors stated that the ‘MP+CI-Bi’ method introduced too much parameters and the bi-directional connections may destroy the fragile architecture. Since the fusion was achieved by summation, no additional parameters should be introduced. The first reason seems unreasonable.
- Achieve better results when compared to state-of-the-arts methods (seems even worse then the IROS paper).
近期评论