Preset method
Cascade RCNN
Introduction
Cascade RCNN is a multi-stage object detection method. It trains the visual feature information extracted by the base model through cascading multiple detectors with different IoU thresholds, thereby addressing the problems of overfitting and noisy detections caused by training with a fixed IoU threshold.
Citation
@article{2017Cascade,
title={Cascade R-CNN: Delving into High Quality Object Detection},
author={ Cai, Zhaowei and Vasconcelos, Nuno },
year={2017},
}
DINO
Introduction
The DINO method is a DETR-type detector that treats the object detection task as a set prediction problem. Through end-to-end training, it eliminates the manually designed processing modules in traditional multi-stage object detection algorithms.
The main idea of the DINO method is to optimize the encoder-decoder structure in the DETR model to further improve the speed and accuracy of converting image sequences to set sequences. This set is actually a learnable position encoding. The specific improvements are reflected in three aspects:
- Introduce positive and negative noise samples to enhance the detector's perception of negative samples.
- Adopt a hybrid query selection method. Use the TOP-K position information output by the encoder as the initialization of anchor boxes, while keeping the content queries as learnable parameters.
- Optimize the gradient propagation of the decoder. Separate the gradient information for the update and iteration of prediction boxes.
Citation
@article{2022DINO,
title={DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection},
author={ Zhang, Hao and Li, Feng and Liu, Shilong and Zhang, Lei and Su, Hang and Zhu, Jun and Ni, Lionel M. and Shum, Heung Yeung },
journal={arXiv e-prints},
year={2022},
}