Skip to content

Preset method

Cascade RCNN

Introduction

Cascade RCNN is a multi-stage object detection method. It trains the visual feature information extracted by the base model through cascading multiple detectors with different IoU thresholds, thereby addressing the problems of overfitting and noisy detections caused by training with a fixed IoU threshold.

Citation

@article{2017Cascade,
  title={Cascade R-CNN: Delving into High Quality Object Detection},
  author={ Cai, Zhaowei  and  Vasconcelos, Nuno },
  year={2017},
}

DINO

Introduction

The DINO method is a DETR-type detector that treats the object detection task as a set prediction problem. Through end-to-end training, it eliminates the manually designed processing modules in traditional multi-stage object detection algorithms.

The main idea of the DINO method is to optimize the encoder-decoder structure in the DETR model to further improve the speed and accuracy of converting image sequences to set sequences. This set is actually a learnable position encoding. The specific improvements are reflected in three aspects:

  1. Introduce positive and negative noise samples to enhance the detector's perception of negative samples.
  2. Adopt a hybrid query selection method. Use the TOP-K position information output by the encoder as the initialization of anchor boxes, while keeping the content queries as learnable parameters.
  3. Optimize the gradient propagation of the decoder. Separate the gradient information for the update and iteration of prediction boxes.

Citation

@article{2022DINO,
  title={DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection},
  author={ Zhang, Hao  and  Li, Feng  and  Liu, Shilong  and  Zhang, Lei  and  Su, Hang  and  Zhu, Jun  and  Ni, Lionel M.  and  Shum, Heung Yeung },
  journal={arXiv e-prints},
  year={2022},
}