From bbe2e5694b47b77872e2e4e69a1b3e0322355171 Mon Sep 17 00:00:00 2001 From: zhanli <719901725@qq.com> Date: Tue, 10 Oct 2023 00:02:31 +0800 Subject: [PATCH] update. --- main/README.md | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/main/README.md b/main/README.md index ece2c25..0146213 100644 --- a/main/README.md +++ b/main/README.md @@ -1,9 +1,11 @@ -This repository hosts the source code of our paper: [[CVPR 2022] Cascade Transformers for End-to-End Person Search](https://arxiv.org/abs/2203.09642). In this work, we developed a novel Cascaded Occlusion-Aware Transformer (COAT) model for end-to-end person search. The COAT model outperforms **state-of-the-art** methods on the PRW benchmark dataset by a large margin and achieves state-of-the-art performance on the CUHK-SYSU dataset. +# **COAT代码使用说明** -| Dataset | mAP | Top-1 | Model | -| --------- | ---- | ----- | ------------------------------------------------------------ | -| CUHK-SYSU | 94.2 | 94.7 | [model](https://drive.google.com/file/d/1LkEwXYaJg93yk4Kfhyk3m6j8v3i9s1B7/view?usp=sharing) | -| PRW | 53.3 | 87.4 | [model](https://drive.google.com/file/d/1vEd_zzFN88RgxbRMG5-WfJZgD3vmP0Xg/view?usp=sharing) | +​ 这个存储库托管了论文的源代码:[[CVPR 2022] Cascade Transformers for End-to-End Person Search](https://arxiv.org/abs/2203.09642)。在这项工作中,我们开发了一种新颖的级联遮挡感知Transformer(COAT)模型,用于端到端的人物搜索。COAT模型在PRW基准数据集上以显著的优势胜过了最先进的方法,并在CUHK-SYSU数据集上取得了最先进的性能。 + +| 数据集(Datasets) | mAP | Top-1 | Model | +| ---------------- | ---- | ----- | ------------------------------------------------------------ | +| CUHK-SYSU | 94.2 | 94.7 | [model](https://drive.google.com/file/d/1LkEwXYaJg93yk4Kfhyk3m6j8v3i9s1B7/view?usp=sharing) | +| PRW | 53.3 | 87.4 | [model](https://drive.google.com/file/d/1vEd_zzFN88RgxbRMG5-WfJZgD3vmP0Xg/view?usp=sharing) | **Abstract**: The goal of person search is to localize a target person from a gallery set of scene images, which is extremely challenging due to large scale variations, pose/viewpoint changes, and occlusions. In this paper, we propose the Cascade Occluded Attention Transformer (COAT) for end-to-end person search. Specifically, our three-stage cascade design focuses on detecting people at the first stage, then progressively refines the representation for person detection and re-identification simultaneously at the following stages. The occluded attention transformer at each stage applies tighter intersection over union thresholds, forcing the network to learn coarse-to-fine pose/scale invariant features. Meanwhile, we calculate the occluded attention across instances in a mini-batch to differentiate tokens from other people or the background. In this way, we simulate the effect of other objects occluding a person of interest at the token-level. Through comprehensive experiments, we demonstrate the benefits of our method by achieving state-of-the-art performance on two benchmark datasets.