UniFormer

2022/10/04 PaperNotes 共 1690 字,约 5 分钟

论文名称:UniFormer: Unified Multi-view Fusion Transformer for Spatial-Temporal Representation in Bird’s-Eye-View

Intro

文章试图解决如何在BEV表征里融入时序信息。

之前的方法都是 warp-based temporal fusion:warping past BEV features to the current time according to the positions of BEV spaces at different time steps

这类方法的问题:

  1. 无法长序建模
  2. 信息损失:如下图所示,BEV卡死了感知范围,但是camera range (100m)往往大于 BEV range(50m)

Method

如何对齐BEV coord和各个view的coord

作者提出的时序融合方法:virtual view,从view出发去找过去BEV和当前BEV空间的对应关系

Virtual views are defined as the views of sensors that do not present in the current time step, and these past views are rotated and translated according to the ego BEV space as if they are present in the current time step.

接下来是作者推导的公式:

我的问题:没看懂

这样的好处是一方面不再像 warp-base 只处理相邻帧的融合,可以处理长时帧;另一方面融合可以并行执行,而 warp-base 只能串联

网络

第一个创新点是 cross-attention:BEV query 直接和 current + past 的 mapping 上做 cross-attention

作者称:The cross attention module can iterate over features from different time steps, which brings another important property, i.e., adaptive temporal fusion.

在消融实验时,To verify this, we directly average the P temporal features before feeding them into the Transformer as the counterpart for comparison, which can be viewed as a fixed equal-weighted fusion.

我的问题:如何做 mapping?

第二个创新点是 self-regression:transformer 的输入不是 BEV query 而是 BEV query 和之前 transformer 输出的级联。对于第一次迭代,直接double BEV query作为输入。

self-regression 算是融合了上一帧的BEV query特征,做法比 BEVFormer 简单粗暴。

作者认为 concatenation of warped BEV features and BEV queries 是 BEVFormer 起飞的原因,同时作者解释这种建模方式的优势是 ` implicitly deepen and double the number of the Transformer’s layers`,所以 self-regression 这种简单的建模也会很好用。

分割头用的是 ERFNet

其中 Non-bt-1D 指的是下图的 c 残差模块

Experiments

The number of multi-scale features is set to L = 4.

The default number of previous time steps is set to P = 6.

The number of sampling heights is set to Z = 4. The height range is (−5m,3m] with a stride of 2m.

60m × 30m / 100m × 100m / 160m × 100m BEV 平面

其中一组结果:比BEVFormer更高的计算复杂度(因为fuse了之前P=6帧的特征所以可以理解),但涨点明显

文档信息

Search

    Table of Contents