MENU

Generalized Intersection over Union

May 20, 2019 • Read: 828 • 目标检测阅读设置

背景

最近被DF的交通标志检测搞得自闭了,主要就是0.9的IoU实在是要求的过于严格,导致检测出的目标框与GT相差半个像素就会出错(有较多的目标面积很小),在尝试了很多的方式没有较大的提升后,决定复现一下 GIoU Loss试一下.

论文地址:Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression

论文不再解读,这里只过一下核心部分.

1. GIoU的定义

算法流程:

  • 输入: 两个任意凸形状A,B
  • 计算$ IoU = \frac {|A\cap B|}{|A \cup B|} $
  • 计算 $GIoU = IoU - \frac {|C\setminus(A\cup B)|}{|C|}$

注:其中C表示可以包含A,B在内的最小的封闭形状,计算GIoU中减数表达的含义是A,B的并集与C的差集所占C的比值.

2. GIoU的性质

论文中作者认为GIoU具有以下几个性质:

  • 与IoU类似,都可以作为一个距离度量,计算损失函数时可以用 $Loss_{GIoU} = 1 - GIoU$来计算;
  • 同样,GIoU也对物体的大小不敏感;
  • 根据上面的公式可以看出,GIoU总是会小于或等于IoU.另外对于IoU而言,其值域为$[0,1]$,而GIoU的值域为$[-1,1]$.在两个形状完全重合时,有GIoU = IoU = 1,当两个形状没有重叠部分时,IoU为0,减数为1,所以此时的GIoU为-1;
  • 由于GIoU引入了包含A,B两个形状的C,所以当A,B不重合时,依然可以进行优化.

3. 将GIoU作为一种损失函数引入到2D目标检测中

论文中给出了详细的算法流程,解读如下:

其中:P表示预测得到的目标框,GT表示实际标签

  • 输入: 预测得到的 $Box^P$坐标信息 和GT的 $Box^{GT}$坐标信息,

$Box^P = (x_1^P,y_1^P,x_2^P,y_2^P)$, $Box^{GT}=(x_1^{GT},y_1^{GT},x_2^{GT},y_2^{GT})$

  • 输出: $Loss_{IoU}$,$Loss_{GIoU}$
  • step1: 对于预测得到的框$Box^P$,要确保有$x_2^P > x_1^P,y_2^P > y_1^P$:

实际就是取:
$\hat x_1^P = min(x_1^P,x_2^P)$, $\hat x_2^P = max(x_1^P,x_2^P)$,
$\hat y_1^P = min(y_1^P,y_2^P)$, $\hat y_2^P = max(y_1^P,y_2^P)$

  • step2: 计算标签框的面积:$Area^{GT} = (x_2^{GT} - x_1^{GT}) * (y_2^{GT} - y_1^{GT})$
  • step3: 计算预测框的面积:$Area^{P} = (x_2^{P} - x_1^{P}) * (y_2^{P} - y_1^{P})$
  • step4: 计算预测框和标签框的交集:

$x_1^I = max(\hat x_1^P,x_1^{GT})$, $x_2^I = min(\hat x_2^P,x_2^{GT})$
$y_1^I = max(\hat y_1^P,y_1^{GT})$, $y_2^I = min(\hat y_2^P,y_2^{GT})$
交集: $I =\begin{cases}(x_2^I - x_1^I) *(y_2^I-y_1^I),& \text if x_2^I>x_1^I,y_2^I>y_1^I\\0& \text{otherwise}\end{cases}$

  • step5: 找到最小封闭图形C的坐标,记做$Box^C$:

$x_1^C = min(\hat x_1^P,x_1^{GT})$, $x_2^C = max(\hat x_2^P,x_2^{GT})$
$y_1^C = min(\hat y_1^P,y_1^{GT})$, $y_2^C = max(\hat y_2^P,y_2^{GT})$

  • step6:计算C的面积,记做$Area^C$:

$Area^C = (x_2^C - x_1^C) * (y_2^C - y_1^C)$

  • step7:计算IoU,$IoU = \frac {I}{U}$, 其中$U = Area^P + Area^{GT} - I$
  • step8: $GIoU = IoU - \frac {Area^C - U}{Area^C}$
  • step9: $Loss_{IoU} = 1 - IoU$, $Loss_{GIoU} = 1-GIoU$

emmm,公式比较多,推到一遍就会发现,其实很简单....

4. Pytorch 实现

下面给出Pytorch的实现代码,比赛结束后,会将GIoU Loss 在mmdetection的代码给出.

import torch
def generalized_iou(Box_p,Box_gt):
    """
    Input:
        Box_p : 模型预测得到的物体的坐标信息,格式为(n,4)(x1,y,x2,y2),且
        Box_gt: 标注的物体坐标信息,格式为(n,4)(x1,y1,x2,y2)
    Output:
        loss_giou: 平均iou loss
    """
    assert Box_p.shape == Box_gt.shape
    # 转换数据格式
    Box_p = Box_p.float()
    Box_gt = Box_gt.float()
    # 确保格式为 x2>x1,y2>y1
    xp_1 = torch.min(Box_p[:,0],Box_p[:,2]).reshape(-1,1)
    xp_2 = torch.max(Box_p[:,0],Box_p[:,2]).reshape(-1,1)
    yp_1 = torch.min(Box_p[:,1],Box_p[:,3]).reshape(-1,1)
    yp_2 = torch.max(Box_p[:,1],Box_p[:,3]).reshape(-1,1)
    Box_p = torch.cat([xp_1,yp_1,xp_2,yp_2],1)
    # 计算预测框的面积
    box_p_area =  (Box_p[:,2]  - Box_p[:,0])  * (Box_p[:,3]  - Box_p[:,1])
    # 计算标签的面积
    box_gt_area = (Box_gt[:,2] - Box_gt[:,0]) * (Box_gt[:,3] - Box_gt[:,1])
    # 计算预测框与标签框之间的交集
    xI_1 = torch.max(Box_p[:,0],Box_gt[:,0])
    xI_2 = torch.min(Box_p[:,2],Box_gt[:,2])
    yI_1 = torch.max(Box_p[:,1],Box_gt[:,1])
    yI_2 = torch.min(Box_p[:,3],Box_gt[:,3])
    # 交集
    intersection =(yI_2 - yI_1) * (xI_2 - xI_1)
    #intersection = torch.max((yI_2 - yI_1),0) * torch.max((xI_2 - xI_1),0)
    # 计算得到最小封闭图形 C
    xC_1 = torch.min(Box_p[:,0],Box_gt[:,0])
    xC_2 = torch.max(Box_p[:,2],Box_gt[:,2])
    yC_1 = torch.min(Box_p[:,1],Box_gt[:,1])
    yC_2 = torch.max(Box_p[:,3],Box_gt[:,3])
    # 计算最小封闭图形C的面积
    c_area = (xC_2 - xC_1) * (yC_2 - yC_1)
    union = box_p_area + box_gt_area - intersection
    iou = intersection / union
    # GIoU
    giou = iou - (c_area - union) / c_area
    # GIoU loss
    loss_giou = 1 - giou

    return loss_giou.mean()

if __name__ == "__main__":
    box_p = torch.tensor([[125,456,321,647],
                          [25,321,216,645],
                          [111,195,341,679],
                          [30,134,105,371]])
    box_gt = torch.tensor([[132,407,301,667],
                           [29,322,234,664],
                           [109,201,315,680],
                           [41,140,115,384]])
    giou_loss =  generalized_iou(box_p,box_gt)
    print(giou_loss)
Archives Tip
QR Code for this page
Tipping QR Code