知行编程网知行编程网  2022-05-04 12:00 知行编程网 隐藏边栏 |   抢沙发  8 
文章评分 0 次,平均分 0.0

盘点目标检测比赛中的各种 trick

来源 | 知乎    作者 | 初识CV

https://zhuanlan.zhihu.com/p/102817180


文章末尾部分有用MMDetection框架写的代码,代码包含两部分:

1. 各部分代码解析,faster_rcnn_r50_fpn_1x.py。
2. trick部分代码,cascade_rcnn_r50_fpn_1x.py。

1.数据增强


数据增强是增加深度模型鲁棒性和泛化性能的常用手段,随机翻转、随机裁剪、添加噪声等也被引入到检测任务的训练中来,个人认为数据(监督信息)的适时传入可能是更有潜力的方向。

2.Multi-scale Training/Testing 多尺度训练/测试


输入图片的尺寸对检测模型的性能影响相当明显,事实上,多尺度是提升精度最明显的技巧之一。在基础网络部分常常会生成比原图小数十倍的特征图,导致小物体的特征描述不容易被检测网络捕捉。通过输入更大、更多尺寸的图片进行训练,能够在一定程度上提高检测模型对物体大小的鲁棒性,仅在测试阶段引入多尺度,也可享受大尺寸和多尺寸带来的增益。

multi-scale training/testing最早见于[1],训练时,预先定义几个固定的尺度,每个epoch随机选择一个尺度进行训练。测试时,生成几个不同尺度的feature map,对每个Region Proposal,在不同的feature map上也有不同的尺度,我们选择最接近某一固定尺寸(即检测头部的输入尺寸)的Region Proposal作为后续的输入。在[2]中,选择单一尺度的方式被Maxout(element-wise max,逐元素取最大)取代:随机选两个相邻尺度,经过Pooling后使用Maxout进行合并,如下图所示。

盘点目标检测比赛中的各种 trick

近期的工作如FPN等已经尝试在不同尺度的特征图上进行检测,但多尺度训练/测试仍作为一种提升性能的有效技巧被应用在MS COCO等比赛中。

3.Global Context 全局语境


这一技巧在ResNet的工作[3]中提出,做法是把整张图片作为一个RoI,对其进行RoI Pooling并将得到的feature vector拼接于每个RoI的feature vector上,作为一种辅助信息传入之后的R-CNN子网络。目前,也有把相邻尺度上的RoI互相作为context共同传入的做法。

4.Box Refinement/Voting 预测框微调/投票法


微调法和投票法由工作[4]提出,前者也被称为Iterative Localization。微调法最初是在SS算法得到的Region Proposal基础上用检测头部进行多次迭代得到一系列box,在ResNet的工作中,作者将输入R-CNN子网络的Region Proposal和R-CNN子网络得到的预测框共同进行NMS(见下面小节)后处理,最后,把跟NMS筛选所得预测框的IoU超过一定阈值的预测框进行按其分数加权的平均,得到最后的预测结果。投票法可以理解为以顶尖筛选出一流,再用一流的结果进行加权投票决策。

不同的训练策略,不同的 epoch 预测的结果,使用 NMS 来融合,或者softnms
需要调整的参数:

  • box voting 的阈值,

  • 不同的输入中这个框至少出现了几次来允许它输出,

  • 得分的阈值,一个目标框的得分低于这个阈值的时候,就删掉这个目标框。


5.随机权值平均(Stochastic Weight Averaging,SWA)


随机权值平均只需快速集合集成的一小部分算力,就可以接近其表现。SWA 可以用在任意架构和数据集上,都会有不错的表现。根据论文中的实验,SWA 可以得到我之前提到过的更宽的极小值。在经典认知下,SWA 不算集成,因为在训练的最终阶段你只得到一个模型,但它的表现超过了快照集成,接近 FGE(多个模型取平均)。

盘点目标检测比赛中的各种 trick

结合 WSWA 在测试集上优于 SGD 的表现,这意味着尽管 WSWA 训练时的损失较高,它的泛化性更好。

SWA 的直觉来自以下由经验得到的观察:每个学习率周期得到的局部极小值倾向于堆积在损失平面的低损失值区域的边缘(上图左侧的图形中,褐色区域误差较低,点W1、W2、3分别表示3个独立训练的网络,位于褐色区域的边缘)。对这些点取平均值,可能得到一个宽阔的泛化解,其损失更低(上图左侧图形中的 WSWA)。

下面是 SWA 的工作原理。它只保存两个模型,而不是许多模型的集成:

1. 第一个模型保存模型权值的平均值(WSWA)。在训练结束后,它将是用于预测的最终模型。

2. 第二个模型(W)将穿过权值空间,基于周期性学习率规划探索权重空间。

盘点目标检测比赛中的各种 trick

在每个学习率周期的末尾,第二个模型的当前权重将用来更新第一个模型的权重(公式如上)。因此,在训练阶段,只需训练一个模型,并在内存中储存两个模型。预测时只需要平均模型,基于其进行预测将比之前描述的集成快很多,因为在那种集成中,你需要使用多个模型进行预测,最后再进行平均。

方法实现:

论文的作者自己提供了一份 PyTorch 的实现 :

https://github.com/timgaripov/swa

此外,基于 fast.ai 库的 SWA 可见 :

https://github.com/fastai/fastai/pull/276/commits

6.OHEM 在线难例挖掘


OHEM(Online Hard negative Example Mining,在线难例挖掘)见于[5]。两阶段检测模型中,提出的RoI Proposal在输入R-CNN子网络前,我们有机会对正负样本(背景类和前景类)的比例进行调整。通常,背景类的RoI Proposal个数要远远多于前景类,Fast R-CNN的处理方式是随机对两种样本进行上采样和下采样,以使每一batch的正负样本比例保持在1:3,这一做法缓解了类别比例不均衡的问题,是两阶段方法相比单阶段方法具有优势的地方,也被后来的大多数工作沿用。

盘点目标检测比赛中的各种 trick

但在OHEM的工作中,作者提出用R-CNN子网络对RoI Proposal预测的分数来决定每个batch选用的样本,这样,输入R-CNN子网络的RoI Proposal总为其表现不好的样本,提高了监督学习的效率。实际操作中,维护两个完全相同的R-CNN子网络,其中一个只进行前向传播来为RoI Proposal的选择提供指导,另一个则为正常的R-CNN,参与损失的计算并更新权重,并且将权重复制到前者以使两个分支权重同步。

OHEM以额外的R-CNN子网络的开销来改善RoI Proposal的质量,更有效地利用数据的监督信息,成为两阶段模型提升性能的常用部件之一。

7.Soft NMS 软化非极大抑制

盘点目标检测比赛中的各种 trick

NMS(Non-Maximum Suppression,非极大抑制)是检测模型的标准后处理操作,用于去除重合度(IoU)较高的预测框,只保留预测分数最高的预测框作为检测输出。Soft NMS由[6]提出。在传统的NMS中,跟最高预测分数预测框重合度超出一定阈值的预测框会被直接舍弃,作者认为这样不利于相邻物体的检测。提出的改进方法是根据IoU将预测框的预测分数进行惩罚,最后再按分数过滤。配合Deformable Convnets(将在之后的文章介绍),Soft NMS在MS COCO上取得了当时最佳的表现。算法改进如下:

盘点目标检测比赛中的各种 trick

上图中的即为软化函数,通常取线性或高斯函数,后者效果稍好一些。当然,在享受这一增益的同时,Soft-NMS也引入了一些超参,对不同的数据集需要试探以确定最佳配置。

8.RoIAlign RoI对齐


RoIAlign是Mask R-CNN([7])的工作中提出的,针对的问题是RoI在进行Pooling时有不同程度的取整,这影响了实例分割中mask损失的计算。文章采用双线性插值的方法将RoI的表示精细化,并带来了较为明显的性能提升。这一技巧也被后来的一些工作(如light-head R-CNN)沿用。

9.拾遗


除去上面所列的技巧外,还有一些做法也值得注意:

  • 更好的先验(YOLOv2):使用聚类方法统计数据中box标注的大小和长宽比,以更好的设置anchor box的生成配置

  • 更好的pre-train模型:检测模型的基础网络通常使用ImageNet(通常是ImageNet-1k)上训练好的模型进行初始化,使用更大的数据集(ImageNet-5k)预训练基础网络对精度的提升亦有帮助

  • 超参数的调整:部分工作也发现如NMS中IoU阈值的调整(从0.3到0.5)也有利于精度的提升,但这一方面尚无最佳配置参照


最后,集成(Ensemble)作为通用的手段也被应用在比赛中。

代码部分


1.各部分代码解析,faster_rcnn_r50_fpn_1x.py:


首先介绍一下这个配置文件所描述的框架,它是基于resnet50的backbone,有着5个fpn特征层的faster-RCNN目标检测网络,训练迭代次数为标准的12次epoch。


<section style="border-radius: 4px;font-size: 0.85em;margin: 0px 8px;background: rgb(40, 44, 52);color: rgb(171, 178, 191);display: block;padding: 6px;overflow-x: auto;white-space: nowrap;"><span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 105px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># model settings</span><br mpa-from-tpl="t"  />model = dict(<br mpa-from-tpl="t"  />  type=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 79px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'FasterRCNN'</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 70px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># model类型</span><br mpa-from-tpl="t"  />    pretrained=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 138px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'modelzoo://resnet50'</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 197px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 预训练模型:imagenet-resnet50</span><br mpa-from-tpl="t"  />    backbone=dict(<br mpa-from-tpl="t"  />        type=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 52px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'ResNet'</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 90px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># backbone类型</span><br mpa-from-tpl="t"  />        depth=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 13px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">50</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 61px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 网络层数</span><br mpa-from-tpl="t"  />        num_stages=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 7px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">4</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 122px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># resnet的stage数量</span><br mpa-from-tpl="t"  />        out_indices=(<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 7px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">0</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 7px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">1</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 7px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">2</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 6px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">3</span>), <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 118px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 输出的stage的序号</span><br mpa-from-tpl="t"  />        frozen_stages=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 7px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">1</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 413px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 冻结的stage数量,即该stage不更新参数,-1表示所有的stage都更新参数</span><br mpa-from-tpl="t"  />        style=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 60px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'pytorch'</span>), <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 721px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 网络风格:如果设置pytorch,则stride为2的层是conv3x3的卷积层;如果设置caffe,则stride为2的层是第一个conv1x1的卷积层</span><br mpa-from-tpl="t"  />    neck=dict(<br mpa-from-tpl="t"  />        type=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 33px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'FPN'</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 63px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># neck类型</span><br mpa-from-tpl="t"  />        in_channels=[<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">256</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">512</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">1024</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">2048</span>], <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 154px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 输入的各个stage的通道数</span><br mpa-from-tpl="t"  />        out_channels=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">256</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 133px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 输出的特征层的通道数</span><br mpa-from-tpl="t"  />        num_outs=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 7px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">5</span>), <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 121px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 输出的特征层的数量</span><br mpa-from-tpl="t"  />    rpn_head=dict(<br mpa-from-tpl="t"  />        type=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 59px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'RPNHead'</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 81px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># RPN网络类型</span><br mpa-from-tpl="t"  />        in_channels=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">256</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 129px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># RPN网络的输入通道数</span><br mpa-from-tpl="t"  />        feat_channels=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">256</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 97px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 特征层的通道数</span><br mpa-from-tpl="t"  />        anchor_scales=[<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 6px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">8</span>], <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 421px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 生成的anchor的baselen,baselen = sqrt(w*h),w和h为anchor的宽和高</span><br mpa-from-tpl="t"  />        anchor_ratios=[<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 19px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">0.5</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 19px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">1.0</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">2.0</span>], <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 101px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># anchor的宽高比</span><br mpa-from-tpl="t"  />        anchor_strides=[<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 7px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">4</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 7px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">8</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 13px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">16</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 13px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">32</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 14px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">64</span>], <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 269px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 在每个特征层上的anchor的步长(对应于原图)</span><br mpa-from-tpl="t"  />        target_means=[<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 13px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">.0</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 14px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">.0</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 13px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">.0</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 13px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">.0</span>], <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 37px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 均值</span><br mpa-from-tpl="t"  />        target_stds=[<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">1.0</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">1.0</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">1.0</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">1.0</span>], <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 37px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 方差</span><br mpa-from-tpl="t"  />        use_sigmoid_cls=<span style="color: rgb(198, 120, 221);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(198, 120, 221);font-weight: 400;font-style: normal;">True</span>), <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 354px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 是否使用sigmoid来进行分类,如果False则使用softmax来分类</span><br mpa-from-tpl="t"  />    bbox_roi_extractor=dict(<br mpa-from-tpl="t"  />        type=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 132px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'SingleRoIExtractor'</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 116px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># RoIExtractor类型</span><br mpa-from-tpl="t"  />        roi_layer=dict(type=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 66px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'RoIAlign'</span>, out_size=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 7px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">7</span>, sample_num=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 7px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">2</span>), <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 362px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># ROI具体参数:ROI类型为ROIalign,输出尺寸为7,sample数为2</span><br mpa-from-tpl="t"  />        out_channels=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">256</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 73px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 输出通道数</span><br mpa-from-tpl="t"  />        featmap_strides=[<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 6px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">4</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 6px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">8</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 13px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">16</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 13px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">32</span>]), <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 85px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 特征图的步长</span><br mpa-from-tpl="t"  />    bbox_head=dict(<br mpa-from-tpl="t"  />        type=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 118px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'SharedFCBBoxHead'</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 86px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 全连接层类型</span><br mpa-from-tpl="t"  />        num_fcs=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 7px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">2</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 86px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 全连接层数量</span><br mpa-from-tpl="t"  />        in_channels=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">256</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 74px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 输入通道数</span><br mpa-from-tpl="t"  />        fc_out_channels=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">1024</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 74px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 输出通道数</span><br mpa-from-tpl="t"  />        roi_feat_size=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 7px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">7</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 93px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># ROI特征层尺寸</span><br mpa-from-tpl="t"  />        num_classes=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 13px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">81</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 292px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 分类器的类别数量+1,+1是因为多了一个背景的类别</span><br mpa-from-tpl="t"  />        target_means=[<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 13px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">0.</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 14px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">0.</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 13px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">0.</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 13px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">0.</span>], <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 38px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 均值</span><br mpa-from-tpl="t"  />        target_stds=[<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">0.1</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">0.1</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">0.2</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">0.2</span>], <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 38px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 方差</span><br mpa-from-tpl="t"  />        reg_class_agnostic=<span style="color: rgb(198, 120, 221);background: rgba(0, 0, 0, 0);display: inline;width: 33px;text-decoration: none solid rgb(198, 120, 221);font-weight: 400;font-style: normal;">False</span>)) <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 1031px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 是否采用class_agnostic的方式来预测,class_agnostic表示输出bbox时只考虑其是否为前景,后续分类的时候再根据该bbox在网络中的类别得分来分类,也就是说一个框可以对应多个类别</span><br mpa-from-tpl="t"  /><span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 244px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># model training and testing settings</span><br mpa-from-tpl="t"  />train_cfg = dict(<br mpa-from-tpl="t"  />    rpn=dict(<br mpa-from-tpl="t"  />        assigner=dict(<br mpa-from-tpl="t"  />            type=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 106px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'MaxIoUAssigner'</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 141px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># RPN网络的正负样本划分</span><br mpa-from-tpl="t"  />            pos_iou_thr=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">0.7</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 105px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 正样本的iou阈值</span><br mpa-from-tpl="t"  />            neg_iou_thr=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">0.3</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 105px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 负样本的iou阈值</span><br mpa-from-tpl="t"  />            min_pos_iou=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">0.3</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 739px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 正样本的iou最小值。如果assign给ground truth的anchors中最大的IOU低于0.3,则忽略所有的anchors,否则保留最大IOU的anchor</span><br mpa-from-tpl="t"  />            ignore_iof_thr=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 13px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">-1</span>), <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 447px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 忽略bbox的阈值,当ground truth中包含需要忽略的bbox时使用,-1表示不忽略</span><br mpa-from-tpl="t"  />        sampler=dict(<br mpa-from-tpl="t"  />            type=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 99px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'RandomSampler'</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 121px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 正负样本提取器类型</span><br mpa-from-tpl="t"  />            num=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">256</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 133px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 需提取的正负样本数量</span><br mpa-from-tpl="t"  />            pos_fraction=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">0.5</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 73px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 正样本比例</span><br mpa-from-tpl="t"  />            neg_pos_ub=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 13px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">-1</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 327px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 最大负样本比例,大于该比例的负样本忽略,-1表示不忽略</span><br mpa-from-tpl="t"  />            add_gt_as_proposals=<span style="color: rgb(198, 120, 221);background: rgba(0, 0, 0, 0);display: inline;width: 33px;text-decoration: none solid rgb(198, 120, 221);font-weight: 400;font-style: normal;">False</span>), <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 241px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 把ground truth加入proposal作为正样本</span><br mpa-from-tpl="t"  />        allowed_border=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 6px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">0</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 184px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 允许在bbox周围外扩一定的像素</span><br mpa-from-tpl="t"  />        pos_weight=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 13px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">-1</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 219px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 正样本权重,-1表示不改变原始的权重</span><br mpa-from-tpl="t"  />        smoothl1_beta=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 7px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">1</span> / <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">9.0</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 75px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 平滑L1系数</span><br mpa-from-tpl="t"  />        debug=<span style="color: rgb(198, 120, 221);background: rgba(0, 0, 0, 0);display: inline;width: 33px;text-decoration: none solid rgb(198, 120, 221);font-weight: 400;font-style: normal;">False</span>), <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 70px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># debug模式</span><br mpa-from-tpl="t"  />    rcnn=dict(<br mpa-from-tpl="t"  />        assigner=dict(<br mpa-from-tpl="t"  />            type=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 106px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'MaxIoUAssigner'</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 136px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># RCNN网络正负样本划分</span><br mpa-from-tpl="t"  />            pos_iou_thr=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">0.5</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 105px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 正样本的iou阈值</span><br mpa-from-tpl="t"  />            neg_iou_thr=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">0.5</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 105px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 负样本的iou阈值</span><br mpa-from-tpl="t"  />            min_pos_iou=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">0.5</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 739px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 正样本的iou最小值。如果assign给ground truth的anchors中最大的IOU低于0.3,则忽略所有的anchors,否则保留最大IOU的anchor</span><br mpa-from-tpl="t"  />            ignore_iof_thr=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 13px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">-1</span>), <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 447px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 忽略bbox的阈值,当ground truth中包含需要忽略的bbox时使用,-1表示不忽略</span><br mpa-from-tpl="t"  />        sampler=dict(<br mpa-from-tpl="t"  />            type=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 99px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'RandomSampler'</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 121px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 正负样本提取器类型</span><br mpa-from-tpl="t"  />            num=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">512</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 133px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 需提取的正负样本数量</span><br mpa-from-tpl="t"  />            pos_fraction=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">0.25</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 73px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 正样本比例</span><br mpa-from-tpl="t"  />            neg_pos_ub=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 13px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">-1</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 327px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 最大负样本比例,大于该比例的负样本忽略,-1表示不忽略</span><br mpa-from-tpl="t"  />            add_gt_as_proposals=<span style="color: rgb(198, 120, 221);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(198, 120, 221);font-weight: 400;font-style: normal;">True</span>), <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 241px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 把ground truth加入proposal作为正样本</span><br mpa-from-tpl="t"  />        pos_weight=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 13px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">-1</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 219px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 正样本权重,-1表示不改变原始的权重</span><br mpa-from-tpl="t"  />        debug=<span style="color: rgb(198, 120, 221);background: rgba(0, 0, 0, 0);display: inline;width: 33px;text-decoration: none solid rgb(198, 120, 221);font-weight: 400;font-style: normal;">False</span>)) <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 70px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># debug模式</span><br mpa-from-tpl="t"  />test_cfg = dict(<br mpa-from-tpl="t"  />    rpn=dict( <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 105px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 推断时的RPN参数</span><br mpa-from-tpl="t"  />        nms_across_levels=<span style="color: rgb(198, 120, 221);background: rgba(0, 0, 0, 0);display: inline;width: 33px;text-decoration: none solid rgb(198, 120, 221);font-weight: 400;font-style: normal;">False</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 137px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 在所有的fpn层内做nms</span><br mpa-from-tpl="t"  />        nms_pre=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">2000</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 254px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 在nms之前保留的的得分最高的proposal数量</span><br mpa-from-tpl="t"  />        nms_post=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">2000</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 254px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 在nms之后保留的的得分最高的proposal数量</span><br mpa-from-tpl="t"  />        max_num=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">2000</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 222px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 在后处理完成之后保留的proposal数量</span><br mpa-from-tpl="t"  />        nms_thr=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">0.7</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 57px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># nms阈值</span><br mpa-from-tpl="t"  />        min_bbox_size=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 7px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">0</span>), <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 88px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 最小bbox尺寸</span><br mpa-from-tpl="t"  />    rcnn=dict(<br mpa-from-tpl="t"  />        score_thr=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">0.05</span>, nms=dict(type=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 32px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'nms'</span>, iou_thr=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">0.5</span>), max_per_img=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">100</span>) <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 247px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># max_per_img表示最终输出的det bbox数量</span><br mpa-from-tpl="t"  />    <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 297px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># soft-nms is also supported for rcnn testing</span><br mpa-from-tpl="t"  />    <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 578px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># e.g., nms=dict(type='soft_nms', iou_thr=0.5, min_score=0.05) # soft_nms参数</span><br mpa-from-tpl="t"  />)<br mpa-from-tpl="t"  /><span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 119px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># dataset settings</span><br mpa-from-tpl="t"  />dataset_type = <span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 86px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'CocoDataset'</span>                <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 73px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 数据集类型</span><br mpa-from-tpl="t"  />data_root = <span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 79px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'data/coco/'</span>                    <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 85px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 数据集根目录</span><br mpa-from-tpl="t"  />img_norm_cfg = dict(<br mpa-from-tpl="t"  />    mean=[<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 46px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">123.675</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 40px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">116.28</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 39px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">103.53</span>], std=[<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 40px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">58.395</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 33px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">57.12</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 40px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">57.375</span>], to_rgb=<span style="color: rgb(198, 120, 221);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(198, 120, 221);font-weight: 400;font-style: normal;">True</span>) <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 414px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 输入图像初始化,减去均值mean并处以方差std,to_rgb表示将bgr转为rgb</span><br mpa-from-tpl="t"  />data = dict(<br mpa-from-tpl="t"  />    imgs_per_gpu=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 7px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">2</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 141px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 每个gpu计算的图像数量</span><br mpa-from-tpl="t"  />    workers_per_gpu=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 6px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">2</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 129px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 每个gpu分配的线程数</span><br mpa-from-tpl="t"  />    train=dict(<br mpa-from-tpl="t"  />        type=dataset_type, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 73px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 数据集类型</span><br mpa-from-tpl="t"  />        ann_file=data_root + <span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 251px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'annotations/instances_train2017.json'</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 139px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 数据集annotation路径</span><br mpa-from-tpl="t"  />        img_prefix=data_root + <span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 79px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'train2017/'</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 109px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 数据集的图片路径</span><br mpa-from-tpl="t"  />        img_scale=(<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">1333</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">800</span>), <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 227px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 输入图像尺寸,最大边1333,最小边800</span><br mpa-from-tpl="t"  />        img_norm_cfg=img_norm_cfg, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 97px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 图像初始化参数</span><br mpa-from-tpl="t"  />        size_divisor=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 14px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">32</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 430px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 对图像进行resize时的最小单位,32表示所有的图像都会被resize成32的倍数</span><br mpa-from-tpl="t"  />        flip_ratio=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">0.5</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 157px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 图像的随机左右翻转的概率</span><br mpa-from-tpl="t"  />        with_mask=<span style="color: rgb(198, 120, 221);background: rgba(0, 0, 0, 0);display: inline;width: 33px;text-decoration: none solid rgb(198, 120, 221);font-weight: 400;font-style: normal;">False</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 99px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 训练时附带mask</span><br mpa-from-tpl="t"  />        with_crowd=<span style="color: rgb(198, 120, 221);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(198, 120, 221);font-weight: 400;font-style: normal;">True</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 168px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 训练时附带difficult的样本</span><br mpa-from-tpl="t"  />        with_label=<span style="color: rgb(198, 120, 221);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(198, 120, 221);font-weight: 400;font-style: normal;">True</span>), <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 106px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 训练时附带label</span><br mpa-from-tpl="t"  />    val=dict(<br mpa-from-tpl="t"  />        type=dataset_type, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 37px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 同上</span><br mpa-from-tpl="t"  />        ann_file=data_root + <span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 238px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'annotations/instances_val2017.json'</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 37px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 同上</span><br mpa-from-tpl="t"  />        img_prefix=data_root + <span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 66px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'val2017/'</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 37px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 同上</span><br mpa-from-tpl="t"  />        img_scale=(<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">1333</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">800</span>), <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 37px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 同上</span><br mpa-from-tpl="t"  />        img_norm_cfg=img_norm_cfg, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 37px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 同上</span><br mpa-from-tpl="t"  />        size_divisor=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 14px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">32</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 37px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 同上</span><br mpa-from-tpl="t"  />        flip_ratio=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 7px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">0</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 37px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 同上</span><br mpa-from-tpl="t"  />        with_mask=<span style="color: rgb(198, 120, 221);background: rgba(0, 0, 0, 0);display: inline;width: 33px;text-decoration: none solid rgb(198, 120, 221);font-weight: 400;font-style: normal;">False</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 37px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 同上</span><br mpa-from-tpl="t"  />        with_crowd=<span style="color: rgb(198, 120, 221);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(198, 120, 221);font-weight: 400;font-style: normal;">True</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 37px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 同上</span><br mpa-from-tpl="t"  />        with_label=<span style="color: rgb(198, 120, 221);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(198, 120, 221);font-weight: 400;font-style: normal;">True</span>), <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 37px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 同上</span><br mpa-from-tpl="t"  />    test=dict(<br mpa-from-tpl="t"  />        type=dataset_type, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 37px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 同上</span><br mpa-from-tpl="t"  />        ann_file=data_root + <span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 238px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'annotations/instances_val2017.json'</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 37px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 同上</span><br mpa-from-tpl="t"  />        img_prefix=data_root + <span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 66px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'val2017/'</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 37px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 同上</span><br mpa-from-tpl="t"  />        img_scale=(<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">1333</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">800</span>), <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 37px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 同上</span><br mpa-from-tpl="t"  />        img_norm_cfg=img_norm_cfg, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 37px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 同上</span><br mpa-from-tpl="t"  />        size_divisor=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 14px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">32</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 37px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 同上</span><br mpa-from-tpl="t"  />        flip_ratio=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 7px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">0</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 37px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 同上</span><br mpa-from-tpl="t"  />        with_mask=<span style="color: rgb(198, 120, 221);background: rgba(0, 0, 0, 0);display: inline;width: 33px;text-decoration: none solid rgb(198, 120, 221);font-weight: 400;font-style: normal;">False</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 37px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 同上</span><br mpa-from-tpl="t"  />        with_label=<span style="color: rgb(198, 120, 221);background: rgba(0, 0, 0, 0);display: inline;width: 33px;text-decoration: none solid rgb(198, 120, 221);font-weight: 400;font-style: normal;">False</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 37px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 同上</span><br mpa-from-tpl="t"  />        test_mode=<span style="color: rgb(198, 120, 221);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(198, 120, 221);font-weight: 400;font-style: normal;">True</span>)) <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 37px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 同上</span><br mpa-from-tpl="t"  /><span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 72px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># optimizer</span><br mpa-from-tpl="t"  />optimizer = dict(type=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 33px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'SGD'</span>, lr=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">0.02</span>, momentum=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">0.9</span>, weight_decay=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 39px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">0.0001</span>) <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 434px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 优化参数,lr为学习率,momentum为动量因子,weight_decay为权重衰减因子</span><br mpa-from-tpl="t"  />optimizer_config = dict(grad_clip=dict(max_norm=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 14px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">35</span>, norm_type=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 6px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">2</span>)) <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 85px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 梯度均衡参数</span><br mpa-from-tpl="t"  /><span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 112px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># learning policy</span><br mpa-from-tpl="t"  />lr_config = dict(<br mpa-from-tpl="t"  />    policy=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 40px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'step'</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 61px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 优化策略</span><br mpa-from-tpl="t"  />    warmup=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 53px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'linear'</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 257px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 初始的学习率增加的策略,linear为线性增加</span><br mpa-from-tpl="t"  />    warmup_iters=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">500</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 213px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 在初始的500次迭代中学习率逐渐增加</span><br mpa-from-tpl="t"  />    warmup_ratio=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 20px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">1.0</span> / <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 6px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">3</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 85px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 起始的学习率</span><br mpa-from-tpl="t"  />    step=[<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 6px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">8</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 13px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">11</span>]) <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 186px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 在第8和11个epoch时降低学习率</span><br mpa-from-tpl="t"  />checkpoint_config = dict(interval=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 7px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">1</span>) <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 149px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 每1个epoch存储一次模型</span><br mpa-from-tpl="t"  /><span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 92px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># yapf:disable</span><br mpa-from-tpl="t"  />log_config = dict(<br mpa-from-tpl="t"  />    interval=<span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 13px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">50</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 155px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 每50个batch输出一次信息</span><br mpa-from-tpl="t"  />    hooks=[<br mpa-from-tpl="t"  />        dict(type=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 105px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'TextLoggerHook'</span>), <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 133px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 控制台输出信息的风格</span><br mpa-from-tpl="t"  />        <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 237px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># dict(type='TensorboardLoggerHook')</span><br mpa-from-tpl="t"  />    ])<br mpa-from-tpl="t"  /><span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 86px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># yapf:enable</span><br mpa-from-tpl="t"  /><span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 119px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># runtime settings</span><br mpa-from-tpl="t"  />total_epochs = <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 13px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">12</span>                               <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 83px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 最大epoch数</span><br mpa-from-tpl="t"  />dist_params = dict(backend=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 40px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'nccl'</span>) <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 74px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 分布式参数</span><br mpa-from-tpl="t"  />log_level = <span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 40px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'INFO'</span>                              <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 134px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 输出信息的完整度级别</span><br mpa-from-tpl="t"  />work_dir = <span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 238px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'./work_dirs/faster_rcnn_r50_fpn_1x'</span> <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 165px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># log文件和模型文件存储路径</span><br mpa-from-tpl="t"  />load_from = <span style="color: rgb(198, 120, 221);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(198, 120, 221);font-weight: 400;font-style: normal;">None</span>                                <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 256px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 加载模型的路径,None表示从预训练模型加载</span><br mpa-from-tpl="t"  />resume_from = <span style="color: rgb(198, 120, 221);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(198, 120, 221);font-weight: 400;font-style: normal;">None</span>                              <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 122px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 恢复训练模型的路径</span><br mpa-from-tpl="t"  />workflow = [(<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 46px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'train'</span>, <span style="color: rgb(209, 154, 102);background: rgba(0, 0, 0, 0);display: inline;width: 7px;text-decoration: none solid rgb(209, 154, 102);font-weight: 400;font-style: normal;">1</span>)] <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 98px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 当前工作区名称</span></section>

2.trick部分代码,cascade_rcnn_r50_fpn_1x.py:


<section style="border-radius: 4px;font-size: 0.85em;margin: 0px 8px;background: rgb(40, 44, 52);color: rgb(171, 178, 191);display: block;padding: 6px;overflow-x: auto;white-space: nowrap;"><span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 99px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># fp16 settings</span><br mpa-from-tpl="t"  />fp16 = dict(loss_scale=512.)<br mpa-from-tpl="t"  /><span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 105px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># model settings</span><br mpa-from-tpl="t"  />model = dict(<br mpa-from-tpl="t"  />    <span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 86px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'CascadeRCNN'</span>,<br mpa-from-tpl="t"  />    num_stages=3,<br mpa-from-tpl="t"  />    pretrained=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 158px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'torchvision://resnet50'</span>,<br mpa-from-tpl="t"  />    backbone=dict(<br mpa-from-tpl="t"  />        <span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 52px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'ResNet'</span>,<br mpa-from-tpl="t"  />        depth=50,<br mpa-from-tpl="t"  />        num_stages=4,<br mpa-from-tpl="t"  />        out_indices=(0, 1, 2, 3),<br mpa-from-tpl="t"  />        frozen_stages=1,<br mpa-from-tpl="t"  />        style=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 60px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'pytorch'</span>,<br mpa-from-tpl="t"  />        <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 262px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;">#dcn=dict( #在最后三个block加入可变形卷积</span><br mpa-from-tpl="t"  />         <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 449px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># modulated=False, deformable_groups=1, fallback_on_stride=False),</span><br mpa-from-tpl="t"  />          <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 283px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># stage_with_dcn=(False, True, True, True)</span><br mpa-from-tpl="t"  />        ),<br mpa-from-tpl="t"  />    neck=dict(<br mpa-from-tpl="t"  />        <span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 33px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'FPN'</span>,<br mpa-from-tpl="t"  />        in_channels=[256, 512, 1024, 2048],<br mpa-from-tpl="t"  />        out_channels=256,<br mpa-from-tpl="t"  />        num_outs=5),<br mpa-from-tpl="t"  />    rpn_head=dict(<br mpa-from-tpl="t"  />        <span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 59px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'RPNHead'</span>,<br mpa-from-tpl="t"  />        in_channels=256,<br mpa-from-tpl="t"  />        feat_channels=256,<br mpa-from-tpl="t"  />        anchor_scales=[8],<br mpa-from-tpl="t"  />        anchor_ratios=[0.2, 0.5, 1.0, 2.0, 5.0], <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 160px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 添加了0.2,5,过两天发图</span><br mpa-from-tpl="t"  />        anchor_strides=[4, 8, 16, 32, 64],<br mpa-from-tpl="t"  />        target_means=[.0, .0, .0, .0],<br mpa-from-tpl="t"  />        target_stds=[1.0, 1.0, 1.0, 1.0],<br mpa-from-tpl="t"  />        loss_cls=dict(<br mpa-from-tpl="t"  />            <span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 73px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'FocalLoss'</span>, use_sigmoid=True, loss_weight=1.0), <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 268px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 修改了loss,为了调控难易样本与正负样本比例</span><br mpa-from-tpl="t"  />        loss_bbox=dict(<span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 92px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'SmoothL1Loss'</span>, beta=1.0 / 9.0, loss_weight=1.0)),<br mpa-from-tpl="t"  />    bbox_roi_extractor=dict(<br mpa-from-tpl="t"  />        <span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 132px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'SingleRoIExtractor'</span>,<br mpa-from-tpl="t"  />        roi_layer=dict(<span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 65px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'RoIAlign'</span>, out_size=7, sample_num=2),<br mpa-from-tpl="t"  />        out_channels=256,<br mpa-from-tpl="t"  />        featmap_strides=[4, 8, 16, 32]),<br mpa-from-tpl="t"  />    bbox_head=[<br mpa-from-tpl="t"  />        dict(<br mpa-from-tpl="t"  />            <span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 119px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'SharedFCBBoxHead'</span>,<br mpa-from-tpl="t"  />            num_fcs=2,<br mpa-from-tpl="t"  />            in_channels=256,<br mpa-from-tpl="t"  />            fc_out_channels=1024,<br mpa-from-tpl="t"  />            roi_feat_size=7,<br mpa-from-tpl="t"  />            num_classes=11,<br mpa-from-tpl="t"  />            target_means=[0., 0., 0., 0.],<br mpa-from-tpl="t"  />            target_stds=[0.1, 0.1, 0.2, 0.2],<br mpa-from-tpl="t"  />            reg_class_agnostic=True,<br mpa-from-tpl="t"  />            loss_cls=dict(<br mpa-from-tpl="t"  />                <span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 119px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'CrossEntropyLoss'</span>, use_sigmoid=False, loss_weight=1.0),<br mpa-from-tpl="t"  />            loss_bbox=dict(<span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 92px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'SmoothL1Loss'</span>, beta=1.0, loss_weight=1.0)),<br mpa-from-tpl="t"  />        dict(<br mpa-from-tpl="t"  />            <span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 119px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'SharedFCBBoxHead'</span>,<br mpa-from-tpl="t"  />            num_fcs=2,<br mpa-from-tpl="t"  />            in_channels=256,<br mpa-from-tpl="t"  />            fc_out_channels=1024,<br mpa-from-tpl="t"  />            roi_feat_size=7,<br mpa-from-tpl="t"  />            num_classes=11,<br mpa-from-tpl="t"  />            target_means=[0., 0., 0., 0.],<br mpa-from-tpl="t"  />            target_stds=[0.05, 0.05, 0.1, 0.1],<br mpa-from-tpl="t"  />            reg_class_agnostic=True,<br mpa-from-tpl="t"  />            loss_cls=dict(<br mpa-from-tpl="t"  />                <span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 119px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'CrossEntropyLoss'</span>, use_sigmoid=False, loss_weight=1.0),<br mpa-from-tpl="t"  />            loss_bbox=dict(<span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 92px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'SmoothL1Loss'</span>, beta=1.0, loss_weight=1.0)),<br mpa-from-tpl="t"  />        dict(<br mpa-from-tpl="t"  />            <span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 119px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'SharedFCBBoxHead'</span>,<br mpa-from-tpl="t"  />            num_fcs=2,<br mpa-from-tpl="t"  />            in_channels=256,<br mpa-from-tpl="t"  />            fc_out_channels=1024,<br mpa-from-tpl="t"  />            roi_feat_size=7,<br mpa-from-tpl="t"  />            num_classes=11,<br mpa-from-tpl="t"  />            target_means=[0., 0., 0., 0.],<br mpa-from-tpl="t"  />            target_stds=[0.033, 0.033, 0.067, 0.067],<br mpa-from-tpl="t"  />            reg_class_agnostic=True,<br mpa-from-tpl="t"  />            loss_cls=dict(<br mpa-from-tpl="t"  />                <span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 119px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'CrossEntropyLoss'</span>, use_sigmoid=False, loss_weight=1.0),<br mpa-from-tpl="t"  />            loss_bbox=dict(<span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 92px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'SmoothL1Loss'</span>, beta=1.0, loss_weight=1.0))<br mpa-from-tpl="t"  />    ])<br mpa-from-tpl="t"  /><span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 244px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># model training and testing settings</span><br mpa-from-tpl="t"  />train_cfg = dict(<br mpa-from-tpl="t"  />    rpn=dict(<br mpa-from-tpl="t"  />        assigner=dict(<br mpa-from-tpl="t"  />            <span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 106px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'MaxIoUAssigner'</span>,<br mpa-from-tpl="t"  />            pos_iou_thr=0.7,<br mpa-from-tpl="t"  />            neg_iou_thr=0.3,<br mpa-from-tpl="t"  />            min_pos_iou=0.3,<br mpa-from-tpl="t"  />            ignore_iof_thr=-1),<br mpa-from-tpl="t"  />        sampler=dict(<br mpa-from-tpl="t"  />            <span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 99px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'RandomSampler'</span>,<br mpa-from-tpl="t"  />            num=256,<br mpa-from-tpl="t"  />            pos_fraction=0.5,<br mpa-from-tpl="t"  />            neg_pos_ub=-1,<br mpa-from-tpl="t"  />            add_gt_as_proposals=False),<br mpa-from-tpl="t"  />        allowed_border=0,<br mpa-from-tpl="t"  />        pos_weight=-1,<br mpa-from-tpl="t"  />        debug=False),<br mpa-from-tpl="t"  />    rpn_proposal=dict(<br mpa-from-tpl="t"  />        nms_across_levels=False,<br mpa-from-tpl="t"  />        nms_pre=2000,<br mpa-from-tpl="t"  />        nms_post=2000,<br mpa-from-tpl="t"  />        max_num=2000,<br mpa-from-tpl="t"  />        nms_thr=0.7,<br mpa-from-tpl="t"  />        min_bbox_size=0),<br mpa-from-tpl="t"  />    rcnn=[<br mpa-from-tpl="t"  />        dict(<br mpa-from-tpl="t"  />            assigner=dict(<br mpa-from-tpl="t"  />                <span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 106px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'MaxIoUAssigner'</span>,<br mpa-from-tpl="t"  />                pos_iou_thr=0.4, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 37px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 更换</span><br mpa-from-tpl="t"  />                neg_iou_thr=0.4,<br mpa-from-tpl="t"  />                min_pos_iou=0.4,<br mpa-from-tpl="t"  />                ignore_iof_thr=-1),<br mpa-from-tpl="t"  />            sampler=dict(<br mpa-from-tpl="t"  />                <span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 86px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'OHEMSampler'</span>,<br mpa-from-tpl="t"  />                num=512,<br mpa-from-tpl="t"  />                pos_fraction=0.25,<br mpa-from-tpl="t"  />                neg_pos_ub=-1,<br mpa-from-tpl="t"  />                add_gt_as_proposals=True),<br mpa-from-tpl="t"  />            pos_weight=-1,<br mpa-from-tpl="t"  />            debug=False),<br mpa-from-tpl="t"  />        dict(<br mpa-from-tpl="t"  />            assigner=dict(<br mpa-from-tpl="t"  />                <span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 106px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'MaxIoUAssigner'</span>,<br mpa-from-tpl="t"  />                pos_iou_thr=0.5,<br mpa-from-tpl="t"  />                neg_iou_thr=0.5,<br mpa-from-tpl="t"  />                min_pos_iou=0.5,<br mpa-from-tpl="t"  />                ignore_iof_thr=-1),<br mpa-from-tpl="t"  />            sampler=dict(<br mpa-from-tpl="t"  />                <span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 86px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'OHEMSampler'</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 253px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 解决难易样本,也解决了正负样本比例问题。</span><br mpa-from-tpl="t"  />                num=512,<br mpa-from-tpl="t"  />                pos_fraction=0.25,<br mpa-from-tpl="t"  />                neg_pos_ub=-1,<br mpa-from-tpl="t"  />                add_gt_as_proposals=True),<br mpa-from-tpl="t"  />            pos_weight=-1,<br mpa-from-tpl="t"  />            debug=False),<br mpa-from-tpl="t"  />        dict(<br mpa-from-tpl="t"  />            assigner=dict(<br mpa-from-tpl="t"  />                <span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 106px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'MaxIoUAssigner'</span>,<br mpa-from-tpl="t"  />                pos_iou_thr=0.6,<br mpa-from-tpl="t"  />                neg_iou_thr=0.6,<br mpa-from-tpl="t"  />                min_pos_iou=0.6,<br mpa-from-tpl="t"  />                ignore_iof_thr=-1),<br mpa-from-tpl="t"  />            sampler=dict(<br mpa-from-tpl="t"  />                <span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 86px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'OHEMSampler'</span>,<br mpa-from-tpl="t"  />                num=512,<br mpa-from-tpl="t"  />                pos_fraction=0.25,<br mpa-from-tpl="t"  />                neg_pos_ub=-1,<br mpa-from-tpl="t"  />                add_gt_as_proposals=True),<br mpa-from-tpl="t"  />            pos_weight=-1,<br mpa-from-tpl="t"  />            debug=False)<br mpa-from-tpl="t"  />    ],<br mpa-from-tpl="t"  />    stage_loss_weights=[1, 0.5, 0.25])<br mpa-from-tpl="t"  />test_cfg = dict(<br mpa-from-tpl="t"  />    rpn=dict(<br mpa-from-tpl="t"  />        nms_across_levels=False,<br mpa-from-tpl="t"  />        nms_pre=1000,<br mpa-from-tpl="t"  />        nms_post=1000,<br mpa-from-tpl="t"  />        max_num=1000,<br mpa-from-tpl="t"  />        nms_thr=0.7,<br mpa-from-tpl="t"  />        min_bbox_size=0),<br mpa-from-tpl="t"  />    rcnn=dict(<br mpa-from-tpl="t"  />        score_thr=0.05, nms=dict(<span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 32px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'nms'</span>, iou_thr=0.5), max_per_img=20)) <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 138px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 这里可以换为sof_tnms</span><br mpa-from-tpl="t"  /><span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 119px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># dataset settings</span><br mpa-from-tpl="t"  />dataset_type = <span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 86px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'CocoDataset'</span><br mpa-from-tpl="t"  />data_root = <span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 310px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'../../data/chongqing1_round1_train1_20191223/'</span><br mpa-from-tpl="t"  />img_norm_cfg = dict(<br mpa-from-tpl="t"  />    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)<br mpa-from-tpl="t"  />train_pipeline = [<br mpa-from-tpl="t"  />    dict(<span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 126px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'LoadImageFromFile'</span>),<br mpa-from-tpl="t"  />    dict(<span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 112px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'LoadAnnotations'</span>, with_bbox=True),<br mpa-from-tpl="t"  />    dict(<span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 53px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'Resize'</span>, img_scale=(492,658), keep_ratio=True), <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 161px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;">#这里可以更换多尺度[(),()]</span><br mpa-from-tpl="t"  />    dict(<span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 79px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'RandomFlip'</span>, flip_ratio=0.5),<br mpa-from-tpl="t"  />    dict(<span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 73px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'Normalize'</span>, **img_norm_cfg),<br mpa-from-tpl="t"  />    dict(<span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 33px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'Pad'</span>, size_divisor=32),<br mpa-from-tpl="t"  />    dict(<span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 139px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'DefaultFormatBundle'</span>),<br mpa-from-tpl="t"  />    dict(<span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 60px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'Collect'</span>, keys=[<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 33px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'img'</span>, <span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 73px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'gt_bboxes'</span>, <span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 73px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'gt_labels'</span>]),<br mpa-from-tpl="t"  />]<br mpa-from-tpl="t"  />test_pipeline = [<br mpa-from-tpl="t"  />    dict(<span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 126px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'LoadImageFromFile'</span>),<br mpa-from-tpl="t"  />    dict(<br mpa-from-tpl="t"  />        <span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 125px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'MultiScaleFlipAug'</span>,<br mpa-from-tpl="t"  />        img_scale=(492,658),<br mpa-from-tpl="t"  />        flip=False,<br mpa-from-tpl="t"  />        transforms=[<br mpa-from-tpl="t"  />            dict(<span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 53px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'Resize'</span>, keep_ratio=True),<br mpa-from-tpl="t"  />            dict(<span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 79px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'RandomFlip'</span>),<br mpa-from-tpl="t"  />            dict(<span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 73px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'Normalize'</span>, **img_norm_cfg),<br mpa-from-tpl="t"  />            dict(<span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 33px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'Pad'</span>, size_divisor=32),<br mpa-from-tpl="t"  />            dict(<span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 99px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'ImageToTensor'</span>, keys=[<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 33px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'img'</span>]),<br mpa-from-tpl="t"  />            dict(<span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 59px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'Collect'</span>, keys=[<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 33px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'img'</span>]),<br mpa-from-tpl="t"  />        ])<br mpa-from-tpl="t"  />]<br mpa-from-tpl="t"  />data = dict(<br mpa-from-tpl="t"  />    imgs_per_gpu=8, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 504px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 有的同学不知道batchsize在哪修改,其实就是修改这里,每个gpu同时处理的images数目。</span><br mpa-from-tpl="t"  />    workers_per_gpu=2,<br mpa-from-tpl="t"  />    train=dict(<br mpa-from-tpl="t"  />        <span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=dataset_type,<br mpa-from-tpl="t"  />        ann_file=data_root + <span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 158px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'fixed_annotations.json'</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 123px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 更换自己的json文件</span><br mpa-from-tpl="t"  />        img_prefix=data_root + <span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 60px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'images/'</span>, <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 77px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># images目录</span><br mpa-from-tpl="t"  />        pipeline=train_pipeline),<br mpa-from-tpl="t"  />    val=dict(<br mpa-from-tpl="t"  />        <span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=dataset_type,<br mpa-from-tpl="t"  />        ann_file=data_root + <span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 158px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'fixed_annotations.json'</span>,<br mpa-from-tpl="t"  />        img_prefix=data_root + <span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 60px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'images/'</span>,<br mpa-from-tpl="t"  />        pipeline=test_pipeline),<br mpa-from-tpl="t"  />    <span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 27px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">test</span>=dict(<br mpa-from-tpl="t"  />        <span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=dataset_type,<br mpa-from-tpl="t"  />        ann_file=data_root + <span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 158px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'fixed_annotations.json'</span>,<br mpa-from-tpl="t"  />        img_prefix=data_root + <span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 60px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'images/'</span>,<br mpa-from-tpl="t"  />        pipeline=test_pipeline))<br mpa-from-tpl="t"  /><span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 72px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># optimizer</span><br mpa-from-tpl="t"  />optimizer = dict(<span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 33px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'SGD'</span>, lr=0.001, momentum=0.9, weight_decay=0.0001) <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 321px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># lr = 0.00125*batch_size,不能过大,否则梯度爆炸。</span><br mpa-from-tpl="t"  />optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))<br mpa-from-tpl="t"  /><span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 112px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># learning policy</span><br mpa-from-tpl="t"  />lr_config = dict(<br mpa-from-tpl="t"  />    policy=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 40px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'step'</span>,<br mpa-from-tpl="t"  />    warmup=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 53px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'linear'</span>,<br mpa-from-tpl="t"  />    warmup_iters=500,<br mpa-from-tpl="t"  />    warmup_ratio=1.0 / 3,<br mpa-from-tpl="t"  />    step=[6, 12, 19])<br mpa-from-tpl="t"  />checkpoint_config = dict(interval=1)<br mpa-from-tpl="t"  /><span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 92px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># yapf:disable</span><br mpa-from-tpl="t"  />log_config = dict(<br mpa-from-tpl="t"  />    interval=64,<br mpa-from-tpl="t"  />    hooks=[<br mpa-from-tpl="t"  />        dict(<span style="color: rgb(230, 192, 123);background: rgba(0, 0, 0, 0);display: inline;width: 26px;text-decoration: none solid rgb(230, 192, 123);font-weight: 400;font-style: normal;">type</span>=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 105px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'TextLoggerHook'</span>), <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 133px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 控制台输出信息的风格</span><br mpa-from-tpl="t"  />        <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 536px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># dict(type='TensorboardLoggerHook') # 需要安装tensorflow and tensorboard才可以使用</span><br mpa-from-tpl="t"  />    ])<br mpa-from-tpl="t"  /><span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 86px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># yapf:enable</span><br mpa-from-tpl="t"  /><span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 119px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># runtime settings</span><br mpa-from-tpl="t"  />total_epochs = 20<br mpa-from-tpl="t"  />dist_params = dict(backend=<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 40px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'nccl'</span>)<br mpa-from-tpl="t"  />log_level = <span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 40px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'INFO'</span><br mpa-from-tpl="t"  />work_dir = <span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 251px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'../work_dirs/cascade_rcnn_r50_fpn_1x'</span> <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 61px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 日志目录</span><br mpa-from-tpl="t"  />load_from = <span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 323px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'../work_dirs/cascade_rcnn_r50_fpn_1x/latest.pth'</span> <span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 109px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;"># 模型加载目录文件</span><br mpa-from-tpl="t"  /><span style="color: rgb(92, 99, 112);background: rgba(0, 0, 0, 0);display: inline;width: 706px;text-decoration: none solid rgb(92, 99, 112);font-weight: 400;font-style: italic;">#load_from = '../work_dirs/cascade_rcnn_r50_fpn_1x/cascade_rcnn_r50_coco_pretrained_weights_classes_11.pth'</span><br mpa-from-tpl="t"  />resume_from = None<br mpa-from-tpl="t"  />workflow = [(<span style="color: rgb(152, 195, 121);background: rgba(0, 0, 0, 0);display: inline;width: 46px;text-decoration: none solid rgb(152, 195, 121);font-weight: 400;font-style: normal;">'train'</span>, 1)]</section>

Reference:
[1] https://arxiv.org/abs/1406.4729
[2] https://arxiv.org/abs/1504.06066
[3] https://arxiv.org/abs/1512.03385
[4] https://arxiv.org/abs/1505.01749
[5] https://arxiv.org/abs/1604.03540
[6] https://arxiv.org/abs/1704.04503
[7] https://arxiv.org/abs/1703.06870
<pre style="max-width: 100%;letter-spacing: 0.544px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section style="margin-right: 8px;margin-left: 8px;max-width: 100%;white-space: normal;color: rgb(0, 0, 0);font-family: -apple-system-font, system-ui, "Helvetica Neue", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.544px;text-align: center;widows: 1;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;"><br  /></section><section style="margin-right: 8px;margin-left: 8px;max-width: 100%;white-space: normal;color: rgb(0, 0, 0);font-family: -apple-system-font, system-ui, "Helvetica Neue", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.544px;text-align: center;widows: 1;line-height: 1.75em;box-sizing: border-box !important;overflow-wrap: break-word !important;"><strong style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;letter-spacing: 0.5px;font-size: 14px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><strong style="max-width: 100%;font-size: 16px;letter-spacing: 0.544px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;letter-spacing: 0.5px;box-sizing: border-box !important;overflow-wrap: break-word !important;">—</span></strong>完<strong style="max-width: 100%;font-size: 16px;letter-spacing: 0.544px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;letter-spacing: 0.5px;font-size: 14px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><strong style="max-width: 100%;font-size: 16px;letter-spacing: 0.544px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;letter-spacing: 0.5px;box-sizing: border-box !important;overflow-wrap: break-word !important;">—</span></strong></span></strong></span></strong></section><section style="max-width: 100%;white-space: normal;font-family: -apple-system-font, system-ui, "Helvetica Neue", "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.544px;text-align: center;widows: 1;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section powered-by="xiumi.us" style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section style="margin-top: 15px;margin-bottom: 25px;max-width: 100%;opacity: 0.8;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section style="max-width: 100%;letter-spacing: 0.544px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section powered-by="xiumi.us" style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section style="margin-top: 15px;margin-bottom: 25px;max-width: 100%;opacity: 0.8;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><section style="margin-right: 8px;margin-bottom: 15px;margin-left: 8px;padding-right: 0em;padding-left: 0em;max-width: 100%;color: rgb(127, 127, 127);font-size: 12px;font-family: sans-serif;line-height: 25.5938px;letter-spacing: 3px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(0, 0, 0);box-sizing: border-box !important;overflow-wrap: break-word !important;"><strong style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;font-size: 16px;font-family: 微软雅黑;caret-color: red;box-sizing: border-box !important;overflow-wrap: break-word !important;">为您推荐</span></strong></span></section><p style="margin-right: 8px;margin-bottom: 5px;margin-left: 8px;padding-right: 0em;padding-left: 0em;max-width: 100%;min-height: 1em;color: rgb(127, 127, 127);font-size: 12px;font-family: sans-serif;line-height: 1.75em;letter-spacing: 0px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">“12306”的架构到底有多牛逼?</span></p><p style="margin-right: 8px;margin-bottom: 5px;margin-left: 8px;padding-right: 0em;padding-left: 0em;max-width: 100%;min-height: 1em;color: rgb(127, 127, 127);font-size: 12px;font-family: sans-serif;line-height: 1.75em;letter-spacing: 0px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;">中国程序员34岁生日当天在美国遭抢笔记本电脑,追击歹徒被拖行后身亡,为什么会发生此类事件?</span></p><p style="margin-right: 8px;margin-bottom: 5px;margin-left: 8px;padding-right: 0em;padding-left: 0em;max-width: 100%;min-height: 1em;color: rgb(127, 127, 127);font-size: 12px;font-family: sans-serif;line-height: 1.75em;letter-spacing: 0px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;-webkit-tap-highlight-color: rgba(0, 0, 0, 0);cursor: pointer;font-size: 14px;box-sizing: border-box !important;overflow-wrap: break-word !important;">阿里如何抗住90秒100亿?看这篇你就明白了!</span></p><p style="margin-right: 8px;margin-bottom: 5px;margin-left: 8px;padding-right: 0em;padding-left: 0em;max-width: 100%;min-height: 1em;font-family: sans-serif;line-height: 1.75em;letter-spacing: 0px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;color: rgb(87, 107, 149);box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;font-size: 14px;box-sizing: border-box !important;overflow-wrap: break-word !important;">60个Chrome神器插件大收集:助你快速成为老司机,一键分析网站技术栈</span></span><br style="max-width: 100%;box-sizing: border-box !important;overflow-wrap: break-word !important;"  /></p><p style="margin-right: 8px;margin-bottom: 5px;margin-left: 8px;padding-right: 0em;padding-left: 0em;max-width: 100%;min-height: 1em;color: rgb(127, 127, 127);font-size: 12px;font-family: sans-serif;line-height: 1.75em;letter-spacing: 0px;box-sizing: border-box !important;overflow-wrap: break-word !important;"><span style="max-width: 100%;-webkit-tap-highlight-color: rgba(0, 0, 0, 0);cursor: pointer;font-size: 14px;box-sizing: border-box !important;overflow-wrap: break-word !important;">深度学习必懂的13种概率分布</span></p></section></section></section></section></section></section></section></section>

本篇文章来源于: 深度学习这件小事

本文为原创文章,版权归所有,欢迎分享本文,转载请保留出处!

知行编程网
知行编程网 关注:1    粉丝:1
这个人很懒,什么都没写

发表评论

表情 格式 链接 私密 签到
扫一扫二维码分享