CGFT认证知识点：ResNet网络详解与实现--粉丝服务平台-粉丝头条-fensifuwu.com

CGFT认证知识点：ResNet网络详解与实现

科技 08-01 来源：曹国钧博士

ResNet

简述

ResNet残差网络主要是通过残差块组成的，在提出残差网络之前，网络结构无法很深，在VGG中，卷积网络达到了19层，在GoogLeNet中，网络达到了22层。随着网络层数的增加，网络发生了退化（degradation）的现象：随着网络层数的增多，训练集loss逐渐下降，然后趋于饱和，当你再增加网络深度的话，训练集loss反而会增大。而引入残差块后，网络可以达到很深，网络的效果也随之变好。

目的

ResNet网络是为了解决深度网络中的退化问题，即网络层数越深时，在数据集上表现的性能却越差，如下图所示是论文中给出的深度网络退化现象。

作者在CIFAR-10数据集上测试了20层和56层的深度网络，结果就是56层的训练误差和测试误差反而比层数少的20层网络更大，这就是ResNet网络要解决的深度网络退化问题。而采用ResNet网络之后，可以解决这种退化问题。在ImageNet数据集上的训练结果可以看出，在没有采用ResNet结构之前，如左图所示，34层网络plain-34的性能误差要大于18层网络plain-18的性能误差。而采用ResNet网络结构的34层网络结构ResNet-34性能误差小于18层网络ResNet。因此，采用ResNet网络结构的网络层数越深，则性能越佳。

ResNet原理

与普通网络的串行结构相比，残差单元增加了跳跃映射，将输入与输出直接进行相加，补充卷积过程中损失的特征信息，这点与U-net的跳跃连接结构有点类似，不过Res中的跳跃连接做的是Add操作，而U-net的跳跃连接做的是Concatenate操作，还是有本质的不同
由上图可知，设网络块学习到的映射为，而直接学习很难学习。因此可以对另一个残差函数可以很容易学习，因为此时网络块的训练目标是将逼近于，而不是某一特定映射。因此，最后的映射就是将F(x)和x相加，

故此网络块的输出为:

因为相加必须保证与是同维度的，因此可以写成通式如下式，用于匹配维度。

有两种维度匹配的方式（A）用zero-padding增加维度 (B）用1x1卷积增加维度。

ResNet框架

黑色圆弧箭头表示一个残差，虚线箭头表示用步长为2的卷积核进行下采样相同颜色的残差表示一个块（block）.每一层其实就是BasicBlock或者BotteNeck结构

BasicBlock和BotteNeck结构

如上图所示左图为BasicBlock结构，右图为BotteNeck结构

BasicBlock结构

两个3x3的卷积层，通道数都是64，然后就是注意那根跳线，也就是Shortcut Connections，将输入x加到输出

BotteNeck结构

先是一个1x1的卷积层，之后是一个3x3的卷积层，然后接一个1x1的卷积层。注意的是这里的通道数是变化的，1x1卷积层的作用就是用于改变特征图的通数，使得可以和恒等映射x相叠加，另外这里的1x1卷积层改变维度的很重要的一点是可以降低网络参数量，这也是为什么更深层的网络采用BottleNeck而不是BasicBlock的原因。

代码实现

#python3 #

@File:ResNets.py

#--coding:utf-8-- #

@Author:axjing

#@Time:2021年09月16日13

#说明:

import torch from torch

import Tensor

import torch.nnasnn from torch.hub

importload_state_dict_from_url #from.._internally_replaced_utils

import load_state_dict_from_url fromtyping

import Type,Any,Callable,Union,List,Optional from torchsummary

import summary

__all__=['ResNet','ResNet18','ResNet34','ResNet50','ResNet101', 'ResNet152','resnext50_32x4d','resnext101_32x8d', 'wide_ResNet50_2','wide_ResNet101_2'] model_urls={ 'ResNet18':'https://download.pytorch.org/models/ResNet18-f37072fd.pth', 'ResNet34':'https://download.pytorch.org/models/ResNet34-b627a593.pth', 'ResNet50':'https://download.pytorch.org/models/ResNet50-0676ba61.pth', 'ResNet101':'https://download.pytorch.org/models/ResNet101-63fe2227.pth', 'ResNet152':'https://download.pytorch.org/models/ResNet152-394f9c45.pth', 'resnext50_32x4d':'https://download.pytorch.org/models/resnext50_32x4d-7cdf4587.pth', 'resnext101_32x8d':'https://download.pytorch.org/models/resnext101_32x8d-8ba56ff5.pth', 'wide_ResNet50_2':'https://download.pytorch.org/models/wide_ResNet50_2-95faca4d.pth', 'wide_ResNet101_2':'https://download.pytorch.org/models/wide_ResNet101_2-32ee1156.pth', } defconv3x3(in_planes:int,out_planes:int,stride:int=1,groups:int=1,dilation:int=1)->nn.Conv2d: """3x3convolutionwithpadding""" returnnn.Conv2d(in_planes,out_planes,kernel_size=3,stride=stride, padding=dilation,groups=groups,bias=False,dilation=dilation) defconv1x1(in_planes:int,out_planes:int,stride:int=1)->nn.Conv2d: """1x1convolution""" returnnn.Conv2d(in_planes,out_planes,kernel_size=1,stride=stride,bias=False) classBasicBlock(nn.Module): expansion:int=1 def__init__( self, inplanes:int, planes:int, stride:int=1, downsample:Optional[nn.Module]=None, groups:int=1, base_width:int=64, dilation:int=1, norm_layer:Optional[Callable[...,nn.Module]]=None )->None: super(BasicBlock,self).__init__() ifnorm_layerisNone: norm_layer=nn.BatchNorm2d ifgroups!=1orbase_width!=64: raiseValueError('BasicBlockonlysupportsgroups=1andbase_width=64') ifdilation>1: raiseNotImplementedError("Dilation>1notsupportedinBasicBlock") #Bothself.conv1andself.downsamplelayersdownsampletheinputwhenstride!=1 self.conv1=conv3x3(inplanes,planes,stride) self.bn1=norm_layer(planes) self.relu=nn.ReLU(inplace=True) self.conv2=conv3x3(planes,planes) self.bn2=norm_layer(planes) self.downsample=downsample self.stride=stride defforward(self,x:Tensor)->Tensor: identity=x out=self.conv1(x) out=self.bn1(out) out=self.relu(out) out=self.conv2(out) out=self.bn2(out) ifself.downsampleisnotNone: identity=self.downsample(x) out+=identity out=self.relu(out) returnout classBottleneck(nn.Module): #Bottleneckintorchvisionplacesthestridefordownsamplingat3x3convolution(self.conv2) #whileoriginalimplementationplacesthestrideatthefirst1x1convolution(self.conv1) #accordingto"Deepresiduallearningforimagerecognition"https://arxiv.org/abs/1512.03385. #ThisvariantisalsoknownasResNetV1.5andimprovesaccuracyaccordingto #https://ngc.nvidia.com/catalog/model-scripts/nvidia:ResNet_50_v1_5_for_pytorch. expansion:int=4 def__init__( self, inplanes:int, planes:int, stride:int=1, downsample:Optional[nn.Module]=None, groups:int=1, base_width:int=64, dilation:int=1, norm_layer:Optional[Callable[...,nn.Module]]=None )->None: super(Bottleneck,self).__init__() ifnorm_layerisNone: norm_layer=nn.BatchNorm2d width=int(planes*(base_width/64.))*groups #Bothself.conv2andself.downsamplelayersdownsampletheinputwhenstride!=1 self.conv1=conv1x1(inplanes,width) self.bn1=norm_layer(width) self.conv2=conv3x3(width,width,stride,groups,dilation) self.bn2=norm_layer(width) self.conv3=conv1x1(width,planes*self.expansion) self.bn3=norm_layer(planes*self.expansion) self.relu=nn.ReLU(inplace=True) self.downsample=downsample self.stride=stride defforward(self,x:Tensor)->Tensor: identity=x out=self.conv1(x) out=self.bn1(out) out=self.relu(out) out=self.conv2(out) out=self.bn2(out) out=self.relu(out) out=self.conv3(out) out=self.bn3(out) ifself.downsampleisnotNone: identity=self.downsample(x) out+=identity out=self.relu(out) returnout classResNet(nn.Module): def__init__( self, block:Type[Union[BasicBlock,Bottleneck]], layers:List[int], num_classes:int=1000, zero_init_residual:bool=False, groups:int=1, width_per_group:int=64, replace_stride_with_dilation:Optional[List[bool]]=None, norm_layer:Optional[Callable[...,nn.Module]]=None )->None: super(ResNet,self).__init__() ifnorm_layerisNone: norm_layer=nn.BatchNorm2d self._norm_layer=norm_layer self.inplanes=64 self.dilation=1 ifreplace_stride_with_dilationisNone: #eachelementinthetupleindicatesifweshouldreplace #the2x2stridewithadilatedconvolutioninstead replace_stride_with_dilation=[False,False,False] iflen(replace_stride_with_dilation)!=3: raiseValueError("replace_stride_with_dilationshouldbeNone" "ora3-elementtuple,got{}".format(replace_stride_with_dilation)) self.groups=groups self.base_width=width_per_group self.conv1=nn.Conv2d(3,self.inplanes,kernel_size=7,stride=2,padding=3, bias=False) self.bn1=norm_layer(self.inplanes) self.relu=nn.ReLU(inplace=True) self.maxpool=nn.MaxPool2d(kernel_size=3,stride=2,padding=1) self.layer1=self._make_layer(block,64,layers[0]) self.layer2=self._make_layer(block,128,layers[1],stride=2, dilate=replace_stride_with_dilation[0]) self.layer3=self._make_layer(block,256,layers[2],stride=2, dilate=replace_stride_with_dilation[1]) self.layer4=self._make_layer(block,512,layers[3],stride=2, dilate=replace_stride_with_dilation[2]) self.avgpool=nn.AdaptiveAvgPool2d((1,1)) self.fc=nn.Linear(512*block.expansion,num_classes) forminself.modules(): ifisinstance(m,nn.Conv2d): nn.init.kaiming_normal_(m.weight,mode='fan_out',nonlinearity='relu') elifisinstance(m,(nn.BatchNorm2d,nn.GroupNorm)): nn.init.constant_(m.weight,1) nn.init.constant_(m.bias,0) #Zero-initializethelastBNineachresidualbranch, #sothattheresidualbranchstartswithzeros,andeachresidualblockbehaveslikeanidentity. #Thisimprovesthemodelby0.2~0.3%accordingtohttps://arxiv.org/abs/1706.02677 ifzero_init_residual: forminself.modules(): ifisinstance(m,Bottleneck): nn.init.constant_(m.bn3.weight,0)#type:ignore[arg-type] elifisinstance(m,BasicBlock): nn.init.constant_(m.bn2.weight,0)#type:ignore[arg-type] def_make_layer(self,block:Type[Union[BasicBlock,Bottleneck]],planes:int,blocks:int, stride:int=1,dilate:bool=False)->nn.Sequential: norm_layer=self._norm_layer downsample=None previous_dilation=self.dilation ifdilate: self.dilation*=stride stride=1 ifstride!=1orself.inplanes!=planes*block.expansion: downsample=nn.Sequential( conv1x1(self.inplanes,planes*block.expansion,stride), norm_layer(planes*block.expansion), ) layers=[] layers.append(block(self.inplanes,planes,stride,downsample,self.groups, self.base_width,previous_dilation,norm_layer)) self.inplanes=planes*block.expansion for_inrange(1,blocks): layers.append(block(self.inplanes,planes,groups=self.groups, base_width=self.base_width,dilation=self.dilation, norm_layer=norm_layer)) returnnn.Sequential(*layers) def_forward_impl(self,x:Tensor)->Tensor: #Seenote[TorchScriptsuper()] x=self.conv1(x) x=self.bn1(x) x=self.relu(x) x=self.maxpool(x) x=self.layer1(x) x=self.layer2(x) x=self.layer3(x) x=self.layer4(x) x=self.avgpool(x) x=torch.flatten(x,1) x=self.fc(x) returnx defforward(self,x:Tensor)->Tensor: returnself._forward_impl(x) def_ResNet( arch:str, block:Type[Union[BasicBlock,Bottleneck]], layers:List[int], pretrained:bool, progress:bool, **kwargs:Any )->ResNet: model=ResNet(block,layers,**kwargs) ifpretrained: state_dict=load_state_dict_from_url(model_urls[arch], progress=progress) model.load_state_dict(state_dict) returnmodel defResNet18(pretrained:bool=False,progress:bool=True,**kwargs:Any)->ResNet: r"""ResNet-18modelfrom `"DeepResidualLearningforImageRecognition"`_. Args: pretrained(bool):IfTrue,returnsamodelpre-trainedonImageNet progress(bool):IfTrue,displaysaprogressbarofthedownloadtostderr """ return_ResNet('ResNet18',BasicBlock,[2,2,2,2],pretrained,progress, **kwargs) defResNet34(pretrained:bool=False,progress:bool=True,**kwargs:Any)->ResNet: r"""ResNet-34modelfrom `"DeepResidualLearningforImageRecognition"`_. Args: pretrained(bool):IfTrue,returnsamodelpre-trainedonImageNet progress(bool):IfTrue,displaysaprogressbarofthedownloadtostderr """ return_ResNet('ResNet34',BasicBlock,[3,4,6,3],pretrained,progress, **kwargs) defResNet50(pretrained:bool=False,progress:bool=True,**kwargs:Any)->ResNet: r"""ResNet-50modelfrom `"DeepResidualLearningforImageRecognition"`_. Args: pretrained(bool):IfTrue,returnsamodelpre-trainedonImageNet progress(bool):IfTrue,displaysaprogressbarofthedownloadtostderr """ return_ResNet('ResNet50',Bottleneck,[3,4,6,3],pretrained,progress, **kwargs) defResNet101(pretrained:bool=False,progress:bool=True,**kwargs:Any)->ResNet: r"""ResNet-101modelfrom `"DeepResidualLearningforImageRecognition"`_. Args: pretrained(bool):IfTrue,returnsamodelpre-trainedonImageNet progress(bool):IfTrue,displaysaprogressbarofthedownloadtostderr """ return_ResNet('ResNet101',Bottleneck,[3,4,23,3],pretrained,progress, **kwargs) defResNet152(pretrained:bool=False,progress:bool=True,**kwargs:Any)->ResNet: r"""ResNet-152modelfrom `"DeepResidualLearningforImageRecognition"`_. Args: pretrained(bool):IfTrue,returnsamodelpre-trainedonImageNet progress(bool):IfTrue,displaysaprogressbarofthedownloadtostderr """ return_ResNet('ResNet152',Bottleneck,[3,8,36,3],pretrained,progress, **kwargs) defresnext50_32x4d(pretrained:bool=False,progress:bool=True,**kwargs:Any)->ResNet: r"""ResNeXt-5032x4dmodelfrom `"AggregatedResidualTransformationforDeepNeuralNetworks"`_. Args: pretrained(bool):IfTrue,returnsamodelpre-trainedonImageNet progress(bool):IfTrue,displaysaprogressbarofthedownloadtostderr """ kwargs['groups']=32 kwargs['width_per_group']=4 return_ResNet('resnext50_32x4d',Bottleneck,[3,4,6,3], pretrained,progress,**kwargs) defresnext101_32x8d(pretrained:bool=False,progress:bool=True,**kwargs:Any)->ResNet: r"""ResNeXt-10132x8dmodelfrom `"AggregatedResidualTransformationforDeepNeuralNetworks"`_. Args: pretrained(bool):IfTrue,returnsamodelpre-trainedonImageNet progress(bool):IfTrue,displaysaprogressbarofthedownloadtostderr """ kwargs['groups']=32 kwargs['width_per_group']=8 return_ResNet('resnext101_32x8d',Bottleneck,[3,4,23,3], pretrained,progress,**kwargs) defwide_ResNet50_2(pretrained:bool=False,progress:bool=True,**kwargs:Any)->ResNet: r"""WideResNet-50-2modelfrom `"WideResidualNetworks"`_. ThemodelisthesameasResNetexceptforthebottlenecknumberofchannels whichistwicelargerineveryblock.Thenumberofchannelsinouter1x1 convolutionsisthesame,e.g.lastblockinResNet-50has2048-512-2048 channels,andinWideResNet-50-2has2048-1024-2048. Args: pretrained(bool):IfTrue,returnsamodelpre-trainedonImageNet progress(bool):IfTrue,displaysaprogressbarofthedownloadtostderr """ kwargs['width_per_group']=64*2 return_ResNet('wide_ResNet50_2',Bottleneck,[3,4,6,3], pretrained,progress,**kwargs) defwide_ResNet101_2(pretrained:bool=False,progress:bool=True,**kwargs:Any)->ResNet: r"""WideResNet-101-2modelfrom `"WideResidualNetworks"`_. ThemodelisthesameasResNetexceptforthebottlenecknumberofchannels whichistwicelargerineveryblock.Thenumberofchannelsinouter1x1 convolutionsisthesame,e.g.lastblockinResNet-50has2048-512-2048 channels,andinWideResNet-50-2has2048-1024-2048. Args: pretrained(bool):IfTrue,returnsamodelpre-trainedonImageNet progress(bool):IfTrue,displaysaprogressbarofthedownloadtostderr """ kwargs['width_per_group']=64*2 return_ResNet('wide_ResNet101_2',Bottleneck,[3,4,23,3], pretrained,progress,**kwargs) if__name__=='__main__': device=torch.device('cuda'iftorch.cuda.is_available()else'cpu') _model=ResNet50(False).to(device) summary(_model,input_size=(3,512,512),batch_size=8) print("*"*30+" | EndOfProgram | "+"*"*30)ResNet50参数---------------------------------------------------------------- Layer(type)OutputShapeParam# ================================================================ Conv2d-1[8,64,256,256]9,408 BatchNorm2d-2[8,64,256,256]128 ReLU-3[8,64,256,256]0 MaxPool2d-4[8,64,128,128]0 Conv2d-5[8,64,128,128]4,096 BatchNorm2d-6[8,64,128,128]128 ReLU-7[8,64,128,128]0 Conv2d-8[8,64,128,128]36,864 BatchNorm2d-9[8,64,128,128]128 ReLU-10[8,64,128,128]0 Conv2d-11[8,256,128,128]16,384 BatchNorm2d-12[8,256,128,128]512 Conv2d-13[8,256,128,128]16,384 BatchNorm2d-14[8,256,128,128]512 ReLU-15[8,256,128,128]0 Bottleneck-16[8,256,128,128]0 Conv2d-17[8,64,128,128]16,384 BatchNorm2d-18[8,64,128,128]128 ReLU-19[8,64,128,128]0 Conv2d-20[8,64,128,128]36,864 BatchNorm2d-21[8,64,128,128]128 ReLU-22[8,64,128,128]0 Conv2d-23[8,256,128,128]16,384 BatchNorm2d-24[8,256,128,128]512 ReLU-25[8,256,128,128]0 Bottleneck-26[8,256,128,128]0 Conv2d-27[8,64,128,128]16,384 BatchNorm2d-28[8,64,128,128]128 ReLU-29[8,64,128,128]0 Conv2d-30[8,64,128,128]36,864 BatchNorm2d-31[8,64,128,128]128 ReLU-32[8,64,128,128]0 Conv2d-33[8,256,128,128]16,384 BatchNorm2d-34[8,256,128,128]512 ReLU-35[8,256,128,128]0 Bottleneck-36[8,256,128,128]0 Conv2d-37[8,128,128,128]32,768 BatchNorm2d-38[8,128,128,128]256 ReLU-39[8,128,128,128]0 Conv2d-40[8,128,64,64]147,456 BatchNorm2d-41[8,128,64,64]256 ReLU-42[8,128,64,64]0 Conv2d-43[8,512,64,64]65,536 BatchNorm2d-44[8,512,64,64]1,024 Conv2d-45[8,512,64,64]131,072 BatchNorm2d-46[8,512,64,64]1,024 ReLU-47[8,512,64,64]0 Bottleneck-48[8,512,64,64]0 Conv2d-49[8,128,64,64]65,536 BatchNorm2d-50[8,128,64,64]256 ReLU-51[8,128,64,64]0 Conv2d-52[8,128,64,64]147,456 BatchNorm2d-53[8,128,64,64]256 ReLU-54[8,128,64,64]0 Conv2d-55[8,512,64,64]65,536 BatchNorm2d-56[8,512,64,64]1,024 ReLU-57[8,512,64,64]0 Bottleneck-58[8,512,64,64]0 Conv2d-59[8,128,64,64]65,536 BatchNorm2d-60[8,128,64,64]256 ReLU-61[8,128,64,64]0 Conv2d-62[8,128,64,64]147,456 BatchNorm2d-63[8,128,64,64]256 ReLU-64[8,128,64,64]0 Conv2d-65[8,512,64,64]65,536 BatchNorm2d-66[8,512,64,64]1,024 ReLU-67[8,512,64,64]0 Bottleneck-68[8,512,64,64]0 Conv2d-69[8,128,64,64]65,536 BatchNorm2d-70[8,128,64,64]256 ReLU-71[8,128,64,64]0 Conv2d-72[8,128,64,64]147,456 BatchNorm2d-73[8,128,64,64]256 ReLU-74[8,128,64,64]0 Conv2d-75[8,512,64,64]65,536 BatchNorm2d-76[8,512,64,64]1,024 ReLU-77[8,512,64,64]0 Bottleneck-78[8,512,64,64]0 Conv2d-79[8,256,64,64]131,072 BatchNorm2d-80[8,256,64,64]512 ReLU-81[8,256,64,64]0 Conv2d-82[8,256,32,32]589,824 BatchNorm2d-83[8,256,32,32]512 ReLU-84[8,256,32,32]0 Conv2d-85[8,1024,32,32]262,144 BatchNorm2d-86[8,1024,32,32]2,048 Conv2d-87[8,1024,32,32]524,288 BatchNorm2d-88[8,1024,32,32]2,048 ReLU-89[8,1024,32,32]0 Bottleneck-90[8,1024,32,32]0 Conv2d-91[8,256,32,32]262,144 BatchNorm2d-92[8,256,32,32]512 ReLU-93[8,256,32,32]0 Conv2d-94[8,256,32,32]589,824 BatchNorm2d-95[8,256,32,32]512 ReLU-96[8,256,32,32]0 Conv2d-97[8,1024,32,32]262,144 BatchNorm2d-98[8,1024,32,32]2,048 ReLU-99[8,1024,32,32]0 Bottleneck-100[8,1024,32,32]0 Conv2d-101[8,256,32,32]262,144 BatchNorm2d-102[8,256,32,32]512 ReLU-103[8,256,32,32]0 Conv2d-104[8,256,32,32]589,824 BatchNorm2d-105[8,256,32,32]512 ReLU-106[8,256,32,32]0 Conv2d-107[8,1024,32,32]262,144 BatchNorm2d-108[8,1024,32,32]2,048 ReLU-109[8,1024,32,32]0 Bottleneck-110[8,1024,32,32]0 Conv2d-111[8,256,32,32]262,144 BatchNorm2d-112[8,256,32,32]512 ReLU-113[8,256,32,32]0 Conv2d-114[8,256,32,32]589,824 BatchNorm2d-115[8,256,32,32]512 ReLU-116[8,256,32,32]0 Conv2d-117[8,1024,32,32]262,144 BatchNorm2d-118[8,1024,32,32]2,048 ReLU-119[8,1024,32,32]0 Bottleneck-120[8,1024,32,32]0 Conv2d-121[8,256,32,32]262,144 BatchNorm2d-122[8,256,32,32]512 ReLU-123[8,256,32,32]0 Conv2d-124[8,256,32,32]589,824 BatchNorm2d-125[8,256,32,32]512 ReLU-126[8,256,32,32]0 Conv2d-127[8,1024,32,32]262,144 BatchNorm2d-128[8,1024,32,32]2,048 ReLU-129[8,1024,32,32]0 Bottleneck-130[8,1024,32,32]0 Conv2d-131[8,256,32,32]262,144 BatchNorm2d-132[8,256,32,32]512 ReLU-133[8,256,32,32]0 Conv2d-134[8,256,32,32]589,824 BatchNorm2d-135[8,256,32,32]512 ReLU-136[8,256,32,32]0 Conv2d-137[8,1024,32,32]262,144 BatchNorm2d-138[8,1024,32,32]2,048 ReLU-139[8,1024,32,32]0 Bottleneck-140[8,1024,32,32]0 Conv2d-141[8,512,32,32]524,288 BatchNorm2d-142[8,512,32,32]1,024 ReLU-143[8,512,32,32]0 Conv2d-144[8,512,16,16]2,359,296 BatchNorm2d-145[8,512,16,16]1,024 ReLU-146[8,512,16,16]0 Conv2d-147[8,2048,16,16]1,048,576 BatchNorm2d-148[8,2048,16,16]4,096 Conv2d-149[8,2048,16,16]2,097,152 BatchNorm2d-150[8,2048,16,16]4,096 ReLU-151[8,2048,16,16]0 Bottleneck-152[8,2048,16,16]0 Conv2d-153[8,512,16,16]1,048,576 BatchNorm2d-154[8,512,16,16]1,024 ReLU-155[8,512,16,16]0 Conv2d-156[8,512,16,16]2,359,296 BatchNorm2d-157[8,512,16,16]1,024 ReLU-158[8,512,16,16]0 Conv2d-159[8,2048,16,16]1,048,576 BatchNorm2d-160[8,2048,16,16]4,096 ReLU-161[8,2048,16,16]0 Bottleneck-162[8,2048,16,16]0 Conv2d-163[8,512,16,16]1,048,576 BatchNorm2d-164[8,512,16,16]1,024 ReLU-165[8,512,16,16]0 Conv2d-166[8,512,16,16]2,359,296 BatchNorm2d-167[8,512,16,16]1,024 ReLU-168[8,512,16,16]0 Conv2d-169[8,2048,16,16]1,048,576 BatchNorm2d-170[8,2048,16,16]4,096 ReLU-171[8,2048,16,16]0 Bottleneck-172[8,2048,16,16]0 AdaptiveAvgPool2d-173[8,2048,1,1]0 Linear-174[8,1000]2,049,000 ================================================================ Totalparams:25,557,032 Trainableparams:25,557,032 Non-trainableparams:0 ---------------------------------------------------------------- Inputsize(MB):24.00 Forward/backwardpasssize(MB):11976.19 Paramssize(MB):97.49 EstimatedTotalSize(MB):12097.68 ---------------------------------------------------------------- ****************************** |EndOfProgram| ****************************** Processfinishedwithexitcode0

摘自：未来现相（知乎）