歡迎您光臨本站 註冊首頁

keras的三種模型實現與區別說明

←手機掃碼閱讀     lousu-xi @ 2020-07-06 , reply:0

前言

一、keras提供了三種定義模型的方式

1. 序列式(Sequential) API

序貫(sequential)API允許你為大多數問題逐層堆疊創建模型。雖然說對很多的應用來說,這樣的一個手法很簡單也解決了很多深度學習網絡結構的構建,但是它也有限制-它不允許你創建模型有共享層或有多個輸入或輸出的網絡。

2. 函數式(Functional) API

Keras函數式(functional)API為構建網絡模型提供了更為靈活的方式。

它允許你定義多個輸入或輸出模型以及共享圖層的模型。除此之外,它允許你定義動態(ad-hoc)的非週期性(acyclic)網絡圖。

模型是通過創建層的實例(layer instances)並將它們直接相互連接成對來定義的,然後定義一個模型(model)來指定那些層是要作為這個模型的輸入和輸出。

3.子類(Subclassing) API

補充知識:keras pytorch 構建模型對比

使用CIFAR10數據集,用三種框架構建Residual_Network作為例子,比較框架間的異同。

數據集格式

pytorch的數據集格式

  import torch  import torch.nn as nn  import torchvision    # Download and construct CIFAR-10 dataset.  train_dataset = torchvision.datasets.CIFAR10(root='../../data/',                         train=True,                          download=True)    # Fetch one data pair (read data from disk).  image, label = train_dataset[0]  print (image.size()) # torch.Size([3, 32, 32])  print (label) # 6  print (train_dataset.data.shape) # (50000, 32, 32, 3)  # type(train_dataset.targets)==list  print (len(train_dataset.targets)) # 50000    # Data loader (this provides queues and threads in a very simple way).  train_loader = torch.utils.data.DataLoader(dataset=train_dataset,                        batch_size=64,                         shuffle=True)  """  # 演示DataLoader返回的數據結構  # When iteration starts, queue and thread start to load data from files.  data_iter = iter(train_loader)    # Mini-batch images and labels.  images, labels = data_iter.next()  print(images.shape) # torch.Size([100, 3, 32, 32])  print(labels.shape)   # torch.Size([100]) 可見經過DataLoader後,labels由list變成了pytorch內置的tensor格式  """  # 一般使用的話是下面這種  # Actual usage of the data loader is as below.  for images, labels in train_loader:    # Training code should be written here.    pass

 

keras的數據格式

  import keras  from keras.datasets import cifar10    (train_x, train_y) , (test_x, test_y) = cifar10.load_data()  print(train_x.shape) # ndarray 類型: (50000, 32, 32, 3)  print(train_y.shape) # (50000, 1)

 

輸入網絡的數據格式不同

  """  1: pytorch 都是內置torch.xxTensor輸入網絡,而keras的則是原生ndarray類型  2: 對於multi-class的其中一種loss,即cross-entropy loss 而言,    pytorch的api為 CorssEntropyLoss, 但y_true不能用one-hoe編碼!這與keras,tensorflow	    都不同。tensorflow相應的api為softmax_cross_entropy    他們的api都僅限於multi-class classification  3*: 其實上面提到的api都屬於categorical cross-entropy loss,    又叫 softmax loss,是函數內部先進行了 softmax 激活,再經過cross-entropy loss。    這個loss是cross-entropy loss的變種,    cross-entropy loss又叫logistic loss 或 multinomial logistic loss。    實現這種loss的函數不包括激活函數,需要自定義。    pytorch對應的api為BCEloss(僅限於 binary classification),    tensorflow 對應的api為 log_loss。    cross-entropy loss的第二個變種是 binary cross-entropy loss 又叫 sigmoid cross-  entropy loss。    函數內部先進行了sigmoid激活,再經過cross-entropy loss。    pytorch對應的api為BCEWithLogitsLoss,    tensorflow對應的api為sigmoid_cross_entropy  """    # pytorch  criterion = nn.CrossEntropyLoss()  ...  for epoch in range(num_epochs):    for i, (images, labels) in enumerate(train_loader):      images = images.to(device)      labels = labels.to(device)            # Forward pass      outputs = model(images)      # 對於multi-class cross-entropy loss      # 輸入y_true不需要one-hot編碼      loss = criterion(outputs, labels)  ...    # keras  # 對於multi-class cross-entropy loss  # 輸入y_true需要one-hot編碼  train_y = keras.utils.to_categorical(train_y,10)  ...  model.fit_generator(datagen.flow(train_x, train_y, batch_size=128),            validation_data=[test_x,test_y],            epochs=epochs,steps_per_epoch=steps_per_epoch, verbose=1)  ...

 

整體流程

keras 流程

  model = myModel()  model.compile(optimizer=Adam(0.001),loss="categorical_crossentropy",metrics=["accuracy"])  model.fit_generator(datagen.flow(train_x, train_y, batch_size=128),            validation_data=[test_x,test_y],            epochs=epochs,steps_per_epoch=steps_per_epoch, verbose=1, workers=4)  #Evaluate the accuracy of the test dataset  accuracy = model.evaluate(x=test_x,y=test_y,batch_size=128)  # 保存整個網絡  model.save("cifar10model.h5")  """  # https://blog.csdn.net/jiandanjinxin/article/details/77152530  # 使用  # keras.models.load_model("cifar10model.h5")    # 只保存architecture  # json_string = model.to_json()   # open('my_model_architecture.json','w').write(json_string)    # 使用  # from keras.models import model_from_json  #model = model_from_json(open('my_model_architecture.json').read())     # 只保存weights  # model.save_weights('my_model_weights.h5')   #需要在代碼中初始化一個完全相同的模型   # model.load_weights('my_model_weights.h5')  #需要加載權重到不同的網絡結構(有些層一樣)中,例如fine-tune或transfer-learning,可以通過層名字來加載模型   # model.load_weights('my_model_weights.h5', by_name=True)  """

 

pytorch 流程

  model = myModel()  # Loss and optimizer  criterion = nn.CrossEntropyLoss()    for epoch in range(num_epochs):    for i, (images, labels) in enumerate(train_loader):      images = images.to(device)      labels = labels.to(device)            # Forward pass      outputs = model(images)      loss = criterion(outputs, labels)            # Backward and optimize  		# 將上次迭代計算的梯度值清0      optimizer.zero_grad()      # 反向傳播,計算梯度值      loss.backward()      # 更新權值參數      optimizer.step()        # model.eval(),讓model變成測試模式,對dropout和batch normalization的操作在訓練和測試的時候是不一樣的  # eval()時,pytorch會自動把BN和DropOut固定住,不會取平均,而是用訓練好的值。  # 不然的話,一旦test的batch_size過小,很容易就會被BN層導致生成圖片顏色失真極大。  model.eval()  with torch.no_grad():    correct = 0    total = 0    for images, labels in test_loader:      images = images.to(device)      labels = labels.to(device)      outputs = model(images)      _, predicted = torch.max(outputs.data, 1)      total += labels.size(0)      correct += (predicted == labels).sum().item()      print('Accuracy of the model on the test images: {} %'.format(100 * correct / total))    # Save the model checkpoint  # 這是隻保存了weights  torch.save(model.state_dict(), 'resnet.ckpt')  """  # 使用  # myModel.load_state_dict(torch.load('params.ckpt'))  # 若想保存整個網絡(architecture + weights)  # torch.save(resnet, 'model.ckpt')  # 使用  #model = torch.load('model.ckpt')  """

 

對比流程

  #https://blog.csdn.net/dss_dssssd/article/details/83892824  """  1: 準備數據(注意數據格式不同)  2: 定義網絡結構model  3: 定義損失函數  4: 定義優化算法 optimizer  5: 訓練-keras  	5.1:編譯模型(傳入loss function和optimizer等)  	5.2:訓練模型(fit or fit_generator,傳入數據)  5: 訓練-pytorch  迭代訓練:  	5.1:準備好tensor形式的輸入數據和標籤(可選)  	5.2:前向傳播計算網絡輸出output和計算損失函數loss  	5.3:反向傳播更新參數  		以下三句話一句也不能少:  		5.3.1:將上次迭代計算的梯度值清0  			optimizer.zero_grad()  		5.3.2:反向傳播,計算梯度值  			loss.backward()  		5.3.3:更新權值參數  			optimizer.step()  6: 在測試集上測試-keras  	model.evaluate  6: 在測試集上測試-pytorch    遍歷測試集,自定義metric  7: 保存網絡(可選) 具體實現參考上面代碼  """

 

構建網絡

對比網絡

1、對於keras,不需要input_channels,函數內部會自動獲得,而pytorch則需要顯示聲明input_channels

2、對於pytorch Conv2d需要指定padding,而keras的則是same和valid兩種選項(valid即padding=0)

3、keras的Flatten操作可以視作pytorch中的view

4、keras的dimension一般順序是(H, W, C) (tensorflow 為backend的話),而pytorch的順序則是( C, H, W)

5、具體的變換可以參照下方,但由於沒有學過pytorch,keras也剛入門,不能保證正確,日後學的更深入了之後再來看看。

pytorch 構建Residual-network

  import torch  import torch.nn as nn  import torchvision  import torchvision.transforms as transforms      # Device configuration  device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')    # Hyper-parameters  num_epochs = 80  learning_rate = 0.001    # Image preprocessing modules  transform = transforms.Compose([    transforms.Pad(4),    transforms.RandomHorizontalFlip(),    transforms.RandomCrop(32),    transforms.ToTensor()])    # CIFAR-10 dataset  # train_dataset.data.shape  #Out[31]: (50000, 32, 32, 3)  # train_dataset.targets list  # len(list)=5000  train_dataset = torchvision.datasets.CIFAR10(root='./data/',                         train=True,                          transform=transform,                         download=True)    test_dataset = torchvision.datasets.CIFAR10(root='../../data/',                        train=False,                         transform=transforms.ToTensor())    # Data loader  train_loader = torch.utils.data.DataLoader(dataset=train_dataset,                        batch_size=100,                         shuffle=True)    test_loader = torch.utils.data.DataLoader(dataset=test_dataset,                       batch_size=100,                        shuffle=False)    # 3x3 convolution  def conv3x3(in_channels, out_channels, stride=1):    return nn.Conv2d(in_channels, out_channels, kernel_size=3,              stride=stride, padding=1, bias=False)    # Residual block  class ResidualBlock(nn.Module):    def __init__(self, in_channels, out_channels, stride=1, downsample=None):      super(ResidualBlock, self).__init__()      self.conv1 = conv3x3(in_channels, out_channels, stride)      self.bn1 = nn.BatchNorm2d(out_channels)      self.relu = nn.ReLU(inplace=True)      self.conv2 = conv3x3(out_channels, out_channels)      self.bn2 = nn.BatchNorm2d(out_channels)      self.downsample = downsample          def forward(self, x):      residual = x      out = self.conv1(x)      out = self.bn1(out)      out = self.relu(out)      out = self.conv2(out)      out = self.bn2(out)      if self.downsample:        residual = self.downsample(x)      out += residual      out = self.relu(out)      return out    # ResNet  class ResNet(nn.Module):    def __init__(self, block, layers, num_classes=10):      super(ResNet, self).__init__()      self.in_channels = 16      self.conv = conv3x3(3, 16)      self.bn = nn.BatchNorm2d(16)      self.relu = nn.ReLU(inplace=True)      self.layer1 = self.make_layer(block, 16, layers[0])      self.layer2 = self.make_layer(block, 32, layers[1], 2)      self.layer3 = self.make_layer(block, 64, layers[2], 2)      self.avg_pool = nn.AvgPool2d(8)      self.fc = nn.Linear(64, num_classes)          def make_layer(self, block, out_channels, blocks, stride=1):      downsample = None      if (stride != 1) or (self.in_channels != out_channels):        downsample = nn.Sequential(          conv3x3(self.in_channels, out_channels, stride=stride),          nn.BatchNorm2d(out_channels))      layers = []      layers.append(block(self.in_channels, out_channels, stride, downsample))      self.in_channels = out_channels      for i in range(1, blocks):        layers.append(block(out_channels, out_channels))      # [*[1,2,3]]      # Out[96]: [1, 2, 3]      return nn.Sequential(*layers)        def forward(self, x):      out = self.conv(x) # out.shape:torch.Size([100, 16, 32, 32])      out = self.bn(out)      out = self.relu(out)      out = self.layer1(out)      out = self.layer2(out)      out = self.layer3(out)      out = self.avg_pool(out)      out = out.view(out.size(0), -1)      out = self.fc(out)      return out      model = ResNet(ResidualBlock, [2, 2, 2]).to(device)    # pip install torchsummary or  # git clone https://github.com/sksq96/pytorch-summary  from torchsummary import summary  # input_size=(C,H,W)  summary(model, input_size=(3, 32, 32))    images,labels = iter(train_loader).next()  outputs = model(images)    # Loss and optimizer  criterion = nn.CrossEntropyLoss()  optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)    # For updating learning rate  def update_lr(optimizer, lr):      for param_group in optimizer.param_groups:      param_group['lr'] = lr    # Train the model  total_step = len(train_loader)  curr_lr = learning_rate  for epoch in range(num_epochs):    for i, (images, labels) in enumerate(train_loader):      images = images.to(device)      labels = labels.to(device)            # Forward pass      outputs = model(images)      loss = criterion(outputs, labels)            # Backward and optimize      optimizer.zero_grad()      loss.backward()      optimizer.step()            if (i+1) % 100 == 0:        print ("Epoch [{}/{}], Step [{}/{}] Loss: {:.4f}"            .format(epoch+1, num_epochs, i+1, total_step, loss.item()))      # Decay learning rate    if (epoch+1) % 20 == 0:      curr_lr /= 3      update_lr(optimizer, curr_lr)    # Test the model  model.eval()  with torch.no_grad():    correct = 0    total = 0    for images, labels in test_loader:      images = images.to(device)      labels = labels.to(device)      outputs = model(images)      _, predicted = torch.max(outputs.data, 1)      total += labels.size(0)      correct += (predicted == labels).sum().item()      print('Accuracy of the model on the test images: {} %'.format(100 * correct / total))    # Save the model checkpoint  torch.save(model.state_dict(), 'resnet.ckpt')

 

keras 對應的網絡構建部分

  """  #pytorch  def conv3x3(in_channels, out_channels, stride=1):    return nn.Conv2d(in_channels, out_channels, kernel_size=3,              stride=stride, padding=1, bias=False)  """    def conv3x3(x,out_channels, stride=1):    #out = spatial_2d_padding(x,padding=((1, 1), (1, 1)), data_format="channels_last")    return Conv2D(filters=out_channels, kernel_size=[3,3], strides=(stride,stride),padding="same")(x)    """  # pytorch  # Residual block  class ResidualBlock(nn.Module):    def __init__(self, in_channels, out_channels, stride=1, downsample=None):      super(ResidualBlock, self).__init__()      self.conv1 = conv3x3(in_channels, out_channels, stride)      self.bn1 = nn.BatchNorm2d(out_channels)      self.relu = nn.ReLU(inplace=True)      self.conv2 = conv3x3(out_channels, out_channels)      self.bn2 = nn.BatchNorm2d(out_channels)      self.downsample = downsample          def forward(self, x):      residual = x      out = self.conv1(x)      out = self.bn1(out)      out = self.relu(out)      out = self.conv2(out)      out = self.bn2(out)      if self.downsample:        residual = self.downsample(x)      out += residual      out = self.relu(out)      return out  """  def ResidualBlock(x, out_channels, stride=1, downsample=False):    residual = x    out = conv3x3(x, out_channels,stride)    out = BatchNormalization()(out)    out = Activation("relu")(out)    out = conv3x3(out, out_channels)    out = BatchNormalization()(out)    if downsample:      residual = conv3x3(residual, out_channels, stride=stride)      residual = BatchNormalization()(residual)    out = keras.layers.add([residual,out])    out = Activation("relu")(out)    return out  """  #pytorch  def make_layer(self, block, out_channels, blocks, stride=1):      downsample = None      if (stride != 1) or (self.in_channels != out_channels):        downsample = nn.Sequential(          conv3x3(self.in_channels, out_channels, stride=stride),          nn.BatchNorm2d(out_channels))      layers = []      layers.append(block(self.in_channels, out_channels, stride, downsample))      self.in_channels = out_channels      for i in range(1, blocks):        layers.append(block(out_channels, out_channels))      # [*[1,2,3]]      # Out[96]: [1, 2, 3]      return nn.Sequential(*layers)  """  def make_layer(x, out_channels, blocks, stride=1):      # tf backend: x.output_shape[-1]==out_channels      #print("x.shape[-1] ",x.shape[-1])      downsample = False      if (stride != 1) or (out_channels != x.shape[-1]):        downsample = True      out = ResidualBlock(x, out_channels, stride, downsample)      for i in range(1, blocks):        out = ResidualBlock(out, out_channels)      return out    def KerasResidual(input_shape):    images = Input(input_shape)    out = conv3x3(images,16) # out.shape=(None, 32, 32, 16)     out = BatchNormalization()(out)    out = Activation("relu")(out)    layer1_out = make_layer(out, 16, layers[0])    layer2_out = make_layer(layer1_out, 32, layers[1], 2)    layer3_out = make_layer(layer2_out, 64, layers[2], 2)    out = AveragePooling2D(pool_size=(8,8))(layer3_out)    out = Flatten()(out)    # pytorch 的nn.CrossEntropyLoss()會首先執行softmax計算    # 當換成keras時,沒有tf類似的softmax_cross_entropy    # 自帶的categorical_crossentropy不會執行激活操作,因此得在Dense層加上activation    out = Dense(units=10, activation="softmax")(out)    model = Model(inputs=images,outputs=out)    return model    input_shape=(32, 32, 3)  layers=[2, 2, 2]  mymodel = KerasResidual(input_shape)  mymodel.summary()

 

pytorch model summary

  ----------------------------------------------------------------      Layer (type)        Output Shape     Param #  ================================================================        Conv2d-1      [-1, 16, 32, 32]       432      BatchNorm2d-2      [-1, 16, 32, 32]       32         ReLU-3      [-1, 16, 32, 32]        0        Conv2d-4      [-1, 16, 32, 32]      2,304      BatchNorm2d-5      [-1, 16, 32, 32]       32         ReLU-6      [-1, 16, 32, 32]        0        Conv2d-7      [-1, 16, 32, 32]      2,304      BatchNorm2d-8      [-1, 16, 32, 32]       32         ReLU-9      [-1, 16, 32, 32]        0    ResidualBlock-10      [-1, 16, 32, 32]        0        Conv2d-11      [-1, 16, 32, 32]      2,304     BatchNorm2d-12      [-1, 16, 32, 32]       32         ReLU-13      [-1, 16, 32, 32]        0        Conv2d-14      [-1, 16, 32, 32]      2,304     BatchNorm2d-15      [-1, 16, 32, 32]       32         ReLU-16      [-1, 16, 32, 32]        0    ResidualBlock-17      [-1, 16, 32, 32]        0        Conv2d-18      [-1, 32, 16, 16]      4,608     BatchNorm2d-19      [-1, 32, 16, 16]       64         ReLU-20      [-1, 32, 16, 16]        0        Conv2d-21      [-1, 32, 16, 16]      9,216     BatchNorm2d-22      [-1, 32, 16, 16]       64        Conv2d-23      [-1, 32, 16, 16]      4,608     BatchNorm2d-24      [-1, 32, 16, 16]       64         ReLU-25      [-1, 32, 16, 16]        0    ResidualBlock-26      [-1, 32, 16, 16]        0        Conv2d-27      [-1, 32, 16, 16]      9,216     BatchNorm2d-28      [-1, 32, 16, 16]       64         ReLU-29      [-1, 32, 16, 16]        0        Conv2d-30      [-1, 32, 16, 16]      9,216     BatchNorm2d-31      [-1, 32, 16, 16]       64         ReLU-32      [-1, 32, 16, 16]        0    ResidualBlock-33      [-1, 32, 16, 16]        0        Conv2d-34       [-1, 64, 8, 8]     18,432     BatchNorm2d-35       [-1, 64, 8, 8]       128         ReLU-36       [-1, 64, 8, 8]        0        Conv2d-37       [-1, 64, 8, 8]     36,864     BatchNorm2d-38       [-1, 64, 8, 8]       128        Conv2d-39       [-1, 64, 8, 8]     18,432     BatchNorm2d-40       [-1, 64, 8, 8]       128         ReLU-41       [-1, 64, 8, 8]        0    ResidualBlock-42       [-1, 64, 8, 8]        0        Conv2d-43       [-1, 64, 8, 8]     36,864     BatchNorm2d-44       [-1, 64, 8, 8]       128         ReLU-45       [-1, 64, 8, 8]        0        Conv2d-46       [-1, 64, 8, 8]     36,864     BatchNorm2d-47       [-1, 64, 8, 8]       128         ReLU-48       [-1, 64, 8, 8]        0    ResidualBlock-49       [-1, 64, 8, 8]        0      AvgPool2d-50       [-1, 64, 1, 1]        0        Linear-51          [-1, 10]       650  ================================================================  Total params: 195,738  Trainable params: 195,738  Non-trainable params: 0  ----------------------------------------------------------------  Input size (MB): 0.01  Forward/backward pass size (MB): 3.63  Params size (MB): 0.75  Estimated Total Size (MB): 4.38  ----------------------------------------------------------------

 

keras model summary

  __________________________________________________________________________________________________  Layer (type)          Output Shape     Param #   Connected to             ==================================================================================================  input_26 (InputLayer)      (None, 32, 32, 3)  0                        __________________________________________________________________________________________________  conv2d_103 (Conv2D)       (None, 32, 32, 16)  448     input_26[0][0]            __________________________________________________________________________________________________  batch_normalization_99 (BatchNo (None, 32, 32, 16)  64     conv2d_103[0][0]           __________________________________________________________________________________________________  activation_87 (Activation)   (None, 32, 32, 16)  0      batch_normalization_99[0][0]     __________________________________________________________________________________________________  conv2d_104 (Conv2D)       (None, 32, 32, 16)  2320    activation_87[0][0]         __________________________________________________________________________________________________  batch_normalization_100 (BatchN (None, 32, 32, 16)  64     conv2d_104[0][0]           __________________________________________________________________________________________________  activation_88 (Activation)   (None, 32, 32, 16)  0      batch_normalization_100[0][0]    __________________________________________________________________________________________________  conv2d_105 (Conv2D)       (None, 32, 32, 16)  2320    activation_88[0][0]         __________________________________________________________________________________________________  batch_normalization_101 (BatchN (None, 32, 32, 16)  64     conv2d_105[0][0]           __________________________________________________________________________________________________  add_34 (Add)          (None, 32, 32, 16)  0      activation_87[0][0]                                          batch_normalization_101[0][0]    __________________________________________________________________________________________________  activation_89 (Activation)   (None, 32, 32, 16)  0      add_34[0][0]             __________________________________________________________________________________________________  conv2d_106 (Conv2D)       (None, 32, 32, 16)  2320    activation_89[0][0]         __________________________________________________________________________________________________  batch_normalization_102 (BatchN (None, 32, 32, 16)  64     conv2d_106[0][0]           __________________________________________________________________________________________________  activation_90 (Activation)   (None, 32, 32, 16)  0      batch_normalization_102[0][0]    __________________________________________________________________________________________________  conv2d_107 (Conv2D)       (None, 32, 32, 16)  2320    activation_90[0][0]         __________________________________________________________________________________________________  batch_normalization_103 (BatchN (None, 32, 32, 16)  64     conv2d_107[0][0]           __________________________________________________________________________________________________  add_35 (Add)          (None, 32, 32, 16)  0      activation_89[0][0]                                          batch_normalization_103[0][0]    __________________________________________________________________________________________________  activation_91 (Activation)   (None, 32, 32, 16)  0      add_35[0][0]             __________________________________________________________________________________________________  conv2d_108 (Conv2D)       (None, 16, 16, 32)  4640    activation_91[0][0]         __________________________________________________________________________________________________  batch_normalization_104 (BatchN (None, 16, 16, 32)  128     conv2d_108[0][0]           __________________________________________________________________________________________________  activation_92 (Activation)   (None, 16, 16, 32)  0      batch_normalization_104[0][0]    __________________________________________________________________________________________________  conv2d_110 (Conv2D)       (None, 16, 16, 32)  4640    activation_91[0][0]         __________________________________________________________________________________________________  conv2d_109 (Conv2D)       (None, 16, 16, 32)  9248    activation_92[0][0]         __________________________________________________________________________________________________  batch_normalization_106 (BatchN (None, 16, 16, 32)  128     conv2d_110[0][0]           __________________________________________________________________________________________________  batch_normalization_105 (BatchN (None, 16, 16, 32)  128     conv2d_109[0][0]           __________________________________________________________________________________________________  add_36 (Add)          (None, 16, 16, 32)  0      batch_normalization_106[0][0]                                     batch_normalization_105[0][0]    __________________________________________________________________________________________________  activation_93 (Activation)   (None, 16, 16, 32)  0      add_36[0][0]             __________________________________________________________________________________________________  conv2d_111 (Conv2D)       (None, 16, 16, 32)  9248    activation_93[0][0]         __________________________________________________________________________________________________  batch_normalization_107 (BatchN (None, 16, 16, 32)  128     conv2d_111[0][0]           __________________________________________________________________________________________________  activation_94 (Activation)   (None, 16, 16, 32)  0      batch_normalization_107[0][0]    __________________________________________________________________________________________________  conv2d_112 (Conv2D)       (None, 16, 16, 32)  9248    activation_94[0][0]         __________________________________________________________________________________________________  batch_normalization_108 (BatchN (None, 16, 16, 32)  128     conv2d_112[0][0]           __________________________________________________________________________________________________  add_37 (Add)          (None, 16, 16, 32)  0      activation_93[0][0]                                          batch_normalization_108[0][0]    __________________________________________________________________________________________________  activation_95 (Activation)   (None, 16, 16, 32)  0      add_37[0][0]             __________________________________________________________________________________________________  conv2d_113 (Conv2D)       (None, 8, 8, 64)   18496    activation_95[0][0]         __________________________________________________________________________________________________  batch_normalization_109 (BatchN (None, 8, 8, 64)   256     conv2d_113[0][0]           __________________________________________________________________________________________________  activation_96 (Activation)   (None, 8, 8, 64)   0      batch_normalization_109[0][0]    __________________________________________________________________________________________________  conv2d_115 (Conv2D)       (None, 8, 8, 64)   18496    activation_95[0][0]         __________________________________________________________________________________________________  conv2d_114 (Conv2D)       (None, 8, 8, 64)   36928    activation_96[0][0]         __________________________________________________________________________________________________  batch_normalization_111 (BatchN (None, 8, 8, 64)   256     conv2d_115[0][0]           __________________________________________________________________________________________________  batch_normalization_110 (BatchN (None, 8, 8, 64)   256     conv2d_114[0][0]           __________________________________________________________________________________________________  add_38 (Add)          (None, 8, 8, 64)   0      batch_normalization_111[0][0]                                     batch_normalization_110[0][0]    __________________________________________________________________________________________________  activation_97 (Activation)   (None, 8, 8, 64)   0      add_38[0][0]             __________________________________________________________________________________________________  conv2d_116 (Conv2D)       (None, 8, 8, 64)   36928    activation_97[0][0]         __________________________________________________________________________________________________  batch_normalization_112 (BatchN (None, 8, 8, 64)   256     conv2d_116[0][0]           __________________________________________________________________________________________________  activation_98 (Activation)   (None, 8, 8, 64)   0      batch_normalization_112[0][0]    __________________________________________________________________________________________________  conv2d_117 (Conv2D)       (None, 8, 8, 64)   36928    activation_98[0][0]         __________________________________________________________________________________________________  batch_normalization_113 (BatchN (None, 8, 8, 64)   256     conv2d_117[0][0]           __________________________________________________________________________________________________  add_39 (Add)          (None, 8, 8, 64)   0      activation_97[0][0]                                          batch_normalization_113[0][0]    __________________________________________________________________________________________________  activation_99 (Activation)   (None, 8, 8, 64)   0      add_39[0][0]             __________________________________________________________________________________________________  average_pooling2d_2 (AveragePoo (None, 1, 1, 64)   0      activation_99[0][0]         __________________________________________________________________________________________________  flatten_2 (Flatten)       (None, 64)      0      average_pooling2d_2[0][0]      __________________________________________________________________________________________________  dense_2 (Dense)         (None, 10)      650     flatten_2[0][0]           ==================================================================================================  Total params: 197,418  Trainable params: 196,298  Non-trainable params: 1,120  __________________________________________________________________________________________________

 


                                                       

   


[lousu-xi ] keras的三種模型實現與區別說明已經有259次圍觀

http://coctec.com/docs/python/shhow-post-241435.html