AI安全对抗赛第二名方案分享

本项目为AI安全对抗赛第二名方案介绍,可完美复现。团队名为:我不和大家玩了,队伍成员一人,姓名张鑫,在读于西安电子科技大学,目前研二,初赛排名第6,提交次数58次。决赛排名第2,提交次数84次。python

赛题背景

目前人工智能和机器学习技术被普遍应用在人机交互、推荐系统、安全防御等各个领域,其受攻击的可能性以及是否具有强抗打击能力备受业界关注。具体场景包括语音,图像识别,信用评估,防止欺诈,过滤恶意邮件,抵抗恶意代码攻击,网络攻击等等。攻击者也试图经过各类手段绕过,或直接对AI模型进行攻击达到对抗目的。在人机交互这一环节,随着移动设备的普及,语音、图像做为新兴的人机输入手段,其便捷和实用性被大众所欢迎。所以图像识别的准确性对人工智能产业相当重要。这一环节也是最容易被攻击者利用,经过对数据源的细微修改,在用户感知不到的状况下,使机器作出了错误的操做。这种方法会致使AI系统被入侵、错误命令被执行,执行后的连锁反应会形成的严重后果。git

本次竞赛的题目和数据由百度安所有、百度大数据研究院提供,竞赛平台AI Studio由百度AI技术生态部提供。期待参赛者们可以以此为契机,学习对抗样本理论知识并提高深度学习工程实践能力。欢迎全球范围开发者积极参与,鼓励高校教师积极参与指导。算法

赛题描述

  • 初赛:初赛中,选手经过对 指定图像 添加扰动,使目标模型(Target Model)(一个为ResNeXt50模型并公开模型结构与参数(白盒);一个为MobileNetV2模型并公开模型结构与参数(白盒);一个不公开模型结构与参数(黑盒)。)分类错误,例如对于一张分类为A的图片,目标模型只要判别扰动后的样本不为A,便可断定成功。同时以生成扰动量越小越优。
  • 复赛:复赛选手的目标与初赛相同: 利用给定,将指定的120张图片样本生成为攻击样本,主办方根据选手提供的攻击样本在后台使用上述5个Target Model(一个是与初赛相同的ResNeXt50白盒模型(白盒),一个是人工加固的模型(灰盒),另外三个均为黑盒模型,其中包括由AutoDL技术训练的模型。)进行评估,只要使Target Model分类结果与Label不一致,则断定为攻击成功。样本攻击成功数越多、扰动越小,得分越高。
 

一 熟悉baseline

方案将在baseline的基础上增长函数进行修改,所以咱们先浏览一遍baseline。安全

 

1.1 熟悉baseline目录结构

In[ ]
#解压代码压缩包
import zipfile
tar = zipfile.ZipFile('/home/aistudio/data/data19725/attack_by_xin.zip','r')
tar.extractall()
In[ ]
cd baidu_attack_by_xin/
/home/aistudio/baidu_attack_by_xin
 

baseline存在如下目录网络

  • attack 存放核心算法代码
  • models 存放模型结构定意
  • models_parameters 存放模型参数
  • input_imgae 存放120张输入图片,与标签文件
  • output_image 存放输出图片
  • utils.py 定义了一些经常使用工具,好比读入图片处理,打印参数等
  • attack_FGSM.py 算法主函数,包含模型定义,算法调用。此文件将直接在下面notebook中展现

note: 加粗部分为个人方案将要修改的部分app

 

1.2 baseline浏览

代码大致分为如下几部分:dom

  1. 定义模型,导入参数。机器学习

    采用模型为ResNeXt50_32x4d,损失函数为交叉熵,与训练分类模型无异。区别在于如何更新参数,这点将在后文FGSM算法中介绍。ide

  2. 依次读入图片调用FGSM算法,生成对抗样本函数

In[ ]
#coding=utf-8

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import argparse
import functools
import numpy as np
import paddle.fluid as fluid

#加载自定义文件
import models
##################################################
##################################################
#此处为导入算法FGSM和PGD
#我方案中函数也将定义在attack/attack_pp.py中
from attack.attack_pp import FGSM, PGD
##################################################
##################################################
from utils import init_prog, save_adv_image, process_img, tensor2img, calc_mse, add_arguments, print_arguments

path = "/home/aistudio/baidu_attack_by_xin/"
######Init args
image_shape = [3,224,224]
class_dim=121
input_dir = path + "input_image/"
output_dir = path +  "output_image/"
model_name="ResNeXt50_32x4d"
pretrained_model= path + "models_parameters/86.45+88.81ResNeXt50_32x4d"

val_list = 'val_list.txt'
use_gpu=True

######Attack graph
adv_program=fluid.Program()
#完成初始化
with fluid.program_guard(adv_program):
    input_layer = fluid.layers.data(name='image', shape=image_shape, dtype='float32')
    #设置为能够计算梯度
    input_layer.stop_gradient=False

    # model definition
    model = models.__dict__[model_name]()
    out_logits = model.net(input=input_layer, class_dim=class_dim)
    out = fluid.layers.softmax(out_logits)

    place = fluid.CUDAPlace(0) if use_gpu else fluid.CPUPlace()
    exe = fluid.Executor(place)
    exe.run(fluid.default_startup_program())

    #记载模型参数
    fluid.io.load_persistables(exe, pretrained_model)

#设置adv_program的BN层状态
init_prog(adv_program)

#建立测试用评估模式
eval_program = adv_program.clone(for_test=True)

#定义梯度
with fluid.program_guard(adv_program):
    label = fluid.layers.data(name="label", shape=[1] ,dtype='int64')
    loss = fluid.layers.cross_entropy(input=out, label=label)
    gradients = fluid.backward.gradients(targets=loss, inputs=[input_layer])[0]

######Inference
def inference(img):
    fetch_list = [out.name]

    result = exe.run(eval_program,
                     fetch_list=fetch_list,
                     feed={ 'image':img })
    result = result[0][0]
    pred_label = np.argmax(result)
    pred_score = result[pred_label].copy()
    return pred_label, pred_score

######FGSM attack
#untarget attack
def attack_nontarget_by_FGSM(img, src_label):
    pred_label = src_label

    step = 8.0/256.0
    eps = 32.0/256.0
    while pred_label == src_label:
        #生成对抗样本
        adv=FGSM(adv_program=adv_program,eval_program=eval_program,gradients=gradients,o=img,
                 input_layer=input_layer,output_layer=out,step_size=step,epsilon=eps,
                 isTarget=False,target_label=0,use_gpu=use_gpu)

        pred_label, pred_score = inference(adv)
        step *= 2
        if step > eps:
            break

    print("Test-score: {0}, class {1}".format(pred_score, pred_label))

    adv_img=tensor2img(adv)
    return adv_img

####### Main #######
def get_original_file(filepath):
    with open(filepath, 'r') as cfile:
        full_lines = [line.strip() for line in cfile]
    cfile.close()
    original_files = []
    for line in full_lines:
        label, file_name = line.split()
        original_files.append([file_name, int(label)])
    return original_files

def gen_adv():
    ########若是你没有头绪能够从这部分看起#######################
    mse = 0
    original_files = get_original_file(input_dir + val_list)

    for filename, label in original_files:
        img_path = input_dir + filename
        print("Image: {0} ".format(img_path))
        ##读入图像,转换维度,归一化##########
        img=process_img(img_path)
        ####将图像输入attack_nontarget_by_FGSM函数,获得被攻击后的图像#######
        adv_img = attack_nontarget_by_FGSM(img, label)
        image_name, image_ext = filename.split('.')
        ##Save adversarial image(.png) 保存图像
        save_adv_image(adv_img, output_dir+image_name+'.jpg')

        org_img = tensor2img(img)
        ##对比攻击图像与原图像的差别,计算mse
        score = calc_mse(org_img, adv_img)
        mse += score
    print("ADV {} files, AVG MSE: {} ".format(len(original_files), mse/len(original_files)))


gen_adv()
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02085620_10074.jpg 
Non-Targeted attack target_label=o_label=1
Non-Targeted attack target_label=o_label=1
Non-Targeted attack target_label=o_label=1
Test-score: 0.1829851120710373, class 1
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02085782_1039.jpg 
Non-Targeted attack target_label=o_label=2
Non-Targeted attack target_label=o_label=2
Non-Targeted attack target_label=o_label=2
Test-score: 0.7980572581291199, class 2
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02085936_10130.jpg 
Non-Targeted attack target_label=o_label=3
Test-score: 0.558245837688446, class 54
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02086079_10600.jpg 
Non-Targeted attack target_label=o_label=4
Test-score: 0.4213048815727234, class 5
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02086240_1059.jpg 
Non-Targeted attack target_label=o_label=5
Non-Targeted attack target_label=o_label=5
Non-Targeted attack target_label=o_label=5
Test-score: 0.6555399894714355, class 5
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02086646_1002.jpg 
Non-Targeted attack target_label=o_label=6
Non-Targeted attack target_label=o_label=6
Test-score: 0.13172577321529388, class 68
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02086910_1048.jpg 
Non-Targeted attack target_label=o_label=7
Test-score: 0.24971207976341248, class 1
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02087046_1206.jpg 
Non-Targeted attack target_label=o_label=8
Test-score: 0.7929812669754028, class 108
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02087394_11337.jpg 
Non-Targeted attack target_label=o_label=9
Test-score: 0.33581840991973877, class 93
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02088094_1003.jpg 
Non-Targeted attack target_label=o_label=10
Non-Targeted attack target_label=o_label=10
Non-Targeted attack target_label=o_label=10
Test-score: 0.24336178600788116, class 10
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02088238_10013.jpg 
Non-Targeted attack target_label=o_label=11
Non-Targeted attack target_label=o_label=11
Test-score: 0.18260332942008972, class 1
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02088364_10108.jpg 
Non-Targeted attack target_label=o_label=12
Test-score: 0.4022131860256195, class 6
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02088466_10083.jpg 
Non-Targeted attack target_label=o_label=13
Non-Targeted attack target_label=o_label=13
Test-score: 0.17899833619594574, class 96
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02088632_101.jpg 
Non-Targeted attack target_label=o_label=14
Non-Targeted attack target_label=o_label=14
Non-Targeted attack target_label=o_label=14
Test-score: 0.748593807220459, class 14
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02089078_1064.jpg 
Non-Targeted attack target_label=o_label=15
Test-score: 0.8527135848999023, class 13
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02089867_1029.jpg 
Non-Targeted attack target_label=o_label=16
Test-score: 0.9962345957756042, class 17
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02089973_1066.jpg 
Non-Targeted attack target_label=o_label=17
Test-score: 0.9775213003158569, class 16
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02090379_1272.jpg 
Non-Targeted attack target_label=o_label=18
Test-score: 0.9822365045547485, class 13
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02090622_10343.jpg 
Non-Targeted attack target_label=o_label=19
Test-score: 0.8759220242500305, class 22
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02090721_1292.jpg 
Non-Targeted attack target_label=o_label=20
Test-score: 0.9765266180038452, class 27
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02091032_10079.jpg 
Non-Targeted attack target_label=o_label=21
Test-score: 0.8322737812995911, class 22
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02091134_10107.jpg 
Non-Targeted attack target_label=o_label=22
Test-score: 0.0860852524638176, class 102
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02091244_1000.jpg 
Non-Targeted attack target_label=o_label=23
Non-Targeted attack target_label=o_label=23
Non-Targeted attack target_label=o_label=23
Test-score: 0.8737558126449585, class 23
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02091467_1110.jpg 
Non-Targeted attack target_label=o_label=24
Non-Targeted attack target_label=o_label=24
Test-score: 0.12693266570568085, class 99
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02091635_1319.jpg 
Non-Targeted attack target_label=o_label=25
Test-score: 0.16966181993484497, class 32
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02091831_10576.jpg 
Non-Targeted attack target_label=o_label=26
Test-score: 0.26041099429130554, class 8
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02092002_10699.jpg 
Non-Targeted attack target_label=o_label=27
Test-score: 0.9970411658287048, class 20
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02092339_1100.jpg 
Non-Targeted attack target_label=o_label=28
Test-score: 0.22998812794685364, class 59
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02093256_11023.jpg 
Non-Targeted attack target_label=o_label=29
Test-score: 0.3322082757949829, class 9
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02093428_10947.jpg 
Non-Targeted attack target_label=o_label=30
Non-Targeted attack target_label=o_label=30
Test-score: 0.4645201563835144, class 117
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02093647_1037.jpg 
Non-Targeted attack target_label=o_label=31
Non-Targeted attack target_label=o_label=31
Test-score: 0.09869368374347687, class 33
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02093754_1062.jpg 
Non-Targeted attack target_label=o_label=32
Test-score: 0.35046255588531494, class 111
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02093859_1003.jpg 
Non-Targeted attack target_label=o_label=33
Non-Targeted attack target_label=o_label=33
Test-score: 0.23188208043575287, class 83
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02093991_1026.jpg 
Non-Targeted attack target_label=o_label=34
Test-score: 0.6639878749847412, class 35
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02094114_1173.jpg 
Non-Targeted attack target_label=o_label=35
Test-score: 0.9812427759170532, class 20
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02094258_1004.jpg 
Non-Targeted attack target_label=o_label=36
Test-score: 0.990171492099762, class 35
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02094433_10126.jpg 
Non-Targeted attack target_label=o_label=37
Test-score: 0.7961386442184448, class 43
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02095314_1033.jpg 
Non-Targeted attack target_label=o_label=38
Test-score: 0.22626255452632904, class 17
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02095570_1031.jpg 
Non-Targeted attack target_label=o_label=39
Test-score: 0.3102056682109833, class 46
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02095889_1003.jpg 
Non-Targeted attack target_label=o_label=40
Non-Targeted attack target_label=o_label=40
Non-Targeted attack target_label=o_label=40
Test-score: 0.2282048910856247, class 38
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02096051_1110.jpg 
Non-Targeted attack target_label=o_label=41
Test-score: 0.6431247591972351, class 39
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02096177_10031.jpg 
Non-Targeted attack target_label=o_label=42
Test-score: 0.7510316967964172, class 116
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02096294_1111.jpg 
Non-Targeted attack target_label=o_label=43
Non-Targeted attack target_label=o_label=43
Non-Targeted attack target_label=o_label=43
Test-score: 0.1718287616968155, class 43
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02096437_1055.jpg 
Non-Targeted attack target_label=o_label=44
Test-score: 0.1869039088487625, class 25
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02096585_10604.jpg 
Non-Targeted attack target_label=o_label=45
Test-score: 0.517105758190155, class 8
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02097047_1412.jpg 
Non-Targeted attack target_label=o_label=46
Non-Targeted attack target_label=o_label=46
Test-score: 0.6849876642227173, class 48
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02097130_1193.jpg 
Non-Targeted attack target_label=o_label=47
Test-score: 0.41658449172973633, class 33
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02097209_1038.jpg 
Non-Targeted attack target_label=o_label=48
Test-score: 0.16146445274353027, class 47
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02097298_10676.jpg 
Non-Targeted attack target_label=o_label=49
Test-score: 0.609655499458313, class 33
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02097474_1070.jpg 
Non-Targeted attack target_label=o_label=50
Test-score: 0.9852504134178162, class 54
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02097658_1018.jpg 
Non-Targeted attack target_label=o_label=51
Test-score: 0.997583270072937, class 43
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02098105_1078.jpg 
Non-Targeted attack target_label=o_label=52
Test-score: 0.2030351161956787, class 50
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02098286_1009.jpg 
Non-Targeted attack target_label=o_label=53
Test-score: 0.43629899621009827, class 31
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02098413_11385.jpg 
Non-Targeted attack target_label=o_label=54
Test-score: 0.2666652798652649, class 50
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02099267_1018.jpg 
Non-Targeted attack target_label=o_label=55
Test-score: 0.24120940268039703, class 64
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02099429_1039.jpg 
Non-Targeted attack target_label=o_label=56
Test-score: 0.5451071858406067, class 105
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02099601_100.jpg 
Non-Targeted attack target_label=o_label=57
Non-Targeted attack target_label=o_label=57
Test-score: 0.2970142066478729, class 63
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02099712_1150.jpg 
Non-Targeted attack target_label=o_label=58
Test-score: 0.8002893924713135, class 29
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02099849_1068.jpg 
Non-Targeted attack target_label=o_label=59
Test-score: 0.4128360450267792, class 70
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02100236_1244.jpg 
Non-Targeted attack target_label=o_label=60
Test-score: 0.5225626826286316, class 70
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02100583_10249.jpg 
Non-Targeted attack target_label=o_label=61
Test-score: 0.7810189127922058, class 59
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02100735_10064.jpg 
Non-Targeted attack target_label=o_label=62
Test-score: 0.20395173132419586, class 12
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02100877_1062.jpg 
Non-Targeted attack target_label=o_label=63
Test-score: 0.1305573433637619, class 62
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02101006_135.jpg 
Non-Targeted attack target_label=o_label=64
Test-score: 0.2646207809448242, class 96
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02101388_10017.jpg 
Non-Targeted attack target_label=o_label=65
Non-Targeted attack target_label=o_label=65
Non-Targeted attack target_label=o_label=65
Test-score: 0.8180195689201355, class 65
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02101556_1116.jpg 
Non-Targeted attack target_label=o_label=66
Test-score: 0.966513991355896, class 69
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02102040_1055.jpg 
Non-Targeted attack target_label=o_label=67
Test-score: 0.9941025376319885, class 62
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02102177_1160.jpg 
Non-Targeted attack target_label=o_label=68
Non-Targeted attack target_label=o_label=68
Non-Targeted attack target_label=o_label=68
Test-score: 0.24618981778621674, class 68
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02102318_10000.jpg 
Non-Targeted attack target_label=o_label=69
Test-score: 0.4054173231124878, class 68
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02102480_101.jpg 
Non-Targeted attack target_label=o_label=70
Test-score: 0.3680818974971771, class 69
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02102973_1037.jpg 
Non-Targeted attack target_label=o_label=71
Test-score: 0.7708333730697632, class 116
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02104029_1075.jpg 
Non-Targeted attack target_label=o_label=72
Test-score: 0.10377514362335205, class 58
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02104365_10071.jpg 
Non-Targeted attack target_label=o_label=73
Test-score: 0.2432803213596344, class 64
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02105056_1165.jpg 
Non-Targeted attack target_label=o_label=74
Test-score: 0.40509727597236633, class 55
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02105162_10076.jpg 
Non-Targeted attack target_label=o_label=75
Test-score: 0.15848565101623535, class 85
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02105251_1588.jpg 
Non-Targeted attack target_label=o_label=76
Test-score: 0.10340812057256699, class 52
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02105412_1159.jpg 
Non-Targeted attack target_label=o_label=77
Test-score: 0.34471365809440613, class 87
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02105505_1018.jpg 
Non-Targeted attack target_label=o_label=78
Test-score: 0.5453677177429199, class 106
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02105641_10051.jpg 
Non-Targeted attack target_label=o_label=79
Test-score: 0.29593008756637573, class 40
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02105855_10095.jpg 
Non-Targeted attack target_label=o_label=80
Test-score: 0.4963936507701874, class 81
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02106030_11148.jpg 
Non-Targeted attack target_label=o_label=81
Test-score: 0.9998124241828918, class 80
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02106166_1205.jpg 
Non-Targeted attack target_label=o_label=81
Non-Targeted attack target_label=o_label=81
Non-Targeted attack target_label=o_label=81
Test-score: 0.49886953830718994, class 82
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02106382_1005.jpg 
Non-Targeted attack target_label=o_label=83
Test-score: 0.9096139669418335, class 101
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02106550_10048.jpg 
Non-Targeted attack target_label=o_label=84
Test-score: 0.9285635948181152, class 64
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02106662_10122.jpg 
Non-Targeted attack target_label=o_label=85
Non-Targeted attack target_label=o_label=85
Non-Targeted attack target_label=o_label=85
Test-score: 0.26929906010627747, class 85
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02107142_10952.jpg 
Non-Targeted attack target_label=o_label=86
Test-score: 0.09236325323581696, class 87
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02107312_105.jpg 
Non-Targeted attack target_label=o_label=87
Test-score: 0.9715318083763123, class 8
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02107574_1026.jpg 
Non-Targeted attack target_label=o_label=88
Test-score: 0.8894961476325989, class 91
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02107683_1003.jpg 
Non-Targeted attack target_label=o_label=89
Test-score: 0.2595520317554474, class 97
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02107908_1030.jpg 
Non-Targeted attack target_label=o_label=90
Test-score: 0.6221421957015991, class 91
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02108000_1087.jpg 
Non-Targeted attack target_label=o_label=91
Test-score: 0.9019123911857605, class 90
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02108089_1104.jpg 
Non-Targeted attack target_label=o_label=92
Test-score: 0.36616942286491394, class 45
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02108422_1096.jpg 
Non-Targeted attack target_label=o_label=93
Non-Targeted attack target_label=o_label=93
Test-score: 0.36447763442993164, class 103
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02108551_1025.jpg 
Non-Targeted attack target_label=o_label=94
Test-score: 0.715351939201355, class 64
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02108915_10564.jpg 
Non-Targeted attack target_label=o_label=95
Test-score: 0.5345490574836731, class 8
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02109047_10160.jpg 
Non-Targeted attack target_label=o_label=96
Test-score: 0.5757254362106323, class 13
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02109525_10032.jpg 
Non-Targeted attack target_label=o_label=97
Non-Targeted attack target_label=o_label=97
Non-Targeted attack target_label=o_label=97
Test-score: 0.19706158339977264, class 97
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02109961_11224.jpg 
Non-Targeted attack target_label=o_label=98
Test-score: 0.6545760631561279, class 99
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02110063_11105.jpg 
Non-Targeted attack target_label=o_label=99
Test-score: 0.6142775416374207, class 112
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02110185_10116.jpg 
Non-Targeted attack target_label=o_label=100
Test-score: 0.8787350058555603, class 98
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02110627_10147.jpg 
Non-Targeted attack target_label=o_label=101
Test-score: 0.3279683589935303, class 48
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02110806_1214.jpg 
Non-Targeted attack target_label=o_label=102
Test-score: 0.3767395615577698, class 91
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02110958_10378.jpg 
Non-Targeted attack target_label=o_label=103
Non-Targeted attack target_label=o_label=103
Non-Targeted attack target_label=o_label=103
Test-score: 0.5809877514839172, class 103
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02111129_1111.jpg 
Non-Targeted attack target_label=o_label=104
Test-score: 0.5515199899673462, class 94
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02111277_10237.jpg 
Non-Targeted attack target_label=o_label=105
Test-score: 0.5786005258560181, class 71
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02111500_1048.jpg 
Non-Targeted attack target_label=o_label=106
Non-Targeted attack target_label=o_label=106
Test-score: 0.07512383162975311, class 72
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02111889_10059.jpg 
Non-Targeted attack target_label=o_label=107
Test-score: 0.3027220666408539, class 72
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02112018_10158.jpg 
Non-Targeted attack target_label=o_label=108
Non-Targeted attack target_label=o_label=108
Non-Targeted attack target_label=o_label=108
Test-score: 0.8465248346328735, class 108
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02112137_1005.jpg 
Non-Targeted attack target_label=o_label=109
Non-Targeted attack target_label=o_label=109
Non-Targeted attack target_label=o_label=109
Test-score: 0.4224722981452942, class 80
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02112350_10079.jpg 
Non-Targeted attack target_label=o_label=110
Non-Targeted attack target_label=o_label=110
Non-Targeted attack target_label=o_label=110
Test-score: 0.5601344108581543, class 110
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02112706_105.jpg 
Non-Targeted attack target_label=o_label=111
Test-score: 0.3139760494232178, class 119
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02113023_1136.jpg 
Non-Targeted attack target_label=o_label=112
Test-score: 0.9765301942825317, class 113
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02113186_1030.jpg 
Non-Targeted attack target_label=o_label=113
Test-score: 0.9946261048316956, class 112
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02113624_1461.jpg 
Non-Targeted attack target_label=o_label=114
Test-score: 0.8437007069587708, class 115
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02113712_10525.jpg 
Non-Targeted attack target_label=o_label=115
Test-score: 0.9968068599700928, class 3
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02113799_1155.jpg 
Non-Targeted attack target_label=o_label=116
Test-score: 0.7794128656387329, class 71
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02113978_1034.jpg 
Non-Targeted attack target_label=o_label=117
Test-score: 0.05417395010590553, class 120
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02115641_10261.jpg 
Non-Targeted attack target_label=o_label=118
Test-score: 0.21042458713054657, class 117
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02115913_1010.jpg 
Non-Targeted attack target_label=o_label=119
Test-score: 0.5643325448036194, class 104
Image: /home/aistudio/baidu_attack_by_xin/input_image/n02116738_10024.jpg 
Non-Targeted attack target_label=o_label=120
Non-Targeted attack target_label=o_label=120
Non-Targeted attack target_label=o_label=120
Test-score: 0.19919021427631378, class 120
ADV 120 files, AVG MSE: 4.72762380070619
 

介绍FGSM以前,咱们回顾如下梯度降低算法更新公式:

θ:=θ−α∗▽J(θ)\theta := \theta - \alpha * \bigtriangledown J(\theta)θ:=θαJ(θ)

其中,θ\thetaθ为模型参数,α\alphaα为步长,J(θ)J(\theta)J(θ)为目标函数。

如上迭代可以使 J(θ)J(\theta)J(θ)不断变小。

而上一code框中,定义目标为模型输出几率与标签的交叉熵,按照如上公式迭代,会使模型预测更加准确。

而咱们的目的是让模型迷惑,预测不出正确的标签,所以,咱们只需改变一下符号:

θ:=θ+▽J(θ)\theta := \theta + \bigtriangledown J(\theta)θ:=θ+J(θ)

FGSM思想大概如此,此外

  • FGSM所用梯度会施以一sign函数,这意味这若是某一维度梯度为-0.000000000001,通过sign函数后将为-1。
In[ ]
""" Explaining and Harnessing Adversarial Examples, I. Goodfellow et al., ICLR 2015 实现了FGSM 支持定向和非定向攻击的单步FGSM input_layer:输入层 output_layer:输出层 step_size:攻击步长 adv_program:生成对抗样本的prog eval_program:预测用的prog isTarget:是否认向攻击 target_label:定向攻击标签 epsilon:约束linf大小 o:原始数据 use_gpu:是否使用GPU 返回: 生成的对抗样本 """
def FGSM(adv_program,eval_program,gradients,o,input_layer,output_layer,step_size=16.0/256,epsilon=16.0/256,isTarget=False,target_label=0,use_gpu=False):
    
    place = fluid.CUDAPlace(0) if use_gpu else fluid.CPUPlace()
    exe = fluid.Executor(place)
   
    result = exe.run(eval_program,
                     fetch_list=[output_layer],
                     feed={ input_layer.name:o })
    result = result[0][0]
   
    o_label = np.argsort(result)[::-1][:1][0]
    
    if not isTarget:
        #无定向攻击 target_label的值自动设置为原标签的值
        print("Non-Targeted attack target_label=o_label={}".format(o_label))
        target_label=o_label
    else:
        print("Targeted attack target_label={} o_label={}".format(target_label,o_label))
        
        
    target_label=np.array([target_label]).astype('int64')
    target_label=np.expand_dims(target_label, axis=0)
    
    #计算梯度
    g = exe.run(adv_program,
                     fetch_list=[gradients],
                     feed={ input_layer.name:o,'label': target_label  }
               )
    g = g[0][0]
    
    
    if isTarget:
        adv=o-np.sign(g)*step_size
    else:
        #################################
        #注意此处符号
        adv=o+np.sign(g)*step_size
    
    #实施linf约束
    adv=linf_img_tenosr(o,adv,epsilon)
    
    return adv
 

1.3 让咱们看看生成的对抗样本和原来有什么区别

In[ ]
#定义一个观察图片区别的函数
def show_images_diff(original_img,adversarial_img):
    #original_img = np.array(Image.open(original_img))
    #adversarial_img = np.array(Image.open(adversarial_img))
    original_img=cv2.resize(original_img.copy(),(224,224))
    adversarial_img=cv2.resize(adversarial_img.copy(),(224,224))

    plt.figure(figsize=(10,10))

    #original_img=original_img/255.0
    #adversarial_img=adversarial_img/255.0

    plt.subplot(1, 3, 1)
    plt.title('Original Image')
    plt.imshow(original_img)
    plt.axis('off')

    plt.subplot(1, 3, 2)
    plt.title('Adversarial Image')
    plt.imshow(adversarial_img)
    plt.axis('off')

    plt.subplot(1, 3, 3)
    plt.title('Difference')
    difference = 0.0+adversarial_img - original_img
        
    l0 = np.where(difference != 0)[0].shape[0]*100/(224*224*3)
    l2 = np.linalg.norm(difference)/(256*3)
    linf=np.linalg.norm(difference.copy().ravel(),ord=np.inf)
    # print(difference)
    print("l0={}% l2={} linf={}".format(l0, l2,linf))
    
    #(-1,1) -> (0,1)
    #灰色打底 容易看出区别
    difference=difference/255.0
        
    difference=difference/2.0+0.5
   
    plt.imshow(difference)
    plt.axis('off')

    plt.show()
    

    #plt.savefig('fig_cat.png')
In[ ]
from PIL import Image, ImageOps
import cv2
import matplotlib.pyplot as plt
original_img=np.array(Image.open("/home/aistudio/baidu_attack_by_xin/input_image/n02085782_1039.jpg"))
adversarial_img=np.array(Image.open("/home/aistudio/baidu_attack_by_xin/output_image/n02085782_1039.jpg"))
show_images_diff(original_img,adversarial_img)
l0=92.0014880952381% l2=3.203206511496744 linf=31.0
 

在肉眼观察下,并看不出对抗样本有什么变化,但其实模型已经认不出这条狗了

 

二 模型改进

baseline采用了FGSM算法,而且只攻击了一个模型,所以改进的思路为两路:

  • 第一路,训练更多的模型,攻击更多的模型已寻求泛化能力。
  • 第二路,改进算法,试试更发杂的,效果更好的算法。
 

2.1 更多的模型

集成模型选取思路为多元,尽量多的不一样模型,才可能逼近赛题背后的黑盒模型。

  • 橙色和蓝色框中模型均采用pytorch进行迁移训练,训练集测试集为原始Stanford Dogs数据集划分,迭代次数均为25,学习率均为0.001。随后将pytorch转换为oxnn模型,再用x2paddle转换为paddle模型。至于转换的细节我将在稍后的系列文章中详细介绍

  • 红色框中模型为直接用paddlepaddle训练而来,迭代次数为20,其他参数与前述相同。

  • 人工加固ResNeXt50_32x4d模型

决赛中的灰盒模型为人工加固的模型,结构为ResNeXt50。为了攻击黑盒模型,我在本地训练了一个加固模型,做为灰盒模型的逼近。训练加固模型涉及到训练集的选取和训练方法的选取。 训练集的构成主要有两部分,第一部分我采用不一样的方法攻击初赛的白盒模型(此白盒模型与灰盒模型具备相同的网络结构),将生成的n个样本集做为训练集的一部分,思路如图3.2所示。这样基于的假设是,不一样的攻击方法生成的对抗样本在真实的灰盒模型表现上会有差别,有些图片依然能被灰盒模型正确识别。将n个样本集集合,就能够构建出彻底将灰盒攻击成功的图片集。

第二部分为Stanford Dogs数据集中随机选取的8000张图片和原始的120张图片,这些图片的选取是为了保持模型的泛化能力。

note: 这种方法效果很是明显,由于从比赛来说,若是你训练的模型和背后的黑盒具备相同的结构,迁移能力要比其余结构的模型好不少

 

集成模型代码实践

幸亏不一样模型参数的命名方式不一样,所以一股脑读进来不会出错,代码以下。

In[ ]
#coding=utf-8
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import sys
import os
import numpy as np
import paddle.fluid as fluid
import pandas as pd
import models
from attack.attack_pp import FGSM, PGD,linf_img_tenosr,ensem_mom_attack_threshold_9model,\
ensem_mom_attack_threshold_9model2,ensem_mom_attack_threshold_9model_tarversion
from utils import init_prog, save_adv_image, process_img, tensor2img, calc_mse, add_arguments, print_arguments

image_shape = [3, 224, 224]
class_dim=121
input_dir = "./input_image/"
output_dir = "./output_image_attack/"
os.makedirs("./output_image_attack") 
#######################################################################
#这就是所用的全部模型
model_name1="ResNeXt50_32x4d"
pretrained_model1="./models_parameters/86.45+88.81ResNeXt50_32x4d"

model_name2="MobileNetV2"
pretrained_model2="./models_parameters/MobileNetV2"

model_name4="VGG16"
pretrained_model4="./models_parameters/VGG16"

model_name3="Densenet121"
pretrained_model3="./models_parameters/Densenet121"

model_name5="mnasnet1_0"
pretrained_model5="./models_parameters/mnasnet1_0"

model_name6="wide_resnet"
pretrained_model6="./models_parameters/wide_resnet"

model_name7="googlenet"
pretrained_model7="./models_parameters/googlenet"

model_name8="nas_mobile_net"
pretrained_model8="./models_parameters/nas_mobile_net"

model_name9="alexnet"
pretrained_model9="./models_parameters/alexnet"
########################################################################
val_list = 'val_list.txt'
use_gpu=True

mydict = {0: 1, 1: 10, 2: 100, 3: 101, 4: 102, 5: 103, 6: 104, 7: 105, 8: 106, 9: 107, 10: 108, 11: 109, 12: 11, 13: 110, 14: 111, 15: 112, 16: 113, 17: 114, 18: 115, 19: 116, 20: 117, 21: 118, 22: 119, 23: 12, 24: 120, 25: 13, 26: 14, 27: 15, 28: 16, 29: 17, 30: 18, 31: 19, 32: 2, 33: 20, 34: 21, 35: 22, 36: 23, 37: 24, 38: 25, 39: 26, 40: 27, 41: 28, 42: 29, 43: 3, 44: 30, 45: 31, 46: 32, 47: 33, 48: 34, 49: 35, 50: 36, 51: 37, 52: 38, 53: 39, 54: 4, 55: 40, 56: 41, 57: 42, 58: 43, 59: 44, 60: 45, 61: 46, 62: 47, 63: 48, 64: 49, 65: 5, 66: 50, 67: 51, 68: 52, 69: 53, 70: 54, 71: 55, 72: 56, 73: 57, 74: 58, 75: 59, 76: 6, 77: 60, 78: 61, 79: 62, 80: 63, 81: 64, 82: 65, 83: 66, 84: 67, 85: 68, 86: 69, 87: 7, 88: 70, 89: 71, 90: 72, 91: 73, 92: 74, 93: 75, 94: 76, 95: 77, 96: 78, 97: 79, 98: 8, 99: 80, 100: 81, 101: 82, 102: 83, 103: 84, 104: 85, 105: 86, 106: 87, 107: 88, 108: 89, 109: 9, 110: 90, 111: 91, 112: 92, 113: 93, 114: 94, 115: 95, 116: 96, 117: 97, 118: 98, 119: 99}
origdict = {1: 0, 2: 32, 3: 43, 4: 54, 5: 65, 6: 76, 7: 87, 8: 98, 9: 109, 10: 1, 11: 12, 12: 23, 13: 25, 14: 26, 15: 27, 16: 28, 17: 29, 18: 30, 19: 31, 20: 33, 21: 34, 22: 35, 23: 36, 24: 37, 25: 38, 26: 39, 27: 40, 28: 41, 29: 42, 30: 44, 31: 45, 32: 46, 33: 47, 34: 48, 35: 49, 36: 50, 37: 51, 38: 52, 39: 53, 40: 55, 41: 56, 42: 57, 43: 58, 44: 59, 45: 60, 46: 61, 47: 62, 48: 63, 49: 64, 50: 66, 51: 67, 52: 68, 53: 69, 54: 70, 55: 71, 56: 72, 57: 73, 58: 74, 59: 75, 60: 77, 61: 78, 62: 79, 63: 80, 64: 81, 65: 82, 66: 83, 67: 84, 68: 85, 69: 86, 70: 88, 71: 89, 72: 90, 73: 91, 74: 92, 75: 93, 76: 94, 77: 95, 78: 96, 79: 97, 80: 99, 81: 100, 82: 101, 83: 102, 84: 103, 85: 104, 86: 105, 87: 106, 88: 107, 89: 108, 90: 110, 91: 111, 92: 112, 93: 113, 94: 114, 95: 115, 96: 116, 97: 117, 98: 118, 99: 119, 100: 2, 101: 3, 102: 4, 103: 5, 104: 6, 105: 7, 106: 8, 107: 9, 108: 10, 109: 11, 110: 13, 111: 14, 112: 15, 113: 16, 114: 17, 115: 18, 116: 19, 117: 20, 118: 21, 119: 22, 120: 24}

adv_program=fluid.Program()
startup_program = fluid.Program()

new_scope = fluid.Scope()
#完成初始化
with fluid.program_guard(adv_program):
    place = fluid.CUDAPlace(0) if use_gpu else fluid.CPUPlace()
    exe = fluid.Executor(place)
    label = fluid.layers.data(name="label", shape=[1] ,dtype='int64')
    label2 = fluid.layers.data(name="label2", shape=[1] ,dtype='int64')
    adv_image = fluid.layers.create_parameter(name="adv_image",shape=(1,3,224,224),dtype='float32')
    
    model1 = models.__dict__[model_name1]()
    out_logits1 = model1.net(input=adv_image, class_dim=class_dim)
    out1 = fluid.layers.softmax(out_logits1)

    model2 = models.__dict__[model_name2](scale=2.0)
    out_logits2 = model2.net(input=adv_image, class_dim=class_dim)
    out2 = fluid.layers.softmax(out_logits2)

    _input1 = fluid.layers.create_parameter(name="_input_1", shape=(1,3,224,224),dtype='float32')
    
    model3 = models.__dict__[model_name3]()
    input_layer3,out_logits3 = model3.x2paddle_net(input =adv_image )
    out3 = fluid.layers.softmax(out_logits3[0])
    
    model4 = models.__dict__[model_name4]()
    input_layer4,out_logits4 = model4.x2paddle_net(input =adv_image )
    out4 = fluid.layers.softmax(out_logits4[0])


    model5 = models.__dict__[model_name5]()
    input_layer5,out_logits5 = model5.x2paddle_net(input =adv_image )
    out5 = fluid.layers.softmax(out_logits5[0])

    model6 = models.__dict__[model_name6]()
    input_layer6,out_logits6 = model6.x2paddle_net(input =adv_image)
    out6 = fluid.layers.softmax(out_logits6[0])

    model7 = models.__dict__[model_name7]()
    input_layer7,out_logits7 = model7.x2paddle_net(input =adv_image)
    out7 = fluid.layers.softmax(out_logits7[0])

    model8 = models.__dict__[model_name8]()
    input_layer8,out_logits8 = model8.x2paddle_net(input =adv_image)
    out8 = fluid.layers.softmax(out_logits8[0])
    
    
    model9 = models.__dict__[model_name9]()
    input_layer9,out_logits9 = model9.x2paddle_net(input =adv_image)
    out9 = fluid.layers.softmax(out_logits9[0])

    place = fluid.CUDAPlace(0) if use_gpu else fluid.CPUPlace()
    exe = fluid.Executor(place)
    exe.run(fluid.default_startup_program())

    one_hot_label = fluid.one_hot(input=label, depth=121)
    one_hot_label2 = fluid.one_hot(input=label2, depth=121)
    smooth_label = fluid.layers.label_smooth(label=one_hot_label, epsilon=0.1, dtype="float32")[0]
    smooth_label2 = fluid.layers.label_smooth(label=one_hot_label2, epsilon=0.1, dtype="float32")[0]



    ze = fluid.layers.fill_constant(shape=[1], value=-1, dtype='float32')
    loss = 1.2*fluid.layers.matmul(ze, fluid.layers.cross_entropy(input=out1, label=label[0]))\
    + 0.2*fluid.layers.matmul(ze, fluid.layers.cross_entropy(input=out2, label=label[0]))\
    + fluid.layers.matmul(ze, fluid.layers.cross_entropy(input=out3, label=label2[0]))\
    + fluid.layers.matmul(ze, fluid.layers.cross_entropy(input=out4, label=label2[0]))\
    + fluid.layers.matmul(ze, fluid.layers.cross_entropy(input=out5, label=label2[0]))\
    + fluid.layers.matmul(ze, fluid.layers.cross_entropy(input=out6, label=label2[0]))\
    + fluid.layers.matmul(ze, fluid.layers.cross_entropy(input=out7, label=label2[0]))\
    + fluid.layers.matmul(ze, fluid.layers.cross_entropy(input=out8, label=label2[0]))\
    + fluid.layers.matmul(ze, fluid.layers.cross_entropy(input=out9, label=label2[0]))
    
    avg_loss=fluid.layers.reshape(loss ,[1])#这里修改loss

init_prog(adv_program)
eval_program = adv_program.clone(for_test=True)

with fluid.program_guard(adv_program): 
    #没有解决变量重名的问题
    #此部分代码为加载模型参数
    def if_exist(var):
        b = os.path.exists(os.path.join(pretrained_model1, var.name))
        return b
    def if_exist2(var):
        b = os.path.exists(os.path.join(pretrained_model2, var.name))
        return b
    def if_exist3(var):
        b = os.path.exists(os.path.join(pretrained_model3, var.name))
        return b
    def if_exist4(var):
        b = os.path.exists(os.path.join(pretrained_model4, var.name))
        return b
    def if_exist5(var):
        b = os.path.exists(os.path.join(pretrained_model5, var.name))
        return b
    def if_exist6(var):
        b = os.path.exists(os.path.join(pretrained_model6, var.name))
        return b
    def if_exist7(var):
        b = os.path.exists(os.path.join(pretrained_model7, var.name))
        return b
    def if_exist8(var):
        b = os.path.exists(os.path.join(pretrained_model8, var.name))
        return b
    def if_exist9(var):
        b = os.path.exists(os.path.join(pretrained_model9, var.name))
        return b
    fluid.io.load_vars(exe,
                       pretrained_model1,
                       fluid.default_main_program(),
                       predicate=if_exist)
    fluid.io.load_vars(exe,
                       pretrained_model2,
                       fluid.default_main_program(),
                       predicate=if_exist2)
    fluid.io.load_vars(exe,
                       pretrained_model3,
                       fluid.default_main_program(),
                       predicate=if_exist3)
    fluid.io.load_vars(exe,
                       pretrained_model4,
                       fluid.default_main_program(),
                       predicate=if_exist4)
    fluid.io.load_vars(exe,
                       pretrained_model5,
                       fluid.default_main_program(),
                       predicate=if_exist5)
    fluid.io.load_vars(exe,
                       pretrained_model6,
                       fluid.default_main_program(),
                       predicate=if_exist6)
    fluid.io.load_vars(exe,
                       pretrained_model7,
                       fluid.default_main_program(),
                       predicate=if_exist7)
    fluid.io.load_vars(exe,
                       pretrained_model8,
                       fluid.default_main_program(),
                       predicate=if_exist8)

    fluid.io.load_vars(exe,
                       pretrained_model9,
                       fluid.default_main_program(),
                       predicate=if_exist9)
    gradients = fluid.backward.gradients(targets=avg_loss, inputs=[adv_image])[0]
    #gradients = fluid.backward.gradients(targets=avg_loss, inputs=[adv_image])
    #print(gradients.shape)
    
def attack_nontarget_by_ensemble(img, src_label,src_label2,label,momentum): #src_label2为转换后的标签
    adv,m=ensem_mom_attack_threshold_9model_tarversion(adv_program=adv_program,eval_program=eval_program,gradients=gradients,o=img,
                src_label = src_label,
                src_label2 = src_label2,
                label = label,
                out1 = out1,out2 = out2 ,out3 = out3 ,out4 = out4,out5 = out5,out6 = out6,out7 = out7 ,out8 = out8,out9 = out9,mm = momentum)#添加了mm

    adv_img=tensor2img(adv)
    return adv_img,m

def get_original_file(filepath):
    with open(filepath, 'r') as cfile:
        full_lines = [line.strip() for line in cfile]
    cfile.close()
    original_files = []
    for line in full_lines:
        label, file_name = line.split()
        original_files.append([file_name, int(label)])
    return original_files
    
def gen_adv():
    mse = 0
    original_files = get_original_file(input_dir + val_list)
    #下一个图片的初始梯度方向为上一代的最后的值
    global momentum
    momentum=0
    
    for filename, label in original_files:
        img_path = input_dir + filename
        print("Image: {0} ".format(img_path))
        img=process_img(img_path)
        #adv_img = attack_nontarget_by_ensemble(img, label,origdict[label],label)
        adv_img,m = attack_nontarget_by_ensemble(img, label,origdict[label],label,momentum)
        #m为上一个样本最后一次梯度值
        momentum = m
        #adv_img 已经通过转换了,范围是0-255

        image_name, image_ext = filename.split('.')
        ##Save adversarial image(.png)
        save_adv_image(adv_img, output_dir+image_name+'.png')
        org_img = tensor2img(img)
        score = calc_mse(org_img, adv_img)
        print("Image:{0}, mase = {1} ".format(img_path,score))
        mse += score
    print("ADV {} files, AVG MSE: {} ".format(len(original_files), mse/len(original_files)))
 

note: 代码中此处函数已替换为个人实现,方案将在下一模块介绍

 

2.2 改进算法

加入动量迭代

动量项是缓解局部最优的经常使用手段,咱们将对抗样本的生成依然当作优化问题,那么用到动量也就符合常理。

实际代码以下:

 

随机梯度反向

  • 大粒度 此方法受启发于[2],论文中采起双路寻优,一路采用常规方法梯度上升,如图中绿线所示。另外一路先采起梯度降低到达这一分类局部最优再进行梯度上升,以期找到更快的上升路径,如图3.3中蓝线所示。 而本人在实现过程当中,对其进行简化,仅在迭代的第一步进行梯度降低。

实际代码以下:

  • 小粒度 随机选取梯度中5%进行取反,可视为像素粒度的梯度反向,反转比例为一超参数。

代码以下:

其中比例为一超参数,而选取随机选取必定%为生成与梯度相同形状的随机数,设定阈值选取必定%乘以-1.

 

添加高斯噪声

此方法受启发于[6],论文做者认为攻击模型的梯度具备噪声,损害了迁移能力。论文做者用一组原始图片加噪声后的梯度的平均代替原来的梯度,效果获得提高。而我与论文做者理解不一样,添加噪声意在增长梯度的噪声,以越过局部最优,再者屡次计算梯度很是耗时,所以我选用了只加一次噪声,均值为0,方差为超参数。

代码以下:

 

攻击后进行目标攻击

此方法受启发于[2],做者在成功越过度界线后进行消除无用噪声操做,做者认为此举能够增强对抗样本的迁移能力。 个人作法与此不一样,我认为不只要越过边界,还要走向这个错误分类的低谷。此举依据的假设是:尽管不一样模型的分界线存在差别,模型学到的特征应是类似的。思路如图3.4中红色箭头所示,带有圆圈的数字表示迭代步数。 所以,在成功攻击以后,我又添加了两步定向攻击,目标为攻击成功时被错分的类别。在集成攻击时,目标为被错分类别的众数。

目标攻击代码: 选取的目标标签为9个模型预测的众数

 

将上述策略集成,核心函数代码全貌展现在下一代码框中, 我将本身设计的函数命名为:ensem_mom_attack_threshold_9model_tarversion

In[ ]
def ensem_mom_attack_threshold_9model_tarversion(adv_program,eval_program,gradients,o,src_label2,src_label,out1,out2,out3,out4,out5,out6,out7,out8,out9,label,mm,iteration=20,use_gpu = True):
    origdict = {1: 0, 2: 32, 3: 43, 4: 54, 5: 65, 6: 76, 7: 87, 8: 98, 9: 109, 10: 1, 11: 12, 12: 23, 13: 25, 14: 26, 15: 27, 16: 28, 17: 29, 18: 30, 19: 31, 20: 33, 21: 34, 22: 35, 23: 36, 24: 37, 25: 38, 26: 39, 27: 40, 28: 41, 29: 42, 30: 44, 31: 45, 32: 46, 33: 47, 34: 48, 35: 49, 36: 50, 37: 51, 38: 52, 39: 53, 40: 55, 41: 56, 42: 57, 43: 58, 44: 59, 45: 60, 46: 61, 47: 62, 48: 63, 49: 64, 50: 66, 51: 67, 52: 68, 53: 69, 54: 70, 55: 71, 56: 72, 57: 73, 58: 74, 59: 75, 60: 77, 61: 78, 62: 79, 63: 80, 64: 81, 65: 82, 66: 83, 67: 84, 68: 85, 69: 86, 70: 88, 71: 89, 72: 90, 73: 91, 74: 92, 75: 93, 76: 94, 77: 95, 78: 96, 79: 97, 80: 99, 81: 100, 82: 101, 83: 102, 84: 103, 85: 104, 86: 105, 87: 106, 88: 107, 89: 108, 90: 110, 91: 111, 92: 112, 93: 113, 94: 114, 95: 115, 96: 116, 97: 117, 98: 118, 99: 119, 100: 2, 101: 3, 102: 4, 103: 5, 104: 6, 105: 7, 106: 8, 107: 9, 108: 10, 109: 11, 110: 13, 111: 14, 112: 15, 113: 16, 114: 17, 115: 18, 116: 19, 117: 20, 118: 21, 119: 22, 120: 24}
    mydict = {0: 1, 1: 10, 2: 100, 3: 101, 4: 102, 5: 103, 6: 104, 7: 105, 8: 106, 9: 107, 10: 108, 11: 109, 12: 11, 13: 110, 14: 111, 15: 112, 16: 113, 17: 114, 18: 115, 19: 116, 20: 117, 21: 118, 22: 119, 23: 12, 24: 120, 25: 13, 26: 14, 27: 15, 28: 16, 29: 17, 30: 18, 31: 19, 32: 2, 33: 20, 34: 21, 35: 22, 36: 23, 37: 24, 38: 25, 39: 26, 40: 27, 41: 28, 42: 29, 43: 3, 44: 30, 45: 31, 46: 32, 47: 33, 48: 34, 49: 35, 50: 36, 51: 37, 52: 38, 53: 39, 54: 4, 55: 40, 56: 41, 57: 42, 58: 43, 59: 44, 60: 45, 61: 46, 62: 47, 63: 48, 64: 49, 65: 5, 66: 50, 67: 51, 68: 52, 69: 53, 70: 54, 71: 55, 72: 56, 73: 57, 74: 58, 75: 59, 76: 6, 77: 60, 78: 61, 79: 62, 80: 63, 81: 64, 82: 65, 83: 66, 84: 67, 85: 68, 86: 69, 87: 7, 88: 70, 89: 71, 90: 72, 91: 73, 92: 74, 93: 75, 94: 76, 95: 77, 96: 78, 97: 79, 98: 8, 99: 80, 100: 81, 101: 82, 102: 83, 103: 84, 104: 85, 105: 86, 106: 87, 107: 88, 108: 89, 109: 9, 110: 90, 111: 91, 112: 92, 113: 93, 114: 94, 115: 95, 116: 96, 117: 97, 118: 98, 119: 99}
    
    place = fluid.CUDAPlace(0) if use_gpu else fluid.CPUPlace()
    exe = fluid.Executor(place)
    
    target_label=np.array([src_label]).astype('int64')
    target_label=np.expand_dims(target_label, axis=0)
    target_label2=np.array([src_label2]).astype('int64')
    target_label2=np.expand_dims(target_label2, axis=0)
    
    img = o.copy()
    decay_factor = 0.90
    steps=90
    epsilons = np.linspace(5, 388, num=75)
    flag_traget = 0#表示非目标攻击
    flag2=0 #退出的标志
    for epsilon in epsilons[:]:
        #print("now momentum is {}".format(momentum))
        if flag_traget==0:
            #momentum = mm
            momentum = 0
            adv=img.copy()
            for i in range(steps):
                
                if i<50:
                    adv_noise = (adv+np.random.normal(loc=0.0, scale=0.5+epsilon/90,size = (3,224,224))).astype('float32')
                else:
                    adv_noise = (adv+np.random.normal(loc=0.0, scale=0.1,size = (3,224,224))).astype('float32')
                g,resul1,resul2,resul3,resul4,resul5,resul6,resul7,resul8,resul9 = exe.run(adv_program,
                             fetch_list=[gradients,out1,out2,out3,out4,out5,out6,out7,out8,out9],
                             feed={'label2':target_label2,'adv_image':adv_noise,'label': target_label })
               
                #print(g[0][0].shape,g[0][1].shape,g[0][2].shape)
                g = (g[0][0]+g[0][1]+g[0][2])/3 #三通道梯度平均
                #print(g.shape)
                velocity = g / (np.linalg.norm(g.flatten(),ord=1) + 1e-10)
                momentum = decay_factor * momentum + velocity
                #print(momentum.shape)
                norm_m = momentum / (np.linalg.norm(momentum.flatten(),ord=2) + 1e-10)
                #print(norm_m.shape)
                _max = np.max(abs(norm_m))
                tmp = np.percentile(abs(norm_m), [25, 99.45, 99.5])#将图片变更的像素点限定在0.5%
                thres = tmp[2]
                mask = abs(norm_m)>thres
                norm_m_m = np.multiply(norm_m,mask)
                if i<50: #前50步,2%的梯度反响,随着i递减 试试5%
                    dir_mask = np.random.rand(3,224,224)
                    #print(dir_mask)
                    dir_mask = dir_mask>(0.15-i/900)  
                    #print(dir_mask)
                    dir_mask[dir_mask==0] = -1
                    #print(dir_mask)
                    norm_m_m = np.multiply(norm_m_m,dir_mask)
                    #print(norm_m_m.shape)
                #步长也随着step衰减
                if i==0:
                    adv=adv+epsilon*norm_m_m 
                else:
                    adv=adv-epsilon*norm_m_m 
                    #adv=adv-(epsilon-i/30)*norm_m_m 
                #实施linf约束
                adv=linf_img_tenosr(img,adv,epsilon)
        else:
            for i in range(2):
                adv_noise = (adv+np.random.normal(loc=0.0, scale=0.1,size = (3,224,224))).astype('float32')
                target_label=np.array([t_label]).astype('int64')
                target_label=np.expand_dims(target_label, axis=0)
                target_label2=np.array([origdict[t_label]]).astype('int64')
                target_label2=np.expand_dims(target_label2, axis=0)
                g,resul1,resul2,resul3,resul4,resul5,resul6,resul7,resul8,resul9 = exe.run(adv_program,
                         fetch_list=[gradients,out1,out2,out3,out4,out5,out6,out7,out8,out9],
                         feed={'label2':target_label2,'adv_image':adv_noise,'label': target_label }
                          )
                g = (g[0][0]+g[0][1]+g[0][2])/3 #三通道梯度平均
                velocity = g / (np.linalg.norm(g.flatten(),ord=1) + 1e-10)
                momentum = decay_factor * momentum + velocity
                #print(momentum.shape)
                norm_m = momentum / (np.linalg.norm(momentum.flatten(),ord=2) + 1e-10)
                #print(norm_m.shape)
                _max = np.max(abs(norm_m))
                tmp = np.percentile(abs(norm_m), [25, 99.45, 99.5])#将图片变更的像素点限定在0.5%
                thres = tmp[2]
                mask = abs(norm_m)>thres
                norm_m_m = np.multiply(norm_m,mask)
                adv=adv+epsilon*norm_m_m
                #实施linf约束
                adv=linf_img_tenosr(img,adv,epsilon)
            flag2=1
            
        print("epsilon is {}".format(epsilon))
        print("label is:{}; model1:{}; model2:{}; model3:{}; model4:{}; model5:{}; model6:{}; model7:{}; model8:{} ; model9:{} ".format(label,resul1.argmax(),resul2.argmax(),mydict[resul3.argmax()],mydict[resul4.argmax()],\
        mydict[resul5.argmax()],mydict[resul6.argmax()],mydict[resul7.argmax()],mydict[resul8.argmax()],mydict[resul9.argmax()]))#模型3标签到真正标签
        

        if((label!=resul1.argmax()) and(label!=resul2.argmax())and(origdict[label]!=resul3.argmax())and(origdict[label]!=resul4.argmax())and(origdict[label]!=resul5.argmax())\
        and(origdict[label]!=resul6.argmax())and(origdict[label]!=resul7.argmax())and(origdict[label]!=resul8.argmax())and(origdict[label]!=resul9.argmax())):
            res_list = [resul1.argmax(),resul2.argmax(),mydict[resul3.argmax()],mydict[resul4.argmax()],mydict[resul5.argmax()],mydict[resul6.argmax()],mydict[resul7.argmax()],mydict[resul8.argmax()],mydict[resul9.argmax()]]
            ser = pd.Series(res_list)
            t_label = ser.mode()[0]#取众数做为target_label
            flag_traget=1
            if(flag2 == 1):
                break
    return adv,momentum
 

3 方案复现

算法介绍完毕,让咱们运行一下完整的方案,生成对抗样本。

note: 因为集成模型较多,代码将运行1个小时左右。可提早停止运行(点击notebook右上角运行菜单,选中中断执行),查看已成的对抗样本。

In[ ]
gen_adv()
 

接来看看对抗样本与原样本有什么区别

In[ ]
#定义一个观察图片区别的函数
def show_images_diff(original_img,adversarial_img):
    #original_img = np.array(Image.open(original_img))
    #adversarial_img = np.array(Image.open(adversarial_img))
    original_img=cv2.resize(original_img.copy(),(224,224))
    adversarial_img=cv2.resize(adversarial_img.copy(),(224,224))

    plt.figure(figsize=(10,10))

    #original_img=original_img/255.0
    #adversarial_img=adversarial_img/255.0

    plt.subplot(1, 3, 1)
    plt.title('Original Image')
    plt.imshow(original_img)
    plt.axis('off')

    plt.subplot(1, 3, 2)
    plt.title('Adversarial Image')
    plt.imshow(adversarial_img)
    plt.axis('off')

    plt.subplot(1, 3, 3)
    plt.title('Difference')
    difference = 0.0+adversarial_img - original_img
        
    l0 = np.where(difference != 0)[0].shape[0]*100/(224*224*3)
    l2 = np.linalg.norm(difference)/(256*3)
    linf=np.linalg.norm(difference.copy().ravel(),ord=np.inf)
    # print(difference)
    print("l0={}% l2={} linf={}".format(l0, l2,linf))
    
    #(-1,1) -> (0,1)
    #灰色打底 容易看出区别
    difference=difference/255.0
        
    difference=difference/2.0+0.5
   
    plt.imshow(difference)
    plt.axis('off')

    plt.show()
    

    #plt.savefig('fig_cat.png')#plt.savefig('fig_cat.png')10model_ensemble_attack.#plt.savefig('fig_cat.png')#plt.savefig('fig_cat.png')10model_ensemble_attack.py10model_ensemble_attack.py
In[ ]
from PIL import Image, ImageOps
import cv2
import matplotlib.pyplot as plt
#########################################
##此处的pname可替换为你想查看的图片
pname = "n02085620_10074.jpg"
#########################################
image_name, image_ext = pname.split('.')
pname_attack = image_name + ".png"
original_img=np.array(Image.open("/home/aistudio/baidu_attack_by_xin/input_image/" + pname))
adversarial_img=np.array(Image.open("/home/aistudio/baidu_attack_by_xin/output_image_attack/" + pname_attack))
show_images_diff(original_img,adversarial_img)
l0=22.507440476190474% l2=5.516279998781479 linf=221.0
 

此方案生成的对抗样本具备良好的迁移能力,能够在AI安全对抗赛中取得第二名,以此标准

其中M表示防护模型,y表示样本I的真实标签。若是防护算法对样本识别正确,这次攻击不成功,扰动量直接置为上限128。若是攻击成功,计算对抗样本和原始样本的扰动量,采用平均L2距离。每一个对抗样本都会在m个防护模型上计算扰动量,n表明样本个数,最后对全部的扰动量进行平均,作为本次攻击的总体距离得分,得分越小越好。

为衡量标准,我方案生成的图片能够达到3.78089。

 

4 临门一脚

然而这不够,做为一个竞赛,人人都虎视眈眈盯着奖金的时候,还须要不断的提高。所以还须要临门一脚,一种后处理方法。

小扰动截断

使用上述方法后,个人结果在95-96分之间波动,为进一步提高成绩,我选用最高分96.53分图片进行后处理。后处理方法为:将攻击后的图片与原图片进行对比,对必定阈值如下的扰动进行截断。 通过不断上探阈值,发现阈值为17(图片的像素范围为0-255)的时候效果最好。此方法提分0.3左右。

代码以下:

提早预警: 运行下面代码须要生成所有对抗样本。

In[ ]
#coding=utf-8

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import argparse
import functools
import numpy as np
import paddle.fluid as fluid

#加载自定义文件
import models
from attack.attack_pp import FGSM, PGD
from utils import *

######Init args
image_shape = [3,224,224]
class_dim=121
input_dir = "./input_image/"
attacked_dir = "./output_image_attack/"
output_dir = "./posopt_output_image/"
drop_thres = 10
os.makedirs("./posopt_output_image") 
val_list = 'val_list.txt'
use_gpu=True


####### Main #######
def get_original_file(filepath):
    with open(filepath, 'r') as cfile:
        full_lines = [line.strip() for line in cfile]
    cfile.close()
    original_files = []
    for line in full_lines:
        label, file_name = line.split()
        original_files.append([file_name, int(label)])
    return original_files

def gen_diff():
    original_files = get_original_file(input_dir + val_list)

    for filename, label in original_files:
        image_name, image_ext = filename.split('.')
        img_path = input_dir + filename
        print("Image: {0} ".format(img_path))
        img=process_img(img_path)
        adv_img_path = attacked_dir + image_name+'.png'
        adv=process_img(adv_img_path)
        
        org_img = tensor2img(img)
        adv_img = tensor2img(adv)
        #10/256 如下的扰动所有截断
        diff = abs(org_img-adv_img)<drop_thres   #<10的为1
        diff_max = abs(org_img-adv_img)>=drop_thres  #>=10的为1
        #<10的保留org_img
        tmp1 = np.multiply(org_img,diff)
        #>10的保留adv_img
        tmp2 = np.multiply(adv_img,diff_max)
        final_img = tmp1+tmp2
        
        save_adv_image(final_img, output_dir+image_name+'.png')


gen_diff()
 

写在赛后

  1. 以上就是本人在AI安全对抗赛取得第二名的所有方案,如需在终端执行,下面介绍了终端执行说明,感谢阅读。
  2. 决赛赛程中我霸榜半个月有余,绞尽脑汁尝试各类攻击方法,不断阅读论文,不断尝试效果。我很享受这种过程,照猫画虎快速入门了paddle,成长了不少。
  3. 然而最后半个小时仍是被反超。赛后交流发现他们序列攻击了十几个模型,然而我集成了9个模型就中止了。也许霸榜让我有了一丝松懈。
  4. 致读者,这是一个很是好的入门对抗样本的机会,细嚼baseline和个人方案将让你入门4-5种这个领域的算法。
  5. 感谢AI studio让我用到了v100,我本身的台式机装的1050但是连n年前的vgg都跑不动。
 

参考文献

[1] Liu Y , Chen X , Liu C , et al. Delving into Transferable Adversarial Examples and Black-box Attacks[J]. 2016.

[2] Shi Y , Wang S , Han Y . Curls & Whey: Boosting Black-Box Adversarial Attacks[J]. 2019.

[3] Narodytska N , Kasiviswanathan S P . Simple Black-Box Adversarial Perturbations for Deep Networks[J]. 2016.

[4] Huang Q , Katsman I , He H , et al. Enhancing Adversarial Example Transferability with an Intermediate Level Attack[J]. 2019.

[5] https://www.cs.cmu.edu/~sbhagava/papers/face-rec-ccs16.pdf

[6] Understanding and Enhancing the Transferability of Adversarial Examples

 

代码使用说明

依赖库:

  • python3
  • paddle
  • numpy

使用步骤:

  • 在终端输入
  • cd baidu_attack_by_xin/
  • python 9model_ensemble_attack.py
  • python pert_drop.py
  • 结果保存在posopt_output_image

使用AI Studio一键上手实践项目吧:https://aistudio.baidu.com/aistudio/projectdetail/296291 

>> 访问 PaddlePaddle 官网,了解更多相关内容

相关文章
相关标签/搜索