使用ML.NET + ASP.NET Core + Docker + Azure Container Instances部署.NET机器学习模型

时间 2019-12-08

标签使用 ml.net asp.net asp core docker azure container instances 部署机器学习模型栏目 ASP 繁體版

原文原文链接

本文将使用ML.NET建立机器学习分类模型，经过ASP.NET Core Web API公开它，将其打包到Docker容器中，并经过Azure Container Instances将其部署到云中。
linux

先决条件

本文假设您对Docker有必定的了解。构建和部署示例应用程序还须要如下软件/依赖项。重要的是要注意应用程序是在Ubuntu 16.04 PC上构建的，但全部软件都是跨平台的，应该适用于任何环境。web

设置项目

咱们要作的第一件事是为咱们的解决方案建立一个文件夹。docker

mkdir mlnetacidemo

而后，咱们想在新建立的文件夹中建立一个解决方案。json

cd mlnetacidemo
dotnet new sln

创建模型

在咱们的解决方案文件夹中，咱们想要建立一个新的控制台应用程序，这是咱们构建和测试咱们的机器学习模型的地方。ubuntu

设置模型项目

首先，咱们要建立项目。从解决方案文件夹输入：api

dotnet new console -o model

如今咱们要将这个新项目添加到咱们的解决方案中。服务器

dotnet sln mlnetacidemo.sln add model/model.csproj

添加依赖项

因为咱们将使用ML.NET框架，咱们须要将其添加到咱们的model项目中。数据结构

cd model
dotnet add package Microsoft.ML
dotnet restore

在咱们开始训练模型以前，咱们须要下载咱们将用于训练的数据。咱们经过建立一个名为data的目录并将数据文件下载到那里来实现。app

mkdir data
curl -o data/iris.txt https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data

若是咱们看一下数据文件，它看起来应该是这样的：框架

5.1,3.5,1.4,0.2,Iris-setosa
4.9,3.0,1.4,0.2,Iris-setosa
4.7,3.2,1.3,0.2,Iris-setosa
4.6,3.1,1.5,0.2,Iris-setosa
5.0,3.6,1.4,0.2,Iris-setosa
5.4,3.9,1.7,0.4,Iris-setosa
4.6,3.4,1.4,0.3,Iris-setosa
5.0,3.4,1.5,0.2,Iris-setosa
4.4,2.9,1.4,0.2,Iris-setosa
4.9,3.1,1.5,0.1,Iris-setosa

训练模型

如今咱们已经设置了全部依赖项，如今是构建模型的时候了。我利用了ML.NET入门网站上使用的演示。

定义数据结构

在咱们model项目的根目录中，让咱们建立两个被调用的类IrisData，IrisPrediction它们将分别定义咱们的特性和预测属性。它们都将用于Microsoft.ML.Runtime.Api添加属性属性。

这是咱们IrisData的样子：

using Microsoft.ML.Runtime.Api;

namespace model
{
public class IrisData
    {
        [Column("0")]
        public float SepalLength;

        [Column("1")]
        public float SepalWidth;

        [Column("2")]
        public float PetalLength;
        
        [Column("3")]
        public float PetalWidth;

        [Column("4")]
        [ColumnName("Label")]
        public string Label;
    }       
}

一样，这是IrisPrediction：

using Microsoft.ML.Runtime.Api;

namespace model
{
    public class IrisPrediction
    {
        [ColumnName("PredictedLabel")]
        public string PredictedLabels;
    }
}

构建LearningPipeLine

using Microsoft.ML.Data;
using Microsoft.ML;
using Microsoft.ML.Runtime.Api;
using Microsoft.ML.Trainers;
using Microsoft.ML.Transforms;
using Microsoft.ML.Models;
using System;
using System.Threading.Tasks;

namespace model
{
    class Model
    {
        
        public static async Task<PredictionModel<IrisData,IrisPrediction>> Train(LearningPipeline pipeline, string dataPath, string modelPath)
        {
            // Load Data
            pipeline.Add(new TextLoader(dataPath).CreateFrom<IrisData>(separator:',')); 

            // Transform Data
            // Assign numeric values to text in the "Label" column, because 
            // only numbers can be processed during model training   
            pipeline.Add(new Dictionarizer("Label"));

            // Vectorize Features
            pipeline.Add(new ColumnConcatenator("Features", "SepalLength", "SepalWidth", "PetalLength", "PetalWidth"));

            // Add Learner
            pipeline.Add(new StochasticDualCoordinateAscentClassifier());

            // Convert Label back to text 
            pipeline.Add(new PredictedLabelColumnOriginalValueConverter() {PredictedLabelColumn = "PredictedLabel"});

            // Train Model
            var model = pipeline.Train<IrisData,IrisPrediction>();

            // Persist Model
            await model.WriteAsync(modelPath);

            return model;
        }
    }
}

除了构建LearningPipLine并训练咱们的机器学习模型以外，该模型还序列化并保存在名为model.zip的文件中以供未来使用。

测试咱们的模型

如今是时候测试全部内容以确保它正常工做。

using System;
using Microsoft.ML;

namespace model
{
    class Program
    {
        static void Main(string[] args)
        {

            string dataPath = "model/data/iris.txt";

            string modelPath = "model/model.zip";

            var model = Model.Train(new LearningPipeline(),dataPath,modelPath).Result;

            // Test data for prediction
            var prediction = model.Predict(new IrisData() 
            {
                SepalLength = 3.3f,
                SepalWidth = 1.6f,
                PetalLength = 0.2f,
                PetalWidth = 5.1f
            });

            Console.WriteLine($"Predicted flower type is: {prediction.PredictedLabels}");
        }
    }
}

所有设定运行。咱们能够经过从解决方案目录输入如下命令来完成此操做：

dotnet run -p model/model.csproj

运行应用程序后，控制台上将显示如下输出。

Automatically adding a MinMax normalization transform, use 'norm=Warn' or
'norm=No' to turn this behavior off.Using 2 threads to train.
Automatically choosing a check frequency of 2.Auto-tuning parameters: maxIterations = 9998.
Auto-tuning parameters: L2 = 2.667734E-05.
Auto-tuning parameters: L1Threshold (L1/L2) = 0.Using best model from iteration 882.
Not training a calibrator because it is not needed.
Predicted flower type is: Iris-virginica

公开模型

此外，您会注意到在咱们model项目的根目录中建立了一个名为model.zip的文件。这个持久化模型如今能够在咱们的应用程序以外用于进行预测，咱们接下来将经过API执行操做。

一旦构建了机器学习模型，您就但愿部署它以便开始进行预测。一种方法是经过REST API。它的核心部分须要作的是接受来自客户端的数据输入并回复预测。为了帮助咱们这样作，咱们将使用ASP.NET Core API。

设置API项目

咱们要作的第一件事是建立项目。

dotnet new webapi -o api

而后咱们想将这个新项目添加到咱们的解决方案中

dotnet sln mlnetacidemo.sln add api/api.csproj

添加依赖项

由于咱们将加载咱们的模型并经过咱们的API进行预测，因此咱们须要将ML.NET包添加到咱们的api项目中。

cd api
dotnet add package Microsoft.ML
dotnet restore

引用模型

在咱们构建机器学习模型的上一步中，它被保存到一个名为的文件中model.zip。这是咱们将在API中引用的文件，以帮助咱们进行预测。要在咱们的API中引用它，只需将它从模型项目目录复制到咱们的api项目目录中。

建立数据模型

咱们的模型是使用数据结构构建的IrisData，IrisPrediction用于定义特征以及预测属性。所以，当咱们的模型经过咱们的API进行预测时，它也须要引用这些数据类型。所以，咱们须要在项目内部定义IrisData和IrisPrediction类api。类的内容几乎与model项目中的内容相同，惟一的例外是咱们的命名空间从更改model为api。

using Microsoft.ML.Runtime.Api;

namespace api
{
    public class IrisData
    {
        [Column("0")]
        public float SepalLength;

        [Column("1")]
        public float SepalWidth;

        [Column("2")]
        public float PetalLength;
        
        [Column("3")]
        public float PetalWidth;

        [Column("4")]
        [ColumnName("Label")]
        public string Label;
    }    
}

using Microsoft.ML.Runtime.Api;

namespace api
{
    public class IrisPrediction
    {
        [ColumnName("PredictedLabel")]
        public string PredictedLabels;
    }
}

构建控制器

如今咱们的项目已经创建，是时候添加一个控制器来处理来自客户端的预测请求了。在Controllers咱们api项目的目录中，咱们能够建立一个PredictController使用单个POST端点调用的新类。该文件的内容应以下所示：

using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
using Microsoft.AspNetCore.Mvc;
using Microsoft.ML;

namespace api.Controllers
{
    [Route("api/[controller]")]
    public class PredictController : Controller
    {
        // POST api/predict
        [HttpPost]
        public string Post([FromBody] IrisData instance)
        {
            var model = PredictionModel.ReadAsync<IrisData,IrisPrediction>("model.zip").Result;
            var prediction = model.Predict(instance);
            return prediction.PredictedLabels;
        }
    }
}

测试API

当咱们的predict控制器完成编码，就能够来测试它了。从咱们mlnetacidemo解决方案的根目录中，输入如下命令。

dotnet run -p api/api.csproj

咱们的请求的正文应该相似于下面的代码段：在POSTMAN或Insomnia等客户端中，向端点发送HHTP POST请求http://localhost:5000/api/predict。

{
    "SepalLength": 3.3,
    "SepalWidth": 1.6,
    "PetalLength": 0.2,
    "PetalWidth": 5.1,
}

打包应用程序

若是成功，返回的输出应该Iris-virginica与咱们的控制台应用程序相同。大！如今咱们的应用程序已在本地成功运行，如今是时候将它打包到Docker容器中并将其推送到Docker Hub。

建立Dockerfile

在咱们的mlnetacidemo解决方案目录中，使用如下内容建立一个Dockerfile：

FROM microsoft/dotnet:2.0-sdk AS build
WORKDIR /app

# copy csproj and restore as distinct layers
COPY *.sln .
COPY api/*.csproj ./api/
RUN dotnet restore

# copy everything else and build app
COPY api/. ./api/
WORKDIR /app/api
RUN dotnet publish -c release -o out


FROM microsoft/aspnetcore:2.0 AS runtime
WORKDIR /app
COPY api/model.zip .
COPY --from=build /app/api/out ./
ENTRYPOINT ["dotnet", "api.dll"]

构建镜像

咱们须要在命令提示符中输入如下命令。这须要一段时间，由于它须要下载.NET Core SDK和ASP.NET Core运行时Docker镜像。

docker build -t <DOCKERUSERNAME>/<IMAGENAME>:latest .

本地测试镜像

咱们须要在本地测试咱们的镜像，以确保它能够在云上运行。为此，咱们可使用该docker run命令。

docker run -d -p 5000:80 <DOCKERUSERNAME>/<IMAGENAME>:latest

要中止容器，请使用Ctrl + C。虽然API暴露了端口80，但咱们将其绑定到本地端口5000只是为了保持咱们先前的API请求不变。向http://localhost:5000/api/predict适当的主体发送POST请求时，应该再次响应一样的结果Iris-virginica。

推送到Docker Hub

如今Docker镜像在本地成功运行，是时候推送到Docker Hub了。一样，咱们使用Docker CLI来执行此操做。

docker login
docker push <DOCKERUSERNAME>/<IMAGENAME>:latest

部署到云

如今，最后一步是向全世界部署和展现咱们的机器学习模型和API。咱们的部署将经过Azure容器实例进行，由于它几乎不须要配置或管理服务器。

准备部署清单

尽管能够在命令行中执行部署，但一般最好将全部配置放在文件中以备文档，并节省时间，而没必要每次都输入参数。使用Azure，咱们能够经过JSON文件来实现。

{
  "$schema":
    "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
  "contentVersion": "1.0.0.0",
  "parameters": {
    "containerGroupName": {
      "type": "string",
      "defaultValue": "mlnetacicontainergroup",
      "metadata": {
        "description": "Container Group name."
      }
    }
  },
  "variables": {
    "containername": "mlnetacidemo",
    "containerimage": "<DOCKERUSERNAME>/<IMAGENAME>:latest"
  },
  "resources": [
    {
      "name": "[parameters('containerGroupName')]",
      "type": "Microsoft.ContainerInstance/containerGroups",
      "apiVersion": "2018-04-01",
      "location": "[resourceGroup().location]",
      "properties": {
        "containers": [
          {
            "name": "[variables('containername')]",
            "properties": {
              "image": "[variables('containerimage')]",
              "resources": {
                "requests": {
                  "cpu": 1,
                  "memoryInGb": 1.5
                }
              },
              "ports": [
                {
                  "port": 80
                }
              ]
            }
          }
        ],
        "osType": "Linux",
        "ipAddress": {
          "type": "Public",
          "ports": [
            {
              "protocol": "tcp",
              "port": "80"
            }
          ]
        }
      }
    }
  ],
  "outputs": {
    "containerIPv4Address": {
      "type": "string",
      "value":
        "[reference(resourceId('Microsoft.ContainerInstance/containerGroups/', parameters('containerGroupName'))).ipAddress.ip]"
    }
  }
}

如今咱们可使用这个模板并将其保存到咱们mlnetacidemo解决方案根目录下的文件azuredeploy.json中。惟一须要改变的是containerimage的配置，将其替换为您的Docker Hub用户名和刚刚推送到Docker Hub的镜像的名称。

部署

为了部署咱们的应用程序，咱们须要确保登陆咱们的Azure账户。要经过Azure CLI执行此操做，请在命令提示符下键入：

az login

按照提示登陆。登陆后，是时候为容器建立资源组了。

az group create --name mlnetacidemogroup --location eastus

成功建立组后，就能够部署咱们的应用程序了。

az group deployment create --resource-group mlnetacidemogroup --template-file azuredeploy.json

完成后，可使用如下命令清理资源：

az group delete --name mlnetacidemogroup

为部署初始化须要消耗几分钟的时间。若是部署成功，您应该在命令行上看到一些输出。寻找ContainerIPv4Address主机，这是能够访问容器的IP地址，更换URL后再次作一个POST请求到http://<ContainerIPv4Address>/api/predict，ContainerIPv4Address是在部署后命令行中找到的值。若是成功，响应内容应该像之前的请求同样返回Iris-virginica。

小结

在本文中，咱们构建了一个分类机器学习模型，使用ML.NET该模型预测鸢尾花的分类，给出了四种分类的预测功能，经过ASP.NET Core REST API公开它，将其打包到容器中并使用Azure Container Instances将其部署到云中。虽然随着模型的变化，这些操做变得更加复杂，可是目前介绍的内容已经足够标准化，扩展此示例仅须要进行不多量的修改便可。