利用Meida Service的Java SDK来调用Azure Media Services的Index V2实现视频字幕自动识别

时间 2019-11-21

标签利用 meida service java sdk 调用 azure media services index v2 实现视频自动识别栏目 Java 繁體版

原文原文链接

Azure Media Services新的Index V2 支持自动将视频文件中的语音自动识别成字幕文件WebVtt,很是方便的就能够跟Azure Media Player集成，将一个原来没字幕的视频文件自动配上一个对应语言的字幕。并且强大的语音识别功能支持识别多国语言包括：json

English [EnUs]
Spanish [EsEs]
Chinese [ZhCn]
French [FrFr]
German [DeDe]
Italian [ItIt]
Portuguese [PtBr]
Arabic (Egyptian) [ArEg]

从上面列表咱们能够看到这个Index V2是支持中文的识别，若是公司里已经存在了大量演讲或者课程视频可是又没有配上字幕的话，Media Services的Index V2功能，可以很好的帮上忙。windows

下面咱们试试用Java的代码来调用Media Services的这个功能：函数

引用Media Service的相关SDK,咱们须要在pom.xml增长几个dependency编码

<dependency>
        <groupId>com.microsoft.azure</groupId>
        <artifactId>azure</artifactId>
        <version>1.0.0-beta2</version>
   </dependency>
  <dependency>
      <groupId>com.microsoft.azure</groupId>
      <artifactId>azure-media</artifactId>
      <version>0.9.4</version>
</dependency>

首先咱们准备好访问Media Service的基本资料，譬如帐号和登陆的Keyspa

// Media Services account credentials configuration
    private static String mediaServiceUri = "https://media.windows.net/API/";
    private static String oAuthUri = "https://wamsprodglobal001acs.accesscontrol.windows.net/v2/OAuth2-13";
    private static String clientId = "wingsample";
    private static String clientSecret = "p8BDkk+kLYZzpnvP0B5KFy98uLTv7ALGuSX7F9LmHtk=";
    private static String scope = "urn:WindowsAzureMediaServices";

而后就是建立一个访问Media Service的Context.net

public static MediaContract getMediaService(){
            
         Configuration configuration = MediaConfiguration.configureWithOAuthAuthentication(
                 mediaServiceUri, oAuthUri, clientId, clientSecret, scope);
         MediaContract  mediaService = MediaService.create(configuration);
    
         return mediaService;
    }

为了可以调用一个processor来执行index，咱们须要一个获取处理起的方法：code

public static MediaProcessorInfo getLatestProcessorByName(MediaContract mediaService,String processname){
        ListResult<MediaProcessorInfo> mediaProcessors;
        try {
            mediaProcessors = mediaService
                    .list(MediaProcessor.list().set("$filter", String.format("Name eq '%s'", processname)));
        
         // Use the latest version of the Media Processor
        MediaProcessorInfo mediaProcessor = null;
        for (MediaProcessorInfo info : mediaProcessors) {
            if (null == mediaProcessor || info.getVersion().compareTo(mediaProcessor.getVersion()) > 0) {
                mediaProcessor = info;
                return mediaProcessor;
            }
        }
        } catch (ServiceException e) {
            // TODO Auto-generated catch block
            ;e.printStackTrace();
        }
        return null;
    }

固然咱们好须要根据Asset的Id来获取到某个具体的Asset来进行处理orm

public static AssetInfo getAssetById(MediaContract mediaService,String assetName) throws ServiceException
    {
        AssetInfo    resultAsset = mediaService.get(Asset.get(assetName));
     return resultAsset;
    }

有了AssetInfo,processor,和mediaserivce的Context，咱们就能够执行Index V2视频

public static String index2(MediaContract mediaService,AssetInfo assetInfo)
    {
        try
        {
            logger.info("start index2: " + assetInfo.getName());
           
            String config = "{"+
                  "\"version\":\"1.0\","+
                  "\"Features\":"+
                    "["+
                       "{"+
                       "\"Options\": {"+
                        "    \"Formats\":[\"WebVtt\",\"ttml\"],"+
                            "\"Language\":\"enUs\","+
                            "\"Type\":\"RecoOptions\""+
                       "},"+
                       "\"Type\":\"SpReco\""+
                    "}]"+
                "}";
          
            String taskXml = "<taskBody><inputAsset>JobInputAsset(0)</inputAsset>"
                    + "<outputAsset assetCreationOptions=\"0\"" // AssetCreationOptions.None
                    + " assetName=\"" + assetInfo.getName()+"index 2" + "\">JobOutputAsset(0)</outputAsset></taskBody>";
            
            System.out.println("config: " + config);
            MediaProcessorInfo indexerMP =
                getLatestProcessorByName(mediaService,"Azure Media Indexer 2 Preview");
                
                // Create a task with the Indexer Media Processor
                Task.CreateBatchOperation task =
                    Task.create(indexerMP.getId(), taskXml)
                        .setConfiguration(config)
                        .setName(assetInfo.getName() + "_Indexing");
                
                Job.Creator jobCreator = Job.create()
                    .setName(assetInfo.getName() + "_Indexing")
                    .addInputMediaAsset(assetInfo.getId())
                    .setPriority(2)
                    .addTaskCreator(task);
                
                final JobInfo jobInfo;
                final String jobId;
                synchronized (mediaService)
                {
                    jobInfo = mediaService.create(jobCreator);
                    jobId = jobInfo.getId();
                }
           //     checkJobStatus(jobId, assetInfo.getName());
                
                return jobId;//downloadAssetFilesFromJob(jobInfo);
            }
            catch (Exception e)
            {
                logger.error("Exception occured while running indexing job: "
                    + e.getMessage());
            }
            return "";
        }

这些方法都写好了，咱们就能够直接在Main函数里面执行它了xml

public static void main( String[] args )
    {
        try {
        MediaContract mediaService=getMediaService();
        AssetInfo asset;
        
    asset = getAssetById(mediaService,"nb:cid:UUID:13144339-d09b-4e6f-a86b-3113a64dbabe");
        
        String result=index2(mediaService,asset);
           System.out.println( "Job:"+result );
    } catch (ServiceException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }
    }

从Index2这个函数里面咱们有两个东西是很重要的。一个是json格式preset结构,咱们若是须要更改识别语言，生成的格式的话，只须要对这个Json文件进行更改就行了。

{
  "version":"1.0",
  "Features":
    [
       {
       "Options": {
            "Formats":["WebVtt","ttml"],
            "Language":"enUs",
            "Type":"RecoOptions"
       },
       "Type":"SpReco"
    }]
}

一个是Task的描述XML，这个XML是用来描述这个任务是处理那个Asset，处理完放到那个Asset里面。基本上跟Media Service相关的各类编码，识别都须要这个task的xml配合对应的preset文件来处理的。

<?xml version="1.0" encoding="utf-16"?>
<taskBody>
  <inputAsset>JobInputAsset(0)</inputAsset>
  <outputAsset assetCreationOptions="0" assetName="ep48_mid.mp4index 2">JobOutputAsset(0)</outputAsset>
</taskBody>

Media Service的识别分析服务很是强大，它还包含了移动侦测、人脸识别、表情识别等等。

https://azure.microsoft.com/zh-cn/documentation/articles/media-services-analytics-overview/