背景:
公司自建IDC机房,基于IDC机房构建大数据集群;须要对集群资源进行监控,集群采用的是CDH集群,采集主要分两块进行:javaHDFS和YARN相关的指标进行采集
IDC机器自身的指标进行采集
注意: 也许有人会有疑惑,CM界面已经提供了监控的图表,为何还须要本身进行展现。缘由在于,这些信息须要集成到内部的数据平台上面去,作成对应的数据报表,可视化的方式展现在本身的数据平台上spring实现思路大体能够分为两种:sql
使用CM所提供的Java API去获取 使用CM提供的REST API去获取其实二者本质上是同样的,CM所提供的Java API也是按照REST API那套来实现的,二者是保持一致的api
核心代码以下:ide
public class IdcHostResource { private static final Logger LOGGER = LoggerFactory.getLogger(IdcHostResource.class); static RootResourceV18 apiRoot; // TODO... 写死了,须要改进 static { apiRoot = new ClouderaManagerClientBuilder() .withHost("cm ip") .withPort(7180) .withUsernamePassword("user", "passwd") .build() .getRootV18(); } /** * 固定获取Host的基本资源信息 */ public static List<IdcHostBasicInfo> getAllHostResource() { List<IdcHostBasicInfo> hosts = new ArrayList<IdcHostBasicInfo>(); HostsResourceV10 hostsResourceV10 = apiRoot.getHostsResource(); List<ApiHost> hostLists = hostsResourceV10.readHosts(DataView.SUMMARY).getHosts(); LOGGER.info("Total" + hostLists.size() + "Host"); for (ApiHost hostList : hostLists) { IdcHostBasicInfo host = formatHost(hostsResourceV10.readHost(hostList.getHostId())); LOGGER.info("Host Name:" + host.getHostName()); LOGGER.info("Host Health Summary:" + host.gethostHealthSummary()); LOGGER.info("Host Physical Memory:" + host.getTotalPhysMemBytes()); hosts.add(host); } return hosts; } public static IdcHostBasicInfo formatHost(ApiHost apiHost) { IdcHostBasicInfo idcHostBasicInfo = new IdcHostBasicInfo(); idcHostBasicInfo.sethostHealthSummary(apiHost.getHealthSummary().toString()); idcHostBasicInfo.setHostName(apiHost.getHostname()); idcHostBasicInfo.setTotalPhysMemBytes(apiHost.getTotalPhysMemBytes()); return idcHostBasicInfo; } /** * 经过tsquery来动态获取对应的metrics info * * @param query * @param startTime * @param endTime * @return */ public static List<IdcMetricInfo> getHostMetrics(String query, String startTime, String endTime) throws ParseException { TimeSeriesResourceV11 timeSeriesResourceV11 = apiRoot.getTimeSeriesResource(); ApiTimeSeriesResponseList responseList = timeSeriesResourceV11.queryTimeSeries(query, startTime, endTime); List<ApiTimeSeriesResponse> apiTimeSeriesResponseList = responseList.getResponses(); List<IdcMetricInfo> metrics = formatApiTimeSeriesResponseList(apiTimeSeriesResponseList); return metrics; } public static List<IdcMetricInfo> formatApiTimeSeriesResponseList(List<ApiTimeSeriesResponse> apiTimeSeriesResponseList) throws ParseException { List<IdcMetricInfo> metrics = new ArrayList<IdcMetricInfo>(); DateUtils dateUtils = new DateUtils(); for (ApiTimeSeriesResponse apiTimeSeriesResponse : apiTimeSeriesResponseList) { List<MetricData> dataList = new ArrayList<MetricData>(); List<ApiTimeSeries> apiTimeSeriesResponseLists = apiTimeSeriesResponse.getTimeSeries(); for (ApiTimeSeries apiTimeSeries : apiTimeSeriesResponseLists) { LOGGER.info("query sql is: " + apiTimeSeries.getMetadata().getExpression()); IdcMetricInfo metric = new IdcMetricInfo(); metric.setMetricName(apiTimeSeries.getMetadata().getMetricName()); metric.setEntityName(apiTimeSeries.getMetadata().getEntityName()); metric.setStartTime(apiTimeSeries.getMetadata().getStartTime().toString()); metric.setEndTime(apiTimeSeries.getMetadata().getEndTime().toString()); for (ApiTimeSeriesData apiTimeSeriesData : apiTimeSeries.getData()) { MetricData data = new MetricData(); // 在Data中插入EntityName,避免重复数据的产生 data.seHostname(apiTimeSeries.getMetadata().getEntityName()); // CM默认获得的时间格式为 EEE MMM dd HH:mm:ss 'CST' yyyy,转换时间格式为 yyyy-MM-dd HH:mm:ss data.setTimestamp(dateUtils.parse(apiTimeSeriesData.getTimestamp().toString())); data.setType(apiTimeSeriesData.getType()); data.setValue(apiTimeSeriesData.getValue()); dataList.add(data); } metric.setData(dataList); metrics.add(metric); } } return metrics; }
注意:测试
代码中涉及到的DateUtils须要本身去进行实现
经过这部分代码能够经过传入tsquery的方式去获取对应的idc集群的metric信息;接下来的代码咱们只须要经过ServiceImpl去实现对应的监控指标的获取代码便可
若是想经过cm api与spring boot整合的,这其中还会遇到2个问题:
依赖冲突问题,主要表如今jackson与cxf的冲突;经过排jar包的方式能够解决大数据正则解析错误,该问题为cm使用过程当中的一个坑,目前仍在排查当中,具体表现形式为:ui
这里面有个空格,所以在编译的过程当中直接会报正则解析的错误;可是咱们能够发如今cm 6.x的api版本中已经没有这个问题了:code
所以能够直接升级api的版原本解决该问题,可是随之带来的问题就是与线上运行的cm版本不一致(线上的版本为5.13.2),所以对于如何解决仍然须要思考;不过通过测试发现,使用cm 6.x版本的api,对于目前线上那套版本的相关指标并不影响orm