当咱们在使用那些建设在OpenStack
之上的云平台服务的时候,每每在概览页面都有一个明显的位置用来展现当前集群的一些资源使用状况,如,CPU,内存,硬盘等资源的总量、使用量、剩余量。并且,每当咱们拓展集群规模以后,概览页面上的资源总量也会自动增长,咱们都熟知,OpenStack
中的Nova
服务负责管理这些计算资源,那么你有没有想过,它们是如何被Nova
服务获取的吗?html
Nova
如何统计资源咱们知道,统计资源的操做属于Nova
服务内部的机制,考虑到资源统计结果对后续操做(如建立虚拟机,建立硬盘)的重要性,咱们推断该机制的运行顺序必定先于其余服务。node
经过上述简单的分析,再加上一些必要的Debug操做,咱们得出:
该机制的触发点位于nova.service.WSGIService.start
方法中:数据库
def start(self): """Start serving this service using loaded configuration. Also, retrieve updated port number in case '0' was passed in, which indicates a random port should be used. :returns: None """ if self.manager: self.manager.init_host() self.manager.pre_start_hook() if self.backdoor_port is not None: self.manager.backdoor_port = self.backdoor_port self.server.start() if self.manager: self.manager.post_start_hook()
其中,self.manager.pre_start_hook()
的做用就是去获取资源信息,它的直接调用为nova.compute.manager.pre_start_hook
以下:json
def pre_start_hook(self): """After the service is initialized, but before we fully bring the service up by listening on RPC queues, make sure to update our available resources (and indirectly our available nodes). """ self.update_available_resource(nova.context.get_admin_context()) ... @periodic_task.periodic_task def update_available_resource(self, context): """See driver.get_available_resource() Periodic process that keeps that the compute host's understanding of resource availability and usage in sync with the underlying hypervisor. :param context: security context """ new_resource_tracker_dict = {} nodenames = set(self.driver.get_available_nodes()) for nodename in nodenames: rt = self._get_resource_tracker(nodename) rt.update_available_resource(context) new_resource_tracker_dict[nodename] = rt # Delete orphan compute node not reported by driver but still in db compute_nodes_in_db = self._get_compute_nodes_in_db(context, use_slave=True) for cn in compute_nodes_in_db: if cn.hypervisor_hostname not in nodenames: LOG.audit(_("Deleting orphan compute node %s") % cn.id) cn.destroy() self._resource_tracker_dict = new_resource_tracker_dict
上述代码中的rt.update_available_resource()
的直接调用实为nova.compute.resource_tracker.update_available_resource()
以下:数组
def update_available_resource(self, context): """Override in-memory calculations of compute node resource usage based on data audited from the hypervisor layer. Add in resource claims in progress to account for operations that have declared a need for resources, but not necessarily retrieved them from the hypervisor layer yet. """ LOG.audit(_("Auditing locally available compute resources")) resources = self.driver.get_available_resource(self.nodename) if not resources: # The virt driver does not support this function LOG.audit(_("Virt driver does not support " "'get_available_resource' Compute tracking is disabled.")) self.compute_node = None return resources['host_ip'] = CONF.my_ip # TODO(berrange): remove this once all virt drivers are updated # to report topology if "numa_topology" not in resources: resources["numa_topology"] = None self._verify_resources(resources) self._report_hypervisor_resource_view(resources) return self._update_available_resource(context, resources)
上述代码中的self._update_available_resource
的做用是根据计算节点上的资源实际使用结果来同步数据库记录,这里咱们不作展开;self.driver.get_available_resource()
的做用就是获取节点硬件资源信息,它的实际调用为:dom
class LibvirtDriver(driver.ComputeDriver): def get_available_resource(self, nodename): """Retrieve resource information. This method is called when nova-compute launches, and as part of a periodic task that records the results in the DB. :param nodename: will be put in PCI device :returns: dictionary containing resource info """ # Temporary: convert supported_instances into a string, while keeping # the RPC version as JSON. Can be changed when RPC broadcast is removed stats = self.get_host_stats(refresh=True) stats['supported_instances'] = jsonutils.dumps( stats['supported_instances']) return stats def get_host_stats(self, refresh=False): """Return the current state of the host. If 'refresh' is True, run update the stats first. """ return self.host_state.get_host_stats(refresh=refresh) def _get_vcpu_total(self): """Get available vcpu number of physical computer. :returns: the number of cpu core instances can be used. """ if self._vcpu_total != 0: return self._vcpu_total try: total_pcpus = self._conn.getInfo()[2] + 1 except libvirt.libvirtError: LOG.warn(_LW("Cannot get the number of cpu, because this " "function is not implemented for this platform. ")) return 0 if CONF.vcpu_pin_set is None: self._vcpu_total = total_pcpus return self._vcpu_total available_ids = hardware.get_vcpu_pin_set() if sorted(available_ids)[-1] >= total_pcpus: raise exception.Invalid(_("Invalid vcpu_pin_set config, " "out of hypervisor cpu range.")) self._vcpu_total = len(available_ids) return self._vcpu_total ..... class HostState(object): """Manages information about the compute node through libvirt.""" def __init__(self, driver): super(HostState, self).__init__() self._stats = {} self.driver = driver self.update_status() def get_host_stats(self, refresh=False): """Return the current state of the host. If 'refresh' is True, run update the stats first. """ if refresh or not self._stats: self.update_status() return self._stats def update_status(self): """Retrieve status info from libvirt.""" ... data["vcpus"] = self.driver._get_vcpu_total() data["memory_mb"] = self.driver._get_memory_mb_total() data["local_gb"] = disk_info_dict['total'] data["vcpus_used"] = self.driver._get_vcpu_used() data["memory_mb_used"] = self.driver._get_memory_mb_used() data["local_gb_used"] = disk_info_dict['used'] data["hypervisor_type"] = self.driver._get_hypervisor_type() data["hypervisor_version"] = self.driver._get_hypervisor_version() data["hypervisor_hostname"] = self.driver._get_hypervisor_hostname() data["cpu_info"] = self.driver._get_cpu_info() data['disk_available_least'] = _get_disk_available_least() ...
注意get_available_resource
方法的注释信息,彻底符合咱们开始的推断。咱们下面单以vcpus
为例继续调查资源统计流程,self.driver._get_vcpu_total
的实际调用为LibvirtDriver._get_vcpu_total
(上述代码中已给出),若是配置项vcpu_pin_set
没有生效,那么获得的_vcpu_total
的值为self._conn.getInfo()[2]
(self._conn
能够理解为libvirt的适配器,它表明与kvm
,qemu
等底层虚拟化工具的抽象链接,getInfo()
就是对libvirtmod.virNodeGetInfo
的一次简单的封装,它的返回值是一组数组,其中第三个元素就是vcpus
的数量),咱们看到这里基本就能够了,再往下就是libvirt的C语言代码而不是Python的范畴了。ide
另外一方面,若是咱们配置了vcpu_pin_set
配置项,那么该配置项就被hardware.get_vcpu_pin_set
方法解析成一个可用CPU位置索引的集合,再经过对该集合求长后,咱们也能获得最终想要的vcpus
的数量。工具
如上,就是Nova统计节点硬件资源的整个逻辑过程(vcpus
为例)。post