Facebook开源的基于SQL的操做系统检测和监控框架:osquery Table详解

时间 2019-11-26

标签开源基于 sql 系统检测监控框架 osquery table 详解栏目硅谷繁體版

原文原文链接

写在前面

上一篇介绍了osquery的一些用法，即如何使用SQL语句查询系统信息。本文就来介绍下这个table是如何定义的，及table中的数据是如何取得的。
本文以uptime和process两张表为例。本文介绍的osquery版本是1.7.6。linux

uptime

uptime主要用来获取系统的启动时间:git

osquery> select * from uptime;
+------+-------+---------+---------+---------------+
| days | hours | minutes | seconds | total_seconds |
+------+-------+---------+---------+---------------+
| 1    | 23    | 19      | 53      | 170393        |
+------+-------+---------+---------+---------------+

uptime表中的这条数据是如何获取的呢？
通常来讲，对于table的描述分为两部分。一部分是spec，一部分是impl。github

spec用于说明表的名称、结构以及对应实现的方法，其代码主要在/specs中。
impl则是表的具体实现，其代码主要在/osquery/tables中。

Spec

首先来看uptime.tableide

table_name("uptime")
description("Track time passed since last boot.")
schema([
    Column("days", INTEGER, "Days of uptime"),
    Column("hours", INTEGER, "Hours of uptime"),
    Column("minutes", INTEGER, "Minutes of uptime"),
    Column("seconds", INTEGER, "Seconds of uptime"),
    Column("total_seconds", BIGINT, "Total uptime seconds"),
])
implementation("system/uptime@genUptime")

能够看到uptime表有5列，分别是days，hours，minutes，seconds，total_seconds。函数

其实现的代码是system/uptime中的genUptime函数。ui

Impl

那么直接来看具体实现uptime.cpp。spa

QueryData genUptime(QueryContext& context) {
  Row r;
  QueryData results;
  long uptime_in_seconds = getUptime(); //获取启动的时间(根据不一样的系统，有不一样的方法获取)

  if (uptime_in_seconds >= 0) {
    r["days"] = INTEGER(uptime_in_seconds / 60 / 60 / 24);
    r["hours"] = INTEGER((uptime_in_seconds / 60 / 60) % 24);
    r["minutes"] = INTEGER((uptime_in_seconds / 60) % 60);
    r["seconds"] = INTEGER(uptime_in_seconds % 60);
    r["total_seconds"] = BIGINT(uptime_in_seconds);
    results.push_back(r);
  }

  return results;
}

Row r是一行数据，其对应于SQL查询结果的一行，包含有该表的每一列。
QueryData results是SQL查询返回的全部查询结果的集合，能够包含若干行。

能够看到该函数是首先获取启动时间，而后在行Row r中对应的字段填入相应的数据。
以后将结果经过results.push_back(r);填入到返回数据中，而后最终返回查询的结果。code

由于uptime表只是获取对应的时间，因此只有一行。这里genUptime也就对应只填写了一行进行返回。orm

uptime是一个比较简单的表，下面对一个更为复杂的表processes进行分析。进程

Process

processes表相对来讲，就复杂一些，其提供了正在running的进程的相关信息。

spec

首先来看processes.table，
能够看到该表包含了不少列。这里就不一一介绍了。

table_name("processes")
description("All running processes on the host system.")
schema([
    Column("pid", BIGINT, "Process (or thread) ID", index=True),
    Column("name", TEXT, "The process path or shorthand argv[0]"),
    Column("path", TEXT, "Path to executed binary"),
    Column("cmdline", TEXT, "Complete argv"),
    Column("state", TEXT, "Process state"),
    Column("cwd", TEXT, "Process current working directory"),
    Column("root", TEXT, "Process virtual root directory"),
    Column("uid", BIGINT, "Unsigned user ID"),
    Column("gid", BIGINT, "Unsigned group ID"),
    Column("euid", BIGINT, "Unsigned effective user ID"),
    Column("egid", BIGINT, "Unsigned effective group ID"),
    Column("suid", BIGINT, "Unsigned saved user ID"),
    Column("sgid", BIGINT, "Unsigned saved group ID"),
    Column("on_disk", INTEGER,
        "The process path exists yes=1, no=0, unknown=-1"),
    Column("wired_size", BIGINT, "Bytes of unpagable memory used by process"),
    Column("resident_size", BIGINT, "Bytes of private memory used by process"),
    Column("phys_footprint", BIGINT, "Bytes of total physical memory used"),
    Column("user_time", BIGINT, "CPU time spent in user space"),
    Column("system_time", BIGINT, "CPU time spent in kernel space"),
    Column("start_time", BIGINT,
        "Process start in seconds since boot (non-sleeping)"),
    Column("parent", BIGINT, "Process parent's PID"),
    Column("pgroup", BIGINT, "Process group"),
    Column("nice", INTEGER, "Process nice level (-20 to 20, default 0)"),
])
implementation("system/processes@genProcesses")
examples([
  "select * from processes where pid = 1",
])

能够看到起其实现是processes中的genProcesses函数。

Impl

processes中的genProcesses函数为不一样系统提供了不一样的实现。本文主要是从linux/processes.cpp来作分析。

首先看实现函数genProcesses：

QueryData genProcesses(QueryContext& context) {
  QueryData results;

  auto pidlist = getProcList(context);
  for (const auto& pid : pidlist) {
    genProcess(pid, results);
  }

  return results;
}

能够看到该函数主要有两部分。

根据context获取pid列表
根据pid列表依次获取每一个pid的信息

获取pid列表

getProcList函数主要是根据context获取pid列表。

std::set<std::string> getProcList(const QueryContext& context) {
  std::set<std::string> pidlist;
  if (context.constraints.count("pid") > 0 &&
      context.constraints.at("pid").exists(EQUALS)) {
    for (const auto& pid : context.constraints.at("pid").getAll(EQUALS)) {
      if (isDirectory("/proc/" + pid)) {
        pidlist.insert(pid);
      }
    }
  } else {
    osquery::procProcesses(pidlist);
  }

  return pidlist;
}

从代码里能够看到，这里能够根据查询条件进行筛选。若是查询条件里面有where pid=xxxx的时候，即符合了
if (context.constraints.count("pid") > 0 && context.constraints.at("pid").exists(EQUALS))的条件，所以只须要将该pid加入到pidList中。
这一步的好处在于若是有where pid=xxxx的条件，就不须要检索全部的pid，只须要去获取特定的pid信息就能够了。
若是没有这种限制条件，则去获取全部的pid。获取的方法是procProcesses函数:

const std::string kLinuxProcPath = "/proc";
Status procProcesses(std::set<std::string>& processes) {
  // Iterate over each process-like directory in proc.
  boost::filesystem::directory_iterator it(kLinuxProcPath), end;
  try {
    for (; it != end; ++it) {
      if (boost::filesystem::is_directory(it->status())) {
        // See #792: std::regex is incomplete until GCC 4.9
        if (std::atoll(it->path().leaf().string().c_str()) > 0) {
          processes.insert(it->path().leaf().string());
        }
      }
    }
  } catch (const boost::filesystem::filesystem_error& e) {
    VLOG(1) << "Exception iterating Linux processes " << e.what();
    return Status(1, e.what());
  }

  return Status(0, "OK");
}

能够看到，获取全部的pid就是遍历/proc下的全部文件夹，判断文件夹是否是纯数字，若是是，则加入到processes集合里。

获取process的信息

有了pidList，接下来就是根据pidList，依次获取每一个pid的信息。

void genProcess(const std::string& pid, QueryData& results) {
  // Parse the process stat and status.
  auto proc_stat = getProcStat(pid);

  Row r;
  r["pid"] = pid;
  r["parent"] = proc_stat.parent;
  r["path"] = readProcLink("exe", pid);
  r["name"] = proc_stat.name;
  r["pgroup"] = proc_stat.group;
  r["state"] = proc_stat.state;
  r["nice"] = proc_stat.nice;
  // Read/parse cmdline arguments.
  r["cmdline"] = readProcCMDLine(pid);
  r["cwd"] = readProcLink("cwd", pid);
  r["root"] = readProcLink("root", pid);
  r["uid"] = proc_stat.real_uid;
  r["euid"] = proc_stat.effective_uid;
  r["suid"] = proc_stat.saved_uid;
  r["gid"] = proc_stat.real_gid;
  r["egid"] = proc_stat.effective_gid;
  r["sgid"] = proc_stat.saved_gid;

  // If the path of the executable that started the process is available and
  // the path exists on disk, set on_disk to 1. If the path is not
  // available, set on_disk to -1. If, and only if, the path of the
  // executable is available and the file does NOT exist on disk, set on_disk
  // to 0.
  r["on_disk"] = osquery::pathExists(r["path"]).toString();

  // size/memory information
  r["wired_size"] = "0"; // No support for unpagable counters in linux.
  r["resident_size"] = proc_stat.resident_size;
  r["phys_footprint"] = proc_stat.phys_footprint;

  // time information
  r["user_time"] = proc_stat.user_time;
  r["system_time"] = proc_stat.system_time;
  r["start_time"] = proc_stat.start_time;

  results.push_back(r);
}

能够看到首先是用getProcStat函数获取pid的信息。
这里getProcStat函数就不展开分析了，其主要就是读取/proc/<pid>/stat文件，而后将对应的字段获取出来。

而后genProcess函数将从getProcStat获取到的信息，填入到行r中的对应列，最后将行r加到返回的结果集中。