Linux操做系统--进程/线程（2）

时间 2020-09-14

标签 linux 系统进程线程栏目 Linux 繁體版

原文原文链接

前言

在本系列的上一篇博文里，我已经介绍了进程/线程的基本含义以及一些相关数据结构，如今咱们来看看Linux中进程的管理。node

进程链表

Linux内核定义了一个list_head结构，数据结构定义linux

struct list_head {
	struct list_head *next；
	struct list_head *prev;
};

字段next 和 prev 分别表示通用双向链表向前和向后的指针元素！list_head字段的指针中存放的是另外一个list_head字段的元素，而不是自己的数据结构地址。如图

在咱们上一篇博客介绍到的进程描述符（task_struct）也有这个结构体，称为进程链表。进程链表是一个双向循环链表，它把全部进程的描述符连接起来。每一个task_struct结构都包含一个list_head类型的字段tasks，这个结构的prev和next分别指向前面和后面的task_struct元素。

这个链表是一个循环的双向链表，开始的时候只有init_task这一个进程，它是内核的第一个进程，它的初始化是经过静态分配内存，"手动"(其它的进程初始化都是经过动态分配内存初始化的)进行的，每新建一个进程，就经过SET_LINKS宏将该进程的task_struct结构加入到这条双向链表中，不过要注意的是若是一个进程新建一个线程（不包括主线程），也就是轻量级进程，它是不会加入该链表的。经过宏for_each_process能够从init_task开始遍历全部的进程。安全

#define for_each_task(p)
for (p = &init_task ; (p = p->next_task) != &init_task ; )

可运行队列（runqueue）

当内核寻找一个新进程在CPU上运行时，必须只考虑可运行进程（即处在TASK_RUNNING状态的进程）。把可运行状态的进程组成一个双向循环链表，也叫可运行队列（runqueue）。
在task_struct结构中定义了两个指针。session

struct task_struct *next_run, *prev_run;

由正在运行或是能够运行的，其进程状态均为TASK_RUNNING的进程所组成的一个双向循环链表，即run_queue就绪队列。该链表的先后向指针用next_run和prev_run，链表的头和尾都是init_task(即0号
进程)。
可是，为了实如今固定的时间内选出“最佳”的可运行程序，内核将可运行进程的优先级划分为0-139，并为此创建了140个可运行进程链表，用以组织处于TASK_RUNNING状态的进程，每一个进程优先权对应一个不一样的链表
linux内核定义了一个prio_array_t类型的结构体来管理这140个链表。每一个可运行的进程都在这140个链表中的一个，经过进程描述符结构中的run_list来实现，它也是一个list_head类型。enqueue_task是把进程描述符插入到某个可运行链表中，dequeue_task则从某个可运行链表中删除该进程描述符。TASK_RUNNING状态的prio_array_t类型的结构体是runqueue结构的arrays[1]成员。

数据结构

pidhash链表

为了经过pid找到进程的描述符，若是直接遍历进程间互联的链表来查找进程id为pid的进程描述符显然是低效的，因此为了更为高效的查找，linux内核使用了4个hash散列表来加快查找，之因此使用4个散列表，是为了能根据不一样的pid类型来查找进程描述符，它们分别是进程的pid，线程组领头进程的pid，进程组领头进程的pid，会话领头进程的pid。每一个类型的散列表中是经过宏pid_hashfn(x)来进行散列值的计算的。每一个进程均可能同时处于这是个散列表中，因此在进程描述符中有一个类型为pid结构的pids成员，经过它能够将进程加入散列表中，pid结构中包含解决散列冲突的pid_chain成员，它是hlist_node类型的，还有一个是将相同pid链起来的pid_list，它是list_head类型。
less

struct pid_link {
    int nr;  // pid的数值
    struct hlist_node pid_chain;
    struct list_head pid_list;
}

struct task_struct {
    …
    struct pid_link pids[4];
    …
}

Linux 进程安全上下文 struct cred

内核2.6，定义一个新的 struct task_security_struct，而后挂接到task_struct的void *security指针上，可是，内核3.x 在task_struct找不到security成员了，原来是将安全相关的信息剥离到一个叫作 cred 的结构体中，由cred负责保存进程安全上下文ide

The security context of a task
   95  *
   96  * The parts of the context break down into two categories:
   97  *
   98  *  (1) The objective context of a task.  These parts are used when some other
   99  *      task is attempting to affect this one.
  100  *
  101  *  (2) The subjective context.  These details are used when the task is acting
  102  *      upon another object, be that a file, a task, a key or whatever.
  103  *
  104  * Note that some members of this structure belong to both categories - the
  105  * LSM security pointer for instance.
  106  *
  107  * A task has two security pointers.  task->real_cred points to the objective
  108  * context that defines that task's actual details.  The objective part of this
  109  * context is used whenever that task is acted upon.
  110  *
  111  * task->cred points to the subjective context that defines the details of how
  112  * that task is going to act upon another object.  This may be overridden
  113  * temporarily to point to another security context, but normally points to the
  114  * same context as task->real_cred.
  115  */
  116 struct cred {
  117         atomic_t        usage;
  118 #ifdef CONFIG_DEBUG_CREDENTIALS
  119         atomic_t        subscribers;    /* number of processes subscribed */
  120         void            *put_addr;
  121         unsigned        magic;
  122 #define CRED_MAGIC      0x43736564
  123 #define CRED_MAGIC_DEAD 0x44656144
  124 #endif
  125         uid_t           uid;            /* real UID of the task */
  126         gid_t           gid;            /* real GID of the task */
  127         uid_t           suid;           /* saved UID of the task */
  128         gid_t           sgid;           /* saved GID of the task */
  129         uid_t           euid;           /* effective UID of the task */
  130         gid_t           egid;           /* effective GID of the task */
  131         uid_t           fsuid;          /* UID for VFS ops */
  132         gid_t           fsgid;          /* GID for VFS ops */
  133         unsigned        securebits;     /* SUID-less security management */
  134         kernel_cap_t    cap_inheritable; /* caps our children can inherit */
  135         kernel_cap_t    cap_permitted;  /* caps we're permitted */
  136         kernel_cap_t    cap_effective;  /* caps we can actually use */
  137         kernel_cap_t    cap_bset;       /* capability bounding set */
  138 #ifdef CONFIG_KEYS
  139         unsigned char   jit_keyring;    /* default keyring to attach requested
  140                                          * keys to */
  141         struct key      *thread_keyring; /* keyring private to this thread */
  142         struct key      *request_key_auth; /* assumed request_key authority */
  143         struct thread_group_cred *tgcred; /* thread-group shared credentials */
  144 #endif
  145 #ifdef CONFIG_SECURITY
  146         void            *security;      /* subjective LSM security */
  147 #endif
  148         struct user_struct *user;       /* real user ID subscription */
  149         struct user_namespace *user_ns; /* cached user->user_ns */
  150         struct group_info *group_info;  /* supplementary groups for euid/fsgid */
  151         struct rcu_head rcu;            /* RCU deletion hook */
  152 };

正如uid,euid的关系同样，task_struct也有两种身份cred函数

struct task_struct{
 ...
 /* process credentials */
 const struct cred __rcu *real_cred; /* objective and real subjective task credentials (COW) */
 const struct cred __rcu *cred;  /* effective (overridable) subjective task credentials (COW) */
 ...
 }

这里详细说明如下这个安全上下文的做用。
linux系统中，一个对象操做另外一个对象时一般要作安全性检查。如一个进程操做一个文件，要检查进程是否有权限操做该文件。
linux内核中，credential机制的引入，正是对象间访问所需权限的抽象；主体提供本身权限的证书，客体提供访问本身所需权限的证书，根据主客体提供的证书及操做作安全性检查。
证书管理术语：
客体：指用户空间程序直接能够操做的系统对象，如进程、文件、消息队列、信号量、共享内存等；每一个客体都有一组凭证，每种客体有不一样的凭证集
客体全部者：客体凭证集有一部分表示客体全部者；如文件中uid表示文件的全部者
主体：操做客体的对象；除进程外大多数系统对象都不是主体，但在特殊环境下某些对象是主体，如文件在设置F_SETOWN后能够发送SIGIO信号到进程，这时文件就是主体，进程就是客体
行为：主体怎样操做客体，如读写执行文件等
客体上下文：客体被访问时所需权限凭证集
主体上下文：主体的权限凭证集
规则：主体操做客体时，用于安全检查
当主体操做客体时，根据主体上下文、客体上下文、操做来作安全计算，查找规则看主体是否有权限操做客体。
进程描述符中cred和real_cred字段分别指向主体与客体的证书学习

usage：表于证书引用管理
uid：实际用户id（real UID of the task，进程真正的uid，即为建立该进程的用户的uid）
gid：实际用户组id
suid：保存的用户uid（saved UID of the task，保留的UID，例如，当一个特权进程须要临时下降其权限时，将其euid更改成非特权的UID，而后将原来的EUID保存到SUID，当须要恢复权限时，将EUID改成SUID中保存的UID便可）
sgid；保存的用户组gid
euid：真正有效的用户id（effective UID of the task，有效的UID，用于进程访问资源时的访问检查，大多数状况下，EUID是同于UID的，可是也能够不一样，或者说动态获取的ID）
egid：真正有效的用户组id
securebits：安全管理标识；用来控制凭证的操做与继承
cap_inheritable：execve时能够继承的权限
cap_permitted：能够(经过capset)赋予cap_effective的权限
cap_effective：进程实际使用的权限
cap_bset：主要用于uid=0或euid=0时，execve能够继承的权限，cap_permitted=cap_inheritable+cap_bset，cap_effective=cap_permitted。能够将cap_bset中的权限经过调用capset赋给cap_inheritable
user：主要表示用户信息，如用户进程数、打开文件数等
rcu：RCU删除用

struct cred在kernel pwn的利用

注：笔者尚未学习内核pwn的相关知识，因此这里只是简单介绍一下cred这个结构体在内核pwn中提权的做用，没有具体例子说明
能够经过执行commit_creds(prepare_kernel_cred(0))来得到root权限（root的uid、gid均为0）
源码以下：ui

/* /kernel/cred.c */
/**
 * prepare_kernel_cred - Prepare a set of credentials for a kernel service
 * @daemon: A userspace daemon to be used as a reference
 *
 * Prepare a set of credentials for a kernel service.  This can then be used to
 * override a task's own credentials so that work can be done on behalf of that
 * task that requires a different subjective context.
 *
 * @daemon is used to provide a base for the security record, but can be NULL.
 * If @daemon is supplied, then the security data will be derived from that;
 * otherwise they'll be set to 0 and no groups, full capabilities and no keys.
 *
 * The caller may change these controls afterwards if desired.
 *
 * Returns the new credentials or NULL if out of memory.
 *
 * Does not take, and does not return holding current->cred_replace_mutex.
 */
struct cred *prepare_kernel_cred(struct task_struct *daemon)
{
	const struct cred *old;
	struct cred *new;

	new = kmem_cache_alloc(cred_jar, GFP_KERNEL);
	if (!new)
		return NULL;

	kdebug("prepare_kernel_cred() alloc %p", new);

	if (daemon)
		old = get_task_cred(daemon);
	else
		old = get_cred(&init_cred);

	validate_creds(old);

	*new = *old;
	new->non_rcu = 0;
	atomic_set(&new->usage, 1);
	set_cred_subscribers(new, 0);
	get_uid(new->user);
	get_user_ns(new->user_ns);
	get_group_info(new->group_info);

#ifdef CONFIG_KEYS
	new->session_keyring = NULL;
	new->process_keyring = NULL;
	new->thread_keyring = NULL;
	new->request_key_auth = NULL;
	new->jit_keyring = KEY_REQKEY_DEFL_THREAD_KEYRING;
#endif

#ifdef CONFIG_SECURITY
	new->security = NULL;
#endif
	if (security_prepare_creds(new, old, GFP_KERNEL) < 0)
		goto error;

	put_cred(old);
	validate_creds(new);
	return new;

error:
	put_cred(new);
	put_cred(old);
	return NULL;
}
EXPORT_SYMBOL(prepare_kernel_cred);

prepare_kernel_cred()
根据源码注释中的描述，这个函数返回一个cred结构体，能够用于代替进程原来的cred以便可以完成须要不一样subjective context的任务。若是提供了参数@daemon，那么security data未来源于此，而这个参数也可为空，而后内容字段会被设置成0（uid/gid都是0，就是root权限咯？）

/* /kernel/cred.c */
/**
 * commit_creds - Install new credentials upon the current task
 * @new: The credentials to be assigned
 *
 * Install a new set of credentials to the current task, using RCU to replace
 * the old set.  Both the objective and the subjective credentials pointers are
 * updated.  This function may not be called if the subjective credentials are
 * in an overridden state.
 *
 * This function eats the caller's reference to the new credentials.
 *
 * Always returns 0 thus allowing this function to be tail-called at the end
 * of, say, sys_setgid().
 */
int commit_creds(struct cred *new)
{
	struct task_struct *task = current;
	const struct cred *old = task->real_cred;

	kdebug("commit_creds(%p{%d,%d})", new,
	       atomic_read(&new->usage),
	       read_cred_subscribers(new));

	BUG_ON(task->cred != old);
#ifdef CONFIG_DEBUG_CREDENTIALS
	BUG_ON(read_cred_subscribers(old) < 2);
	validate_creds(old);
	validate_creds(new);
#endif
	BUG_ON(atomic_read(&new->usage) < 1);

	get_cred(new); /* we will require a ref for the subj creds too */

	/* dumpability changes */
	if (!uid_eq(old->euid, new->euid) ||
	    !gid_eq(old->egid, new->egid) ||
	    !uid_eq(old->fsuid, new->fsuid) ||
	    !gid_eq(old->fsgid, new->fsgid) ||
	    !cred_cap_issubset(old, new)) {
		if (task->mm)
			set_dumpable(task->mm, suid_dumpable);
		task->pdeath_signal = 0;
		/*
		 * If a task drops privileges and becomes nondumpable,
		 * the dumpability change must become visible before
		 * the credential change; otherwise, a __ptrace_may_access()
		 * racing with this change may be able to attach to a task it
		 * shouldn't be able to attach to (as if the task had dropped
		 * privileges without becoming nondumpable).
		 * Pairs with a read barrier in __ptrace_may_access().
		 */
		smp_wmb();
	}

	/* alter the thread keyring */
	if (!uid_eq(new->fsuid, old->fsuid))
		key_fsuid_changed(task);
	if (!gid_eq(new->fsgid, old->fsgid))
		key_fsgid_changed(task);

	/* do it
	 * RLIMIT_NPROC limits on user->processes have already been checked
	 * in set_user().
	 */
	alter_cred_subscribers(new, 2);
	if (new->user != old->user)
		atomic_inc(&new->user->processes);
	rcu_assign_pointer(task->real_cred, new);
	rcu_assign_pointer(task->cred, new);
	if (new->user != old->user)
		atomic_dec(&old->user->processes);
	alter_cred_subscribers(old, -2);

	/* send notifications */
	if (!uid_eq(new->uid,   old->uid)  ||
	    !uid_eq(new->euid,  old->euid) ||
	    !uid_eq(new->suid,  old->suid) ||
	    !uid_eq(new->fsuid, old->fsuid))
		proc_id_connector(task, PROC_EVENT_UID);

	if (!gid_eq(new->gid,   old->gid)  ||
	    !gid_eq(new->egid,  old->egid) ||
	    !gid_eq(new->sgid,  old->sgid) ||
	    !gid_eq(new->fsgid, old->fsgid))
		proc_id_connector(task, PROC_EVENT_GID);

	/* release the old obj and subj refs both */
	put_cred(old);
	put_cred(old);
	return 0;
}
EXPORT_SYMBOL(commit_creds);

根据源码注释的描述，这个函数会将当前进程的real_cred和cred都设置成一组新的cred。 综上，经过prepare_kernel_cred(0)得到一个root的cred，而后再用commit_creds()将其安装到当前进程，即commit_creds(prepare_kernel_cred(0))，这样就能够提权啦！