【6.C++基础】-锁

时间 2021-02-16

标签 html node linux 数组安全并发函数性能优化栏目 HTML 繁體版

原文原文链接

锁的意义

原子性+可见性
同一时间，只有一个线程执行锁中代码 + 锁内读在锁前代码执行完，写在锁释放前可见html

原子

操做

自己内核的原子是经过原子指令实现的https://code.woboq.org/linux/...
原子库实现的一下方法能够带内存屏障来增强可见性。node

store //原子写
load //原子读
exchange //原子交换

compare_exchange_weak //compare and set 性能更高，可是两个值同样时可能会意外返回false。a.compare_exchange_weak(&expect,val)。if a=expect，则a.store(v), else expect=a,返回falselinux

bool compare_exchange_weak (T& expected, T val, memory_order sync = memory_order_seq_cst) volatile noexcept;
Compares the contents of the atomic object's contained value with expected:
- if true, it replaces the contained value with val (like store).
- if false, it replaces expected with the contained value .

 __asm__ __volatile__("" : : : "memory");
 inline void* Acquire_Load() const {
    void* result = rep_;
    MemoryBarrier();
    return result;
  }
  inline void Release_Store(void* v) {
    MemoryBarrier();
    rep_ = v;
  }

compare_exchange_strong数组

2.内存屏障

typedef enum memory_order {
        memory_order_relaxed, // 不对执行顺序作保证
        memory_order_acquire, // A load operation with this memory order performs the acquire operation on the affected memory location: no reads or writes in the current thread can be reordered before this load. All writes in other threads that release the same atomic variable are visible in the current thread (see Release-Acquire ordering below)
        memory_order_release, // A store operation with this memory order performs the release operation: no reads or writes in the current thread can be reordered after this store. All writes in the current thread are visible in other threads that acquire the same atomic variable (see Release-Acquire ordering below) and writes that carry a dependency into the atomic variable become visible in other threads that consume the same atomic (see Release-Consume ordering below).
        memory_order_acq_rel, // 同时包含memory_order_acquire 和 memory_order_release
        memory_order_consume, // 本线程中,全部后续的有关本原子类型的操做,必须在本条原子操做完成以后执行
        memory_order_seq_cst // 所有存取都按顺序执行
    } memory_order;

无锁队列

template
struct Node { T t; shared_ptr<Node> next; };
atomic<shared_ptr<Node>> head;
public:
   slist() =default;
   ~slist() =default;
   class reference { 
      shared_ptr p;
   public:
      reference(shared_ptr<Node> p_) : p{_p} {}
      T& operator*() { return p->t; }
      T* operator->() { return &p->t; }
   };
   auto find(T t) const {
      auto p = head.load();
      while (p && p->t != t)
         p = p->next;
      return reference{move(p)};
   void push_front(T t) {
      auto p = make_shared<Node>();
      p->t = t;
      p->next = head;
      while (head.compare_exchange_weak(p->next, p))
         {}
   }
   void pop_front() {
      auto p = head.load();
      while (p && !head.compare_exchange_weak(p, p->next))
         {}
   }
};

### mutex安全

std的mutex =>pthread_mutex_lock
linux的glibc的pthread包分好几种，普通的就调futex。自适应的也会先spin。
循环调用 CAS,wait在futex
cmpxchgl检查futex（也就是__lock成员）是否为0（表示锁未占用），如是，赋值1（表示锁被占用）
pthread_cond_wait:
也是先释放mutex。而后futex在cond上（lll_futex_wait (&cond->__data.__futex, futex_val, pshared);）而后再锁mutex
更多pthread的锁：https://casatwy.com/pthreadde...

应用

boost和std都有。boost的效率说是比std高一些并发

定义：mutex对象 boost::shared_mutex, boost::mutex
lock_guard,shard_lock,unique_lock都是模板类，用来管理mutex函数

boost::shared_lock<T>中的T只能是shared_mutex类
unique_lock<T>中的T能够为mutex类中的任意一种，若是为shared_mutex，那么boost::unique_lock<boost::shared_mutex>类的对象构造函数构造时，会自动调用shared_mutex的shared_lock方法，析构函数里，会自动调用shared_mutex的shared_unlock方法。若是是boost:: unique_lock<boost::mutex>，则分别自动调用lock和unlock方法。性能

读写锁实现：
typedef boost::shared_lock<boost::shared_mutex> readLock;
typedef boost::unique_lock<boost::shared_mutex> writeLock;
boost::shared_mutex rwmutex;
用的时候：
readLock(rwmutex) 优化

互斥锁：
typedef boost::unique_lock<boost::mutex> exclusiveLock;
boost::mutex m;
exclusiveLock(m)ui

tips

一写多读多写多读关于coredump这种线程安全都是由于地址访问，好比要读的起始被删除了，数据的reserve啊，map的树调整啊，rehash啊，直接删除之类的。而单独的++这种是不须要的。

还有是可见性和原子性。多写不加锁（没有原子性，可见性的保证）会指令乱序覆盖，好比++的次数变少，读可能会读到旧数据，可能做为if判断不会当即生效由于在寄存器和另外一个cpucache中。
关于volitale 做用就是禁止编译器优化，因此取值不会走寄存器。控制不了别的，因此后面的指令仍是会乱序到他前面，cpu仍是有cpucache，而且cpucache的MSEI没有指令加锁也不会原子性，仍然会出现读不到的状况。用内存屏障或者老老实实用原子，用锁，减小锁冲突

内核原语（spinlocks，mutexes，memory barriers等）确保了并发访问共享数据的安全，内核原语同时阻止了不须要的优化。若是能正确的使用这些同步原语，固然同时也就没有必要使用volatile类型。
https://lwn.net/Articles/233482/

barrier();
禁止编译器指令重排。不使用寄存器的值，从内存中load
(https://zhuanlan.zhihu.com/p/...

spinlock

用户态和内核处理spin差别很大,内核能控制特定cpu,因此逻辑会复杂不少
用户态spin还会直接陷入内核阻塞,内核可不会，那就是真的死循环,必须考虑性能

本身写spinlock

pthread有spin

while (!condition) {  
    if (count > xxx)  break;  
    count++;  
    \_\_asm\_\_ volatile （"pause");  
  }

  mutex();

内核spin

while (lock->locked);    
        lock->locked = 1;    =》不原子=》 while (test_and_set(&lock->locked));  =》while (lock->locked || test_and_set(&lock->locked));
这种写法每次唤醒lock会出现饿死状况
引入owner和排队
struct spinlock {
        unsigned short owner;
        unsigned short next;
};
void spin_lock(struct spinlock *lock)
{
        unsigned short next = xadd(&lock->next, 1);
        while (lock->owner != next);
}
void spin_unlock(struct spinlock *lock)
{
        lock->owner++;
}
在加入spinlock时，会invalid spinlock致使整个cpu cache颠簸。=》每一个cpu本身的结构，用链表连接起来
https://zhuanlan.zhihu.com/p/89058726

信号量

信号量
可睡眠，可多个
原来pthread_mutex不支持进程，后来也有了，可是不是全部平台都支持。信号量是原来进程
加锁down:在自旋锁的保护下，加入等待列表，解锁，调度出去，回来后获取锁，检查是否up，up返回不然循环
解锁up:在自旋锁的保护下，去第一个等待列表，删除，设置up,回调
https://zhuanlan.zhihu.com/p/...
pfs中用来进程同步

ABA

rocksdb中无所队列ABA问题
若是位置V存储的是链表的头结点，那么发生ABA问题的链表中，原头结点是node1，线程 2 操做头结点变化了两次，极可能是先修改头结点为node2，再将node1（在C++中，也但是从新分配的节点node3，但刚好其指针等于已经释放掉的node1）插入表头成为新的头结点。

对于线程 1 ，头结点仍旧为 node1（或者说头结点的值，由于在C++中，虽然地址相同，但其内容可能变为了node3），CAS操做成功，但头结点以后的子链表的状态已不可预知。

创建一个全局数组 HP hp[N]，数组中的元素为指针，称为 Hazard pointer，数组的大小为线程的数目，即每一个线程拥有一个 HP。
约定每一个线程只能修改本身的 HP，而不容许修改别的线程的 HP，但能够去读别的线程的 HP 值。
当线程尝试去访问一个关键数据节点时，它得先把该节点的指针赋给本身的 HP，即告诉别人不要释放这个节点。
每一个线程维护一个私有链表(free list)，当该线程准备释放一个节点时，把该节点放入本身的链表中，当链表数目达到一个设定数目 R 后，遍历该链表把能释放的节点统统释放。
当一个线程要释放某个节点时，它须要检查全局的 HP 数组，肯定若是没有任何一个线程的 HP 值与当前节点的指针相同，则释放之，不然不释放，仍旧把该节点放回本身的链表中。
这个不是和文件持有时，其余不能delete是同样的。无锁链表在没有delete时候，next比较。问题是CAS直接取指针比较啊。
https://www.drdobbs.com/lock-...这个能够解决释放，至关于维护一个释放队列，先不释放=。=可是解决不了若是再申请仍是这块内存，CAS比较里边值的问题，这个释放能够延时，可是赋值不行啊，仍是要带version啊。