Java小白系列（五）：关键字volatile

一、前言

我们前面分析过 Synchronized 关键字的特性：原子性、可见性、有序性、可重入性！虽然，JDK在不断的尝试优化这个内置锁，我们在《进阶》一文中有提到：无锁 -> 偏向锁 -> 轻量锁 -> 重量锁一共四种状态，但是，在高并发的情况下，最终都还是会膨胀到重量锁。

同样，我们在《小白二》中，提顺带提了句 volatile 关键字，它与Synchronized 的区别是：volatile 不具备原子性！
注：不具备原子性不代表它没有原生性！

为何这么说？
那是因为，Synchronized 是同步代码块，通过 monitor 监视器，对整个代码块（方法是通过判断 ACC_SYNCHRONZED 标志位对整个方法）进行了整体原子性操作。而 volatile 对单一操作是原子性的，非单一操作则是非原子性的。

例如：

public class Demo {
    private volatile int i = 0;
    public void increase() {
        i ++; // 非原子性操作
    }
}
复制代码

『i ++』这个操作实际分为多步（编译器支持这种操作，但在编译时，会分成以下几步）

public class Demo {
    private volatile int i = 0;
    public void increase() {
        i ++;
    }
}
复制代码

通过 javap 查看字节码：

public class com.chris.demo.Demo {
  public com.chris.demo.Demo();
    Code:
       0: aload_0
       1: invokespecial #1                  // Method java/lang/Object."<init>":()V
       4: aload_0
       5: iconst_0
       6: putfield      #2                  // Field i:I
       9: return

  public void increase();
    Code:
       0: aload_0
       1: dup
       2: getfield      #2                  // Field i:I
       5: iconst_1
       6: iadd          //
       7: putfield      #2                  // Field i:I
      10: return
}
复制代码

上面的步骤分为三步：
第 0 ~ 2 步：从内存中读取变量 i 至寄存器；
第 5 ~ 6 步：对变量 i 加 1；
第 7 步：将变量 i 的值回写到内存中；

二、volatile 的作用

2.1、防止重排序

我们先来看一个经典的例子：实现单例的方式有很多种，其中有一种方式叫作 DCL （Double Check Lock），即双重检查加锁，具体实现如下：

如果你是以下写法，那么就错了

public class Singleton {
    private static Singleton singleton;
    private Singleton() {}

    public static Singleton getInstance() {
        if (singleton == null) {
            synchronized (Singleton.class) {
                if (singleton == null) {
                    singleton = new Singleton();
                }
            }
        }
        return singleton;
    }
}
复制代码

正确写法如下：

public class Singleton {
    private volatile static Singleton singleton; // 这里，加上 volatile
    private Singleton() {}

    public static Singleton getInstance() {
        if (singleton == null) {
            synchronized (Singleton.class) {
                if (singleton == null) {
                    singleton = new Singleton();
                }
            }
        }
        return singleton;
    }
}
复制代码

我们来分析一下为何第一种写法不对，要使用第二种写法。重点在于 JVM 的执行顺序（会对一些没有依赖要求的指令重排序），因此错误的DCL会出现以下两种执行顺序：

第一种顺序：

分配内存
初始化内存
内存赋值给实例对象

第二种顺序：

分配内存
内存赋值给实例对象
初始化内存

大家肯定会觉得很奇怪，第二种顺序明显是错误的，为何还会有这种顺序？

我前面说了，b 与 c 步骤，都依赖步骤a，但b 与 c 两者没有依赖要求，所以，JVM可能会重排序！
但是，当我们加上了『volatile』关键字后，可以防止第二种顺序（即防止重排序），因此可以保证内存赋值给实例对象。

2.2、实现可见性

可见性在 Synchronized 一文已解释过：多个线程访问时，一个线程修改了某个变量的数据，其它线程能立即获取到最新的值，可以认为是共享变量。之所有存在可见性问题，是因为每个线程都拥有自己的一个高速缓存区：线程工存内存（它是每个线程的本地内存，重点在于『高速』）。

我们看个例子：

public class Demo {
    private int a = 1;
    private int b = 2;

    private void change() {
        a = 3;
        b = a;
    }

    private void print() {
        System.out.println("print => b = " + b + ", a = " + a);
    }

    public static void main(String[] args) {
        while (true) {
            final Demo test = new Demo();
            new Thread(() -> {
                try {
                    Thread.sleep(10);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
                test.change();
            }).start();

            new Thread(() -> {
                try {
                    Thread.sleep(10);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
                test.print();
            }).start();
        }
    }
}
复制代码

大家猜猜，打印的结果是什么（运行个几十秒后停止）？

...
print => b = 2，a = 1
...
print => a = 3, b = 3
...
print => b = 3, a = 1
...
复制代码

理论上结果应该是两种：『b = 2，a = 1』或『b = 3，a = 3』，但实际却有三种结果，为什么会这样？
这就是因为，第一个线程在 a 修改为3时，第二个线程是不可见的。如果两个变量都加了 volatile 后对于多个线程都是可见的。

2.3、保证原子性

原子性问题也早已经解释过，JLS中也有描述：

17.7 Non-Atomic Treatment of double and long

For the purposes of the Java programming language memory model, a single write to a non-volatile 
long or double value is treated as two separate writes: one to each 32-bit half. This can result 
in a situation where a thread sees the first 32 bits of a 64-bit value from one write, and the 
second 32 bits from another write.

Writes and reads of volatile long and double values are always atomic.

Writes to and reads of references are always atomic, regardless of whether they are implemented 
as 32-bit or 64-bit values.

Some implementations may find it convenient to divide a single write action on a 64-bit long or 
double value into two write actions on adjacent 32-bit values. For efficiency's sake, this behavior 
is implementation-specific; an implementation of the Java Virtual Machine is free to perform writes 
to long and double values atomically or in two parts.

Implementations of the Java Virtual Machine are encouraged to avoid splitting 64-bit values where 
possible. Programmers are encouraged to declare shared 64-bit values as volatile or synchronize 
their programs correctly to avoid possible complications.
复制代码

大致含义如下：
double和long是非原子性操作，出于JMM的目的，非 volatile 的 long 或 double 分为两步操作：即高 32 位和低 32 位。因此建议使用 volatile 的 double 和 long 变量来读和写。

三、volatile 原理

volatile 更深入的原理，我会单独开一文来讲解，这个涉及到 JMM （Java Memory Model）。

3.1、可见性原理

因为每个线程都有自己的工作内存，它是一个本地内存，因此，每个线程并不直接与主内存进行数据交互，而是通过工作内存来完成相应的操作，所以，导致每个线程间的数据不可见。

volatile 修饰的变量之所有对其它线程可见在于：

修改该变量的值会强制刷新主内存中对应的变量，同时；
其它线程的工作内存中，该变量的值失效，要求从主内存中重新读取该变量的值；

3.2、有序性原理

这个就涉及到 Java 的 happen-before 规则，在 JSR 133 中定义如下：

Two actions can be ordered by a happens-before relationship.If one action happens before another, then the first is visible to and ordered before the second.

通俗一点说就是如果a happen-before b，则a所做的任何操作对b是可见的。（这一点大家务必记住，因为happen-before这个词容易被误解为是时间的前后）。

我们再来看看JSR 133中定义了哪些happen-before规则：

Each action in a thread happens before every subsequent action in that thread.
An unlock on a monitor happens before every subsequent lock on that monitor.
A write to a volatile field happens before every subsequent read of that volatile.
A call to start() on a thread happens before any actions in the started thread.
All actions in a thread happen before any other thread successfully returns from a join() on that thread.
If an action a happens before an action b, and b happens before an action c, then a happens before c.

翻译过来为：

同一个线程中的，前面的操作 happen-before 后续的操作。（即单线程内按代码顺序执行。但是，在不影响在单线程环境执行结果的前提下，编译器和处理器可以进行重排序，这是合法的。换句话说，这一是规则无法保证编译重排和指令重排）。
监视器上的解锁操作 happen-before 其后续的加锁操作。（Synchronized 规则）
对volatile变量的写操作 happen-before 后续的读操作。（volatile 规则）
线程的start() 方法 happen-before 该线程所有的后续操作。（线程启动规则）
线程所有的操作 happen-before 其他线程在该线程上调用 join 返回成功后的操作。
如果 a happen-before b，b happen-before c，则a happen-before c（传递性）。

这里我们主要看下第三条：volatile变量的保证有序性的规则。为了实现volatile内存语义，JMM会对volatile变量限制这两种类型的重排序。下面是JMM针对volatile变量所规定的重排序规则表：

3.3、内存屏障（Memory Barrier）

为了实现volatile可见性和happen-befor的语义。JVM底层是通过一个叫做“内存屏障”的东西来完成。内存屏障，也叫做内存栅栏，是一组处理器指令，用于实现对内存操作的顺序限制。

LoadLoad 屏障

执行顺序：Load1—>Loadload—>Load2

确保Load2及后续Load指令加载数据之前能访问到Load1加载的数据。

StoreStore 屏障

执行顺序：Store1—>StoreStore—>Store2

确保Store2以及后续Store指令执行前，Store1操作的数据对其它处理器可见。

LoadStore 屏障

执行顺序：Load1—>LoadStore—>Store2

确保Store2和后续Store指令执行前，可以访问到Load1加载的数据。

StoreLoad 屏障

执行顺序: Store1—> StoreLoad—>Load2

确保Load2和后续的Load指令读取之前，Store1的数据对其他处理器是可见的。

四、总结

总体上来说volatile的理解还是比较困难的，如果不是特别理解，也不用急，完全理解需要一个过程，在后续的文章中也还会多次看到volatile的使用场景。这里暂且对volatile的基础知识和原来有一个基本的了解。

总体来说，volatile是并发编程中的一种优化，在某些场景下可以代替Synchronized。但是，volatile的不能完全取代Synchronized的位置，只有在一些特殊的场景下，才能适用volatile。总的来说，必须同时满足下面两个条件才能保证在并发环境的线程安全：

对变量的写操作不依赖于当前值。
该变量没有包含在具有其他变量的不变式中。