Personal View site logo
Make sure to join PV on Telegram or Facebook! Perfect to keep up with community on your smartphone.
Please, support PV!
It allows to keep PV going, with more focus towards AI, but keeping be one of the few truly independent places.
Meltdown CPU Bug for Dummies
  • Processor have MMU, Memory Management Unit. It is responsible to check memory space for specific process. For example, prevent it to read/write outside his designated space.

    Issue is that during speculative execution of code CPU can ignore MMU for a while, and despite it properly stops, results of the execution can be obtained indirectly by attacker.

    All Intel CPUs after Pentium Pro era are affected. Why AMD is not affected? Speculative execution unit works differently and ask MMU sooner, hence stops. Why Intel made this? Because current approach makes CPUs faster and cooler.

    Suppose we have such code

    if (x < array1_size) { y = array1[x] }

    With sequential execution processor always first check the condition, in this case that x is within bounds for array, and only if it is true it executes the following line.

    During speculative execution statistical branch prediction is used, so for example after certain amounts of such ifs being true processor will start to assume that it is true almost always, so it'll start execute second line ahead of checking condition.

    What will happen if result of condition execution turns out to be false? Processor will flush the calculated results of parts ahead and redo proper branch. Processor also tells his branch predictor that probability of this was must be reduced.

    What if index value x becomes located not only outside out array, but such far that it is also outside our process memory space? Well, processor do things same. it'll still execute second line ahead of condition, as it does not wait for MMU to know if process is allowed to read at this address, answer will come only later. And same as before if MMU tells that process don't have right to access this memory processor will flush the result. Modern processor can manage to execute many instructions (like 20) before getting flushing.

    For now all is well, as incorrect branch execution results are flushed, they are not stored in RAM or CPU registers. All you can really determinate is small speed hit due to flushing and returning to correct branch.

    Here comes more fun. CPU also has cache. This part works all by itself and it has all the data that had been used by CPU recently. Issue is that if incorrect branch operation from code above made read from array1 at address x and execution branch had been flushed cache still has the value. Issue is that cache is fully internal structure, you can't get just read it. But it turns out you can figure out that it has, including our value.

    Next step in the rabbit hole is related to timing. It turns out that if software will try to access same address again you can determinate if reading had been fast (from cache) or slow (from RAM). So, for now we can't read value directly but can determinate if someone accessed it recently.

    Here come CPU instructions, idea is to use indirect addressing.

    Like this

    y = array1[x] z = array2[y]

    Just in assembler.

    Remember that our wrong branch can be quite long.
    Idea is to not only read value at specified address, but also use this value for indirect addressing to another value (address of that we can access!).

    Each process until last days patches had OS kernel (and hence all other memory) mapped to its own memory space, but protected by MMU.

    Here is the attack using elements we talked above:

    Suppose we have our own address space at addresses 0..9999, and kernel is mapped to addresses 10000...20000.

    We make special execution branch that will work in speculative execution mode.

    • First it will read memory_space at address 15000 (it is in kernel!), x=memory_space[15000], suppose x is equal to 78, but our software have no idea yet.
    • Using indirect addressing our code will read y=memory_space[x]
    • MMU will stop and flush our execution (exception won't happen, as it was wrong prediction!)
    • Now our application starts to read its own memory space z=memory_space[i], where "i" is going from zero to maximum value possible for x. Reading must be random (not sequential).
    • At address 78 access time will be significantly smaller, as during second step processor already accessed this address during speculative execution. So, we now know that x was equal to 78.
    • Hence we effectively can read kernel memory space.

    CPUs affected

    • ARM Cortex-A15, Cortex-A57, Cortex-A72 и Cortex-A75.
    • All Intel CPUs including Xeon, Core, all modern Atoms, Celerons, Pentiums.

    Fix

    Removing kernel memory mapping into user level process memory.
    This result on penalty for each syscall (call to kernel functions).
    Also each switch cause Translation Lookaside Buffer flush further degrading performance.