The bug is really cool: https://groups.google.com/forum/#!topic/golang-dev/MEATuOi_ei4. It impacts not only crypto/tls, but also crypto/openpgp. If you are using Go to sign messages, it's time to change your RSA private keys. Someone asked me to explain the bug in layman term, and below is my take.
The RSA function calculates $m^e \pmod{pq}$, where $p$ and $q$ are large prime numbers. Modular operations are expensive, people usually use two tricks to speed them up:
a/ Montgomery reduction: calculating 33500 % 99 is hard because 99 is not a nice number. It'll be awesome if we can replace 99 with 100, but that's exactly the trick that Montgomery found.
b/ Chinese Remainder Theorem: in high school you were probably asked to find a number $n$ satisfying $n \pmod{2} = 1$, $n \pmod{3} = 5$, and $n \pmod{5} = 7$. Perhaps your teachers didn't tell you, but you can solve this problem using the Chinese Remainder Theorem. We can use the same trick here, by computing $m^e \pmod{p}$ and $m^e \pmod{q}$ then combining the result we can save a lot of CPU cycles.
Go uses both tricks, but it did the Montgomery reduction incorrectly in a few cases. Instead of returning 0 for x % x, it returned x. It also returned $x + \delta$ for $(x + \delta) \pmod{x}$ -- the correct value should be $\delta$. It's surprising that such a small mistake could lead to leaking RSA private keys.
Suppose there's a RSA oracle that accepts a message $c$ and returns $c^d \pmod {pq}$, where $d$ is the private exponent. Most TLS servers expose such an oracle. Let's see how one can obtain $p$ or $q$ by querying the oracle. Let's look at how CRT is used in RSA. The following values are precomputed and stored as part of the private key inside the oracle:
Suppose that $m_2$ is incorrect and equal to $q + \delta$ (the extra $q$ is there because the Montgomery reduction forgot to remove it). That means $m = m' + q$, in other words $m \equiv m' \pmod{q}$. Raising both sides to the public exponent we have $m^e \equiv m'^{e} \equiv c^{ed} \equiv c \pmod{q}$. The last equation is due to the fact that $ed \equiv 1 \pmod{q - 1}$. Thus, $m^e - c$ is a multiple of $q$, and we can compute $q$ as the gcd of $m^e - c$ and $pq$. In summary, if we raise the signature to the public exponent and subtract the original message, we'll end up with a multiple of $q$. Thanks Adam Langley for this trick :).
Can we exploit if $m_2$ is correct, but either $h$, $m_1$, or both is incorrect? It's unclear, but I'll leave that as an exercise for the readers. Comments are welcome!
The RSA function calculates $m^e \pmod{pq}$, where $p$ and $q$ are large prime numbers. Modular operations are expensive, people usually use two tricks to speed them up:
a/ Montgomery reduction: calculating 33500 % 99 is hard because 99 is not a nice number. It'll be awesome if we can replace 99 with 100, but that's exactly the trick that Montgomery found.
b/ Chinese Remainder Theorem: in high school you were probably asked to find a number $n$ satisfying $n \pmod{2} = 1$, $n \pmod{3} = 5$, and $n \pmod{5} = 7$. Perhaps your teachers didn't tell you, but you can solve this problem using the Chinese Remainder Theorem. We can use the same trick here, by computing $m^e \pmod{p}$ and $m^e \pmod{q}$ then combining the result we can save a lot of CPU cycles.
Go uses both tricks, but it did the Montgomery reduction incorrectly in a few cases. Instead of returning 0 for x % x, it returned x. It also returned $x + \delta$ for $(x + \delta) \pmod{x}$ -- the correct value should be $\delta$. It's surprising that such a small mistake could lead to leaking RSA private keys.
Suppose there's a RSA oracle that accepts a message $c$ and returns $c^d \pmod {pq}$, where $d$ is the private exponent. Most TLS servers expose such an oracle. Let's see how one can obtain $p$ or $q$ by querying the oracle. Let's look at how CRT is used in RSA. The following values are precomputed and stored as part of the private key inside the oracle:
- $p$ and $q$: the primes from the key generation,
- $d_p = d \pmod{p - 1}$,
- $d_q = d \pmod{q - 1}$, and
- $q_{inv} = q^{-1} \pmod{p}$.
These values allow the oracle to compute the exponentiation $c^d \pmod{pq}$ more efficiently as follows:
- $m_1 = c^{d_p} \pmod{p}$
- $m_2 = c^{d_q} \pmod{q}$
- $h = q_{inv}(m_1 - m_2) \pmod{p}$
- $m = m_2 + hq$
Suppose that $m_2$ is incorrect and equal to $q + \delta$ (the extra $q$ is there because the Montgomery reduction forgot to remove it). That means $m = m' + q$, in other words $m \equiv m' \pmod{q}$. Raising both sides to the public exponent we have $m^e \equiv m'^{e} \equiv c^{ed} \equiv c \pmod{q}$. The last equation is due to the fact that $ed \equiv 1 \pmod{q - 1}$. Thus, $m^e - c$ is a multiple of $q$, and we can compute $q$ as the gcd of $m^e - c$ and $pq$. In summary, if we raise the signature to the public exponent and subtract the original message, we'll end up with a multiple of $q$. Thanks Adam Langley for this trick :).
Can we exploit if $m_2$ is correct, but either $h$, $m_1$, or both is incorrect? It's unclear, but I'll leave that as an exercise for the readers. Comments are welcome!