State preparation The initial state of the system is: : |\Psi_0\rangle = |0\rangle^{\otimes n}|\psi\rangle , where |\psi\rangle is the m-qubit state that evolves through U. We first apply the
n-qubit Hadamard gate operation H^{\otimes n} on the first register, which produces the state:|\Psi_1\rangle = (H^{\otimes n}\otimes I_m)|\Psi_0\rangle = \frac{1}{2^{\frac{n}{2}}}(|0\rangle + |1\rangle)^{\otimes n}|\psi\rangle = \frac{1}{2^{n/2}} \sum_{j = 0}^{2^n - 1} |j\rangle |\psi\rangle.Note that here we are switching between binary and n-ary representation for the n-qubit register: the ket |j\rangle on the right-hand side is shorthand for the n-qubit state |j\rangle\equiv \bigotimes_{\ell=0}^{n-1} |j_\ell\rangle, where j=\sum_{\ell=0}^{n-1} j_\ell 2^\ell is the binary decomposition of j.
Controlled-U operations This state |\Psi_1\rangle is then evolved through the controlled-unitary evolution U_C whose action can be written as U_C(|k\rangle\otimes|\psi\rangle) = |k\rangle\otimes( U^{k}|\psi\rangle), for all k=0,...,2^n-1. This evolution can also be written concisely asU_C = \sum_{k=0}^{2^n-1} |k\rangle\!\langle k|\otimes U^k, which highlights its controlled nature: it applies U^k to the second register conditionally to the first register being |k\rangle. Remembering the eigenvalue condition holding for |\psi\rangle, applying U_C to |\Psi_1\rangle thus gives|\Psi_2\rangle \equiv U_C|\Psi_1\rangle = \left(\frac{1}{2^{n/2}}\sum_{k=0}^{2^n-1} e^{2\pi i \theta k}|k\rangle\right)\otimes |\psi\rangle, where we used U^{k}| \psi\rangle = e^{ 2\pi i k\theta}|\psi \rangle. To show that U_C can also be implemented efficiently, observe that we can write U_C = \prod_{\ell=0}^{n-1} C_\ell(U^{2^\ell}), where C_\ell(U^{2^\ell}) denotes the operation of applying U^{2^\ell} to the second register conditionally to the \ell-th qubit of the first register being |1\rangle. Formally, these gates can be characterized by their action asC_\ell(U^k) (|j\rangle\otimes |\psi\rangle) = |j\rangle\otimes ( U^{j_\ell k}|\psi\rangle).This equation can be interpreted as saying that the state is left unchanged when j_\ell=0, that is, when the \ell-th qubit is |0\rangle, while the gate U^k is applied to the second register when the \ell-th qubit is |1\rangle. The composition of these controlled-gates thus gives\prod_{\ell=0}^{n-1} C_\ell(U^{2^\ell})(|j\rangle\otimes|\psi\rangle) = |j\rangle\otimes\left(U^{\sum_{\ell=0}^{n-1} j_\ell 2^\ell} |\psi\rangle\right)= U_C \left( |j\rangle\otimes |\psi\rangle\right), with the last step directly following from the binary decomposition j=\sum_{\ell=0}^{n-1} j_\ell 2^\ell. From this point onwards, the second register is left untouched, and thus it is convenient to write |\Psi_2\rangle=|\tilde\Psi_2\rangle\otimes|\psi\rangle, with |\tilde\Psi_2\rangle the state of the n-qubit register, which is the only one we need to consider for the rest of the algorithm.
Apply inverse quantum Fourier transform The final part of the circuit involves applying the inverse
quantum Fourier transform (QFT) \mathcal{QFT} on the first register of |\Psi_2\rangle:|\tilde\Psi_3\rangle = \mathcal{QFT}^{-1}_{2^n} |\tilde\Psi_2\rangle.The QFT and its inverse are characterized by their action on basis states as\begin{align} \mathcal{QFT}_N|k\rangle &= N^{-1/2}\sum_{j=0}^{N-1} e^{\frac{2\pi i}{N}jk}|j\rangle, \\ \mathcal{QFT}_N^{-1}|k\rangle &= N^{-1/2}\sum_{j=0}^{N-1} e^{-\frac{2\pi i}{N}jk}|j\rangle. \end{align} It follows that :|\tilde\Psi_3\rangle = \frac{1}{2^{\frac{n}{2}}}\sum_{k=0}^{2^n - 1} e^{2\pi i \theta k} \left( \frac{1}{2^{\frac{n}{2}}}\sum_{x=0}^{2^n - 1} e^{\frac{-2\pi i k x}{2^n}}|x\rangle \right) = \frac{1}{2^{n}}\sum_{x=0}^{2^n - 1} \sum_{k=0}^{2^n - 1}e^{-\frac{2\pi i k}{2^n} \left ( x - 2^n \theta \right )} |x\rangle. Decomposing the state in the computational basis as |\tilde\Psi_3\rangle = \sum_{x=0}^{2^n-1} c_x |x\rangle, the coefficients thus equal c_x \equiv \frac{1}{2^n} \sum_{k=0}^{2^n-1} e^{-\frac{2\pi ik}{2^n}(x-2^n \theta) } = \frac{1}{2^{n}} \sum_{k=0}^{2^n - 1} e^{-\frac{2\pi i k}{2^n} \left ( x-a \right )} e^{2 \pi i \delta k},where we wrote 2^n \theta = a + 2^n \delta, with a is the nearest integer to 2^n \theta. The difference 2^n\delta must by definition satisfy 0 \leqslant |2^n\delta| \leqslant \tfrac{1}{2}. This amounts to approximating the value of \theta \in [0, 1] by rounding 2^n \theta to the nearest integer.
Measurement The final step involves performing a
measurement in the computational basis on the first register. This yields the outcome |y\rangle with probability\Pr(y) = |c_y|^2 = \left| \frac{1}{2^{n}} \sum_{k=0}^{2^n-1} e^{\frac{-2\pi i k}{2^n}(y-a)} e^{2 \pi i \delta k} \right |^2. It follows that \operatorname{Pr}(a)=1 if \delta=0, that is, when \theta can be written as \theta=a/2^n, one always finds the outcome y=a. On the other hand, if \delta\neq0, the probability reads\operatorname{Pr}(a)=\frac{1}{2^{2n}} \left | \sum_{k=0}^{2^n-1} e^{2 \pi i \delta k} \right |^2 = \frac{1}{2^{2n}} \left | \frac{1- {e^{2 \pi i 2^n \delta}}}{1-{e^{2 \pi i \delta}}} \right|^2. From this expression we can see that \Pr(a) \geqslant \frac{4}{\pi^2} \approx 0.405 when \delta\neq0. To see this, we observe that from the definition of \delta we have the inequality |\delta| \leqslant \tfrac{1}{2^{n+1}}, and thus:\begin{align} \Pr(a) &= \frac{1}{2^{2n}} \left | \frac{1- {e^{2 \pi i 2^n \delta}}}{1-{e^{2 \pi i \delta}}} \right |^2 && \text{for } \delta \neq 0 \\ &= \frac{1}{2^{2n}} \left | \frac{2 \sin \left ( \pi 2^n \delta\right)}{ 2\sin( \pi \delta)} \right |^2 && \left| 1-e^{2ix}\right|^2 = 4\left| \sin (x)\right|^2 \\ &= \frac{1}{2^{2n}} \frac {\left | \sin\left(\pi 2^n \delta\right) \right |^2}{| \sin( \pi \delta) |^2} \\ &\geqslant \frac{1}{2^{2n}} \frac {\left | \sin\left(\pi 2^n \delta\right) \right |^2}{| \pi \delta |^2} && | \sin(\pi \delta) | \leqslant | \pi \delta | \\ &\geqslant \frac{1}{2^{2n}} \frac {|2 \cdot 2^n \delta|^2}{| \pi \delta |^2} && | 2\cdot2^n \delta | \leqslant | \sin(\pi 2^n\delta) | \text{ for } |\delta| \leqslant \frac{1}{2^{n+1}} \\ &\geqslant \frac {4}{\pi^2} .\end{align} We conclude that the algorithm provides the best n-bit estimate (i.e., one that is within 1/2^n of the correct answer) of \theta with probability at least 4/\pi^2. By adding a number of extra qubits on the order of O(\log(1/\epsilon)) and truncating the extra qubits the probability can increase to 1 - \epsilon. == Toy examples ==