Figure 1.7: Illustration for circular convolution [a∗b] (left) and circular correlation [a? b] (right) of two 3-dimensional vectors a and b. For example, the vector after circular convolution reads [a∗b]0 =a0b0+a2b1+a1b2, [a∗b]1 =a1b0+a0b1+a2b2, and [a∗b]2 =
a2b0+a1b1+a0b2. The vector after circular correlation reads [a?b]0 =a0b0+a1b1+a2b2,
[a?b]1 =a2b0+a0b1+a1b2, and [a?b]2 =a1b0+a2b1+a0b2.
a global learning module, missing links, or implicit knowledge, in the knowledge graph can be inferred as well. In [68], we adopt a simple 2-layer neural network that uses the holis- tic representations as input features to learn the global relational patterns hidden in the knowledge graph. This simple neural network with bottle structure outperforms several baselines, especially when the holistic representations are encoded from the Cauchy initial- ization. More important, it is even capable of inferring implicit knowledge of unobserved entities given only several semantic triples that contain those unobserved entities, with- out retraining and fine-tuning the weights. More experiments and rigorous analysis of the holistic representations can be found in [68] and Chapter 3.
1.6
Variational Quantum Circuit for Knowledge Graph
Embedding
1.6.1
Variational Quantum Circuit
Knowledge graphs are extracted from various unstructured text data, e.g., webpages, news- paper articles, and scientific reports, through two steps: named entity recognition and rela- tion extraction. The task of named entity recognition (NER), also known as named entity classification, is to recognize and locate the mentioned entities in unstructured texts and
classify them to predefined categories [82]. Previous NER approaches use language-specific knowledge to design hand-crafted rules and annotate mentioned entities in corpora. With the development of deep learning-based natural language processing, neural architectures for NER are introduced in [60], which apply bidirectional LSTMs and conditional random fields. Relations are then extracted from the corpora after annotating the entities and integrated into knowledge bases as semantic triples.
The number of recognized entities and extracted triples continually increases as knowl- edge graphs collect and merge information from different data sources. The growing num- ber of semantic triples and entities leads to a slow inference on knowledge graphs given a new query. To understand this, consider that we are given an unobserved query with an unknown object, the computational complexity of inferring the potentially correct object
is O(NeR3) for the Tucker model, where R represents the rank, and Ne the number of
entities. This estimation comes from the observation that the computational complexity of evaluating the score function isR3 for the Tucker model, and the score function needs to be
calculatedNe times and ranked afterward to determine the potential object. Hence, in this dissertation, we investigate the first quantum approaches for statistical relational learning to accelerate the learning process and inference on knowledge graphs. In this section, we briefly sketch the idea of our first quantum approach, whose underlying building block is the parameterized quantum circuit, also known as the variational quantum circuit.
|0i
..
.
|0i
|Ψi
U
θp
Figure 1.8: Quantum circuit part of the quantum-classical hybrid architecture for super- vised learning. The input feature is first normalized and encoded as the amplitudes of the quantum state |Φi, which is then evolved by a parameterized unitary transformation
Uθ, where θ represents parameters in the transformation. The predicted binary label is
encoded in the measurement statistics of the auxiliary qubit.
The parameterized quantum circuit is the building block of quantum-classical hybrid machine learning algorithms. Hybrid approaches combine a low-depth parameterized quan- tum circuit and a classical unit for optimization to learn a task by tuning and updating the parameters in the circuit. The hybrid architecture makes the parameterized quantum
1.6 Variational Quantum Circuit for Knowledge Graph Embedding 23
circuit a hot research topic since it is more suited to near-term noisy quantum devices. An overview of variational algorithms with a quantum-classical hybrid optimization scheme can be found in [75, 78], which show that parameterized quantum circuits can approxi- mate some nonlinear functions through numerical simulations. The variational approach can be applied to solve combinatorial optimization problems, such as MaxCut on regular graphs [26], by reformulating them to the ground state problems of Ising models. More- over, [101] investigated a supervised learning algorithm using the quantum-classical hybrid architecture, where inputs are normalized and encoded into the amplitudes of quantum states.
Figure 1.8 illustrates the quantum part of the hybrid architecture for supervised learn- ing. This quantum supervised learning architecture encodes the normalized input features as the amplitudes of a quantum state, which is then evolved by a sequence of unitary transformations. The unitary transformations are usually composed of parameterized sin- gle and two-qubit gates. An objective function for the binary classification is associated with the measurement statistics of an auxiliary qubit, which is entangled with the qubits for amplitude encoding. This binary quantum classifier is then optimized by updating the parameters in the unitary transformations and minimizing the objective function.
In this dissertation, we restrict to the case where the unitary transformation Uθ is
composed of a sequence of parameterized single and two-qubit gates. A single-qubit gate is a 2-dimensional matrix representation of the special unitary group SU(2), which, after ignoring a global phase, can be parameterized as
G(α, β, γ) =
eiβcosα eiγsinα
−e−iγsinα e−iβcosα
, (1.3)
where {α, β, γ} are tunable parameters of the single-qubit gate. Two-qubit gates that we adopt in the Ansätze are controlled gates, where one qubit acts as a control of opera- tions on another qubit. For instance, the controlled gate Ci(Gj) that applies a unitary transformation on the j-th qubit conditioned on the state of thei-th qubit can be written as
Ci(Gj)|xii⊗ |yij =|xii⊗G x j |yij ,
where |xii and |yij represent the quantum state of qubit i and j, respectively.
Quantum algorithms using n fully entangled qubits can perform computations on 2n
amplitudes. Hence, an n-qubit quantum circuit can encode input data with maximal
down the (2n×2n)-dimensional matrix representation of each unitary gate acting on the n-qubit system. Suppose that Uθ consists of L unitary operations and let the l-th unitary operationUlbe a single-qubit gate acting on the k-th qubit, then its matrix representation reads
Ul =11⊗ · · · ⊗Gk⊗ · · · ⊗1n.
If the l-th unitary operation is a controlled gate Ci(Gj), which acts on thej-th qubit and conditioned on the i-th qubit, then Ul possesses the following matrix representation
Ul =11⊗ · · · ⊗ P0 |{z} i-th ⊗ · · · ⊗ 1j |{z} j-th ⊗ · · · ⊗1n +11⊗ · · · ⊗ P1 |{z} i-th ⊗ · · · ⊗ Gj |{z} j-th ⊗ · · · ⊗1n,
where P0 = (1 00 0) and P1 = (0 00 1). Therefor, the (2n×2n)-dimensional matrix representa-
tion of Uθ can be written as Uθ =UL· · ·U1.
To optimize the circuit model, gradients of parameters can be estimated from the same circuit architecture using the parameter shift rule. This technique has been recently proposed in [39, 101, 27], and it shows that the partial derivative of the expectation of a quantum observable with respect to a circuit parameter can be decomposed into a sum of unitary operators. Hence, the partial derivatives with respect to the circuit parameters can be derived from the measurement statistics of the same auxiliary qubit using the parameter shift rule. For instance, let us consider parameterized single-qubit gate G(α, β, γ), whose partial derivatives with respect to α, β, and γ read
∂ ∂αG(α, β, γ) = G(α+ π 2, β, γ) ∂ ∂βG(α, β, γ) = 1 2G(α, β+ π 2,0) + 1 2G(α, β+ π 2, π) ∂ ∂γG(α, β, γ) = 1 2G(α,0, γ+ π 2) + 1 2G(α, π, γ+ π 2).
1.6.2
Modeling Knowledge Graphs with Variational Quantum
Circuit
Having the knowledge of variational quantum circuits, we can introduce quantum embed- ding models for knowledge graphs. In the pioneering work [70], we contribute two different
1.6 Variational Quantum Circuit for Knowledge Graph Embedding 25
a quantum representation, which is encoded as the amplitudes of a quantum states. The only difference is that how these quantum representations of entities are prepared or loaded
as the amplitudes of quantum states. In the QCE model, quantum representations are
stored in a tree-structured classical memory, which can be accessed by a quantum algo- rithm to load the representations as quantum representations. This memory structure (see Figure 1.9) is a special Quantum Random Access Memory (QRAM) [33], which allows the vector representations to be loaded with exponential acceleration in the vector dimen-
sion. Since theQCEmodel is training-based, we have shown that the iterative parameter
updates might ruin the exponential speedup gained during the preparation of quantum states. Hence, we were motivated to propose the fully-parameterized Quantum Circuit
Embedding (fQCE). ||x||2 x21+x22 x2 1 sgn(x1) x2 2 sgn(x2) x23+x24 x2 3 sgn(x3) x2 4 sgn(x4)
Figure 1.9: Classical memory structure with quantum access for creating the quantum state |xi = x1|00i +x2|01i+ x3|10i+ x4|11i. In this example, a 4-dimensional real-
valued normalized vector can be encoded as the amplitudes of a 2-qubit quantum state via
three conditioned unitary rotations. In general, an R-dimensional real-valued vector can
be encoded as the amplitudes of a dlogRe-qubit quantum state via O(logR) conditioned unitary rotations. More details are given in [90, 54]
In the fQCE model, vector representations are not stored in the classical memory
structure described in Figure 1.9. Instead, quantum representations are prepared via ad- ditional variational quantum circuits with entity-dependent gate parameters. The entity- dependency means that the quantum circuit architecture for preparing entity quantum representations remains the same for all entities. However, each entity possesses a unique set of gate parameters. In other words, the quantum representation of an entity is prepared by iteratively applying parameterized gates on a maximally entangled state.
To evaluate the score function of a semantic triple, for both models, after preparing a quantum state for the subject, denoted as |si, a predicate-dependent circuit evolves the
quantum state |si to the resulting state |spi. Moreover, the quantum state for the object
|oi is prepared analogously, which is entangled with the state |spi via an auxiliary qubit. After performing another Hadamard gate on the auxiliary qubit, the inner product of quantum states |oi and |spi is encoded in the state of the auxiliary qubit. Therefore, we can derive the score function from the measurement statistics of the auxiliary qubit. More details of the circuit architecture can be found in [70] and Chapter 4.
By replacing the tree-structured memory storage with a variational quantum circuit for preparing the quantum representations, we realize a circuit-centric model for knowl-
edge graph embedding. Recall that an R-dimensional classical vector representation can
be encoded as the amplitudes of a quantum state with O(logR) fully entangled qubits.
Therefore, in the circuit-centricfQCEmodel, if the variational circuit for the entity prepa- ration is shallow enough and in the order O(logR), the computational complexity of score functions can be reduced to O(logR).
Furthermore, we can realize an acceleration with respect to the number of entities when inferring unobserved triples after training. The basic idea is to introduce a quantum register for indices, which is entangled with the qubits for encoding quantum representations and
the auxiliary qubit. Consider query (s,p,?) with an unknown object. Using the quantum
register for indices, we first prepare states |spi and P
i|ii |eii, where Pi|ii |eii represents the entanglement of all indices of entities and the corresponding quantum representations. In this way, the inner product between |spi and all |eii can be evaluated and encoded as the amplitudes of the register qubits. Correct objects might subsequently be read out by measuring the register qubits. This algorithm heuristically realizes a quadratic speedup in the number of entities during the inference. More details about the algorithm can be found in Section 6 of [70] and Chapter 4.