Next Article in Journal
Delay-Sensitive NOMA-HARQ for Short Packet Communications
Previous Article in Journal
Some Information Geometric Aspects of Cyber Security by Face Recognition
Previous Article in Special Issue
Currents in a Quantum Nanoring Controlled by Non-Classical Electromagnetic Field
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Neural-Network Quantum States for Spin-1 Systems: Spin-Basis and Parameterization Effects on Compactness of Representations

H.H. Wills Physics Laboratory, University of Bristol, Bristol BS8 1TL, UK
*
Authors to whom correspondence should be addressed.
Entropy 2021, 23(7), 879; https://doi.org/10.3390/e23070879
Submission received: 21 May 2021 / Revised: 7 July 2021 / Accepted: 7 July 2021 / Published: 9 July 2021
(This article belongs to the Special Issue Entropy and Information in Quantum Many-Body Systems)

Abstract

:
Neural network quantum states (NQS) have been widely applied to spin-1/2 systems, where they have proven to be highly effective. The application to systems with larger on-site dimension, such as spin-1 or bosonic systems, has been explored less and predominantly using spin-1/2 Restricted Boltzmann Machines (RBMs) with a one-hot/unary encoding. Here, we propose a more direct generalization of RBMs for spin-1 that retains the key properties of the standard spin-1/2 RBM, specifically trivial product states representations, labeling freedom for the visible variables and gauge equivalence to the tensor network formulation. To test this new approach, we present variational Monte Carlo (VMC) calculations for the spin-1 anti-ferromagnetic Heisenberg (AFH) model and benchmark it against the one-hot/unary encoded RBM demonstrating that it achieves the same accuracy with substantially fewer variational parameters. Furthermore, we investigate how the hidden unit complexity of NQS depend on the local single-spin basis used. Exploiting the tensor network version of our RBM we construct an analytic NQS representation of the Affleck-Kennedy-Lieb-Tasaki (AKLT) state in the x y z spin-1 basis using only M = 2 N hidden units, compared to M O ( N 2 ) required in the S z basis. Additional VMC calculations provide strong evidence that the AKLT state in fact possesses an exact compact NQS representation in the x y z basis with only M = N hidden units. These insights help to further unravel how to most effectively adapt the NQS framework for more complex quantum systems.

1. Introduction

The strongly-interacting quantum many-body problem is crucial to our understanding of many intriguing physical phenomena, but it is also inherently difficult to treat numerically owing to the exponential growth of the Hilbert space with system size. A commonly used approximate strategy is the variational method where a trial state, characterized by a tractable number of variational parameters, is optimized in energy. The effectiveness of this approach is highly dependent on the ansatz having an expressive form that can be systematically improved, to minimize bias, while also allowing relevant observables to be evaluated efficiently. Tensor networks have provided several examples of such ansatzes, with matrix product states (MPS) [1,2] displaying impressive accuracy in one-dimensional systems, along with projected entangled-pair states (PEPS) [1,3], tree tensor networks (TTN) [4,5], and multi-scale entanglement renormalization ansatz (MERA) [6] making two-dimensional systems accessible. Recently, artificial neural networks (ANNs) have emerged as another class of highly flexible variational ansatzes with many variants, such as restricted Boltzmann machines (RBM) [7], Deep Boltzmann Machines (DBM) [8,9,10], convolutional neural networks (CNN) [11,12,13,14,15], and feed-forward neural networks (FFNN) [16,17,18,19]. An important advantage of ANNs is that they are highly flexible and can be applied to any number of spatial dimensions, making them a powerful method for tackling the subtle physics seen in two-dimensional systems.
Although one of the simplest ANN variants, RBMs have seen widespread applications, including for open quantum systems [20,21,22,23], frustrated spin problems [24,25], quantum circuit simulation [26,27,28], and more. There are several reasons for their continued use. First, their simple structure allows for efficient sampling crucial for applying variational Monte Carlo (VMC) [29,30]. Second, RBMs are also a good candidate for a weakly biased ansatz, given that they are capable of exactly representing arbitrary states when their hidden unit number M scales exponentially with system size N. Third, RBMs are capable of representing states with volume-law entanglement [31], which further distinguishes them from tensor networks [32], despite their conceptual similarities [33,34,35]. Finally, there are also numerous classes of states with efficient exact RBM representations, including graph states [8]; spin Jastrow states, such as Laughlin states [33,36,37]; and general stabilizer states, such as the toric code [38,39,40,41], as well as more exotic hypergraph states and XS-stabilizer states [40]. Recently, we found that all but the last class listed, in fact, have RBM representations requiring M < N hidden units [42], illustrating how even very modestly sized RBMs have significant representational power.
Despite their efficacy for spin- 1 2 systems, the application of RBMs to systems with a local on-site dimension d > 2 , such as spin-1 or bosonic systems, has been limited with convolutional or feedforward neural networks generally being favored [16,17,43]. The typical approach in machine learning to handle models with multinomial or categorical variables is so-called “one-hot” or “unary” encoding [44]. Rather than representing a physical degree of freedom directly with one visible unit this approach encodes the possible local physical states into a set of binary visible units. While this approach leverages the power of binary or spin- 1 2 RBMs, it multiplies the number of visible units by a factor d, significantly increasing the parameter count and complexity of the optimization. As a consequence, studies utilizing unary encoding so far, for example, on the Bose Hubbard model [45,46], have been limited to small system sizes. Thus, there is need to devise more efficient RBM constructions tailored for d > 2 systems.
Progress has been made in this direction in recent work [47], where multivalued RBMs were applied directly to the one-dimensional spin-1 anti-ferromagnetic Heisenberg model (AFH) and substantially enhanced by incorporating a transformation to a coupled SU(2) symmetric basis. Complementary to this, here, we propose and study a direct generalization of the RBMs to spin-1 systems that retains key properties of spin- 1 2 RBM with a minimal increase in variational parameters (as we will not be examining other network architectures, we will use RBM and NQS interchangeably in this paper). Specifically, the ability to describe arbitrary product states without hidden units, invariance of the parameterization to the values assigned to visible variables (labeling freedom), and equivalence to the tensor network formulation. This leads to the introduction of new quadratic bias and interaction weights in the RBM effective energy function. We demonstrate the effectiveness of the new formulation via VMC calculations for the spin-1 AFH model, where it is seen to deliver the same accuracy as unary encoding but with substantially fewer variational parameters. Additionally, we also investigate how the local single-spin basis affects the hidden unit complexity of a state by performing calculations in both the S z and x y z spin-1 bases. For the AFH model with periodic boundary conditions, we find that the S z is more accurate. A useful advantage of our new spin-1 RBM formulation is that it permits tensor network based analytic constructions. Focusing on the paradigmatic Affleck-Kennedy-Lieb-Tasaki (AKLT) model, we give explicit exact NQS representations in both the S z and x y z bases. Our S z basis NQS construction displays the expected [36] M O ( N 2 ) scaling, while the simplification of the amplitude structure in the x y z basis gives an NQS construction with M = 2 N hidden units. By using VMC calculations, we find compelling evidence that the AKLT state, in fact, only requires M = N hidden units to be represented exactly in the x y z  basis.
The structure of this paper is as follows. We briefly outline the VMC method applied to the many body problem in Section 2 and discuss desirable properties for variational ansatzes. Next, in Section 3, we discuss the RBM in its spin- 1 2 form and analyze its key properties. In Section 4, we introduce a new generalization of NQS to spin-1 systems designed to mimic these properties and present VMC results for the AFH model. We then introduce analytic constructions for the AKLT model in Section 5, followed by VMC calculations in both the S z and x y z basis. Finally, in Section 6, we conclude and discuss some open problems.

2. The Many-Body Problem and Variational Monte Carlo

2.1. Quantum Many-Body Problem

In this work, we will focus on physical systems composed of N localized spinful particles. Each particle is described by a vector of spin operators S ^ j = ( S ^ j x , S ^ j y , S ^ j z ) for j = 1 , 2 , , N . Typically, the eigenstates S j of S ^ j z are used as the local spin basis, from which the S z basis for the full system is constructed as S = S 1 S N , where S = ( S 1 , S 2 , , S N ) . Any many-body quantum state for these N spins can then be expanded in this basis as
Ψ = S Ψ ( S ) S ,
via its complex amplitudes Ψ ( S ) . In the spin- 1 2 case, we have S ^ j α = 1 2 σ ^ j α (taking = 1 throughout) for α = { x , y , z } defined by the Pauli matrices, and S j { + 1 2 , 1 2 } . In the spin-1 case, we have
S ^ j x = 1 2 0 1 0 1 0 1 0 1 0 , S ^ j y = 1 2 0 i 0 i 0 i 0 i 0 , S ^ j z = 1 0 0 0 0 0 0 0 1 ,
with S j { + 1 , 0 , 1 } .
A key challenge in many-body physics is to find the ground state Ψ 0 of a system governed by an interacting Hamiltonian H ^ . In the context of spin systems, H ^ will include terms that are products of the spin operators S ^ j over two or more spins. Given that the expectation value of any observable A ^ for a general (unnormalized) state Ψ is
A Ψ = Ψ A ^ Ψ Ψ | Ψ ,
the variational approach reformulates the eigenvalue problem H ^ Ψ 0 = E 0 Ψ 0 as the minimization of the energy E = H ^ Ψ over the exponentially many amplitudes Ψ ( S ) . Performing this task exactly is only feasible for small systems N O ( 10 )  [48].

2.2. Variational Monte Carlo Method

A way to circumvent the “curse of dimensionality” is to instead to restrict the optimization over a specialized class of states Ψ p , dependent on parameters p whose number n params scales polynomially with N. The variational principle E 0 E p 0 = min p H ^ Ψ p then provides a route to finding the best approximation Ψ p 0 within the ansatz for Ψ 0 . The flexibility and utility of variational ansatz are greatly enhanced if instead of computing expectation values A Ψ p exactly we evaluate them approximately using Monte Carlo sampling. This approach is called variational Monte Carlo (VMC) and is described in detail in Appendix A. A key feature of VMC is that only ratios of amplitudes for an ansatz Ψ p ( S ) / Ψ p ( S ) between different spin states S and S are needed within the algorithm allowing us to ignore the normalization of quantum states in this work. To be efficient, thus, we require that these amplitude ratios for our ansatz can be evaluated with an effort scaling polynomially with N.
All numerical calculations presented in this paper employ the powerful stochastic reconfiguration method [30,49] for optimizing the parameters p . While ansatzes with a larger number of parameters may describe more varied amplitude structure, in principle allowing greater accuracy, the effort to optimize them can scale as O ( n params 3 ) . It is, therefore, crucial that the functional form of any candidate ansatz Ψ p is judicious in its use of variational parameters to avoid excessive redundancies.

3. Neural-Network Quantum States in Spin-1/2

In this section, we review NQS in terms of RBMs, discussing their form from both an energy function and tensor network perspective. In doing so, we highlight key properties of NQS that are desirable for expressiveness and the ability to represent important families of states.

3.1. Restricted Boltzmann Machine Approach

Restricted Boltzmann machines consist of two sets of units, Nvisible units representing the physical system, and Mhidden units to be marginalized out. The units are characterized by a Boltzmann-like combined probability distribution p ( v , h ) = exp ( E λ ) with an effective “energy” function
E λ ( v , h ) = j = 1 N a j v j i = 1 M b i h i i = 1 M j = 1 N w i j h i v j ,
where λ = { a , b , w } is the set of M N + M + N complex parameters consisting of N visible biases a = ( a 1 , , a N ) , M hidden biases b = ( b 1 , , b M ) , and M × N hidden-visible interaction weights w = [ w 11 , , w 1 N , , w M N ] . While p ( v , h ) is Boltzmann once the hidden units are traced out, we are left with a more complex marginal distribution for the visible units, from which the NQS amplitudes are derived [7]:
Ψ λ ( S v ) = h exp E λ ( v , h ) .
The RBM architecture is shown diagrammatically in Figure 1.
Typically, visible v j and hidden h i unit variables are taken as two-valued. The pair of unique values { μ , ν } taken by units within the energy function Equation (3) can be freely chosen, and they need not coincide with the eigenvalues S j { + 1 2 , 1 2 } of the local operator S ^ z chosen to define the physical basis. In Equation (4), we emphasize this labeling freedom by explicitly introducing a mapping between physical and visible configuration S v . Commonly, with NQS, an implicit choice v = S is made, nullifying the need for this distinction, but it will prove to be useful here. Canonical choices v j are number-like { 0 , 1 } or Ising-like { + 1 , 1 } values. We will adopt Ising-like visible variables, in which case it follows that an arbitrary visible unit taking values
S j = μ ν = v ˜ j
is generated by a shift and rescaling of v j { + 1 , 1 } as v ˜ j = 1 2 ( μ + ν ) + 1 2 ( μ ν ) v j . Since this is a linear transformation, it is entirely accommodated by redefining the energy function E λ ( v , h ) parameters λ without changing its functional form. We shall see shortly that labeling freedom is a crucial ingredient for generalizing RBMs to spin-1 system using higher-dimensional visible units in Section 4. Tracing out Ising-like hidden variables results in the amplitude expansion [7]
Ψ λ ( S v ) = j = 1 N e a j v j i = 1 M 2 cosh b i + j = 1 N w i j v j ,
that is commonly used and numerically convenient.
A key property of spin- 1 2 RBMs is that they can represent any generic product state of the form
c + ( 1 ) + c ( 1 ) c + ( N ) + c ( N ) ,
without hidden units by simply setting their visible biases to
a j = ln c + ( j ) ln c ( j ) μ ν .
Hidden units are, thus, necessary to describe entangled states; however, unlike tensor networks, there is no direct quantitative relation between them. Indeed, Ref. [31] reports that, for random NQS, adding hidden units actually decreases the amount of entanglement in a state. Nevertheless, increasing the number M of hidden units increases the expressiveness of the NQS, allowing more complex correlations within Ψ λ ( S v ) to be encoded. Formally, the NQS ansatz can represent any arbitrary state exactly in the limit of M 2 N [50]. More usefully, there are important classes of states that possess highly accurate approximate or even exact NQS representations with an efficient scaling M poly ( N )  [33,36,38,39,40,41]. A particularly tractable NQS hidden unit complexity is typified by the following: efinition
Definition 1
(Compact NQS). States with an exact NQS representation where M N will be denoted as compact.
As we demonstrated in ref. [42], important classes of state, including Jastrow, graph, and stabilizer states, all have compact NQS representations.

3.2. Tensor Network Approach

An alternative formulation of NQS views them as a tensor network. From this perspective, the amplitudes Ψ ( S ) are recast as elements of an order-N tensor Ψ S 1 S 2 S N , and this structureless tensor can be decomposed into a set of lower order tensors contracted together [1,2]. Here, we will make repeated use of tensor network diagrams that form an important analytical tool. Here, generic tensors of any order are represented as shapes, most often a circle ∘ color shaded to guide the eye, with protruding legs for each index they possess. Contraction of tensors is then represented by the joining of legs via graphical equations like this:
Entropy 23 00879 i001
depicting C a b x z = α A a b α B x α z . When there is a symbol inside a shape, it represents tensors with fixed elements or some specific structure. For example, the order-1 tensors with ↑ or ↓ symbols are the representation of the spin- 1 2 S z basis states = ( 1 0 ) and = ( 0 1 ) , while the tensor shown in Figure 2a represents + = + . A special tensor we will use frequently is the COPY tensor [33,51], denoted by a dot •, which is the multidimensional generalization of the 2 × 2 identity matrix. Its order-3 variant is shown in Figure 2b. It possesses a number of important properties. First, contracting any COPY tensor leg with + reduces the order of the COPY tensor by 1, as shown in Figure 2c. Second, contracting any COPY tensor leg with an S z basis state factorizes it by duplicating the basis vector across all legs, as shown in Figure 2d, motivating its name. This property makes COPY tensors a useful glue for constructing sampleable tensor networks. An order-N COPY tensor can itself be viewed as a representation of a GHZ state, shown in Figure 2e. Finally, Figure 2f shows a generic diagonal matrix attached to a leg of a COPY tensor can be commuted across to any of the COPY dot’s other legs.
With these concepts in place, the bipartite graph structure of RBM graph readily translates to a tensor network:
Entropy 23 00879 i002
in which the vertices representing the visible and hidden units are replaced by COPY tensors, and order-2 tensors, or 2 × 2 coupling matrices C ( i j ) , are introduced on each edge of the graph. Based on Figure 2e, thus, we can view each hidden unit in an NQS as contributing within an amplitude-wise product a locally deformed GHZ state. The NQS tensor network readily displays the key properties of the spin- 1 2 RBM. They can trivially represent product states in Equation (7) by using one hidden unit with rank-1 coupling matrices P containing the coefficients { c + ( j ) , c ( j ) } that subsequently factorizes out as
Entropy 23 00879 i003
Moreover, since the elements of coupling matrices C ( i j ) have no functional dependence on any visible variables, they manifestly display labeling freedom. Crucially, while the NQS tensor network appears to have 4 M N variational parameters, it exhibits significant gauge freedom due to the ability to reshuffle diagonal matrices across COPY tensors, as shown in Figure 2g. Consequently, most of the elements in the coupling matrices can be extracted and combined at the COPY tensors, reducing the number of independent parameters to N + M + N M , identical to the RBM formulation [42]. This equivalence allows either the NQS tensor network or RBM approach to be used interchangeably, and any direct generalization of RBMs to spin-1 should retain this feature.

4. Generalization to Spin-1 Systems

A tentative definition of a spin-1 RBM can be made by simply using the three-valued visible variables v j with a natural Ising-like physical to visible mapping,
S j = + 1 0 0 1 = v j ,
in the standard linear energy function Equation (3). By retaining two-valued Ising-like hidden units h i { + 1 , 1 } , the amplitudes Ψ λ ( S v ) = h exp E λ ( v , h ) continue to be given by Equation (6), but now admitting three-valued visible variables.
This spin-1 generalization of the RBM lacks the key properties of the spin- 1 2 variant. First, Ψ λ ( S v ) cannot easily describe a generic product spin state
c + ( 1 ) + c 0 ( 1 ) 0 + c ( 1 ) c + ( N ) + c 0 ( N ) 0 + c ( N ) ,
with coefficients c α ( j ) , since the visible bias term exp ( a j v j ) cannot discriminate each of the local spin-1 states. This points to issues of expressiveness since we are using the same set of M N + M + N parameters { a , b , w } to describe a bigger state space. Second, the energy function Equation (3) does not possess labeling freedom for the visible units values. An arbitrary physical to visible mapping,
S j = μ 0 ν ω = v ˜ j ,
is generated from an Ising-like variable v j = { + 1 , 0 , 1 } via the quadratic transformation v ˜ j = ν + 1 2 ( μ ω ) v j + 1 2 ( μ + ω 2 ν ) v j 2 . Consequently, transforming visible variables in this way cannot be accommodated in E λ ( v , h ) by changing only the parameters { a , b , w } . One way to avoid these issues is to use “one-hot” or “unary” encoding which has been successfully applied to both spin-1 [40] and bosonic systems [45,46].

4.1. Unary Encoding Approach

Unary encoding applies the spin- 1 2 RBM formalism of Equation (4) to a larger state space of each spin-1 by mapping the physical system into one comprising a larger number of spin- 1 2 particles. Specifically, the local S z basis of each spin-1 is mapped on to three spin- 1 2 ’s as
= , 0 = , = .
Physical states of the spin-1 system are now contained in the unary encoded subspace where only a single-excitation occurs within any three-site unit cell which facilitates the efficient projection of the representation. This naturally generalizes to a d-dimensional local state space.
By using this mapping, a spin- 1 2 RBM can be applied inheriting its labeling freedom. Owing to the single-excitation projection, product states in Equation (13) are readily described by the visible biases a j a , a j b and a j c associated with the three spin- 1 2 ’s {a,b,c} encoding a given spin-1. In general, for a d-dimensional local state space, unary encoding accounts for the enlarged state space by increasing the number of variational parameters to n params = M + d N + d N M compared to the naive spin-1 RBM. However, there are some obvious deficiencies of this approach. First, unary encoding appears to unnecessarily enlarge the parameter count, as evidenced by the fact that it increases it even for d = 2 . This will significantly increase the computational cost. Second, splitting the interaction weights w across a unary cell makes the interpretation of any individual hidden unit’s contribution to the physical state rather opaque.

4.2. Defining a Spin-1 RBM and Tensor Network

These deficiencies show that a more general RBM energy function is needed that is sensitive to the three-valued nature of the visible units through the inclusion of terms involving the square of visible variables (a similar approach is likely to have been used in Ref. [47] already, although it was not explicitly stated). This motivates the following definition: efinition
Definition 2
(spin-1 NQS). We introduce a direct spin-1 RBM as the ansatz with amplitudes Ψ Λ ( S v ) = h exp E Λ ( v , h ) via the energy function
E Λ ( v , h ) = j = 1 N a j v j + j = 1 N A j v j 2 + i = 1 M b i h i + i = 1 M j = 1 N w i j h i v j + i = 1 M j = 1 N W i j h i v j 2 ,
defined by the parameters Λ = { a , b , w , W , A } .
The new contributions to a spin-1 RBM are W , an M × N -dimensional matrix of quadratic interactions, and A , an N-dimensional vector of quadratic visible biases. There are now 2 M N + 2 N + M complex parameters in total. Tracing out the two-valued Ising-like hidden units gives amplitudes
Ψ Λ ( S v ) = i = 1 N e a i v i + A i v i 2 j = 1 M 2 cosh b j + i = 1 N w i j v i + i = 1 N W i j v i 2 .
The inclusion of a quadratic visible bias now allows any product state in Equation (7) to be described without hidden units by setting
a j = log c + ( j ) c ( j ) / c 0 ( j ) and A j = log c + ( j ) / c ( j ) ,
while the quadratic interaction term ensures labeling freedom for the visible variable.
A strong justification for Equation (16) being the appropriate spin-1 generalization of RBMs is its relation to an NQS tensor network for spin-1. To handle spin-1 systems, we introduce a COPY tensor with three-dimensional legs copying the S z basis states { , 0 , } . Its properties are summarized in Figure 3a–f and are straightforward generalizations of the spin- 1 2 case given in Figure 2a–f and discussed in Section 3.2. The major difference is that we now distinguish three-dimensional legs with ‘=’ lines instead of ‘−’. As before, the NQS tensor network follows from the conversion of the RBM graph in Figure 1, except that visible vertices are now replaced by three-dimensional COPY tensors giving:
Entropy 23 00879 i004
Owing to the mixed dimensionality of the COPY tensors in this network, it now requires a rectangular 2 × 3 coupling matrix C ( i j ) between the i-th hidden and j-th visible unit. For the i-th hidden unit, its set of coupling matrices Υ ( i ) = { C ( i 1 ) , C ( i 2 ) , , C ( i N ) } can be explicitly tabulated as
+ 1 0 1 v 1 + 1 0 1 v 2 + 1 0 1 v N h i + 1 1 C + + ( i 1 ) C + 0 ( i 1 ) C + ( i 1 ) C + ( i 1 ) C 0 ( i 1 ) C ( i 1 ) C + + ( i 2 ) C + 0 ( i 2 ) C + ( i 2 ) C + ( i 2 ) C 0 ( i 2 ) C ( i 2 ) C + + ( i N ) C + 0 ( i N ) C + ( i N ) C + ( i N ) C 0 ( i N ) C ( i N ) .
The amplitudes Υ ( i ) ( v ) of this hidden unit’s correlator then follow by summing the product of coupling matrix elements selected by v along each row, giving
Υ ( i ) ( v ) = h i = ± 1 1 j = 1 N C h j v i ( i j ) = C + v 1 ( i 1 ) C + v 2 ( i 2 ) C + v N ( i N ) + C v 1 ( i 1 ) C v 2 ( i 2 ) C v N ( i N ) .
The amplitudes of NQS tensor network are then the product of each of these hidden unit correlators
Ψ NQS ( S v ) = i = 1 M Υ ( i ) ( v ) ,
which, like an RBM, can be exactly and efficiently sampled.
The spin-1 NQS tensor network appears to have 5 M N complex variational parameters; however, again, gauge freedom allows the shuffling of diagonal matrices (of an appropriate dimension) through a COPY tensors reducing this. Specifically, its equivalence to the generalized spin-1 RBM proposed in Equation (16) is made using the coupling matrix decomposition
C + + ( i j ) C + 0 ( i j ) C + ( i j ) C + ( i j ) C 0 ( i j ) C ( i j ) = e c e b ˜ i j 0 0 e b ˜ i j e w i j + W i j 1 e w i j + W i j e w i j W i j 1 e w i j W i j × e a ˜ i j + A ˜ i j 0 0 0 1 0 0 0 e a ˜ i j + A ˜ i j .
The solution for the weights w i j , W i j and partial biases a ˜ i j , A ˜ i j , b ˜ i j is outlined in Appendix B, from which the full RBM biases are formed as a j = i = 1 M a ˜ i j , A j = i = 1 M A ˜ i j and b i = j = 1 N b ˜ i j . The spin-1 NQS tensor network, thus, reduces to n params = 2 M N + 2 N + M complex parameters.
This correspondence between the spin-1 RBM and the spin-1 NQS tensor network highlights an advantage over unary encoding. Specifically, coupling matrices provide an intuitive tool for engineering the correlations and structures that a given hidden unit imprints on the amplitudes of an NQS. A trivial case is when all the elements of the j-th coupling matrix are 1’s, denoted generically as I , equivalent to the hidden unit being disconnected from that visible unit. A more complex example with conditional correlations is a hidden unit with coupling matrices
{ I , , I , C c k , C 0 , , C 0 } ,
built from
C c = 0 1 1 1 0 0 and C 0 = 1 1 1 1 1 1 .
This hidden unit generates a correlator Υ ( v ) = δ v k , + δ v k , ( 1 ) C ( v ) , where a factor ( 1 ) C ( v ) is introduced conditional on the kth spin being in the state , with C ( v ) being the number of 0 and states in the configuration v between spins k + 1 and N. Similar types of hidden units will be used extensively in Section 5.1 to construct an exact representation of a state.

4.3. Projection of Unary Encoding into a Spin-1 RBM

The tensor network formalism provides further evidence that unary encoding from Section 4.1 is an over-parameterization of RBMs for spin-1 systems with δ n params = N ( 1 + M ) redundant parameters. Unary projection is implemented by an order-4 tensor U obeying:
Entropy 23 00879 i005
By directly applying this projector to the spin- 1 2 NQS tensor network for a unary encoded state and performing a graphical rewrite, we obtain the spin-1 NQS introduced in Equation (18). Figure 4 summarizes the crucial manipulations required.
In Figure 4a, some representative examples of contractions between the projection tensor U and hidden units are shown. There are three steps to rewriting the network. The first step, shown in Figure 4b, essentially pulls U through the three unary two-dimensional COPY tensors, leaving behind a single three-dimensional COPY tensor representing the physical spin-1. Two important cases are shown in the example in Figure 4b. A hidden unit may have connections to each of the unary spin- 1 2 ’s, where upon they get bundled up by the U tensor. A hidden unit may connect to only a subset of the unary spin- 1 2 ’s, which is handled by plugging the unused legs of U tensor with + . If a hidden unit couples to more than one of the unary spins, then the second step, shown in Figure 4c, involves splitting the hidden unit’s two-dimensional COPY tensor to separate those connections. The final step is then to contract the split COPY tensor, coupling matrices and the projection U to form a rectangular 2 × 3 coupling matrix, as depicted in Figure 4d. If a hidden unit has connections exclusively within the unary spin- 1 2 ’s, then it becomes an entirely local visible bias contribution in the spin-1 NQS.

4.4. Change of Local Spin Basis

For tensor network representations, such as MPS or PEPS, the complexity (internal bond dimension) of a given state’s description is rooted in its entanglement structure. As such changing the local basis of the spins used in a calculation has no effect on this complexity. Moreover, transforming a representation from one basis to another is accomplished by simply admixing the local tensors. Within VMC, a change of local spin basis leaves the locality and the sparsity of the Hamiltonian essentially unchanged. However, it is pivotal to the method that the amplitudes Ψ p ( S ) of whatever ansatz is used can be efficiently evaluated in this new basis. This is not generally true of NQS since their sampleability is intimately tied to the basis that factorizes the COPY tensors they are built from.
To understand how NQS behave, consider a representation of some state Ψ , for instance, of four spin-1’s sampleable in the S z basis
Entropy 23 00879 i006
Now, suppose we transform the local S z basis S to a new basis χ via a unitary χ S = B ^ S . Formally, from a tensor network perspective, we find the χ basis NQS representation by sandwiching 𝟙 = B ^ B ^ on each physical leg and computing the S z basis NQS representation of ( B ^ ) N Ψ . However, currently, there is no known procedure for updating exactly an NQS after the application of an arbitrary single-spin unitary, even allowing for an increased number of hidden units. While we are guaranteed that an NQS representation of B ^ N Ψ exists, as illustrated here schematically
Entropy 23 00879 i007
there is no guarantee it will be efficient, even if the representation of Ψ was originally. An example of such a catastrophic loss of NQS efficiency has been presented by Gao and Duan [8]. They show that, so long as the polynomial hierarchy in computational complexity theory does not collapse, a two-dimensional cluster state, which has an efficient NQS representation, has no efficient NQS representation after a specific layer of translation-invariant single-spin unitaries are applied. Thus, NQS complexity depends non-trivially on the local spin basis used.
Motivated by this, we will consider NQS calculations in two different local spin-1 bases to examine how the complexity varies, specifically, the standard S z basis and the x y z basis defined as
x = 1 2 , y = i 2 + , z = 0 .
In the x y z basis, the individual spin-1 operators all acquire the same off-diagonal form
S ^ j x = 0 0 0 0 0 i 0 i 0 , S ^ j y = 0 0 i 0 0 0 i 0 0 , S ^ j z = 0 i 0 i 0 0 0 0 0 ,
as a consequence of them all contributing one eigenstate to the basis. On a practical level, using the x y z basis for the spin-1 RBM in Equation (17) simply requires replacing S with α and a physical-visible mapping from α v , such as
α j = x + 1 y 0 z 1 = v j .
Equivalently, the visible unit COPY tensors for this NQS tensor network can be considered to be rotated into this x y z basis.

4.5. Numerical Example—Spin-1 Anti-Ferromagnetic Heisenberg Model

To confirm the effectiveness of our spin-1 NQS, we performed VMC calculations to reach the ground state of the well-known anti-ferromagnetic Heisenberg (AFH) model in one dimension. The Hamiltonian is given by
H ^ = J i S ^ i · S ^ i + 1 ,
where J > 0 is the magnetic interaction strength and with periodic boundary conditions N + 1 1 . We focus on small systems allowing direct comparison to the ground state calculated from exact diagonalization via the overlap O = | Ψ ED | Ψ Λ | 2 . Additionally, we performed NQS optimizations in both the S z and x y z bases to compare any differences in performance. This basis change alters the fixed quantum numbers of the system. In particular, in the S z basis, the AFH model preserves the total S z projection of the system, and its ground state lies in the j = 1 N S j z = 0 sector of the full Hilbert space, while, in the x y z basis, the AFH model preserves the “parity” of the total x , y , z spin populations in the system, and, for even N, the ground state lies in the subspace of configurations, where there is an even number of each basis state. As is common for VMC calculations, we only select configurations from the relevant subspace during sampling. We perform the calculations with hidden unit numbers M = [ 1 , 2 , , 2 N ] , initializing M = 1 with random small complex parameters. For each successive calculation, we need the NQS with the parameters for M 1 and initialize the Mth hidden unit with random small parameters, gradually increasing the size of the network in a sequential manner. To check the robustness against initialization bias of the qualitative features we have discussed, we rerun optimization sequences 5 to 10 times and present the best results here.
In Figure 5, we show how the accuracy of spin-1 NQS and unary encoding representations improve with an increasing number of hidden units M for both the S z and x y z bases plotted in terms of the variational parameter count. The spin-1 NQS achieves a superior accuracy to unary encoding for a similar n params . The inset of Figure 5a shows the collapse of the same S z data plotted against M, indicating that the spin-1 NQS and unary encoding have in fact located the same solution for a given M. However, by using δ n params = N ( 1 + M ) less parameters, the spin-1 NQS is considerably more efficient to optimize, especially noting that δ n params scales with both system size and hidden unit number (for example, consider that the N = 12 , M = 12 spin-1 NQS has a comparable parameter count to an N = 12 , M = 8 unary NQS). In Figure 5b, we observe a noticeable drop in accuracy for both NQS variants in the x y z basis compared to the S z basis. This suggests that the AFH ground state amplitude structure with periodic boundaries is inherently more complicated in the x y z basis regardless of encoding. Moreover, this confirms that hidden unit number M of an NQS is basis-dependent quantity and cannot be used as a proxy of the entanglement.

5. Revisiting the AKLT Model

We now move on to benchmark our spin-1 NQS against the analytically solvable AKLT model [52], which is a spin-1 chain governed by a bilinear-biquadratic SU(2)-isotropic Heisenberg Hamiltonian of the form
H ^ AKLT = j = 1 N S ^ j · S ^ j + 1 + β S ^ j · S ^ j + 1 2 + 2 3 ,
with periodic boundary conditions N + 1 1 . It has special significance since it was the first solvable spin-1 chain model that exhibits the ‘Haldane gap’ [53]. The AKLT state Ψ AKLT is the ground state of H ^ AKLT at the AKLT point β = 1 3 , and has an energy of exactly zero.
As is well-known, Ψ AKLT has a special structure of correlations which are related to a valence bond solid. Specifically, each spin-1 is envisaged as being a pair of spin- 1 2 particles that are correspondingly entangled in a singlet state with a partner spin- 1 2 in the nearest neighboring spin-1 on the chain. The AKLT state is then the projection P of the local pairs of spin- 1 2 particles into the triplet subspace, as depicted in Figure 6. This also leads to Ψ AKLT possessing a very compact MPS representation with matrices
A + = 1 2 σ ^ + , A 0 = 1 2 σ ^ z , A = 1 2 σ ^ ,
where σ ^ ± = 1 2 ( σ ^ x ± i σ ^ y ) , such that the (unnormalized) amplitudes of the ground state in the S ^ z basis follow as
Ψ AKLT ( S ) = tr A S 1 A S 2 A S N .
Since the AKLT point of H ^ AKLT lies in the gapped Haldane phase, Ψ AKLT has finite-ranged magnetic correlations,
O z z = ψ AKLT S ^ 0 z S ^ z ψ AKLT e / ξ , with ξ = 1 ln ( 3 ) ,
yet it also has an unbroken spin rotation symmetry which is a hallmark of a symmetry protected topological order. Specifically, the string-order parameter
O string = lim N lim ψ AKLT S ^ 0 z j = 1 1 e i π S ^ j z S ^ z ψ AKLT 4 9 ,
reveals the presence of infinite-ranged anti-ferromagnetic correlations. This is evident from the structure of the MPS amplitudes. Any matrix product A ± A 0 A 0 A ± = 0 , so any configuration containing a ferromagnetic segment, like “+ 0 0 0 +”, with any number of 0’s is not allowed. In contrast, allowed configurations contain only anti-ferromagnetic segments, such as “– 0 + 0 0 0 – + 0”, arising from sequences, like A ± A 0 A 0 A , where every ± is partnered with a ∓ separated by with an arbitrary string of 0’s.
Despite its simple MPS representation it is surprisingly non-trivial to capture the non-commutative matrix products making up the AKLT amplitudes with an NQS. Direct conversion to an NQS from the MPS representation gives two reasons why it must contain long-ranged hidden units. First, it has been shown [38] that any short-ranged translationally invariant NQS cast into a MPS form by mapping hidden units into virtual bonds has A matrices that are at most rank-1. Since the matrix A 0 in the AKLT state is rank-2, it fails this condition. Second, if we divide the chain into a sequence of three contiguous parts a , b , c , once we make b larger than the longest range of any hidden unit, so no hidden unit connects to visible units in both a and c, then the NQS amplitudes factorize as
ψ sr NQS ( S a , S b , S c ) = ψ a b ( S a , S b ) ψ b c ( S b , S c ) ,
implying that S a and S c are uncorrelated [34]. The AKLT amplitudes Ψ AKLT ( S ) do not satisfy this property since region b can be any length of 0’s, and there will always be non-zero amplitudes “+ 0 0 0 ...0 –” and “–  0 0 0 ...0 +” encoding string order correlations that are not factorizable.
It has been previously found that the AKLT state in the S z basis requires an NQS with M O ( N 2 ) long-ranged hidden units [36], and this was borne out in numerical calculations for small systems. In Appendix C, we explicitly construct an NQS for the S z basis AKLT amplitudes using M = 2 N 2 + N 1 2 ( N 1 ) + 1 hidden units, many of which are extensive over the system. The O ( N 2 ) scaling can be readily understood as a consequence of having hidden units that each eliminate disallowed configurations, such as “± 0 0 0 ±”, and impose the sign for allowed configurations, such as “± 0 0 0 ∓”, for all N separations and N translations over the system. This is rather less efficient than the compact spin- 1 2 NQS found for Jastrow, graph and stabilizer states in ref. [42]. The AKLT state can be expressed with O ( N ) hidden units but at the expense of needing a 2-layer DBM network [38] that cannot in general be exactly sampled, complicating its use numerically. However, as we saw for the AFH model numerical results, the hidden unit complexity is basis dependent. Surprisingly, we will show next that an exact M O ( N ) spin-1 NQS representation of the AKLT state is obtained in the x y z basis.

5.1. Exact Spin-1 NQS for AKLT State in the x y z Basis

The AKLT state provides an instructive example of how a single spin basis change can significantly alter the amplitude structure. Transforming the MPS representation into the x y z basis yields matrices
B x = 1 2 ( A + A 1 ) = 1 2 σ ^ x , B y = i 2 ( A + 1 + A 1 ) = 1 2 σ ^ y , B z = A 0 = 1 2 σ ^ z ,
and, thus, renders the amplitudes into products of Pauli matrices
Ψ AKLT ( α ) = tr B α 1 B α 2 B α N ,
where α = ( α 1 , , α N ) with α j = { x , y , z } label the x y z basis. As expected, there is no change in the complexity/internal dimension of the MPS representation.
The structure of the amplitudes Ψ AKLT ( α ) in the x y z basis is significantly simpler than Ψ AKLT ( S ) in the S z basis. Amplitudes are now evaluated by tracking the anticommutations of Pauli matrices required to make the matrices of each the type form a contiguous sequence, e.g., x x x y y y z z z , and then reducing the product repeatedly via ( σ ^ α ) 2 = 𝟙 . The resulting matrix trace is non-zero only when the overall product is 𝟙 , and so all non-zero amplitudes have an equal magnitude. Depending on whether N is even or odd, this condition requires that there is either an even or odd number of x , y and z’s in any configuration string, respectively. Using this, we arrive at the following result:
Theorem 1
(AKLT state x y z NQS). The AKLT state in the x y z spin basis has an exact spin-1 NQS representation requiring M = 2 N hidden units.
Proof. 
We establish this result using a direct and intuitive construction for Ψ AKLT ( α ) in which hidden units are devised to implement the nodal structure and sign structure of this state. The rules governing the amplitudes are as follows:
  • To implement the parity constraint on the number of x , y and z’s in any configuration string α , we introduce the following 2 × 3 coupling matrices:
    C x y = 1 1 1 1 1 1 , C y z = 1 1 1 1 1 1 .
    By defining two hidden units from these matrices as Υ x y = { C x y , C x y , , C x y } and Υ y z = { C y z , C y z , , C y z } , we arrive at the product filter
    Υ x y ( α ) Υ y z ( α ) = 1 + ( 1 ) # x s + # y s 1 + ( 1 ) # y s + # z s ,
    in which the hidden units cancel out any strings α that have odd numbers of both x’s and y’s, and y’s and z’s, respectively. Together, these hidden units completely establish for any N the nodal structure of the AKLT state amplitudes in this basis.
  • To reproduce the sign structure arising from anticommuting Pauli matrices into a contiguous sequence, we require two types of hidden units. The first type of hidden unit uses a conditional coupling matrix for the local state x
    C x = 0 1 1 1 0 0 ,
    along with C y z to define a hidden unit of the form
    Υ x ( k ) = { C y z , , C y z , C x k , I , , I } ,
    where C x appears in the kth position in the sequence. The action of Υ x ( k ) is to induce on a configuration a factor ( 1 ) # y s + # z s between site k 1 and the left boundary, conditional on site k being in state x . This is the sign that would occur if a σ x matrix was anticommuted to this boundary through the corresponding product of Pauli matrices. The second type of hidden unit uses two further coupling matrices
    C y = 1 1 1 1 1 1 , C z = 1 1 0 0 0 1 ,
    defining a hidden unit of the form
    Υ z ( k ) = { I , , I , C z k , C y , , C y } ,
    where C z appears in the kth position in the sequence. The action of Υ z ( k ) is to induce on a configuration a factor ( 1 ) # y s between site k + 1 and the right boundary, conditional on site k being in state z . This is the sign that would occur if a σ ^ z matrix was anticommuted to the right boundary through the corresponding product of Pauli matrices, assuming that any σ ^ x ’s have already been anticommuted to the left boundary. To capture all locations k for both types, thus, requires 2 ( N 1 ) hidden units which entirely establish the sign structure of the AKLT state amplitudes in this basis.
This gives a total of M = 2 N hidden units. □
The resulting amplitude-wise product decomposition of the AKLT into hidden unit correlators is depicted Figure 7 for N = 6 .

5.2. Analytic Example—AKLT Unary Stabilizer State

An explicit NQS construction for the AKLT state in the x y z basis has been given before in Ref. [40] using unary encoded cells of { a , b , c } spin- 1 2 ’s. Their construction involves initializing the b spin- 1 2 ’s in state + while entangling the a and c spin- 1 2 ’s between adjacent unary cells in the state + . As a tensor network, this is represented as
Entropy 23 00879 i008
where each box is a unary cell, and H denotes the Hadamard matrix. Each unary cell then has the following unitary applied to it
Entropy 23 00879 i009
where S is the phase gate [40] (we have switched the controlled-NOT gates in the circuit given in Figure 5b of ref. [40] into controlled-Z gates here to expose the graph state equivalence). Putting these pieces together, the unary encoded spin- 1 2 state Ψ unary is a stabilizer state constructed by the circuit
Entropy 23 00879 i010
where the top part (above the dashed line) generates a graph state, and the bottom part applies local Clifford gates. Once unary projection to the spin-1 is applied, Ψ unary generates the AKLT state [40].
Previously, in ref. [42], we showed how any stabilizer state for N spin- 1 2 ’s can be readily converted into an NQS with M N 1 . Here, we just summarize the basic process. The first step in this conversion is to use the local Clifford equivalence of stabilizer states to graph states to relocate all the non-diagonal Clifford gates to independent vertices of the graph. This conversion takes the simple chain-like graph state and pattern of Clifford gates from Equation (32) and gives the following for N = 12 spins:
Entropy 23 00879 i011
As required, the resulting graph has diagonal Clifford gates on all vertices, except for a small independent set I = { 10 , 12 } highlighted. Notice that the three-site translational invariance of Ψ unary , mirrored by the initial chain graph, is still formally present in the transformed graph but is now obscured by its highly connected topology. By forming a vertex cover C = { I , 1 , 2 , 4 , 5 , 7 , 8 } , we obtain a NQS [42] with M = 8 hidden units (despite the more complex graph, this is the same number of hidden units required to describe the initial chain graph state as an NQS):
Entropy 23 00879 i012
More generally, for N spin- 1 2 ’s, this procedure generates an NQS with M = 2 N / 3 hidden units.
After applying the unary projection and contraction process directly to this spin- 1 2 NQS, we obtain spin-1 NQS tensor network, whose schematic structure is shown here for N = 4 :
Entropy 23 00879 i013
It is evident that pairs of hidden units possess coordination 4 , 3 and 2, while one pair gets projected down to a spin-1 visible bias, giving M = 6 overall. The same structure applies for general N with pairs of equal coordination spanning N , N 1 , , 2 , giving a spin-1 NQS for the AKLT state in the x y z basis requiring M = 2 N 2 hidden units. This representation is essentially identical to the one presented in Section 5.1, except that the hidden units implementing the nodal structure (the fully connected pair) also contribute to the sign structure, reducing the total hidden unit count by one pair. This raises an interesting question of whether an even more compressed spin-1 NQS representation of the AKLT state is possible. We finish by examining this using direct numerical optimizations.

5.3. Numerical Example—AKLT in x y z and S z Bases

Although the AKLT state is translationally invariant, the hidden units encoding the sign structure of this solution are neither individually translationally invariant nor do their translates appear. Consequently, we performed VMC calculations with increasing M using both the spin-1 NQS and unary NQS for N = { 6 , 8 , 10 , 12 } and compared against exact diagonalization. As with the AFH model (which is H ^ AKLT with β = 0 ), earlier in Section 4.5, we considered both the S z and x y z basis with their corresponding nodal structure enforced by sampling. We also utilize the same sequential growth scheme as we used in the Heisenberg calculations, again confirming the robustness of our qualitative conclusions by performing reruns of the optimization sequence and presenting the best results here.
As we are performing stochastic optimization, intrinsic sampling noise will limit the accuracy to which any formally exact solution can be found. It is, therefore, crucial to quantitatively characterize when exactness may have been reached numerically. For a gapped Hamiltonian, the average energy of an approximate state E can be related to its infidelity with the ground state using [30]
1 O E E 0 δ = ϵ δ ,
where δ is the energy gap of the Hamiltonian, and ϵ = E E 0 is the energy deviation. As the ground state energy E 0 of the AKLT Hamiltonian is zero, the energy deviation is simply the sampled energy of the state ϵ samp . As shown in Figure 8a, even if the exact analytic spin-1 NQS solution is used, ϵ samp fluctuates when using a finite number of samples typical of an optimization step. For all optimizations presented in this paper, we used the following hyperparameters: number of samples per optimization step n samp = 8000 , number of optimization steps n step 5000 . Typically, the full wavefunction and its fidelity are calculated and checked every 1000 steps to gauge whether the solution has converged or requires further optimization. To account for fluctuations caused by a finite n samp employed throughout the stochastic optimization, therefore, we use, in Equation (35), ϵ = Δ ϵ , the standard deviation of the sampled energy. As shown in Figure 8b, Δ ϵ vanishes as 1 / n samp , and we estimate an algorithmic fidelity resolution of R = Δ ϵ / δ 1.2× 10 5 for N = 12 sites below, in which it may be hard to discriminate an exact solution from an extremely good approximate one.
In Figure 9, we show the smooth decrease in 1 O against M for the S z basis. While the N = 6 curve in Figure 9a drops below R , this is not attained for larger N shown in Figure 9b–d within the M’s tested. This indicates that no “exact” NQS solutions have been located. As with the AFH model, the spin-1 NQS and unary NQS achieve a similar accuracy verses M, although the former utilizes less variational parameters.
The analogous results in Figure 10 for the x y z basis display remarkable features in comparison. For each N, a sharp drop in 1 O by over 4 orders of magnitude is observed at M = N 2 that consistently pushes the infidelity below R . After reaching this point, 1 O versus M plateaus, and subsequent hidden units have negligible bearing on the accuracy of the wavefunction due to statistical fluctuations originating from the stochastic optimization. These features are consistently produced by both the spin-1 NQS and unary NQS, aside from the largest system size N = 12 in Figure 10d, indicating that the increasing number of redundant variational parameters in the unary encoding is complicating the optimization. The overall behavior of 1 O observed in this basis is strong evidence of a numerically exact solution with M = N (once the 2 hidden units implementing the nodal structure are included). This is substantially smaller than the analytic solutions introduced and is very suggestive of there being a compact exact spin-1 NQS representation of the AKLT state in the x y z basis.

6. Conclusions

We have introduced the most natural and direct generalization of RBM from spin- 1 2 to spin-1. This necessitated including a quadratic visible bias and a quadratic visible-hidden interaction in the RBM energy function to ensure trivial product state representation, labeling freedom and gauge equivalence to the tensor network formulation. We demonstrated its use numerically for the spin-1 AFH model in both the S z and x y z bases, illustrating how the choice of basis can affect the accuracy and hidden unit complexity of an NQS representation. Using our spin-1 NQS, we then re-examined how to represent the AKLT state exactly. In the S z basis, it is known to require M O ( N 2 ) hidden units, yet, by changing to the x y z basis, we construct an NQS with M O ( N ) hidden units.
Numerical VMC calculations have indicated that, by capturing the nodal structure, either implicitly within the sampling or explicitly through the inclusion of extra hidden units, the optimization can find even more efficient constructions for the sign structure. The resulting spin-1 NQS for the AKLT state in the x y z basis requires M = N hidden units in total making it compact. This example raises the interesting possibility of improving the efficiency and accuracy of NQS calculations by including single-spin basis transformations to lower the hidden unit complexity.
Several important open questions remain about NQS representations. In particular, it would be instructive to build representations of classes of bosonic states using multinomial RBMs. In this case, a local Fock basis is typically employed; however, our findings suggest that it could be useful to explore a local basis that breaks the particle number symmetry when describing condensates. Moreover, the elevation of visible units from binary to multinomial raises the question of whether also using multinomial hidden units can enhance the expressiveness of NQS. This has been explored in the context of binary visible units in ref. [54] in an analytical context to precisely represent certain two- and three-body interactions. The use of multinomial hidden units for numerical VMC calculations has been largely unexplored and is the subject of forthcoming work [55] for the Bose Hubbard model in two dimensions.   

Author Contributions

Conceptualization: M.Y.P. and S.R.C.; methodology, M.Y.P. and S.R.C.; software, M.Y.P. and S.R.C.; validation, M.Y.P. and S.R.C.; formal analysis, M.Y.P. and S.R.C.; investigation, M.Y.P.; resources, S.R.C.; data curation, M.Y.P.; writing—original draft preparation, S.R.C.; writing—review and editing, M.Y.P.; visualization, M.Y.P. and S.R.C.; supervision, S.R.C.; project administration, S.R.C.; funding acquisition, S.R.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Engineering and Physical Sciences Research Council (EPSRC) under grants No. EP/P025110/2 and EP/T028424/1. M.Y.P. also acknowledges the University of Bristol Advanced Computing Research Centre for the use of their High Performance Computing facility (BlueCrystal Phase 3 and 4) in performing the VMC calculations presented.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

MATLAB scripts and .mat files containing the data shown in the figures are available in the data repository ref. [56], doi:10.5523/bris.1ln9kyt6i86n12ehhftht27edp.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Appendix A. Monte Carlo Method and Sampling

Any many-body quantum state can be expressed in terms of its amplitudes on a set of configurations states in a chosen basis, but exhaustively computing the amplitudes for every possible configuration in the basis is often prohibitively expensive. Monte Carlo methods circumvent this by randomly selecting a subset of these configurations, and calculating an estimate of key observables using this selection. In this case, we require only that the (unnormalized) amplitudes Ψ p ( S ) of the ansatz (defined by parameters p ) in a fixed basis, such as S z , can be computed in a time with polynomial complexity in N. We then rewrite the expectation value as
A ^ Ψ p = S P ( S ) A ( S ) , where P ( S ) = | Ψ p ( S ) | 2 S | Ψ p ( S ) | 2
is the probability of the spin configuration S , and
A ( S ) = S Ψ p ( S ) Ψ p ( S ) S A ^ S
is the local estimator of A ^ . The sum over S in Equation (A2) is restricted to only those configurations for which the matrix element S A ^ S 0 , and, so as long as A ^ is sparse in the chosen fixed basis, then A ( S ) can be efficiently evaluated. Typical product terms comprising short-ranged Hamiltonians H ^ fulfill this requirement. Thus, an unbiased estimate of A ^ Ψ p is made by drawing independent samples of spin configurations S from the distribution P ( S ) . A powerful way to accomplish this is Markov-chain Monte Carlo in which a sequence of spin configurations S 0 S 1 is generated using the Metropolis-Hastings algorithm that accepts a proposed configuration S prop for the nth iteration with a probability P n ( S prop ) = min [ 1 , | Ψ p ( S prop ) | 2 / | Ψ p ( S n 1 ) | 2 ] . A subset of visited configurations in the sequence, suitably separated to ensure they are decorrelated, is then used to estimate of A ^ Ψ p . Crucially, both A ( S ) and Monte Carlo sampling rely only on the ratio of ansatz amplitudes, so the intractable state normalization S | Ψ p ( S ) | 2 is entirely avoided.
Variational Monte Carlo proceeds by evaluating the ansatz energy E p = H ^ Ψ p and its variance, along with their gradient vectors with respect to parameters p . The parameters can then be updated by a small step along the direction of steepest descent, with this iterated until convergence to the minimum p 0 [29]. The efficiency of this minimization process is strongly problem and ansatz dependent. In challenging cases, more sophisticated approaches, such as modified stochastic optimization [57], the ‘linear method’ [58,59], and stochastic reconfiguration [30,49], are needed. All numerical calculations in this paper employ stochastic reconfiguration. The method involves the construction of a n params × n params matrix, requiring O ( n samp n params 2 ) operations, though with a conjugate gradient method this can be brought down to O ( n samp n params ) [60]. The parameter changes can then be calculated from a set of linear equations involving the matrix, which scales as O ( n params 2 ) using matrix-vector product methods. As a rule of thumb, the matrix requires a number of samples n samp > 10 n params to ensure the sampled matrix is not rank-deficient [30], giving an overall minimum scaling of O ( n params 2 ) for the algorithm, though, typically, it is prudent to have a large number of samples n samp n params as the error of any sampled observable, typically scales, as 1 / n samp [30], including the elements of the matrix.

Appendix B. Boltzmann Parameterization of Coupling Matrices

Every 2 × 3 coupling matrix within the spin-1 NQS tensor network can be parameterized in Boltzmann form using the generalized energy function introduced in Section 4.2 as
C h i v j ( i j ) = exp c ˜ i j + w i j h i v j + W i j h i v j 2 + b ˜ i j h i + a i j ˜ v j + A ˜ i j v j 2 ,
for h i { + 1 , 1 } and v j { + 1 , 0 , 1 } , which includes a quadratic weight W i j and quadratic partial bias A ˜ i j . These complex parameters are found from the coupling matrix elements as
a ˜ i j = 1 4 log C + + ( i j ) + log C + ( i j ) log C + ( i j ) log C ( i j ) , b ˜ i j = 1 2 log C + 0 ( i j ) log C 0 ( i j ) , c ˜ i j = 1 2 log C + 0 ( i j ) + log C 0 ( i j ) , w i j = 1 4 log C + + ( i j ) log C + ( i j ) log C + ( i j ) + log C ( i j ) , W i j = 1 4 log C + + ( i j ) log C + ( i j ) 2 log C + 0 ( i j ) + 2 log C 0 ( i j ) + log C + ( i j ) log C ( i j ) , A ˜ i j = 1 4 log C + + ( i j ) + log C + ( i j ) 2 log C + 0 ( i j ) 2 log C 0 ( i j ) + log C + ( i j ) + log C ( i j ) .
In both cases, the presence of zero matrix elements in C ( i j ) , which appear frequently in our ‘by hand’ constructions, return diverging parameters. However, as discussed in ref. [42], this can be handled by softening the zeros via
C h i v j ( i j ) max C h i v j ( i j ) , e S ,
where typically S 5–10.

Appendix C. AKLT State NQS in S z Basis

In this appendix, we construct a spin-1 NQS for the AKLT state amplitudes in the S z basis given in Equation (26) by using hidden units as successive filters. The rules governing the structure of the amplitudes can be summarized as follows:
  • Zero out the amplitude for any configuration containing a substring for any = 0 , 1 , , N 2 of the form + [ 0 0 ] + or [ 0 0 ] , where [ 0 0 ] is a string of 0’s of length . To implement this rule, we introduce the following 2 × 3 coupling matrices:
    C i = 1 1 1 i 0 0 , C i = 1 1 1 0 0 i , C + 0 = 1 1 1 0 1 0 ,
    which, in turn, single out S z basis spin-1 states , 0 and . To cancel out in any configuration substrings + [ 0 0 ] + and [ 0 0 ] , starting at a given site k with a given separation , we construct pairs of hidden units with
    { I , , I , C + i k , C + 0 , , C + 0 , C + i k + 1 + , I , , I } , and { I , , I , C i k , C + 0 , , C + 0 , C i k + 1 + , I , , I } .
    To capture all separations = 0 , 1 , , N 2 starting on any site k = 1 , 2 , , N , we require 2 N ( N 1 ) hidden units.
  • Apply a factor of ( 1 ) to the amplitude of a configuration for each substring of the form + [ 0 0 ] it contains. To implement this rule, we introduce two additional coupling matrices:
    C 2 = 1 1 1 2 0 0 , C = 1 1 1 0 0 1 .
    A factor ( 1 ) is induced on any configuration containing the substring + [ 0 0 ] , starting at a given site k with a given separation , by a hidden unit with
    { I , , I , C 2 k , C + 0 , , C + 0 , C k + 1 + , I , , I } .
    To capture all odd separations = 1 , 3 , , N 2 starting on any site k = 1 , 2 , , N requires N 1 2 ( N 1 ) hidden units.
  • Zero out the single excitation configurations + [ 0 0 ] N 1 and [ 0 0 ] N 1 and all their translates. To implement this rule, we introduce another coupling matrix:
    C = 1 1 1 1 0 0 .
    Configurations + [ 0 0 ] N 1 and [ 0 0 ] N 1 are then cancelled out by a pair of hidden units with { C , C + 0 , , C + 0 } and { C , C + 0 , , C + 0 } . Capturing all the translates is achieved with 2 N hidden units.
  • For N, odd zero out the amplitude for configuration [ 0 0 ] N , or, for N, even double its amplitude. To implement this rule, we introduce one final coupling matrix:
    C 0 = 1 1 1 0 1 0 ,
    and defining a single hidden unit as { C 0 , C 0 , , C 0 } .
  • Apply a factors of 2 ( # + s ) × ( 1 ) ( # 0 s ) × ( 1 ) ( # s ) to the amplitudes of all configurations. This does not require any additional hidden units. Instead, we can pick any hidden unit and right multiply each of its N coupling matrices by D = diag ( 2 , 1 , 1 ) . In RBM formalism, this is equivalent to setting non-zero visible unit biases.
This gives a total of M = 2 N 2 + N 1 2 ( N 1 ) + 1 hidden units. Notice that the origin of the O ( N 2 ) scaling arises from spatial translations, so if this symmetry is directly imposed within the NQS ansatz, as it often is numerically, then the AKLT state in the S z basis instead has its number of unique hidden units scaling as M = 2 N + 1 2 ( N 1 ) + 1 .

References

  1. Verstraete, F.; Murg, V.; Cirac, J. Matrix product states, projected entangled pair states, and variational renormalization group methods for quantum spin systems. Adv. Phys. 2008, 57, 143–224. [Google Scholar] [CrossRef] [Green Version]
  2. Orús, R. A practical introduction to tensor networks: Matrix product states and projected entangled pair states. Ann. Phys. 2014, 349, 117–158. [Google Scholar] [CrossRef] [Green Version]
  3. Cirac, J.I.; Sierra, G. Infinite matrix product states, Conformal Field Theory and the Haldane-Shastry model. Phys. Rev. B 2010, 81, 104431. [Google Scholar] [CrossRef] [Green Version]
  4. Shi, Y.Y.; Duan, L.M.; Vidal, G. Classical simulation of quantum many-body systems with a tree tensor network. Phys. Rev. A 2006, 74, 022320. [Google Scholar] [CrossRef] [Green Version]
  5. Murg, V.; Verstraete, F.; Legeza, O.; Noack, R.M. Simulating strongly correlated quantum systems with tree tensor networks. Phys. Rev. B 2010, 82, 205105. [Google Scholar] [CrossRef] [Green Version]
  6. Evenbly, G.; Vidal, G. Quantum Criticality with the Multi-scale Entanglement Renormalization Ansatz. In Strongly Correlated Systems; Springer Series in Solid-State Sciences; Springer: Berlin/Heidelberg, Germang, 2013; Volume 176, pp. 99–130. [Google Scholar]
  7. Carleo, G.; Troyer, M. Solving the quantum many-body problem with artificial neural networks. Science 2017, 355, 602606. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  8. Gao, X.; Duan, L.M. Efficient representation of quantum many-body states with deep neural networks. Nat. Commun. 2017, 8, 662. [Google Scholar] [CrossRef] [PubMed]
  9. Carleo, G.; Nomura, Y.; Imada, M. Constructing exact representations of quantum many-body systems with deep neural networks. Nat. Commun. 2018, 9, 5322. [Google Scholar] [CrossRef]
  10. He, H.; Zheng, Y.; Bernevig, B.A.; Sierra, G. Multi-Layer Restricted Boltzmann Machine Representation of 1D Quantum Many-Body Wave Functions. arXiv 2019, arXiv:1910.13454. [Google Scholar]
  11. Choo, K.; Neupert, T.; Carleo, G. Two-dimensional frustrated J1J2 model studied with neural network quantum states. Phys. Rev. B 2019, 100, 125124. [Google Scholar] [CrossRef] [Green Version]
  12. Irikura, N.; Saito, H. Neural-network quantum states at finite temperature. Phys. Rev. Res. 2020, 2, 013284. [Google Scholar] [CrossRef] [Green Version]
  13. Schmitt, M.; Heyl, M. Quantum Many-Body Dynamics in Two Dimensions with Artificial Neural Networks. Phys. Rev. Lett. 2020, 125, 100503. [Google Scholar] [CrossRef]
  14. Liang, X.; Dong, S.J.; He, L. Hybrid convolutional neural network and projected entangled pair states wave functions for quantum many-particle states. Phys. Rev. B 2021, 103, 035138. [Google Scholar] [CrossRef]
  15. Levine, Y.; Sharir, O.; Cohen, N.; Shashua, A. Quantum Entanglement in Deep Learning Architectures. Phys. Rev. Lett. 2019, 122, 065301. [Google Scholar] [CrossRef] [Green Version]
  16. Saito, H.; Kato, M. Machine Learning Technique to Find Quantum Many-Body Ground States of Bosons on a Lattice. J. Phys. Soc. Jpn. 2018, 87, 014001. [Google Scholar] [CrossRef]
  17. Choo, K.; Carleo, G.; Regnault, N.; Neupert, T. Symmetries and Many-Body Excitations with Neural-Network Quantum States. Phys. Rev. Lett. 2018, 121, 167204. [Google Scholar] [CrossRef] [Green Version]
  18. Luo, D.; Clark, B.K. Backflow Transformations via Neural Networks for Quantum Many-Body Wave Functions. Phys. Rev. Lett. 2019, 122, 226401. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  19. Adams, C.; Carleo, G.; Lovato, A.; Rocco, N. Variational Monte Carlo Calculations of A4 Nuclei with an Artificial Neural-Network Correlator Ansatz. Phys. Rev. Lett. 2021, 127, 022502. [Google Scholar] [CrossRef]
  20. Torlai, G.; Melko, R.G. Latent Space Purification via Neural Density Operators. Phys. Rev. Lett. 2018, 120, 240503. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  21. Vicentini, F.; Biella, A.; Regnault, N.; Ciuti, C. Variational Neural-Network Ansatz for Steady States in Open Quantum Systems. Phys. Rev. Lett. 2019, 122, 250503. [Google Scholar] [CrossRef] [Green Version]
  22. Hartmann, M.J.; Carleo, G. Neural-Network Approach to Dissipative Quantum Many-Body Dynamics. Phys. Rev. Lett. 2019, 122, 250502. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  23. Yoshioka, N.; Hamazaki, R. Constructing neural stationary states for open quantum many-body systems. Phys. Rev. B 2019, 99, 214306. [Google Scholar] [CrossRef] [Green Version]
  24. Westerhout, T.; Astrakhantsev, N.; Tikhonov, K.S.; Katsnelson, M.I.; Bagrov, A.A. Neural Quantum States of frustrated magnets: Generalization and sign structure. Nat. Commun. 2020, 11, 1593. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  25. Nomura, Y. Helping restricted Boltzmann machines with quantum-state representation by restoring symmetry. J. Phys. Condens. Matter 2021, 33, 174003. [Google Scholar] [CrossRef]
  26. Jónsson, B.; Bauer, B.; Carleo, G. Neural-network states for the classical simulation of quantum computing. arXiv 2018, arXiv:1808.05232. [Google Scholar]
  27. Freitas, N.; Morigi, G.; Dunjko, V. Neural network operations and Susuki–Trotter evolution of neural network states. Int. J. Quantum Inf. 2018, 16, 1840008. [Google Scholar] [CrossRef] [Green Version]
  28. Bausch, J.; Leditzky, F. Quantum codes from neural networks. New J. Phys. 2020, 22, 023005. [Google Scholar] [CrossRef] [Green Version]
  29. Gubernatis, J.; Kawashima, N.; Werner, P. Quantum Monte Carlo Methods: Algorithms for Lattice Models; Cambridge University Press: Cambridge, UK, 2016. [Google Scholar] [CrossRef] [Green Version]
  30. Becca, F.; Sorella, S. Quantum Monte Carlo Approaches for Correlated Systems; Cambridge University Press: Cambridge, UK, 2017. [Google Scholar] [CrossRef]
  31. Deng, D.L.; Li, X.; Das Sarma, S. Quantum Entanglement in Neural Network States. Phys. Rev. X 2017, 7, 021021. [Google Scholar] [CrossRef]
  32. Eisert, J.; Cramer, M.; Plenio, M.B. Colloquium: Area laws for the entanglement entropy. Rev. Mod. Phys. 2010, 82, 277. [Google Scholar] [CrossRef] [Green Version]
  33. Clark, S.R. Unifying neural-network quantum states and correlator product states via tensor networks. J. Phys. A Math. Theor. 2018, 51, 135301. [Google Scholar] [CrossRef] [Green Version]
  34. Chen, J.; Cheng, S.; Xie, H.; Wang, L.; Xiang, T. Equivalence of restricted Boltzmann machines and tensor network states. Phys. Rev. B 2018, 97, 085104. [Google Scholar] [CrossRef] [Green Version]
  35. Collura, M.; Dell’Anna, L.; Felser, T.; Montangero, S. On the descriptive power of Neural-Networks as constrained Tensor Networks with exponentially large bond dimension. SciPost Phys. Core 2021, 4. [Google Scholar] [CrossRef]
  36. Glasser, I.; Pancotti, N.; August, M.; Rodriguez, I.D.; Cirac, J.I. Neural-Network Quantum States, String-Bond States, and Chiral Topological States. Phys. Rev. X 2018, 8, 011006. [Google Scholar] [CrossRef] [Green Version]
  37. Kaubruegger, R.; Pastori, L.; Budich, J.C. Chiral topological phases from artificial neural networks. Phys. Rev. B 2018, 97, 195136. [Google Scholar] [CrossRef] [Green Version]
  38. Zheng, Y.; He, H.; Regnault, N.; Bernevig, B.A. Restricted Boltzmann Machines and Matrix Product States of 1D Translational Invariant Stabilizer Codes. Phys. Rev. B 2019, 99, 155129. [Google Scholar] [CrossRef] [Green Version]
  39. Zhang, Y.H.; Jia, Z.A.; Wu, Y.C.; Guo, G.C. An Efficient Algorithmic Way to Construct Boltzmann Machine Representations for Arbitrary Stabilizer Code. arXiv 2018, arXiv:1809.08631. [Google Scholar]
  40. Lu, S.; Gao, X.; Duan, L.M. Efficient representation of topologically ordered states with restricted Boltzmann machines. Phys. Rev. B 2019, 99, 155136. [Google Scholar] [CrossRef] [Green Version]
  41. Jia, Z.A.; Zhang, Y.H.; Wu, Y.C.; Kong, L.; Guo, G.C.; Guo, G.P. Efficient Machine Learning Representations of Surface Code with Boundaries, Defects, Domain Walls and Twists. Phys. Rev. A 2019, 99, 012307. [Google Scholar] [CrossRef] [Green Version]
  42. Pei, M.Y.; Clark, S.R. Compact Neural-network Quantum State representations of Jastrow and Stabilizer states. arXiv 2021, arXiv:2103.09146. [Google Scholar]
  43. Saito, H. Solving the Bose–Hubbard Model with Machine Learning. J. Phys. Soc. Jpn. 2017, 86, 093001. [Google Scholar] [CrossRef]
  44. Guo, C.; Berkhahn, F. Entity Embeddings of Categorical Variables. arXiv 2016, arXiv:1604.06737. [Google Scholar]
  45. McBrian, K.; Carleo, G.; Khatami, E. Ground state phase diagram of the one-dimensional Bose-Hubbard model from restricted Boltzmann machines. J. Phys. Conf. Ser. 2019, 1290, 012005. [Google Scholar] [CrossRef] [Green Version]
  46. Vargas-Calderón, V.; Vinck-Posada, H.; González, F.A. Phase Diagram Reconstruction of the Bose–Hubbard Model with a Restricted Boltzmann Machine Wavefunction. J. Phys. Soc. Jpn. 2020, 89, 094002. [Google Scholar] [CrossRef]
  47. Vieijra, T.; Casert, C.; Nys, J.; De Neve, W.; Haegeman, J.; Ryckebusch, J.; Verstraete, F. Restricted Boltzmann Machines for Quantum States with Non-Abelian or Anyonic Symmetries. Phys. Rev. Lett. 2020, 124, 097201. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  48. Läuchli, A.M.; Sudan, J.; Sørensen, E.S. Ground-state energy and spin gap of spin-12 Kagomé-Heisenberg antiferromagnetic clusters: Large-scale exact diagonalization results. Phys. Rev. B 2011, 83, 212401. [Google Scholar] [CrossRef] [Green Version]
  49. Sorella, S. Generalized Lanczos algorithm for variational quantum Monte Carlo. Phys. Rev. B 2001, 64, 024512. [Google Scholar] [CrossRef] [Green Version]
  50. Le Roux, N.; Bengio, Y. Representational Power of Restricted Boltzmann Machines and Deep Belief Networks. Neural Comput. 2008, 20, 1631–1649. [Google Scholar] [CrossRef] [PubMed]
  51. Biamonte, J.D.; Clark, S.R.; Jaksch, D. Categorical Tensor Network States. AIP Adv. 2011, 1, 042172. [Google Scholar] [CrossRef] [Green Version]
  52. Affleck, I.; Kennedy, T.; Lieb, E.H.; Tasaki, H. Rigorous results on valence-bond ground states in antiferromagnets. Phys. Rev. Lett. 1987, 59, 799–802. [Google Scholar] [CrossRef] [PubMed]
  53. Haldane, F.D.M. Nonlinear Field Theory of Large-Spin Heisenberg Antiferromagnets: Semiclassically Quantized Solitons of the One-Dimensional Easy-Axis Néel State. Phys. Rev. Lett. 1983, 50, 1153–1156. [Google Scholar] [CrossRef]
  54. Rrapaj, E.; Roggero, A. Exact representations of many-body interactions with restricted-Boltzmann-machine neural networks. Phys. Rev. E 2021, 103, 013302. [Google Scholar] [CrossRef] [PubMed]
  55. Pei, M.Y.; Clark, S.R. Neural-network quantum states for bosons revisited. 2021; in preparation. [Google Scholar]
  56. Clark, S.R.; Pei, M.Y. NQS Spin-1 Numerics Data. Data Repository for NQS Spin-1 Calculation Data and Plotting Scripts. Available online: https://data.bris.ac.uk/data/dataset/1ln9kyt6i86n12ehhftht27edp (accessed on 3 May 2021).
  57. Lou, J.; Sandvik, A.W. Variational ground states of two-dimensional antiferromagnets in the valence bond basis. Phys. Rev. B 2007, 76, 104432. [Google Scholar] [CrossRef] [Green Version]
  58. Nightingale, M.P.; Melik-Alaverdian, V. Optimization of Ground- and Excited-State Wave Functions and van der Waals Clusters. Phys. Rev. Lett. 2001, 87, 043401. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  59. Toulouse, J.; Umrigar, C.J. Optimization of quantum Monte Carlo wave functions by energy minimization. J. Chem. Phys. 2007, 126, 084102. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  60. Neuscamman, E.; Umrigar, C.J.; Chan, G.K.L. Optimizing large parameter sets in variational quantum Monte Carlo. Phys. Rev. B 2012, 85, 045103. [Google Scholar] [CrossRef] [Green Version]
Figure 1. The bipartite graph of an RBM depicting interaction weights w as edges shown as solid arcs between hidden and visible units. For completeness, the biases a and b on each unit are depicted here as self-loop edges with dotted arcs to distinguish them from the interactions. The diagram here has been duplicated from ref. [42] and is presented here again for clarity.
Figure 1. The bipartite graph of an RBM depicting interaction weights w as edges shown as solid arcs between hidden and visible units. For completeness, the biases a and b on each unit are depicted here as self-loop edges with dotted arcs to distinguish them from the interactions. The diagram here has been duplicated from ref. [42] and is presented here again for clarity.
Entropy 23 00879 g001
Figure 2. Useful spin- 1 2 tensor network diagrams. (a) The + state. (b) An order-3 COPY tensor. (c) Termination of a COPY tensor leg with the + state. (d) Copying of basis state inputs by an order-4 COPY tensor. (e) Expansion of an order-N COPY tensor into a GHZ state. (f) The commutativity of diagonal order-2 tensors across a COPY tensor [42].
Figure 2. Useful spin- 1 2 tensor network diagrams. (a) The + state. (b) An order-3 COPY tensor. (c) Termination of a COPY tensor leg with the + state. (d) Copying of basis state inputs by an order-4 COPY tensor. (e) Expansion of an order-N COPY tensor into a GHZ state. (f) The commutativity of diagonal order-2 tensors across a COPY tensor [42].
Entropy 23 00879 g002
Figure 3. Useful spin-1 tensor network diagrams, directly generalizing the spin- 1 2 properties outlined in Section 3.2. (a) The # state. (b) An order-3 COPY tensor. (c) Termination of a COPY tensor leg with the # state. (d) Copying of basis state inputs by an order-4 COPY tensor. (e) Expansion of an order-N COPY tensor into a spin-1 GHZ state. (f) The commutativity of diagonal order-2 tensors across a COPY tensor.
Figure 3. Useful spin-1 tensor network diagrams, directly generalizing the spin- 1 2 properties outlined in Section 3.2. (a) The # state. (b) An order-3 COPY tensor. (c) Termination of a COPY tensor leg with the # state. (d) Copying of basis state inputs by an order-4 COPY tensor. (e) Expansion of an order-N COPY tensor into a spin-1 GHZ state. (f) The commutativity of diagonal order-2 tensors across a COPY tensor.
Entropy 23 00879 g003
Figure 4. (a) Representative examples of unary projection contractions with hidden units possessing different patterns of connectivity. (b) The first step in the contraction involves pulling the projection through the two-dimensional COPY tensors, leaving behind a single three-dimensional COPY tensor. (c) Connections to the unary spins are isolated by splitting the hidden unit COPY tensors. (d) The resulting blocks of the network are then contracted to obtain the 2 × 3 spin-1 NQS coupling matrices.
Figure 4. (a) Representative examples of unary projection contractions with hidden units possessing different patterns of connectivity. (b) The first step in the contraction involves pulling the projection through the two-dimensional COPY tensors, leaving behind a single three-dimensional COPY tensor. (c) Connections to the unary spins are isolated by splitting the hidden unit COPY tensors. (d) The resulting blocks of the network are then contracted to obtain the 2 × 3 spin-1 NQS coupling matrices.
Entropy 23 00879 g004
Figure 5. Plots of the infidelity of unary (red) and spin-1 NQS (blue) with the spin-1 AFH ground state with periodic boundary conditions. The calculations presented in both plots have the same number of sites N = 12 , up to M = 2 N = 24 hidden units. (a) The infidelity 1 O of the two NQS formulations for the S z basis versus the RBM parameter count n params . The inset shows the collapse of the same data when it is plotted in terms of the hidden unit number M. (b) The same calculations performed in the x y z basis.
Figure 5. Plots of the infidelity of unary (red) and spin-1 NQS (blue) with the spin-1 AFH ground state with periodic boundary conditions. The calculations presented in both plots have the same number of sites N = 12 , up to M = 2 N = 24 hidden units. (a) The infidelity 1 O of the two NQS formulations for the S z basis versus the RBM parameter count n params . The inset shows the collapse of the same data when it is plotted in terms of the hidden unit number M. (b) The same calculations performed in the x y z basis.
Entropy 23 00879 g005
Figure 6. Valence bond solid construction of the spin-1 AKLT state Ψ AKLT from the projection P of pairs of spin- 1 2 particles shared between neighboring sites on the chain.
Figure 6. Valence bond solid construction of the spin-1 AKLT state Ψ AKLT from the projection P of pairs of spin- 1 2 particles shared between neighboring sites on the chain.
Entropy 23 00879 g006
Figure 7. The 3 6 (unnormalized) amplitudes of the N = 6 AKLT state in the x y z basis Ψ AKLT ( α ) are displayed here as a 3 3 × 3 3 matrix with the color of an element designating which of the values + 1 , 0 , 1 the amplitude has. The bottom left corner element corresponds to the amplitude of x x x x x x , while the top right corner corresponds to z z z z z z . The amplitudes Ψ AKLT ( α ) are reconstructed exactly by the product of M = 2 N = 12 hidden unit filters given in the main text shown. The first two filters Υ x y ( α ) and Υ y z ( α ) establish the nodal structure, while the other ten filters Υ x ( k ) ( α ) and Υ z ( k ) ( α ) imprint the sign structure.
Figure 7. The 3 6 (unnormalized) amplitudes of the N = 6 AKLT state in the x y z basis Ψ AKLT ( α ) are displayed here as a 3 3 × 3 3 matrix with the color of an element designating which of the values + 1 , 0 , 1 the amplitude has. The bottom left corner element corresponds to the amplitude of x x x x x x , while the top right corner corresponds to z z z z z z . The amplitudes Ψ AKLT ( α ) are reconstructed exactly by the product of M = 2 N = 12 hidden unit filters given in the main text shown. The first two filters Υ x y ( α ) and Υ y z ( α ) establish the nodal structure, while the other ten filters Υ x ( k ) ( α ) and Υ z ( k ) ( α ) imprint the sign structure.
Entropy 23 00879 g007
Figure 8. (a) The energy ϵ samp (red squares) for 500 independent Monte Carlo runs of the N = 12 exact AKLT spin-1 NQS each consisting of n samp = 8000 individual sampling steps separated by N individual Markov chain moves to reduce autocorrelation effects. The standard deviation Δ ϵ for these samples around the exact zero ground state energy is shown by the blue band. (b) The standard deviations Δ ϵ of energies sampled versus the number of samples n samp used. Each point is also calculated from 500 independent samples, and the number of sampling steps per evaluation n samp = { 1000 , 2000 , 4000 , 8000 , 16 ,000}. The fluctuations in energy closely follow a 1 / n samp scaling (dashed line), as expected for a Monte Carlo sampling process [30]. The red dotted lines are to guide the eye to the point for n samp = 8000 , which is representative of an optimization step for our VMC calculations.
Figure 8. (a) The energy ϵ samp (red squares) for 500 independent Monte Carlo runs of the N = 12 exact AKLT spin-1 NQS each consisting of n samp = 8000 individual sampling steps separated by N individual Markov chain moves to reduce autocorrelation effects. The standard deviation Δ ϵ for these samples around the exact zero ground state energy is shown by the blue band. (b) The standard deviations Δ ϵ of energies sampled versus the number of samples n samp used. Each point is also calculated from 500 independent samples, and the number of sampling steps per evaluation n samp = { 1000 , 2000 , 4000 , 8000 , 16 ,000}. The fluctuations in energy closely follow a 1 / n samp scaling (dashed line), as expected for a Monte Carlo sampling process [30]. The red dotted lines are to guide the eye to the point for n samp = 8000 , which is representative of an optimization step for our VMC calculations.
Entropy 23 00879 g008
Figure 9. Plots of the infidelity of unary (red stars) and spin-1 (blue squares) NQS with the AKLT state in the S z basis. The infidelity bound R is also plotted as a dotted green line. Results for four system sizes are plotted: (a) N = 6 , (b) N = 8 , (c) N = 10 , (d) N = 12 .
Figure 9. Plots of the infidelity of unary (red stars) and spin-1 (blue squares) NQS with the AKLT state in the S z basis. The infidelity bound R is also plotted as a dotted green line. Results for four system sizes are plotted: (a) N = 6 , (b) N = 8 , (c) N = 10 , (d) N = 12 .
Entropy 23 00879 g009
Figure 10. Plots of the infidelity of unary (red stars) and spin-1 (blue squares) NQS with the AKLT state for the x y z basis. The infidelity resolution R is plotted as a green dotted line. Results for four system sizes are plotted: (a) N = 6 , (b) N = 8 , (c) N = 10 , (d) N = 12 .
Figure 10. Plots of the infidelity of unary (red stars) and spin-1 (blue squares) NQS with the AKLT state for the x y z basis. The infidelity resolution R is plotted as a green dotted line. Results for four system sizes are plotted: (a) N = 6 , (b) N = 8 , (c) N = 10 , (d) N = 12 .
Entropy 23 00879 g010
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Pei, M.Y.; Clark, S.R. Neural-Network Quantum States for Spin-1 Systems: Spin-Basis and Parameterization Effects on Compactness of Representations. Entropy 2021, 23, 879. https://doi.org/10.3390/e23070879

AMA Style

Pei MY, Clark SR. Neural-Network Quantum States for Spin-1 Systems: Spin-Basis and Parameterization Effects on Compactness of Representations. Entropy. 2021; 23(7):879. https://doi.org/10.3390/e23070879

Chicago/Turabian Style

Pei, Michael Y., and Stephen R. Clark. 2021. "Neural-Network Quantum States for Spin-1 Systems: Spin-Basis and Parameterization Effects on Compactness of Representations" Entropy 23, no. 7: 879. https://doi.org/10.3390/e23070879

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop