Abstract
In this work we introduce a multi-species generalization of the Hopfield model for associative memory, where neurons are divided into groups and both inter-groups and intra-groups pair-wise interactions are considered, with different intensities. Thus, this system contains two of the main ingredients of modern deep neural-network architectures: Hebbian interactions to store patterns of information and multiple layers coding different levels of correlations. The model is completely solvable in the low-load regime with a suitable generalization of the Hamilton–Jacobi technique, despite the Hamiltonian can be a non-definite quadratic form of the Mattis magnetizations. The family of multi-species Hopfield model includes, as special cases, the 3-layers Restricted Boltzmann Machine with Gaussian hidden layer and the Bidirectional Associative Memory model.
Similar content being viewed by others
Notes
The random variable z is chosen symmetrically distributed and, typically, its probability density is taken as \(p(z) = [1- \tanh ^2(z)]/2\).
The intensive pressure, here denoted as \(f_N(\beta )\), is strictly related to the (possibly more familiar) intensive free energy \(\tilde{f}_N(\beta )\) by \(f_N(\beta ) = - \beta \tilde{f}_N(\beta )\). Therefore, the existence and the uniqueness of \(\tilde{f}_N(\beta )\) also ensure the existence and uniqueness of \(f_N(\beta )\) and vice versa, while the positive convexity of \(\tilde{f}_N(\beta )\) ensures the negative convexity of \(f_N(\beta )\). As a result, the thermodynamic equilibrium can be detected as a minimum for the free energy or as a maximum for the pressure.
References
Bengio, Y., LeCun, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. Google book (2016)
Amit, D.J.: Modeling Brain Function: The World of Attractor Neural Networks. Cambridge University Press, Cambridge (1992)
Coolen, A.C.C., Kühn, R., Sollich, P.: Theory of Neural Information Processing Systems. Oxford Press, Oxford (2005)
Amit, D.J., Gutfreund, H., Sompolinsky, H.: Spin Glass model of neural networks. Phys. Rev. A 32, 1007–1018 (1985)
Amit, D.J., Gutfreund, H., Sompolinsky, H.: Storing infinite numbers of patterns in a spin glass model of neural networks. Phys. Rev. Lett. 55, 1530–1533 (1985)
Hackley, D.H., Hinton, G.E., Sejnowski, T.J.: A learning alghoritm for Boltzmann machines. Cogn. Sci. 9(1), 147 (1985)
Salakhutdinov, R., Hinton, G.E.: Deep Boltzmann machines. AISTATS 1, 3 (2009)
Hinton, G.E., Osindero, S., Teh, Y.W.: A fast algorithm for deep belief nets. Neural Comput. 18, 1527–1554 (2006)
Larocelle, H., Mandel, M., Pascanu, R., Bengio, Y.: Learning algorithms for the classification restricted Boltzmann machine. J. Mach. Learn. 13, 643–669 (2012)
Barra, A., Bernacchia, A., Santucci, E., Contucci, P.: On the equivalence of Hopfield networks and Boltzmann machines. Neural Netw. 34, 1–9 (2012)
Barra, A., Genovese, G., Sollich, P., Tantari, D.: Phase transitions in Restricted Boltzmann Machines with generic priors. Phys. Rev. E 96(4), 042156 (2017)
Barra, A., Genovese, G., Sollich, P., Tantari, D.: Phase diagram of restricted Boltzmann machines and generalized Hopfield networks with arbitrary priors. Phys. Rev. E 97(2), 022310 (2018)
Tubiana, J., Monasson, R.: Emergence of compositional representations in restricted Boltzmann machines. Phys. Rev. Lett. 118, 138301 (2017)
Huang, H.: Statistical mechanics of unsupervised feature learning in a restricted Boltzmann machine with binary synapses. J. Stat. Mech. 2017(5), 053302 (2017)
Huang, H.: Role of zero synapses in unsupervised feature learning. J. Phys. A 51(8), 08LT01 (2018)
Hebb, O.D.: The Organization of Behaviour: A Neuropsychological Theory. Pshyc. Press, Melbourne (1949)
Kosko, B.: Bidirectional associative memories. IEEE Trans. Syst. Man Cybern. 18(1), 49–60 (1988)
Kurchan, J., Peliti, L., Saber, M.: A statistical investigation of bidirectional associative memories (BAM). J. Phys. I 4(11), 1627–1639 (1994)
Englisch, H., Mastropietro, V., Tirozzi, B.: The BAM storage capacity. J. Phys. I 5(1), 85–96 (1995)
Barra, A., Contucci, P., Mingione, E., Tantari, D.: Multi-species mean field spin glasses. Rigorous results. Annales Henri Poincaré 16, 691–708 (2015)
Barra, A., Genovese, G., Guerra, F.: Equilibrium statistical mechanics of bipartite spin systems. J. Phys. A 44, 245002 (2011)
Barra, A., Galluzzi, A., Guerra, F., Pizzoferrato, A., Tantari, D.: Mean field bipartite spin models treated with mechanical techniques. Eur. Phys. J. B 87(3), 74 (2014)
Panchenko, D.: The free energy in a multi-species Sherrington-Kirkpatrick model. Ann. Probab. 43(6), 3494–3513 (2015)
Genovese, G., Tantari, D.: Overlap synchronisation in multipartite random energy models. J. Stat. Phys. 169(6), 1162–1170 (2017)
Contucci, P., Fedele, M., Vernia, C.: Inverse problem robustness for multi-species mean field spin models. J. Phys. A 46, 065001 (2013)
Genovese, G., Tantari, D.: Non-convex multipartite ferromagnets. J. Stat. Phys. 163(3), 492–513 (2016)
Agliari, E., Barra, A., Galluzzi, A., Tantari, D., Tavani, F.: A walk in the statistical mechanical formulation of neural networks—alternative routes to Hebb prescription. NCTA2014 7, 210–217 (2014)
McCulloch, W.S., Pitts, W.: A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 5, 115–133 (1943)
Gardner, E.J., Wallace, D.J., Stroud, N.: Training with noise and the storage of correlated patterns in a neural network model. J. Phys. A 22(12), 2019 (1989)
Agliari, E., Barra, A., De Antoni, A., Galluzzi, A.: Parallel retrieval of correlated patterns: from Hopfield networks to Boltzmann machines. Neural Netw. 38, 52–63 (2013)
Gutfreund, H.: Neural networks with hierarchically correlated patterns. Phys. Rev. A 37(2), 570 (1988)
Agliari, E., Barra, A., Galluzzi, A., Guerra, F., Moauro, F.: Multitasking associative networks. Phys. Rev. Lett. 109, 268101 (2012)
Sollich, P., Tantari, D., Annibale, A., Barra, A.: Extensive parallel processing on scale free networks. Phys. Rev. Lett. 113, 238106 (2014)
Agliari, E., Annibale, A., Barra, A., Coolen, A.C.C., Tantari, D.: Immune networks: multitasking capabilities near saturation. J. Phys. A 46, 415003 (2013)
Agliari, E., Annibale, A., Barra, A., Coolen, A.C.C., Tantari, D.: Immune networks: multi-tasking capabilities at medium load. J. Phys. A 46, 335101 (2013)
Agliari, E., Annibale, A., Barra, A., Coolen, A.C.C., Tantari, D.: Retrieving infinite numbers of patterns in a spin-glass model of immune networks. Europhys. Let. 117(2), 28003 (2017)
Agliari, E., Barra, A., Galluzzi, A., Isopi, M.: Multitasking attractor networks with neuronal threshold noise. Neural Netw. 49, 19–29 (2014)
Barra, A., Genovese, G., Guerra, F.: The replica symmetric approximation of the analogical neural network. J. Stat. Phys. 140(4), 784–796 (2010)
Barra, A., Genovese, G., Guerra, F., Tantari, D.: How glassy are neural networks? J. Stat. Mech. 2012(07), P07009 (2012)
Barra, A., Guerra, F.: About the ergodic regime in the analogical Hopfield neural networks: moments of the partition function. J. Math. Phys. 49, 125217 (2008)
Barra, A., Genovese, G., Guerra, F., Tantari, D.: About a solvable mean field model of a Gaussian spin glass. J. Phys. A 47(15), 155002 (2014)
Genovese, G., Tantari, D.: Legendre duality of spherical and Gaussian spin glasses. Math. Phys. Anal. Geom. 18, 10 (2015)
Agliari, E., Barra, A., Del Ferraro, G., Guerra, F., Tantari, D.: Anergy in self-directed B lymphocytes: a statistical mechanics perspective. J. Theor. Biol. 375, 21–31 (2015)
Sompolinsky, H.: Neural networks with nonlinear synapses and a static noise. Phys. Rev. A 34, 2571(R) (1986)
Wemmenhove, B., Coolen, A.C.C.: Finite connectivity attractor neural networks. J. Phys. A 36, 9617 (2003)
Agliari, E., Barra, A., Galluzzi, A., Guerra, F., Tantari, D., Tavani, F.: Retrieval capabilities of hierarchical networks: from Dyson to Hopfield. Phys. Rev. Lett. 114, 028103 (2015)
Agliari, E., Barra, A., Galluzzi, A., Guerra, F., Tantari, D., Tavani, F.: Hierarchical neural networks perform both serial and parallel processing. Neural Netw. 66, 22–35 (2015)
Agliari, E., Barra, A., Galluzzi, A., Guerra, F., Tantari, D., Tavani, F.: Metastable states in the hierarchical Dyson model drive parallel processing in the hierarchical Hopfield network. J. Phys. A 48(1), 015001 (2014)
Agliari, E., Barra, A., Galluzzi, A., Guerra, F., Tantari, D., Tavani, F.: Topological properties of hierarchical networks. Phys. Rev. E 91(6), 062807 (2015)
Folli, V., Leonetti, M., Ruocco, G.: On the maximum storage capacity of the hopfield model. Front. Comput. Neurosci. 10, 144 (2017)
Rocchi, J., Saad, D., Tantari, D.: High storage capacity in the Hopfield model with auto-interactions—stability analysis. J. Phys. A 50(46), 465001 (2017)
Albeverio, S., Tirozzi, B., Zegarlinski, B.: Rigorous results for the free energy in the Hopfield model. Commun. Math. Phys. 150, 337–373 (1992)
Pastur, L., Shcherbina, M., Tirozzi, B.: The replica-symmetric solution without replica trick for the Hopfield model. J. Stat. Phys. 74(5), 1161–1183 (1994)
Bovier, A., Gayrard, V., Picco, P.: Gibbs states of the Hopfield model with extensively many patterns. J. Stat. Phys. 79, 395–414 (1995)
Bovier, A., Gayrard, V.: The retrieval phase of the Hopfield model, a rigorous analysis of the overlap distribution. Probab. Theor. Rel. Fields 107, 61–98 (1995)
Bovier, A., Gayrard, V.: Hopfield models as generalized random mean field models. In: Bovier, A., Picco, P. (eds.) Progress in Probability, vol. 41. Birkauser, Boston (1997)
Agliari, E., Barra, A., Tirozzi, B.: Boltzmann machines:self-averaging properties and thermodynamic limits, submitted (2018)
Scacciatelli, E., Tirozzi, B.: Fluctuation of the free energy in the Hopfeld model. J. Stat. Phys. 67, 981–1108 (1992)
Talagrand, M.: Rigorous results for the Hopfield model with many patterns. Probab. Theory Rel. Fields 110(2), 177–275 (1998)
Talagrand, M.: Exponential inequalities and convergence of moments in the replica-symmetric regime of the Hopfield model. Ann. Probab. 28(4), 1393–1469 (2000)
Barra, A.: The mean field Ising model trough interpolating techniques. J. Stat. Phys. 132(5), 787–809 (2008)
Guerra, F.: Sum rules for the free energy in the mean field spin glass model. Fields Inst. Commun. 30, 161 (2001)
Liao, X., Yu, J.: Qualitative analysis of Bi-directional Associative Memory with time delay. Int. J. Circ. Theor. Appl. 26(3), 219–229 (1998)
Cao, J., Xiao, M.: Stability and Hopf Bifurcation in a simplified BAM neural network with two time delays. IEEE Trans. Neural Netw. 18(2), 416–430 (2007)
Cao, J., Wang, L.: Exponential stability and periodic oscillatory solution in BAM networks with delays. IEEE Trans. Neural Netw. 13(2), 457–463 (2007)
Cao, J.: Global asymptotic stability of delayed bi-directional associative memory neural networks. Appl. Math. Comput. 142(2–3), 333–339 (2003)
Cao, J., Wan, Y.: Matrix measure strategies for stability and synchronization of inertial BAM neural network with time delays. Neural Netw. 53, 165–172 (2014)
Park, J.H., Park, C.H., Kwon, O.M., Leed, S.M.: A new stability criterion for bidirectional associative memory neural networks of neutral-type. Appl. Math. Comput. 199(2), 716–722 (2008)
Gabrié, M., Tramel, E.W., Krzakala, F.: Training restricted Boltzmann machine via the Thouless-Anderson-Palmer free energy. Adv. Neural Inf. Process. Syst. 1, 640–648 (2015)
Mezard, M.: Mean-field message-passing equations in the Hopfield model and its generalizations. Phys. Rev. E 95(2), 022117 (2017)
Barra, A., Di Biasio, A., Guerra, F.: Replica symmetry breaking in mean-field spin glasses through the Hamilton Jacobi technique. J. Stat. Mech. 2010(09), P09006 (2010)
Barra, A., Dal Ferraro, G., Tantari, D.: Mean field spin glasses treated with PDE techniques. Eur. Phys. J. B 86(7), 332 (2013)
Genovese, G., Barra, A.: A mechanical approach to mean field spin models. J. Math. Phys. 50(5), 053303 (2009)
Evans, L.: Partial Differential Equations (Graduate Studies in Mathematics), vol. 19. American Mathematical Society, Providence (1998)
Cannarsa, P., Sinestrari, C.: Semiconcave Functions, Hamilton-Jacobi Equations, and Optimal Control. Birkhauser, Boston (2004)
Barbier, J., Dia, M., Macris, N., Krzakala, F., Lesieur, T., Zdeborova, L.: Mutual information for symmetric rank-one matrix estimation: A proof of the replica formula. Advances in Neural Information Processing Systems, 424-432, (2015)
Kabashima, Y., Krzakala, F., Mèzard, M., Sakata, A., Zdeborova, L.: Phase transitions and sample complexity in Bayes-optimal matrix factorization. IEEE Trans. Inf. Theory 62(7), 4228–4265 (2016)
Mézard, M., Parisi, G., Virasoro, M.A.: Spin Glass Theory and Beyond. World Scientific, Singapore (1987)
Acknowledgements
E.A. acknowledges financial support by Sapienza Università di Roma (Project No. RG11715C7CC31E3D). D.T. is supported by Scuola Normale Superiore and National Group of Mathematical Physics GNFM-INdAM.
Author information
Authors and Affiliations
Corresponding author
Appendix A. Proof of Lemma 1
Appendix A. Proof of Lemma 1
Before proving Lemma 1 we define the matrix \(\tilde{\mathbf{J }}^c\) as
where \(k^*\) is defined as in Lemma (1), then it holds the following
Lemma 2
The quadratic form stemming from (A.1) is positive definite if \(c>1+\displaystyle \frac{\nu -1}{k^*}\). Moreover, we get \(\tilde{\mathbf{J }}^c = \tilde{\mathbf{P }}^T\tilde{\mathbf{P }}\), where the rows of \(\tilde{\mathbf{P }}\) are the linearly independent vectors given by
Proof
Let \(\mathbf{A }=\mathbf{diag }(\alpha _1, \dots , \alpha _{\nu })\) and define
such that we can write \(\tilde{\mathbf{J }}^c=\mathbf{A }\tilde{\mathbf{T }}\mathbf{A }\). We now derive the eigenvectors of \(\tilde{\mathbf{T }}^c\). In order for w to be eigenvector with eigenvalue \(\lambda \), the following homogeneous system must be fulfilled
Focusing on the ith component
If \(\displaystyle \sum _{a=1}^{\nu } w_{a}\ne 0\), then \(\mathbf{w }^1={\nu }^{-1/2}(1, \dots , 1)\) is eigenvector with eigenvalue \(\lambda = (c-1)k^*+1-\nu \), being positive if \(c>1+(\nu -1)/k^*\).
The remaining eigenvectors live in the subspace of dimension \(\nu -1\), characterized by \(\displaystyle \sum _{a=1}^{\nu } w_{a}=0\), and are related to the same eigenvalue \(\lambda =(c-1)k^*+1\). Normalizing, we get that the base of eigenvectors of the subspace orthogonal to the subspace \(\mathbb {R} \mathbf{w }^1\) is
Let \({\mathbf{P }}^\prime \) the matrix of dimension \(\nu \times \nu \), whose rows are the vectors \(\mathbf{w }_a\), \(a=1, \dots , \nu \). The matrix \(\mathbf{P }^\prime \) is such that \({\mathbf{P }^\prime }^T\mathbf{P }^\prime ={\tilde{\mathbf{T }}}^c\), therefore if we pose \(\tilde{\mathbf{P }}=\mathbf{P }^\prime \mathbf{A }\) we get
Finally, we notice that the rows of \(\tilde{\mathbf{P }}\) are
\(\square \)
We are now ready to prove Lemma (1).
Proof
We want to prove that if the condition \(c>1+\displaystyle \frac{\nu -1}{k^*}\) holds, then (5.1) is positive definite. Along the proof we will exploit the fact that \(\tilde{\mathbf{J }}^c\) defined in (A.1) is positive definite and we shall consider the diagonal matrix given by the difference between (5.1) and (A.1):
Therefore, if \(c>1+\displaystyle \frac{\nu -1}{k^*}\), then \(\mathbf{J }^c-\tilde{\mathbf{J }}^c >0\). If \(\mathbf{u }\in \mathbb {R}^\nu \) is an arbitrary vector
from which we get
The last inequality is ensured by Lemma (2) and this concludes the proof. \(\square \)
Rights and permissions
About this article
Cite this article
Agliari, E., Migliozzi, D. & Tantari, D. Non-convex Multi-species Hopfield Models. J Stat Phys 172, 1247–1269 (2018). https://doi.org/10.1007/s10955-018-2098-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10955-018-2098-6