How opinion dynamics can be used for mathematical optimization
In both animal and human societies, the power of collective intelligence has always been a subject fascinating observers and scientists. From flocks of birds navigating the skies to colonies of ants efficiently foraging for food, nature has in many places perfected the art of cooperation and consensus-building among its members. Similarly, in human societies, we constantly experience that communication and collaboration steers our individual actions and helps us reach agreements, make decisions, and solve complex problems. At the heart of these phenomena lies the concept of “swarm intelligence,” a term that abstractly captures the collective tendency to facilitate consensus formation by exchanging information within groups of individuals, whether they are animals, humans, or even artificial agents.
Mathematicians have long picked up on these concepts in the search for efficient optimisation methods for complicated, non-convex objective functions. Typical algorithms that have been proposed simulate several agents at once, all of which explore the domain of an optimisation problem at hand while exchanging information on their current position and its quality as a solution of the problem. While mathematical formulations of such methods have been around since at least the 1950s, rigorously investigating them often becomes difficult due to the complicated interplay between individual agents and collective behaviour.
A line of works on a particular class of methods, termed consensus-based algorithms, has recently been able to shed some light on this interplay between the local terms driving individual particles and the emergent collective behaviour. The main idea behind consensus-based optimization (CBO), proposed originally in 2016, is to define individual particles, or agents, that evolve in time according to a certain stochastic differential equation (SDE). Instead of studying the behaviour of a system of single particles, the analysis is concerned with the mean-field limit, i.e. the limit of many particles and their distribution across the domain of the problem. The resulting mean-field limit partial differential equation (PDE) is simpler to study and the theoretical results in turn justify the simulation of finite numbers of particles.
After the original CBO scheme was proposed and analysed, naturally the techniques could be transferred to modified algorithms and the hunt for more efficient methods began. In particular, it is not a priori clear what type of local, individual information a particle should use, and how exactly it should best interact with the other particles. The CBO scheme, for example, contains a single ‘individual’ term that allows particles to move randomly around the domain, and another ‘collective’ term which ensures that all particles find a consensus based on their individual position’s quality. Follow-up works have investigated other schemes that, for example, feature a different type of randomness or allow the individual agents to have memory on their past positions. Recent research, carried out by Konstantin Riedl at TU Munich and published in European Journal of Applied Mathematics, explores and analyses such modifications of the CBO algorithm.
The author analyses a variant of CBO which incorporates the mentioned memory effects as well as local gradient information. The ‘memory’ is incorporated both on the individual and on the collective level in the algorithm. With every particle, the method keeps track of that particle’s historically best position in terms of its objective value. By storing this best past position one can add an individual force term pushing each particle towards it. At the global level, the consensus is not formed between the random particles, but rather between these best past positions. The ‘gradient information’ that is considered in the paper adds as yet another additional term in the SDE defining the scheme. With that term, particles are pushed in the direction of the negative objective’s gradient, similarly as in gradient descent, one of the simplest and most popular optimisation methods.
Needless to say, the additional terms complicate the analysis of the particles’ mean field limit. Nevertheless, as the author states, the new “results for CBO with memory effects and gradient information hold for a vast class of objective functions under minimal assumptions on the initialisation of the method”. The main theoretical work of the article goes into deriving the mean-field limit PDE and showing that it contracts exponentially fast to a point mass at the correct solution.
The article presents numerical evidence that also in practice both the memory and the gradient information can be vital for the consensus-based algorithm to find the correct minimum. An example computation is carried out for the notoriously non-convex and difficult problem of training a neural network. In another application to a non-convex formulation of a standard signal processing problem, the new algorithm easily outperforms the standard CBO scheme in its ability to find the correct minimum.
The research by Konstantin Riedl shows how the CBO framework allows for a rigorous analysis even when the dynamics are driven by more and more complex terms both at the individual and collective level, and demonstrates that the powerful technique of analysing mean-field limits of particle systems can be effectively applied to more refined optimization algorithms.
‘Leveraging memory effects and gradient information in consensus-based optimisation: On global convergence in mean-field law’. This open access paper is published in European Journal of Applied Mathematics. Thanks to open access, this paper is freely available to read, download, and share.
Since 2008 EJAM surveys have been expanded to cover Applied and Industrial Mathematics. Coverage of the journal has been strengthened in probabilistic applications, while still focusing on those areas of applied mathematics inspired by real-world applications, and at the same time fostering the development of theoretical methods with a broad range of applicability. Survey papers contain reviews of emerging areas of mathematics, either in core areas or with relevance to users in industry and other disciplines. Research papers may be in any area of applied mathematics, with special emphasis on new mathematical ideas, relevant to modelling and analysis in modern science and technology, and the development of interesting mathematical methods of wide applicability.