Advanced features of MSM: a guide

MSM is an advanced tool for cortical surface mapping, which has been shown to outperform competing methods for the alignment of multimodal data, proving fundamental to the development of the Human Connectome Project’s Multimodal Parcellation of the Human Cerebral Cortex (Nature 2016)

Although there is already support for use of this method via a page on the FSL website, this does not include usage guidance for the advanced features of MSM introduced in the 2018 paper. This includes the support of much smoother and biologically plausible warps of noisy data such as fMRI and regularisation of warps between cortical anatomy. This purpose of this blog is therefore to provide top-up support for these features.

Note, code and binaries for MSM HOCR are available from github. I will respond to any questions submitted either to the github or the FSL list. Support for use with developing brains can also be reached via the dHCP neurostars list

MSM basics

MSM is a spherical registration tool which learns a mapping between multimodal (or multivariate/multichannel) cortical feature maps from two brains. Examples of the types of data which you can use can include: morphological (sulcal depth or curvature) maps, cortical myelination, task or rest fMRI, tract density maps.

Fig 1. Multimodality feature sets projected onto the cortex

If you want to align combinations of maps you can merge their gifti metric files using the HCP workbench command (available here) e.g.

wb_command -metric-merge output_name.func.gii -metric metric1.func.gii -metric metric2.func.gii -metric metric3.func.gii

A basic call to MSM is as follows

msm --indata=in_data.func.gii --refdata=ref_data.func.gii -o ~/mydirname/L.

where, there are three things you really need to know:

  1. MSM is a spherical registration tool. Thus, like FreeSurfer registration you are learning mappings between spherical projections of the cortical anatomy. –in_mesh and –ref_mesh have to be spheres. This is true whatever version of MSM you choose to use.
  2. MSM estimates a multi-resolution alignment parameterised over a series of low resolution regularly sampled spheres known as icospheres. These are referred to as control point grids in the papers and the documentation. They add regularisation and supports larger deformations.
  3. MSM mappings are learnt though Discrete Optimisation. This means, rather than allowing a control-point grid vertex to displace continuously to any new location (at each iteration) it is instead constrained to displace to one of a finite set of end points: as shown in the diagram below.
Fig 2. The control point grid (A) and discrete optimisation framework (B) of MSM. C) shows displacement of a control point onto the location of a discrete label point

The discrete cost function is as follows:

COST=\sum_{c1 \in C_D} c(\mathbf{l}_{c1}) + \lambda \sum_{c2 \in C_R } V(\mathbf{l}_{c2})

This is a Markov Random Field (MRF) cost which seeks to assign labels to each data point so as to maximise some estimate of data likelihood whilst penalising sharp changes in labelling points between neighbouring points. Thus, c(\mathbf{l}_{c1}) is the similarity for some image patch (or clique c1). And V(\mathbf{l}_{c2}) is the smoothness penalty for clique c2. We will discuss cliques in more detail later but for now let’s assume, as done in the original version of the algorithm that c1 is a univariate and just represents some similarity around a control point p.

Fig 3. The registration seeks to improve the correspondence between patches of features located around each control point (in the moving mesh) and their overlapping locations (in the target mesh)

Most often for MSM similarity is assessed through cross-correlations, either for overlapping image patches (as shown in the figure), or (if you are using multiple feature maps) an average of correlations of feature vectors at each corresponding point in space.

When finally estimated discrete displacements of the control point grid are then upsampled (or interpolated) to the original sphere using barycentric interpolation to get the final warp.

For more details on the basic use of MSM, how to use different types of data and change the basic parameters of the warp please refer to the FSL guide.


In discrete optimisation cliques are groups of data points, for example they can be single \{p\}, pairs \{p,q\}, triplets \{p,q,r\} or higher orders. In MSM the highest we go to is 3 and this reflect triplets of points, reflecting the vertex corners of each triangular face in the mesh.

Fig 4 A tri-clique (otherwise known as a mesh face)

Limitations of the original MSM

In the original version of MSM the highest order of clique considered is two. This reflects the fact that standard discrete optimisation libraries optimise MRFs where the maximum clique size is two (pairs); this is largely due to the significant computational challenges involved in programming approximations that work for such increasingly complex combinatorial problems. The issue with this is it only allows first-order smoothness penalties – i.e. you can only look at the rate of change of displacement between neighbouring points \{p,q\}. Most successful modern registration aim to penalise second order effects using, for example, bending energy penalties (see Rueckert 1999 or Ashburner 2007).


HOCR stands for Higher Order Clique Reduction, which is based on work done by Hiroshi Ishikawa (2010, 2014) in which algorithms were proposed which reduce higher (than 2) order clique terms to a linear combination of pairwise and univariate terms through polynomial approximation. This allows higher-order MRFs to be solved with standard MRF solvers such as Fast-PD (as used by the original MSM). This allows MSM to move from using a fairly weak pairwise penalty to a tri-order biomechanical strain based regularisation:

\mathbf{W_{pqr}}=\frac{\mu}{2}(R^k+R^{-k}-2)+\frac{\kappa}{2}(J^k + J^{-k} - 2)

This penalises both scale J=\lambda_2\lambda_2 and shape R=\lambda_1/\lambda_2 distortions of the triangles of the control-point mesh, where \lambda_1 and \lambda_2 are the principal stretches (or eigenvalues of the 2D deformation tensor \mathbf{F_{pqr}} and the bulk  (\kappa) and shear (\mu) moduli are parameters which weight the relative importance of the two terms.

Fig 5 Showing an affine transformation (F) of one triangle of the control point grid mesh.


aMSM – anatomical regularisation

Strain based regularisation has proved extremely vital for improving quality of alignment for noisy datasets such as fMRI. However, this is not it’s only benefit. It can also be used to indirectly infer a smoothed and regularised mapping of the cortical anatomy (see figure below)

Fig 6 aMSM infers a smoothed and regularised warp between source and target cortical anatomies (SAS and TAS).

Here, warps  continue to be inferred between source and targets spheres (SSS and TSS) via displacements of points to discrete locations defined on the sphere (c). However, due to one-to-one correspondence between vertices for each subject’s sphere and (white/pial/midthickness) surface, it means it is possible to infer an anatomical warp  (SAS->DAS) by projecting the moving surface mesh topology (SAS) onto the target anatomical surface (TAS) shape, using the  location of source vertices (shown in yellow) relative to target vertices (shown in pink: TSS) to perform the weighted (barycentric) resampling. This leads to a deformed configuration of the source anatomy (DAS) where the 2D transformations (\mathbf{F_{pqr}}) of mesh faces (from SAS-DAS) can then be regularised using biomechanical strain (\mathbf{W_{pqr}})

Running MSM HOCR from the command line

So, having gone through potential applications, let’s discuss how to access the new features of the method: from the command line and through edits to the configuration files.

 1. Config files:

A config files for running MSM HOCR for spheres looks like this

Screen Shot 2020-04-05 at 13.40.36
sMSM strain config

Other examples can be found here

There are few changes here relative to the (original) FSL version of MSM. However, the following parameters are new

  • –rescaleL this argument updates the discrete label space at each iteration to allow more thorough sampling of the entire space (for more details see the paper). It makes the code run slower but improves performance
  • –bulkmod and –shearmod are the bulk and shear modulus parameters of the strain function defined above. Investigations in the 2018 MSM paper showed that a) this default parametrisation worked for adult and neonatal brains and b) the method was robust to a range of choices of values. There should be no reason to change from these default values unless you want to do advanced biomechanical modelling of the type described in Garcia et al 2018
  • –k_exponent is the k from the strain equation. Again, it is not recommended that you change this
  • –triclique this uses tricliques for similarity estimation. In general we found these didn’t help much so don’t generally use them
  • –regoption This determines whether you are running MSM with the original (pairwise) regularisation of the 2014 paper (for this use  –regoption=1), strain based regularisation of spheres (option 3) and strain-based regularisation of cortical anatomies (option 5). Option 2 (and 4) reflect an angular scale-based penalty used for the 2016 HCP parcellation paper. They are not as good as strain so I don’t recommend using them
  • –dopt – will determine how the method is optimised. Set –dopt=HOCR for advanced regularisation

For all other config parameters we advise checking the FSL MSM doc

In addition if you seek to run aMSM and penalise anatomical deformations you might consider the following options:

  • –anatgrid – this uses a downsampled version of the anatomical grid (e.g. as shown in Fig 5) to estimate the distortions F_pqr. The downsampled meshes are estimaed through projection of different icospheric mesh resolutions onto the anatomical surface. We found that going lower than ico 3 downgraded performance and advise starting with 4 (see example aMSM config below)
Screen Shot 2020-04-05 at 13.53.41
               aMSM strain config               (used as file aMSM_strain_conf in aMSM command line call below)

One other thing to note for aMSM is that it requires many more iterations and due to the computational load (and likely inefficient implementation) it takes a long time (up to 24 hours to run). Generally MSM HOCR is a slower algorithm on account of how Ishikawa’s implementation breaks the problem up into a number of separate optimisations for each label (rather than fully integrating as for standard FastPD).

2. Command line.

In general the command line calls will remain exactly the same. The exception being when you want to run aMSM. Here, you need to supply the input and target cortical anatomy files using the –inanat and –refanat options. Here, the anatomy can be the white, pial, midthickness (or even inflated) surfaces. Each will place different constraints on the alignment. An example call to aMSM is as follows:

msm --indata=in_data.func.gii --refdata=ref_data.func.gii -o ~/mydirname/L. --conf=aMSM_strainconf

For more guidance on the impact of parameter choices for aMSM, and guidance on applying to your data please refer to our paper. If you are expressly interested in Biomechanical modelling, detailed optimisation and configuration guidance is supplied in the supplementary information of Garcia et al 2018. where aMSM was used to build a model of cortical growth in preterm neonates. The below figure shows how this lead to evidence of differential rates of growth across the cortex over time.

Fig 5 from Garcia at al 2018. This shows (A and B) how rates of growth are initially higher in the lateral parietal, temporal, occipital, and frontal regions (red) and lower in the medial frontal and insular regions (blue).  However  relative expansion increases in the temporal lobe over time (A and C). For more details see the paper