***Unofficial*** Guide to the HCP surface file formats

The WU-Minn Human Connectome Project has provided an immensely powerful resource of 1000 adult subjects functional and structural imaging data, all projected to the surface. This resource has already begun to generate high impact publications and huge leap forwards in understand macroscale brain structure and its influences on human behaviour [1][2]. Nevertheless the data sets provided from the HCP are extensive, and not always straightforward to interpret, so I am providing a simple guide.

This is not intended to be a complete guide, and only represents the data structure as I understand it, (and have used it). I do not intend to speak for the HCP in any official way. I also consider this work in progress so if any points are still not clear, or there is a consensus that this misrepresents the data structures in any way I would be happy to edit it.

I’ll start by summarising the surface file formats. These are found in the “structural” subfolder for each subject. The relevant file structure is as follows:

  • structural/MNINonlinear/, with subfolders
  • structural/MNINonlinear/Native
  • structural/MNINonlinear/fsaverage_LR32k

An important thing to recognise first about the HCP surface file format is that it has two versions of the atlas space: 164k_FS_LR and 32k_FS_LR. These atlases are regularly spaced and represent a left-right symmetric atlas developed  by Washu U in [3]. FS stands for FreeSurfer, and indicates the atlas is related to the FreeSurfer atlas fsaverage.

For each subject the top directory structural/MNINonlinear/ has surface topologies and features resampled onto the high (164k) dimensional atlas space. The fsaverage_LR32k folder has data resampled to the low (32k) dimensional atlas space. All functional and diffusion data will have been resampled to the 32k surface in order to reduce the size of the files, but also because the original data resolution does not support a higher resolution.

The Native folder represents each subject’s ‘native’ surfaces, that is those extracted directly from each subject’s T1 using Freesurfer. From my understanding of the FreeSurfer Pipeline the white matter mesh is first fit to a  white matter (and deep) tissue segmentation, this is then expanded out towards the pial surface, and further to form and inflated surface (as document in [4]); then the inflated surface is projected onto a sphere. Midthickness is derived as a surface half way between the pial and white. All surfaces have vertex correspondence, which means the same vertex on each surface represents roughly the same point in the brain.

In the HCP file system, for subject X and hemisphere, L,  the native surface files are named:

  • X.L.white.native.surf.gii
  • X.L.midthickness.native.surf.gii
  • X.L.pial.native.surf.gii
  • X.L.inflated.native.surf.gii
  • X.L.sphere.native.surf

Of these the inflated (and indeed very inflated) are predominantly for visualisation and the spheres provide a 2D surface on which to perform simplified registration using spherical alignment methods such as FreeSurfer [5], Spherical Demons[6]  or MSM [7].

I have an extensive blog post on MSM in preparation, so I won’t go into details other than to say that MSM is a multimodal surface matching approach that has thus far been used for alignment of resting state and task fMRI, cortical folding, cortical myelin and retinotopy. For this reason the HCP recently replaced FreeSurfer with MSM for their pipelines.

Therefore, where the HCP speak of MSMsulc and MSMall they refer to data that has been aligned using just folding (MSMsulc), or folding and function (MSMall). Each registration comes with a different spherical warp i.e:

  • X.L.sphere.MSMSulc.native.surf.gii
  • X.L.sphere.MSMAll.native.surf.gii

And as legacy from the FreeSurfer alignments each subject also comes with FreeSurfer warped surfaces:

  • X.L.sphere.reg.native.surf.gii (fsaverage aligned)
  • X.L.sphere.reg.reg_LR.native.surf.gii (fs_LR aligned)
  • X.L.sphere.rot.native.surf.gii (rotated into alignment with FS_LR)

The latter is necessary as FS_LR is rotated with respect the the freesurfer average space. X.L.sphere.rot.native.surf.gii provides rough alignment prior to application of MSMSulc.

The most tricky concept is the relationship between these meshes and the template spaces. In short, as MSM is now used for the full HCP pipeline ,the data in each subjects fsaverage_LR32k and MNINonlinear folders corresponds to the result of resampling the subjects data after alignment with MSMSulc (default) or MSMall, with all data resampled with MSMall clearly identified by the filename.

Thus, if you ever want to resample from the atlas space to the Native space using workbench or MSM tools, the most important thing to remember is that the MSMall template datasets correspond to X.L.sphere.MSMAll.native.surf.gii, and the default template datasets correspond to X.L.sphere.MSMSulc.native.surf.gii.

Feature sets provided by the HCP include MyelinMaps, that can be smoothed and or bias corrected (BC) with respect to differences with the template. Folding as sulcal depth (sulc) which represents how far the surface is projected inwards or outwards during inflation, or curvature (principal curvatures. Thickness features are also available, and on the 32k surface resting state FMRI, task activations and diffusion derived sructural connectivity features are also available.

Other relevant surface data files provided as output of the structural pipeline include areal distortion maps, which provide a measure of change of area between the original and distorted spheres (take to the log2).

Hopefully all this provides some clarification??? In my next post I will talk in more detail about the MSM registration method. Watch this space!


[1] Smith, Stephen M., et al. “A positive-negative mode of population covariation links brain connectivity, demographics and behavior.” Nature neuroscience18.11 (2015): 1565-1567.

[2] Atasoy, Selen, Isaac Donnelly, and Joel Pearson. “Human brain networks function in connectome-specific harmonic waves.” Nature Communications 7 (2016).

[3] Van Essen, David C. “A population-average, landmark-and surface-based (PALS) atlas of human cerebral cortex.” Neuroimage 28.3 (2005): 635-662.

[4] Fischl, Bruce, Martin I. Sereno, and Anders M. Dale. “Cortical surface-based analysis: II: inflation, flattening, and a surface-based coordinate system.”Neuroimage 9.2 (1999): 195-207.

[5] Fischl, Bruce, et al. “High-resolution intersubject averaging and a coordinate system for the cortical surface.” Human brain mapping 8.4 (1999): 272-284.

[6] Yeo, BT Thomas, et al. “Spherical demons: fast diffeomorphic landmark-free surface registration.” Medical Imaging, IEEE Transactions on 29.3 (2010): 650-668.

[7] Robinson, Emma C., et al. “MSM: A new flexible framework for Multimodal Surface Matching.” NeuroImage 100 (2014): 414-426.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s