-
Notifications
You must be signed in to change notification settings - Fork 0
Using SQUASHFS archives
Squashfs is an archive format that allows mounting for immediate, read-only, access of files.
On MASSIVE, this is an avenue to reduce file counts, file sizes and potentially make reading files, particularly many small files, much more efficient.
This page will be a tutorial on how to create squashfs archives and mount them.
Squashfs archives are created with the mksquashfs utility, see man mksquashfs for a full synopsis. But the basic command is
mksquashfs [files/folders] <output>.squashfs [options]
We will deal with the case of making an archive of a single directory and all its contents. Let the directory be called foo. We will make an archive called foo.squashfs which will contain its contents.
mksquash foo foo.squashfs -no-xattrs
The -no-xattrs disables storing extended attributes which is needed to avoid a lot of warnings. The default options will use gzip as the compressor, which we dont want, so for different types of data I recommend:
mksquash foo foo.squashfs -no-xattrs -Xcomp zstd -Xcompression-level 19 -processors `nproc`
or
mksquash foo foo.squashfs -no-xattrs -Xcomp xz -processors `nproc`
mksquash foo foo.squashfs -no-xattrs -Xcomp zstd -Xcompression-level 1 -processors `nproc`
Fastest compression options, no point trying hard to get further compression.
In Linux/UNIX, to mount a filesystem or archive is to make its contents available within a directory. We can mount squashfs archives to give us readonly access to the contents. If you want to mount a squashfs archive, you use the squashfuse utility.
Firstly, you need to create an empty directory to act as the mount point. Unfortunately, lustre doesn't allow mounting directories so we will need to use the system /tmp directory.
Either make a fixed directory and mount
mkdir /tmp/foo
squashfuse foo.squashfs /tmp/foo
or the better option a generated directory
D=`mktemp -d`
squashfuse foo.squashfs $D
The second option is better since it is guaranteed that you won't overwrite another directory already there.
The directory contents will be accessible at /tmp/foo or the temporary directory $D.
You can make a symbolic link in your current directory to the mounted directory so you can access the directory in the same original location, i.e.
ln -sf /tmp/foo foo
When you have finished, make sure you unmount, i.e.
umount /tmp/foo
or
umount $D
Bear in mind that /tmp is specific to the current node you are working on. So the symbolic link trick will only work if you are running on a single node. If you are running grid jobs, you will need to use the $D option in your scripts, i.e.
D=`mktemp -d`
mount foo.squashfs $D
cmds $D/filenames...
umount $D
You can do this for each separate grid job and each of them will have their own unique "copy" of the squashfs archive mounted.
Squashfs files can be treated like normal (7z, zip) archives that you can list files and extract from. See the utility unsquashfs.
- 0.0 Home
- 0.1 Neuroscience fundamentals
- 0.2 Reproducible Science
- 0.3 MRI Physics, BIDS, DICOM, and data formats
- 0.4 Introduction to Diffusion MRI
- 0.5 Introduction to Functional MRI
- 0.6 Measuring functional and effective connectivity
- 0.7 Connectomics, graph theory, and complexity
- 0.8 Statistical and Mathematical Tidbits
- 0.9 Introduction to Psychopathology
- 0.10 Introduction to Genetics and Bioinformatics
- 0.11 Neural field theory and eigenmodes
- 0.12 Introduction to Programming
- 1.0 Working on the Cluster
- 2.0 Programming Languages
- 2.1 Python
- 2.1.1 Getting Set Up
- 2.1.2 Applications of Python in Neuroimaging
- 2.2 MATLAB
- 2.3 R and RStudio
- 2.4 Programming Intro Exercises
- 2.5 git and GitHub
- 2.6 SLURM and Job Submission
- 2.1 Python
- 3.0 Neuroimaging Tools and Packages
- 3.1 BIDS
- 3.2 FreeSurfer
- 3.2.1 Qdec
- 3.3 FSL
- 3.3.1 ICA-FIX
- 3.4 Connectome Workbench/wb_command
- 3.5 fMRIPrep
- 3.6 QSIPrep
- 3.7 HCP Pipeline
- 3.8 tedana
- 4.0 Quality control
- 4.1 MRIQC
- 4.2 Common Artefacts
- 4.3 T1w
- 4.4 rs-fMRI
- 5.0 Specialist Tools
- 6.0 Putting it all together
- 7.0 Data management