Skip to content

Using SQUASHFS archives

chrisadamsonmcri edited this page Feb 4, 2026 · 1 revision

Squashfs is an archive format that allows mounting for immediate, read-only, access of files.

On MASSIVE, this is an avenue to reduce file counts, file sizes and potentially make reading files, particularly many small files, much more efficient.

This page will be a tutorial on how to create squashfs archives and mount them.

Creating archives

Squashfs archives are created with the mksquashfs utility, see man mksquashfs for a full synopsis. But the basic command is

mksquashfs [files/folders] <output>.squashfs [options]

We will deal with the case of making an archive of a single directory and all its contents. Let the directory be called foo. We will make an archive called foo.squashfs which will contain its contents.

mksquash foo foo.squashfs -no-xattrs

The -no-xattrs disables storing extended attributes which is needed to avoid a lot of warnings. The default options will use gzip as the compressor, which we dont want, so for different types of data I recommend:

Moderate to highly compressible data

mksquash foo foo.squashfs -no-xattrs -Xcomp zstd -Xcompression-level 19 -processors `nproc`

or

mksquash foo foo.squashfs -no-xattrs -Xcomp xz -processors `nproc`

Already compressed data

mksquash foo foo.squashfs -no-xattrs -Xcomp zstd -Xcompression-level 1 -processors `nproc`

Fastest compression options, no point trying hard to get further compression.

Mounting squashfs archives

In Linux/UNIX, to mount a filesystem or archive is to make its contents available within a directory. We can mount squashfs archives to give us readonly access to the contents. If you want to mount a squashfs archive, you use the squashfuse utility.

Firstly, you need to create an empty directory to act as the mount point. Unfortunately, lustre doesn't allow mounting directories so we will need to use the system /tmp directory.

Either make a fixed directory and mount

mkdir /tmp/foo
squashfuse foo.squashfs /tmp/foo

or the better option a generated directory

D=`mktemp -d`
squashfuse foo.squashfs $D

The second option is better since it is guaranteed that you won't overwrite another directory already there.

The directory contents will be accessible at /tmp/foo or the temporary directory $D.

You can make a symbolic link in your current directory to the mounted directory so you can access the directory in the same original location, i.e.

ln -sf /tmp/foo foo

When you have finished, make sure you unmount, i.e.

umount /tmp/foo

or

umount $D

Bear in mind that /tmp is specific to the current node you are working on. So the symbolic link trick will only work if you are running on a single node. If you are running grid jobs, you will need to use the $D option in your scripts, i.e.

D=`mktemp -d`
mount foo.squashfs $D

cmds $D/filenames...

umount $D

You can do this for each separate grid job and each of them will have their own unique "copy" of the squashfs archive mounted.

Extracting/listing files

Squashfs files can be treated like normal (7z, zip) archives that you can list files and extract from. See the utility unsquashfs.

Clone this wiki locally