Where to store data¶
VSC-4 and VSC-5 provide different facilities for data: on the hardware side, there is a high-performance IBM Storage Scale Parallel Filesystem (aka GPFS) and a node-local flash storage. The parallel filesystem is the same on both clusters, so projects that exist on both clusters can access the same data.
They are accessible under:
- IBM Spectrum Scale:
$HOME
expands to /home/fsXXXXX/username
$DATA
expands to /gpfs/data/fsXXXXX/username
where XXXXX is the project number - Local SSD
/local
Home Storage¶
$HOME
is the location of the user UNIX home directory. It is entirely located on NVMe discs.
Quota on $HOME
is 100GB and 106 number of files for the entire project. The storage size can not be extended but the number of files can be increased upon request.
Project Storage¶
$DATA
is a tiered file system containing 500TB flash and around 5PB of HDD storage. It stores up to 10% of the data and all metatdata on NVMe discs and the rest on spinning discs. Frequently used files are automatically moved to the NVMe tier, while unused files are moved back to the HDD tier.
Quota on $DATA
is 10TB and 106 number of files for the project and can be extended up to 100TB.
If there is need for even more storage, specific arrangements can be made.
Access permissions
The files on $DATA
are usually group read-/writable so project members can exchange data.
Check quota
You can check your current quota usage on each of the two storage systems with:
mmlsquota --block-size auto -j data_fs7XXXX data [or home_fs7XXXX home]
For running but not for storing¶
/local¶
/local is a locally mounted NVMe disk. The size is about 450GB on VSC-4 and 1.8TB on VSC-5. Data retention is not guaranteed, all data is lost on a reboot, so they need to be transferred after use.
/tmp¶
Can be used for intensive I/O, and take up to half a compute node memory and the data resides in the shared memory (RAM) of the node.
The data gets deleted after the job, so move results to $DATA
Warning
/local
and /tmp
might be useful to run jobs but they are NOT for permanent storage.
Requesting extra storage¶
Submit your request via the project website
What to store where¶
$HOME
¶
User settings and various caches and config directories are automatically here. Additionally, HOME can be used for custom configuration, code and scripts, custom software and environments (such as conda). Do NOT store any scientific or research data here - this includes original input data as well as final results. They get too big very fast, and heavy I/O to HOME should also be avoided.
$DATA
¶
This is the main project volume, so all scientific/research data must go there, including raw input data, final results and all types of intermediate data. Please remove any data (especially temporary or intermediate data) that is no longer in use - the system is not set up for long-term archiving.