TUT HPC Cluster Wiki

TUT HPC Cluster Wiki: en / ClusterSystemTips

[English|Japanese]

Cluster System Usage Tips

Share any tips that you have discovered on using the Cluster System. You can edit this article after logging in.

Log of the information sharing mailing list for research users

Log of the information sharing mailing list for research users is available at: http://lists.imc.tut.ac.jp/pipermail/research-users/

Measuring resource usage (e.g. memory usage)

Resources (e.g. memory) used by the processes can be viewed by using the GNU time command (/usr/bin/time). Specifying the option -v displays the resource details used by the process, such as follows (see time command man page).

-bash-4.1$ /usr/bin/time --version
GNU time 1.7

-bash-4.1$ /usr/bin/time -v whoami
my016
        Command being timed: "whoami"
        User time (seconds): 0.00
        System time (seconds): 0.00
        Percent of CPU this job got: 0%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.00
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 3088
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 243
        Voluntary context switches: 3
        Involuntary context switches: 1
        Swaps: 0
        File system inputs: 0
        File system outputs: 0
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

Maximum resident set size shows the memory usage. Note that the GNU time command V. 1.7 shows this figure 4 times greater than the actual memory usage, which is a known bug (see Massive bug in GNU time (Japanese)).

You can run the GNU time command on a calculation node, however, the result is output as the standard error.

Batch Deletion of Submitted Jobs

End users are not permitted to use the -all option for qdel command, and qdel -t does not function correctly. For this reason, use the following script to perform batch deletion of multiple submitted jobs.

   1 #!/bin/bash
   2 
   3 if [ $# -ne 1 -a $# -ne 2 ]; then
   4     echo "Usage: bash $0 [ 'all' | firstId lastId | lastId ]"
   5     exit
   6 fi
   7 
   8 # Is the first argument a numerical value?
   9 if expr "$1" : '[0-9]*' > /dev/null ; then
  10     ids=`seq $1 $2`
  11 else
  12     ids=`qstat | cut -d . -f 1 | tail -n +3 | column`
  13 fi
  14 
  15 echo qdel $ids
  16 qdel $ids

Specifying all as the first argument deletes all the submitted jobs (regardless of whether they are being executed or not). It is also possible to delete a range of jobs by specifying the starting job ID as the first argument and the ending job ID as the second argument. To execute a script called qdel.sh, write the code such as follows:

-bash-4.1$ bash qdel.sh all

-bash-4.1$ bash qdel.sh 100 200

Script to Submit Specified Command as a Job

The qsub command with the -v option can pass an environment variable to a script. The following script uses this function to execute the specified command as a job on a calculation node. In this script, the standard error is output with the standard output (-j oe), and the script also includes commands to log the execution time and the node where the calculation took place (date and hostname).

   1 #!/bin/bash
   2 #PBS -l nodes=1:ppn=1
   3 #PBS -q wLrchq
   4 #PBS -j oe
   5 
   6 cd $PBS_O_WORKDIR
   7 
   8 date
   9 hostname
  10 echo "$JOB_CMD"
  11 eval "$JOB_CMD"
  12 date

To execute a script called qsub.sh, specify the command to execute as the argument of JOB_CMD. Executing this script submits the command as a job.

-bash-4.1$ qsub -v JOB_CMD="/usr/bin/time -v perl i_love_cats.pl" qsub.sh

-bash-4.1$ qsub -v JOB_CMD="perl catching_cats.pl | perl counting_cats.pl" qsub.sh

List of All Users' Jobs in the Queue

A list of all jobs including other users' jobs in the queue can be checked using the /usr/local/maui/bin/showq command.

Displaying Job Status Repeatedly

It is useful to use the watch command to display the execution status of a submitted job. The code examples below run the watch command every 5 seconds, displaying the latest job execution status. Option -d highlights the difference from the last status. Note that an alias cannot be specified to a command that is executed repeatedly.

-bash-4.1$ watch -n 5 qstat -a

-bash-4.1$ watch -n 5 qstat -Q

-bash-4.1$ watch -n 5 -d qstat -Q

Limitation by ulimit -t (the CPU time)

The development node has a limitation on the CPU time as specified by the ulimit command. This limit value seems to apply to the value that are not shown by time command (hidden values). If a job is aborted due to an unknown reason, this limitation could be the cause. For example, synchronizing a large volume of data using the rsync command generates an error such as follows.

-bash-4.1$ rsync --progress -avh /tmp/source /destination/
sending incremental file list

...

rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: Broken pipe (32)
rsync: connection unexpectedly closed (96 bytes received so far) [sender]
rsync error: error in rsync protocol data stream (code 12) at io.c(600) [sender=3.0.6]

The CPU time consumed by rsync appears to be determined by the transmitted data volume regardless of whether --bwlimit has been specified or not. As remote synchronization involves transport encryption using ssh, it consumes approximately twice as much CPU time as local synchronization.

Job Scheduling Order

The Wide-Area Coordinated Cluster System for Education and Research

The job scheduler allocates jobs in descending order of host number of the calculation nodes such as wsnd30, wsnd29, ..., and wsnd00.