編集不可のページ History 添付ファイル

 

TUT HPC Cluster Wiki: en / WideAreaClusterSystem

[English|Japanese]

Wide-Area Coordinated Cluster System for Education and Research

Operation ended on August 2019. Please refer to the top page for the new system.

Test operation commenced on August 1, 2014.

Logging In

Log in to the development server using an SSH client such as TeraTerm or putty. The different hostnames of the server are prepared for internal users and external users.

The development server is a group of servers to develop and compile programs. Do not perform large-scale calculations on the server. Conduct the actual calculation on the cluster as a job via Torque.

Note that the development server is redundantly configured using two servers to distribute the loading.

For Internal Users

The hostname of the development server is wdev.edu.tut.ac.jp. Enter your account information provided by the Information and Media Center in the user name and password field.

$ ssh wdev.edu.tut.ac.jp

For External Users

The hostname of the development server is lark.imc.tut.ac.jp. Use an account provided through the account registration system (par + 7-digit numerals) to log in to the server. External users are identified through a public key authentication method. Register your public key on the Change Profile page. Refer to the How to use SSH client page for the procedure for creating a public key.

$ ssh lark.imc.tut.ac.jp

Queue Configuration

The following is the tentative configuration.

Queue name

No. of available nodes

Timeout/job

Max. processes/node

Max. memory/node

Notes

wSrchq

30 nodes

1 hour

20

100GB

wLrchq

30 nodes

336 hours

20

100GB

System Configuration

Hardware Configuration

Category

Host name

Model

CPU

Main memory capacity

Computing performance

Accelerator

OS

Development processing server

wdev

HA8000-tc/HT210

Xeon E5-2680 v2 2.8GHz 10-core x 2

128GB

448GFLOPS

Xeon Phi

RHEL6.4

Calculation node (Xeon Phi processors installed)

wsnd00〜wsnd15

HA8000-tc/HT210

Xeon E5-2680 v2 2.8GHz 10-core x 2

128GB

448GFLOPS

Xeon Phi

RHEL6.4

Calculation node

wsnd16〜wsnd31

HA8000-tc/HT210

Xeon E5-2680 v2 2.8GHz 10-core x 2

128GB

448GFLOPS

RHEL6.4

* It is not possible to execute jobs using wsnd17 and wsnd31.

File System Configuration

Home area

/home/numeric_characters/user_name/

The same home area as the Windows terminal for education can be used. This appears as Z:\ to Windows devices.

Work area

/gpfs/work/user_name/

This appears as V:\ to the Windows terminal for education.

Software area

/common/

The work area can also be viewed through /work/user_name/.

Compiler

Compiler

Version

Installed directory

Intel

14.0.0 Build 20130728

/common/intel-2013SP1/

PGI

14.3-0

/common/pgi-14.3/

GNU

4.4.7

/usr/bin/

Message Passing Interface (MPI)

Library

Version

Installed directory

Intel MPI

14.0.0 Build 20130728

/common/intel-2013SP1/

Open MPI

1.6.5

/common/openmpi-1.6.5/

MPICH 3

3.1

/common/mpich-3.1/

MPICH 1

1.2.7p1

/common/mpich-1.2.7p1/

Software Configuration

Software name

Version

Description

Installed directory

Structural analysis

ANSYS Multiphysics

14.5

Multiphysics analysis tool

/common/ansys14.5/

ANSYS CFX

14.5

General-purpose thermal-hydraulics software

/common/ansys14.5/

ANSYS Fluet

14.5

General-purpose thermal-hydraulics software

/common/ansys14.5/

ANSYS LS-DYNA

14.5

Crash analysis tool

/common/ansys14.5/

ANSYS HFSS

15.0.3

High frequency 3D electromagnetic field analysis software

/common/ansys_hfss-15.0.3/

ABAQUS

6.12

General-purpose non-linear finite element analysis program

/common/abaqus-6.12-3/

Patran

2013

Integrated CAE environment pre/post-processing software

/common/patran-2013/

DEFORM-3D

10.2

FEA-based 3D formation process simulation system

/common/deform-3d-10.2/

COMSOL

4.4

FEA-based general-purpose physics simulation system

/common/comsol44/

Computational Materials Science

PHASE (Serial version)

2014.01

First-principles pseudopotentials calculation software (Serial version)

/common/phase0-2014.01-serial/

PHASE (Parallel version)

2014.01

First-principles pseudopotentials calculation software (Parallel version)

/common/phase0-2014.01-parallel/

PHASE-Viewer

3.2.0

Integrated GUI environment software

/common/phase-viewer-v320/

UVSOR (Serial version)

3.42

First-principles pseudopotentials dielectric-response analysis software (Serial version)

/common/uvsor-v342-serial/

UVSOR (Parallel version)

3.42

First-principles pseudopotentials dielectric-response analysis software (Parallel version)

/common/uvsor-v342-parallel/

OpenMX (Serial version)

3.7

First-principles quantum simulator based on the relative quantum mechanics bottom-up causation theory (Serial version)

/common/openmx-3.7/

OpenMX (Parallel version)

3.7

First-principles quantum simulator based on the relative quantum mechanics bottom-up causation theory (Parallel version)

/common/openmx-3.7/

Computational chemistry

Gaussian

09 Rev.C.01

Electronic structure program

/common/gaussian09-C.01/

NWChem (Serial version)

6.3.2

A comprehensive and scalable open-source solution for large scale molecular simulations (Serial version)

/common/nwchem-6.3.2-serial/

NWChem (Parallel version)

6.3.2

A comprehensive and scalable open-source solution for large scale molecular simulations (Parallel version)

/common/nwchem-6.3.2-parallel/

GAMESS (Serial version)

2013.R1

A general ab initio quantum chemistry package (Serial version)

/common/gamess-2013.r1-serial/

GAMESS (Parallel version)

2013.R1

A general ab initio quantum chemistry package (Parallel version)

/common/gamess-2013.r1-parallel/

MPQC

3.0-alpha

Massively Parallel Quantum Chemistry Program

/common/mpqc-3.0.0a-2014.03.20/

Amber (Serial version)

12

Molecular Dynamics Package (Serial version)

/common/amber12-serial/

Amber (Parallel version)

12

Molecular Dynamics Package (Parallel version)

/common/amber12-parallel/

AmberTools (Serial version)

12

Set of several independently developed packages that work well by themselves, and with Amber itself (Serial version)

/common/amber12-serial/AmberTools/

AmberTools (Parallel version)

12

Set of several independently developed packages that work well by themselves, and with Amber itself (Parallel version)

/common/amber12-parallel/AmberTools/

CONFLEX (Serial version)

7

General-purpose molecular dynamics computation software (Serial version)

/common/conflex7/

CONFLEX (Parallel version)

7

General-purpose molecular dynamics computation software (Parallel version)

/common/conflex7/

CHEMKIN-PRO

15112

Detailed chemical reaction analysis support software

/common/chemkin-15112/

Technical processing

MATLAB

R2013a

Numerical computing language

/common/matlab-R2013a/

・You must be a Type A user to use ANSYS, ABAQUS, Patran, DEFORM-3D, COMSOL, GAUSSIAN, CHEMKIN-PRO, and MATLAB.

・To apply for registration as a Type A user, see: http://imc.tut.ac.jp/en/research/form.

Using the Software

For using the software, see Using Cluster Systems.

Using the Xeon Phi Processor

Job execution using the Xeon Phi processor can be performed in native mode or offload mode. In native mode, the Xeon Phi processor is used as a single calculation node. The MPI program can be used as it is. In offload mode, a specific process within the source code can be offloaded to the Xeon Phi processor and executed. The Xeon Phi processor can be used in the same manner as GPGPU through OpenACC.

Native Execution

Sample Source Program

sample_phi.c

#include <stdio.h>
#include <mpi.h>
int main(int argc, char **argv)
{
    int myid, nprocs;
    char hname[128]="";

    MPI_Init(&argc,&argv);
    MPI_Comm_size(MPI_COMM_WORLD,&nprocs);
    MPI_Comm_rank(MPI_COMM_WORLD,&myid);

    gethostname(hname,sizeof(hname));
    if (myid==0)
        printf ("NUMBER OF PROCESSES: %3d\n", nprocs);
    printf("HELLO WORLD! (HOST NAME: %10s, MYID: %3d)\n", hname, myid);

    MPI_Finalize();

    return 0;
}

* The source content is the same as the sample source program for MPI (sample_c_mpi.c). Only the file name has been changed.

Compiling

It is necessary to create an execution file for the Xeon CPU and another execution file for the Xeon Phi coprocessor. Use the Intel compiler.

Creating the execution file for the Xeon CPU

% mpiicc sample_phi.c -o sample_phi

Creating the execution file for the Xeon Phi coprocessor

% mpiicc -mmic sample_phi.c -o sample_phi.mic

Notes

Add “.mic” to the execution file for the Xeon Phi coprocessor.

Apart from “.mic”, the Names of the execution files for the Xeon CPU and the Xeon Phi coprocessor must be identical.

Sample Script to Submit Jobs

phi_native.sh

### sample

#!/bin/sh
#PBS -q wSrchq
#PBS -l nodes=3:ppn=2:Phi

MIC0_PROCS=3
MIC1_PROCS=1
source /common/torque/MIC/mkmachinefile.sh

cd $PBS_O_WORKDIR
mpirun -machinefile ${MACHINEFILE} -n ${PBS_NP} ./sample_phi

Note on #PBS -q

A job can be submitted to either the wSrchq or wLrchq queues.

Note on #PBS -l

Always specify :Phi in #PBS –l. This is the setting to specify the calculation node equipped with the Xeon Phi processors.

Note on MIC0_PROCS and MIC1_PROCS

Two Xeon Phi processors are installed on a single node in the system. MIC0_PROCS= and MIC1_PROCS= are used to specify the number of processes to invoke on the respective Xeon Phi processor on the calculation node. In the script file shown above, three calculation nodes are used by specifying nodes=3. One of the Xeon Phi processors on each node invokes 3 processes while the other invokes 1 process. The Xeon Phi’s capacity is 60 core/240 threads per processor. The value specified in MIC0_PROCS= and MIC1_PROCS= must be 240 or less.

Note on the number of parallel process

In the script file shown above invokes 18 processes in total and executes them in parallel. Assume that wsnd00, wsnd02, and wsnd03 are selected by nodes=3. To distinguish the two Xeon Phi processors on the calculation node, they are named Xeon Phi0 and Xeon Phi1, respectively. A total of 18 processes are invoked in this example: 2 processes in wsnd00, 3 processes in Xeon Phi0 in wsnd00, 1 process in Xeon Phi1 in wsnd00, 2 processes in wsnd02, 3 processes in Xeon Phi0 in wsnd02, 1 process in Xeon Phi1 in wsnd02, 2 processes in wsnd03, 3 processes in Xeon Phi0 in wsnd03, and 1 process in Xeon Phi1 in wsnd03.

Others

Change the values for the following options as appropriate: the queues to submit the job, the values for nodes=,ppn=, MIC0_PROCS=, and MIC0_PROCS=, and the execution file name (sample_phi in the above script file).

Offload Execution

Sample Source Program

tbo_sort.c

Location: /common/intel-2013SP1/composer_xe_2013_sp1.0.080/Samples/en_US/C++/mic_samples/LEO_tutorial/tbo_sort.c

Compiling

When compiling, always specify the -openmp option. Specify an Intel complier.

% icc -openmp

tbo_sort.c -o tbo_sort

Sample Script to Submit Jobs

phi_offload.sh

### sample

#!/bin/sh
#PBS -q wSrchq
#PBS -l nodes=1:ppn=20:Phi

cd $PBS_O_WORKDIR
./tbo_sort

Note on #PBS -q

A job can be submitted to either the wSrchq or wLrchq queues.

Note on #PBS -l

There is no need to change #PBS -l nodes=1:ppn=20:Phi. This line means to use a single calculation node on Xeon Phi exclusively.

Others

Change the queue to submit the job and the execution file name (tbo_sort in the above script file) as appropriate.

Limitations

● Specify the IP address of Xeon Phi when using mpirun –machinefile. Usually, the value for –machinefile is generated automatically by mkmachinefile.sh.

● Only tcp is available as the MPI communication. tcp is used by default in mkmachinefile.sh.

● Native execution is available on wsnd 00 to 09, 14, and 15 as of October 2015. The job scheduler automatically chooses the node unless otherwise specified explicitly.