How to run WRF requiring large memory

Any issues with the actual running of the WRF.

How to run WRF requiring large memory

Postby bruceyoung01 » Thu May 25, 2017 4:51 pm

Hi all

I am running WRF with domain setup requiring large memory. The horizontal grid is like 300*250 grids, and vertical with 50 layers. Now I submitted to our campus supercomputer to run using parallel. However every time I got similar error about memory exceeding the maximum memory. Here is an example of the error log.

starting wrf task 0 of 1
starting wrf task 0 of 1
starting wrf task 0 of 1
slurmstepd: Step 5607917.0 exceeded virtual memory limit (72054376 > 70963200), being killed
slurmstepd: *** STEP 5607917.0 ON n64 CANCELLED AT 2017-05-25T15:14:47 ***
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
srun: error: n65: tasks 15-29: Killed
srun: Terminating job step 5607917.0
srun: error: n64: tasks 0-14: Killed


I used slurm to submit jobs. Here is the slurm script to submit jobs.

#!/bin/bash
#SBATCH --job-name=wrf_long
#SBATCH --output=slurm.out_wrf_long
#SBATCH --error=slurm.err_wrf_long
#SBATCH --partition=batch
#SBATCH --qos=long_contrib
#SBATCH --time=120:00:00
#SBATCH --constraint=hpcf2013
##SBATCH --mem=max
#SBATCH --mem=63000
#SBATCH --nodes=8
#SBATCH --ntasks-per-node=15
#CHANGE TO JOB DIRECTORY
#RUN PROGRAM
srun ./wrf.exe >&wrf.log


I am not sure how to minimize the memory that I required for each node and disperse the whole job into several small jobs. In this case, for each small job, it will require less memory.

Do you have any ideas on this? Thank you very much.

Best,
Zhifeng
bruceyoung01
 
Posts: 3
Joined: Sat Sep 19, 2009 2:20 pm

Re: How to run WRF requiring large memory

Postby Ragan » Fri Jul 28, 2017 7:18 am

Ragan
 
Posts: 1
Joined: Sat Jul 22, 2017 9:49 am


Return to Runtime Problems

Who is online

Users browsing this forum: No registered users and 4 guests