[molpro-user] Re: performance on Sun ultraSparc, solaris 9: disk i/o bottleneck?

Tue Mar 8 06:05:27 GMT 2005

Hi Molpro-Users,

I have increased shmmax on my ultraSPARC smp (Solaris 9, Sun Studio 8, 8cpus) to 256MB, and raised semmni and semmns to 128.  I am still seeing low CPU usage.  The top section of a prstat gives:

   PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP
  1071 seth     4377M 3859M sleep   60    0   0:14:43 7.6% molprop_2002_6_/1
  1073 seth     4377M 3859M cpu3    22    0   0:14:04 3.7% molprop_2002_6_/1
  1070 seth     4377M 3864M cpu2    13    0   0:14:04 3.5% molprop_2002_6_/1
  1072 seth     4377M 3859M sleep   60    0   0:13:58 2.1% molprop_2002_6_/1
  1229 root     4536K 2704K sleep   59    0   0:00:00 0.0% sshd/1
  1236 seth     4608K 4416K cpu5    49    0   0:00:00 0.0% prstat/1
  1231 seth     2576K 1848K sleep   49    0   0:00:00 0.0% bash/1
   263 root     2944K 2248K sleep   59    0   0:00:00 0.0% nscd/22
    60 root     7528K 6736K sleep   59    0   0:00:04 0.0% picld/12

In this, one can see the molpro executables running at the top, each taking a very small amount of %CPU.

vmstat 3 3 gives the following output:
 r b w   swap  free  re  mf pi po fr de sr m0 m1 m4 m5   in   sy   cs us sy id
 0 0 0 18551736 20035272 1495 377 1146 3 2 0 7 1 0 0 0  587 11415 510 19  3 78
 0 0 0 12820880 14982248 11956 3 14521 0 0 0 0 1 0 0 1 1185  537  448 11 10 80
 0 0 0 12820880 15002072 13217 92 24173 8 8 0 0 0 0 0 1 1181 5439 401 14 11 75
b

Which, according to what I read, may indicate the beginning of a bottleneck because 'sy' is a large fraction of 'us'.  mpstat gives:
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0   71   0 14981     7    1   80   30    0   50    0  2340   31   5   5  58
  1   45   1 13513   179    3   74   25    0   40    0  2430   31   3   7  60
  2   33   0 12438     6    1   82   29    0   31    0  1877   26   2   5  66
  3   42   1 11727     6    2   81   30    0   36    0  1622   23   3   5  70
  4   46   0 10309     6    1   58   11    0   37    0  1401   19   3   5  73
  5   68   0 8761     5    1   66   20    1   36    0  1035   11   3   7  79
  6   41   0 8748    17    1   50    4    1   29    0   445    6   3   4  87
  7   29   0 1501   367  264   22    6    0   11    0   204    3   1   1  95
b
which shows that most of the cpus are spending a long time in idle.  At this point I am tempted to state that an I/O bottleneck is occurring, however, I note that the above data has been invariant to the number of fileystems I dedicated to temporary files (up to 3 filesystems on 3 separate disks).  The results of iostat -cxm 3 3 are:
                  extended device statistics                        cpu
device       r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b  us sy wt id
md0          0.6    0.2    5.8    0.2  0.0  0.0   19.8   0   0  19  3  5 73
md1          0.0    0.0    0.2    0.2  0.0  0.0   15.3   0   0
md4          0.0    0.0    0.0    0.0  0.0  0.0   15.0   0   0
md5          0.0    0.2    0.0    0.4  0.0  0.0   19.8   0   0
md6          0.0    0.0    0.0    0.1  0.0  0.0   18.6   0   0
md10         0.3    0.2    2.7    0.2  0.0  0.0   19.5   0   0
md11         0.0    0.0    0.1    0.1  0.0  0.0   16.4   0   0
md14         0.0    0.0    0.0    0.0  0.0  0.0   12.0   0   0
md15         0.0    0.2    0.0    0.4  0.0  0.0    9.7   0   0
md16         0.0    0.0    0.0    0.1  0.0  0.0   12.1   0   0
md20         0.3    0.2    3.1    0.2  0.0  0.0   19.2   0   0
md21         0.0    0.0    0.1    0.1  0.0  0.0   13.4   0   0
md24         0.0    0.0    0.0    0.0  0.0  0.0   18.5   0   0
md25         0.0    0.2    0.0    0.4  0.0  0.0    9.9   0   0
md26         0.0    0.0    0.0    0.1  0.0  0.0   12.3   0   0
sd6          0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
ssd0        20.7   55.9 1179.1 9605.6  8.8 27.5  474.3   6  33
ssd1         0.4    0.7    3.2    1.0  0.0  0.0   15.2   0   1
ssd2         1.9    1.3  146.7  822.3  0.0  0.3   92.5   0   3
ssd3         0.2    3.0    1.8 1088.5  0.0  1.0  312.3   0   3
ssd4         0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
ssd5         0.4    0.7    2.9    1.0  0.0  0.0   15.3   0   1
nfs1         0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
                  extended device statistics                        cpu
device       r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b  us sy wt id
md0          0.0    0.3    0.0    0.5  0.0  0.0   39.4   1   1   5 10 30 56
md1          0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
md4          0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
md5          0.0    0.3    0.0    0.2  0.0  0.0   35.0   1   0
md6          0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
md10         0.0    0.3    0.0    0.5  0.0  0.0   20.8   0   1
md11         0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
md14         0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
md15         0.0    0.3    0.0    0.2  0.0  0.0    9.2   0   0
md16         0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
md20         0.0    0.3    0.0    0.5  0.0  0.0   16.8   0   1
md21         0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
md24         0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
md25         0.0    0.3    0.0    0.2  0.0  0.0   14.2   0   0
md26         0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
sd6          0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
ssd0        93.2    0.3 14997.2    0.2  0.0  3.5   37.5   0  60
ssd1         0.0    1.3    0.0    1.0  0.0  0.0   17.5   0   2
ssd2        26.6   32.6  213.1 27450.2  0.0  1.8   30.0   0  92
ssd3         3.7   80.9   29.3 37813.5  0.0 26.5  313.5   0 100
ssd4         0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
ssd5         0.0    1.3    0.0    1.0  0.0  0.0   15.2   0   2
nfs1         0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
                  extended device statistics                        cpu
device       r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b  us sy wt id
md0          0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0   4  8 27 62
md1          0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
md4          0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
md5          0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
md6          0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
md10         0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
md11         0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
md14         0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
md15         0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
md16         0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
md20         0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
md21         0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
md24         0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
md25         0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
md26         0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
sd6          0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
ssd0       100.0    0.0 15909.1    0.0  0.0  5.6   56.4   0  61
ssd1         0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
ssd2        28.0   33.7  224.0 28500.9  0.0  1.7   28.2   0  91
ssd3        10.0   73.0   80.0 34224.1  0.0 25.9  311.5   0  99
ssd4         0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
ssd5         0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
nfs1         0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0

Now, it would seem that the devices named ssd0 and ssd3 (and also ssd2) are quite bottlenecked in terms of their service wait time and also the % time busy.  Unfortunately, I don't know how to verify that these correspond to the scratch filesystems (I suspect they do), because I can't seem to find the map that connects device id's in iostat to device files in /etc/mnttab.  Using this data, can someone on the list confirm that my suspicion that there is a disk bottleneck on these systems, and if possible, recommend to me how I can find out which iostat device id's correspond to device files on my system?  I would be very grateful for any suggestions.

Cheers,

Seth

ccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccms

Dr Seth Olsen, PhD
Postdoctoral Fellow, Computational Systems Biology Group
Centre for Computational Molecular Science
Chemistry Building,
The University of Queensland
Qld 4072, Brisbane, Australia

tel (617) 33653732
fax (617) 33654623
email: s.olsen1 at uq.edu.au
Web: www.ccms.uq.edu.au 

ccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccms

----- Original Message -----
From: Ross Nobes <Ross.Nobes at uk.fujitsu.com>
Date: Monday, March 7, 2005 11:11 pm
Subject: RE: [molpro-user] performance on Sun ultraSparc, solaris 9

> Hi Seth,
> 
> Are you running the serial version or a parallel version of MOLPRO
> please? If it is the former, what %CPU values are you seeing from
> prstat, keeping in mind that the most you would ever see on an 8-CPU
> machine is 12.5%?
> 
> Best wishes,
> Ross
> ---
> Ross Nobes
> Manager, Physical and Life Sciences Research Group
> Fujitsu Laboratories of Europe
> Hayes Park Central, Hayes End Road
> Hayes, Middlesex UB4 8FE, UK
> Phone +44 (0) 77 7195 6113
> Fax +44 (0) 20 8606 4539
> E-mail Ross.Nobes at uk.fujitsu.com
> 
> > -----Original Message-----
> > From: owner-molpro-user at molpro.chem.cf.ac.uk [owner-molpro-
> > user at molpro.chem.cf.ac.uk] On Behalf Of Dr Seth OLSEN
> > Sent: 07 March 2005 01:54
> > To: molpro-user at molpro.net
> > Subject: [molpro-user] performance on Sun ultraSparc, solaris 9
> > 
> > 
> > Hi Molpro-users,
> > 
> > I'm running MolPro on an 8-cpu Sun UltraSPARC SMP box (running 
> solaris9)
> > with 30G Ram.  I have been trying some test jobs and it seems that
> MolPro
> > is not making good use of the available resources, in the sense 
> that I
> see
> > low %CPU values when I check prstat.  Are there system parameters 
> that> should be tweaked when I try to run MolPro on this system.  I 
> havealready
> > reset shmmax to 4GB.  It this recommended?  Is there a better value
> for
> > shmmax.  I know that the low cpu utilization can't be due to 
> excessive> disk i/o because I see this phenomenon even if I run 
> under GDIRECT.
> Are
> > there any insights available as to how I can get the most out of
> MolPro on
> > this system?
> > 
> > Cheers,
> > 
> > Seth
> > 
> > 
> > ccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccms
> > 
> > Dr Seth Olsen, PhD
> > Postdoctoral Fellow, Computational Systems Biology Group
> > Centre for Computational Molecular Science
> > Chemistry Building,
> > The University of Queensland
> > Qld 4072, Brisbane, Australia
> > 
> > tel (617) 33653732
> > fax (617) 33654623
> > email: s.olsen1 at uq.edu.au
> > Web: www.ccms.uq.edu.au
> > 
> > ccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccms
> > 
> > 
> > 
> 
>