[molpro-user] Re: performance on Sun ultraSparc, solaris 9: disk i/o bottleneck?
Dr Seth OLSEN
s.olsen1 at uq.edu.au
Tue Mar 8 06:05:27 GMT 2005
Hi Molpro-Users,
I have increased shmmax on my ultraSPARC smp (Solaris 9, Sun Studio 8, 8cpus) to 256MB, and raised semmni and semmns to 128. I am still seeing low CPU usage. The top section of a prstat gives:
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
1071 seth 4377M 3859M sleep 60 0 0:14:43 7.6% molprop_2002_6_/1
1073 seth 4377M 3859M cpu3 22 0 0:14:04 3.7% molprop_2002_6_/1
1070 seth 4377M 3864M cpu2 13 0 0:14:04 3.5% molprop_2002_6_/1
1072 seth 4377M 3859M sleep 60 0 0:13:58 2.1% molprop_2002_6_/1
1229 root 4536K 2704K sleep 59 0 0:00:00 0.0% sshd/1
1236 seth 4608K 4416K cpu5 49 0 0:00:00 0.0% prstat/1
1231 seth 2576K 1848K sleep 49 0 0:00:00 0.0% bash/1
263 root 2944K 2248K sleep 59 0 0:00:00 0.0% nscd/22
60 root 7528K 6736K sleep 59 0 0:00:04 0.0% picld/12
In this, one can see the molpro executables running at the top, each taking a very small amount of %CPU.
vmstat 3 3 gives the following output:
r b w swap free re mf pi po fr de sr m0 m1 m4 m5 in sy cs us sy id
0 0 0 18551736 20035272 1495 377 1146 3 2 0 7 1 0 0 0 587 11415 510 19 3 78
0 0 0 12820880 14982248 11956 3 14521 0 0 0 0 1 0 0 1 1185 537 448 11 10 80
0 0 0 12820880 15002072 13217 92 24173 8 8 0 0 0 0 0 1 1181 5439 401 14 11 75
b
Which, according to what I read, may indicate the beginning of a bottleneck because 'sy' is a large fraction of 'us'. mpstat gives:
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
0 71 0 14981 7 1 80 30 0 50 0 2340 31 5 5 58
1 45 1 13513 179 3 74 25 0 40 0 2430 31 3 7 60
2 33 0 12438 6 1 82 29 0 31 0 1877 26 2 5 66
3 42 1 11727 6 2 81 30 0 36 0 1622 23 3 5 70
4 46 0 10309 6 1 58 11 0 37 0 1401 19 3 5 73
5 68 0 8761 5 1 66 20 1 36 0 1035 11 3 7 79
6 41 0 8748 17 1 50 4 1 29 0 445 6 3 4 87
7 29 0 1501 367 264 22 6 0 11 0 204 3 1 1 95
b
which shows that most of the cpus are spending a long time in idle. At this point I am tempted to state that an I/O bottleneck is occurring, however, I note that the above data has been invariant to the number of fileystems I dedicated to temporary files (up to 3 filesystems on 3 separate disks). The results of iostat -cxm 3 3 are:
extended device statistics cpu
device r/s w/s kr/s kw/s wait actv svc_t %w %b us sy wt id
md0 0.6 0.2 5.8 0.2 0.0 0.0 19.8 0 0 19 3 5 73
md1 0.0 0.0 0.2 0.2 0.0 0.0 15.3 0 0
md4 0.0 0.0 0.0 0.0 0.0 0.0 15.0 0 0
md5 0.0 0.2 0.0 0.4 0.0 0.0 19.8 0 0
md6 0.0 0.0 0.0 0.1 0.0 0.0 18.6 0 0
md10 0.3 0.2 2.7 0.2 0.0 0.0 19.5 0 0
md11 0.0 0.0 0.1 0.1 0.0 0.0 16.4 0 0
md14 0.0 0.0 0.0 0.0 0.0 0.0 12.0 0 0
md15 0.0 0.2 0.0 0.4 0.0 0.0 9.7 0 0
md16 0.0 0.0 0.0 0.1 0.0 0.0 12.1 0 0
md20 0.3 0.2 3.1 0.2 0.0 0.0 19.2 0 0
md21 0.0 0.0 0.1 0.1 0.0 0.0 13.4 0 0
md24 0.0 0.0 0.0 0.0 0.0 0.0 18.5 0 0
md25 0.0 0.2 0.0 0.4 0.0 0.0 9.9 0 0
md26 0.0 0.0 0.0 0.1 0.0 0.0 12.3 0 0
sd6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
ssd0 20.7 55.9 1179.1 9605.6 8.8 27.5 474.3 6 33
ssd1 0.4 0.7 3.2 1.0 0.0 0.0 15.2 0 1
ssd2 1.9 1.3 146.7 822.3 0.0 0.3 92.5 0 3
ssd3 0.2 3.0 1.8 1088.5 0.0 1.0 312.3 0 3
ssd4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
ssd5 0.4 0.7 2.9 1.0 0.0 0.0 15.3 0 1
nfs1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
extended device statistics cpu
device r/s w/s kr/s kw/s wait actv svc_t %w %b us sy wt id
md0 0.0 0.3 0.0 0.5 0.0 0.0 39.4 1 1 5 10 30 56
md1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
md4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
md5 0.0 0.3 0.0 0.2 0.0 0.0 35.0 1 0
md6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
md10 0.0 0.3 0.0 0.5 0.0 0.0 20.8 0 1
md11 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
md14 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
md15 0.0 0.3 0.0 0.2 0.0 0.0 9.2 0 0
md16 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
md20 0.0 0.3 0.0 0.5 0.0 0.0 16.8 0 1
md21 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
md24 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
md25 0.0 0.3 0.0 0.2 0.0 0.0 14.2 0 0
md26 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
sd6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
ssd0 93.2 0.3 14997.2 0.2 0.0 3.5 37.5 0 60
ssd1 0.0 1.3 0.0 1.0 0.0 0.0 17.5 0 2
ssd2 26.6 32.6 213.1 27450.2 0.0 1.8 30.0 0 92
ssd3 3.7 80.9 29.3 37813.5 0.0 26.5 313.5 0 100
ssd4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
ssd5 0.0 1.3 0.0 1.0 0.0 0.0 15.2 0 2
nfs1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
extended device statistics cpu
device r/s w/s kr/s kw/s wait actv svc_t %w %b us sy wt id
md0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 4 8 27 62
md1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
md4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
md5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
md6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
md10 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
md11 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
md14 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
md15 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
md16 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
md20 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
md21 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
md24 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
md25 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
md26 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
sd6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
ssd0 100.0 0.0 15909.1 0.0 0.0 5.6 56.4 0 61
ssd1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
ssd2 28.0 33.7 224.0 28500.9 0.0 1.7 28.2 0 91
ssd3 10.0 73.0 80.0 34224.1 0.0 25.9 311.5 0 99
ssd4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
ssd5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
nfs1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
Now, it would seem that the devices named ssd0 and ssd3 (and also ssd2) are quite bottlenecked in terms of their service wait time and also the % time busy. Unfortunately, I don't know how to verify that these correspond to the scratch filesystems (I suspect they do), because I can't seem to find the map that connects device id's in iostat to device files in /etc/mnttab. Using this data, can someone on the list confirm that my suspicion that there is a disk bottleneck on these systems, and if possible, recommend to me how I can find out which iostat device id's correspond to device files on my system? I would be very grateful for any suggestions.
Cheers,
Seth
ccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccms
Dr Seth Olsen, PhD
Postdoctoral Fellow, Computational Systems Biology Group
Centre for Computational Molecular Science
Chemistry Building,
The University of Queensland
Qld 4072, Brisbane, Australia
tel (617) 33653732
fax (617) 33654623
email: s.olsen1 at uq.edu.au
Web: www.ccms.uq.edu.au
ccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccms
----- Original Message -----
From: Ross Nobes <Ross.Nobes at uk.fujitsu.com>
Date: Monday, March 7, 2005 11:11 pm
Subject: RE: [molpro-user] performance on Sun ultraSparc, solaris 9
> Hi Seth,
>
> Are you running the serial version or a parallel version of MOLPRO
> please? If it is the former, what %CPU values are you seeing from
> prstat, keeping in mind that the most you would ever see on an 8-CPU
> machine is 12.5%?
>
> Best wishes,
> Ross
> ---
> Ross Nobes
> Manager, Physical and Life Sciences Research Group
> Fujitsu Laboratories of Europe
> Hayes Park Central, Hayes End Road
> Hayes, Middlesex UB4 8FE, UK
> Phone +44 (0) 77 7195 6113
> Fax +44 (0) 20 8606 4539
> E-mail Ross.Nobes at uk.fujitsu.com
>
> > -----Original Message-----
> > From: owner-molpro-user at molpro.chem.cf.ac.uk [owner-molpro-
> > user at molpro.chem.cf.ac.uk] On Behalf Of Dr Seth OLSEN
> > Sent: 07 March 2005 01:54
> > To: molpro-user at molpro.net
> > Subject: [molpro-user] performance on Sun ultraSparc, solaris 9
> >
> >
> > Hi Molpro-users,
> >
> > I'm running MolPro on an 8-cpu Sun UltraSPARC SMP box (running
> solaris9)
> > with 30G Ram. I have been trying some test jobs and it seems that
> MolPro
> > is not making good use of the available resources, in the sense
> that I
> see
> > low %CPU values when I check prstat. Are there system parameters
> that> should be tweaked when I try to run MolPro on this system. I
> havealready
> > reset shmmax to 4GB. It this recommended? Is there a better value
> for
> > shmmax. I know that the low cpu utilization can't be due to
> excessive> disk i/o because I see this phenomenon even if I run
> under GDIRECT.
> Are
> > there any insights available as to how I can get the most out of
> MolPro on
> > this system?
> >
> > Cheers,
> >
> > Seth
> >
> >
> > ccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccms
> >
> > Dr Seth Olsen, PhD
> > Postdoctoral Fellow, Computational Systems Biology Group
> > Centre for Computational Molecular Science
> > Chemistry Building,
> > The University of Queensland
> > Qld 4072, Brisbane, Australia
> >
> > tel (617) 33653732
> > fax (617) 33654623
> > email: s.olsen1 at uq.edu.au
> > Web: www.ccms.uq.edu.au
> >
> > ccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccms
> >
> >
> >
>
>
More information about the Molpro-user
mailing list