libperfsuite, the
PerfSuite core software library, contains a small number of
routines that are used to collect data that can be used by the
PerfSuite command-line, graphical, and Web-based tools. This was
the primary motivation for writing the library.
libperfsuite is a stand-alone
library that requires no external library support to function.
The library is targeted for Linux/Intel (x86/ia64) platforms.
The routines within libperfsuite provide output
and functionality that can be useful independently of the
PerfSuite tools and may be used solely in that way, too.
The routines currently contained in the library are summarized on
this web page. All libperfsuite routines begin with the
prefix "ps_" (for C) and "PSF_"
(for Fortran).
Note: this web page describes an earlier release of the
libperfsuite library (version 0.5). The version
of libperfsuite installed at NCSA and available
for download is more recent and contains additional functionality
not described here. Documentation for libperfsuite
is currently being rewritten in DocBook format to be distributed
as part of the final 0.6 release of PerfSuite software. If you
have any questions or feedback, please contact us through the
email address at the bottom of this page.
This description is specific to the installation of PerfSuite
on NCSA Linux clusters, which is rooted at the directory
/usr/apps/tools/perfsuite. Usage elsewhere should
adjust all paths accordingly to reflect the local installation.
Compiling
All C-based applications should include the main PerfSuite
header file <perfsuite.h>. Fortran-based
applications should include <fperfsuite.h>.
No other header files
are necessary to use these routines.
When you compile your program, include the flag:
-I/usr/apps/tools/perfsuite/include
Linking
When you link your program, include the flags:
-L/usr/apps/tools/perfsuite/lib -lperfsuite
On IA-32 systems, libperfsuite is only available as a
static (.a) library, so you will not need to adjust
your LD_LIBRARY_PATH environment variable after
linking. You will need to relink your program to use
updated versions of libperfsuite as the
library is improved in the future.
This routine allows you to retrieve the CPU time accumulated by
an arbitrary (active) process on each processor of the system.
You can supply any process ID as the input parameter pid
or you can set pid=0, which will cause the
CPU times for the current process to be returned.
The CPU times returned are expressed in units of "jiffies",
or the number of clock ticks per second,
which are 1/100 of a second on x86, and 1/1024 of a second on ia64.
See sysconf(2), in particular _SC_CLK_TCK for more information.
The CPU times returned by the C version of the routine
will be contained in the cputimes_t
structure, which is defined in <perfsuite.h> as:
struct cputimes {
int n_cpus;
long total_utime;
long total_stime;
long *cpu_utime;
long *cpu_stime;
};
typedef struct cputimes cputimes_t;
The Fortran version of the routine will supply the total user and
system CPU time for the process in the total_utime
and total_stime parameters, and will supply the
number of CPUs actually found on the system in the n_cpus
parameter. Per-CPU user and system times will be returned in
the cpu_* arrays, up to the limit supplied as input
in maxcpus.
Memory Considerations
The C version of this routine will obtain memory for the
cpu_utime and cpu_stime arrays via
malloc(). You should free() the memory
associated when you are done with the values. Note: internally,
ps_cputimes obtains new memory on each call and assigns it
to these two pointers, so if you don't deallocate
the memory or if you've assigned other memory to these pointers, it
will result in a memory leak in your application.
All internal memory allocation done by the Fortran
version of ps_cputimes is freed before the routine returns.
Successful completion of this routine returns zero as a result
of the C function. Successful completion with the Fortran
version will have ierr set to zero.
C
int ps_memusage(pid,vsize,rss)
pid_t pid;
float *vsize, *rss;
Fortran
subroutine PSF_memusage(pid,vsize,rss,ierr)
integer pid, ierr
real vsize, rss
This is a convenience routine that allows you to retrieve
information about memory consumption of a process. You can
supply any active process ID in the parameter pid
or you can supply the value 0, which will return the information
relevant to the calling process. The
current virtual size of the process is returned in vsize
and the current resident set size of the process is returned
in vss. Both are expressed in units of megabytes
(1048576 bytes). This routine calls ps_procstat() to
obtain this information from the /proc filesystem.
A return (or ierr) value of
zero indicates successful completion.
integer pid
integer pinfo(PS_PROCSTATSIZE)
character*(PS_MAXCOMMSIZ) comm
character state
integer ierr
This is a convenience routine that allows you to retrieve information
about a process from the /proc filesystem. It accepts
a process ID in the pid argument (set pid=0 to
retrieve information about the calling process) and returns information
about the process in the remaining arguments. A lot of information
is returned by this routine; fields contained in the ps_procstat_t
structure or (in the Fortran version, the pinfo array)
are listed in <perfsuite.h> and
<fperfsuite.h> respectively.
A number of constants have been defined for the Fortran version
that allow you to reference individual elements in the pinfo
array by symbolic names; these are listed in
<fperfsuite.h>.
Note: There is a possibility of overflow of values when calling
this routine using the Fortran interface. In particular, any of the
values that are declared as unsigned in the header file
mentioned next may overflow when being converted to a 32-bit signed
integer for the Fortran version, and will appear to be negative numbers.
If you're interested in obtaining the current memory usage of your
application, you may instead prefer to use PSF_memusage(),
described above, which guards against this overflow.
Refer to the manual page for proc(5) for more information
about the information returned by this routine (the fields are those
under each process' "stat" entry). You can also view
the libperfsuite C header file that declares the various fields
to get an idea of what's available.
A zero return value for the C routine or a zero in the ierr
parameter indicates success.
This routine samples a hardware-based time counter on the system
and returns the current value.
Typically, you would sample this twice, both before and after a section
of code for which you are interested in obtaining a high-precision
timing value. The difference between the "after" and
"before" values will be the amount of wall-clock
time elapsed between the two samples, expressed in timer
clicks. You can use this value in conjunction with the clock
speed returned by ps_cpuspeed to convert the
clicks to a time value.
ps_rtc always returns zero.
Note that gettimeofday(2) also returns a high-resolution
time and may be preferable to using ps_rtc for your
purposes.
C
int ps_cpuspeed(speed)
double *speed;
Fortran
subroutine PSF_cpuspeed(speed, ierr)
real speed
integer ierr
This routine returns the clock speed of the machine, expressed in
megahertz (MHz, or millionths of a second). It retrieves this
information from /proc/cpuinfo, which is set at
system boot.
ps_cpuspeed returns zero on success (as the return
value of the C function or within the ierr parameter
for the Fortran routine).