er_print - Oracle Developer Studio 12.5 Man Pages

Language:

er_print(1)

Name

er_print - print an ASCII report from one or more performance experiments

Synopsis

er_print [ –script script | -command | –V ] experiment-list

Description

er_print is a utility that generates a plain-text version of the various displays supported by Performance Analyzer. The output is displayed on the standard output. Experiment files are generated using the collect command, or the dbx collector commands. The experiment-list can contain one or more experiment names or experiment group names. An experiment group is defined by a file that contains the names of the experiments in the group. You can read experiments on descendant processes either by referring to them explicitly or by constructing an experiment group for them as described in the section "Creating Experiment Groups" below.

Based on the data collected, various metrics of performance can be computed for functions, callers and callees, source files, and disassembly listings. The data collected and metrics generated are described in the collect (1) man page. The graphical displays available are described in the Oracle Developer Studio 12.5: Performance Analyzer and the Performance Analyzer help.

Options

-script script: Read er_print commands from the file script which contains one command per line.
-command: Process the given command, described in the Sub-Commands section.
-V: Display version information and exit.

The options are processed in the order they appear. Options can be repeated. Scripts and explicit commands can be mixed in any order. If no command or script arguments are supplied, er_print enters interactive mode to read commands from the input terminal. Input from the input terminal is terminated with the quit command.

Any line that ends in \ has the \ character removed, and the content of the next line appended before the line is parsed. There is no limit (other than total memory) to how many continuation lines you can use. Any arguments that contain blanks must be surrounded by double quotes ("), whether on the command line, or read from scripts, or read from a .er.rc file.

After each command is processed, any error or warning messages arising from the processing are written. Summary statistics on the processing can be printed with the procstats command.

When invoked on more than one experiment or experiment-group, er_print displays data in aggregation mode by default. See "COMPARISON MODE" for more information.

Sub Commands

The commands accepted by the er_print utility are listed below. Any command can be abbreviated by a shorter string, as long as the command is unambiguous.

Commands Controlling the Function List

functions

Write the function list with the current set of metrics. The function list includes any load objects whose functions are hidden with the object_select command. Load objects include executables and shared objects loaded by the program.

metrics metric_spec

Set the function list metrics (the same metrics are also used for source and disassembly, and for lines and pcs). metric_spec is a list of metric keywords separated by colons. For dynamic metrics, which are metrics based on measured data, each keyword is of the form <flavor><visibility><metric-name>. For static metrics, which are metrics based on the static properties of the load objects in the experiments (name, address, and size), each keyword is of the form [<visibility>]<metric-name>, with the <visibility> setting being optional.

<flavor> can be either "i" for inclusive or "e" for exclusive. The combinations "ie" and "ei" are expanded: for example "ie<visibility><metric-name>" is expanded to "i<visibility><metric-name>:e<visibility><metric-name>".

<visibility> can be any combination of "." (to show the metric as a time), "%" (to show it as a percentage), and "+" (to show it as a count). If the metric can be shown only as a time or only as a count, "." and "+" have the same meaning. For hardware counter profiling experiments and counters that count in cycles, the metric is normally shown as a time ("."); it can be shown as a count using "+" in its <visibility> field. The order of appearance of time, percent, and count is fixed: it is not affected by the order of the characters in the <visibility> setting. For static metrics, "+", ".", and "%" all have the same meaning.

If two specific counters, "cycles" and "insts", are collected, two additional metrics are available, "CPI" and "IPC", meaning cycles-per-instruction and instructions-per-cycle, respectively. They are always shown as a ratio, and not as a time, count, or percentage, whether specified by "+", ".", or "%". A high value of CPI or low value of IPC indicates code that runs inefficiently in the machine; conversely, a low value of CPI or a high value of IPC indicates code that runs efficiently in the pipeline.

<visibility> can also be specified as a "!", which means turn off the metric. It is typically not used except with the dmetrics command (see below), to set default metrics to override the built-in visibility defaults for each type of metric.

The same metric cannot appear multiple times in the metric_spec; if it does, it is reported as an error. If the metric "name" is not in the list, it is appended to the list. A list of all the available <metric-name> values for the experiments loaded can be obtained with the metric_list command. See the collect (1) man page for more information on metrics.

By default, use the metric setting based on the dmetrics command, processed from .er.rc files, as described in "Default-setting Commands" below. If a metrics command explicitly sets metric_spec to default, restore the default settings appropriate to the data recorded. When metrics are reset, the default sort metric is set in the new list. If metric_spec is omitted, print the current metrics setting.

In addition to setting the metrics for the function list, the command sets the metrics for caller-callees, for data-derived output, and for index-objects to match the metrics settings. The callers-callees metrics show the attributed metrics corresponding to those metrics in the functions list whose inclusive or exclusive metrics are shown, as well as the static metrics.

If a metrics command has an error in it, or no metrics in the specification correspond to the current data, the command is ignored with a warning, and the previous settings remain in effect.

sort metric_spec

Sort the function list by the given metric. The metric_spec is described under metrics; it can be preceded by a "-" sign to specify reverse sort. For example the following command means to sort by inclusive user time:

sort i.user

The <visibility> in the metric name does not affect the sort order. If more than one metric is named in the metric_spec, er_print uses the first one that is visible. If none of the metrics named are visible, er_print ignores the sort command.

By default, er_print uses the metric sort setting based on the dsort command, processed from .er.rc files, as described in "Default-setting Commands" below. If a sort command explicitly sets metric_spec to default, er_print uses the default settings.

If metric_specis omitted, er_print prints the current sort metric.

fsummary

Write the summary panel for each function in the function list, in the order specified by the current sort metric. Include any load objects in the function list whose functions are hidden with the object_select command.

fsingle function_name [N | ADDR]

Write the summary panel for the named function. The optional parameter is needed for those cases where the function name is ambiguous; see the source command for more information.

Commands Controlling the Callers-Callees List

callers-callees

Write the callers-callees panel for each function, based on the last metrics specification, in the order specified by the function sort metric (sort).

csingle function_name [N | ADDR]

Write the callers-callees panel for the named function. The optional parameter is needed for those cases where the function name is ambiguous; see the source command for more information.

The caller-callee panel can also be generated for a callstack fragment. The initial frame in the fragment is set by the csingle command. Additional frames can be added by the cprepend or cappend commands; frames can be removed by the crmfirst and crmlast commands. After each frame is added or removed from the callstack fragment, the panel showing all callers into the current fragment, and all callees from that fragment is printed.

cprepend function_name [N | ADDR]

Prepend the named function to the current callstack fragment. The optional parameter is needed where the function name is ambiguous; see the source command for more information.

cappend function_name [N | ADDR]

Append the named function to the current callstack fragment. The optional parameter is needed where the function name is ambiguous; see the source command for more information.

crmfirst

Remove the top frame from the callstack segment.

crmlast

Remove the bottom frame from the callstack segment.

Command Controlling the Calltree List

calltree | ctree: Write the dynamic call graph from the experiment and show the hierarchical metrics at each level.

Commands Common to Tracing Data

datasize: Write the distribution of the size of the data referred to in tracing data in a logarithmic scale. For Heap Tracing, the size is the allocation or leak size. For I/O Tracing, the size is the number of bytes transferred.
duration: Write the distribution of the duration of the events in tracing data in a logarithmic scale. For Synchronization Tracing, the duration is the Synchronization Delay. For I/O Tracing, the duration is the time spent in the I/O operation.

Commands Controlling Memory Leak and Allocation Lists

leaks: Write the list of leaks sorted by size along with the call stack for each. Aggregate the entries in the leak list by common call stack.
allocs: Write the list of allocations sorted by size along with the call stack for each. Aggregate the entries in the allocations list by common call stack.
heap: Write the list of allocations and leaks, aggregated by common callstack.
heapstat: Write the overall statistics of heap usage including the peak memory usage for the application.

Command Controlling the I/O Activity Report

ioactivity: Write the report of all I/O activity sorted by file.
iodetail: Write the report of all I/O activity sorted by virtual file descriptor. A different virtual file descriptor is generated for each open of a file, even if the same file descriptor is returned from the open.
iocallstack: Write the report of all I/O activity sorted by callstack, and aggregated over all events with the same callstack. For each aggregated callstack, include the callstack trace.
iostat: Write summary statistics for all I/O activity.

Commands Controlling Source and Disassembly Listings

pcs

Write a list of program counters (PCs) and their metrics, ordered by the current sort metric. Include lines in the list that show aggregated metrics for each load object whose functions are hidden with the object_select command.

psummary

Write the summary metrics panel for each PC in the PC list, in the order specified by the current sort metric.

lines

Write a list of source lines and their metrics, ordered by the current sort metric. Include lines in the list that show aggregated metrics for each function that does not have line-number information, or whose source file is unknown. Also include lines that show aggregated metrics for each load object whose functions are hidden with the object_select command.

lsummary

Write the summary metrics panel for each line in the lines list, in the order specified by the current sort metric.

source { filename | function_name } [N | ADDR]

Write annotated source of the given object file or of the object file that contains the given function. If the name of the function is that of a C++ function or a Java method, you can use the demangled name in either short or long form, or the mangled name. If the demangled name contains spaces, it must be surrounded by double quotes. The optional parameter, N or ADDR, is needed for those cases where the file name or function name is ambiguous. If the N form is used, pick the Nth possible choice (with the numbering starting from 1). If there is more than one possibility and the N supplied is not within the possible range, er_print reports an error. If there is only one possibility, and the N supplied is not within the possible range, the error is ignored.

If the ADDR form is used, it must be written as @segment-number:address. The segment-number:address values should be specified exactly as they appear as the address metric for the function.

If an ambiguous name is given without the specifier, er_print displays a list of choices instead of the annotated source. Each list item contains the number that can be used for N, the name of the object module that references the function or file and, in the case of an ambiguous function, the function name.

The default source context for any function is defined as the source file to which the first instruction in that function is attributed. Normally this is the source file that was compiled to produce the object module that contains the function. Immediately following the first instruction, er_print adds an index line for the function, which is displayed as text within angle brackets in the form shown below:

<Function: f_name>

Alternate source contexts consist of other files that have instructions in the function attributed to them. Such contexts include instructions coming from include files and instructions from functions inlined into the named function. If there are any alternate source contexts, er_print includes a list of extended index lines at the beginning of the default source context to indicate where the alternate source contexts are located in the following form:

<Function: f, instructions from source file src.h>

The function name can also be specified as function`file`, where file is used to specify an alternate source context for the function.

Note: If the -source argument is used when invoking er_print on the command line, the backslash escape character must prepend the file quotes. In other words, the function name is written as function\`file\`. The backslash should not be used when er_print is in interactive mode.

Normally when the default source context is used, metrics are shown for all functions from the source file. If the source file is explicitly used as an alternate source context, metrics are shown only for the named function.

If any compiler commentary has been selected, er_print interleaves the commentary with the source lines in the source listing. The string "##" is prepended to lines with metrics that are equal to or exceed a threshold percentage of the maximum for that metric within the file, to make it easier for you to find the important lines. You can control the threshold using the sthresh command, and control the classes of source compiler commentary to show using the scc command.

er_print searches for the file using the search path as specified by a setpath and/or addpath commands. If the file is not found, er_print searches under the absolute pathname as recorded in the executable. If you have moved the sources, or the experiment was recorded in a different file system, you can put a symbolic link from the current directory to the real source location in order to see the annotated source, or copy the source code and load objects into the experiment.

src { filename | function_name } [N | ADDR]

Same as source.

disasm { filename | function_name } [N | ADDR]

Write annotated disassembly of the given object file, or of the object file containing the given function. Ambiguities are resolved in the same way as for the source command.

The string "##" is prepended to lines with a metric that equals or exceeds a threshold percentage of the maximum value of that metric. This makes it easier for you to find the important lines. The threshold is set using the dthresh command. The classes of commentary shown are set by the dcc command.

er_print searches for the source file corresponding to the specified disassembly as described for the source command, and interleaves the disassembly with the source and index lines. If the function includes code from an alternate source context, er_print inserts an index line referring to the alternate context, followed by the raw disassembly without compiler commentary. If the source file cannot be found, er_print shows the disassembly without the source and without the compiler commentary.

scc com_spec

Specify which classes of compiler commentary are shown with annotated source. com_spec is a colon-separated list of classes. Each of the classes refers to specific types of messages. The allowed classes are:

b[asic]: Show basic messages from all classes.
v[ersion]: Show version messages.
w[arn]: Show warning messages.
pa[rallel]: Show parallelization messages.
q[uery]: Show questions from the compiler.
l[oop]: Show loop transformation messages.
pi[pe]: Show pipelining messages.
i[nline]: Show inlining messages.
m[emops]: Show messages about memory operations.
f[e]: Show front-end messages.
co[degen]: Show code generator messages.
cf: Show compile flags at the bottom of the source.
all: Show all messages.
none: Do not show messages.

The classes "all" and "none" cannot be used with other classes.

For compatibility, the threshold for flagging important lines can be included in the list with any of the classes, including "all" and "none".

t[hreshold]=nn: Flag a line as important if it has a metric value which is greater than nn percent of the largest value of that metric on any line in the file. The default value of nn is 75.

For example, the following command specifies to show loop transformation messages and pipelining messages, and sets the threshold to 50 percent:

scc l:pi:t=50

If no scc command is given, er_print uses the default setting:

scc all

If com_spec is not specified, it turns off compiler commentary. The scc command is normally used only in a .er.rc file.

sthresh value

Set the threshold for flagging important lines in the source. er_print flags a line as important if it has a metric value that is greater than value percent of the largest value of that metric on any line in the file. The default value of value is 75.

dcc com_spec

Specify which classes of compiler commentary are shown with annotated disassembly. The com_spec specification for this command can include any of the scc classes and the following additional specifications:

h[ex]: Show the hexadecimal representation of each instruction.
noh[ex]: Do not show the hexadecimal representation of each instruction.
s[rc]: Interleave source in disassembly (default).
as[rc]: Interleave annotated source, with source-line metrics, in disassembly.
nos[rc]: Do not interleave source in disassembly.

If no dcc command is given, er_print uses the default setting:

dcc all:src

This command is normally used only in a .er.rc file.

cc com_spec

Specify compiler-commentary for both source and disassembly.

dthresh value

Set the threshold for flagging important lines in the disassembly. er_print flags a line as important if it has a metric value which is greater than value percent of the largest value of that metric on any line in the file. The default value of value is 75.

Commands Controlling Searching for Files

er_print looks for load object files first in the archives directory of the experiment. If the files are not found there, er_print uses the same algorithm as for source and object files, described below.

In most experiments, source and object files are recorded in the form of full paths. In addition, Java source files also have a package name, listing the directory structure to the file. When you look at experiments on a different machine, those full paths may not be accessible.

Two complementary methods are used to locate source and object files: path mapping, and a search path. (The same methods are used for load object files, if they are not found in the archives subdirectory.)

Path mapping is applied first, and specifies how to replace the leading component in any full file path, with a different path. For example, if a file is specified as:

/a/b/c/d/e/sourcefile

A pathmap directive could specify mapping /a/ to /x/y, or specify mapping /a/b/c/d/ to /x, and so forth.

If path mapping did not find the file, the search path is used. The search path gives a list of directories to be searched for a file with the given basename (sourcefile, in the example above). The search path can be set with a setpath command, or can be appended to with an addpath command. For Java files, both the package name and the basename are tried, in that order.

Each directory in the search path is used to construct a full path to try, or two full paths in the case of Java source files. Each of those full paths will have path mapping applied to them, and if none of the mapped paths point to the file, the next search path directory is tried.

Finally, if neither of those mechanisms have found the file, and no pathmapping prefix matched the original full path, that original full path is tried. If any pathmap prefix matched the original full path, but the file was not found, the original full path will not be tried.

Note that the default search path include the current directory, ., and the experiment directories, so one way to make source files accessible is to copy them to either of those places, or to put symbolic links in those places pointing to the current location of the source file.

pathmap old_prefix new_prefix

If a file can not be found using the path_list as set by addpath or setpath, you can use pathmap to remap one or more paths. Any pathname for a source file, object file, or shared object that begins with the old_prefix has that prefix replaced by the new_prefix, and the resulting path used to find the file. Multiple pathmap commands can be supplied, and they are each tried until the file is found. A pathmap directive with no arguments will print the current path mapping list.

setpath path_list

Set a search path used to find source files, object files, and so on. The path_list is a colon-separated list of directories, jar files, or zip files. If any directory has a colon character in it, you must escape the colon with a backslash. The special directory name $expts refers to the set of current experiments, in the order in which they were loaded. It can be abbreviated with a single $ character. Only the founder experiment is looked at when searching $expts; no descendant experiments are examined.

The default search path is: "$expts:.". Use the compiled-in full pathname if a file cannot be found by searching the current path.

setpath with no argument means print the current path.

setpath commands cannot be used in .er.rc files.

addpath path_list

Append path_list to the current setpath settings.

addpath commands can be used in .er.rc files, and will be concatenated.

Commands Controlling the Data Space List

Dataspace commands apply only to hardware counter experiments where memoryspace/dataspace data was recorded, either by default or explicitly for precise counters on either Oracle Solaris x86 or SPARC systems. See the collect (1) man page for more information.

Dataspace data is only available for profile hits that occurred in functions that were compiled with the -xhwcprof flag. The -xhwcprof flag is available with the Studio C, C++ and Fortran compilers and is only meaningful on SPARC [R] platforms; it is ignored on other platforms. See the compiler manual for further information.

data_objects

Write the list of data objects with their metrics.

data_single name [N]

Write the summary metrics panel for the named data object. The optional parameter N is needed for those cases where the object name is ambiguous.

data_layout

Write the annotated data object layouts for all program data objects with data-derived metric data, sorted by the current data sort metric values for the structures as a whole. Show each aggregate data object with the total metrics attributed to it, followed by all of its elements in offset order, each with their own metrics and an indicator of its size and location relative to 32-byte blocks, where:

<: Element fits block entirely.
/: Element starts a block.
|: Element is inside a block.
\: Element completes a block.
#: Element size requires multiple blocks.
X: Element spans multiple blocks but could fit within one block.
?: Undefined.

Commands Controlling the Memory Object Lists

Memory Object commands are applicable only to hardware counter experiments where memoryspace/dataspace data was recorded, either by default for precise counters on either X86, or where backtracking was specified for non-precise counters on SPARC Oracle Solaris systems only. See the collect (1) man page for more information.

Memory Objects are components in the memory subsystem such as cache-lines, pages, memory-banks, and so on. The object is determined from an index computed from the virtual and/or physical address as recorded. Memory objects are predefined for virtual and physical pages, for sizes of 8KB, 64KB, 512KB, and 4 MB. You can define others with the memobj_define command.

The following commands control the Memory Object Lists.

memobj mobj_type

Write the list of the memory objects of the given type with the current metrics. Metrics used and sorting are the same as for the Data Space List. The name mobj_type can also be used directly as the command.

memobj_list | mobj_list

Write the list of known types of memory objects, as used for mobj_type in the memobj command.

memobj_define | mobj_define mobj_typeindex_exp

Define a new type of memory objects with a mapping of VA/PA to the object given by the index_exp. The syntax of the expression is described under "EXPRESSION GRAMMAR", below. The mobj_type must not already be defined, and it cannot match any existing command, or any Index Object type (see below). Its name is case-insensitive, and must be entirely composed of alphanumeric characters or the '_' character, and begin with an alphabetic character. The index_exp must be syntactically correct. If not, an error is returned and the definition is ignored. If the index_exp contains any blanks, it must be surrounded by double quote ("). The <Unknown> memory object has index -1, and the expression used to define a new memory object should support recognizing <Unknown>. For example, for VADDR-based objects, the expression should be of the form:

VADDR>255?<expression>:-1

and for PADDR-based objects, the expression should be of the form:

PADDR>0?<expression>:-1

memobj_drop mobj_type

Drop the memory object of the given type.

machinemodel model_name

Create memory objects as defined in the named machine model. The model_name is a file name, either in the user's current directory, or in the user's home directory, or it is the name of a machine model defined in the release. Machine model files are stored with a suffix of .ermm. If the model_name on the machinemodel command does not end with that suffix, the model_name with .ermm appended will be used. If the model_name begins with a /, it is assumed to be an absolute path, and only that path (with .ermm appended, if needed) will be tried. If the model_name contains a /, only that pathname relative to the current directory, or the user's home directory will be tried.

A machine model file can contain comments and mobj_define commands. Any other commands are ignored. A machinemodel command can appear in a .er.rc file. If a machine model had been loaded, either by the command, or by reading an experiment with a machine model recorded in it, a subsequent machinemodel command will remove any definitions coming from the previous machine model.

If the model_name is missing, print a list of all known machine models. If the model_name is a zero-length string, unload any loaded machine model.

Commands Controlling the Index Object Lists

Index Object commands are applicable to all experiments. An Index Object is a class of objects that can be indexed by a formula that is computed from the header of a packet. Index objects are predefined for Threads, CPUs, Processes, Process-wide Samples, and Seconds, among others. You can get the full list with the indxobj_list command. You can define others with the indxobj_define command.

The following commands control the Index Object Lists.

indxobj type

Write the list of the index objects of the given type with the current index object metrics. Metrics used and sorting are the same as for the Functions list, but containing Exclusive metrics only. The name of the indxobj type can also be used directly as the command.

Use the following predefined index objects directly with er_print as commands without the preceding indxobj command. For example, er_print -threads test.1.er shows metrics for threads.

threads
cpus
samples
seconds
processes
experiment_ids
GCEvents

indxobj_list

Write the list of known types of index objects, as used for indxobj_type in the indxobj command.

indxobj_define type index_exp

Define a new type of index objects with a mapping of packets to the object given by the index_exp. The syntax of the expression is described under "EXPRESSION GRAMMAR", below. The type must not already be defined, and it cannot match any existing command, or any Memory Object type (see above). Its name is case-insensitive, and must be entirely composed of alphanumeric characters or the '_' character, and begin with an alphabetic character. The index_exp must be syntactically correct. If not, an error is returned and the definition is ignored. If the index_exp contains any blanks, it must be surrounded by double quote (").

Commands Controlling the Thread Analyzer Reports

races: Write the report on any detected groups of data-races in the experiment. Data-race reports are available only from experiments with data-race-detection data.
rdetail [ race_id ]: Write detailed information on the specified data-race. If race_id is given as all, write the detailed information for all data-races. Data-race reports are available only from experiments with data-race-detection data.
rsummary [ race_id ]: Same as rdetail.
deadlocks: Write the report on any detected deadlocks in the experiment. Deadlock reports are available only from experiments with deadlock-detection data.
ddetail [ deadlock_id ]: Write detailed information on the specified deadlocks. If deadlock_id is given as all, write the detailed information for all data-races. Deadlock reports are available only from experiments with deadlock-detection data.
dsummary [ deadlock_id ]: Same as ddetail.

Commands Listing Experiments, Process-Wide Samples, Threads, and LWPs

experiment_list: Display the list of experiments that are loaded. Each experiment is listed with an index, which is used when selecting samples, threads, or LWPs, and a PID, which can be used for advanced filtering.
sample_list: Display the list of process-wide resource-utilization samples.
lwp_list: Display the list of LWPs processed during the experiment(s).
thread_list: Display the list of threads processed during the experiment(s).
cpu_list: Display the list of CPUs used during the experiment(s).

Commands Controlling Filtering of Experiment Data

There are two ways to specify filtering of experiment data, either by specifying a filter expression, which is evaluated for each data record to determine whether or not the record should be included, or with the older commands to select experiments, process-wide samples, threads, CPUs, and LWPs for filtering.

The filtering by expression is controlled by the following command.

filters filter_exp: filter_exp is an expression which evaluates as true for any data record that should be included, and false for records that should not be included.

The grammar of the expression in the filters command is described under "EXPRESSION GRAMMAR", below. If the expression contains any blanks, it must be surrounded by double quotes (").

describe: Print the list of tokens that can be used to build a filter expression.

The following old-style commands select the experiments, process-wide samples, threads, CPUs, and LWPs for which data is displayed for filtering.

sample_select sample_spec: sample_spec is a sample list, as described below.
lwp_select lwp_spec: lwp_spec is a LWP list, as described below.
thread_select thread_spec: thread_spec is a thread list, as described below.
cpu_select cpu_spec: cpu_spec is a CPU list, as described below.

Each of the lists above can be a single number, a range of numbers (n-m), a comma-separated list of numbers and ranges, or the explicit string "all". Each list can optionally be preceded by an experiment list with a similar format, separated from the list by a colon (:). Multiple lists can be concatenated, separated by a plus sign. Lists must not contain spaces. If no experiment list is included, the list applies to all experiments.

Apply the experiment selection from any of the select commands to all four select targets — threads, LWPs, CPUs and process-wide samples. Retain each experiment in the experiment list, if a selection of threads, LWPs, CPUs or process-wide samples exists; if no experiment is specified by the user, select all experiments. Turn off selections for experiments not in the experiment list.

Some examples:

thread_select 1: Select thread 1 from all experiments.
thread_select all:1: Selects thread 1 from all experiments.
thread_select all:1,3,5: Selects threads 1, 3, and 5 from all experiments.
thread_select 1,2:all: Select all threads from experiments 1 and 2, as listed by exp_list.
thread_select 1:1+2:2: Select thread 1 from experiment 1 and thread 2 from experiment 2.

Commands Controlling Load Object Function Expand/Collapse

object_show object1,object2,...

Set all named load objects to show all their functions. The names of the objects can be either full pathnames or the basename. If the name contains a comma character, the name must be surrounded by double quotes. If the string all is used to name the load object, functions are shown for all load objects.

object_hide object1,object2,...

Set all named load objects to hide all their functions. The names of the objects can be either full pathnames or the basename. If the name contains a comma character, the name must be surrounded by double quotes. If the string all is used to name the load object, functions are hidden for all load objects.

object_api object1,object2,...

Set all named load objects to show all only the functions representing entry points into the library. The names of the objects can be either full pathnames or the basename. If the name contains a comma character, the name must be surrounded by double quotes. If the string all is used to name the load object, only the entry-point functions are shown for all load objects.

objects_default

Set all load objects according to the initial defaults from .er.rc file processing.

object_list

Display a list of all load objects. The first column shows the status: show if the load object's functions are shown in the function list (expanded), API-only if only the functions representing entry points into the load object are shown, or hide if its functions are not shown in the function list (collapsed). All functions for a collapsed load-object map to a single entry in the function list representing the entire load object, or API-only if only those functions representing the entry point into the load object are shown.

The object_show, object_hide, or object_api commands are processed in the order encountered, and any subsequent command will override any previous commands for that load object (or all load objects).

object_select object1,object2,...

Set the list of active load objects. Functions from all named load objects are shown; functions from all others are hidden. The names of the objects can be either full pathnames or the basename. If the name contains a comma character, the name must be surrounded by double quotes.

If functions from a load object are shown, all functions that have non-zero metrics are shown in the function list. If a functions from a load object are hidden, its functions are collapsed, and only a single line with metrics for the entire load object instead of its individual functions is shown.

By default, show functions from all load objects, except the <Unknown> object.

Commands That List Metrics

metric_list: Display the currently selected function list metrics and a list of metrics and keyword names that can be used to reference them in other commands, such as sort. The format of the metric keywords is described under metrics. The available metrics depend on the data collected. See the collect (1) man page.
cmetric_list: Display the currently selected caller-callee metrics and a list of metrics and keyword names for the callers-callees report. Display the list in the same way as the metric_list output, but include attributed metrics additionally.
data_metric_list: Display the currently selected data-derived metrics and a list of metrics and keyword names for all data-derived reports. Display the list in the same way as the metric_list output, but include only those metrics that have a data-derived flavor, and static metrics.
indx_metric_list: Display the currently selected index object metrics and a list of metrics and keyword names for all index object reports. Display the list in the same way as the metric_list output, but include only those metrics that have a exclusive flavor, and static metrics.

Commands Controlling the Output

outfile filename

Close any open output file and open filename for subsequent output. When opening filename, clear any preexisting content. If filename is a minus sign (-), write output to stdout. If filename is a pair of minus signs (--), write output to stderr.

appendfile filename

Close any open output file and open filename, preserving any preexisting content, so that subsequent output is appended to the end of the file. If filename does not exist, the functionality of appendfile is the same as for outfile.

limit n

Limit any output to the first n entries of the report, where n is an unsigned integer. If n is zero, remove any limit. If n is omitted, print the current limit.

name { long | short | mangled } [:{soname | nosoname }]

Use long or short form of C++ function names. If soname is specified, append the shared-object name to the function name.

viewmode { user | expert | machine }

Set the viewing mode to user, expert or machine.

For Java experiments, user mode shows Java call stacks for Java threads, and does not show housekeeping threads. The function list includes a function <JVM-System>, representing aggregated time from non-Java threads. When the JVM software does not report a Java call stack, time is reported against the function <no Java callstack recorded>. Lists of functions, PCs, etc., do not include compiled methods. All variants of a method are aggregated together. The Disassembly view should show Java bytecodes. Expert mode shows Java call stack for Java threads when your Java code is being executed, and a machine call stack when JVM code is being executed or when the JVM software does not report a Java call stack. It shows machine call stacks for non-user-Java (housekeeping) threads. Machine mode shows machine call stacks for all threads. Lists of functions, PCs, etc,, only inlcude compiled methods. Each variant of a method is listed separately. The Disassembly view should show native-code instructions.

For OpenMP experiments, user mode shows reconstructed call stacks similar to those obtained when the program is complied without OpenMP.

Expert mode exposes compiler generated functions representing parallelized loops, tasks, and so on, which are aggregated with user functions in user mode. In both modes, special functions, with names of the form <OMP-*>, are shown when the OpenMP runtime is performing certain operations Functions from the OpenMP runtime code (libmtsk.so) are suppressed.

In Machine mode, the actual native stacks as captured are shown.

For all other experiments, all three modes show the same data.

compare [on|off|delta|ratio]

Set comparison mode off (compare off, the default), or on (compare on), or delta (compare delta) or ratio (compare ratio). If comparison mode is off, when multiple experiments are read the data is aggregated. If comparison is enabled, when multiple experiments are loaded separate columns of metrics are shown for the data from each experiment. If comparison mode is delta, the base experiment shows absolute metrics but the comparison experiment shows differences between it and the base. If comparison mode is ratio, the comparison experiment shows ratios between it and the base.

Comparison mode will treat each experiment or experiment-group as a separate compare group. The first experiment or experiment-group argument is the base group. If you want to include more than one experiment in a compare group, you must create an experiment-group file to use as a single argument to er_print.

printmode string

Set the print mode from the string. If the string is text, printing will be done as in previous versions -- tabular form. If the string is a single character, printing will be done as a delimiter-separated list, with the single character as the delimiter. If the string is html, printing will be formatted for an HTML table. Any other string is invalid, and the command will be ignored.

The printmode setting is used only for those commands that generate tables (functions, memobj, indxobj, and so on); the setting is ignored for other printing commands.

Commands That Print Other Information

header [ exp_id ]

Write descriptive information about the specified experiment. exp_id is the numeric identifier of the experiment as given by the experiment_list command. If exp_id is all or is omitted, write the headers of all experiments. Following each header, print any errors or warnings. Separate headers for each experiment by a line of dashes.

If the experiment directory contains a file named notes, the contents of that file are prepended to the header information.

objects

List the load objects with any error or warning messages that result from the use of the load object for performance analysis.

overview

Write an overview of all data summed over all experiments. Function list metrics are indicated with [X] while hot metrics have asterisks highlighting their values.

sample_detail [ exp_id ]

Write the details of all the process-wide resource-utilization samples for the specified experiment. exp_id is the numeric identifier of the experiment as given by the experiment_list command. If exp_id is omitted, or is all, write the sum and the statistics for all samples in all experiments.

The report now generated by sample_detail used to be printed with the overview command.

statistics [ exp_id ]

Write the execution statistics data for the specified experiment, aggregating data over the current set of process-wide resource utilization samples. exp_id is the numeric identifier of the experiment as given by the experiment_list command. If exp_id is omitted, write the sum across all experiments, as selected by their samples. If exp_id is all, write the sum and the statistics for the selected samples in each experiment.

ifreq

Write the instruction-execution summary. Instruction frequency reports are available only from experiments with count data.

Commands for Experiments

These commands are for use in scripts and interactive mode only. They are not allowed on the command line.

add_exp exp_name: Add the named experiment or experiment group to the current session.
drop_exp exp_name: Drop the named experiment from the current session.
open_exp exp_name: Drop all loaded experiments from the session, and then load the named experiment or experiment group.

Default-setting Commands

Defaults for many of the reports in the er_print utility and the er_src utility can be set in a resource file named .er.rc. The er_print utility processes a system-wide .er.rc file, then a .er.rc file in your home directory, if present, then a .er.rc file in the current directory, if present. Directives read from each file take precedence over directives read previously.

The er_print utility and the er_src utility print a message to stderr naming the path to the .er.rc files that are processed.

These files can contain scc, sthresh, dcc, dthresh, addpath, pathmap, name, mobj_define, indxobj_define, object_show, object_hide, object_api, compare, printmode, machinemodel, and viewmode commands, as described above. They can also contain the following commands, which cannot be used on either the command line or in scripts:

dmetrics metric_spec

Specifies the default order and visibility of metrics. Multiple dmetrics commands can be given in any .er.rc file, and are concatenated. dmetrics settings from the various files are concatenated in the order: current directory, your home directory, and system.

The metric_spec is described under the metrics command above with the following additions:

The <visibility> can be "!" which means that no version of the metric is visible. This allows you to specify the order of a metric without making it visible by default.

Two generic metric names can be specified. "hwc" means any hardware counter metric, and "any" means any metric at all.

For all metrics computed from the experiments that have been loaded, the concatenated list of all dmetrics is scanned for a match. The first matching entry determines both the visibility and the ordering of the metrics in the function list and the callers-callees list.

dsort metric_spec

Specifies the metric to be used by default for sorting the function list. The first metric in the dsort specification that matches any metric in the experiment(s) is used to determine the sort metric, subject to the following conditions. If the entry in the dsort metric_spec has a <visibility> of "!" , the first metric whose name matches is used, whether it is visible or not; if any other setting of <visibility> is used, the first visible metric whose name matches is used. Like dmetrics, dsort specifications from the various .er.rc files are concatenated in the order: current directory, your home directory, and system.

en_desc option

Set the mode for reading descendant experiments according to option The allowed values of option are:

on: Show all experiments on descendant processes (the default)
off: Do not show any experiments on descendant processes
=<regex>: Show those experiments on descendant processes whose lineage or executable name match the regular expression.

If any experiments that are loaded have descendants that are not loaded as a result of the en_desc setting, a message is printed from er_print.

Miscellaneous Commands

procstats: Print the accumulated statistics from processing data.
script script: Process commands from the named script.
version: Print the current release version of er_print.
quit: Exit interactive mode. If used in a script, no more commands from that script are processed.
exit: An alias for quit.
help: Print help information.
# ...: Comment line; used in scripts or a .er.rc file.

EXPRESSION GRAMMAR

A common grammar is used for an expression defining a filter and an expression used to compute a memory object index.

The grammar specifies an expression as a combination of operators and operands. For filters, if the expression evaluates to true, the packet is included; if the expression evaluates to false, the packet is excluded. For memory objects, the expression is evaluated to an index that defines the particular memory object referenced in the packet.

Operands in an expression are either constants, or fields within a data record, as listed with the describe command. The operands include THRID, LWPID, CPUID, USTACK, XSTACK, MSTACK, LEAF, VIRTPC, PHYSPC, VADDR, PADDR, DOBJ, TSTAMP, SAMPLE, EXPID, PID, or the name of a memory object. Operand names are case-insensitive.

USTACK, XSTACK, MSTACK represent the function call stacks in user view, expert view, and machine view, respectively.

VIRTPC, PHYSPC, VADDR, and PADDR are non-zero only when "+" is specified for Hardware-counter-profling or clock-profiling. Furthermore, VADDR is less than 256 when the real virtual address could not be determined. PADDR is zero if VADDR could not be determined, or if the virtual address could not be mapped to a physical address. Likewise, VIRTPC is zero if backtracking failed, or was not requested, and PHYSPC is zero if either VIRTPC is zero, or the VIRTPC could not be mapped to a physical address.

Operators include the usual logical operators and arithmetic (including shift) operators, in C notation, with C precedence rules, and an operator for determining whether an element is in a set (IN) or whether any or all of a set of elements is contained in a set (SOME IN or IN, respectively). An additional operator ORDERED IN is for determining if all elements from the left operand appear in the same sequence in the right operand. Note that IN operator requires all elements from the left operand to appear in the right operand but does not enforce the order. If-then-else constructs are specified as in C, with the ? and : operators. Parentheses should be used to ensure proper parsing of all expressions. On the er_print command-line, the expression cannot be split across lines. In scripts, or on the command-line, the expression must be inside double quotes if it contains any blanks.

Filter expressions evaluate to a boolean, true if the packet should be included, and false if it should not be included. Thread, TWP, CPU, experiment-id, process pid, and sample filtering are based on a relational expression between the appropriate keyword and an integer, or using the IN operator and a comma-separated list of integers.

Time-filtering is used by specifying one or more relational expressions between TSTAMP and a time, given in integer nanoseconds from the start of the experiment whose packets are being processed. Times for process-wide resource-utilization samples can be obtained from the sample_detail command; Times in the sample_detail command are given in seconds, and must be converted to nanoseconds for time-filtering. Times can also be obtained from Performance Analyzer's Timeline view.

Function filtering can be based either on the leaf function, or on any function in the stack. Filtering by leaf function is specified by a relational expression between the LEAF keyword and an integer function id, or using the IN operator and the construct FNAME("<regex>"), where <regex> is a regular expression, as specified on the regex(5) man page. The entire name of the function, as given by the current setting of name, must match.

Filtering based on any function in the call stack is specified by determining if any function in the construct FNAME("<regex>") is in the array of functions represented by the keyword USTACK:

(FNAME("myfunc") SOME IN USTACK)

Data object filtering is analogous to stack function filtering, using the DOBJ keyword and the construct DNAME("<regex>") enclosed in parentheses.

Memory object filtering is specified by using the name of the memory object, as shown in the mobj_list command, and the integer index of the object, or the indices of a set of objects. (The <Unknown> memory object has index -1.)

Data object filtering and memory object filtering are meaningful only for hardware counter packets with memoryspace/dataspace data; all other packets are excluded under such filtering.

Direct filtering of virtual addresses or physical addresses is specified by a relational expression between VADDR or PADDR, and the address.

Memory object definitions (the mobj_define command) use an expression that evaluates to an integer index, using either the VADDR keyword or PADDR keyword. They are applicable only to hardware counter packets for memory counters and memoryspace/dataspace data. The expression should return an integer, or -1 for the <Unknown> memory object.

Example Filter Expressions

To see events from thread 1 when it was running on CPU 2 only, use:

THRID == 1 && CPUID == 2

If an index object, THRCPU, is defined as "CPUID<<16|THRID", the following filter is equivalent to the one above:

THRCPU == 0x10002

To filter events from experiment 2 that occurred during the period between second 5 and second 9:

EXPID==2 && TSTAMP >= 5000000000 && TSTAMP < 9000000000

To filter events that have any method from a particular Java class in the stack (in user view mode):

FNAME("myClass.*") SOME IN USTACK

If function IDs are known (e.g. in Analyzer GUI), to filter events that contain a particular call sequence in the machine call stack:

(314,272) ORDERED IN MSTACK

If the describe command lists the following properties for a clock profiling experiment:

MSTATE    UINT32  Thread state
NTICK     UINT32  Duration

then you can specify a filter to retain events in a particular state:

MSTATE == 1

If you want to filter events for duration:

MSTATE == 1 && NTICK > 1

COMPARISON MODE

When er_print is invoked on more than one experiment or experiment group, it aggregates the data. If the compare command is used to enable comparison, the function list shows separate columns of metrics for each experiment or group, so that the data can be compared.

Creating Experiment Groups

To create an experiment group, create a plain text file whose first line is as follows:

#analyzer experiment group

Then add the names of the experiments on subsequent lines. The file extension of the experiment group text file must be .erg. You can use experiment groups also to load only specific descendant experiments if you want to isolate their data away from their founder experiment.

COMPATIBILITY

er_print will only work on experiments recorded with the current version of the tools. It will report an error for experiments recorded with any other version. You should use the version of er_print from the release with which the experiment was recorded.