Aprobe FAQ

From OC Systems Wiki!
Jump to: navigation, search

Contents

Aprobe FAQ

Frequently Asked Questions for Aprobe (All Platforms) Updated May 2017

This document describes aspects of the "Aprobe" product from OC Systems, Inc. (www.ocsystems.com):

It consists of questions asked by evaluators and customers, as well as "artificial" questions intended to provide an introduction to the use of the products.

More complete and detailed descriptions of RootCause are provided by the User's Guides for those products, but this FAQ may provide answers not easily found there, and also includes specific code examples not applicable to a general User's Guide.

Aprobe underlies RootCause, which provides support for constructing sets of probes which can be deployed remotely. See "What is RootCause?" for more information.

Users are encouraged to send questions (and answers!) to support@ocsystems.com.


Note to Windows and Solaris Users:

The last updates to RootCause/Aprobe for the Windows and Solaris platforms were version 2.1.4b/4.3.4b in mid-2006. Support for these platforms was officially dropped in 2011. A recent update of this FAQ has removed all questions and answers that are specific to those platforms. If by some unlucky chance you're still using them, here is the [rc_aprobe_faq-2007.html old version of the FAQ].


Note to 64-bit RootCause/Aprobe Users:

Whereever you read APROBE in the questions and answers below, replace with APROBE64. Different file names and environment variables must be used to allow both 32- and 64-bit versions to co-exist.

This FAQ is Copyright (c) 2017 by OC Systems, Inc. ALL RIGHTS RESERVED.

Aprobe

What is Aprobe?

Aprobe is a suite of tools and libraries which support dynamic modification and extension of a program by dynamically patching the program executable and/or shared libraries.

A dictionary defines "Probe" as "Device for exploring an otherwise inaccessible place or object." "Aprobe" stands for "Algorithmic Probe". It is hence a tool for exploring your program with the help of user-written algorithmic probes. These probes are installed into your program with the help of OC Systems' patented "dynamic action linking" technology.

A user runs a program with the "aprobe" tool, indicating that certain "probes" are to be patched into the program and executed as the program itself runs.

A "probe" consists of "actions" composed in C, with some special syntax added to indicate where in the program the actions are to be invoked.

There are a number of predefined probes included in Aprobe; there is a tool to generate simple probes directly from a linked or unlinked object file; or the user may easily compose his own probes in a simple extension of the C language.

See also "What is RootCause?"

What is ProbePak?

The ProbePak was an experiment at introducing users to the power of Aprobe and RootCause by making a subset available for free download. It didn't work out, and ProbePak is no longer supported. See the main page www.ocsystems.com for information on our current products.

What are some potential uses of Aprobe?

Read more about uses of Aprobe in the Product section of the web site or read thewhite papers in the Resources section. See also "What are some potential uses of RootCause?"

How do I get started quickly with Aprobe?

The best way to get started writing probes is to look at examples, and make some small changes.

If you have RootCause and have been using the GUI, you can use the Custom... button in the Trace Setup window to generate a probe, and look at that. If that looks too daunting, or you want a more tutorial approach, try the graduated examples in the $APROBE/examples (or $APROBE/ada_examples) and $APROBE/demo/Aprobe subdirectories of the Aprobe installation. Check out $APROBE/examples/evaluate/README.

Who can use Aprobe?

Technical people who are developing, testing, and maintaining software.

What different versions of Aprobe are there?

The current version of Aprobe is 4.4.6d, released in May 2017.

The original version of Aprobe is version 2. for AIX, included as part of OC Systems LegacyAda/OATS product, and in earlier versions of OC Systems "PowerAda" product.. While it shares the "probe" concept with the newer version, the user interface and details of Aprobe Version 2 differ substantially from Versions 3 and 4.

For which platforms is Aprobe available?

Same as those for RootCause: See "For which platforms is RootCause available?".

How do I get Aprobe?

E-mail , and we will arrange for you to receive the software.

What documentation is available for Aprobe?

The new Aprobe User Guide is available on-line in this DocsWiki Aprobe User Guide .

There are a series of graduated examples that come with their own text documentation in the $APROBE/examples and $APROBE/demo subdirectories of the Aprobe installation. You should read $APROBE/examples/evaluate/README and try at least some of the examples under that directory, before trying Aprobe on your own application or looking through this FAQ for answers.

What tools make up Aprobe?

  • apcgen - generates APC for some or all functions in the specified object file(s)
  • apc - compiles and links the specified APC file(s) into a UAL.
  • aprobe - runs the specified program after loading and applying patches in the specified UAL(s).
  • apformat - formats any data logged in the specified APD file(s).

These tools are described further in other questions below. A number of additional tools and scripts and for specific situations are also provided. See Tools Reference.

How is Aprobe licensed?

Same as RootCause. See "How is RootCause licensed?" .

Is there a point-and-click (GUI) interface to Aprobe?

Yes. It's called RootCause. See "What is RootCause?" .

Also, Some predefined probes (see "Using Predefined Probes" below) include a Java GUI to specify configuration parameters for that probe.

Can I run Aprobe on any executable program file?

Yes. You can run aprobe (without any probes) on any application at all unless:

  • It is a secure application which a debugger doesn't have authority to attach to. In this case you should get a clear explanatory message.
  • The application does something very strange like replacing some low-level system routines with its own versions that do something different.
  • There's a bug in Aprobe.

If you find that using aprobe causes your application to crash, you should try running aprobe without any probes. If it still crashes, it should be reported as a bug to support@ocsystems.com.

A slightly different question is, "Can I use Aprobe to put probes on any program?" To actually apply probes to a native module, there are three basic requirements:

  • Symbols

For Aprobe to do what it does it must be able to figure out where the subroutines you are trying to probe have been linked and loaded. We call this location information "symbols". All symbolic debuggers have the same requirement. See How do I tell what symbols a program has available? .

The symbols may be as originally added to the application (i.e., not stripped, see Q12.16 ), or they may have been saved separately by Aprobe using apmkadi (see apmkadi ).

Most programs delivered with the operating system, and off-the-shelf software, are stripped, so you can't use Aprobe directly on the application code, but you can generally probe shared libraries (DLLs) that support them.

Standard Call/Return behavior
If the program uses a mechanism that transfers control other than by the normal call and return mechanism, such as setjmp /longjmp or an unsupported exception mechanism, and there is an active probe at the time of that non-standard transfer of control, the program will likely crash.
Supported exception mechanism.
Ada and C (and Java, but that's a separate issue) support exceptions which are non-standard transfers of control. Each compiler does this in a different way, and must be explicitly supported by the Aprobe runtime. See "What compiler(s) must have been used to compile my program?".

In what language(s) can my program be written?

Same as for RootCause. See "In what language(s) can my program be written?"

What compiler(s) must have been used to compile my program?

Same as for RootCause. See [[RootCause_FAQ#q1.11|"What compiler(s) must have been used to compile my program? "]

How do I tell if a program file is "stripped"?

Use the "file" command, e.g.:

AIX:

$ file a.out
a.out:      executable (RISC System/6000) or object module not stripped
$ file /bin/ls
bin/ls:     executable (RISC System/6000) or object module

Linux:

$ file a.out
a.out: ELF 32-bit LSB executable, Intel 80386, version 1, dynamically linked (uses shared libs), not stripped
$ file /bin/ls
/bin/ls: ELF 32-bit LSB executable, Intel 80386, version 1, dynamically linked (uses shared libs), stripped

How do I tell what symbols a program has available?

The apcgen tool will list the Aprobe function symbols in any compiled object module, for example:

apcgen -L C:\WinNT\system32\kernel32.dll
apcgen -L /usr/lib/libc.so
apcgen -L /work/programs/prog.exe

There are other apcgen options such as -m to show "mangled" names and -v to show file names--use apcgen -h for usage.

The RootCause Trace Setup window shows a tree of all the functions organized by module, directory and file, using the same mechanism used by apcgen.

If you want information about data symbols, or want to confirm that a function may actually be probed, you can use the apinfo command, which runs the info predefined probe. This only works on executable programs. For example:

apinfo -d /work/programs/prog.exe

will show all the global and file-static data symbols found when prog.exe is loaded by aprobe. There are lots of other options: use apinfo -h to see them.

What do I do to get symbols in my program?

On Unix, every program has its symbols unless they're explicitly stripped (see "(Unix) How do I tell if a program file is "stripped"?"

What do I do to get "debug information" in my program?

This is documented in the RootCause User's Guide, "Building a Traceable Application", and in the Aprobe User's Guide "Writing APC Probes", but it's summarized here:

  • For C/C compilers compile with -g.
  • With PowerAda you get debug information by default, but you need the PowerAda program library available just as you would for adbg.

In addition to compiling with the right option to generate the debug information, you also must retain that information and have it available where it's supposed to be:

  • For gcc-based compilers, including GNAT, and IBM's C and C compilers, debug information is collected at link-time into the executable, and is retained unless you explicitly use the "strip" command.
  • PowerAda line information is recorded in the executable and can't be stripped, so you don't need any debug information at run-time. However, for `apc' and the RootCause GUI, you need the Ada program library, which must be consistent with the executable and available at the same location recorded in the executable. If the library is moved, you can specify its location with the environment variable APROBE_POWERADA_LIBRARY, for example:
export APROBE_POWERADA_LIBRARY=/builds/old/prog1/adalib

How do I tell if a program file has "debug information"?

The apcgen|apcgen command will list those functions that have debug information associated with them:

apcgen -Ld a.out

This should be all you need, but there are some system utilities that look in the object files themselves that may also be used:

  • On Linux you should find objdump :
objdump -W a.out | grep "DW_TAG_subprogram" | awk '{ print $NF }'
  • On AIX, you can use the dump utility with the -t option to dump symbol information, including debug "stab" strings. For example:
dump -t a.out | grep ":F"
  • will show the functions that have debug information.

What is a "probe"?

A "probe" is a "user action" associated with a specific location in a program. The user action is executed whenever control passes through the location with which it is associated. A "probe" is described in an extension of C called "APC", for example:

probe thread
{
  probe "foo"
  {
    on_entry
    {
      printf("Entering foo.\n");
    }
  }
}

The block following the "on_entry" is the "user action". The syntax surrounding it describes exactly where and when the action should be executed: immediately upon entering function "foo()" in each thread.

What is a "UAL" (.ual file)?

A UAL is a "User Action Library". It is the output of the "apc" command, and is a shared library consisting of the object code generated from your apc files. Not just any shared library (DLL) may be used as a UAL, and it a UAL may not be renamed after creation, because it has specially-named entry points based on its filename which are called by the Aprobe runtime to perform initialization.

What is "logging"?

With respect to Aprobe, "logging" means "writing data to a file for later analysis" Aprobe provides a built-in logging facility that allows saving raw data in a time and space-efficient way, and using "apformat" to display the logged data later. See [#logging "Logging Data"] for related questions.

What is an ".apd" file?

An ".apd" file is one that contains the data generated by a program run under aprobe. These are binary files which are read with the "apformat" tool.

There is always a ".apd" file generated giving aprobe invocation information, even if no "log" statements are executed. If log statements were executed there will be a "-1.apd" file, and maybe "-2.apd" files as well.

What can't I do if my executable or library doesn't have debug information?

You can't reference source-level information in your probes. It's just like using a source level debugger in this respect, and for the same reason. A good rule is, if the debugger can print the value of a variable x at line 15, then you can do "on_line(15) log($x)" in your probe.

More specifically, you need to specify "-x exe_or_library " on the apc command, and the exe_or_library must contain debugging information, if you use a construct in your probe that cannot be resolved without specific debug information from the program. Such constructs are:

(a) target expressions: names from the probed program preceded by $, or $* ($1, $2 are ok, as are hardware-register references starting with '$$').

and

(b) references to specific source lines;

Note that there are lots of probes you can write; for example, all but one of the predefined probes provided with Aprobe will work fine in the absence of debug information, and the one that does require it (coverage) does so in order to get source line number information.

Does use of on_line() requires application to be have debug information?

Yes, but things aren't that simple. To build a probe that requires debug information (including line information), the debug must be available when the probe is compiled. However, the debug information can then be stripped and the probe run against the stripped executable.

For the symbol table, the necessary symbols must be present at runtime, either in the application (or application libraries) or in a [[AUG_Files_Reference#ADI file|.adi] file which is generated with the tool apmkadi. That tool allows you to capture the symbol table in an internal form and then strip the executable.

Also, PowerAda programs always contain source line information -- this is not considered debug information.

Finally for low-level hacking, you can instrument specific offsets using on_offset.

What is the maximum number of probes allowed?

For probes you are just limited by paging space. For UALs there is a more practical limit - we limit the total number of modules to 255 and that includes UALs.

Is there access to C private/protected variables?

Yes, if it's in the debug we can see it. We don't look at whether the debug says it's private, protected or public - we just use it.

Is there any way to attach with Aprobe to a running application?

No. This question is very frequently asked. It sounds great in theory but in practice Aprobe is a tool for tracking problems that have yet to happen, not those that have just happened. There is also quite a bit of work done by Aprobe when an application starts up; often doing this to a running application is as big an issue as re-starting the application.Finally for Java you wouldn't be able to change the classpath to see our classes or intercept classes that have already loaded.

Is there a way to probe a function for which no symbol is available?

Yes, if you know its address and size, you can define a symbol for it using ap_RecordDynamicFuntionSymbol() in the Aprobe Runtime Library and and then apply probes using the define symbol.

Here is are example C and Apc files illustrating how to use it.

defsym.c

 #include
 #include
 
 static char *image(char *s)
 {
    char *s1 = strdup(s);
    return s1;
 }
 
 int main (void)
 {
   printf (image("Hello\n"));
   return 0;
 }
 

defsym.apc

 //---------------------------------------------------------------------------
 // Define Dynamic Function Symbol Example
 //
 // This is an example of using ap_RecordDynamicFunctionSymbol()
 // to define symbols when no debug information is available.
 //
 // NOTE:  If the offset for symbols is wrong the program will
 // likely crash because you will have directed Aprobe to instrument
 // the wrong piece of code.
 //---------------------------------------------------------------------------
 
 #include "aprobe.h"
 
 // To define your symbols early enough to be instrumented and
 // probed, you have to define them from a UAL initalize function.
 // The initial part of the name must be InitializeUal_, and the first
 // character following that must be lower in the ASCII collating order
 // than the first character of the UAL name. '0' is the lowest legal
 // character.
 
 void InitializeUal_0_defsym_apc()
 {
    // In this example I just define an alias for the symbol "main"
    // and probe that instead.  You have to know the correct offset
    // and size of the function (though size is not so critical).
    // The offset is the offset in the moudle, not just the text
    // section.
    ap_SymbolIdT NewSym =
       ap_RecordDynamicFunctionSymbol (
          ap_ApplicationModuleId(),
          "MyAliasForMain",
          ap_ExternSymbol,
          ap_IntegerToOffset(0x10),
          0x1d,
          0);
    if (ap_IsNoSymbolId(NewSym))
    {
       printf("Couldn't define symbol...\n");
    }
 }
 
 probe thread
 {
    // You'll get a warning about the symbol not being defined
    // when you compile this with apc, but it's OK.
    probe "MyAliasForMain"
    {
       on_entry  printf("Hello again...\n");
    }
 }
 

13. Using the "aprobe" Command

13.1 What does "aprobe" do?

Aprobe locates the specified UALs (if any), loads them as well as the Aprobe runtime, patches the executable to invoke the probes described in the UAL files, and starts execution of the specified program.

13.2 How do I specify options to my program when using aprobe?

The executable program name is the last argument on the aprobe command line. All options after that are passed as arguments to the executable. For example, if your regular command-line would be:

  mygrep "a_string" *.txt

Then with aprobe it would be:

  aprobe -u mygrep.ual mygrep "a_string" *.txt

The most reliable way to do this, used by RootCause, is with the aprobe "-execvp" option. In this case you specify a filename in place of the parameters, and the filename includes all arguments, including "argv[0]" that is to be passed as the executablename. For example, in the above case:

aprobe -execvp -u mygrep.ual mygrep mygrep.args

where mygrep.args might contain the lines:

mygrep.exe
"a_string"
file1.txt
file2.txt

13.3 How do I specify options to my probes?

Options and parameters can be passed to each UAL as well. This is done by following the UAL name with the -p option followed by the options in quotes. This is most commonly seen when invoking a predefined probe that is part of Aprobe, for example:

   aprobe -u info -p "-sa" mygrep.exe

The options to the info probe are "-sa".

13.4 How do I print my output at run time instead of sending to the APD file?

The "-if" ("immediate format") option on the aprobe command does this, e.g.,

   aprobe -if -u fooTest foo

13.5 Can I suppress generating an ".apd" file?

Not at this time. Even if you do "aprobe -if -n 0 ... " you get the basic .apd file.

13.6 How can I run my probes without invoking aprobe?

Use RootCause. That's one of it's key features If for some reason you can't do that, you do the things described in Chapter 4 of the Aprobe User's Guide, "Loading Probes without aprobe":

  • Substitute the Aprobe command line for the command that starts your program
  • Rename your application and replace it with a hard-coded script that calls aprobe on the renamed executable. On Unix there is a script delivered that facilitates this, called "run_with_aprobe_edit", which documents its use.
  • Use the "run_with_aprobe_apo" script. The script text includes documentation on its use.
  • Link aprobe into your application by linking it with the shared library "libdal.so".

13.7 How do I probe a function in a dynamically-loaded shared library?

If your program explicitly loads a file by calling dlopen("dynamic.so"), Aprobe does not support this directly since it does all its patching when the executable and any shared libraries linked in are first loaded into memory. So the only shared libraries you can probe are those listed by the command

ldd exe_name

You can force a shared library to be pre-loaded at startup by specifying its full path as the argument to the "-dll" option on the aprobe command line, for example:

aprobe -dll /my/application/dynamic.so /my/application/app.exe

This assumes that dynamic.so is not dependent on any other dynamically loaded shared libraries and that it doesn't hurt to be initialized earlier than would have been the case with dlopen.

13.8 Can I probe a function in native C or C code loaded by a Java application?

In general, if you're using Java, you should be using RootCause and not Aprobe directly. However, you can do this using apjava -dll option. If your Java class named "JTest" contains "LoadLibrary("native") then this should work:

apjava -dll /full/path/libnative.so -u native_probes.ual -java JTest

13.9 Is there a way I can use Aprobe in a target environment where my application has no symbol or debug information with it (is stripped)?

If you have a program that can be probed, you can run the tool apmkadi on it to create an Aprobe Debug Information (ADI) file. You can then remove the symbols from the executable (using the strip command) and ship it to the target site. When you want to run Aprobe on that, you would then specify not only the UAL file(s) containing the probes, but the ADI file(s) as well, which contain only the symbolic information needed by Aprobe. See Appendix A, "apmkadi" for more information.

13.10 Can I run aprobe but produce no APD files?

Yes. The "-p" flag, which prevents generation of any APD files, was introduced in Aprobe version 4.2.5. This is useful if your probes don't log any data using the default log method.

13.11 Why does my program crash when using aprobe, and not without?

The possibilities are:

  1. There's a bug in your probe, for example, one of your action routines is dereferencing a null pointer. See "Debugging Your Probes" near the end of Chapter 3 of the Aprobe User's Guide.
  2. Your application provides its own "malloc()" function which requires initialization before its first use. Since aprobe gets control before your application does, and uses the application's malloc(), this could cause a crash on startup. See also [#q13.14 Q13.14].
  3. Your probe is accessing or logging data on_exit to a function or thread, but the on_exit action is being called in an exception or thread exit condition and so may not have valid data available. In order to check for this, put on_exit code within a block that checks the ap_ProbeActionReason implicit parameter, e.g.:
 on_exit {
    if (ap_ProbeActionReason == ap_ExitAction)
    {
      log("foo returns ", $return);
    }
    else
    {
      log("foo exits abnormally for: ", ap_ProbeActionReason);
    }
  }
  1. Your program is very time-critical, or is such that timing may change the order in which order-dependent operations are executed. Aprobe introduces some overhead, and your probes likely introduce a lot more overhead, which can change your program's behavior. You can use aprobe itself to find out what's happening, and to force synchronization between threads -- contact .
  2. There's an Aprobe capacity problem. This may happen with the predefined probes if you select all functions, or (equivalently) specify "*" IN "*" in the configuration file. (See Q. 4.9). You can either reduce the number of functions you're probing, or increase the default probe stack size with "aprobe -q stacksize=20000000" (or some other big number).
  3. You're probing a function that doesn't follow standard generated code conventions. This can happen when you try to probe everything in a shared system library such as libc.so on Linux or libc.a(shr.o) on AIX. See if it reproduces when you only probe known entry points in the system library, or limit your probes to your application module only.
  4. There's a bug in Aprobe. Contact Contact our sales department for more information ( ).

13.12 AIX: Aprobe version 3.2 had the -s1 option to prevent conflicts with my application's shared memory. Is there a similar feature in version 4.2?

We hoped that by getting rid of shmat() from our code that we would no longer cause conflicts. Unfortunately we didn't realize that the OS would choose memory map addresses that would conflict, so the problem immediately reappeared. We added a different flag to allow you to specify the memory area that should be used: -q mmap=address where address is the address that should be passed to mmap() when Aprobe requests its shared memory. For example:

aprobe -q mmap=0xd0000000 -u myprobes myapp.exe

If you don't have this flag, you'll need an updated version of Aprobe but you might be able to get around it:

Many users find that they can avoid shared memory conflicts simply reducing the size of the APD files. The default maximum size is 256M persistent and 256M user APD file. By using a ring (aprobe -n flag) you can vastly reduce the user apd size and you can use the -sp flag to specify a reduced persistent file. For instance, the following:

 aprobe -sp 16000000 -n 5

will create a persistent file of approx 16M and up to 5 APD files of 2M each.

13.13 Why does Aprobe ask for such a large memory-mapped file on startup, when I've specified only a 4M APD file with "-s"?

The size of the persistent APD file is controlled independently of the size of the APD ring files. You can use the -sp option to lower this significantly. The default is 256Mbytes because we need to set it to the maximum at the beginning. However we've found that 16M is generally sufficient in practice.

If you look and see how big your persistent files grow you can use that at a baseline. The main things that get logged to the persistent file after program start are:

  • New java classes / methods
  • b) New threads
  • Tracebacks recorded with the traceback to ID mechanism
  • LOAD_SHED functions

13.14 On Linux when I run my application under Aprobe it crashes during initialization with a problem in malloc. This doesn't happen without Aprobe. Why?

The application might have a poor implementation of malloc built-in. On Linux an application can provide it's own implementation of malloc, free, etc. and this will be used. Most local versions of malloc are well behaved. Some, however, require initializing by the application before first use. Since Aprobe gets in earlier than the main() this can cause a malloc request to be made ahead of it being initialized.

If you have control over the code you should fix this by making the malloc self-initializing. If you don't then, unfortunately, you will not be able to run the application under Aprobe.

14. Using the "apformat" Command

14.1 What does apformat do?

apformat reads one or more related APD (.apd) files and formats the data they contain. For example, if the command

    aprobe -u a.ual a.exe

produced the files

   a.apd a-1.apd

Then the command

    apformat a.apd
  1. Reads a.apd to find out the executable (a.out) and UAL(s) (a.ual or a.dll) that were used by aprobe to generate the file, and what other APD files were generated (a-1.apd).
  2. Reads the data records contained in a-1.apd, and for each one, invokes the associated format routine contained in the UAL file, passing the data in the record as parameters to the format routine.

14.2 Which of the ".apd" files do I specify on the command-line?

If you specify the "base" one, without any number at the end (e.g., a.apd), all of the files that were written to during the most recent invocation will be formatted. If you specify an individual data file, such as "a-2.apd", only the data in that specific APD file will be formatted.

14.3 Can I restrict the apformat output to just that generated by one of the several UALs provided at aprobe time?

Yes. Use the "-z" option to indicate that no UALs are to be loaded implicitly, then use "-u" to explicitly state which one you want to use:

  apformat -z -u first myprog.apd

14.4 Can I restrict the apformat output to just that generated by one or two of my format routines?

Yes. If you provided your own format routines, you can do it by editing those routines and re-generating the UAL of the same name as the original .

Lets say you have "dumpall.apc", from which you generated "dumpall.ual". Copy "dumpall.apc" to "dumpall.apc.save". Then edit "dumpall.apc" and comment out the bodies of all the format routines except for the one(s) you want to keep. Use `apc' to compile "dumpall.apc" into "dumpall.ual", e.g., apc dumpall.apc -x myprog then do:

  apformat -z -u dumpall myprog.apd

The UAL name must be preserved because the basename of each UAL is part of the "key" used to map formats to data in the APD.

14.5 Can I programmatically filter which formats are used?

Yes, and this is actually preferable:

  1. Define global flags corresponding to the different kinds of filtering you want, initializing all to (say) "true".
  2. Code your format routines such that each has "if (FormatFlag1 || FormatFlag2) { ... }" guarding the execution of the print actions in the format routine.
  3. In the "on_entry" part of a "probe format", read the command-line arguments to the UAL (ap_UalArgc, ap_UalArgv), or an environment variable, or file, or whatever, to determine the desired settings of the flags.

14.6 Can I do the previous 2 if I'm using automatically generated formats?

No. The formats are generated automatically and there's no way to put your own conditions within them. (Of course you can put conditions around the log statement at run time, so that no data is recorded to begin with, but this is a different issue.)

14.7 When do I need to specify the UAL file to apformat?

When you want to use UALs different from, or in addition to, the ones that were specified when you ran aprobe. You might want to do this in order to only process part of the data, or use different format routines. Use apformat -z if you want to use only those UALs explicitly specified on the apformat command line.

14.8 Can I use "apformat" without an APD file?

No. There must be a valid APD file generated by aprobe.

14.9 Aprobe works fine, but I get a crash from apformat; why?

This is almost certainly because there's a bug in one of your format routines. See Debugging Your Probes near the end of Chapter 3 of the Aprobe User's Guide.

However, if you didn't write any of your own format routines, either because you're using a predefined probe, or because you just used "log(something);", then this is probably OC Systems' fault and you should contact Aprobe support.

14.10 Can can I use ap_UalArgv in "probe format ... on_entry" to get arguments passed at run-time (aprobe time)?

No. ap_UalArgv at apformat time is for reading arguments passed to the UAL on the apformat command line, as in:

   apformat -u my_probe -p "param1 param2" t.apd

You would have to log the data you need from run-time yourself, and format it later. This can be done by including the following APC file into your APC file prior to the "probe format" or other format routine in which you want to use the arguments. You can then use the variables ap_RuntimeUalArgc and ap_RuntimeUalArgv just as you would use ap_UalArgc/v at run time.

 /* logualargs.apc
  * Include this once per UAL to record runtime arguments for format time use.
  */
 
 #ifndef _LOGUALARGS_APC_
 #define _LOGUALARGS_APC_
 
 static int ap_RuntimeUalArgc = 0;
 static ap_NameT *ap_RuntimeUalArgv = NULL;
 static void ap_RuntimeUalArgStart(ap_Uint32 *argc)
 {
    ap_SizeT size = ((*argc) 1) * sizeof(ap_NameT);
    ap_RuntimeUalArgc = *argc;
    ap_RuntimeUalArgv = (ap_NameT*)(ap_Malloc(size));
    memset(ap_RuntimeUalArgv, 0, size);
 }
 
 static void ap_RuntimeUalArgAdd(int *pos, ap_NameT Arg)
 {
    ap_RuntimeUalArgv[*pos] = ap_StrDup(Arg);
 }
 
 probe program
 {
    on_entry
    {
        int i;
        log (ap_UalArgc)
           with ap_RuntimeUalArgStart to ap_PersistentLogMethod;
        for (i = 0; i < ap_UalArgc; i  )
        {
           log(i, ap_StringValue(ap_UalArgv[i]))
            with ap_RuntimeUalArgAdd to ap_PersistentLogMethod;
        }
    }
 }
 #endif
 

For example:

 #include "logualargs.apc"
 
 probe thread
 {
 }
 
 probe format
 {
    on_entry
    {
        int i;
        // Run-time arguments to this UAL
        printf("ap_RuntimeUalArgc = %d\n", ap_RuntimeUalArgc);
        for (i = 0; i < ap_RuntimeUalArgc; i  )
        {
           printf("ap_RuntimeUalArgv[%d] = \"%s\"\n", i, ap_RuntimeUalArgv[i]);
        }
        // Format-time arguments to this UAL
       for (i = 0; i < ap_UalArgc; i  )
        {
           printf("ap_UalArgv[%d] = \"%s\"\n", i, ap_UalArgv[i]);
        }
    }
 }
 

15. Using Predefined Probes

15.1 What is a predefined probe?

This is just a UAL containing probes written by OC Systems for a specific purpose. They are generally more complex than ones you would write yourself, and are designed to work on any program that can be probed. Most of these probes include a Java GUI to simplify parameterization of the probe for your specific program, such as specifying the functions to be probed.

All predefined probes are in $APROBE/ual_lib/*.ual; the source code is $APROBE/probes/*.apc. The documentation for these probes is in Appendix D of the User's Guide.

15.2 Do I have to use "apc" to build these probes myself?

No! The UALs for all of the predefined probes are already built and located in $APROBE/ual_lib. This is in the UAL search path, so the simple name of the UAL is sufficient. For example:

    aprobe -u info myprog.exe

15.3 The examples show invocation of predefined probes using aprobe -u info myprog.exe. How does aprobe find these UALs when they're not in the current directory?

The directory $APROBE/ual_lib is always searched for UALs after the working directory. The environment variable APROBE_LIBPATH may also be defined to add additional directories.

15.4 Can I use Coverage without using the Java configuration GUI?

Yes. In fact, that's the default. There is no GUI for `info'. The coverage, profile and trace probes provide a GUI to assist in building or modifying configuration file which defines what should be done, but this file is just a text file that can be edited by hand.

The `memwatch' predefined probe provides a "runtime" GUI to monitor memory usage as the program is running, and to take interactive snapshots of the allocation data.

See the documentation for each probe in Appendix D of the Aprobe User's Guide.

15.5 The trace probe really slows down the program--how can I speed it up?

You should see the Aprobe User's Guide documentation about this probe. However, you can try these things in this order:

  1. Use Load Shedding by specifying "LoadShedThreshold 10" in your configuration file.
  2. Don't use wildcards like "Trace *", but rather use apcgen -L to list specific functions you want to trace and just name those.
  3. Use the TRIGGER configuration parameter to specify a specific call-tree you want to trace.
  4. Use the circular-buffer mechanism, by specifying SaveTraceDataTo CIRCULAR_BUFFER in the configuration file, rather than logging data in real time. Note that your program must complete in a well-behaved way in order to get a snapshot of the data logged to the circular buffer.

15.6 How can I get a snapshot of my predefined probe data before my program dumps core?

The ability to take a snapshot when an unexpected signal occurs is provided by combining the predefined probe of your choice with the "sigsegv" probe:

   // my_coverage.apc
   #include "sigsegv.h"
   #include "coverage.h"
   static void MyHandler(int sig, void *Data)
   {
      ap_Coverage_DoSnapshot("Snapshot on signal.");
   }
   probe program
   {
      on_entry
      {
         ap_Sigsegv_AddCallback(MyHandler, NULL);
      }
   }
Then you link this with the existing predefined probes:
  $ apc my_coverage.apc coverage.ual sigsegv.ual # creates my_coverage.ual

15.7 Is there a way to invoke predefined probe operations from within my probes?

An API for each predefined probe is defined by the ".h" file corresponding to it in $APROBE/probes. For example, "profile.h" defines "ap_Profile_DoSnapshotForAll()". To call this, you would #include "profile.h" in your APC file (it's in $APROBE/include as well, which is always searched for include files). Then when you compile your apc file, specify the UAL as if it were just another object file to link with:

    apc myprofile.apc profile.ual

This will produce myprofile.ual.

15.8 How can my probes use the Java GUI facilities that the predefined probes use?

There are two interfaces to the Java GUI objects used by the predefined probes. The one to start with is defined in $APROBE/include/quick_gui.h and implemented in quick_gui.ual. This supports simple graphs, and interactive message, Yes/No, and confirmation dialogs. An example of using this is given in the example $APROBE/examples/learn/visualize_data/.

The full GUI interface used by the predefined probes like profile.ual is apGUI.h, but this is only for fearless experts.

15.9 I'd like to customize a predefined probe -- how do I rebuild it?

This is a bit ugly because the Makefile for building probes relies on adjacent directories, so you have to rebuild in place (after saving the original) or copy and/or soft-link them locally:

mkdir my_aprobe ; cd my_aprobe
cd my_aprobe
ln -s $APROBE/include $APROBE/lib $APROBE/bin .
mkdir ual_lib
mkdir probes
cd probes
cp $APROBE/probes/memwatch.apc . # if you wanted to edit memwatch
ln -s $APROBE/probes/* .         # to get everything else
chmod  w memwatch.apc
# edit memwatch.apc (or whatever) as desired
cd ../ual_lib
make -f $APROBE/ual_lib/Makefile memwatch.ual # or whatever

If you have problems or questions, contact OCS Support.

15.10 How do I use the coverage probe with multiple test cases?

The `atcmerge' tool merges formatted results from different runs on the same or different executables. You can use the aprobe "-d" option to create different APD filesets and corresponding ".tc" files for each run, and use the "atcmerge" tool to merge these. See Aprobe\Examples\Advanced\Test_Coverage for an example.

15.11 Where did the "heap" probe go?

heap.ual has been superseded by memwatch.ual. This is a simpler, more robust probe that provides information about allocation patterns, but does not save all the additional data necessary to do error checking. Contact OC Systems if you need a probe with this allocation-checking functionality.

Other memory analysis probes provided are:

  • memstat.ual - statistical memory tracking ling for long-running programs
  • java_memstat.ual - memstat for Java applications
  • memleak.ual - a light-weight heuristic allocation tracker
  • memcheck.ual - a memory-corruption checker

15.12 How do I use this "events" probe everyone's talking about?

With RootCause 2.0.5 (Aprobe 4.2.5) there's an example under examples/predefined_probes/events, and documentation in Appendix D of the Aprobe User's Guide. Here's a quick summary we sent to a user:

You must have an app_name .events.cfg file, otherwise events does nothing. Let's take a simple case with the routines one() and two() which both call routine three() which, in turn, calls routine four():

   main()
      one()
         three()
            four()
      two()
         three()
            four()

The simplest configuration file is:

EVENT FUNCTION one()
EVENT FUNCTION two()
EVENT FUNCTION three()
EVENT FUNCTION four()

To just look at the calls nested under one() you would add:

FOCUS one()

If you wanted to restrict this at runtime:

FOCUS RUNTIME one()

Let's say that the processing for one() becomes more complex and you want to do end in another routine. This would do the trick:

EVENT START MyEvent one() ON ENTRY
EVENT START MyEvent another() ON_ENTRY

FOCUS MyEvent
FOCUS RUNTIME MyEvent

15.13 In the `profile' probe, what do "Calls to Self/Child" columns mean?

Assume we have a program foo with two functions outer() and inner() . outer loops and calls inner which does some work. We setup the foo.profile.cfg file to profile both of them.

If we look at the output for routine outer we would expect to see Calls to Self being one - it's just called once. Calls to Child should be something like 10 or however many times inner is called.

Similarly the two tables show individual and cumulative time. The individual time for outer would be much lower than the cumulative time since the individual time has all of the recorded times for inner subtracted from it.

Finally, note that this only applies to routines profiled. If outer also calls routine another() which is not profiled, another 's call counts do not show and its time is recorded as part of outer 's individual time.

15.14 Why don't memstat, memwatch, heap probes work on my application?

The most likely reason is that your application doesn't use the default system allocation routines. These might be actual replacements for malloc(), etc. in your own application or in another library such as libsafe or libefence.

Sometimes if you explicitly replace malloc() it can break RootCause/Aprobe completely: see [#q13.14 Q13.14].

If Aprobe mostly works except for memory probes, then you can override the default routines used by memwatch by registering for your own allocation routines, or by changing the probe itself. This will require writing or editing some apc code, depending on your exact situation. for further assistance.

15.15 Can you please explain the fields "Alloc Count" and "Free Count" in the memstat "Outstanding Allocation" report?

A specific allocation point (see below) might be reached just once (usually at initialization) and will have an Alloc Count of 1. It may or may not ever free that so the Free Count will be 0 or 1. But many (most!) applications have allocation points that give rise to more than one allocation. For instance:

for (i = 0; i < 10; i  )
{
   linkedList.add (new MyObject (i));
}

Obviously each instance of MyObject was created from the same allocation point. Most growth happens this way - in fact we don't count any allocations we only see once as growth.

What is an allocation point? For native code it's the unique traceback up to the current maximum depth, something like:

  Line 10 of a()
  called from Line 15 of b()
  called from Line 32 of c()

For Java each allocation point is a combination of a traceback and the object type allocated there.

15.16 Can I use memstat to track all allocations and frees?

The default setting of the memstat probes is to pinpoint leaks in a longer-running program. However, you can change the options. From the main RC window select the memstat probe in the UAL list, right click and choose Edit UAL. From the Runtime tab change the Sampling Ratio to 1 so you see every allocation.

From the Format tab check the Display Freed Allocations box. You might also find the Display Zero Growth Allocations useful. Next run, you'll start seeing those freed allocations.

Click the OK button and then the Build button. Re-format (either through the Index or Examine button) and the reports should have the information you need.

15.17 Is there a way to only report allocations in a certain module based on the stack traceback entries?

This mechanism wasn't available in memstat until version 2.1.4b (June 2005), (only in memwatch, which is more focused on individual allocations). For earlier versions, you could edit and build your own custom version of "combined_memstat.apc" that has filtering: see filtermemory.apc.

Version 2.1.4b also introduced EXCLUDE filters in memstat and memwatch, which eliminate the named stack traces and show all others. See $APROBE/probes/[java]_memstat.cfg or $APROBE/probes/memwatch.cfg for usage information.

15.18 Is there a predefined probe for detecting memory corruption?

Yes. The "memcheck" probe, introduced in version 2.1.3 (February 2004) uses a "fence" mechanism to detect corruption of allocated (but not stack/local) memory. It also reports double deallocations.

15.19 Is there a predefined probe for tracking down lock contentions?

We have done some work in this area for customers, but we have not productized it, because the platform- and problem-specifics are not easily generalized. If you want some unsupported probes to start from please contact us.

15.20 What options in the trace.cfg file are obsolete, and why?

Many changes have occurred as the Trace predefined probe has been adapted to support RootCause users. A number of the options have been deprecated, and others apply only when used directly outside of RootCause.

The following options have been deprecated.

MaxDepthOfTracedCalls, DefaultLevels
These were synonyms. It is no longer possible to specify a maximum depth at which tracing is disabled.
LogTimes
times are always logged.
LogLines
lines are logged if and only if specified on each TRACE line with LINES.
TracingEnabledInitially
Tracing is enabled initially if and only if no TRIGGER lines appear. if there are one or more TRIGGER lines, then tracing is only enabled when executing the functions specified by the TRIGGER(s).
CallCountOptions. ExactCallCounts
call counts are now done at format-time by the RC trace display rather than by the trace probe, so these options don't apply.
IndexSymbols
Symbols cannot be indexed.
MaxIndentLevelsBeforeWrap, IndentColumns, AlwaysShowNumericNestingLevel
these used to control formatting but now custom formatting is done by providing alternative formatting routines. The ones provided for the RootCause Trace Display are in $APROBE/probes/rc_formats.ual.

15.21 Why does the memstat summary file say it can't do the analysis because I only have one sample?

Some possibilities are:

  • You didn't run for long enough. A couple of minutes isn't long enough if you want to run the statistical sampling.
  • You didn't format all of the available data. You can do this by selecting all of the apd files for the ring instead of the default (which is just the last one).
  • You don't have a real problem. This is more common than you think: People often see instability that they think are memory leakage issues that aren't.
For more information see the Memory Probes page.

15.22 How do I force a snapshot from a predefined probe?

The coverage, memcheck, memwatch, profile, and statprof probes record data in memory and dump it only at normal program termination, or when explicitly requested with a programmatic snapshot. A snapshot can be forced without terminating the program by calling the entry point provided by the probe:

  • coverage - ap_Coverage_DoSnapshot( "comment" );
  • memcheck - ap_Memcheck_DoCheckpoint( "comment" );
  • memwatch - ap_Memwatch_DoSnapshot( "comment" );
  • profile - ap_Profile_DoSnapshotForAll( "comment", 1 );
  • statprof - ap_Statprof_Snapshot( "comment" );

The second parameter to ap_Profile_DoSnapshotForAll() is 1 (TRUE) if it will be the final snapshot, and 0 (FALSE) if it will be called again via a snapshot or normal program completion.

There are three ways these can be called:

Use demand.ual

Aprobe version 4.4.4a (March, 2013) introduced demand.ual, which along with its header file demand.h and supporting command-line tool apdemand provide a framework for "demanding" action from another probe from the command-line at nun-time, independent of what the probed program might be doing. This is an advanced feature, but can be powerful in the right circumstances. To learn more, copy the example directory $APROBE/examples/predefined_probes/demand to a working area, and start with the README file there. It shows how you can use demand.ual in conjunction with profile.ual to take a performance snapshot "on demand". The "RemoteControl" file in that same directory provides more detail in how to use this to control your own probes or your application. Contact support@ocsystems.com if you have questions.

Use 'call' from dbx or gdb

A very convenient way is to attach with dbx or gdb and use the "call" operation. For example if ps says that the PID of application appdriver is 12345, then you can do:

   $  dbx -a 12345
   (dbx) call ap_Statprof_Snapshot( "dbx" );
   (dbx) detach
   $  apformat appdriver.apd

Even when using detach it's possible that the program will terminate at this point so you shouldn't use this if it's important that the program to continue.

Call the Snapshot function from a custom probe

An alternative is to link your own custom version of the predefined probe with a probe which takes a snapshot at a certain point in the program, for example:

  // my_profile.apc
  #include "profile.h"
  probe thread {
    probe "abnormal_end_signal_was_handled" {
      on_entry ap_Profile_DoSnapshotForAll( "probe snap",  FALSE );
    }
  }

Then you link this with the existing predefined probe:

  $ apc my_profile.apc profile.ual # creates my_profile.ual

Note that the name abnormal_end_signal_was_handled is only a suggestion, not a name in the Aprobe runtime. An application programmer may offer another name which is called when the application averts an abnormal end. If not, an application programmer may need to help by creating and calling this dummy function at the right time for the snapshot probe, which is when the application averts an abnormal end. Part of the challenge is finding programmers who know that much about the application.

A special case of this is to take a snapshot when an unexpected signal occurs: see [#q15.6 Q15.6].

15.23 Could you explain the memstat summary's "Leaked Memory" and "Total Leakage" values?

The statistical part works like this. Say you have a setting (Sampling rate) of one in thirty. Every 30th allocation we record it in a table. Every free gets looked up in that table. If it is in there it is recorded, if it isn't it is ignored. So the sampling is only on the allocations, not the frees.

In the table, the totals (including leaked memory) and counts are multiplied by the sampling rate. If you have enough samples, this will be entirely valid.

We record what you pass to the O/S, not necessarily what the O/S actually allocates. This could under-estimate the amount of memory in certain cases. (e.g. if the memory manager always allocates in quad-word steps it would allocate 16 bytes when you requested 4).

The statistics that identify certain allocation points as "Growth" are based on least squares linear regression analysis.

15.24 How can I define a memstat (or memwatch) filter matching any number of call levels?

That is, is there a way to do something like the following?

   FILTER      extern:"malloc_y_heap()" in "libc.a(shr.o)"
        ==> **** any number of levels matching anything ****
        ==> "ap_demangle.c":"Demangle_Xlc_Symbol_Name()" at line 2103 (ap_demangle.c)

No, the best you can do is enumerate all the possible matches from your test cases. Wildcards of one or more levels may be implemented in the future.

15.25 Is there a predefined probe to check for stack corruption?

Not really. Long ago we wrote stackcheck.apc for a customer. This version is just for Windows, which is no longer supported, but it might give you an idea of how you could write one for yourself. It checks that the return address is not corrupted on_entry and on_exit to all instrumented functions. instrumentation is hard-coded in the probe for now. A configuration file or separate cconfiguration probe could be added to handle specifying the instrumentation points.

16. Using the "apc" Command

16.1 What does apc do?

The apc command translates one or more APC files into C, and then uses a native C compiler to compile these into object code, and link them with other files specified on the command-line to form a shared library called a UAL. A UAL has a suffix of .ual

16.2 How do I indicate what C compiler and options apc should use?

The compiler is defined in the file $APROBE/lib/compiler_profiles and by the APROBE_CC_COMMAND environment variable. This is described in the Files Reference (Appendix B) of the Aprobe User's Guide.

Options to the compiler can also be specified on the aprobe command line by including them in quotes after the "-compiler" option, for example,

apc foo.apc -compiler "-v"

16.3 Do I need to specify an object file or executable to apc?

You need to specify "-x object module" if you use a construct in your APC that cannot be resolved without specific symbol table or debug information from the program. Such constructs are:

  1. target expressions: names from the probed program preceded by `$', or "$*" ($1, $2 are ok, as are hardware-register references starting with '$$'), and
  2. references to specific source lines.

In general, probes that you compose to gather information about specific parts of your program will contain one of the above, and you'll want to include the executable or an object file.

For probes on shared libraries which don't contain any debug information, or for probes that should apply to any program (like the predefine probes included with Aprobe), you generally will not provide an object module.

16.4 How do I specify other object files to link into my UAL?

Just include them on the apc command-line. Linker options are specified in quotes after the "-linker" flag, for example,

apc foo.apc -linker "-lX11"

16.5 apc says my function name's not known--why not?

There are a number of possibilities. If you specified "-x ... " on the apc command line, then it means it couldn't find the named function in that file's symbol table. Since apc works pretty hard to match incomplete function names, the name is probably wrong in case or spelling, or, if you provided a parameter profile, it's probably not exactly what the C compiler encoded as the name for the function.

You could try using apcgen to generate a probe template for all the functions in the source file (or object file, if it's a template instance) containing the function you want, or the tool apinfo or apsymbols to dump out all the function names in the whole program.

16.6 How do I generate debug information for my APC files so line and function information show up in tracebacks?

As with C, use the -g flag; this passes the appropriate debug options to the C compiler, and saves the generated C source file.

16.7 Can I specify an environment variable for the compiler path in the compiler_profiles file?

Yes! If "ls -l ${CC_PATH}/bin/gcc" on the command-line shows that the compiler exists, then a stanza like:

CC_COMMAND ${CC_PATH}/bin/gcc

will work.

Also note that the environment variable APROBE_COMPILER_PROFILES can be used to override the default of $APROBE/lib/compiler_profiles and point to your own variant of this file. See compiler_profiles file in the user's guide.

16.8 How do I compile a probe for a 32-bit app when running 64-bit Linux?

If you build your application with the compilation option -m32, then to build your probe you'll need to pass -m32 to apc's backend compiler, plus define the i386 macro to the preprocessor. For example:

 apc -Di396 -compiler -m32 -linker -melf_i386 foo.apc

The link stage just invokes ld directly which should automatically build a 32-bit shared library from a 32-bit object file.

If you're going to be doing this regularly you should edit $APROBE/lib/compiler_profiles to update the CFLAGS and PREPROCESS lines so these options are applied automatically.

17. Writing Probes in APC

This section contains questions and answers about writing in probes in APC for native (C, C , Ada) programs.

17.1 How do I use "apcgen" to generate a probe automatically?

You need an object file or executable that contains debug information, i.e., was compiled with debug (see [#q12.19 Q12.19] ) or a C header file. For example:

apcgen foo.exe > foo.apc
apc foo.apc -x foo.exe

generates foo.apc, an APC file probing all the user-defined functions in foo.exe that have debug information, then compiles that into a UAL.

apcgen -qparams -p sin -o math_sin.apc /usr/include/math.h

apc math_sin.apc -x /usr/include/math.h

generates and compiles math_sin.apc containing a probe on the sin() function which logs the parameter and return value. Use apcgen -h to see what options are available to control the output.

Note that RootCause provides this functionality in a point-and-click GUI.

17.2 How do I write a "probe"?

One way is to start with a file generated by "apcgen" (see previous Q.). Or you compose one in your favorite text editor. It's pretty much like writing C, but there's some syntax needed to indicate where and when your probe should be executed. Here's a very simple one:

probe thread
  {
    probe "main"
    {
      on_entry
      {
        printf("Entering main.\n");
      }
    }
  }

If you put this in the file "foo.apc", then you would compile it:

apc foo.apc

which produces "foo.ual", which you can then probe your program with:

aprobe -u foo foo.exe

17.3 What is the difference between APC and straight C?

There are several differences:

1) There is special syntax to indicate where and when the probe should be executed, such as "probe", "on_entry", "on_exit", "on_line", etc.

2) There is a special keyword called "log" for recording data at run time and defining the format with which it should be displayed afterward.

3) There are special data references, called "target expressions" which start with `$' and refer to values in the probed program.

All of these are expanded or converted to ANSI C by the apc compiler.

In addition, there is an implicit " #include "aprobe.h ", which makes available the extensive Aprobe API defined in APROBE/include/aprobe.h and documented in Appendix C of the Aprobe User's Guide.

17.4 Why do I need a "probe thread"?

This is an artifact of the clever Aprobe scoping rules. When one probe is nested within another (that is, defined in the declarative part of an enclosing probe), it not only gives visibility to the enclosing probe's data as you would expect, it also means that the inner probe is "active" (its actions may be executed) only if the outer probe is active.

Since every function is executed within some thread of execution, if a function probe weren't inside a thread probe it would never be active.

Anyway, just put in the probe thread{ .. }. It's what works.

17.5 What's the difference between "probe thread" and "probe program"?

The on_entry actions of a "probe program" occur once each, before calling main() (or WinMain() , etc.) and after returning from main() , respectively. The corresponding actions of a "probe thread" occur at the creation and destruction of each separate thread.

Data defined in the declarative part of a "probe thread" is global to all probes, but is unique for each thread. There is always at least one, the "main" thread, which is conceptually nested immediately within the probe program.

17.6 When exactly are the "on_entry" and "on_exit" parts of a function probe executed?

The on_entry actions are executed before the first instruction of the function itself. In particular, the function's local stack frame hasn't been created yet.

The on_entry actions are executed at the very first instruction pointed to by the function's linker symbol, before any compiler-generated saves of parameters or other values.

The on_exit actions are executed after the stack frame has been discarded, so local data is not available. The next (target program) instruction executed will be the one following the call to the probed function.

17.7 Why can't I dump some parameters in the on_exit part?

Parameters passed by value are essentially local data. They are stored on the stack and the stack frame has been discarded by the time the on_exit part is executed.

If you want to be able to access the input parameters you can save them in the on_entry part, for example:

 probe thread
 {
   probe "foo"
   {
     int parm1;

     on_entry
     {
       parm1 = $1;
     }
 
     on_exit
     {
       if (parm1 == 1)
       {
          ...
       }
     }
   }
 }

C reference parameters, and composite parameters passed by reference to Ada, are available by-name on_exit because `apc' implicitly generates code in an on_entry section to save the address passed in. GNAT Ada OUT and IN OUT parameters can be displayed because these are implemented as fields of a 'struct' returned by the function.

17.8 Why is my local variable "unknown" in on_entry and on_exit parts?

The on_entry and on_exit parts are conceptually outside the scope of the function, so the local data is not visible. Local data is visible only within an "on_line" action.

17.9 Is there a way to probe "the first line" or "the last line" in my function?

Yes. Simply write on_line(first) or on_line(last) . You can use this to do function-relative line numbers as well, such as on_line(first 5) .

17.10 How do I specify which of several overloaded functions I want to probe?

In C , you must specify the exact parameter profile encoded in symbol table by the C compiler. The best way to get this is either to look at the output of "apcgen -vL" applied to the object file generated by the compiler, or use ` apinfo -sa myprog ' to list probe names of the functions symbols in your application.

17.11 How do I reference a hardware register?

A hardware register is referenced within a user action (e.g., on_entry) by preceding the name commonly used for the register by "$$". The exact register names are documented in Appendix B, "Files Reference", under "APC File".

Note that the value you get for the register is the value it had at the point the target program called the probed routine.

17.12 How do I query the parameters to a function?

If the function is compiled with debug (see [#q12.19 Q12.19] ) you can reference a parameter by name ($param) and reference all parameters with "$*.

Whether or not a function is compiled with debug,or there's an object module available, you can reference the first parameter with "$1", the second with "$2", etc., up to $8.

Note, however, that if there is no debug information provided, you must cast the "$1" to its proper type.

17.13 Can I use automatic formatting if I don't have an executable with debug information?

Yes, but you must (a) include the definition of each logged item's type in the APC file (if it's not a predefined type), and (b) cast each item to that type. This is how one can log parameters to system routines, for example:

#include <stdio.h> // includes the struct FILE
probe thread
{
  probe "fopen"
  {
    /* fopen returns *FILE, defined in stdio.h */
    on_exit
      log("fopen() returns ", (FILE *)$return, " = ", *(FILE*)$return);
  }
  probe "fclose"
  {
    /* first parameter to fclose is *FILE */
    on_entry
      log("fclose() called with ", (FILE *)$1, " = ",
        *(FILE*)$1 );
  }
}

17.14 How do I change the return value from a function?

on_exit { $return = desired_value; }

17.15 How do I log the value of a string parameter?

ap_StringValue is a macro which logs everything from the address provided up to the first null character:

on_entry { log("NameParam = ", ap_StringValue($NameParam)); }

Note : this only applies to null-terminated (C, C ) strings. It does not apply to the Ada predefined string type -- see [#q17.25 Q17.25] .

17.16 How do I log the contents of an array?

You must specify the bounds of the array in the log statement:

on_entry { log("Items = ", $Items[0 ..9]); }

If the array bounds are dynamic (as most are), you can compute them first

   on_entry
    {
      int last;
      for (last = 0; $Items[last] != 0; last++);
      log ("Items = ", $Items[0 .. last-1]);
    }


17.17 How do I "stub out" the probed function so it does nothing?

Use the "ap_StubRoutine" macro in the on_entry part of a function, and be sure to return something sensible if necessary in the on_exit part, e.g.,

  probe "foo" {
    on_entry ap_StubRoutine;
    on_exit  $return = 0;
  }

Note that you can't assign the return value in the on_entry part, since the return register is reset as part of the stub implementation.

17.18 How do I query the data in a class from when probing a member function?

All data in a class is defined as a field of the local variable "this", so to get at the class data item "NCalls" you would do:

log("$this->NCalls");

17.19 How do I query a global (or static) variable when there's a local one of the same name?

To specify you want a data item other than that visible by default, add an expression context string, to the target expression:

log("static NItems = ", $(NItems, "-file items.c"));

To get the global one, if any:

log("global NItems = ", $(NItems, "-module foo.exe"));

17.20 Can I reference a static variable that wouldn't normally be visible to my probed function?

Yes. See the previous Q. You can reference a static item by name in any file:

log("static NItems = ", $(NItems, "-file items.c"));

even if the probed function this appears in is not in file "items.c".

17.21 Can I call a function in my program from within a probe?

If your program is compiled with debugging enabled, you can precede its name with a `$'. This is often useful for using a probe to call debugging-support routines, e.g.,

probe thread
{
  probe "ReadSymbolTable"
  {
    on_exit
      $DumpSymbolTable($0);
  }
}

In the absence of debug information, you can get the symbol address from Aprobe and cast that to the correct type.

Calling C methods is more complex (they require a "this" pointer, and the naming can be tricky)--see [#q17.64 Q17.64].

17.22 Can my APC files reference names in one another like a C program?

Yes, but if they do they must all be compiled in the same "apc" command into a single UAL.

17.23 Can I call a function in another UAL?

Yes. A UAL is just a shared object library (a DLL), so you must do the following:

1) Export the symbol for the function to be called, using the apc "-e" option, when you build the UAL to be referenced, e.g.,

apc funcdef.apc -e func

2) specify the referenced UAL as an input file on the command-line when you compile the probe that contains the external reference flag when you specify the other UAL as a shared module

apc main.apc funcdef.ual

17.24 How do I change the return code from my Unix program?

From $APROBE/examples/learn/probe_exit/exit.apc:

probe thread {
  probe "exit" in "libc.so" // "libc.a(shr.o)" on AIX
  {
    on_entry {
      /* return 0 even if an error occurred: */
      $1 = 0;
    }
  }
}

17.25 How do I print or change a GNAT Ada string value in my probe?

An unconstrained string is represented as a record with two components. The first is a pointer to the string (which is not null-terminated) and the second is a pointer to another record which contains the bounds of the string.

The "apc" tool recognizes this special type and displays it appropriately, if debug information is available. Since it's length is known, ap_StringValue is not used. For example:

probe thread {
  probe "hello.qualify_name" {
     on_entry
     {
        // log the input parameter then stub the routine itself
        log("qualify_name called with: ", $1);
        ap_StubRoutine;
     }
   }
}

In the absence of debug information (e.g., for Ada.Text_IO.Put_Line ), or when you want to assign to an unconstrained string, you can use macros defined in gnatstrings.h. For example:

#include "gnatstrings.h" 
probe thread {
  probe "hello.qualify_name" {
     on_exit
     {
        // return what we want to:
         ap_SetGnatUCString(
            $return,
            ap_CatenateStrings(
               "/home/ocs/",
               ap_ExtractGnatUCString
               ($1),
               NULL));
     }
  }
}

17.26 How can I just log some data and format it as hex?

This is an example of an APC file to log a buffer's worth of data and format it as hex.

// Example APC file to demonstrate logging a block of data and
 // formatting it as hex.
 
 // Use this macro to provide a buffer and length of data you wish to log
 // and be formatted as hex. e.g. LogAsHex (MyBuffer, 100);
 #define LogAsHex(B,L)                              \
 log (((ap_Byte *) ((ap_Byte *) B)) [0 .. ((L)-1)], \
      (ap_Uint32) (L),                              \
      (ap_Uint32) (B)) with HexFormat
 
 // Buffer is the actual data, Length the length and StartAddress the
 // address of the data at runtime.
 static void HexFormat (ap_Byte    *Buffer,
                        ap_Uint32  *Length,
                        ap_Uint32  *StartAddress)
 {
    ap_Uint32 PrintAddress;
    ap_Uint32 EndAddress;
 
    // We start printing at the first 16 byte boundary below StartAddress
    // which might be below where we actually need to show characters. So
    // we check if we are in range before printing a character
    PrintAddress = *StartAddress & 0xfffffff0;
    EndAddress = *StartAddress   *Length;
 
    while (PrintAddress < EndAddress)
    {
       int i;
 
       // Print out the hex bytes
       printf ("�x: ", PrintAddress);
       for (i = 0; i < 16; i  )
       {
          // Check we're in range
          if ((PrintAddress   i) < *StartAddress ||
              (PrintAddress   i) >= EndAddress)
          {
             printf ("  ");
          }
          else
          {
             printf ("�x", Buffer [PrintAddress - *StartAddress   i]);
          }
 
          if (i && i % 4 == 0)
          {
             printf ("  ");
          }
       }
 
       // Print out the ascii
       printf ("   ");
       for (i = 0; i < 16; i  )
       {
          // Check it's in range
          if ((PrintAddress   i) < *StartAddress ||
              (PrintAddress   i) >= EndAddress)
          {
             printf (" ");
          }
          else
          {
             ap_Byte c = Buffer [PrintAddress - *StartAddress   i];
 
             // Is this a printable character?
             if (c >= 32 && c <= 127)
             {
                printf ("%c", c);
             }
             else
             {
                printf (".");
             }
          }
       }
 
       printf ("\n");
       PrintAddress  = 16;
    }
 }
 
 // This is an example of using the above log mechanism - the first
 // parameter must be an address (e.g. an array, a pointer, etc.). The 2nd
 // parameter is the number of bytes.
 probe thread
 {
    probe "fred()"
    {
       on_entry LogAsHex ($1, $2);
    }
 }
 

A C file follows to test it with:

void fred (const char *Buffer, int Length)
{
   ;
}

int main (int argc, char *argv)
{
   char Buffer [100];
   int  i;

   for (i = 0; i < 100; i  )
   {
      Buffer [i] = (char) i;
   }
   fred ((const char *) Buffer, 100);
   return 0;
}

17.27 How do I log information about each thread as it starts?

You log the Thread ID using a format routine that prints information about it, since the information, especially the thread entry point, may not be available on_entry to the thread:

void PrintThreadInfo(ap_ThreadIdT *ThreadIdPtr)
{
  printf("Thread %d: ", *ThreadIdPtr);
  ap_PrintSymbol(
     ap_AddressToSymbol(
        ap_ThreadEntryPoint(*ThreadIdPtr)));
}

probe thread
{
  on_entry
  {
     log(ap_ThreadId()) with PrintThreadInfo;
  }
}

Note that the thread entry point symbol will probably be a system function.

17.28 GNAT turns SIGSEGV into CONSTRAINT_ERROR; can I use Aprobe to get a core dump?

Yes. Here's a probe which stubs (disables) the call the GNAT runtime makes to sigaction() to register a signal handler. This allows the default action to occur when the signal occurs.

#include <signal.h>

probe thread
{
   probe "sigaction()" in "libthread.so"
   {
      ap_BooleanT Stubbed = FALSE;

      on_entry
      {
         if ($1 == SIGSEGV)
         {
            printf ("Stubbing sigaction(SIGSEGV)\n");
            Stubbed = TRUE;
            ap_StubRoutine;
         }
      }
      on_exit if (Stubbed) $0 = 0;
   }
}

17.29 How can get I get Aprobe actions to happen when my program dumps core?

First, you should be running with sigsegv.ual: it will provide a traceback and exit actions in these cases. If you want to add additional exit actions, such as a predefined probe snapshot, see [#q15.6 Q15.6], or you can copy and extend $APROBE/probes/sigsegv.apc to build your own probe.

17.30 Is there a way to find out where a signal occurs when it doesn't cause a core dump?

The sigsegv.ual predefined probe will log a traceback for the following signals:

  • 3 - SIGQUIT
  • 4 - SIGILL
  • 8 - SIGFPE
  • 10 - SIGBUS
  • 11 - SIGSEGV
  • 15 - SIGTERM

17.31 How can I reduce the overhead of my probes?

The most obvious way is to use #pragma nofloat in probes that don't use floating point; this eliminates the need to save/restore floating point registers. See also Aprobe Performance Considerations in Chapter 4 of the Aprobe User's Guide.

probe thread
{
  probe "your_routine"
  {
    #pragma nofloat
    // Your probes
  }
}

17.32 Can I use Aprobe on JOVIAL or Fortran programs?

Yes, but there will be no "debug" information found, so you won't be able to use named target expressions (e.g., "$x", "$*") or do on_line probes. Furthermore, no type information is available for parameters, etc., like "$1".

17.33 How can a log a composite object without using debug information?

A. Declare or #include a C type that maps to the structure you want, then cast your target expression to a dereference of a pointer to this C type. For example:

typedef struct
{
   int Field1;
   float Field2;
} MyStruct;

probe thread
{
  probe "foo"
  {
    on_entry
    {
      if (((MyStruct *) $1)->Field1 > 0)
      {
        log(*((MyStruct *) $1));
      }
    }
  }
}

or perhaps a bit cleaner is:

probe thread
{
  probe "foo"
  {
    on_entry
    {
      MyStruct *Param1 = (MyStruct *)$1;
      if (Param1->Field1 > 0)
      {
        log(*Param1);
      }
    }
  }
}

17.34 How can I cast a value to a type name from the program?

"I have part of my program without debug info, but I know the type of a parameter passed in that "no debug" part, and furthermore, I know that the type name is defined in a part that does have debug info. How can I cast an "unknown-type" parameter to the known type name?"

This is similar to the previous question, except instead of defining the type in your APC, refer to the type in your program by its name and file, wrapped in "typeof", within your probe declarative part, as follows:

probe thread
{
  probe "foo"
  {
    typedef ($(MyStruct, "-file debug_part.c")) MyStruct;
    on_entry
    {
      MyStruct *Param1 = (MyStruct *)$1;
      if (Param1->Field1 > 0)
      {
        log(*Param1);
      }
    }
  }
}

17.35 Is there a special editor or editor mode for APC?

No, but it's pretty close to C. The C mode for Emacs, Lemmy, or other editor works pretty well. Contact OC Systems if you think we should put work into this.

17.36 How do I execute a probe only if a certain data condition is met?

In Aprobe version 2, you could do something like:

 probe .outer_routine
    on entry
       if $r3 = 3 then
          probe .inner_routine
            null; -- inner_routine stuff
          end probe;
       end if;
    end probe;

Probes in Aprobe2 were executable but in Aprobe3 they are declarative. You declare a named probe, and make an explicit calls to enable or disable it. For example:

probe thread
{
  probe "outer_routine"
  {
    // Note that this probe has a name "InnerProbe"
    probe "inner_routine"
    {
      ; // Inner routine stuff
    } InnerProbe;

    // Entry to outer_routine
    on_entry
    {
      if ($param1 == 3)
      {
        // We can enable or disable the probe
        ap_EnableProbe (InnerProbe);
      }
      else
      {
        // Disable the inner probe
       ap_DisableProbe (InnerProbe);
      }
    }
  }
}

17.37 How can I interactively modify the parameters to a routine in my application?

The basic approach is simple. In the little C example "t.c" below, main() calls Test() every 5 seconds, passing to it an integer and a float. Subprogram "Test" prints these values out. In t.apc We put a probe onTest, and replace the parameters with values we retrieve from the environment. The trick is how to retrive values from the environment.

One obvious way is to prompt to stdout and read from stdin. This may work for some applications, but not many. A more general approach is to check if a user created a file "Test.cfg" in the directory where the program is run and if so we read the new values of parameters with the help of a call to fscanf(). This approach works pretty well as long as the overhead of `fopen' call on entry to "Test" is acceptable. In cases when it is not one could move this call some place else and store the new values in global APC variables.

Note that this "read-a-file" approach can be used for a wide range of program iteraction. One could simply use the presence of a file as a "switch" to enable or disable certain probes.

t.c
void Test(float parm1, int parm2)
{
  printf("Test(%f,%d)n", parm1, parm2);
}

main()
{
  while(1)
  {
    Test(0.0, 0);
    sleep(5);
  }
}
t.apc
#include <stdio.h>

#define CONFIG_FILE "Test.cfg"

probe thread
{
  probe "Test"
  {
    on_entry
    {
      FILE *fd = fopen(CONFIG_FILE, "r");

      if (fd != NULL)
      {
         // We have a file with new values
        float Parm1;
        int   Parm2;
        fscanf(fd, "Test(%f,%d)", &Parm1, &Parm2);

        // Now update the target parameters with new values
        $parm1 = Parm1;
        $parm2 = Parm2;
        fclose(fd);
        remove(CONFIG_FILE);
      }
    }
  }
}

17.38 I'm trying to stub a function called by my program, but APC can't seem to find it.

The Ada code looks like:

    function Plock (N : in Types.Integer_T) return Types.Integer_T;
    pragma Import (C, Plock, "plock");

Plock is some system call to lock or unlock into memory process, text or data. I get a warning message from apc stating: Function "....plock[1] not found in the modules(s) provided to apc . And also an error message from apc stating: Could not resolve function name: "......plock[1]"

plock() is a system function - it is not defined within your application. The following will work:

probe thread
{
  probe "plock()" in "libc.so"
  {
    on_entry ap_StubRoutine;
    on_exit $0 = 0;   // Or whatever return you want
  }
}

17.39 I only want to probe malloc() if it's called by realloc(). How would I do that?

Here's one way, which also illustrates some other useful idioms.

#define MyCallerFunctionId               
   ap_SymbolToFunction(                 \
      ap_AddressToSymbol(               \
         ap_LocationAddress(            \
            ap_CallerLocation(          \
               ap_CurrentLocation))))

#define NamedFunctionId(SYMBOL,MODULE)  \
    ap_SymbolToFunction (               \
      ap_SymbolNameToId(                \
        ap_ModuleNameToId (MODULE),     \
        SYMBOL,                         \
        ap_NoName,                      \
        ap_FunctionSymbol))

probe program
{
  int MallocCalls = 0;
  int ReallocCalls = 0;

  ap_FunctionIdT ReallocFunctionId = NamedFunctionId("realloc()", "libc.so");

  probe thread
  {
    int NestingLevel = 0;

    probe "malloc()" in "libc.so"
    {
      #pragma nofloat
      on_entry
      {
        ap_FunctionIdT CallerFunctionId = MyCallerFunctionId;

        if (! ap_FunctionIdsEqual(CallerFunctionId, ReallocFunctionId))
        {
           MallocCalls  ;
        }
      }
    }

    probe "realloc()" in "libc.so"
    {
      #pragma nofloat
      on_entry
        ReallocCalls  ;
    }
  }

  on_exit // from program:
  {
    log("Heap statistics on program exit");
    log("-------------------------------");
    log("Number of calls to "malloc()"  => ", MallocCalls);
    log("Number of calls to "realloc()" => ", ReallocCalls);
  }
}

17.40 I have a GNAT Ada procedure that I'm stubbing out, but want to return a string value. The procedure has a declaration similar to the one below. What's the APC?

   procedure Read_Foo (File : in  File_Type;
                       Item : out String;
                       Size : out Integer);

For routines like this, although the Item is an out parameter, GNAT implements it as if it were an in parameter (but modifiable) since the bounds of the string must already be set. The following probe shows an example of changing this:

static const char *NewString = "Aprobe string";

probe thread
{
  probe "read_package.read_foo"
  {
    on_entry
    {
      sprintf ((char *) $item.P_ARRAY, NewString);
      ap_StubRoutine;
    }
    on_exit
    {
      $return.size = strlen (NewString);
    }
  }
}

17.41 Is there a simple probe that just traces the lines in one routine?

The following gives output similar to:

   MyPackage.MyRoutine line: 120
   MyPackage.MyRoutine line: 122

when formatted:

probe thread
{
  // Replace your name here
  probe "MyPackage.MyRoutine"
  {
    on_line (all)
    {
      log ("MyPackage.MyRoutine line: ",
      ap_StringValue (ap_LineIdToNumber (ap_CurrentLineId)));
    }
  }
}

17.42 How do I reference enumeration literals in APC?

Here is an example:

a.cpp
#include <iostream.h>
#define VALUE satu

enum TYPE { sund, mond, tues, wedn, thur, frid, satu };

int main (void)
{
  TYPE bar = satu;
  cout << "Hello Worldn";
}
a.apc
probe thread
{
  probe "main"
  {
    on_line (11)
    {
      if ($bar == $satu)
      {
        log ("Match");
      }
      else
      {
        log ("No Match");
      }
    }
  }
}

If the enumeration literals are defined in a class, you can qualify them. So for:

class a
{
  enum TYPE { sund, mond, tues, wedn, thur, frid, satu };

  private:
    TYPE bar;

  public:
    void seta(){ bar = VALUE; }
};

a test;

You could use

  if ($test.bar == ($("a::satu")))

17.43 Why does including <math.h> in my APC keep it from compiling? (I want to call the "pow()" function in my probe.)

The problem here is that "log" is an Aprobe directive and it is also defined as a function in the mathematical library. So, you need a small workaround to use any function other than 'log' from the mathematical library. Here is an example:

#undef log               /* 1. undefine definition in aprobe.h */
#include <math.h>        /* 2. process math.h */
#undef log               /* 3. remove math.h's log define (AIX) */
#define log aPl          /* 4. restore aprobe's definition */

probe thread
{
  probe "main"
  {
    on_exit
    {
      log("pow(2,3) = ", pow(2,3));
    }
  }
}

The workaround is to add the preprocessor lines numbered 1 through 4 above.

If you need to use the math.h log function in an APC file, you avoid the workarounds in steps 3 and 4 above, and use 'aPl' instead of Aprobe's log operation everywhere thereafter. That is:

#undef log               /* 1. undefine definition in aprobe.h */
#include <math.h>        /* 2. process math.h */

probe thread
{
  probe "main"
  {
    on_exit
    {
      aPl("log(2.0) = ", log(2.0));
    }
  }
}

In either case, when compiling your APC file on Unix, you must pass the linker flags "-lm" as follows:

apc xxx.apc -linker -lm

because compiling any routines from the libm.a library requires the -lm flags.

You can see the macros for the keywords that Aprobe uses (e.g., #define log aPl) at the top of aprobe.h, preceded by #ifdef APROBE_KEYWORDS, which is only defined when the file is being processed by the APC compiler.

17.44 How do I query an environment variable from with a probe?

Call getenv() , as in the following example:

#include <stdlib.h> /* defines getenv() */
ap_NameT LOG_LEVEL = NULL;

static ap_BooleanT IsSevereLogLevel()
{
  return LOG_LEVEL && (strcmp(LOG_LEVEL, "severe") == 0);
}

probe program {
  on_entry
    LOG_LEVEL=getenv("LOG_LEVEL");  /* can set LOG_LEVEL to NULL */

  probe thread
  {
    probe "main()"
    {
      on_entry
        if (IsSevereLogLevel()) printf("Severe\n");
    }
  }
}

17.45 The above looks like a useful utility. How can I structure my probes so it can be shared?

Here's one way, if your "utility" is pure C and doesn't use aprobe stuff.

  1. Write "loglevel.h", "loglevel.c" in the obvious way, e.g.
loglevel.h
extern ap_BooleanT InitializeLogLevel(void);
extern ap_BooleanT IsSevereLogLevel(void);


loglevel.c
#include <stdlib.h>                /* defines getenv() */
#include <aprobe.h>                /* defines ap_NameT */

static ap_NameT LOG_LEVEL = NULL;

void InitializeLogLevel(void)
{
    LOG_LEVEL = getenv("LOG_LEVEL");  /* can set LOG_LEVEL to NULL */
}

ap_BooleanT IsSevereLogLevel(void)
{
  return LOG_LEVEL && (strcmp(LOG_LEVEL, "severe") == 0);
}
  1. Compile loglevel.c into loglevel.o. If you #include aprobe.h , just put $APROBE/include in your include path:
cc -c -I$APROBE/include loglevel.c
  1. Write the probe:
t.apc
#include "loglevel.h"
probe program
{
  on_entry InitializeLogLevel();

  probe thread
  {
    probe "main()"
    {
      on_entry
        if (IsSevereLogLevel()) printf("Severe\n");
    }
  }
}
  1. Compile the probe, referencing loglevel.o:
apc -g t.apc loglevel.o

17.46 Can I define functions in one APC file and call them from another APC file?

Yes. See also [#q17.22 Q17.22] and [#q17.23 Q17.23] . This is how our predefined probes are structured. The difference is that you must provide both UALs on the aprobe command-line. One could restructure the above example like so:

  1. Define the header file:
loglevel.h
extern ap_BooleanT IsSevereLogLevel(void);
  1. Write the probe:
loglevel.apc
#include <stdlib.h>                   /* defines getenv() */

static ap_NameT LOG_LEVEL = NULL;

// the externally callable function:
ap_BooleanT IsSevereLogLevel(void)
{
  return LOG_LEVEL && (strcmp(LOG_LEVEL, "severe") == 0);
}

// initialization of data accessed by the above function:
probe program
{
  on_entry
    LOG_LEVEL = getenv("LOG_LEVEL");  /* can set LOG_LEVEL to NULL */
}
  1. Compile the probe into loglevel.ual exporting IsSevereLogLevel :
apc -g loglevel.apc -e IsSevereLogLevel
  1. Write the "client" probe:
t.apc
#include "loglevel.h"
probe thread
{
  probe "main()"
  {
    on_entry
      if (IsSevereLogLevel()) printf("Severe\n");
  }
}
  1. Compile the client probe, referencing loglevel.ual.
apc -g t.apc loglevel.ual
  1. When you run an application, you need both t and loglevel:
aprobe -u t -u loglevel my_program

17.47 I am trying to write an aprobe that will call an Ada routine in a package body, but the routine never seems to get called.Why?

Presumably because the probe on that function is not triggered. That's because we disable probes whilst in an entry action. This is pretty easy to understand given an example. Suppose you have the following probe:

probe thread
{
   probe "printf()" in "libc.so"
   {
      on_entry printf ("We're in printf\n");
   }
}

Obviously if Aprobe didn't do anything specific, you would end up in an infinite loop: Your code would call printf() which would call the entry action for printf which would call printf which would call the entry action ... So what we do is disable the probes while you're in an action. That way the call to printf() from your probe wouldn't trigger the probe on printf itself.

In your example you are calling a routine while probes are disabled so the probe on that routine doesn't get triggered. Of course you can manually turn probes on yourself (although it is then your responsibility that you won't allow an infinite loop). The description of this in aprobe.h was improved in version 3.1.7, to the following:

These two routines

     extern void ap_IncrementDisableProbesCount (ap_ThreadContextPtrT);
     extern void ap_DecrementDisableProbesCount (ap_ThreadContextPtrT);

can be used to turn off / on probes for the thread. Normally when a probe is hit, Aprobe disables further probes in the thread for the duration of the action. This is to prevent recursive loops (for instance imagine if a probe on "printf()" called "printf()" and we did nothing about it). Sometimes you may want to temporarily enable probes. For instance, suppose on_entry to routine A you make a call to another routine in your application (say B) which calls routine C. You have a probe on C which you want to happen. You could bracket the call as follows:

on_entry
      {
         // Turn on probes before the call
         ap_DecrementDisableProbesCount (ap_ThreadContextPtr);
         // Make the call
         $B (1, 2, 3);
         // Turn probes back off
         ap_IncrementDisableProbesCount (ap_ThreadContextPtr);

So, your probe becomes:

probe thread
{
    probe "test.adb":"test.x[1]"
    {
      on entry
        ...
      on exit
        ...
        ap_DecrementDisableProbesCount (ap_ThreadContextPtr);
        $("test.y[1]");
        ap_IncrementDisableProbesCount (ap_ThreadContextPtr);
    }
}

17.48 How can I log a string passed to a library function like strdup() where there's no debug information?

In the absence of debug information all parameters would be assumed to be of type 'int' and only positional ($1, $2, etc.) references will be allowed.

If you know the type of such parameter you could cast it to the right type. The strdup() function doesn't have debug information, but you could still compile and use the following apc file:

probe thread
{
   probe extern:"strdup()" in "libc.so"
   {
   on_entry
      log("strdup(", ap_StringValue($1), ")");
   }
}

Note that ap_StringValue is a macro which among other things casts the argument to a string.

For a complete list of subprograms that you can probe in shared libraries do:

aprobe -u info -p -sa <your_executable_here>

It is best not to mix apc code that relies on debug information with the apc code that should compile without it. This way when you compile the apc code that doesn't require debug info you may omit the -x option altogether and you would not have any warnings from the apc compiler.

17.49 Can I use Aprobe to change the command run by a call to system() from my application to run my own little script instead?

Yes: replace the parameter to system() with a path to your script. In this example, the new path fits in the space occupied by the old. Imagine the possibilities...

my_ls.apc
// change these 2 lines to work on a different command:
static char cmd_to_change[] = "/bin/ls";
static char my_script[]     = "/tmp/my_ls ";

probe thread
{
  probe "system()" in "libc.so" // or libc.a(shr.o) for AIX
  {
    ap_NameT new_command = NULL;

    on_entry
    {
      char *command = (char *)$1;

      // for debugging, give some info about where we are:
      log("system() called with ", ap_StringValue($1));
      ap_LogTraceback(99);

      // make sure we only replace the right command
      {
        char *cmdpos = strstr(command, cmd_to_change);
         if (cmdpos == command)
        { // replace it
          char *argstring = command   strlen(cmd_to_change);
          new_command = ap_CatenateStrings(my_script, argstring, NULL);
          $1 = (int)new_command;
          log("*** changed to: ", ap_StringValue($1));
        }
      }
    }

    on_exit
      // indicate the return code for the command:
      log("system() returns ", $0);
      // free our string:
      ap_StrFree(new_command);
  }
}
my_ls script
echo "MY_LS: --->"
ls -ltF
echo "<---- MY_LS"

17.50 Is there a way to catch and suppress exceptions?

We do support suppressing C exceptions on AIX Aprobe only. The syntax is:

probe "fred"
{
  on_exit
      if (ap_ProbeActionReason ==
        ap_CppExceptionPropagated)
          ap_SuppressException;
}

You can catch exceptions in the on_exit section of your probes. To catch exceptions all you have to do is to distinguish between a normal exit from your subprogram and an exception exit from it as both would trigger your probe's on_exit actions. For example, if subprogram "fred()" may leave via exception you could test for this as follows:

  probe thread
  {
    probe "fred()"
    {
      on_exit
        switch(ap_ProbeActionReason)
        {
          case ap_AdaExceptionPropagated:
          case ap_CppExceptionPropagated:
            log("Exception exit from fred()\n");
        }
    }
  }

If you need to, you can find other action reasons defined in aprobe.h.

The example above works well when you know where the exception may be raised, when you don't know you can log all exceptions raised in your program. To do so use the following probe:

probe thread
{
   ap_LogExceptionsInThread;
}

There are also other macros for this: ap_PrintExceptionsInThread , ap_PrintAndLogExceptionsInThread . These are all defined in aprobe.h

17.51 Can I track stack usage with Aprobe?

A probe to track stack usage is available here for [stack_usage_aix.apc AIX]. This should be easily extended for Linux.

17.52 Is there a way to access local variables that doesn't depend on a hard-coded line number?

Yes. Function-relative line numbers are supported using an expression consisting of a constant offset from the special values 'first' and 'last'. For example:

   probe "Outer"
   {
      // Assume that 30 is the relative line number for the next line
      // after the call to Inner
      on_line (first   30)
      {
         $i = 99;
      }
   }

To be sure you're using the right value, you'll have to know the probe-able lines in your function (see [#q17.61 Q17.61]). The offset is then the difference between that line and the probe-able line you want (e.g., if the first line is 12, and you want line 22, then probe on_line (first 10).

Now if the file changes your probe will still work unless you modify Outer (which is obviously less of a concern since that's the one your working with anyway).

17.53 Can I use Aprobe to query a caller's local data that wouldn't be visible by normal visibility rules?

What you might want to do is hold the address of the variable and then change that.

   probe thread
   {
      int *i;

      probe "Outer"
      {
         on_line (first)
         {
            // Store the address of i
            i = &$i;
         }
      }

      probe "Inner"
      {
         on_entry
         {
            // Change the value of i
            *i = 100;
         }
      }
   }

Obviously this is harder for types that aren't straight integers, etc. The typeof expression can be useful here:

   probe thread
   {
      typeof ($("myrecordt", "-file types.ads")) *RecordPtr;
   }

17.54 In APC I can reference some class members as fields of class objects, but others I cannot. Why?

Here are some general limitations and workarounds for accessing class data and methods:

  1. Class static data is not part of the object; it is a global and is referenced using a qualified name, like
          $("Screen::nNumScreens")
  • If you're unsure of the full name of a static data item you can use:
          apinfo -d myprog.exe
  1. A class object is always called $this within a method. However, static class methods do not have a $this argument.
  2. To see what's really in the class object, use "log(*$this);" on_entry to a method.
  3. If you're unsure of the full method name in class "Class", you can use
          apcgen -L <dll-or-exe> | grep "Class::"
  • or
          apinfo -sa myprog.exe | grep "Class::"

Here's a simple example:

////////////////////////////////////////////////////////////
// TestStatic.apc
////////////////////////////////////////////////////////////
probe program
{
  on_entry
    printf ("  p. Static1.exe execution has started\n");
  on_exit
    printf ( "  p. Static1.exe execution has completed\n");

}
probe thread
{
  probe "Screen::Screen"
  {
    on_entry
      printf ("  p. New screen has been constructed!\n");
  }
  probe "Screen::~Screen"
  {
    on_entry
      printf ("  p. A SCREEN HAS BEEN DESTRUCTED!\n");
  }
  probe "Screen::Update(void)"
  {
    on_entry
      printf ("  p. A screen update has started!\n");
      printf ("  p. Within Update, Current nNumScreens =%d\n",$Screen::nNumScreens);
  }
  probe "Screen::GetNumScreens(void)"
  {
    on_entry
      printf ("  p. GetNumScreens has started!\n");
      printf ("  p. Current nNumScreens = %d\n",$Screen::nNumScreens);
  }
  probe "main()"
  {
    on_entry
      printf ("  p. Main() has been started!\n");
  }
}

17.55 How can I enable and disable probes externally while my program runs?

You can do this by periodically checking for the existence of a file. If you find the file enable the probe. You can automatically delete it from your probe if you want a single-action check, or delete it yourself when you want to disable the action again. For example:

static ap_BooleanT MemsetProbeEnabled = FALSE;

probe thread
{
   probe extern:"memset()"
   {
   // We are not using floating point registers.
   // Use nofloat pragma to avoid saving them and
   // speed things up a little.
   #pragma nofloat
   on_entry
      if (MemsetProbeEnabled)
      {
         // Log parameters, traceback, etc.
      }
   }
}

#define CONFIG_FILE "/tmp/memset.cfg"

static void PeriodicAction(void *EP)
{
   FILE *fd = fopen(CONFIG_FILE, "r");

   if (fd != NULL)
   {
      // Togle the value of MemsetProbeEnabled
      MemsetProbeEnabled = !MemsetProbeEnabled;

      fclose(fd);
      remove(CONFIG_FILE);
   }
}

probe program
{
on_entry
   ap_DoPeriodically(
      PeriodicAction,
      15, // interval in seconds
      NULL);
}

17.56 AIX: How do I convert my pre-version-3 APC file to current one?

Aprobe version 2, which was delivered with OC Systems' PowerAda and OATS products as well as being sold separately for C and C , was fundamentally different in its processing and expression of APC.

The best way isn't to "convert" at all, but to understand what the probes in your old APC file are trying to do, read the current documentation about Aprobe, and then write a probe to do the same thing in Aprobe version 4. This answer will just enumerate a few of the key differences, and rely on you to look in the user's guide for details:

Version 2 Aprobe was available only for the AIX platform, and used low-level AIX register and symbol names. Aprobe versions 3 and newer support multiple platforms.

In Version 2, "aprobe" actually compiled each APC file at run-time. In Version 4, you use the new `apc' program to compile the APC file(s) into a linkable UAL file, and name the UAL files on the aprobe command line.

Version 4 APC is C with a few extra keywords. Version 2 was an invented language based on Ada syntax. So, for example, instead of case $r3 is ... you'd write switch($$r3) { ...

In Version 4 there's an underscore to make "on entry" one word: "on_entry", "on_exit", "on_line".

In Version 2 you could write

probe .sym1, .sym2, on entry ...."

In Version 4 each probe can name only one symbol, but there is the new concept of a "probe type" or "typedef probe" which may be defined and then applied to many symbols. So you'd do

typedef probe { on_entry ... } CoolProbeT;
CoolProbeT  Sym1Probe("sym1");
CoolProbeT Sym2Probe("sym2");

In Version 2 APC there were only registers ($r3). In Version 4 you can reference parameters by position ($1, $2, etc.); In Version 4 you can reference the return value on_exit as $0, and that's not to mention accessing program variables by their source names...

Because version 2 APC was so low-level, there was another tool "apgen" which read an "apg" file that supported a few operations on source-level variables and generated APC to access them. In Version 4, you can reference a source-level name anywhere, provided that name is available in the debug information of the executable provided to the apc compiler.

In Version 2 a `format' was required for each log statement, and was a special syntax that could be named or unnamed in-line. In version 4 a format routine is just a C routine which can be automatically generated based on the types of the `log' arguments.

Here is a reply given to a customer who asked this question:

> It is my understanding that the new aprobe is more "C like" than "ADA like".
>Beyond that, I could use a little help.

That's true - it is basically ANSI C with some extra keywords. I take it you have gone through the examples in $APROBE/examples/evaluate to get yourself acquainted with the syntax? If not do that first and then come back to your larger problem.

> I wasn't sure if the [Aprobe v2] words format and bytes were aprobe terms.

Yes they are. In v2 you had `format start' and `format finish': These have been made consistent with all other probes on v4 so you would use:

probe format
{
  on_entry
  {
    // Put the equivalent actions to the format start here
  }

  on_exit
  {
    // Put the equivalent actions to the format finish here
  }
}

The bytes operator was a v2 thing. In v4 you would express the code in
terms of C so you would probably use char [] :

  on_entry
  {
    char CmdText [200];
  }

> I wasn't sure about the $function.

This is where v4 is much better than v2. Since you are writing your probes in C, you can just include the header files and call the functions directly. For instance, you wish to call the `creat' function. All you need to do is:

// Include the header files
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

probe format
{
  on_entry
   {
    int fd;
    fd = creat ("Filename", 0644);
  }
}

and the same for access, system, printf, sprintf, write, etc. You'll find your probes look much better!

> The probe part I did I am pretty sure is wrong.

`probe' is quite different and you have to account for the different names used by the newer compiler and hardware, if any. Here's an example which will be close:

probe thread {
   // I'm guessing on the name here: If you have trouble finding the
   // routine, run `aprobe -u info.ual -p -s <exe name> > syms' and all of
   // the routines will be placed in the syms file.
   probe "Queuing_Services.Read_From_Q[2]"
   {
      // Store the parameters on entry since the registers aren't
      // available on_exit
      int   SrNum = $2;            // Second parameter was $r4 on AIX
      int   Length = $3;           // Third parameter was $r5 on AIX
      char *Data = (char *) $4;    // Fourth parameter was $r6 on AIX

      on_exit
      {
         // Log the data
         log (SrNum, Length, Data [0 .. Length - 1]) with DitFormat;
      }
   }
}

Your format routine should be defined above this; in v4 they are regular C routines but the important thing is that they take pointers to the data, so:

void DitFormat (int *SrNum, int *Length, char *Data)
{
   // Do your processing here
}

A couple of comments on the new file: It is recommended that you use C style comments (//) unless you wish to keep code common with some existing C code since they are less error prone.

Make sure your format routine only has pointers for it's parameters.

Hope this helps - like I said, make sure you understand how to write simple probes, logs and logs with formats and then you should be fine to tackle this exercise.

17.57 (Unix) Is there a probe to see when my application "exec's" another program?

[faq_exec.apc Here] is the source for a probe that should do the trick. It will record calls to all of the exec routines, including the file, calling user/group IDs, file user, group and mode information and the environment. It was written for Solaris but should work on other Unixes.

To compile, just save to your local disk and do apc faq_exec.apc.

To use this probe you will need to have a new or existing workspace for the process you want to watch. Then either,

  • Copy the exec.ual file into the workspace directory, or
  • Use the Add Ual option from the Setup menu in the main RC window. In the "Ual file" field type or browse to the exec.ual file. Uncheck the copy UAL checkbox and (optionally) give it a description like "Record exec calls" and click OK. The UAL will be listed in the list on the left hand side of the window; check the checkbox for it and click Build.

Although the first option is simpler, using Add Ual will make it easier to turn on or off later.

Now rerun, format and look for the exec calls. If necessary the probe can be expanded to record parameters if this will be necessary to identify it.

17.58 How can I cast an enumeration value to print its numeric value?

Yeah, the "obvious" direct cast doesn't work. The trick is to get that byte into something you can safely cast. The reliable way to do it is as shown below.

probe thread {
  probe "qts.write_to_q" {
    on_exit
    {
       /* this doesn't work: log("rc=", $rc, "=", (int)($rc)) */
       char rc_val = *(char *)&$rc;
       log("rc=", $rc, "=",  (int)rc_val);
    }
  }
}

17.59 How can I detect memory overwrites on dynamically allocated (malloc'd) memory?

A crash can happen because memory allocated using malloc() or its variants is being corrupted by code that writes past the end (or before the beginning) of the memory that's returned, corrupting malloc's internal pointers or adjacent data.

The predefined "memcheck" probe detects this by putting a "fence" at the end of allocated memory, and checking the fence is intact when the memory is freed: see [#q15.18 Q15.18].

17.60 How do I know when my application has forked?

You can use the ap_AddNewProcessCallback to add a callback when Aprobe detects your new process. Pass it a handler that will be called in the child process. For instance:

static void MyNewProcessHandler (ap_ThreadContextPtrT ThreadContext)
{
   log ("Here is my new process");
}

17.61 How do I know what lines I can probe in a function?

The most reliable way is to use:

apcgen -qlines -p function_name -x module_name

This generates an on_line section for each line in the given function. You can redirect the output to a file and edit the file with your on_line actions.

For an executable module you can use:

apinfo -l exe_name

which lists all the symbols and their lines, if any. This output is simply the "raw" line information, sorted by code offset, so is not as useful for writing probes, though the output may be a good reference for use with test coverage or a debugger.

17.62 Is there a routine available to find symbol ids by mangled name, or one that will demangle for us?

You can generally pass a mangled name as the name to ap_NameToSymbolId() and you'll get the correct Symbol ID. However, there is also the following (defined in aprobe.h, of course):

extern void ap_Demangle(
   ap_DemangledNameT *Result,
   ap_NameT          MangledName,
   ap_BooleanT       IsSubprogram,
   ap_CompilerKindT  CompilerKind);

Here is an example of how to use it:

{
   ap_DemangledNameT DemangledName;

   ap_Demangle(
      &DemangledName,
      ".sec_fdk_Nam_Svc_Def__ELAB",
      TRUE,
      ap_AIXpa4_CompilerKind);


   // Now we can use DemangledName.FullName
   SymbolId =
      ap_SymbolNameToId(
         ap_ApplicationModuleId(),
         DemangledName.FullName,
         ap_ExternSymbol,
         ap_FunctionSymbol);
}

17.63 Is there a way to suppress (or force) the warning when probing a symbol that is undefined?

Yes, this was introduced in RootCause 2.1.3/Aprobe 4.3.3 (February 2004). The way to do it is specify #pragma optional in column 1 immediately inside the probe (or typedef probe), for example:

probe thread {
   probe extern:"PrintDebug()" {
#pragma optional
    ...
   }
}

Conversely there is also a #pragma required which forces a warning in the case where the module is undefined. By default, a warning is not generated on probes on missing modules. For example:

probe thread {
   probe extern:"open()" in "libpthread.so" {
#pragma required
   }
}

would force an warning if libpthread.so was not among the libraries loaded by the application.

This was possible prior to version 2.1.3 but was harder since it required use of a typedef probe and programmatic checking and instrumentation using the Aprobe API. (See for example AllocationFunctions[] array in memwatch.apc.)

17.64 Can I call a C method from a probe?

Yes, if:

  • You know the full method name, and
  • You have a this pointer for that method's class available (or else the method is static).

In these cases, you call it just like a C function (see [#q17.21 Q17.21] except that you pass this as the first parameter). For example, suppose you have a class that looks something like this:

class Example {
public:
   void doIt(const string& s);
   void debugIt(const string& s);
};

And you want to call debugit() on entry to doit(). The following works:

probe thread {
  probe "Example::doIt" {
    on_entry {
      $("Example::debugIt")($this, &($s));
    }
  }

(Note the & when passing the string parameter: APC automatically dereferences reference parameters, so you need to "restore" the reference.)

But obviously this is a very simple example. In many real cases you have template instances with long and subtly different names. In such cases, you can use apcgen -vL to list the methods in an individual object file and "grep" for the methods you're looking for and try to match up the line numbers.

When you have dynamically dispatched calls, you are limited to methods in common base classes, or else you need to use some conditional test to determine which specific method to call.

Often the best choice is to use a separate extern "C" C module as an interface between your probe and the call, as described in [#q20.8 Q20.8].

As always, if you have problems or questions, contact .

17.65 How do I print/change a C std::string object?

This is provided by the predefined probe cppstring.ual and its associated header file $APROBE/include/cppstring.h You can learn more from the example at $APROBE/examples/predefined_probes/cppstring/

This is a good example how to combine some simple C with a probe to avoid having to reverse-engineer the C .

18. Writing Java Probes

18.1 How do I use Aprobe on a Java application?

See Chapter 5 of the Aprobe User's Guide.

18.2 Can I change the return value of a Java function?

Yes. Here's a simple application, a probe, and the xmj file:

// The application Simple.java
public class Simple
{
   int doIt ()
   {
      return 10;
   }

   public static void main (String[] args)
   {
      System.out.println ("doIt returns "   new Simple ().doIt ());
   }
}

// The probe SimpleProbe.java
public class SimpleProbe extends com.ocsystems.aprobe.ProbeMethod
{
   public Object onExit (Object returnValue)
   {
      return new Integer (11);
   }
}

<!-- The xmj file simple.xmj -->
<probe_deployment>
   <probe class="SimpleProbe" parameters="readonly">
      <target value="Simple::doIt"/>
   </probe>
</probe_deployment>


$ javac Simple.java
$ javac -classpath $APROBE/lib/aprobe.jar SimpleProbe.java
$ apjava -u simple.xmj -java Simple
doIt returns 11

18.3 Can I throw an arbitrary Java exception from my probe?

Unfortunately not. Java requires that all exceptions, other than RuntimeException and it's descendants, must be declared by the method or caught. We cannot specify that the base Aprobe Patch class throws a specific exception because that would require that all methods that called it would have to either catch the exception or specify that it throws it. However, you can throw any RuntimeException.

18.4 When using a Java custom probe, can I get output to appear in the Trace Display tree?

Yes there are a few ways:

  1. Use the methods in com.ocsystems.aprobe.Logger to log objects (including strings).
  2. Use the com.ocsystems.aprobe.TraceBean.logComment method to log a comment. You'll get an exception if you have de-selected trace for the run because you are calling a native method directly.
  3. Write some custom apc to go along with the custom java; have the custom apc define specific format routines for the logged data and export some native methods; have the probe bean call the exported native methods. Needless to say this option is about as advanced as you can get and we don't really document it. No user has got to the stage of doing it yet. If you are there, .

18.5 Is it possible to "stub" a Java method so it does not execute the code in the original method?

Yes, starting with RootCause version 2.1.3a (April 2004). To stub a method, simply call the stub() method at the end of the onEntry probe method, for example:

import com.ocsystems.aprobe.*;
public class TestProbe1 extends ProbeMethod
{
   public boolean onEntry(Object[] parameters)
   {
      stub();
      return true;
   }
}

18.6 Is there any way to probe classes from rt.jar, e.g., java.io.*?

Sorry but you cannot probe any classes in the bootpath, which includes rt.jar. This is a limitation basically imposed by the JVM because you cannot call methods which are not in the bootpath from within bootpath classes. That is, you could never apply a probe because that class would be in the child's class loader so the parent wouldn't have visibility. In informal discussions with engineers in Sun's JVM group they said it was a bad limitation of the JVM because it made bytecode patching, which was a "preferred" technology, very difficult.

We have kicked around the ideas of having a bridge to native code in the bootpath classes and then the native code calling the probes but the technical issues are difficult.

For some problems, instead of probing these classes it's possible to probe the native methods underneath. For example, probe the file access routines in the libc library rather than the java.io methods.

18.7 How do I call another method in the same class instance from within my Java method probe?

The 'this' object is the first parameter (params[0]). So if you're probing a method in class SquareID, and you want to call otherMethod() there, then it'd be something like:

...
  SquareID id = (SquareID) params[0];

  id.otherMethod();
  return true;
...

Note that the code has to import the SquareID class, too:
import SquareID;

See Custom Java Probes in the RootCause Java user guide for more basic information.

18.8 Can I add custom Java probes within the RootCause GUI?

No. Most or all of it must be done from the command line. In a GUI you can click on "Custom" button in the setup options, but this would only bring up a help dialog with the instructions on how to set the XMJ and the corresponding Java code. You can cut and paste from this dialog to create your .xmj file in the workspace. After that, you would probably only use the workspace and intercept mechanism to deliver your probes to the application in an automated fashion. You could apply these probes directly to your application using the apjava command. RootCause just hides this from the user of the application.

18.9 Can I change the value of parameters passed to a Java method?

Yes, starting with RootCause version 2.1.3a (April 2004). There are two parts:

  1. In the deployment descriptor XML file, indicate that the parameters are read/write (not the default of read-only):
<?xml version="1.0" encoding="UTF-8"?>
<probe_deployment>
   <probe class="TestParamsProbe">
        <target value="ParamsTester::callIt(java.lang.String,boolean)"
                parameters="readwrite" />
   </probe>
</probe_deployment>

  1. In the probe itself, simply assign new Objects to the params vector:
import com.ocsystems.aprobe.*;
public class TestParamsProbe extends ProbeMethod
{
   public boolean onEntry (Object[] params)
   {
      // params [0], the 'this' parameter, can't and won't be changed.
      params [1] = new String ("This is a new string");
      params [2] = new Boolean (true);

      return true;
   }

   public Object onExit (Object returnValue)
   {
      int value = ((Integer) returnValue).intValue ();

      return new Integer (value   1);
   }
}

18.10 Can I log any Java variables other than method parameters?

The Variables pane in the RootCause Trace Setup dialog only supports logging Java parameters (all or none). In a custom probe, you can access individual parameters by position, and the return value. From a custom Java probe, you can access public class data just as you would from another class in your Java application. There is no access to method local data or class private data.

18.11 Is there a way to define nested probes in Java similar to that supported in APC?

Yes. In APC you'd write something like:

   probe "a()" {
     probe "b()" {
       on_entry
          do_something();
     }
   }

For Java it's not quite as clean as with APC because of the split between the probes in Java and the definition in XML. The file Example14.java has two Probe Methods; the MyUmbrellaProbe is the equivalent of the "a()" in the above example. It creates a new MyNestedMethodProbe probe (i.e., "b()") in it's onEntry method. The file Example14.xml is the probe deployment descriptor. We just define both targets in it. Note that you don't specify the hierarchy in the XML: it's defined by the Java probe.

19. Logging Data

19.1 What's the difference between "logging" and "printing"?

Printing you understand. You call "printf()" or "puts()" and it displays what you passed to it directly to standard output (or some other file if you used fprintf()) as soon as the call is executed.

Logging, as implemented by the "log" directive in APC, is more complicated. It writes the data you specified within the parentheses to a memory-mapped APD file, and associates a "format routine" with that data. The format routine is not called, and the data is not displayed, until later when the "apformat" command is run over the APD file.

Another important difference between printing and logging is that the Aprobe log mechanism is lock-free, whereas printing requires a lock to get exclusive access for the printing thread. This gives a significant advantage to the log operation in multi-threaded applications where performance and deadlock are considerations.

19.2 Why do I get data mismatch warnings logging to my very simple format routine?

All parameters to a format routine must be *addresses*. So if you do

        log((int) x) with myformat;

then you must have

        static void myformat(int *i) { ... };

If you had declared "myformat(int i)" then you would get a warning from the C compiler invoked from `apc'.

19.3 Why do my format routine parameters (usually) have to be pointers to the type logged?

The short answer is, "Because that's how it works." There are two real reasons. The first has to do with the whole logging/formatting concept. Data is copied to a memory-mapped file when logged. When formatting, we memory-map the APD file. To pass the data to the format routine directly, we'd have to allocate temporary space of the right size and copy it again.

It's much more elegant to pass everything -- scalars, structs, and arrays -- by pointer. That way, when you log an `int' value, you write it to the APD file, and when you format it, you just pass its address in the memory-mapped apd file directly to the format routine. This allows ints, arrays of ints, and structs to all work the same way.

The second reason is related to the first, and has to do with the fact the C doesn't have an array "type", but rather treats any adjacent locations in memory as an array. Here's what our chief designer has to say on this subject:

When designing the APC extensions such as 'log' statements we had to make sure that they would work with any data types, including scalars, structs and arrays. It was array types that gave us the most problems, mostly due to the fact that C has very little support for arrays.

Even though one can declare an array with a given number of elements, such declarations are limited as to where they can appear (e.g. you can not use a pointer to an array declaration inside of a formal parameter list) and operations for array types are essentially the same as operations for pointer types.

Now consider these 2 log statements below:

int foo[10];

log(foo[0]) with MyFormat;
log(foo[0..9]) with MyFormat;

The format for the first log statement could have used 'int' like you suggest, but what about the second log statement? Of course, we could have treated the first log statement differently from the second one, since the first one clearly logs one element, while the other logs a range of elements. If we did so we would use 'int' in the format declaration for the first 'log' statement and 'int *' for the second. But even so, you would still have cases like this:

log(foo[0..0]) with ... // Do you use 'int *' here or 'int'
?  log(foo[Var1..Var2] with ...  // We don't even know the number of elements here.

The requirement that all formats use pointers to the data as argument allowed us not to make any distinction between the way we log scalars and arrays. If this seems to be confusing to you, you can always use a simpler interface, where you don't have to provide any formatting routine at all.

log("foo[0] => ", foo[0]);

If this doesn't make sense to you, you are not alone. Some of us didn't like the way this had to be done either, unfortunately no one came up with a better solution than the one we have right now. If you have such suggestions, feel free to share them with us.

19.4 How can I control the size of the APD file produced?

This is specified as a parameter to the aprobe command. By default there is a single 256M file. You can specify the number of files (see the next Q.) and/or the maximum size of each file. You set the maximum size of each file (in bytes) with "-s n_bytes". You set the number of files with "-n num_files", where num_files must be in the range 0-9. If you specify 0, all logged output is discarded. If you specify 2 or more, but don't explicitly set the size with "-s", the maximum size is set to 2 megabytes.

19.5 What is an "APD ring"?

The "APD ring" is how the aprobe logging mechanism deals with large quantities of data. By default there's a single APD file produced by aprobe, with a maximum size of 256 MB. If you try to log more than that, the last (newest) data is lost.

If you specify more than one file, the files conceptually form a "ring" so that the most recent data is always kept, and the oldest data is lost. The ring is really more like a fixed-length stack where data falls off the bottom when additional data is pushed onto a full stack.

Details are described under "APD File" in Appendix B (Files Reference) of the Aprobe User's Guide.

19.6 How can I control what goes into each APD file?

You can't log data to whatever file you want, but you can register a callback routine that is called whenever the logging mechanism changes to a new file in the ring. This is illustrated by the example in APROBE/examples/learn/apd_ring included with Aprobe.

19.7 How can I reduce the time that is spent logging data in my probes?

See the section "Log Statement Overhead", under "Aprobe Performance Considerations", in Chapter 4 of the Aprobe User's Guide.

19.8 How can I log data so it's guaranteed to be available when I format, even if the APD ring wraps around?

The appropriate place for such data is the persistent apd file. You can log to this like this: log (...) with blahformat to ap_PersistentLogMethod;

Since the persistent file is always formatted first this would mean that you would get your data earlier than you would if you logged to the apd files, in the format on_entry part.

20. Other Aprobe Questions

20.1 Where does aprobe get its "time" from (e.g., for the profile probe)?

On AIX, Aprobe reads the realtime clock directly using read_real_time , then converts to ap_TimeT using time_base_to_time , both defined in sys/time.h .

On Linux, Aprobe just calls gettimeofday() defined in sys/time.h.

20.2 Why do my threads execute in different order under aprobe?

Almost certainly it's timing. Each time a thread is created, aprobe collects some information. This can delay thread creation somewhat and change the order in which threads are executed. Also, your probes take some time, and delay a thread that executes a probe relative to another that does not.

20.3 It looks like if I run "aprobe -if", both the probe program and probe format get executed, which messes up initialization. How can I avoid this?

There's a function ap_CurrentAprobeState() that returns either ap_AprobeRunTime or ap_AprobeFormatTime. So you can do:

   if (ap_CurrentAprobeState() == ap_AprobeFormatTime) { ... } 

in your probe format. This is the preferable way.

  probe program {
    on_entry {
      DumpInfo();
      // Don't run the program.  Exit after printing all the info.
      // (MAGIC exit code tells runtime this is *not* and error)
      exit(APROBE_MAGIC_EXIT_CODE);
    }
  }
  probe format {
    on_entry {
      if (ap_CurrentAprobeState () == ap_AprobeFormatTime)
      {
        DumpInfo();
        /* Don't do any formatting.  Exit after printing all the info. */
        exit(0);
      }
    }
  }

20.4 I have a probe on_exit to a function to change the struct that is returned. It causes a core-dump when the probed function called as a procedure. What's the problem?

On some architectures, a structure returned by value is written to space on the stack allocated by the caller. However, if the caller is discarding the returned value by calling the function as a procedure, no space is allocated. In this case, a probe which may normally attempt to change the return value should not do so, as it will likely corrupt memory. In order to allow users to handle this problem, the following macro is provided:

#define ap_StructValueReturnExpected private

This would be used as a boolean expression in an on_exit part as follows:

probe "UpdateCoordinates()"
on_exit
  if (ap_StructValueReturnExpected)
    $return.x = $return.y = $return.z = 0;
}

20.5 I want to capture the address of a target expression on entry in a pointer to the right target type. How do I declare this?

There are (at least) 3 possibilities, illustrated in the APC file below:

probe thread{
  // Method 1: Use the APC "typeof" operator on the type name directly as a
  //           target expression:
  probe "foo"  {
    typeof($MyStructT) *Param1 = &($MyS);
      on_exit {
      log("Param1 => ", *Param1);
    }
  }
  // Method #2: Use the APC "typeof" operator on the target
  //            expression for the parameter name:
  probe "foo"{
    typeof($MyS) *Param1 = &($MyS);
    on_exit{
      log("Param1 => ", *Param1);
    }
  }
  // Method # 3: Use the "typeof" operator on the target expression
  //             for the positional parameter.
  probe "foo"
  {
    typeof($1) *Param1 = &($1);
    on_exit
    {
      log("Param1 => ", *Param1);
    }
  }
}

This applies whether you're capturing a parameter or global value, or even assigning an APC value to a target expression. The type declaration is the important point here.

Of course target expressions apply only if you have debug information available for the definition of the various names. Otherwise, you must reproduce or include the C type declaration directly in the APC, and reference it there.

20.6 I want to probe a method in a template class. How do I refer to the method in the function probe on that method?

This can be tricky. What you need to do is get a list of all the functions as Aprobe will reference them. The info.ual predefined probe is provided precisely for this purpose, and "apcgen -L" also works. In this case, if your executable were named "myprog.exe", and the method you wanted to probe were called Method, try:

  aprobe -u info -p -s myprog.exe | grep Method

or

  apcgen -Lv myprog.exe | grep Method

This gives each function name which can be probed, and the file and line on which it's declared. This can still be pretty tricky for template instances, but it's the best we have at the moment.

20.7 In what order do separate probes on the same function probes execute?

A user wrote: "What I want to do is:

probe thread
{
    probe "myfunc()"
   {
       ap_BooleanT IsEnabled = TRUE;
   on_entry
       if (some_expression)
       {
          IsEnabled = FALSE;
          return;
       }
    on_exit
      if (! IsEnabled) return;
    on_entry
       do_something();
    on_exit
       do_something();
   }
}

The first on_entry/on_exit pair would be the wrapper part and would prevent any second on_entry/on_exit pair from executing. Can I count on the first pair executing in order?"

Here's the answer: "Yes. On_entry/exit should execute in lexical order. If you have multiple probes on the same routine, their on_entry's should execute in lexical order as well, however on_exit's will execute in the reverse order to ensure proper nesting. Probe program on_entry actions are executed before probe thread's ones and those are executed before any subprogram probes if any.

UALs on the aprobe command-line (or in the RootCause workspace's aprobe script) are initialized in reverse order, i.e., right-to-left. Similarly, two probes on the same function in different UALs are executed right to left, for example:

$ aprobe -u t1 -u t2 t.exe
enter t2:main()
enter t1:main()
exit t1:main()
exit t2:main()

20.8 Is it possible to reference C files from my application from within my UAL.

Yes, but it's a bit tricky. On Linux you can specify "-linker g " and on AIX "-linker xlC" on the apc command line to indicate that the UAL should be linked with the C compiler rather than directly with the linker ld. This causes C static initialization and includes the C shared library. This in turn allows you to link an object code archive of already-compiled C files with your APC, and use extern "C" { ... } to access it. This is down by the cppstring.ual predefined probe illustrated in $APROBE/examples/predefined_probes/cppstring/.

If your C code is already linked into a shared library then you can link to it when you build your UAL, for example:

# link to /path/to/my/lib/libmy.so:
apc my.apc -linker "-L/path/to/my/lib -lmy"

There are probably other ways as well -- at base, a UAL is just a shared library built from gcc-compiled C files.

20.9 Can I force a snapshot of my predefined probe data by sending a signal to my application?

Yes. The following apc code registers for SIGPROF and does a snapshot:

#include <signal.h>
#include "memwatch.h"

static void Handler (int sig, siginfo_t *siginfo, void *ucp)
{
   printf ("Taking snapshot on signal %d\n", sig);
   ap_Memwatch_DoSnapshot ("snapshot signal");
}

probe program
{
   on_entry
   {
      ap_RegisterSignalHandler (SIGPROF,
                                ap_CallBeforeUserAction,
                                Handler);
   }
}

If this file was memwatch_sig you would compile it with:

apc memwatch_sig.apc memwatch.ual

and then use memwatch_sig.ual instead of memwatch.ual when running. Then send the signal (kill -PROF pid ) to generate a snapshot.

20.10 How do I log multi-dimensional Ada arrays?

Aprobe only supports getting one slice at a time -- for the right-most index. For individual elements, therefore, it's trivial:

  log ($available_overlays [1] [1]);

or you could use a single slice:

  log ($available_overlays [1] [1 .. 10]);

Multi-dimensional arrays should scale up fine. Since the arrays are stored contiguously you could cheat and cast it to a one-dimensional array if you're clever about your labeling.

20.11 AIX: Why isn't my ual world readable?

The apc command does a `chmod 640' on the ual it generates after a successful link. This is necessary because this effects how the shared module is loaded at run-time. Here's an excerpt from AIX 'info' output for 'dlopen()', which is the runtime routine used to load UALs when running aprobe:

  • If the module being loaded has read-other permission, the module is loaded into the global shared library segment. Modules loaded into the global shared library segment are not unloaded even if they are no longer being used. Use the slibclean command to remove unused modules from the global shared library segment.

It seems obvious that we don't want individual's uals the shared library segment. Multiple edit/apc/aprobe cycles could result in bizarre behavior. The slibclean command can only be run by an account with su privileges.

20.12 AIX: When I use pthreads calls in my probes, the UAL won't link. Do I need to explicitly specify the library or change my compiler_profiles file?

We strongly advise against linking probes with a thread library since it can cause major problems when run against a single threaded application. The recommended approach on AIX, although a little painful, is to look up the symbol dynamically and call it by pointer. Here is an example for pthread_attr_getstacksize:

// Define a type to map to the routine
typedef int (*pthread_attr_getstacksize_subprogram_T)
   (pthread_attr_t *, size_t *);

// Declare a variable to hold the address
static pthread_attr_getstacksize_subprogram_T
   pthread_attr_getstacksize_subprogram_ptr = NULL;

probe program
{
   on_entry
   {
      pthread_attr_getstacksize_subprogram_ptr =
         (pthread_attr_getstacksize_subprogram_T)
            ap_FunctionPointer (ap_ModuleNameToId (PthreadModuleId (),
                               "pthread_attr_getstacksize()",
                               ap_NoName);

      // Call it - note don't do this on program entry until you have the
      // fix for that!
      if (pthread_attr_getstacksize_subprogram_ptr)
      {
         pthread_attr_getstacksize_subprogram_ptr (&Attributes, &Size);
      }
   }

The PthreadModuleId () routine would look something like:

static ap_ModuleIdT PthreadModuleId(void)
{
   ap_ModuleIdT Result;
   /* First, the 4.3 case */
   Result = ap_ModuleNameToId("libpthreads.a(shr_xpg5.o)");

   if (ap_IsNoModuleId(Result))
   {
      /* Didn't find it in shr_xpg5.o, so if we don't find it in shr.o ...
*/
      Result = ap_ModuleNameToId("libpthreads.a(shr.o)");
      /* ...we'll give back that null result. */
   }
   return Result;
}

For Linux a similar approach can be used as for AIX. In that case the module is "libpthread.so" always.

20.13 Is there a way I can manage thread-specific data without using native thread-management routines?

Yes. Defining and referencing thread-specific data is built into Aprobe. Here is an example:

int *GetThreadSpecificInt();

probe thread
{
   int ThreadSpecificItem = 0;

   int *GetThreadSpecificInt()
   {
      return &ThreadSpecificItem;
   }
}

Now you can call GetThreadSpecificInt() function from anywhere to get hold of the thread specific data item. This should work equally well on all the platforms and be usually much faster than using pthread functions.

You can report or take actions when each thread starts and stops as well:

probe thread
{
   on_entry
     printf("Entering thread\n");
   on_exit
     printf("Exiting thread\n");
}
The predefined probes in the $APROBE/probes have many sophisticated examples of this. A simple example is available on Unix platforms in $APROBE/examples/evaluate/5.threads.

20.14 How does using Aprobe for C differ from using Aprobe for C or Ada?

These are interesting differences:

  • memory
  • objects
  • mangling
  • exceptions
  • generics/templates

Aprobe tries to make the probe interface common, but language differences may get in the way:

  • C applications tend to be much bigger, which can make the RootCause GUI very slow.
  • C calls extra procedures, like class constructor/destructor.
  • C has objects whose members Aprobe tries to access, with varying degrees of success.
  • C has name mangling that Aprobe tries to hide, with varying degrees of success.
  • Aprobe does not support throwing C exceptions as it does for Ada exceptions.
  • C throws objects with exceptions, while Aprobe only logs the object's address.
  • C uses standard templates whose expanded form is all that Aprobe sees.
  • C has multiple inheritance whose rules are resolved by the compiler.

These differences can make Aprobe a little harder to use on a C application, or a little less satisfying when a probe logs data to be formatted for easy reading. For example, constructors and destructors may get profiled/traced, but most of the time, they just clutter the report; objects may be shown with member addresses instead of member data; mangled names sometimes show in reports or apc input; exception object content may be needed but lacking; output may show the internal form of an expanded template rather than the source form written by the programmer; a probe's references to inherited data may need compilation by the C compiler to be right.

OC Systems is developing a strategy whereby C can be linked with the probes to circumvent many of these problems--contact us to learn more.

20.15 Why does my C application crash when run with Aprobe?

If your application is bigger than, say 100M, the chances are that it's running out of memory. You can verify this by running the "apsymbols" command, for example, apsymbols c2.eab. If it crashes, then that's the problem. If apsymbols doesn't crash it, the problem might be elsewhere. See [#q13.11 Q13.11].

There are two known reasons why aprobe may cause the application to run out of memory:

  • Demangler memory leaks - On AIX, the IBM C runtime is used to "demangle" the C symbols. There was a memory leak in all versions before 7.0.0.3. You can use the command "lslpp -l xlC.aix50.rte" on AIX to see what version is installed on your AIX box.
  • Old Aprobe Version - Aprobe creates a symbol table in memory. Prior to version 4.3.4a (RootCause 2.1.4a) it used the same memory as the application itself, and so there would be insufficient memory for both aprobe and the application. Do aprobe -h | head to see what version you have.

See the next question for possible workarounds.

20.16 (AIX) My application aprobe or its tools runs out of memory. What can I do?

This is a side-effect of having huge C applications as described above. On AIX there's a way to give your application more memory. AIX supports a concept called the Large Address-Space Model. This may be applied via an environment variable when running, for example:

   LDR_CNTRL=MAXDATA=0x20000000 c2.eab

or

   LDR_CNTRL=MAXDATA=0x20000000 apformat c2.apd

This means it allocates all of 2 memory segments (3 and 4) for your application's memory. If you need even more memory you could try 0x30000000 but this may not work at runtime because some applications hard-code use of segment 5.

20.17 My application aprobe or its tools is very slow starting up. What can I do?

This is, again, because of the huge symbol tables in ERAM C programs. The workaround is to use Aprobe's ADI (Aprobe Debug Information) mechanism to pre-construct the symbol table for an executable. Here's how it works:

  1. Assume m2.exe is a executable that still has its symbol table and line information:
  2. Create an ADI file for that executable using the apmkadi command, for example: <
cd /u/m2
apmkadi -o m2.adi m2.exe
  1. Reference the ADI file just like a UAL, for example:
aprobe -u m2.adi -u trace.ual m2.exe
  1. Run as you do now.
  2. Explicitly reference the adi file when you format, for example:
apformat -u m2.adi m2.apd
  1. If an ADI file of the default name is found in the same directory as the exexecutable, and its checksum matches, it is used automatically.

20.18 (AIX) Why is the C exception raised in my libxml -1.0.a library not reported by exceptions.ual?

The AIX C compiler, unlike other compilers, generates a copy of the C runtime exception-catching function in every shared library, rather than just the C runtime library. Aprobe automatically instruments this function, "__Throw" in the predefined libC.a library, but not in user-provided libraries. For that, you must use a special probe, cppexcmodules.apc, edited to name your library or libraries.

20.19 Why don't my on_line probes work?

This is likely because the code you are probing was compiled with optimize. Check your Makefile to see if CFLAGS, CXXFLAGS contain -O.

20.20 How do I probe a C application's CPU usage?

Unless you're an Aprobe or RootCause power user, the way to do this is with the statprof predefined probe (Unix platforms only). If possible, use it in an environment where the application terminates normally or with Ctrl-C (but not "kill -9"). Simply put "-u statprof" on the command-line or in the .apo file, and when you format a table will be generated showing what functions used what percentage of CPU. Details are in the user's guide.

If your application doesn't terminate normally you'll need to force a snapshot, as described [#SNAPSHOT below]. If the output of statprof says something like:

  56.7    0.59     Other functions (not in profiled module)
then you can see the usage throughout all modules by re-running with -u statprof -p -c, where -c means "course" and will show the usage of all modules. If the usage was mostly in, say, "libXm.a(shr4.o)" then you can rerun again to analyze just that one with -u statprof -p "libXm.a(shr4.o)".

20.21 How do I probe a C application's memory usage?

Aprobe has predefined uals for watching memory use:

  • memcheck
  • memwatch
  • memstat

Some read configuration files, though a ual will generate a default configuration file if it doesn't read a user file.

The memcheck ual watches for things like spilling over the limit of a memory area. memcheck requires no configuration files and simply checks standard allocation and deallocation routines. It checks the validity of allocated data on normal program termination, memory signal, or explicit request via call to [#SNAPSHOT ap_Memcheck_DoCheckpoint].

The memwatch ual can detect things like unfreed memory accumulating. It doesn't have a configuration file, but requires that the program terminate normally to dump its data. If the program doesn't terminate normally you can use dbx to force a [#SNAPSHOT snapshot].

The memstat probe is used primarily with the RootCause GUI because it requires some configuration, but is much more usable with respect to overhead and analysis. For more details on this probe, see RootCause Memory Tracking Probes on the web site.

20.22 How can I interactively debug an application in real time?

Debugging a real-time application with dbx (or gdb) is usually tricky, because the debugger must attach to the process in real-time. Aside from the problem of hitting a moving target, hitting the target stops the process. With Aprobe, both problems are easily solved using a custom probe. The model for the probe is below, but an introduction to the concept is needed:

The Aprobe solution is to write a probe which monitors for a reason to debug, and forks a copy of the real-time process when the monitor sees a need. The parent process then continues, while the copy stalls itself in the probe so dbx can attach to the copy. Here is the model, and a talk-through follows the model:

#include <sys/types.h>
#include <unistd.h>
probe thread {
 probe "somewhere_where_there_can_be_a_problem" {
   on_line (where_there_can_be_a_problem) { // or on_entry or on_exit
     if (the elusive problem the user is watching for has occurred) {
// here is the guts of the probe
       int normal = $some_reference; // save a normal state, explained below
       pid_t child;
       child = fork();
       if (child) fprintf(stderr,
          "Oops -- such-and-such happened -- gdb xxx %d\n", child);
       else while (  child) {
         if (child > 600) exit(1); // kill if unused in ten minutes
         if ($some_reference==normal) sleep(1); // stay in the probe
         else {$some_reference = normal; break;} // leave the probe
       }
     }
   }
 }
}

The 10-minute stall loop stops counting as soon as dbx attaches. If the user finishes digging and detaches dbx, loop counting would resume and the probe would kill the application copy if the user forgot to kill it. But if the user wants to set breakpoints and resume the application copy out of the stall to debug it, the method is to use dbx-set to change a chosen piece of static data and dbx-continue. The probe sees a state change, restores the saved state, and returns from the probe. This is the only way the throwaway child process would execute beyond the probe.

Debugging the forked process over a breakpointed path goes beyond interactive data digging at the point of a problem, and may not be needed for every problem. If not, there is no need to chooses a static integer visible to dbx and the probe.

This "living dump" concept is useful for distributed applications, because the parent application process is unaffected by this probe. The whole distributed operation should be unaffected. Yet the user would have an attachable copy of a troubled process that might have stalled itself while the cause of a problem was still visible. Digging for the problem can be leisurely, since it makes no difference if the parent process continues or ends.

20.23 How do I get the size of my "std::list<std::string>" object generated by g ?

Different compilers have different low-level implementations for these and it's best to just call the C size method if possible. This worked on our RH8 gcc 2.95.2 system:

probe thread
{
   probe extern:"::myroutine(void)"
   {
      on_entry
      {
         // The list is in a variable called my_list.
         // We need to call list.size ():
         log
($("list<basic_string<char,string_char_traits<char>,__default_alloc_template<false,0> >,allocator<basic_string<char,string_char_traits<char>,__default_alloc_template<false,0> > > >::size(void)const") (&$my_list));
      }
   }
}

I found the routine's fully qualified name using apsymbols (or apcgen) and grepping for "size".

20.24 What do I do if my program dumps core when run with Aprobe?

For possible reasons for such crashes, see questions [#q13.11 Q13.11], [#q13.14 Q13.14] and [#q20.15 Q20.15]. If you have a core file, keep reading.

The first thing to check is whether any probes you have written are responsible for illegal memory references. These will cause core dumps just like any C or C program. If you have a machine-level debugger installed you can usually use it to get the a stack trace.

On AIX:

 dbx /full/path/of/your-application /path/to/core

On Linux:  gdb /full/path/of/your-application -c /path/to/core.12345

(That is, the first argument is the name of your executable, and the second is the path to the core file it dropped, which should be in the program's PWD.) Then enter the command where which will give the stack trace at the point of the core dump.

On AIX you need to have the bos.adt.debug fileset installed.)

If the stack trace includes a function name which looks like:

 OnExit_0094_L0013(...

then the core dump probably occurred in one of your own probes. Look at the integer in the third part of the name: this is the line number of the 'probe' directive in the .apc file (in this case, 13). You may also see names beginning 'OnEntry' or 'OnOffset'.

If dbx complains that the core file doesn't match the your application, you should run:

On AIX:

 dbx $APROBE/bin/aprobe.exe /find/the/core-file

On Linux:

 gdb $APROBE/bin/aprobe -c /find/the/core-file

Send the output of the where command to support@ocsystems.com and it should give us a clue. Remember to state what version of RootCause/Aprobe you are running (this is reported by 'apconfig' or 'aprobe -h | head')

AIX only: slibclean - to correct shared module problems:

Lastly, run 'slibclean' and see if that fixes the problem. 'slibclean' is an AIX utility which removes unused shared modules from the system's memory. It require root access, but some sites elect to make this application 'setuid' so it can be run by ordinary users.

Allowing full core files

In the event dbx complains about a truncated core file, you should verify that your environment allows full core dumps. This entails two steps:

  1. Login as the same user that runs the application and run: ulimit -c
  2. If this does not report 'unlimited', then the account ulimit for core files needs to be set with: ulimit -c unlimited
  3. If this command returns an error, contact your sysadmin to adjust the account's 'hard core file limit'. If you are running your application from a login shell, you will need to logout and login again for the change to take effect.
  4. AIX only: check that the operating system allows full core dumps with:
       lsattr -E -l sys0 | grep fullcore
    This should report fullcore true. If not, the sysadmin needs to enable full core files through smitty System Environments->Change / Show Characteristics of Operating System->Enable full CORE dump.

21. Licensing

OC Systems spends a surprising amount of time helping users get licensing set up on their machines. Here are a few of the most common questions and answers.

21.1 What do we do with a license key that looks like "ocs-Aprobe-48833..."?

This is a decimal format key for use in the prompt that appears during installation. It is a single text string with no blanks or line breaks .

  • If you haven't yet installed RootCause/Aprobe, go ahead and install it and give this key at the prompt. When installation has completed, RootCause will be ready to use.
  • If you've already got an installation, append the exact single text line to the file $APROBE/licenses/license.dat .

21.2 What do we do with a license key that looks like "FEATURE ..."?

This is human-readable format key, and can't be used at the prompt that appears during installation.

  • If you haven't yet installed RootCause/Aprobe, go ahead and install it, but give no key. Just confirm that it's OK to proceed without a key. Then proceed to the next step.
  • If you've already got an installation, append the exact text lines given in the mail message to the file $APROBE/licenses/license.dat.

21.3 How do I start a second license server just for Aprobe?

When there is already a license server running on the machine and you want to start another one just for Aprobe, here's how to do it. This should applies to all Unix hosts.

The procedure for running a second license server on the same host is very simple.

When you are issued a concurrent-use license for Aprobe, it will include a line like the following:

SERVER my.server.name 000347b371fe

You should amend this line by adding a third parameter to the SERVER directive, which will be the port the license server will listen on for client requests. The server and clients all read this same file. The default port for FlexLM is 27000 but any available port number can be specified. One convention is to use the next available port higher than 27000, for example:

SERVER my.server.name 000347b371fe 27001

This parameter is the only one needed to support multiple flex servers on the same host.

21.4 AIX: How do I start lmgrd when the machine boots?

These instructions apply to any services which need to be started at boot time on AIX, not just lmgrd .

The details of these instructions may or may not be applicable to your own situation, depending on the exact configuration of your systems. You should consult your local policies and support organizations and convince yourself that the suggestions made here are appropriate before putting them into practice.

That said, this is fairly basic stuff.

We will use the 'mkitab' command to add an entry to the /etc/inittab file. This command is used in place of simply editing inittab,because it helps to insure that the integrity of inittab is maintained. If you were to make even a small error while editing inittab, the system may become unbootable. The mkitabcommand helps to alleviate this risk.

With root authority, execute the following command:

mkitab -i rcnfs rclocal:2:once:/etc/rc.local

This adds an entry to inittabimmediately following the 'rcnfs' entry, which instructs the init program to run /etc/rc.local and not wait for it to complete before proceeding with the rest of system initialization. Thus, you will probably be able in /etc/rc.local to take advantage of services which may only be available on NFS filesystems (again, depending on the exact configuration of the system we are installing on, which may not even mention rcnfs, in which case you would need to determine the correct point to add your local startup script).

You should then create the /etc/rc.local file, set its execute permission, and add to it the appropriate commands to start lmgrd and log its output, as well as whatever other site-specific initializations you may need to perform, not limited to OCS products.

Reboot the system and verify correct operation.