IRISA
Celtique Project

Nit: Nullability Inference Tool
-----------------------

Documentation

Table of contents

User documentation
Developer documentation

Installation

Binary for Windows

No installation is needed: simply download the archives and unzip it. To use the Eclipse plug-in, go to the plugin page.

Nit is a command line tool and should be used from the command line or in combination with the Eclipse plugin.

Compilation from source

Under windows: The easier way to compile this software under windows is probably to install the MinGW version of OCaml. You need to install the packages gcc-core, binutils, gcc-mingw-core, mingw-runtime, w32-api, mingw-zlib, gcc, make. You will also need FlexDLL.

  • Install GNU Make (version >= 3.80), OCaml (version >= 3.10) and Zlib.
  • Download the archive, uncompress, untar and cd to the directory (tar jxf nit-<version>.tar.bz2 && cd nit-<version>).
  • Edit Makefile.config if you are under Windows. You can the compile with make. The output is the file nit.opt. If something did not work as expected, you can try do edit Makefile.config.

User documentation

Command line options

Usage: nit [options ... ] --main <string>

Nit infers nullness properties for the reference variables of a Java program and can therefore be used to infer null-ability annotations for fields, local variables, methods parameters and return values.

It takes options but no argument and analyze the program starting from the given main method (with --main option). At least one output option must be given.

Options:

--classpath <string> It takes a list of directories and .jar files concatenated with ':' (or ';' under Windows) where needed classes (including boot strap classes) will be looked for. It should therefore contain the path of your runtime (or bootclasspath in Java). If no --classpath option is given, then the environment variable CLASSPATH is used.
--main <string> Set the main method of the program. It takes as argument a class name (such as foo.Bar) optionnaly followed by a method signature in the JVM (internal) syntax (such as '.m(IIL)V'). If no method signature is given, the default is '.main([Ljava/lang/String;)V', which correspond to the void main(String[]) method. This option must be specified at least once and can be specified more than once.
--html-output <directory> Outputs the result as HTML code in the given directory (including annotated opcodes).
--text-output <f.txt> Outputs the result as text in the given file (including annotated opcodes).
--xml-output <f.xml> Outputs the result as XML in the given file (excluding opcodes, i.e. only method and field annotations).
--anno-file-output <f.jaif> Outputs the result (only method and field annotations and only Nullable/NonNull, ie no raw annotations) in the given file in a format suitable for use with the Annotation File Utilities (http://types.cs.washington.edu/annotation-file-util ities/).
--warnings Output warning for each unsafe dereference on the standard output (stdout).
--version Prints version information and exit.
--class-path <string> DEPRECATED. Same as --classpath.
--debug Prints debugging information (can be used multiple times).
--verbose Prints debugging information (can be used multiple times).
-v Prints debugging information (can be used multiple times).
--quiet Prints less debugging information (can be used multiple times).
--stats Output statistics in CSV format on the standard output (stdout).
--alias-output Prints alias information (when available) in text and HTML output. This option is incompatible with --no-alias-analysis.
--remove-package <string> It takes a comma separated list of package names (e.g. "java,sun" or "java.lang") and removes all classes in these packages from the output. This option can be specified multiple times. Note: these packages are still analyzed, the option just removes the information concerning these packages from the output.
--class-analysis {RTA|CRA} Either RTA or CRA (RTA is the default option). It selects the analysis used to load the program (RTA is much more precise than CRA but is also very sensitive to native methods and may therefore produce incomplete programs) (the type system then add co-variant and contra-variant constraints on types so it does not change the analysis).
--no-compact Avoid compacting the result of intermediate analyses (alias and instanceof) (It can sometimes save time but always costs memory).
--no-alias-analysis Do not perform any alias analysis prior to the (instanceof and) non-null analysis (save time and memory, but less precise). This option is incompatible with --alias-output.
--no-instanceof-analysis Do not perform an analysis of instanceof instructions prior to the non-null analysis (save time and memory, but less precise).
--no-field-annotation Deactivate field annotations. It reduces the precision of the analysis without significantly improving time and memory consumption.
--no-method-annotation Deactivate method annotations. It reduces the precision of the analysis without significantly improving time and memory consumption.
--unsafe-exception Annotate caught exception object as if they were always initialized (NonNull) objects. It improves the results but it is incorrect (as it is possible to throw the current object this in a constructor).
--unsafe-native Annotate native methods as if they were always returning initialized objects (NullableInit). It improves the results but it is incorrect.
--unsafe-static Annotate static fields as if no class initializer could initialize a static field to not-fully initialized value, i.e to a Raw value. It improves the results but it is incorrect.
--unsafe-array Annotate arrays as if none of their cells were ever assigned a Raw value. It improves the results but it is incorrect.
--aggressive-native Annotate native methods as if their return value were always NonNull. It improves the results but it is incorrect.
--aggressive-array Annotate arrays as if all their cell were cells always NonNull. It improves the results but it is incorrect.
--aggressive-static Annotate static fields as if they were always initialized by their class initializer with a NonNull value. It improves the results but it is incorrect.
--aggressive-field Annotate instance fields as if they were always NonNull. It improves the results but it is incorrect.
--aggressive-parameter Annotate method parameters as if they were always NonNull. It improves the results but it is incorrect.
--aggressive-return Annotate return value of methods as if they were always NonNull. It improves the results but it is incorrect.
--aggressive-jsr Consider the code after a JSR instruction as being unreachable. JSR are not handled precisely by this implementation and this option improves the results although it is incorrect.
-help Display this list of options
--help Display this list of options

Output formats

XML output format

In the following grammar, the star (*) means the previous token is present any number of times (including zero). The question mark (?) means the previous token is optional. The plus sign (+) means the previous token is present any number of times and at least once. The parentheses are not tokens, they are here to group tokens together so the following star, question mark or plus applies to the list of tokens.

PROGRAM ::= <nit>CLASS*</nit>
CLASS ::= <class name="NAME">FIELD* METHOD* WARNING*</class>
FIELD ::= <field descriptor="NAME:TYPE">VALUE</field>
WARNING ::= <warning line="INTEGER">WARNTYPE</warning>
WARNTYPE ::= <getfield signature="NAME:TYPE" info="STRING" /> | <putfield signature="NAME:TYPE" info="STRING" /> | <invoke signature="NAME:(TYPE*)RETURN" info="STRING" /> | <arraylength info="STRING" /> | <vaload signature="TYPE" info="STRING" /> | <vastore signature="TYPE" info="STRING" />
METHOD ::= <method descriptor="name:(TYPE*)RETURN">
(<arguments>(VALUE*)</arguments>)?
(<return>VALUE</return>)?
(<this>THIS</this><post>THIS</post)?
THIS ::= <AllDef /> | <Top /> | <Def field="package.class.fs:type" />+
VALUE ::= <NonNull /> | <NullableInit /> | <Nullable /> | <Raw class="package.class" /> | <Raw>

< and > in method names are respectively replaced by ^ and $ in the XML output. Method and field descriptors are composed of the name of the method or field, respectively, and their "type" (or descriptor) as documented in the JVM Specification (Field Descriptors, Method Descriptors).

Annotation File format (produced with option --anno-file-output))

Jaime Quinonez has developed an annotation file format and a tool suite (which can be found at http://types.cs.washington.edu/annotation-file-utilities/) that allows to extract annotations from Java source file or Java bytecode (.class) files to this file format and to insert annotations for a such file into Java source or Java bytecode files. This tool suite is supposed to work on the annotations defined in the JSR 308, which should be part of Java 7. It now seems to be maintained by Michael Ernst.

Text and HTML output formats

The HTML output format is documented with the text output format as they should be identical, except that some class, interface, field and method names are links to their definition and that some parts can be dynamically hidden. Although it might be easier to navigate in a hypertext file, as this file is output as one unique file, it can be huge (hundreds of MB). A good text editor can handle such files easily but web-browsers have much more difficulties.

The text output is the actual bytecode in a human readable manner (disassembled) in a close way to what would output javap -verbose -private on all class files of the program (except for classes in packages given with the --remove-package option).

Field declarations are annotated with annotation starting with @ written just after the field name. Method annotations are written after the field signature. They include the list of parameters, including the self reference in the case of a virtual method. One-word non-reference type are annotated with NonNull, two-word non-reference type are annotated with two NonNull annotations.

Instruction are also annotated with the flow-sensitive information computed by the analysis: the set of fields definitely set at that point (Bot or Top meaning all or none of them, respectively), the annotations for local variables (where last local variables may be omitted if they are all inferred as NonNull), and the operand stack where the left-most element is the peek of the stack.

In addition to those annotations computed by the analysis (the real purpose of the program) and to the information javap -verbose -private usually outputs, there are also some other information outputted to help to navigate in the program and to understand the computed annotations. Their name should be sufficiently meaningful to be understandable without further documentation. Here is the kind of information added for each kind of data.

DataInformation added
Class The direct sub-classes
Interface The direct implementations (sub-classes) and direct sub-interfaces.
Non-abstract method The interfaces that define this method and the classes in which this method is overridden.
Abstract method The class or interface that define a method that this method definition overrides, the class or interface that overrides this method definition and the class that implements this method.
Statistics output format

Statistics can be made on the result of the analysis via option --stats. It counts dereferences in field reads (getfield), field writes (putfield) and virtual method calls (invokevirtual, invokeinterface and invokespecial) of each type (@Nullable, @NonNull, @Raw). It also counts annotations of each type for fields, method formal parameters and return values (including static methods) if the Java type is of type reference (object or array) and if the code is not proved dead (field actually read, method actually called).

The output format is the CSV format where the first line is the legend and the second line contains the data. The column separator is the ';' character. The title of each column is explained here.

Field nameDescription
Name Gives the name of the main class of the program that has been analyzed.
Free Unused.
RAW Fields Number of field annotated as Raw.
MBN fields Number of field annotated as Nullable.
NN fields Number of field annotated as NonNull.
total invokes Total number of method calls taken in account (reachable invokevirtual, invokespecial, invokeinterface and invokestatic).
Raw invokes Number of virtual method calls on reference annotated as Raw.
NN invokes Number of virtual method calls on reference annotated as NonNull.
MBN invokes Number of virtual method calls on reference annotated as Nullable.
invokes Raw Number of method calls that returns values annotated as Raw.
invokes NN Number of method calls that returns values annotated as NonNull.
invokes MBN Number of method calls that returns values annotated as Nullable.
total arguments Total number of actual arguments taken in account (i.e. value of type reference that have been passed as arguments of method calls taken in account). Note: Total arguments = Raw arguments + MBN arguments)
Raw arguments Number of arguments that are annotated as Raw.
MBN arguments Number of arguments that are annotated as Nullable.
total getfields Total number of field read taken in account. Note: Total getfields = getfields on MBN + getfields on Raw + getfields on NN.
getfields on MBN Number of field read on values annotated as Nullable.
getfields on Raw Number of field reads on values annotated as Raw.
getfields of MBN Number of field reads that have return values annotated as Nullable.
getfields of RAW Number of field reads that have return values annotated as Raw.
getfields of NN Number of field reads that have return values annotated as NonNull.
total putfields Total number of field writes that have taken in account. Note: Total putfields = putfields on MBN + putfields on Raw + putfields on NN.
putfields on MBN Number of field writes on values annotated as Nullable.
putfields on RAW Number of field writes on values annotated as Raw.
putfields of MBN Number of field writes where written values are annotated as Nullable.
putfields of RAW Number of field writes where written values are annotated as Raw.
putfields of NN Number of field writes where written values are annotated as NonNull.
total parameters Total number of formal parameters taken in account (i.e. formal parameters of type reference of methods (virtual or static) that have been taken in account).Note: Total parameters = MBN parameters + Raw parameters + NN parameters.
MBN parameters Number of formal parameters annotated as Nullable.
Raw parameters Number of formal parameters annotated as Raw.
total return Total number of return annotation taken in account (i.e. one for each method taken in account if the return type of the method is of type reference). Note: Total return = MBN return + NN return.
MBN return Number return annotated as Nullable.
Raw return Number return annotated as Raw.
MBN ArrayLoad Number of array load on array references annotated as Nullable.
NN ArrayLoad Number of array load on array references annotated as NonNull or Raw.
MBN ArrayStore Number of array store on array references annotated as Nullable.
NN ArrayStore Number of array store on array references annotated as NonNull or Raw.
MBN ArrayLength Number of arraylength instructions on array references annotated as Nullable.
NN ArrayLoad Number of arraylength instructions on array references annotated as NonNull or Raw.
Mem usage (MB) Maximum quantity of memory used by the process during its whole life in megabytes.
Time (s) Quantity of CPU time used by or for the process (user time and system time) in seconds.
Version A string with the compile-time options that were used when building the program and the command line arguments given for that execution. (The version is to identify the statistics, not the software version.)
Last modification: 2009-09-25 19:19:48.000000000 +0200
This website uses Google Analytics