alien method: "From the perspective of a class C, an alien method is one whose behavior is not fully specified by C. This includes methods in other classes as well as overridable methods (neither private nor final) in C itself" [Goetz 2006].

atomicity: When applied to an operation on primitive data, indicates that other threads that might access the data might see the data as it exists before the operation occurs or after the operation has completed but may never see an intermediate value of the data.

big-endian: "Multibyte data items are always stored in big-endian order, where the high bytes come first" [JVMSpec 2013, Chapter 4, "The class File Format"]. This term refers to the tension between Lilliput and Blefuscu (regarding whether to open soft-boiled eggs from the large or the small end) in Jonathan Swift's satirical novel Gulliver's Travels; it was first applied to the question of byte-ordering by Danny Cohen [Cohen 1981].

canonicalization: Reducing the input to its equivalent simplest known form.

class variable: "A class variable is a field declared using the keyword static within a class declaration, or with or without the keyword static within an interface declaration. A class variable is created when its class or interface is prepared and is initialized to a default value. The class variable effectively ceases to exist when its class or interface is unloaded" [JLS 2013, §4.12.3, "Kinds of Variables"].

condition predicate: An expression constructed from the state variables of a class that must be true for a thread to continue execution. The thread pauses execution, via Object.wait(), Thread.sleep(), or some other mechanism, and is resumed later, presumably when the requirement is true and when it is notified [Goetz 2006].

conflicting accesses: Two accesses to (reads of or writes to) the same variable provided that at least one of the accesses is a write [JLS 2013, §17.4.1, "Shared Variables"].

controlling expression: The top-level expression in the conditional expression of an if, while, do...while, or switch statement.

data race: "When a program contains two conflicting accesses that are not ordered by a happens-before relationship, it is said to contain a data race" [JLS 2013, §17.4.5, "Happens-before Order"].

deadlock: Two or more threads are said to have deadlocked when both block waiting for each others' locks. Neither thread can make any progress.

happens-before order: "Two actions can be ordered by a happens-before relationship. If one action happens-before another, then the first is visible to and ordered before the second. . . . It should be noted that the presence of a happens-before relationship between two actions does not necessarily imply that they have to take place in that order in an implementation. If the reordering produces results consistent with a legal execution, it is not illegal. . . . More specifically, if two actions share a happens-before relationship, they do not necessarily have to appear to have happened in that order to any code with which they do not share a happens-before relationship. Writes in one thread that are in a data race with reads in another thread may, for example, appear to occur out of order to those reads" [JLS 2013, §17.4.5, "Happens-before Order"].

heap memory: "Memory that can be shared between threads is called shared memory or heap memory. All instance fields, static fields and array elements are stored in heap memory....Local variables, formal method parameters or exception handler parameters are never shared between threads and are unaffected by the memory model" [JLS 2013, §17.4.1, "Shared Variables"].

hide: One class field hides a field in a superclass if they have the same identifier. The hidden field is not accessible from the class. Likewise, a class method hides a method in a superclass if they have the same identifier but incompatible signatures. The hidden method is not accessible from the class. See Java Language Specification, §8.4.8.2, "Hiding (by Class Methods)" [JLS 2013] for the formal definition. Contrast with override.

immutable: When applied to an object, immutable means that its state cannot be changed after being initialized. An object is immutable if

  • Its state cannot be modified after construction;
  • All its fields are final; and
  • It is properly constructed (the this reference does not escape during construction). [Goetz 2006]

It is technically possible to have an immutable object without all fields being final. String is such a class but this relies on delicate reasoning about benign data races that requires a deep understanding of the Java Memory Model.

initialization safety: "An object is considered to be completely initialized when its constructor finishes. A thread that can see a reference to an object only after that object has been completely initialized is guaranteed to see the correctly initialized values for that object's final fields" [JLS 2013, §17.5, "final Field Semantics"].

interruption policy: "Determines how a thread interprets an interruption request—what it does (if anything) when one is detected, what units of work are considered atomic with respect to interruption, and how quickly it reacts to interruption" [Goetz 2006].

instance variable: "A field declared within a class declaration without using the keyword static. If a class T has a field a that is an instance variable, then a new instance variable a is created and initialized to a default value as part of each newly created object of class T or of any class that is a subclass of T. The instance variable effectively ceases to exist when the object of which it is a field is no longer referenced, after any necessary finalization of the object has been completed" [JLS 2013, §4.12.3, "Kinds of Variables"].

liveness: Every operation or method invocation executes to completion without interruptions, even if it goes against safety.

memoization: An optimization technique used primarily to speed up computer programs by having function calls avoid repeating the calculation of results for previously processed inputs [White 2003].

memory model: "The rules that determine how memory accesses are ordered and when they are guaranteed to be visible are known as the memory model of the Java programming language" [Arnold 2006]. "A memory model describes, given a program and an execution trace of that program, whether the execution trace is a legal execution of the program" [JLS 2013, §17.4, "Memory Model"].

normalization: Lossy conversion of the data to its simplest known (and anticipated) form. "When implementations keep strings in a normalized form, they can be assured that equivalent strings have a unique binary representation" [Davis 2008].

normalization (URI): The process of removing unnecessary "." and ".." segments from the path component of a hierarchical URI. Each "." segment is simply removed. A ".." segment is removed only if it is preceded by a non-".." segment. Normalization has no effect on opaque URIs [API 2013, Class URI].

obscure: One scoped identifier obscures another identifier in a containing scope if the two identifiers are the same, but the obscuring identifier does not shadow the obscured identifier. This can happen if the obscuring identifier is a variable and the obscured identifier is a type, for example. See Java Language Specification, §6.4.2, "Obscuring" [JLS 2013], for more information.

obsolete reference: "An obsolete reference is simply a reference that will never be dereferenced again" [Bloch 2008].

open call: "An alien method invoked outside of a synchronized region is known as an open call" [Lea 2000, §2.4.1.3]. See also Effective Java, 2nd ed. [Bloch 2008].

override: One class method overrides a method in a superclass if they have compatible signatures. The overridden method is still accessible from the class via the super keyword. See Java Language Specification, §8.4.8.1, "Overriding (by Instance Methods)" [JLS 2013], for the formal definition. Contrast with hide.

partial order: An order defined for some, but not necessarily all, pairs of items. For instance, the sets {a, b} and {a, c, d} are subsets of {a, b, c, d}, but neither is a subset of the other. So "subset of" is a partial order on sets [Black 2004].

program order: The order that interthread actions are performed by a thread according to the intrathread semantics of the thread. "Program order [can be described] as the order of bytecodes present in the .class file, as they would execute based on control flow values" (David Holmes, JMM Mailing List).

publishing objects: "Publishing an object means making it available to code outside of its current scope, such as by storing a reference to it where other code can find it, returning it from a nonprivate method, or passing it to a method in another class" [Goetz 2006].

race condition: "General races cause nondeterministic execution and are failures in programs intended to be deterministic" [Netzer 1992]. "A race condition occurs when the correctness of a computation depends on the relative timing or interleaving of multiple threads by the runtime" [Goetz 2006].

relativization (URI): "The inverse of resolution. For example, relativizing the URI http://java.sun.com/j2se/1.3/docs/guide/index.html against the base URI http://java.sun.com/j2se/1.3 yields the relative URI docs/guide/index.html" [API 2013, Class URI].

safe publication: "To publish an object safely, both the reference to the object and the [state of the object] must be made visible to other threads at the same time. A properly constructed object can be safely published by:

  • Initializing an object reference from a static initializer;
  • Storing a reference to it into a volatile field or AtomicReference;
  • Storing a reference to it into a final field of a properly constructed object; or
  • Storing a reference to it into a field that is properly guarded by a lock" [Goetz 2006, §3.5 "Safe Publication"].

safety: Its main goal is to ensure that all objects maintain consistent states in a multithreaded environment [Lea 2000].

sanitization: Validating input and transforming it to a representation that conforms to the input requirements of a complex subsystem. For example, a database may require all invalid characters to be escaped or eliminated before their storage. Input sanitization is the elimination of unwanted characters from the input by means of removing, replacing, encoding, or escaping the characters.

security flaw: A software defect that poses a potential security risk [Seacord 2013].

sensitive code: Any code that performs operations that would be forbidden to untrusted code. Also, any code that accesses sensitive data. For example, code whose correct operation requires enhanced privileges is typically considered to be sensitive.

sensitive data: Any data that must be kept secure. Consequences of this security requirement include the following:

  • Untrusted code is forbidden to access sensitive data.
  • Trusted code is forbidden to leak sensitive data to untrusted code.

Examples of sensitive data include passwords and personally identifiable information.

sequential consistency: "A very strong guarantee that is made about visibility and ordering in an execution of a program. Within a sequentially consistent execution, there is a total order over all individual actions (such as reads and writes) which is consistent with the order of the program, and each individual action is atomic and is immediately visible to every thread. . . . If a program is correctly synchronized, then all executions of the program will appear to be sequentially consistent" [JLS 2013, §17.4.3, "Programs and Program Order"]. Sequential consistency implies there will be no compiler optimizations in the statements of the action. Adopting sequential consistency as the memory model and disallowing other primitives can be overly restrictive because, under this condition, the compiler is not allowed to make optimizations and reorder code.

shadow: One scoped identifier shadows another identifier in a containing scope if the two identifiers are the same and they both reference variables. They may also both reference methods or types. The shadowed identifier is not accessible in the scope of the shadowing identifier. See Java Language Specification, §6.4.1, "Shadowing" [JLS 2013], for more information. Contrast with obscure.

synchronization: "The Java programming language provides multiple mechanisms for communicating between threads. The most basic of these methods is synchronization, which is implemented using monitors. Each object in Java is associated with a monitor, which a thread can lock or unlock. Only one thread at a time may hold a lock on a monitor. Any other threads attempting to lock that monitor are blocked until they can obtain a lock on that monitor" [JLS 2013, §17.1, "Synchronization"].

starvation: A condition wherein one or more threads prevent other threads from accessing a shared resource over extended periods of time. For instance, a thread that invokes a synchronized method that performs a time-consuming operation starves other threads.

tainted data: Data that either originate from an untrusted source or resulted from an operation whose inputs included tainted data. Tainted data can be sanitized (also untainted) through suitable data validation. Note that all outputs from untrusted code must be considered to be tainted [Jovanovic 2006].

thread-safe: An object is thread-safe if it can be shared by multiple threads without the possibility of any data races. "A thread-safe object performs synchronization internally, so multiple threads can freely access it through its public interface without further synchronization" [Goetz 2006]. Immutable classes are thread-safe by definition. Mutable classes may also be thread-safe if they are properly synchronized.

total order: An order defined for all pairs of items of a set. For instance, <= (less than or equal to) is a total order on integers; that is, for any two integers, one of them is less than or equal to the other [Black 2006].

trusted code: Code that is loaded by the primordial class loader regardless of whether or not it constitutes the Java API. In this text, this meaning is extended to include code that is obtained from a known entity and given permissions that untrusted code lacks. By this definition, untrusted and trusted code can coexist in the namespace of a single class loader (not necessarily the primordial class loader). In such cases, the security policy must make this distinction clear by assigning appropriate privileges to trusted code while denying the same from untrusted code.

untrusted code: Code of unknown origin that can potentially cause some harm when executed. Untrusted code may not always be malicious, but it is usually hard to determine automatically. Consequently, untrusted code should be run in a sandboxed environment.

volatile: "A write to a volatile field happens-before every subsequent read of that field" [JLS 2013, §17.4.5. "Happens-before Order"]. "Operations on the master copies of volatile variables on behalf of a thread are performed by the main memory in exactly the order that the thread requested" [JVMSpec 1999]. Accesses to a volatile variable are sequentially consistent, which also means that the operations are exempt from compiler optimizations. Declaring a variable volatile ensures that all threads see the most up-to-date value of the variable if any thread modifies it. Volatile guarantees atomic reads and writes of primitive values, but it does not guarantee the atomicity of composite operations such as variable incrementation (read-modify-write sequence).

vulnerability: "A set of conditions that allows an attacker to violate an explicit or implicit security policy" [Seacord 2013].

3 Comments

  1. Dhruv,

    I would love to rely on the JLS for the definition of volatile, if only it provided one. It doesn't, it merely provides properties, including the ones in your definition. And your and my definitions are not equivalent, as my definition also provides indication of order of access (see JLS S17 for examlpes), which your definition does not.

    I'd rather my definition came from a more well-known source, but I still prefer my definition over yours, for the sake of completeness.

    1. David,

      Sorry for not including more reasons for eliminating the definition (the comment space is slightly awkward). Here are some detailed comments:

      1. We should try to use the JLS as far as possible. It is fine to use online resources or websites, however, as with the C standard, it may not be a good idea to rely on less reliable sources for definitions or personal opinions (wikipedia, forums, personal websites unless they are vetted). Of course exceptions can be made in some cases where there is no equivalent resource. This also applies to other definitions on the page which are currently ad-hoc.

      2. We should quote exactly in "" if some text is taken verbatim from the source.

      Wrt the other definition of volatile itself -

      "The volatile keyword is used on variables that may be modified simultaneously by other threads."

      What about only reads from other threads?

      "This warns the compiler to fetch them fresh each time, rather than caching them in registers. This also inhibits certain optimisations that assume no other thread will change the values unexpectedly."

      A java programmer won't usually care about registers. The optimizations point is interesting but I am unsure if the JLS/JVM spec talks about this wrt volatile. So how do we know?

      "Since other threads cannot see local variables, there is never any need to mark local variables volatile."

      This is implied. If you try to declare a volatile variable in a method, you'll get a compiler error.

      " You need synchronized to co-ordinate changes to variables from different threads, but often volatile will do just to look at them."

      That's just one way. New concurrency utilities are another. I'm fine with listing these but it doesn't tell the reader exactly how volatile is better or worse (it is not self-explanatory). IOW, this sentence is good but a little vague for a definition because it uses words like "often".

      "volatile does not guarantee you atomic access, e.g. a ++;"

      Yes. (smile)

      Lastly, even though comparisons with C/C++ are informative, we should try to avoid them.

      I think it will be good to come up with a criteria by which we can use definitions consistently. That said, this page will probably evolve into something that makes sense and gather some kind of consensus. Also, the editors might be better equipped than me to comment on this new Definitions section. Meanwhile, do feel free to edit and supplement the definitions.

      Thanks.

  2. I think we need a definition for 'race condition', and we need to compare/contrast that with 'data race', either here or in an appropriate rule, if we want to enforce that data race & race condition are two different concepts. I think 'data race' is-a-type-of 'race condition', but is a big enough problem that it is considered distinct within the Java Memory Model.