26. Variations, Virtual Machines

Part of CS:2820 Object Oriented Software Development Notes, Fall 2015
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

 

Eliminating Methods?

In the spirit of the variations discussed in the previous lecture, it is worth noting that any Java program can be written without using any methods (except, as we will see, methods that do nothing useful).

Consider this example method:

class A {
	private int v; // a member of some class

        int m( int p ){ // a method of that class
		int t = v;
		v = p;	// this is just a useless demo computation
		return t;
	}
}

	// elsewhere in the program
	A v = new A;
	x = v.m(5);

To eliminate method m(), we replace it with a local class M within A, moving the code of m() into the initializer for that new class. This gives us:

class A {
	private int v; // a member of some class

        class M { // the class representing a method
		public int ret; // the return value of the method
		private int t;

		M( int p ) { // the initializer
			t = v;
			v = p;	// this is just a useless demo computation
			return t;
		}
	}

	// elsewhere in class A
	x = new M(5).ret;

As long as all the calls to the method m() are inside a non-static method of class A and we want to apply those calls to this, the current instance of the class, this code works. We have difficulty if the call is from outside the class or if we want to apply m() to some other object. The problem is, as we have already noted, a problem. Outside of class A, what we want to write for a call to the former method m() is the following:

	// outside of class A
	A v = new A;
	x = v.new m(5).ret;

We could solve this problem by making M a static class, so that we replace v.new m(5) with new m(v, 5). As we have already discussed this alternative, let's try something different: Java is perfectly happy to allow inner classes that are not static, that is, where the inner class is declared relative to some specific object of the enclosing class. The problem is that we can't call the initializer from outside. So, we need to move the initializer inside:

class A {
	private int v; // a member of some class

        class M { // the class representing a method
		public int ret; // the return value of the method
		private int t;

		M( int p ) { // the initializer
			t = v;
			v = p;	// this is just a useless demo computation
			return t;
		}
	}

	newM( int p ) { return new M( p ); } // crutch
}

	// elsewhere outside class A
	A v = new A;
	x = v.newM(5).ret;

This works just fine. While, from a technical point of view, our newM() code is a Java method, the useful code of our original method m() has all been moved into the initializer for class M. The annoying bit of code we referred to as a crutch above is there simply to allow us to limp around a shortcoming of the Java language. The newM() method performs no useful computation, it only connects an outside call to the initializer with the initializer that an outsider could not call. What we have found here is a shortcoming of the current implementation of Java. Perhaps, in some future version of Java, they will fix this.

The code would have been simpler if the method M had not had a return value, and it would have been even simpler if the method m() were static. In that case, the class M would also be static, and there would be no need for the crutch we have named newM().

Activation Records

The whole point of the above transformation is to illustrate the essential identity of two concepts that Java tries to distinguish: Members of a class and local variables of a method. Deep inside the implementation of any programing language that has these two concepts, they are fundamentally similar. An object is a collection of members, and when you call a method, a special object, called an activation record, is allocated to hold the local variables of that method. When the method returns, access to the activation record is lost. All of the activaiton records for each method are, logically speaking, instances of a class unique to that method.

So, while the above transformatio is not one we would suggest that anyone make in a real program, it serves to illustrate the activation record concept within the Java language.

O-functiona and V-functions

When a class gets large, with a large collection of methods, it is useful to find some way of organizing those methods to help readers understand the class. David Parnas, one of the early developers of ideas relating to object-oriented programming, proposed the following general classification system for methods:

O-functions
Methods that Operate on the value of an object.

V-functions
Methods that work in terms of the Value of an object without changing it.

Note: Parnas used the term function, in part, because the term method was not yet in general use. Today, he might have called them methods. Regardless of what you call them, the term remains useful. Consider this simple class:

class C {
        private int f; // some field of the object
        public void set( int v ) { // a simple o-function
		f = v;
        }
	public int get() { // a simple v-function
		return f;
	}
}

When a class gets large, separating the o-functions from the v-functions can be one useful way of organizing the class definition to make it easier to read.

Note that you might imagine that the o-functions doing all the complex work of the function, while the v-functions trivially inspect and return fields, but this is not the case. Consider the problem of computing perspective projections from some viewpoint:

class Perspective {
	// fields describing the viewpoint

        public void set( Viewpoint v ) { // an o-function
		// do whatever is needed
        }

	public point transform( point p ) { // a v-function
		return perspective transformation of point p;
	}
}

Here, it turns out, the fields describing the perspective transformation from a particular viewpoint are essentially a matrix. When the viewpoint is set (by providing the coordinates of a point and the direction of the view from that point), the code in our O-function must compute this matrix.

When the coordinates of a point are transformed into the coordinates as seen from that viewpoint, the essential computation involves multiplying the coordinate vector times the perspective transformation matrix. This is essentially. Here, our v-function does quite a bit of work. Of course, we can also construct alternative problems where the O-functions naturally do all the work while the v-functions do very little.

Virtual Machines

The term virtual machine is used fairly frequently these days, and in its broadest sense, it is very relevant to the problem of organizing large programs. First, however, let's look at its primary meanings.

software-defined virtual machines
When you have executable code in the instruction set of one machine, but the hardware you have executes a different instruction set, you can still run that code if you have a virtual (software) implementation of the instruction set you want. Such a virtual implementation is also referred to as an instruction-set emulator.

For example, many computer architecture courses are taught using the ARM and MIPS instruction sets, yet most of the students taking those courses only have access to computers with Intel x86-family processors. So, to run their programming assignments, they use virtual MIPS or ARM processors implemented in software by an appropriate emulator.

For a second example, most Java compilers produce output in a machine language called J-code, and the simplest Java run-time systems use a virtual J machine, or a J-code emulator to run the code. Briefly, Rockwell Collins Corp. of Cedar Rapids, Iowa, offered a hardware J-machine for sale. They did this because they had already developed, in house, a micrprocessor called the AAMP that was so similar to the J-machine that converting the AAMP to run J-code was fairly easy.

virtual machine monitors
A virtual machine monitor is an operating system that creates one or more virtual machines running on one physical machine. The virtual machines are typically identical in their instruction set to the physical machine, except that each has access to only a subset of the physical resources.

A well-written virtual machine monitor on a computer designed to be virtualizable will typically run user code at a speed very close to the actual hardware speed. Most machine instructions on a virtual machine will be executed by the physical hardware. The only instructions that are executed in software are the instructions that control access to physical resources such as I/O devices and the memory management unit.

For example, VMware (the product built by the company of the same name) tries to virtualize the x86 family instruction set. Their software is widely used on server farms that provide cloud computing services in order to prevent users of cloud services from interfering with other users. An unfortunate problem with the Intel x86 family is that it was not originally designed to be virtualizable, and attempts to find all of the parts of the instruction set that need virtualization have proven to be very difficult.

The VM operating system from IBM that runs on their enterprise server architecture is a far better example, as well as being the first fully developed example in this category.

virtual machines (in general)
The most general definition of a virtual machine is that it is any set of hardware and software resources that, taken together, create an environment in which applications can be developed. So:

An important point to note, here, is that most applications are developed in the context of a hierarchy of virtual machines. Recognizing that many of the components of large programs are actually defining new virtual machine layers can help make sense of large programs that would otherwise be difficult to digest.

Transparency

Virtual machines in a hierarchy may be transparent or opaque; these terms were originally defined by Parnas in the early 1970s.

Transparent virtual machines
A virtual machine is transparent if it does not prevent its user from inspecting and making arbitrary changes to the state of underlying virtual (or physical) machines.

Opaque virtual machines
A virtual machine is opaque if it prevents its user from inspecting or making changes to the underlying state.

In general, opaque virtual machines offer security benefits, preventing user code from accessing or manipulating things that are dangerous, while transparent virtual machines offer greater flexibility and potentially greater transparency.

Parnas's original illustration for this concept was a 4-wheeled vehicle, perhaps an automobile. The low level virtual machine has 2 fixed wheels at the rear end, and two steerable wheels at the front. The front wheels can be steered independently (imagine two steering levers, one you can hold in each hand).

The low level vehicle is very flexible. It can follow any path a vehicle might want to follow, and it can even turn on a dime. Just position the rear axle over the dime, and then turn the two front wheels so that the lines of their axles also pass over the dime. This would make parallel parking incredibly easy, but this vehicle is extremely unsafe. If you are driving at any significant speed and you turn the front wheels so they are not parallel, you risk tumbling the vehicle tail over head. I would hate to drive this vehicle on I-80.

So, we build a higher level virtual machine on top of the low level one by linking the two front wheels to a steering wheel so that they always turn (approximately) in parallel. Our new vehicle is incredibly safer, but it is less flexible because our new virtual machine is opaque. Parallel parking is now a difficult skill involving much reversing and rocking of the vehicle.

Most virtual machines are a mix of transparent and opaque parts. Security problems such as the ability to install rootkits in operating systems tend to involve virtual machines that were intended to be opaque but had small (and typically difficult to find) transparent spots.

In Java, the primary tool we have to control the transparency of virtual machine layers in large programming projects are the ability to declare components of objects private. If we are careless in designing the methods of an abstraction, though, we can end up providing the user with a backdoor that allows some method or combination of methods to set a private field to an arbitrary and unsafe value.