40. Virtual Machines
Part of
CS:2820 Object Oriented Software Development Notes, Spring 2016
|
The term virtual machine refers to any set of hardware and software resources that, taken together, create an environment in which applications can be developed. Of course, you can develop applications directly on top of a real machine, but that means you have nothing but hardware to use as a starting point. Generally, we develop our applications on top of a combination of hardware and pre-existing software. We might use:
Consider an application running on a server in the cloud (Amazon, Google, Microsoft, it hardly matters). This is probably running on a multi-layered stack of virtual machines:
The actual neuron-network simulation code we wrote can be considered to rest on top of this stack of virtual machine layers. Almost all large software systems involve such stacks of virtual machines. Many of the layers described above can be further broken down into layers. For example, the internal code of many operating systems contains a kernel that is used to implement services that are used to implement other services that sit under application programs. From the outside, we consider the operating system to be a single layer, but to a system programmer working on developing the system, it has many layers.
The virtual machine layers in the above hierarchy differ in some important ways. Software implementations of instruction sets have a significant performance penalty because it typically takes from 10 to 100 physical machine instructions to implement each virtual machine instruction. At the same time, a software implementation of an instruction set can completely isolate the user from knowing anything about the physical machine; this creates portability, and it also has the potential to be very secure.
In contrast, with a virtual machine consisting of a software library plus some lower level tools, the user is not obligated to use the library, so the user has full access to the lower level machine at its native speed. With full access comes danger, if some of the features of that machine are insecure or unsafe.
Virtual machines in a hierarchy may be transparent or opaque; these terms were originally defined by Parnas in the early 1970s.
In general, opaque virtual machines offer security benefits, preventing user code from accessing or manipulating things that are dangerous, while transparent virtual machines offer greater flexibility and potentially greater transparency.
Parnas's original illustration for this concept was a 4-wheeled vehicle, perhaps an automobile. The low level virtual machine has 2 fixed wheels at the rear end, and two steerable wheels at the front. The front wheels can be steered independently (imagine two steering levers, one you can hold in each hand).
The low level vehicle is very flexible. It can follow any path a vehicle might want to follow, and it can even turn on a dime. Just position the rear axle over the dime, and then turn the two front wheels so that the lines of their axles also pass over the dime. This would make parallel parking incredibly easy, but this vehicle is extremely unsafe. If you are driving at any significant speed and you turn the front wheels so they are not parallel, you risk tumbling the vehicle tail over head. I would hate to drive this vehicle on I-80.
So, we build a higher level virtual machine on top of the low level one by linking the two front wheels to a steering wheel so that they always turn (approximately) in parallel. Our new vehicle is incredibly safer, but it is less flexible because our new virtual machine is opaque. Parallel parking is now a difficult skill involving much reversing and rocking of the vehicle.
Most virtual machines are a mix of transparent and opaque parts. Security problems such as the ability to install rootkits in operating systems tend to involve virtual machines that were intended to be opaque but had small (and typically difficult to find) transparent spots.
In Java, the primary tool we have to control the transparency of virtual machine layers in large programming projects are the ability to declare components of objects private. If we are careless in designing the methods of an abstraction, though, we can end up providing the user with a backdoor that allows some method or combination of methods to set a private field to an arbitrary and unsafe value.