Go to the previous, next section.
This chapter documents the virtual devices. Each virtual device provides a virtualisation of one of the system's resources. All devices which a virtual machine wishes to address (including the CPU) must be emulated by a virtual device.
A virtual machine is a special type of task (see section Task Handling) used to virtualise (emulate) the functionality of a normal 8086 processor.
All functions described in this section of the manual are members of the `vm' module, defined in the C header file `<vmm/vm.h>'.
Using the V86 mode of the 80386 the vm module creates and controls tasks which exactly resemble the environment of a standard 80x86 processor running in real (16-bit) mode. Since each virtual machine is built on top of a standard system task multiple virtual machines may be running simultaneously using the standard system scheduling primitives.
Although V86 mode allows the processor to run 16-bit code in a
protected mode environment there is still a lot of the standard
environment to be virtualised by hand. For example, since it is not
safe to let virtual machines adjust the physical interrupt-mask flag
in the FLAGS
register all access to this register has to be
trapped and emulated using a virtual copy of the register.
It is not possible to let the virtual machines use the host system's physical hardware either since the operating systems running in the virtual machines would assume that they were the sole users of the hardware devices. This means that every hardware device has to be virtualised; this basically means trapping all I/O port access, redirecting it to the necessary virtual device which will then emulate the I/O operation.
All virtual machines are given a standard set of virtual hardware when they are created. This includes a tty to provide a virtual video adaptor and virtual keyboard hardware (see section TTY Driver). Other virtual devices may be added afterwards through the use of a standard mechanism (see section Virtual Device Structure).
Obviously all this emulation imposes a massive overhead on running systems in this way; unfortunately there is no way to avoid this and still maintain total system integrity. We believe that the overall effect (multiple virtual machines running side-by-side) is well worth the overheads incurred. Although we have attempted to make the system as fast as possible we have not gone to ridiculous lengths to achieve this, favouring total protection over speed in all cases.
Each virtual machine is represented by an instance of the following data structure:
struct vm { /* The thread of execution running this VM. */ struct task *task; /* Local I/O port handlers for this VM. */ struct io_handler *local_io; /* This VM's tty. */ struct tty *tty; /* The VM's virtual IF */ u_long virtual_eflags; /* Functions to call when the VM is killed. */ struct vm_kill_handler *kill_list; /* True if a HLT instruction happened and this VM is suspended. */ bool hlted; /* Is NMI enabled? */ bool nmi_sts; /* Is A20 enabled? */ bool a20_state; /* Storage for the virtual A20 stuff. */ u_long himem_ptes[16]; /* Per-VM storage pointers for virtual devices. */ void *slots[32]; /* Defines the hardware virtualised for this VM. */ struct cookie_jar hardware; };
Each task that is being used to provide a virtual machine has its
TASK_VM
flag set and its user_data
field set to point to
its instance of the above structure.
vm Function: struct vm * create_vm (const char *name, u_long virtual_mem, const char *display_type)
This function creates a new virtual machine called name (this string is copied). virtual_mem defines the total amount of memory to give it in kilobytes, a value between 640 and 1024 will only result in 640K of available memory because of the special region starting at 640K.
The parameter display_type is a string naming the type of virtual video adaptor the virtual machine should have. A tty is opened using this argument (see section TTY Driver).
When the virtual machine begins executing it's CS:IP
will be
FFFF:0
, its virtual interrupt-flag will be set to zero (i.e.
interrupts masked) and its virtual A20 line will be disabled (i.e.
address above one megabyte wrap around to zero).
Note that the task will be suspended to allow other virtual devices to
be added to the virtual machine before it is started. To begin
execution use the kernel module's wake_task
function.
If this function is successful it will return a pointer to this virtual machines data structure, otherwise a null pointer will be returned.
vm Function: void kill_vm (struct vm *vm)
This function stops the virtual machine vm from executing, calls all the kill handlers registered with the virtual machine (see section Kill Handlers), closes the virtual machine's tty and finally deletes the virtual machine's task and its structure.
Almost all hardware devices in a PC use programmed I/O for their
interface with the processor. Therefore virtual devices need to be
able to virtualise all I/O port operations performed by virtual
machines. Luckily the 80386 provides a means of doing this: all
IN
and OUT
instructions can be made to trap, allowing
the virtual machine's general protection fault handler to emulate I/O
instructions.
Each virtual device must register the I/O ports which it wishes to virtualise; there are two levels of registration: either globally throughout the system or locally to a single virtual machine. When the general protection fault handler decodes an I/O instruction, it finds the port being addressed and then looks for a handler for this port for the current virtual machine. First it checks the local handlers for the virtual machine, then if no handler has been found, it searches the list of global handlers.
For each sequence of I/O ports that a virtual device wishes to
virtualise it must create an instance of the following structure, fill
in its fields as necessary and call the add_io_handler
function
to register it with the system.
struct io_handler { struct io_handler *next; /* The name of this handler. */ const char *name; /* The first and last (inclusive) I/O ports which this handler should trap. */ u_short low_port, high_port; /* Functions to be called for read and write accesses of an I/O port respectively. PORT is the I/O port being addressed, SIZE defines the number of bytes (1, 2 or 4) being accessed. */ u_long (*in)(struct vm *vm, u_short port, int size); void (*out)(struct vm *vm, u_short port, int size, u_long val); };
vm Function: void add_io_handler (struct vm *local, struct io_handler *ioh)
This function registers an I/O handler, IOH, with the system. If local is a non-null pointer it means that this I/O handler will be registered local to the virtual machine local. Otherwise the handler will be registered globally.
Note that the structure pointed to by ioh must not be deleted or reused while it is still registered.
vm Function: void remove_io_handler (struct vm *local, struct io_handler *ioh)
This function does the opposite of add_io_handler
-- it
removes the I/O handler ioh from the system. If local is
non-null it defines the virtual machine to which the I/O handler was
locally registered.
vm Function: struct io_handler * get_io_handler (struct vm *vm,u_short port)
This function attempts to locate an I/O handler for an access of the I/O port numbered port from the virtual machine vm.
If a suitable handler is found a pointer to it is returned, otherwise the function returns a null pointer.
Virtual devices usually have to store some per-virtual machine state
information, there is no obvious way to do this efficiently without
being able to edit the struct vm
used to represent a virtual
machine. Since one of our main objectives was to make the system
dynamically extendable (i.e. without having to recompile it
all) this is simply not acceptable.
To solve this problem we came up with the idea of reserving a certain
amount of space in each virtual machine structure to hold arbitrary
pointers. Each pointer is called a slot and each virtual machine
currently contains 32 slots. When a virtual device needs to store
state for each virtual machine it can simply allocate an unused slot
(using the alloc_vm_slot
function) then use the slot with that
index to store a pointer to a structure containing the state for that
virtual machine. Note that each virtual device should usually only
allocate one slot; the same index is used for all virtual
machines.
Each slot has the type void *
, i.e. a pointer of any type, the
virtual device should cast this to the type of the structure being
stored for each virtual machine. Normally these structures will be
allocated dynamically (with the kernel's malloc
function) for
each virtual machine.
vm Function: int alloc_vm_slot (void)
This function finds an unused virtual machine slot, marks it as being in use and returns its index. If all slots are in use the value -1 is returned.
vm Function: void free_vm_slot (int slot)
This function marks the virtual machine slot with index slot as being free for allocation.
Most virtual devices need to be able to perform some clean-up
operations when a virtual machine that they are providing a virtual
device for is deleted. Usually they will need to deallocate the
resources being used by that particular virtual machine. Our system
provides a method for virtual devices to hook into the kill_vm
function used to destroy a virtual machine: a kill handler may
be registered with the vm module for a particular virtual machine.
Each kill handler is represented by a structure, normally located in the structure used by the virtual device to store its per-vm state information.
struct vm_kill_handler { struct vm_kill_handler *next; /* The function to be called when the virtual machine is killed. VM points to the virtual machine in question. */ void (*func)(struct vm *vm); };
To install a kill handler in a virtual machine one of these structure
must be allocated somehow, its func
field set to point at a
suitable function and the add_vm_kill_handler
called with the
virtual machine to install it in and the address of the handler
structure. Note that each kill handler structure can only be
registered with a single virtual machine.
vm Function: void add_vm_kill_handler (struct vm *vm, struct vm_kill_handler *kh)
This function adds the kill handler pointed to by kh to the list of kill handlers that will be called when the virtual machine vm is killed.
Do not attempt to add the same kill handler to more than one
virtual machine; the handler's next
field is used to make a
list of handlers in each virtual machine.
The task used to run a virtual machine is given a set of special exception handlers; the function of these handlers ranges from simply passing the exception straight through to the virtual machine to emulating the instruction causing the exception.
Exceptions 0, 1, 3 and 4 (Divide-by-zero, Debug, Breakpoint and Overflow) are handled in the simplest manner: when the instruction causing the exception is in the virtual machine (as opposed to being part of the 32-bit system code) an interrupt of the correct number is emulated on the virtual machine's stack.
The most complex of a virtual machine's exception handlers is its general protection handler, the reason for this is that a GP fault can happen in many different circumstances and many of these circumstances require that we emulate the instruction causing the fault.
The first action of the handler is to decode all prefix opcodes in the instruction into a bit-mask. It then does a case switch on the actual instruction opcode, attempting to emulate the instruction if necessary. If the instruction can't be emulated (or the fault is actually a fatal error) the virtual machine is stopped and its registers are printed to the console.
The following table describes the instructions which can be emulated and how it goes about doing the emulation.
IN
INS
IN
instructions; the handler finds the I/O
port being addressed, looks for an I/O handler for this port and if
one exists calls it. The result of the operation is either stored in
part of the EAX
register or for the string I/O opcodes at the
address pointed to by ES:DI
. Note that if no I/O handler exists
for the port the value used as the result is -1.
OUT
OUTS
OUT
instructions; these are handled in much the
same way as IN
instruction.
CLI
STI
STI
instruction doesn't actually
unmask interrupts until after the following instruction;
unfortunately our emulation doesn't do this: it immediately unmasks
interrupts. Because of this the instruction combination STI; HLT
is detected and handled as a special case.
PUSHF
PUSHFD
EFLAGS
register is combined with the task's actual
EFLAGS
register and this value pushed onto the virtual machine's
stack. If an opcode prefix preceded the opcode all 32 bits are pushed,
otherwise only the low 16 bits.
POPF
POPFD
FLAGS
or EFLAGS
(depending on an opcode prefix)
register stored on the virtual machine's stack and uses it to set the
machine's virtual and physical EFLAGS
register. If the virtual
interrupt-enable flag is set any pending interrupts will be dispatched.
INT n
INT 3
FLAGS
, CS
and IP
registers onto its stack and setting CS
and IP
to the
values from the vector table.
Note that the 80386 reference manual states that an INT 3
instruction causes a real Interrupt #3 to happen but our experience
suggests that it may actually cause a GP fault.
vm Function: void simulate_int (struct vm *vm, struct trap_regs *regs, int type)
This function simulates an interrupt of type number type in the virtual machine represented by vm. The register frame regs should be the current frame of the virtual machine. This function should be used with care; a good place to call it from is in the task's return-hook, this ensures atomic access to regs.
IRET
IRETD
IRET
instruction, if an opcode prefix byte preceded
the opcode an IRETD
instruction is emulated. Basically the old
IP
, CS
and FLAGS
values are popped from the
virtual machine's stack into it's register frame.
HLT
hlted
field of the virtual machine structure then
suspending the machine's task.
REP INS
REP OUTS
REP
prefix except that a block of data
is handled at once.
Virtual machines use a special page fault handler, it has two functions:
IP
over it and let the
virtual machine continue executing.
Note that the virtual A20 gate adds some complications to the above procedure: if the address is above 1M and the A20 gate is disabled the address is truncated. Also any addresses in the first 64K of the address space have two page-table-entries when A20 is disabled.
The invalid opcode fault is caused when the processor attempts to execute an instruction which it doesn't recognise; usually the action of our handler for this exception is simply to cause the related interrupt in the virtual machine.
There is however one special case: when the opcode faulted on is an
ARPL
instruction the system's list of Arpl handlers is scanned
for one matching the byte stored after the opcode. If one is found the
function its associated with is called. The reason for this is to
provide a safe method for 16-bit code running in a virtual machine to
call functions in kernel mode. See section Arpl Handlers.
As briefly mentioned in the previous section Arpl handlers provide a rudimentary system call interface for code running in virtual machines. This is mainly used in the Virtual BIOS to allow the 16-bit BIOS functions to call back to code in kernel mode to handle the BIOS call.
In a piece of 16-bit code an ARPL
opcode (value 0x63) followed
by a byte defining the Arpl handler to call is used to invoke an Arpl
handler. The virtual machine's Invalid Opcode exception handler
recognises that it is an Arpl call and invokes the correct handler.
For example to call the Arpl handler number 20 the following piece of assembly language could be used:
.byte 0x63, 20
Each Arpl handler is represented by a separate instance of the following structure:
struct arpl_handler { struct arpl_handler *next; /* The name of this handler. */ const char *name; /* The range of Arpl functions covered by this handler, both numbers are inclusive. */ u_short low, high; /* The function to call when the handler is invoked by a virtual machine. REGS is the register frame of the machine, VM, which invoked the handler. SVC is the Arpl function invoked. */ void (*func)(struct vm *vm, struct vm86_regs *regs, u_short svc); };
To register a handler use the function add_arpl_handler
, when
the handler must be deregistered the function remove_arpl_handler
should be called.
vm Function: void add_arpl_handler (struct arpl_handler *ah)
Register the Arpl handler pointed to by ah with the system.
vm Function: void remove_arpl_handler (struct arpl_handler *ah)
Deregister the Arpl handler ah.
vm Function: struct arpl_handler * get_arpl_handler (u_short func)
Attempt to locate an Arpl handler to call to handle the Arpl function number func. If such a handler is found a pointer to it is returned, otherwise the function returns a null pointer.
In the beginning the 8086 processor had a one megabyte address space, however using the base and offset memory addressing module used by processors in the x86 line addresses up to one megabyte plus 64K can be generated. In the 8086 these addresses were simply truncated to the low 64K of the system's address space. When later members of the x86 family were given bigger address spaces a problem occurred: how to maintain compatibility with the 8086 without stopping operating system's from using the memory above one megabyte.
The accepted solution was to allow systems to ignore the 20th bit of
all physical addresses if they wanted to. This is called the A20 gate:
when enabled addresses with a 1
in their 20th bit work as
normal, when it's disabled these addresses are effectively truncated
to a lower part of the address space. For some perverse reason the
keyboard controller actually controls the A20 gate.
To ensure compatibility with real system's virtual machines have to be able to virtualise the A20 gate: obviously, the way to do this is by using the virtual machine's page tables to map the bottom 64K of memory into the space above 1M when the A20 gate is disabled.
When the Virtual Keyboard receives the command to set the state of the virtual machine's A20 gate it calls the following function.
vm Function: void set_gate_a20 (struct vm *vm, bool state)
This function controls the A20 gate in the virtual machine vm.
When the parameter state is TRUE
the address line is
enabled, when it is FALSE
it is disabled.
The himem_ptes
field of the virtual machine structure is used
to store the old values of the page-table-entries in the 64K range
from 1M when A20 is disabled and copies of the machine's low 64K
entries are mapped into this region.
Throughout the previous chapter we have referred to virtual devices, pieces of kernel code which virtualise the functionality of a piece of hardware found in a PC. Most virtual devices conform to a standard structure, allowing the user to specify the virtual devices given to a virtual machine in a very straightforward manner.
Each virtual device is stored in a module of the same name: for example the Virtual IDE device is stored in a module called `vide' (see section Modules). Instead of using the standard module structure as its base type a virtual device uses a special form of the module structure:
struct vxd_module { /* A normal module structure. */ struct module vxd_base; /* This function is called to create a virtual device of this type in the virtual machine VM. ARGC and ARGV define the arguments to the device. */ bool (*create_vxd)(struct vm *vm, int argc, char **argv); };
Each virtual device simply declares one of these structures at the
start of its module structure instead of the usual
struct module
, the shell commands to initialise virtual
machines do all the rest. For example the Virtual IDE device defines
its module structure as follows:
struct vide_module { struct vxd_module base; /* Member functions follow... */
It then defines an instance of this structure as follows:
struct vide_module vide_module = { { MODULE_INIT("vide", vide_init, NULL, NULL, vide_expunge), create_vide }, /* Member functions follow... */
The following function description documents how the create_vxd
function in the virtual device structure is called.
Function: bool create_vxd (struct vm *vm, int argc, char **argv)
This function is called to install a virtual device of this type into the new virtual machine vm. The virtual machine's task will not have started executing yet.
The two parameters argc and argv define the string
arguments given to the shell command vmvxd when this virtual
device was specified. They are similar to the parameters given to the
C main
function except that argv[0]
really is an argument, not the name of the device, due to this
argc is one less than it would be in main
.
The function should return TRUE
if it succeeds in installing a
virtual device in the virtual machine, FALSE
otherwise.
Note that after being called the module containing the function is
immediately closed. This means that the module's reference count won't
be correct. To solve this this function should normally increment the
open_count
field of the module then when an instance of the
virtual device is deleted open_count
can be decremented.
See section Modules.
As you can see the virtual device structure says nothing about how to delete a virtual device when the virtual machine is killed. Usually each virtual device adds a kill handler to the virtual machine as it is created. See section Kill Handlers.
The shell is used to configure and launch a virtual machine; this involves initialising a virtual machine, specifying the hardware virtualised by this machine in a series of shell commands and a final command to actually launch the newly created virtual machine.
Shell Command: vminit [name] [memory-size] [display-type]
This command starts a new virtual machine initialisation block for a virtual machine called name.
The parameter memory-size defines the number of kilobytes of memory given to the new virtual machine while display-type names the type of virtual video adaptor given to the machine (see section Video Drivers).
If any of the optional parameters aren't specified suitable default values are chosen.
Shell Command: vmvxd module-name [args ...]
This command is used to install a virtual device into the virtual
machine currently being configure (i.e. the most recent vminit
command).
The parameter module-name names the virtual device module containing the virtualisation code, the optional argument strings args are passed to the virtual device's initialisation function.
Use this command to end a initialisation block started by the
vminit
command. The virtual machine is started executing.
Using the above shell commands blocks of commands completely configuring a virtual machine can be built. If these commands are saved in files they can be used as shell scripts to start a particular type of virtual machine. An example initialisation block follows (lines starting with `#' are considered comments by the shell).
# Start the initialisation block for a VM with # 2 megabytes of memory and a CGA display. vminit 2048 CGA # Give it virtual PIC, PIT, DMA, CMOS and BIOS devices, vmvxd vpic vmvxd vpit vmvxd vdma vmvxd vcmos vmvxd vbios # a virtual IDE disk, vmvxd vide /usr/dos-hd.image # and a virtual printer. vmvxd vprinter 0x278 # Now launch the new machine. vmlaunch
The PIC is the system's Programmable Interrupt Controller, used to receive interrupt requests from external devices and pass them along to the processor when possible. A PC has two PICs, providing a total of 15 different IRQs.
The virtual device module `vpic' should be installed in each virtual machine; it creates two virtual PICs for the task. It supports most of the features that a real PIC offers though some of the less often used features have been left out to keep things simple. The virtualisation should be able to cope with the usual ways in which the system's interrupt controllers are used. Supported features include:
simulate_vm_irq
.
vpic Function: void simulate_irq (struct vm *vm, u_char irq)
This function is used to simulate an IRQ of level irq in the virtual machine represented by vm. It is usually used by virtual devices when they need to virtualise an interrupt from the device they are emulating.
vpic Function: void IF_enabled (struct vm *vm)
This function must be called each time the virtual machine's virtual
interrupt-flag is set (i.e. by the STI
emulation). If any
interrupt requests are pending they will be dispatched to the machine
as soon as possible.
vpic Function: void IF_disabled (struct vm *vm)
This must be called when the virtual interrupt-flag of the machine
vm is cleared (i.e. by the CLI
emulation).
vpic Function: void set_mask (struct vm *vm, bool set, u_short mask)
This function alters the state of the mask of the Virtual PIC installed
in vm. When set is TRUE
all interrupt requests whose
bit is set in mask are masked out; when set is FALSE
all those IRQs whose bit in mask is set are enabled.
This function is usually called by a virtual device's create_vxd
function to enable the interrupts that it virtualises. (To start with
all IRQs are disabled except for IRQ2, the cascade for the slave PIC.)
The PIT is the Programmable Interval Timer, used by PCs to provide accurate time keeping of short intervals. The PIT has three channels, one of which is connected to IRQ0; each of which can be put into one of five different modes of operation to provide different types of timer (one-shot, continuous, etc...).
The virtual device module `vpit' should be installed in each virtual machine to give it a virtual PIT device, this emulates the actions of a real PIT. All the PIT I/O ports are virtualised and used to respond to the commands made of the device. A 1024Hz timer is used to simulate the channel zero interrupt up to a maximum rate of 200 times per second.
The following table lists the I/O ports virtualised:
The three channels are initialised to the same state that the BIOS initialises them to in a real PC. That is:
One of the main problems with virtualising the PIT is that virtual machines don't have the processor all of the time. This means that some thought has to be given to the way in which a timer is implemented. The two obvious alternatives are to run the timer in real-time (i.e. it still ticks even when the virtual machine is not running), or to run it in "virtual" time in which case the timer only ticks when the virtual machine actually has the CPU.
We decided to use the first option: run the timer channels in real time. This was chosen it most closely resembles what a real PIT does: namely to measure short intervals of time accurately. It also gives time-of-day clocks ticking on IRQ0 a better chance of keeping time. One drawback of this method is that if a virtual machine's channel zero timer interrupts while the machine is not running the interrupt will not be received until the virtual machine returns to the head of the run queue. Even worse, if lots of tasks are waiting to run and a virtual machine is suspended for long enough that multiple timer interrupts occur the virtual machine will only receive one interrupt.
As noted above, only channel zero actually uses a timer to measure its intervals (see section Timers); this is because it is the only channel linked to an IRQ and therefore the only needing to be asynchronous. The other channels simply store the time at which they were started, when their counter registers are polled the virtual PIT simply calculates the counter's value from the current time and the time at which the timer was started.
vpit Function: struct vpit * get_vpit (struct vm *vm)
Returns a pointer to the structure representing the virtual PIT of machine vm. If no such device has been installed in vm a null pointer is returned.
The are two DMA chips in the PC. The first DMA chip is responsible for 8-bit transfers and occupies ports 0x00 - 0x0F. The second DMA chip is responsible for 16-bit transfers and occupies ports 0xC0 - 0xDF. Both chips also occupy the port range 0x80 - 0x90 where the page registers for both chips reside.
The emulation of the DMA chipset is very simple. It allows for programs to read and write the values of the DMA registers. It is the responsibility of other virtual devices to use the DMA information appropriately.
The following functions allow virtual devices to use access the DMA chipset:
vdma Function: void get_dma_info (struct vm *vm, struct channel_info *info)
This function returns information about the settings of a DMA channel.
vm is the virtual machine.
channel is the DMA channel (0 - 7).
info is the structure where the channel information is placed.
This function returns nothing.
vdma Function: void set_dma_info (struct vm *vm, struct channel_info *info)
This function sets information about the settings of a DMA channel.
vm is the virtual machine.
channel is the DMA channel (0 - 7).
info is the structure where the channel information is obtained.
This function returns nothing.
The header file `vdma.h' defines the channel information structure as follows:
struct channel_info { /* Memory address page */ u_int8 page; /* Memory offset */ u_int16 address; /* Length of transfer */ u_int16 len; /* Transfer Mode */ u_int8 mode; };
The BIOS is the PC's Basic Input/Output System: a set of assembly language routines to provide a standard interface to the devices found in most systems. These routines are divided into groups for each piece of hardware or concept; each group being called through a software interrupt.
Our virtual BIOS (a standard virtual device in a module called `vbios') handles most of the standard BIOS functions in a way that sits well with the rest of the system. Although in theory it would be possible to simply map a standard BIOS into the address space of each virtual machine (since our system virtualises the devices it must communicate with). In practice however it is much better to use a custom BIOS since it can be designed to use the system at the kernel level, without incurring the large overhead of talking to each virtual device through the virtual I/O mechanism. For example the BIOS function to wait for a short time period uses a standard kernel timer instead of the virtual PIT.
Each virtual machine which has a virtual BIOS installed in it has a
copy of some 16-bit code copied into its system ROM area. This code
provides the assembly language functions called through the interrupt
vector table when the virtual machine's operating system wishes to
call a BIOS function. Some of these stubs are able to handle the BIOS
function themselves (for example the keyboard services on INT 16)
while others simply use an ARPL
instruction to invoke a
function in the vbios module itself to handle the BIOS function
(see section Arpl Handlers).
The CMOS is a device introduced with the PC/AT used to store setup information relating to the hardware attached to the PC. This information includes the type and number of hard disks and floppy disks. The standard CMOS hardware allows storage of up to 64 bytes of information, although modern PC's can have up to 2K bytes of storage, this being used in a non-standard way according to the wishes of the manufacturer to hold information relating to non-standard extensions.
The hardware also includes a RTC (Real Time Clock) used to store the current time and date and also an alarm. The RTC can also provide a periodic interrupt with intervals from 2Hz up to 8192Hz.
The CMOS is battery backed to allow the clock to keep the correct time even when the machine is switched off and to allow the permanent storage of the BIOS settings.
The CMOS is emulated by the `vcmos' module. Emulation of the time facilities is simulated by having a function activated every second to update the time and date and to check if the alarm should be set off. Allowance is also made for the periodic interrupt. Because each virtual machine's virtual CMOS has a separate time update function, each virtual machine can set its RTC to reflect any time and date without interfering with other virtual machines or the main clock used by the operating system.
Upon creation of a virtual CMOS, it is setup in the following way:
To allow for virtualisation of the CMOS, the following port references are trapped and redirected to the `vcmos' module.
Access to the CMOS is quite simple. The CMOS memory address (0 - 0x3F) is written to port 0x70 and then the data is either written or read by writing to or reading from port 0x71.
Because of the nature of the CMOS, the only CMOS memory address that has to be captured is 0x0C which holds the status of the last interrupt to be activated. When this port is read from, its value must be set to 0.
Another function of the CMOS is to enable and disable the NMI (Non-Maskable
Interrupt). If bit 7 of the memory address written to port 0x70 is set then
the NMI is enabled, else it is disabled. This affects the value if the
nmi_sts
flag in the vm
structure.
The `vcmos' module provides the following functions:
vcmos Function: u_char get_vcmos_byte (struct vm *vm, u_int addr)
This allows another module to get a byte in the Virtual CMOS memory.
vm is the virtual machine.
addr is the memory address in the range 0 - 0x3F.
This function returns the byte at addr.
vcmos Function: void set_vcmos_byte (struct vm *vm, u_int addr, u_char val)
This allows another module to set a byte in the Virtual CMOS memory. After the memory value has been set, the checksum for the CMOS is automatically recalculated.
vm is the virtual machine.
addr is the memory address in the range 0 - 0x3F.
val is the new value to set the memory location to.
This function does not return a value.
vcmos Function: void get_bcd_time (struct vm *vm, struct vm86_regs *regs)
This function allows for emulation of the BIOS service INT 0x1A function 0x02. It reads the current time from the CMOS clock.
vm is the virtual machine.
regs is the register set of the virtual machine.
The register values are filled as follows:
CH
CL
DH
DL
EFLAGS
This function does not return a value.
vcmos Function: void set_bcd_time (struct vm *vm, struct vm86_regs *regs)
This function allows for emulation of the BIOS service INT 0x1A function 0x03. It sets the current time in the CMOS clock.
vm is the virtual machine.
regs is the register set of the virtual machine.
The clock is set according to the register values as follows:
CH
CL
DH
DL
This function does not return a value.
vcmos Function: void get_bcd_date (struct vm *vm, struct vm86_regs *regs)
This function allows for emulation of the BIOS service INT 0x1A function 0x04. It reads the current date from the CMOS clock.
vm is the virtual machine.
regs is the register set of the virtual machine.
The register values are filled as follows:
CH
CL
DH
DL
EFLAGS
This function does not return a value.
vcmos Function: void set_bcd_date (struct vm *vm, struct vm86_regs *regs)
This function allows for emulation of the BIOS service INT 0x1A function 0x05. It sets the current date in the CMOS clock.
vm is the virtual machine.
regs is the register set of the virtual machine.
The clock is set according to the register values as follows:
CH
CL
DH
DL
This function does not return a value.
vcmos Function: void set_alarm (struct vm *vm, struct vm86_regs *regs)
This function allows for emulation of the BIOS service INT 0x1A function 0x06. It sets an alarm in the CMOS clock. This function does not return a value. vm is the virtual machine.
regs is the register set of the virtual machine.
The alarm is set according to the register values as follows:
CH
CL
DH
The carry-flag in the EFLAGS
register is cleared if the alarm is
set, else set if alarm already set or clock stopped.
This function does not return a value.
vcmos Function: void reset_alarm (struct vm *vm, struct vm86_regs *regs)
This function allows for emulation of the BIOS service INT 0x1A function 0x07. It clears any pending alarm request on the CMOS clock.
vm is the virtual machine.
regs is the register set of the virtual machine.
The alarm is cleared.
This function does not return a value.
The `vfloppy' driver provides an interface that mimics floppy drives under a virtual machine. It presently only provides support for accesses performed under BIOS control, using the INT 13H interface. Most applications will only use this interface rather than attempt to access the floppy at the I/O port level.
During the installation of the Virtual Floppy device, with the
vmvxd
shell command, the driver requires a single parameter
thus:
image
The image argument is either the name of a floppy device, followed by a colon (i.e. `fd0:', `fd1:', ...) or the name of a file. Presently, support is only given for linkage to a file.
The header file `<vmm/vfloppy.h>' defines the Virtual Floppy module. The functions that it provides (as well as the standard virtual device creation function) are as follows.
vfloppy Function: void kill_vfloppy (struct vm *vm)
Remove the Virtual Floppy device from the virtual machine specified by vm.
vfloppy Function: bool change_vfloppy (struct vm *vm, const char *new_file)
If the Virtual Floppy has not already been linked to a file or device, then this function will do so, linking the driver with the file specified in new_file in virtual machine vm. If the driver is already linked to a file, then this function will close the existing link and relink to the newly specified file.
vfloppy Function: int vfloppy_read_sectors (struct vm *vm, u_int drvno, u_int head, u_int track, u_int sector, int count, void *buf)
This function reads count 512-byte sized sectors from the Virtual Floppy device of the virtual machine vm to the buffer buf, a logical address in user space. The sectors will be read from the sector on the virtual disk defined by cyl, head and sector on the disk numbered drvno (this must be zero since only one virtual disk is supported by the controller).
The value returned is either the number of blocks successfully read or -1 to denote an error before any blocks could be read.
vfloppy Function: bool vfloppy_get_status (struct vm *vm, u_char *statp, u_char *errp)
If the virtual machine vm has a Virtual Floppy installed
in it, the contents of the locations statp and errp are set
to the values of the controller's status and error registers
respectively, then the value TRUE
is returned. Otherwise, when
no Virtual Floppy is installed the value FALSE
is
returned and statp and errp left unmodified.
The virtual device module `vide' provides a complete emulation of a simple IDE controller with a single hard disk attached to it. When installing this virtual device in a virtual machine the module must be told how to virtualise the disk, this can either be done with a file in the system's filing system (see section The Filing System) or by using all of a partition on a physical hard disk (note that using the partition representing the whole of a hard disk can allow a virtual machine to see the same hard disk as the physical machine does).
For maximum performance the Virtual IDE device provides two levels of accessing the contents of a virtual hard disk:
When installing a Virtual IDE device in a virtual machine (using the
vmvxd
shell command) the arguments to the device define how the
virtual disk is stored. The argument template is as follows:
image [size]
The image argument is either the name of a hard disk partition followed by a colon (i.e. `hda:', `hda1:', ...) or the name of a file. The optional size argument is only used when image is a file name; it defines the maximum number of blocks that the disk should contain (the actual number of blocks may be slightly less to allow for a suitable disk geometry).
Example commands to install a Virtual IDE device in the virtual machine currently being defined could be:
# A 20M disk from the file `/usr/hd.image' vmvxd vide /usr/hd.image 40960 # A disk using the file `/usr/dos.image', it will use the # size of the file to get it's size parameter. vmvxd vide /usr/dos.image # Use all of the first physical disk as a virtual disk vmvxd vide hda:
When creating virtual machines note that it is probably not a good idea to share virtual disks between more than one virtual machine at any one time; it may well lead to catastrophic disk corruption!
The header file `<vmm/vide.h>' defines the Virtual IDE module, the functions that it provides (as well as the standard virtual device creation function) are as follows.
vide Function: void delete (struct vm *vm)
Remove the Virtual IDE device from the virtual machine vm.
vide Function: int read_user_blocks (struct vm *vm, u_int drvno, u_int head, u_int cyl, u_int sector, int count, void *buf)
This function reads count 512-byte sized blocks from the Virtual IDE device of the virtual machine vm to the buffer buf, a logical address in user space. The blocks will be read from the block on the virtual disk defined by cyl, head and sector on the disk numbered drvno (this must be zero since only one virtual disk is supported by the controller).
The value returned is either the number of blocks successfully read or -1 to denote an error before any blocks could be read.
vide Function: int write_user_blocks (struct vm *vm, u_int drvno, u_int head, u_int cyl, u_int sector, int count, void *buf)
This is similar to the read_user_blocks
function except that
blocks are written to the virtual disk instead of being read from it.
The result is determined in the same way.
vide Function: bool get_status (struct vm *vm, u_char *statp, u_char *errp)
If the virtual machine vm has a virtual IDE controller installed
in it, the contents of the locations statp and errp are set
to the values of the controller's status and error registers
respectively, then the value TRUE
is returned. Otherwise, when
no IDE is installed the value FALSE
is returned and statp
and errp left unmodified.
vide Function: bool get_geom (struct vm *vm, u_int *headsp, u_int *cylsp, u_int *sectsp)
This function sets the contents of the locations headsp, cylsp
and sectsp to the number of heads, cylinders and sectors that
the virtual hard disk of the virtual machine vm has, and then
returns TRUE
. If vm has no virtual hard disk the value
FALSE
is returned.
Every virtual machine is given a virtual keyboard, as part of its tty, allocated when the virtual machine is created. For this reason the `vkbd' module is not a standard virtual device and should not be included in the configuration procedure of a virtual machine.
A virtual keyboard is another type of logical keyboard, it exactly emulates a real keyboard and its 8042 controller device. As key codes are received by the logical keyboard they are translated back into scan codes and stored in a buffer in the virtual keyboard. If enabled, a virtual IRQ is generated to let the virtual machine know that new keyboard input is available.
The I/O ports virtualised by this device are:
0x60
0x61
0x64
The printer port allows the connection of a printer. Virtualisation of this device is made by trapping three ports. Because more than one printer port can exist on a standard PC, the ports are given as offsets to the base address to the printer ports. The standard printer port base addresses are 0x278, 0x378 and 0x3BC.
The ports virtualised by the `vprinter' module are:
Virtualisation is made more complicated because a method has to be found to distinguish between the end of one file printed to the printer and the start of a new file. Also, programs can write either directly to the printer port or can use BIOS functions. The `vprinter' module therefore tries to provide a reasonable emulation of the device, accepting that there will be limitations, rather than attempting to provide a perfect emulation.
The `vprinter' module also allows for emulation of some of the services provided by BIOS service INT 0x17.
The `vprinter' module provides the following functions:
vprinter Function: void printer_write_char (struct vm *vm, struct vm86_regs *regs)
This function allows for emulation of the BIOS service INT 0x17 function 0x00. It writes a character to the spool file associated with the given port of the virtual machine. If no spool file exists, it is opened.
vm is the virtual machine.
regs is the register set of the virtual machine.
The data is written to the spool file according to the register values as follows:
AL
DX
The AH
register is set to the status of the printer port.
This function does not return a value.
vprinter Function: void printer_initialise (struct vm *vm, struct vm86_regs *regs)
This function allows for emulation of the BIOS service INT 0x17 function 0x01. It initialises the printer port, closing any previously opened spool file associated with this port, submitting it for printing and opens a new spool file.
vm is the virtual machine.
regs is the register set of the virtual machine.
The port is initialised according to the register values as follows:
DX
The AH
register is set to the status of the printer port.
This function does not return a value.
vprinter Function: void printer_get_status (struct vm *vm, struct vm86_regs *regs)
This function allows for emulation of the BIOS service INT 0x17 function 0x02. It returns the status of the specified port.
vm is the virtual machine.
regs is the register set of the virtual machine.
The port is specified according to the register values as follows:
DX
The AH
register is set to the status of the printer port.
This function does not return a value.
Go to the previous, next section.