Linux-Kongreß´97

The Big Bang: Booting Linux on an Alpha PC

Have you ever wondered just what happens when you switch on your machine? I have spent most of the last 2 years of my working life trying to make sure that the right things happen when an Alpha AXP processor is powered on. The right thing is, of course, to boot Linux. I believe that, just like the Universe, the most interesting things happen in the first few moments. This article takes a brief tour through the sequence of events from power on to the login prompt on an Alpha AXP based Linux system.

Chips with Everything

If you open up an Alpha AXP based Linux box you may be a little disappointed, the insides look very like the insides of a standard Intel X86 PC. This because the major components of an Alpha based Linux system are mostly the same as an Intel PC. The most obvious difference is the chip itself which is likely to be buried underneath a rather large heat sink. It is just basic physics, energy tends towards heat. Pentium Pro and PowerPC based processors have the same heat dissipation problems. Next in importance (and speed) comes the external cache and main memory followed by the I/O buses. All of the Alpha AXP systems, except one, that run Linux include ISA and PCI buses. On those buses are the hardware controllers which support the physical devices in the system; the video, disk, keyboard and mouse devices. These memory, data buses and hardware controllers must all be set up by some software in the sytem so that the Linux kernel can use these hardware resources.

When a microprocessor is turned on it must run some initialization software. Typically that software is held in non-volatile memory; either a flash or ROM device. Mostly these days, flash devices are used. The startup software for Intel PC systems is known as Basic Input Output Setup or BIOS code. This startup software has an aura of mystique about it as it does play such an important part of the system configuration. In reality, it is no more complex than an average Linux device driver. At power up all Intel chips start operating exactly the way that the Intel 8086 CPU used to; they start executing from address 0xE000 in memory and this is where the flash memory is mapped.

Alpha AXP PCs also uses flash ROM based code to set up the system. Alpha AXP based systems all put the flash or ROM device in a seperate address space from the system's main memory. Unfortunately, the Alpha AXP microprocessors have less addressing modes than Intel's X86 CPUs and they have no access to any external memory spaces without help from a support chipset. The Alpha AXP architecture does not have an instruction which means "read from I/O address foo and put the results in register bar". The support chips all provide I/O space access through a sparse memory mapping scheme. The sparse mapping schemes vary but they all involve mapping some part of Alpha's rather large address space into I/O space. For systems using the Apecs (2117x) chipset this means that the PCI I/O address space starts at 0xFFFFFC01C0000000 . All of this means that the Alpha CPU cannot access the flash until after the main memory has been set up and it most certainly cannot run instructions directly from flash or ROM. To solve this problem, Alpha AXP uses a synchronous ROM, or SROM, to set up the system's memory before it starts executing code that it copies from the flash ROM and into the system's main memory.

The SROM

In an Alpha AXP based system the SROM code detects and sets up the system's banks of memory and loads into memory the contents of the flash ROM image. As power is first applied to the CPU the contents of the SROM are synch'd into the on chip instruction cache via 3 pins on the processor. (The image in the SROM even includes the tag lines of the cache memory). At this point not even the external caches and main memory are running. The SROM code executes directly out of the internal instruction cache and is running in a privileged mode known as PALmode. PAL stands for Privileged Architecture Library. This code, unlike ordinary code has access to all of the CPU's internal registers and state and in a very real sense is in complete control of the system. Even kernel mode does not have the same access. PALcode can be thought of as a personality layer for the underlying CPU hardware.

It is the SROM code which detects what size and speed of memory Simms have been fitted and sets up the memory refresh timers. Once these have been set up, the SROM code's next task is to load a flash ROM image into memory. As flash has become cheaper there is more of it, sometimes as much as a 1Mbyte. This means that you can fit more than one image into it. So that the SROM code can figure out which image to load, these images each have a header. The header includes such information as where in memory to load the image and a checksum that allows the SROM code to figure out if the image is damaged. Additionally there must be a way to select a particular image. On some systems a jumper might be used and on others a spare byte is used in the battery backed RAM of the TOY clock. This might sound a little bizarre but system software designers tend to use these little tricks. The settings of a bank of jumpers can be read from a register which is mapped somewhere in the system's memory.

Once the SROM has loaded the ROM image into memory there is a slight problem. The image was loaded and written to main memory as data from the SROM code that was running out of the internal instruction cache. This means that the data cache now contains parts of the image to run but that the instruction cache does not have any of the ROM image in it. This is a classic problem with Harvard architecture CPUs where the instruction and data cache are seperate. So, just before control is transferred to the image, the instruction cache is flushed clearing it out ready for the console code. Of course, flushing the cache whilst you are executing out of it is tricky. You have to branch someplace that is not in the cache and execute some pipelined NOPs whilst the cache is flushed.

BIOS

Since the dawn of the PC, IBM's PC AT in 1982, there has been BIOS, code. The task of this code is to set up the system before loading software into main memory and transferring control to it. The software loaded can either be the operating system or a loader for the operating system. It lives in the first block (the boot block) of a block structured device, for example a floppy or a hard disk. Most BIOSs have a setup menu which allows some setting of the system and, in particular, selects which devices will be used to boot from. The Intel X86 architecture allows the BIOS code to be accessed via special callbacks and this means that as well as setting up the system, the BIOS code can also be queried about the system topology by the running operating system. A good example of this is the setting up of the PCI subsystem. At boot time, a PCI based system must work out the topology of the PCI subsystem, what devices are in which slots and so on. In an Intel X86 based system this is done by the BIOS and, when Linux boots it makes PCI BIOS callbacks to discover the layout of the PCI subsystem. Mostly though, Linux reinitializes the devices and subsystems of the PC. For example, it completely reinitializes the SCSI subsystem. This is logical; there is no easy way to figure out just how any subsystem was set up without reinitializing it. This tendency of the Linux kernel to take complete control of the system is useful for non-Intel architectures such as Alpha AXP. One interesting feature of BIOS is its ability to make use of the software supplied with hardware devices. For example, all video devices contain on board ROMs which have BIOS setup code that gets loaded and run by the system BIOS. Of course, this code is Intel X86 machine code. The BIOS code is, of course, specific to a particular system but, once you have the BIOS, all Intel X86 based systems look very much alike to the software.

The Linux kernel itself can be written directly into the boot block but it can also use LILO which is flexible enough to allow more than one OS to be loaded; most dual booted Intel X86 systems use this mechanism. LILO uses the BIOS callbacks and the underlying device drivers to load the Linux kernel into memory and transfer control of the system to it. The BIOS code remains resident in memory but it does not take up much memory, after all most of it is in flash with only its data structures left in system memory.

All in all, BIOS is a rather nice thing to have, unfortunately only Intel X86 PCs have it. So, what do other architectures do?

Consoles

Traditional minicomputers and mainframes also have special code, known as a console, which, on the face of it does pretty much the same as a PC's BIOS. The console must initialize the system's hardware, load the operating system kernel into memory and start executing it. When the operating system is running the console provides information to the operating system about the system's configuration. For some of the larger systems the console ran on a seperate CPU or even machine. Its job was not only to boot the system but also to monitor its execution. A number of consoles allowed you to stop the operating system, query its state and then allow it to continue running. (History, look this up, example.)

These non-Intel X86 systems have usually been built from non-standard components that, compared to the Intel PC market are expensive. Furthermore, they did not, by and large, use generic devices that could be used in a number of machines. You could not take the graphics device from a VAXStation and put it into a SPARCstation. Every manufacturer had their own boot code written for their own ends and they did not need it to be standard in any way. You could not walk up to one vendor's console and expect it to have the same interface as any other machine. Sometimes you could not even expect different machines made by the same vendor to have the same console interface. By contrast, you can pretty much guarantee to boot any Intel PC as they are very much alike even though their BIOS code may have been written by different companies.

Historically then, non-PC computers had their own boot code and these are known as consoles. Digital is no different from the other manufacturers and our console is known as the SRM Console (named after The System Reference Manual) and it is used to boot both OpenVMS and Digital Unix. The SRM is an operating system in its own right and it includes such features as multi-threading and loadable device drivers. It is also somewhat bigger than the Linux kernel and Digital charges rather a lot of money for it. As it is itself an operating system, it requires device drivers to be written for it before they can supported by the operating system itself. On the other hand it was designed to boot Digital Unix and when Linux was first ported to the Alpha, this is the console that was used. As well as loading the Linux kernel, the SRM console does provide it with Digital Unix PALcode. Later versions of the SRM console include our BIOS emulation library which allows us to run the Intel BIOS code from video cards in emulation thus giving our systems a wider choice of video cards. BIOS emulation is also starting to be used on the SCSI subsystem.

Over the last few years the higher end, workstation class of machines, have moved towards using standard PC components. This is because they are widely available, cheap and now often have very high performance. For example a 64 bit PCI graphics card is very powerful. I cannot speak for other computer manufacturers but Digital's move towards PC components allied with fast processors has been very marked. Five years ago Digital were shipping bespoke systems with almost no standard components. Today, there is very little difference between a workstation and a PC. Whilst the marketting people have seen Microsoft's Windows NT as their goal the Alpha PC is also a big opportunity for Linux and, hopefully for Digital.

The Windows NT ARC Console

Ther have been several attempts at unified console architecture. One of the more successful was the ACE consortium's inititive. Their console architecture was what became, eventually the Windows NT ARC console. Another is the OpenBoot standard which uses Forth interpretters to run the system's device startup code. It is an elegant solution but it has not achieved wide popularity. Digital ships the Windows NT ARC console on its Windows NT systems and, even though it is now owned by Microsoft, it still bears many of the features that were there in the ACE consortium's proposal. ARC is an environment for running loaders of operating systems. It includes standard callbacks for finding out the system's characteristics as well as device drivers which can be used by the OS loader. These device drivers have to be written for it but it also uses Digital's BIOS emulation code. Like the SRM console it remains in memory and can provide services to the running operating system. Being owned by Microsoft means that it is not free code, you need a firmware developement license and, as it runs only on Alpha AXP systems it is not portable. It does have the considerable advantage of being free. Unlike the SRM console it does support filesystems; DOS and NTFS.

Unfortunately it has a large disadvantage so far as running Linux is concerned. It is based on Windows NT PALcode which will not support any UNIX operating system and, worst of all, it runs in superpage addressing mode and not KSEG addressing mode so none of the Linux code would even run.

MILO

The Milo image as it is first loaded is a small stub image which itself contains a compressed copy of the full Milo image as data. This is done to save space when the Linux kernel is running. This compressed Milo image is re-entrant so that the Linux kernel can return control to it after a "shutdown -r now" command. The compressed Milo is left in memory (it takes up around 500Kbytes) and when Milo describes just what memory is available to the Linux kernel, it describes the memory containing the compressed Milo as not available. The compressed Milo works out what mode the system is now running in and, if neccessary, changes mode to the correct mode. It also moves itself to where it should be in memory. This might seem more than wierd behaviour but Milo is designed to be loaded from several places; from flash, from the Windows NT ARC console and even from the SRM console. The compressed Milo image contains a very little piece of PALcode whereas the full, uncompressed Milo contains the full Digital Unix PALcode for this system.

When the uncompressed Milo starts to run, pretty much all that Milo assumes has been configured is the memory. Milo first sets up the serial ports of the system. It then writes information about the boot sequence to COM port 1. If Milo is not booting on a particular system then often it is a good idea to hook up a serial line to that port to see what is happening. after that, Milo's boot sequence is very similar to the Linux kernel's except that Milo does not initialize quite as many things as the Linux kernel.

This similarity to the Linux kernel is no accident. Milo uses device drivers from the Linux kernel as is. This means that, theoretically at least, any device driver that works for Alpha Linux will work in Milo. So far, this is true for the floppy driver, the IDE driver, the NCR 810 SCSI driver, Adaptec 2940 and 3940 SCSI drivers as well as the Qlogic SCSI driver. It does not make sense, however, to include all of the Linux kernel inside Milo. Instead Milo uses pseudo-kernel code which maintains the interfaces that the drivers expect. For example, the timer code is Milo's but the PCI code is the kernel's.The drawback is that as the kernel revises the code, Milo must change to keep up. I have learnt a lot about the Linux kernel this way. Another example of Milo's pseudo-kernel code is the kernel memory allocation/deallocation code, the Milo buffer code is much simpler and less elegant than the kernel code but it gets the job done reasonably well if not particularly efficiently.

If it finds a VGA device then Milo will use Digital's BIOS emulation library to run (in X86 emulation) the on board BIOS code on the video device. This code is only shipped as a library and, unlike the rest of Milo, is not free license. The Milo source tree also includes a freeware alternative that is not bad with S3 based chips.

Milo's role in life is to find and load the kernel. Milo does include a keyboard driver and so it has a simple command interface. Unlike Lilo, which is table driven, Milo accepts commands and even stores configuration information in NVRAM. Milo's interface is simple, it does enough to find out which disk partition to load the Linux kernel from and what arguments to pass to the kernel. Milo supports filesystems (DOS, EXT2, ISO9660) and so the boot commands make human sense, for example:

MILO > boot sda2:vmlinuz root=/dev/sda2

Did I mention that it also figures out if the image is gzip'd and, if it is, automatically uncompresses it? When Milo has loaded the Linux kernel into the appropriate place in physical memory it sets the page tables to their initial state and builds the HWRPB (Hardware Restart Parameter Block). It is the HWRPB which tells Linux what sort of system this is and what physical memory is free. Almost immediately, the Alpha AXP setup code discards the HWRPB and Linux goes on to reinitialize the system itself. It does, however, take notice of what physical memory Milo told it was free and what was not.

Milo has been very successful, most of the Alpha Linux systems sold include it and it is included in the Alpha Linux distributions, Red Hat AXP Linux for example.