| PresentationWelcome to the first of a series of articles about the Linux
    kernel secrets.  Probably you already took a look at the kernel
    sources some time in the past. In that case you noticed that the
    initial couple of 100-kb compressed files has turned into more
    than 300 files containing more than 2 million source code lines,
    and taking as many as 9 Megabytes of compressed storage. This series is intended not for newbies but advanced
    programmers. Obviously you're free to read it anyway, and the
    author will do his best to answer any question or doubt you send
    through e-mail. New bugs are discovered and new patches are published mostly
    every day.  Nowadays it's mostly impossible to understand the
    source code in a whole. It's co-written by lots of different
    programmers who try to keep an homogeneous coding style, but in
    fact it differs from each other. Linux: The Internet Operating SystemLinux is a freely distributable operating system for PC
     architecture and others. It's compatible with the POSIX 1003.1
     standard and includes a large number of features from Unix System
     V and BSD 4.3. Many substantial parts of the Linux kernel this
     series is writing about, were written by Linus Torvalds, a Finish
     computer science student. The first kernel was released on
     November, 1991. Main FeaturesLinux solves mostly all needs of a current Unix-based operating system:
    Multitasking
    
    Linux supports true multitasking. All processes are
    independent. None of them must release the processor to execute
    other process. Multiuser accessibility
    
    Linux is not only a multiuser operating system, but also has multiuser 
    accessibility. Linux is able to share the same system resources among 
    users connected through different terminals attached to the host.Executables loaded on demand
    
    Only needed parts of a program are loaded into memory to be executed.Memory pagination
    
    If the system memory is fully exhausted, Linux will then search fo
    r 4K-sieLinux entoncesd memory pages to be released from memory and stored
    on the hard disk. If any of these pages is required again, Linux will 
    restored it from disk into its original memory  location. Old unix systems
    and some current platforms, including Microsoft Windows, memory is swapped
    into disk. That means that all memory pages belonging to a task are saved 
    on disk when there is a memory shortage, but this is less efficient.Dynamic disk cache
    
    MSDOS users are used to work with SmartDrive, a program which reserves 
    some fixed area of the system memory for disk caching. Linux instead has a 
    lot more dynamic disk caching system: reserved memory for cache is 
    enlarged when memory is unused, or shrinked as needed when system or users
    processes demand more memory.Shared libraries
    
    Libraries are sets of routines used by programs to process data. There is 
    a number of standard libraries used from more than one process at the same
    time. These libraries are included onto every executable file in old 
    systems, and loaded redundantly into memory everytime a new process using 
    is the same library is executed, so spending more memory space.
    compartida. In modern systems like Linux, shared code is loaded just once,
    and shared among all processes that use it.Standard POSIX 1003.1 100% compliant. Some System V and BSD
    features supported.
    
    POSIX 1003.1 defines an standard interface for Unix operating
    systems.This interface is described as a set of C routines, and is
    currently supported by all modern operating systems. Microsoft
    Windows NT has support for POSIX 1003.1.  Linux 1.2 is 100%
    compliant with POSIX. Additionally, some System V and BSD
    interfaces are supported or being implemented for further
    compatibility.Several executable file formats
    
    Who would not like to run any DOS, Windows95, FreeBSD or OS/2
    application under Linux? So DOS, Windows and Windows95 emulators
    are under development. Linux is also able to run binaries from
    other intel-based Unix platforms compliant with the iBCS2 (intel
    Binary Compatibility) standard.Several filesystem formats
    
    Linux support a large number of file system formats. The most
    commonly used format used nowadays is the Second Extended File
    System (Ext2). Another supported file system format is the File
    Allocation Table (FAT) used by DOS-based systems, but FAT is not
    ready for security or multiuser access due to its design
    restrictions.Networking
    
    Linux is able to be integrated into any local area network. Any unix 
    service is supported, including Networked File System (NFS), remote login 
    (telnet, rlogin), dial-up SLIP and PPP, and so on. Integration as server
    or client for other networks is also supported, including filesharing and
    printing in Macintosh, Netware and Windows.System V IPC
    
    Linux uses this technology to provide inter-process message queing, semaphores 
    and shared memory.  Compiling the KernelLet's take a look at the kernel source code before studying the
    kernel itself. Source tree structure:
Linux kernel sources are commonly located under the /usr/src/linux directory, 
so we'll mention directories as relative to this location. As a result of the 
porting to non-Intel architectures, the kernel tree was changed after version 
1.0. Architecture-dependent code is located under the arch/ hierarchy. Code 
for Intel 386, 486, Pentium and Pentium Pro processors are under arch/i386. The 
arch/mips directory is for MIPS-based systems, arch/sparc for Sun Sparc-based 
platforms, arch/ppc for PowerPC/Powermacintosh systems, and so on. We'll 
concentrate on the Intel architecture as this is the most widely used with Linux. The Linux kernel is just an standard C program. There are only two important 
differences. The starting point for programs written in the C language is the 
main(int argc,char **argv) routine. Linux kernel uses start_kernel(void). 
The program environment does not exist yet when the system is starting up and the 
kernel is to be loaded. This means that a couple of things are to be done 
before the first C routine is called. The asembler code that perform this 
task is located under the arch/i386/asm/ directory. The appropiate assembler routine loads the kernel into  the
absolute 0x100000 (1 Mbyte) memory address, then installs the interrupt
servicing routines, global file descriptor tables and interrupt descriptor 
tables, that are exclusively used during the initialization process. At this 
point, the processor is turned into protected mode. The init/ directory 
contains everything you need to initialize the kernel. Here is the 
start_kernel() routine, dedicated to initialize the kernel properly, taking in
consideration all passed boot parameters.  The first process is created 
without using system calls (system itself is not loaded yet). This is the 
famous idle process, the one which uses processor time when not used by any
other process. The kernel/ and arch/i386/kernel/ directories contain, as suggested by their 
path names, the main parts of the kernel. Here is where main system calls are 
located. Here are implemented other tasks including the time handler,
the scheduler, the DMA manager, the interrupt handler and the signal 
controller. Code handling system memory is located in mm/ and arch/i386/mm/.
This area is devoted to the memory assignation and release for processes.
Memory paging is also implemented here. The Virtual File System (vfs) is under the fs/ directory. 
Different supported file system
formats are located in different subdirectories respectively.
The most important file systems are Ext2 y Proc. We'll take a detailed look at later 
them later. All operating systems require a set of drivers for hardware components.
In the Linux kernel, these are located under drivers/. Under ipc/ you will find the Linux implementation of the System V IPC. Source code to implement several network protocols, sockets and internet 
domains is stored under net/. Some standard C routines are implemented in lib/, enabling the kernel 
itself to use C programming habits. Loadable modules generated during the kernel compilation are saved in 
modules/, but it's empty until the first kernel compilation is done. Probably the most important directory used by programmers is include/.
Here you find all C header files specifically used by the kernel. Specific 
kernel header files for intel platforms are under include/asm-386/  Compiling: A new kernel is basically generated in just three steps:
 
 First of all, configuring kernel customizable options with "make config", 
"make menuconfig" or "make xconfig" (different interfaces for the same 
configuring stage)
Then, all source code dependencies are rearranged with "make depend"
and then the real kernel compilation is performed with "make"
 We will get on details about the backgrounds for these scripts and how to modify them to 
introduce new configuration options in next articles. I hope you enjoyed this article. You're free to email your comments, sugestions
and criticisms  to elesende@nextwork.net. |