world leader in high performance signal processing
Trace: » difference_from_linux

Differences between μClinux and Linux

μClinux stands for microcontroller Linux (u for the Greek letter mu denoting micro and C for controller). The name μClinux normally refers to the complete distribution, and is not the name of the kernel. Similar to RedHat, Debian, Gentoo, or Damn Small Linux, μClinux - is the name of the collection of userspace applications, userspace libraries, the system libraries, and the Linux kernel.

There is a option within the mainline Linux kernel, which allows the Linux kernel to be used on devices that do not support virtual memory, which enables Linux to be run on low cost devices without Memory Management Units (MMU). Since the mainline Linux kernel allows the removal of the main features of the MMU - (1) memory protection and (2) virtual memory, the differences in a noMMU Linux kernel and a MMU Linux kernel is very little. This trickles down to few differences in user space software as well.

Since, the Blackfin processor has a MPU (Memory Protection Unit) which offers memory protection, but not virtual memory, the μClinux distribution running on the Blackfin processor has better protection than other noMMU systems, and will not crash as often as noMMU Linux running on other noMMU processors.

Memory Protection

One consequence of operating without memory protection is that an invalid pointer reference (null pointer, or pointer to an odd address), even by an unprivileged process, may trigger an address error, and potentially corrupt or even shut down the system. Obviously code running on such a system must be programmed carefully and tested diligently to ensure robustness and security.

The Blackfin MPU does provide partial memory protection, and can segment user space from kernel space, so this less likely to be an issue on the Blackfin architecture like it is on other devices that run noMMU Linux.

On the Blackfin/Linux kernel, we start the kernel at 0x1000, leaving 4k as a buffer for bad pointers. If an application gets a bad pointer that reads or writes into the first 4k of memory, the application will optionally halt. To understand this, if we take the trivial application:

int main ()
{
        int *i;
 
        i=0;
        *i=0xDEAD;
        printf("%i : %x\n", i, *i);
}

If it is run on a Linux host (x86), you get:

rgetz@test:~/test> gcc -O3 test.c -o test
rgetz@test:~/test> ./test
Segmentation fault

If we cross compile,

rgetz@test:~/test> bfin-uclinux-gcc -Wl,-elf2flt -O3 ./test.c -o ./test

and run it on the target, I get:

root:~> ./test
0 : dead

Or

root:~> ./test
SIGSEGV

This is controlled by the kernel configuration option: Halt on reads/writes to null Pointer (DEBUG_HUNT_FOR_ZERO), under Kernel Hacking → Debug kernel. By default, this option is turned on, as turning it off has little run time overhead, and only occupies an additional locked CPLB entry. If you are really sure that your code is perfect - you can turn this option off, and increase the speed and efficiency of the kernel by a tiny fraction.

Additional protection is provided when an application gets a bad pointer that reads or writes into main memory, but does it unaligned (accessing a 32 bit value, in a 16 bit address), the application will halt. Other implementations of noMMU Linux may start writing over the kernel. This will ensure that random crashes due to pointer corruption are less likely to happen on Blackfin/Linux.

The Blackfin MPU also can protect Linux applications from each other and from writing into kernel space and other applications, however it does suffer from a large run time overhead.

Virtual Memory

Application Stack

When the Linux kernel is running on an architecture where a MMU exists, the MMU provides Linux programs basically unlimited stack and heap space for every application. This is done by the the virtualization of physical memory. When an Linux application needs more memory, it asks the memory manager; the memory manager finds empty physical memory (if it can't find physical memory, it swaps physical memory to the hard drive, and frees it), and maps to this physical memory to the virtual memory of the application.

Without the virtual to physical address mapping, you cannot move the physical memory around without breaking applications. This means the kernel cannot swap chunks of memory out to some form of storage media which means memory usage is limited to the amount of physical memory. Because it can not support virtual memory (MMUless), it allocates stack space at the end of the data for the executable. If the application stack grows too large on noMMU Linux, it will overwrite the static data and code areas. This means that the developer, who previously was oblivious to stack usage within the application, must now be aware of the stack requirements.

On Blackfin/Linux - there is a compiler option to turn on stack_checking. If the option -fstack-limit-symbol=_stack_start, the compiler will add in extra code, which checks to ensure that the stack is not exceeded. This will ensure that random crashes due to stack corruption will not happen on Blackfin/Linux.

Position Independent Code

One consequence of running Linux with out virtual memory is that processes which are loaded by the kernel must be able to run independently of their position in memory. One way to achieve this is to fix-up address references in a program once it is loaded into RAM, the other is to generate code that uses only relative addressing (referred to as PIC, or Position Independent Code). noMMU Linux supports both of these methods. This is really only an issue for assembly level programmers, as this detail is hidden from most developers.

Fragmentation

Another consequence arises from memory allocation and deallocation occurring within a flat memory model. A lot of dynamic memory allocation can result in fragmentation which can starve the system. What this means is that the Linux kernel keeps track of all the memory in the system in power of 2 chunks, (by the page size, which is 4k) - this results in a memory pool with 4k, 8k, 16k, 32k, 64k, 128k, 256k, 512k, 1M, 2M, 4M, 8M, 16M, & 32M. When the kernel needs memory - it will split these pages apart, and give them to the various applications in the system. The problem is - some applications keep the memory, and others do not, and give things back to the kernel. With this dynamic nature - the kernel has a difficult time re-combining pages back together for future large allocations (therefore commonly referred to as fragmentation - you may have 25Meg of free memory - but allocations for 7Meg may fail, unless you have a 8Meg continuous page available).

One way to improve the robustness of applications that perform dynamic memory allocation, is to replace malloc() calls with requests from a preallocated buffer pool. Swapping pages in and out of memory is not implemented since it cannot be guaranteed that the pages would be loaded to the same location in RAM (this requires virtual memory). In embedded systems it is also unlikely that it would be acceptable to suspend an application in order to use more RAM than is physically available.

Overcommit

The lack of an virtual memory prevents the kernel from over committing memory. The concept of over committing is where the kernel will let applications allocate large chunks of virtual memory without verifying whether there is enough physical memory to back it up. The reason this normally works is that applications commonly ask for huge buffers but then only ever utilize small portions of it. So if the kernel lets the application think it has 20megs because it only ever writes to 1meg (and thus utilize only 1meg of physical memory), everything runs much smoother. Unfortunately, this behavior is attained by delaying the physical memory allocation to when virtual memory requests for it which requires virtual memory support. So on a no-MMU system, if your application asks for 20megs, the kernel will have to actually reserve 20megs from the physical memory pool, possibly wasting 19megs of memory as your application only needed 1meg. Be mindful of your memory usage and wastage!

On Demand Loading

A common way to access files is via the mmap (map to memory) call - in a system with virtual memory, this will allocate a buffer of virtual memory the size of the file, not backed by physical memory until that part of the file is accessed. For example - if you read the last 4k of a 10Meg file, the only thing read into physical memory is the last 4k (at a 10Meg offset from where the beginning of the file was mmaped to). You can randomly access parts of the file, by treating the file as a large buffer, and the MMU will load (or write) the file as it is being accessed. Without an MMU - there is no way to do demand loading, so the entire file must be read into physical memory before the mmap call can succeed. This can lead to out various of memory conditions. It is recommended to use fopen/fseek/read/write when accessing large files on a noMMU system.

System Interfaces

The lack of memory management hardware on uClinux distribution target processors requires some changes to the Linux system interface. Perhaps the greatest difference is the absence of the fork() and brk() system calls. A call to fork() clones a process to create a child. Under Linux, fork() is implemented using copy-on-write pages. Without an MMU, the Linux kernel cannot completely and reliably clone a process, nor does it have access to copy-on-write. A noMMU Linux kernel implements vfork() in order to compensate for the lack of fork(). When a parent process calls vfork() to create a child, both processes share all their memory space including the stack. vfork() then suspends the parent's execution until the child process either calls exit() or execve(). Note that multitasking is not otherwise affected. Programs which rely heavily on the fork() system call may require substantial reworking to perform the same task under noMMU Linux. Also, noMMU Linux has neither an autogrow stack nor brk() and so user space programs must use the mmap() command to allocate memory. For convenience, the C library implements malloc() as a wrapper to mmap(). There is a compile-time option as well as a run time application (flthdr) to set the stack size of a program.

Memory Subsystem

The architecture-generic memory management subsystem of no MMU Linux has been modified to remove reliance on MMU hardware and provides basic memory management functions within the kernel software itself. This is the role of the directory /mmnommu, derived from and replacing the directory /mm. Several subsystems were modified, added, removed or rewritten; kernel and user memory allocation and deallocation routines have been reimplemented; and support for transparent swapping / paging has been removed. Program loaders which support Position Independent Code (PIC) have been added and a new binary object code format named flat, which supports PIC, was created. Other program loaders, such as that for ELF, have been modified to support other formats which, instead of using PIC, use absolute references. It is then the responsibility of the kernel to fix-up these references during run time. Both methods have advantages and disadvantages. Traditionally PIC is quick and compact but has a size restriction on some architectures. The runtime fix-up technique removes this size restriction, but incurs overhead when the program is loaded by the kernel.

C library difference

The uClinux distribution uses uClibc as the standard C library instead of glibc.

uClibc is much smaller than GNU C Library (glibc), the C library normally used with desktop Linux distributions. While both glibc and uClibc are intended to fully support all relevant C standards across a wide range of platforms, uClibc is specifically focused on embedded Linux. Features can be enabled or disabled according to space requirements, there are features in glibc which are not required by any standard which uClibc does not support.

More information

There is a good article by David McCullough: uClinux for Linux Programmers.