Cross compilation

As the NXT brick runs an ARM7 cpu, obviously your stock system compiler will not produce code that will work for it. You need to obtain a cross-compiler for the arm-elf architecture (ARM cpu, ELF binary format). ARM toolchains actually have several variants that they can support. This page describes the variants required by !NxOS, and explains how to build a Gcc cross-compiler with the correct options set.

Interworking support

arm7tdmi processors can execute two different instruction sets: the traditional arm instruction set, and the thumb reduced instruction set. The latter instruction set uses 16 bit opcodes instead of arm's 32 bit opcodes, only has access to half of the general purpose registers, and is missing a few instructions for some critical operations, such as manipulating the cpu mode register.

However, the thumb instruction set has the major advantage of having only half the memory footprint of arm code. Furthermore, the core can read two thumb instructions in a single memory read cycle, which reduces the load on the ram.

To switch between the two instruction sets, the arm7tdmi has the BX opcode (Branch and eXchange instruction sets), which lets you jump from arm code to thumb code and inversely. This is all well and good for manual jumps that are in hand-coded assembler, but for code produced by the C compiler, you need to explicitely tell the compiler to authorize jumping between code written in either mode, and to generate the required "glue code" to get from one to the other.

By default, gcc generates opcodes in the arm instruction set, with interworking disabled (no way to mix arm and thumb code). The -mthumb option instructs the compiler to output thumb opcodes, and the -mthumb-interwork instructs it to allow mixing arm and thumb code, and to generate the appropriate glue code.

Soft float

The arm7 processor in the NXT brick does not have any floating point arithmetic support. However, the abstract ARMv4t architecture does specify floating point opcodes, such as ADF. This is because the ARM architecture supports extension through coprocessors: when the ARM core reads an instruction that it cannot handle, it floats the opcode out on the coprocessor bus, in the hope that a coprocessor will signal that it can handle it. If no coprocessor replies, the CPU raises an illegal instruction exception.

In the NXT's case, such opcodes will always raise an illegal instruction exception, since there are no coprocessors installed.

A little background information (skip to the next paragraph if you don't care). In this situation, there are two ways to handle floating point in software. The first is to let the cpu raise an illegal instruction exception, and use the illegal instruction handler to parse the invalid opcode, figure out what to do, do it in software, and return control to the original code, just after the illegal opcode. This method has the advantage that it does not require programs to know that the instructions are not handled in hardware, but is a very costly form of emulation (requires an exception and runtime parsing of opcodes to decide what to do). The other solution is to tell gcc, at compile-time, that there is no floating point support on the target architecture. In this case, gcc will build in soft-float mode, and insert a call to a libgcc routine in all the places where it would use a floating point instruction. The libgcc function simply emulates that opcode using fixed point instructions. Thus, all problematic opcodes are replaced by calls to emulation functions, directly at compile time. This means that the compiled program is in some way aware that it is running on a fixed-point machine, but that knowledge allows it to emulate faster than going through an exception and parsing opcodes at runtime.

On the Mindstorms NXT, since there are no coprocessors and no way to add any, it seems obvious that we should use software floating point emulation, which is enabled by passing the -msoft-float option to gcc. For assembler files that just need assembling, the GNU as commandline flag is -mfpu=softfpa.

Compiler

To build gcc, you obviously need to first install its dependencies. The two biggest (besides a working compiler toolchain for your machine!) are the GMP and MPFR libraries. If you are running Debian or Ubuntu linux, this can be done simply by installing the development packages:

apt-get install libgmp3-dev libmpfr-dev

Once you have the dependencies, you need to actually build the compiler. This is not an easy task, since it involves a little creative navigating to build newlib (an embedded libc) alongside gcc, when each one depends on the other. Fortunately, we have worked this all out for you, so you just need to create an empty directory, cd to it, and from there run the source:nxos/scripts/build-arm-toolchain.sh script, which will download, extract, compile and install a full cross-compiler toolchain in that directory.

Once the script has finished running, the install subdirectory contains the actual installed stuff. The build and src directories can be deleted, since they only contain the source code and intermediary build files.

The compiler script is currently based on gcc 4.1.1. A few tests with gcc 4.2 seem to indicate that it would also work, we just need to update the script. Either version is fine for working with the NXT.