Writing ARM Assembly

From OMAPpedia

Revision as of 19:36, 3 October 2013 by Emrainey (Talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Contents

[edit] Overview

This page will go over the basics of writing ARM assembly on the OMAP platform against the GCC family of compilers and assemblers. If you have assembly that is in NASM format, you can port it over using the guide at Porting NASM Assembly to GCC. For OMAP4 Specifics, see Assembly Optimizations for OMAP4.

[edit] Reference

For assembly instruction references, refer to ARM's site http://infocenter.arm.com/help/index.jsp for the specific processor type in the OMAP you are using.

To figure out which instruction set you can use and thus if you can have NEON or some subset of parallel instructions, see this table:

OMAP Type ARM Type ARM Version SIMD
OMAP1xxx ARM926EJ-S (1) ARMv5 No
OMAP2xxx ARM1136 ARMv6 Some
OMAP3xxx ARM Cortex A8 ARMv7 NEON
OMAP4xxx ARM Cortex A9 ARMv7 NEON
OMAP5xxx ARM Cortex A15 ARMv7 NEON

(1) Some variation exists.

[edit] Makefiles

You'll need to make sure that your Makefile supports cross compiling against the ARM assemblers. See OMAP Platform Support Tools. When compiling or assembling the assembly files, be sure to set your $(CC).

CC=$(CROSS_COMPILE)gcc
AS=$(CROSS_COMPILE)as

[edit] Assembly Files

Assembly files have historically been named with a .S or .s extension. Use .S to be able to pass the file through the C++ preprocessor as well as the assembler.

Parameters are named r0-r3 here to show how the assembly registers translates these into parameters. Parameters beyond 4 are pushed onto the stack. If you can't avoid going over this, there are ways to pull the additional parameters off the stack in the assembly into the r4-r11 registers in the prolog.

[edit] Comments

Comments should be either used with /* comment */ or the per line comment #.

[edit] Calling C functions from Assembly

Calling C functions from assembly is largely an issue of setting up the parameters correctly and then branching to the function.

In your C files define your function (no need to declare, unless other C functions call it).

int somefunc(int r0, int r1, int r2, int r3)
{
    // does something
}

In the Assembly File:

.extern somefunc

And in the subroutine itself:

    # move parameters manually to r0, r1, r2, r3
    bl somefunc
    # return code is in r0, r4-r11 should be preserved

If you need to add additional parameters to the stack you must also remove them after the function call to keep the sp correct.

[edit] Calling Assembly Functions from C

First, define your functions in a C header file so that the C/C++ code can find the prototype for it.

/** This is simple function which just returns 0 */
int function(int r0, int r1, int r2, int r3);

Second, you'll have to define the function or symbol in the assembly file. Naming it a global variable will allow the linker to find it and resolve the symbol in the C file.

.global function
function:

[edit] EABI Calling conventions

In the EABI spec http://en.wikipedia.org/wiki/Application_binary_interface#EABI defines how functions are called, how stacks are used, which registers do what, etc. This allows assembly and C to link together successfully (even across different compilers which support EABI). The calling conventions can be found http://en.wikipedia.org/wiki/Calling_convention#ARM. The EABI standard dictates that the ARM Stack be "Full Descending" which means that stores need to decrement beforehand and loads must increment afterward. You can use the actual addressing types "DB" and "IA" or just "FD" on the assembly instructions.

[edit] Prolog

The prolog saves the state of the registers r4 through r11 typically (you can save any amount you need to, but those are the typical ones). This instruction also post-updates the stack pointer (sp).

stmdb sp!, {r4-r11} /* Push 8 "longs" on the stack and subtracts sp beforehand */

If there are additional parameters on the stack you can reference them after the stmia instruction, but you'll need to offset the sp by the appropriate values. This *assumes* that you use {r4-r11}.

ldr r4, [sp, #(4*9)]  /* This loads parameter 5 which is 9 "longs" "up" on the stack now */
ldr r5, [sp, #(4*10)] /* This loads parameter 6 which is 10 "longs" "up" on the stack now */

[edit] Epilog

The epilog restores the previous register set from the stack back to the registers and updates the sp value.

ldmia sp!, {r4-r11}

[edit] Return

The return places the return value into r0 and moves the lr (the return address) into the pc. This will cause the next instruction fecthed to be the instruction after the call to the function.

mov r0, #0
mov pc, lr

[edit] Optimized Return

You can reduce your code size by also popping the LR from the stack back into the PC, which also acts as the "return" statement. Here I use the "FD" stack mode.

stmfd sp!,{r4-r11,lr} # stack save + return address
...
# use 10 as the additional offset for other parameters off the stack since we're saving 9 ints now
ldr r4, [sp, #(4*10)]
...
ldmfd sp!,{r4-r11,pc} # stack restore + return

[edit] Register Renaming

With the Gas style assemblers, you can rename registers to aid in readability.

name .req register

Example:

pixels .req r0
width .req r1
height .req r2 

    mul pixels, width, height

[edit] Complete Listing

.global function
function:
    # prolog
    stmdb sp!, {r4-r11}
    ... 
    # epilog
    ldmia sp!, {r4-r11}
    # return value goes into r0, here it's zero
    mov r0, #0
    mov pc, lr

[edit] Defining Strings

The assembler allows you to define strings in the format (with special characters):

.global final_message
final_message:
.string "Sorry for the Inconvenience\n"

Use a label before the string in order to reference it.

[edit] Defining Constants

The GNU assembler takes constants in the form of

.equ symbol, value

Such that you could do this (capitalization is optional):

.equ ANSWER_TO_LIFE_UNIVERSE_EVERYTHING, 42

[edit] Defining Data Arrays

When you need to define large static arrays of data (tables, precomputed values, multiple constants, etc.) you can use a data section to do this. This is not quite the same as the .data section (which can be static data or functions).

.global my_array
my_array:
.long 127
.long 28
.long 94
.long 23

This symbol can be then be used and to load these values into registers to apply to calculations, etc.

ldr r4, =my_array
ldr r5, [r4, #0x0]
ldr r6, [r4, #0x4]
ldr r7, [r4, #0x8]
ldr r8, [r4, #0xC]

[edit] Types

Each type can be zero (?) or more expressions.

.byte 247         /* is 8 bit  */
.word 2098        /* is 16 bit */
.long 10238476    /* is 32 bit */
.quad 23487928374 /* is 64 bit */
.octa 928374928734982734 /* is 128 bit */
.float 3.141528   /* is 32 bit IEEE floating point. */

.byte 0xEF, 0xBE, 0xAD, 0xDE /* Byte sequence 0xDEADBEEF in LITTLE ENDIAN */

[edit] Defining Macros

The GNU assembler also allows macros which can be used to simplify some assembly routines.

.macro name operand [,operand,...]
   [instructions]
.endm

Here's an example that does a 4 value average

# avg = (a+b+c+d)/4;
.macro avgerage avg,sum,a,b,c,d
	add \sum, \a, \b
	add \sum, \c, \sum
	add \sum, \d, \sum
	mov \avg, \sum, lsr #2	
.endm

[edit] Odd's n' Ends

You should define your assembly file with .text at the beginning and .end at the end.

.text 
...
.end 

[edit] Enabling NEON

If you are assembling for ARMv7 instructions (NEON) then you must state so in the Makefile in the AFLAGS as -march=armv7-a or -mfpu=neon. You can also state so in the assembly file as:

.arch armv7-a
.fpu neon

[edit] Register Usage

There's a good table reference for which registers are used for what in GCC (during inline assembly at least) at [1], under "Register Usage".

[edit] Reference

DVP YUV NEON Color Convert Functions

Personal tools
Namespaces
Variants
Actions
Navigation
Toolbox