RISC World

ARM code for Beginners
Part 5

Stacks and Subroutines - Brain Pickard explains how anyone can program in ARM code.

Last time I introduced the method of saving data to and loading data from memory. In this part I will extend these ideas by introducing a data structure called a STACK and show how this can be used to save and load values in registers. But first the solutions for the problems set in part 4.

Write an ARM routine using a lookup table to produce the answer to the (cube-square) of integer from 1 to 256. This can be done in a few lines.

     DIM code% 2048
     FOR pass%=0 TO 3 STEP3
     [ OPT pass%
     ADR R2,table%        ;R2 points to start of the table
     SUB R2,R2,#4         ;subtract 4 since we are multiplying
                          ;the index R1by 4 (starting value
     LDR R0,[R2,R1,LSL#2] ;of R1 is 1 B% in BASIC) using LSL#2
                          ;to move to the correct word
     MOV PC,R14           ;in the table
     FOR n%=1 TO 256
     OPT FNformula(n%,pass%) ;this function sets up the lookup table
     PRINT"Formula macro Cubes-Squares"
     INPUT"Enter an integer (1 to 256) "B%
                           REM this means R1 will contain index value
     IF B%>0 THEN PRINT"Its Cube-Square is ";USR(code%)
     UNTIL B%=0
     DEF FNformula(n%,p%)
     EQUD INT((n%^3)-(n%^2)) ;works out the
                             ;cube-square for number held in n%

Write a routine to input a set of 10 positive integers storing them in memory, and then printing them out. You can use BASIC for inputting, printing and ARM code for storing in memory. This is slightly more involved.

     DIM code% 2048
     FOR pass%=0 TO 3 STEP3
     [OPT pass%
     LDR R1,index%    ;we need to remember the position
                      ;in memory for each store
     ADR R2,memstore% ;R2 points to the start of the memory block
     STR R0,[R2,R1]   ;store the value at R2+R1
     ADD R1,R1,#4     ;update index to next word ready
                      ;for next number
     STR R1,index%    ;store this value
     MOV PC,R14
     EQUD 0
     FOR n%=1 TO 10
     [OPT pass%
     EQUD 0
     PRINT”Storing 10 numbers in memory•
     FOR n%=1 TO 10
     INPUT”Enter an integer •A%
     CALL code%
     FOR n%=0 TO 9

Write a routine which will allow a user to enter a password. Each key press should produce a * on the screen. The password should then be checked against one already in memory and if correct a beep should be heard. The password should be case insensitive and at least 10 characters long. (Hint there are a set of SWI's which output single characters: SWI256+42 will output a "*" and a beep is SWI256+7). This is the most difficult one!

MODE28 DIM code% 2048 FOR pass%=0 TO 3 STEP3 P%=code% [OPT pass% ADR R0,message% ;R0 points to message 'Enter password' SWI”XOS_Write0• ;this SWI writes it to the screen ADR R1,memstore%;R1 points to memory store for user ;entered password MOV R2,#0 ;R2 to be used as counter/index for ;storing chrs in memory .inputlp% ;start of the input loop SWI”XOS_ReadC• ;this SWI waits till key pressed ASC code ;of key then loaded in R0 CMP R0,#13 ;has the user pressed Return i.e. finished ;entering password BEQ chkpass% ;if branch to checking routine ORR R0,R0,#32 ;if not then make all letters ;lowercase (case insensitive) CMP R0,#ASC(”a•);check range of key pressed if the key ;is below a or above z BLT inputlp% ;then reject this character CMP R0,#ASC(”z•);do not save it to memory and do not BGT inputlp% ;print a * SWI256+42 ;SWI256+ swi's print to screen the ;character whose ASCII code added STRB R0,[R1,R2] ;store the letter in memory note the ;byte suffix only require 1 byte/chr CMP R2,#10 ;is this the tenth letter ADDLT R2,R2,#1 ;if not then add one to index for next ;letter else stay at tenth position B inputlp% ;branch back for next input .chkpass% ;this is checking routine ADR R3,password%;R3 points to the memory block ;containing the correct password MOV R2,#0 ;R2 is the index .comparelp% ;here starts the comparison loop LDRB R0,[R1,R2] ;load user entered letter again note ;the use of byte suffix LDRB R4,[R3,R2] ;load correct password letter ADD R2,R2,#1 ;add one to index CMP R0,R4 ;compare these letters if they are not equal BNE getout% ;then branch to getout condition NOT EQUAL CMP R0,#0 ;see if we have reached the end of the ;user entered password (the BGT comparelp% ;memory block ends with a zero byte). If ;not go and compare next CMP R4,#0 ;see if we have reached the end of ;the correct password again this BGT comparelp% ;ends with a zero byte. If not go back ;for next comparison .getout% ;this part is reached under two conditions. ;If equal then entered SWIEQ256+7 ;password correct so beep. If incorrect ;then condition was not equal. MOV PC,R14 ;return to BASIC. .message% ;start of memory block for message EQUS”Enter password • EQUB 0 EQUB 0 .memstore% ;start of memory store for user entered ;password. EQUD 0 ;this is twelve bytes long (3 words) so ;that there will always be EQUD 0 ;a zero byte after the last allowed letter EQUD 0 .password% EQUS”mypassword•;this is the allowed password can be altered ;but must be 10 chrs long EQUB 0 EQUB 0 ] NEXT CALL code%

As I have mentioned before if you have found a different way which works your solution is correct. In fact if you have managed a short cut please let me know!


A stack is a block of memory which can be used to store and retrieve data. This data structure is used in many systems such as programming languages, assemblers etc. In fact the PostScript printer language invented by Adobe is based round stacks to store data and procedures.

If you imagine building a pile of coins (pence or pounds depending how rich you feel!). Placing coins on the pile is like saving data items onto a stack and removing coins off the pile is like retrieving data from a stack. Note the order in which you place and then remove the coins. You can only get to the top coin at any one time, so the last coin on will be the first one removed. This is called Last In First Out or LIFO. The action of saving data to a stack is called pushing and retrieving data is called pulling.

A computer must remember where the last data was placed in the block of memory. A stack pointer (SP) is used (see below). Referring to the diagram below.

This shows a small stack of 5 items of data. If we push an item then the length of the stack will grow as shown in the middle diagram. If we then pull 3 items of data off the stack it will shrink to 3 items in length as shown on the right.

Types of Stack

There are two types of stack called ascending and descending. The coin analogy was an example of an ascending stack. As data is pushed onto the stack the stack grows upwards (i.e. up the memory block).

A descending stack starts at the top of the memory block. Pushing data onto this type makes the stack 'grow' downwards (i.e. down the memory block).

The Stack Pointer (SP)

This pointer can be updated in different ways. Consider the following.

Here SP always points to the last item of data pushed onto or the next item of data to be pulled off the stack. Under these conditions, for an ascending stack the SP is updated as follows.

Pushing data

SP incremented upwards (or downwards for a descending stack).Data then pushed into the location SP is pointing.

Pulling data

Data pulled from the location SP is pointing, SP decremented downwards (upwards for an descending stack).

So for pushing SP is updated BEFORE the data is pushed and for pulling SP is updated AFTER the data is pushed. There is no reason why SP cannot be updated to point to the next available location (i.e. empty location) as shown below.

Under these conditions SP is updated as follows (for an ascending stack).

Pushing data:
Data pushed into the location SP is pointing.
SP incremented upwards (or downwards for a descending stack).

Pulling data:
SP decremented downwards (upwards for an descending stack).
Data pulled from the location SP is pointing so the location becomes the next free location.

As you can see this is the reverse of the previous case.

Full and Empty Stacks

A stack setup with SP pointing to the location containing data is called a FULL STACK. A stack setup with SP pointing to the next free location is called an EMPTY STACK.

So there are four ways of implementing a stack

  • Full ascending
  • Full descending
  • Empty ascending
  • Empty descending

Implementing Stacks in ARM code

There are two commands LDM and STM for loading or storing data to or from one or more registers. (LoaD Multiple registers and STore Multiple registers) These are used with suffices IA, IB, DA and DB to implement the four types of stack.

STMIA SP!,{register list}
This stores the register list Incrementing SP After each store. Note the use of the curly brackets { } This is pushing data onto an empty ascending stack.

LDMDB SP!,{register list}
This loads the register list Decrementing SP Before each load. This is pulling data from an empty ascending stack.

Note the use of the ! Write back is essential for stack implementation. Without it SP would not be updated! SP is a register whose contents point to the required position in memory being used as a stack. Inside {} a list of registers containing the data to be pushed or pulled. This list can be written different ways.

     LDMDB SP!,{R0,R3,R5}
     LDMDB SP!,{R0-R4}
     LDMDB SP!,{R0-R3,R6-R9,R11}

The use of the - means an inclusive list e.g. R0-R3 means R0 to R3. Register 13 is often used for SP (see later).

I will leave you to work out the suffices needed for implementing the above for the other types of stack.

Suffices FA,FD,EA and ED

As you might have noticed this is all a bit confusing! So the designers of the ARM assembler include the above suffices which make stack operations simpler.

FA and FD stand for Full Ascending and Descending stacks.
EA and ED stand for Empty Ascending and Descending stacks.

This simplifies the LDM and STM commands.

  • To implement a full descending stack use STMFD and LDMFD
  • For a full ascending stack STMFA and LDMFA
  • Empty descending STMED and LDMED
  • Empty ascending STMEA and LDMEA

It is up to the user to decide which type of stack to use. (I normally use full descending as this is the type of stack set up by Risc OS when using ARM code)

ARM code examples STMFD R13!,{R0-R8,R14} stores contents of R0 to R8 inclusive then R14 in a full descending stack using the contents of R13 as the stack pointer. LDMFD R13!,{R0-R8,R14} would then load the contents back into the registers. STMEA R0!,{R0,R5-R8} stores contents of R0 (which is the SP) then R5 to R8 in an empty ascending stack.

Using Stacks

This has been quite a long description but it is important to understand stacks. Now we come to their use. When you enter an ARM code routine from BASIC Risc OS treats it as a subroutine. It places the return address in R14, hence we have used MOV PC,R14 to return to BASIC. What if we wish to use R14 or modify its contents during our routine? We could use the STR R14,returnadd% then to return to BASIC we would have to use LDR PC,returnadd% This is fine for R14 but what if we wish to return to BASIC with the contents of ALL the registers preserved? This is where a stack is useful.

RISC OS sets up a full descending stack with the stack pointer in R13 when we enter an ARM routine.

     STMFD R13!,{R0-R12,R14}
will push the values of all useable registers onto the stack.
     LDMFD R13!,{R0-R12,PC}
will pull the values back into the registers and return us back to BASIC.


Duplicating code in any language leads to large and less easily serviced routines. Consider the problem of potting the following design.

There are two routines required to plot (red square, white circle). This could be done in a loop but what if we wish to plot the design smaller or larger, or even change the colours?

It would be better if we could produce code which acts like procedures in BASIC. These would take the three parameters (position, size and colour) and produce the design. Such routines are called subroutines.

ARM code Subroutines

To call a subroutine inside ARM code we use the branch command with an added link suffix thus:

     BL subroutine%

     BL plot%

The subroutine is defined else where in the code thus.

     MOV R0,R3  ;start of the subroutine coding
     MOV PC,R14     ;returning back to the main routine.

Subroutines within subroutines

In our design example the three plotting subroutines should be placed in a subroutine thus

     BL plotsquare%
     BL plotcircle%
     MOV PC,R14

There is a problem with the above structure. On entering the plotdesign subroutine R14 will contain the return address to the main routine. Branching to plotsquare will cause R14 to be over written with the return address to plotdesign. Similarly when branching to the other plot routines. I will leave it to you to see why the MOV PC,R14 make the routine go into endless loop!

To get over this problem we must store the contents of R14 before we branch to a sub routine, then restore the value when we return from a subroutine using it to return thus:

     STMFD R13!,{,R14} 
     BL plotsquare%
     BL plotcircle%
     LDM R13!,{, PC}

I have included the design program called subrtine for you to try out. It starts with a black screen. Clicking the left mouse button produces the design with the top left hand corner of the square at the pointer position.

Clicking at another position will draw the design at that position. Clicking with the right mouse button will change the size of the design and change the colours. Clicking with the middle mouse button will exit the routine.

The SWI"OS_Mouse" is used to determine the mouse state. This SWI returns the following data: R0 screen x coordinate, R1 screen y coordinate, R2 button state and R3 the time when button state changed.

Using this method of storing contents of registers is similar to the use of local variables in BASIC.

Try these problems:

  • Write an arm routine containing a sub routine to plot two concentric circles of differing radii and colours. Use the left mouse button to increase the radius and the right mouse button to change the colour.
  • Using a full descending stack write a routine in which the user enters integers which are saved on the stack. The routine should then print the numbers as it pulls them off the stack, hence showing the stack is LIFO.

That's all for this time. Next time I will return to the BASIC CALL command and show how we can use BASIC variables in our ARM code.

Brian Pickard