An Engineer in Wonderland – Help Alice with assembler code

I am trying to write some code for a microcontroller – a PIC – and have stumbled across a problem, and a sort-of answer.

As I have no formal training in well-structured assembler code, I suspect my answer is sub-optimal – or even sub-passable – and could do with some help.

My hope is that some kind soul will say ‘What you need is to use Shubunkin’s inferior parameter pass, or the Smyth-Hamilton’s data swap’, or some such.

Anyway, this is the problem:

My interrupts service routine stores a number in a byte of memory for the main programme loop to use. For the sake of argument, the byte is called John.

The main loop transfers this number to a working register to use it, then clears John.  

The problem that I envisage is: That the ISR will very occasionally put a new value into John in the 1µs between the main loop reading John and the main loop clearing John.

Which will mean the main loop occasionally misses a new value from the ISR.

How do you get over that?

My clunky answer is to have the ISR read John to make sure it is clear before storing a new value into it. If John is not clear, the ISR will have to store the spare value until the next time tick and have another go at putting it into John.

The additional time delay will be no problem, but the ISR may have to store one or two spare values pending putting them into John.

And is there a book on good structured assembler programming that would help?


Please don’t respond below as our spam blocking system doesn’t work and the inbox is overwhelmed by all kinds of generous offers from a multitude of rather annoying people.

Should you feel the need, respond to with ‘Assembler’ in the title.

No email addresses are collected for marketing (or any other) purposes from responses to this blog. I will keep it that way for as long as possible.



  1. Thanks Mike.
    Richard L has explained that approach to me.
    As I understand it, it is like a queue, but with the top linked to the bottom.
    I am shying away from circular buffers and long queues because they need index pointers, and they are in the ‘too difficult’ category for now.
    Ian C
    I had a good look at you code and I think I now understand that yours is the semaphore approach, with a two entry queue in the ISR.
    I am tempted by this because the short queue doesn’t need index pointers.
    Thanks again to all who have taken the trouble. I have learned a lot and really am just trying to find some time to start coding it all.

  2. Hi,
    Use a circular buffer.

  3. Thanks Ian C, particularly for actually coding it.
    This is another one to print out and ponder
    BTW, it took quite a while to get all those triangular brackets and spaces through the parser! Thanks to Alun, electronics Weekly’s galant web master, for showing me how.

  4. Dear “Alice”
    I’ve just read your article in the online magazine, and think I can help. Please see the suggestion below…
    Best regards
    Ian C
    What you need is double-buffering! 
    For this you need a second byte register, call it , and a control register, , with status bits ,  and .
    The ISR then has two routines, Transfer (sub-routine) and New_Byte (in-line code):
       If  = 1 And  = 0
          Copy  into , set , clear 
    New_Byte:       (ISR has new data)
       If  = 1
       Write new data into , set 
    Assuming the ISR always has new data when it runs, the procedure is
    call Transfer             (move any data from  into )
    New_Byte routine    (write new data into )
    call Transfer             (move data into , if  is currently empty)
    The main code is even simpler:
       If  = 1
          consume data from 
    In PIC assembler, this would be:
    John     equ   0x20        ; register at data memory address 0x20
    Keith    equ   0x21        ; register at address 0x21
    Larry    equ   0x22        ; at 0x22
    J_full   equ   1           ;bit 1 of , Larry[1]
    K_full   equ   2           ;Larry[2]
    overflow equ   3           ;Larry[3]
    ;ISR Routine
    ;   :
       call  Transfer                                  ;move data between buffers
       btfsc Larry,J_full                              ;test  status
       bsf   Larry,overflow                            ;data is about to be overwritten! 
       movf  ,w                      ;load WREG with new data
       movwf John                                      ;store into 
       bsf   Larry,J_full                              ;set status as full
       call  Transfer                                  ;move data between buffers (again)
    ;   :
    ;   :
    ;   :                                              ;etc…
    ;Transfer sub-routine
       btfsc   Larry, K_full                           ;test  buffer
       return                                          ; is already full
       btfss   Larry, J_full                           ;test  buffer
       return                                          ; is empty – nothing to transfer
       movf    John,w       ;move data between buffers
       movwf   Keith                                   ;
       bsf     Larry, K_full                           ;update buffer status
       bcf     Larry, J_full                           ;
       return                                          ;Transfer routine complete
    ;   :
    ;   :
       btfss   Larry,K_full
       bra     Main_Other                              ;branch to do something else
       movf    Keith,w
       bcf     Larry,K_full
    ;do something with the data in WREG…
    ;   :
    ;   :
    ;   :
    Main_Other:                                        ;do something else…
    ;   :
    ;   :
    ;   :

  5. Phew. Much thanks to all.
    It has taken a while to understand (hopefully) all that information.
    Will, I think I get the idea, simply wait until just after the update and assume there is a distinct gap before the next one?
    Dave H, I am thinking now of a two or three element queue now.
    Andy J, I thought about masking the interrupts, and that still might happen.
    To be honest, the applications is not too demanding compared with some of the amazing stuff described by the guys above. It is a bit of a personal challenge do the code ‘properly’ and also keep the interrupts open all of the time.
    Peter L, I had to print that out and take it home to read in peace and quiet!
    OK, that is the second suggestion to disable interrupts, so that is on the more-things-to-consider list now.
    BTW, as I understand it, the mid-range PICs do not have a pushable stack.
    And you said: …. a system which has been written by a hardware engineer who has ‘done’ a bit of software. No offense intended; but this is what happens…..
    And no offence taken. You have exactly pinned down the situation and I am suitably humbled :o) Although I will carry on as the application is not professional-grade and I am enjoying having my brain cells stretched.
    On the subject of C, I think that will have to be my next evening class.
    Hello again Richard H.
    I understand it now thanks. And I quite like that PIC mechanism – it is a bit like the junior edition of indexed addressing. I suppose it keeps the number of instructions down.
    And I didn’t realise multiply came from ROM now.
    The asymmetry I remember in 6800s was that the B I/O port had very different electrical properties to the A port. And I remember needing to supply that awful complex clock.
    Thanks again to all.
    It really is now a question of finding time to make the PCB and put the easy code in before playing with the data transfer between ISR and loop.

  6. 1. You asked: I have no experience at all with PIC index registers. Are they really there, or they PIC-style registers-added-together?
    There are PIC registers called FSRs – File Storage Registers. These are 16-bit registers (though the upper bits are not used) and they are employed to point at user storage.
    Although they are like index registers, the means of access to data is via indirection, rather than indexing.
    Associated with each FSR is a conceptual register called INDF from which data may be retrieved, or to which data may be sent.
    The process is to make sure the FSR is pointing at the correct location, then perform a MOVF instruction to or from the chosen INDF (INDF0, INDF1 etc).
    The number of FSRs and INDFs varies from PIC family to family.
    The FSRs have handy features so that they can be incremented or decremented after the data transfer has been done, within the same instruction and instruction time.
    to do this, instead of making a transfer via INDFn, the transfer is made via POSTINCn or POSTDECn.
    This is the same as INDFn with an increment or decrement following.
    The process using INDF or POSTINC or POSTDEC is selected merely by the choice of a conceptual location allocated during the assembly process.
    The user does not need to pay heed to its being done.
    There is also a self-explanatory PREINCn as well as a PLUSWn.
    The PLUSW offsets the transfer by the contents of the accumulator, which is very handy.
    Unlike some processors relying on indexing, there is therefore no need to update the index register(s) subsequently.
    In addition to indirect storage, there is also direct random-access storage to a limited number of bytes.
    2. You asked: I do faintly remember that the 6809 had two of them. Wouldn’t they be useful for this???
    Yes, the 6809 does have 2 called X and Y.
    As it happens, I used to do subcontract lectures for Motorola on this processor and others in the 6800 family, in the 1970s and early 1980s.
    The ABX (add B to X) instruction of the 6809 enabled a form of INX to be operated, but there wasn’t a corresponding ABY.
    The X and Y registers were true 16-bit registers, but they had to be managed separately (and therefore subsequently).
    Revisiting all this 6800 stuff seems a bit odd.
    In those days, bit products were actually calculated using circuitry (eg the MUL instruction did serial add and shift, taking 11 cycles), whereas now the processors contain pre-calculated values in ROM so that the instruction decode may access the required result in one operation.
    The benefits of cheap memory show themselves readily here: in the 1970s, in-chip memory was generally used to provide access to the sub-instruction codes, rather than the precalculated results, merely because the larger ROMs we now take for granted had not been developed at that time.
    Does this help?
    Richard H

  7. You are absolutely correct when you assert that an interrupt could occur between reading John and clearing John. In that case the new value of John will be cleared and the data lost.
    Of course I know very little about the system that you are attempting to construct but let us suppose that you are reading a character from a UART and that you have arranged things so that an interrupt is asserted when the UART receive buffer is full. The main program will be interrupted and the ISR associated with the interrupt will be entered. Since the accumulator register A will be used in the ISR then we will save A on the stack:
    // ISR for UART RX interrupt – NB ISR is uninterruptable
    PUSH A // save A
    JOHN <- A
    POP A // restore A
    // program main loop
    // do stuff
    A <- JOHN
    // * interrupt ?
    JOHN <- 0
    // do stuff with A
    As you quite rightly say, it is possible that an interrupt will occur at the point marked with the asterisk thus losing some data from the UART by clearing John. This problem can be solved by inhibiting UARTRX interrupts between storing the value of John in the A register and clearing A, so we have:
    LOOP: // main program
    // disable UARTRX interrupts
    A <- JOHN
    JOHN <- 0
    // enable UARTRX interrupts
    Usually the interrupt register for the given device will allow the program to set/reset bits to enable disable various interrupt sources. In the worst case we can disable all interrupt sources ( GDI) and then re-enable all interrupt sources ( GEI ) but it is not good practise.
    The above is a general solution and my notation is free-form and not to be confused with any processor on the market. Furthermore, I doubt that any software engineer would adopt such a primitive solution. If the system is complex then it is likely that a kernel of some sort would be used and the transfer of data from the ISR to the main body of the program be affected by using queues and/or protected variables. And of course C is now the preferred language for real-time programming at the level of 8/16/32 bit microprocessors.
    On the whole I would say, hire a professional software engineer who has a good 10 years experience, or more, in real-time programming at the microprocessor level. The extra cost for 4 weeks work will more than make up for the delays and the frustration of ending up with a system which has been written by a hardware engineer who has ‘done’ a bit of software. No offense intended ; but this is what happens.
    I doubt that any one book will help: you will need a book on operating systems or kernels for microprocessors and if I recall there was a book written about 20 years ago which had as its title structured assembly language programming [ for the PC ] . It is an unfortunate fact ; but there is not one book on the market which can address all your requirements. When I began programming micros at the assembler level I used to code in pseudo-Pascal and then literally translate the high level Pascal to assembler on a line by line basis. Nowadays it is better to find a good C compiler for the target processor and code entirely in C. Most C compilers will turn out concise code at the assembler level. Very rarely you might find it more effective to write an ISR in assembler, but that can be done within the C compiler.
    In any event I wish you the best of luck with your project, and, I am, yours sincerely
    Peter L

  8. I’ve done a bit of code over the years (was taught a bit at Uni- but long forgotten!) and in the last month I decided to teach my 15 year old son a bit of PIC programming… as the “state” system is more into just teaching how to drive Microsoft products!
    Your own solution should work, but yes it’s a bit “clunky”
    I think the most elegant solution is to simply disable the generation of interrupts in the Opcode before you do the read & clear, then re-enable in the opcode after. This means no interrupt will be handled during that microsecond you are worried about – but if an interrupt was pending it WILL be handled immediately afterwards. Even if it’s a regular interrupt from an internal timer or whatever the flag will just sit there to be picked up as soon as the interrupt is enabled and the overall time will not be altered.
    An alternative is some sort of “handshake” – passing another variable or “flag” between the two bits of code to ensure that they sync correctly.
    …but I’d go for disabling the interrupt – it’s called “Masking “ the interrupt – it’s what the Interrupt Enable bits in the PIC are for!!
    Best regards
    Andy J

  9. What you need to use is something called a QUEUE or FIFO.
    You simply put values into one end of the queue in your ISR and remove them from the other end of the queue in the main processing loop.
    David H

  10. What you are describing is a common problem for any driver writer who has to deal with concurrent accesses. Probably the simpest way to deal with this is the semaphore mechanism but I would not go overboard in the implemetation. Consider the real-time-clock in every PC. Its not OK to read the time while an update is in progress so there is a Update-In-Progress status bit that the user can read as a flag. While UIP is ‘busy’, each user simply polls the bit. Remember that to do this correctly (since you cannot guarantee atomic access to the busy bit, you ideally need to do something like:
    ->Take control
    At this point you should be OK to access the resource. It really doesn’t matter that this resource is a memory location or a hardware register.
    FYI: Personally I also agree with the suggestion to use ‘C’ code – even as a former BIOS developer used to assembler & C.

  11. My word, that was a comprehensive answer Richard. Thanks for that.
    I have had a thorough read, and feel somewhat ashamed that I have such a simple problem, and such a surfeit of microcontroller riches to solve it with compared with the 1960s.
    I have no experience at all with PIC index registers. Are they really there, or they PIC-style registers-added-together?
    I do faintly remember that the 6809 had two of them. Wouldn’t they be useful for this???
    As it happens, I think I am going to go for the simple semaphore technique, and run a two byte stack – which is unlikely to overflow – in the ISR.
    If it is full, I shall throw away incoming data until it has some space again.
    Thank you all for my education.

  12. Hi Alice,
    The answer is to store the data in a small set of bytes with an index pointer for the storage process, and a separate index pointer for the retrieval process.
    Let us call this set of bytes a “conceptual rotating drum”. The storage process index pointer is updated by the ISR, and the retrieval index pointer is updated by the application program which uses the data.
    Let’s say you have the risk of which you speak, but (Case 1) it is fairly infrequent. In that case, the drum can be quite small, maybe even just two bytes.
    When the data is stored, the drum index pointer for storage is incremented.
    In the application routine, you periodically check whether the two index pointers, one for the storage and one for the retrieval, match each other.
    If they do not, it follows that the ISR has stored a byte which then needs to be retrieved.
    When the application program has control of the program counter, if it sees that the index pointers do not match, it then gets the next byte out of the drum and then updates its retrieval pointer.
    In a 2-byte example, this would then make the two pointers the same again, meaning there is no further work to do.
    Case 2
    Where data may come in bursts, faster than the application program can take it out of the drum, you merely make the drum bigger.
    It is often convenient in PIC to make the drum size a power of 2, so that you can then compute the rotating address offset merely by ANDing the last n bits in the offset where 2^n is the drum size.
    Additional fun
    If there is a danger of drum overflow, then it is possible for the ISR to check the displacement between the storage index and the retrieval index, so that if there is a danger of drum wrap-around, then the data source which is driving the ISR can be slowed or stoppoed, or data can be diverted to a second or third processor etc.
    Once, back in the late 1960s (just as the dinosaurs were leaving the earth…), I was working on an application where the data came in from what we would nowadays call a screenshot, and there were about 1850 interrupts each shot with a couple of bytes of data, and each interrupt came in every few microseconds. In those days, this was FAST!
    At the time, the computer had a machine cycle time of 960ns, and some instructions required several cycles.
    To overcome the risk of data crash (overwrite), a drum was created that was getting on for 4K bytes in length.
    This was filled by the incoming high-speed burst of CRT data, and then it was unpacked painstakingly by the relatively slow application program.
    This program only permitted a “screen-shot refill” when it was getting towards the end of the previously-stored data.
    Thus, the ISR was switched on and off by the application program.
    If such a thing occurred nowadays, and there was so much data that one processor could not handle it, then a corollary of this process I mention in the paragraph immediately above could be operated by turning on and off a second/third/fourth processor, and subdividing the task.(One early version of the Scotland Yard fingerprint analysis program used a self-controllable multiprocessor set just like this.
    How the fingerprint scheme works now I do not know – it may still use the same strategy.)
    I hope this helps.
    Best wishes,
    Richard L

  13. Thanks Alistair
    Two days ago I had heard of neither semaphore* nor mutex, and now I have two ways of sorting the problem, and will ponder the options.
    I am rather pleased that the way I was originally proposing to do it is almost the semaphore technique, with 00 in the byte replacing the semaphore bit.
    Thanks to both of you for permission to stack in the ISR 🙂
    * except in the arm-waving sense

  14. Hi Alice,
    I read your latest blog post, and this is a classic case of a Mutex (mutual exclusion). Wikipedia has a good reference.
    I guess the simplest thing to do would be to temporarily switch off the ‘John’ interrupt while you are clearing the old value. However that merely shifts the problem to the ISR – you could potentially miss a new event while the ISR is off. It depends on what your interrupt event is – is it a timer (you get a small bit of clock drift), or is it an external event (how long does this event occur? Is it edge sensitive?)
    If you simply must catch every event, then the ISR would have to push the ‘John’ values onto a stack, and have the main routine pull these off the stack. Obviously the main routine must be able to keep up with the ISR or you get stack overflow.
    Hope this helps,

  15. Peter.
    Thanks for such a comprehensive answer.
    I will study your words at my leisure, and the wikipedia entry.

  16. Looks like you need a simple semaphore.
    Wikipedia goes overboard here [Semaphore-(programming)] but you just need a ‘ready’ flag that either says ‘ready for new data’ or ‘haven’t caught up with the last data yet’.
    Interrupt service routine goes like this:
    …………….Generate John
    …………….If ready flag is set, write John and clear ready flag.
    …………….If ready flag is not set, foreground wasn’t ready.
    …………………………..(Do some kind of queueing, and attempt to catch up.)
    Foreground goes like this:
    …………….Set ready flag
    …………….Is ready clear?
    …………………………..Read John
    …………………………..Set ready flag
    …………….(other stuff)
    …………….goto loop……………..(excuse all the dots. Indents don’t work. Editor)
    So the interrupt clears ready, but never sets it. The foreground sets ready, but never clears it.
    You were asking for books on structured assembly. In my commercial and open-source-hacker experience, I’ve found that if an assembly program is large enough that it needs a structure, you should strongly consider coding in C instead – that’s exactly what the language was originally designed for. I’ve very rarely seen code in assembly that can’t be written in C, and the C version is typically easier to read, easier to modify, runs just as fast, uses just as little memory and is much more portable. Just like BBC Basic, C allows you to drop to assembly when you really need to using the asm() keyword.
    There are many C compilers for PIC, AVR, MSP430 etc. including several free or open source versions.

Leave a Reply

Your email address will not be published. Required fields are marked *