Meeting and talking with customers about their register needs and requirements is one of my favorite things to do.  Not too long ago I met with an embedded firmware guru over coffee and we discussed various topics about registers and register management tooling.  We discussed registers in the context of multi-processor system on chips (MPSoCs) with a lot of different interconnect channels or bridges and buses. In such a system, where there is concurrent software that may access the same registers, we discussed the pains of read-modify-write and how to reduce the need for such operations.  Some of our conclusions on this are discussed in this posting, which was spurred by one Gary Strigham’s recent Embedded Bridge issues.  Gary has hinted that his upcoming issue will discuss “an atomic register design that provides robust support for concurrent driver access to common registers” and I’m curious to see how his techniques compare to the ones discussed herein.

What is  a register read-modify-write operation?

Firmware often needs to modify only one single bit or bit-field of an addressable register without modifying other bits in the register.  This requires the firmware to read the register, change the desired bit(s), then write the register back.

Problem 1: Atomicity for register read-modify-write operation

In a system that has concurrent software accessing the registers, read-modify-writes are a real concern.  Without proper inter-process synchronization (using mutexes or some other form of guaranteed atomicity) there is a danger of race conditions.  These dangers are well described in Issue #37 of Gary Stringham’s Embedded Bridge.

Problem 2: Latency of read-modify-write operations

In a complex MPSoC, with a complex interconnect fabric, register read operations can be painfully slow when compared to pure register write operations.  System performance can be greatly improved by reducing the number of read operations required.

How to trade register read-modify-write transactions with a single register write

The trick to replacing the need for read-modify-write of a register with a register write operation is to create additional CLEAR and SET write-only registers.

Consider the following FLAGS register as an example which uses 8 bit registers for simplicity sake.

reg name bit 7 bit 6 bit 5 bit 4 bit 3 bit 2 bit 1 bit 0
FLAGS flag_a flag_b flag_c flag_d flag_e flag_f flag_g flag_h

Without any supporting CLEAR/SET registers, to modify flag_f would require a read-modify-write operation. However, when we add the supporting CLEAR and SET write-only registers and related RTL logic then each bit can be set or cleared independently.  The flag_f bit can be set by writing 0×04 to FLAGS_SET and cleared by writing ox04 to the FLAG_CLEAR register.  The following table shows how the three related registers would look.

reg name bit 7 bit 6 bit 5 bit 4 bit 3 bit 2 bit 1 bit 0
FLAGS flag_a flag_b flag_c flag_d flag_e flag_f flag_g flag_h
FLAGS_CLEAR clear_a clear_b clear_c clear_d clear_e clear_f clear_g clear_h
FLAGS_SET set_a set_b set_c set_d set_e set_f set_g set_h

Here the complexities of read-modify-write in a concurrent software environment are traded for additional complexity within the Verilog or VHDL RTL code, and the related verification code. A register management tool like SpectaReg can be setup to allow easy specification of such register patterns in a graphical environment and auto-generation of the related RTL, verification code, and the firmware abstraction of the bits. With such a register management tool, the related work is greatly simplified when compared to the tedious and error-prone process of doing it manually.  An additional advantage relates to architectural exploration - with an automated path between the specification and generation it’s much easier to switch between the two techniques: i) a single read/write register requiring read-modify-write, and ii) a read only register with supporting SET and CLEAR registers.

In addition to register read-modify-writes discussed at the coffee talk with the firmware guru, we also chatted about how specialized hardware registers offer valuable diagnostics, performance profiling, optimization and debugging of MPSoC systems - perhaps a good topic for a future posting so stay tuned.

Lastly, if you have any thoughts to contribute, be sure to leave a comment.

While on an ocean side walk, I daydreamed of being struck by a great IP/subsystem idea with potential for royalty licensing.  I imagined organizing a team and jumping into action, developing the RTL logic, and processor integration.  Should I choose Virtual and/or FPGA prototyping?

English Bay, Oceanside

ocean side walk

Virtual Platform

It’s all about getting the software working as soon as possible.  The Google Android Emulator is an excellent example of how Google was able to get the software working without requiring developers to possess the device hardware.  The Android Emulator is described as a “mobile device emulator — a virtual mobile device that runs on your computer.”  Android abstracts the hardware, ARM processor, and Linux kernel with an Eclipse based Java framework, targeting Android’s Dalvik Virtual Machine, a register-based architecture that’s more memory efficient than the Java VM.

Virtual Prototyping

Clearly the virtual Android emulator/platform makes software development easier. Similarly, a virtual prototype makes abstract product validation easier. With a virtual prototype, the developers can explore different algorithms and architectures across the hardware and software abstractions.  Things like instruction set, on-chip interconnect, acceleration, memory, interrupt, and caching architecture can be explored by digital designers and firmware ahead of the VHDL and Verilog solidification.  Still, despite any effort spent on virtual prototyping, physical validation is essential before offering an IP for license.  FPGA prototyping is a good choice for all but the most complex and highest performance IP.

FPGA Prototyping

Those who know firmware and RTL coding, should have no problem getting basic examples running on an FPGA dev kit in the first day or two.  Those with experience validating a chip in the lab understand that so much really comes down to using embedded software to drive the tests.  Today’s Embedded System FPGA kits provide on-chip processors and the whole development environment.  Some available embedded FPGA processor options are listed in the following table:

Vendor/Processor On-chip interfaces Tools
Altera NIOS II

(soft core)

Avalon SOPC Builder

Quartus II

NIOS II IDE, with Eclipse CDT &

GNU compiler/debugger

Xilinx PowerPC

(hard core)

CoreConnect PLB & OPB Embedded Development Kit (EDK), with Eclipse CDT GNU compiler/debugger

Xilinx Platform Studio (XPS)


Xilinx MicroBlaze

(soft core)

CoreConnect PLB & OPB Embedded Development Kit (EDK), with Eclipse CDT GNU compiler/debugger

Xilinx Platform Studio (XPS)


Actel ARM

(soft core)

AMBA AHB & APB Libero IDE, CoreConsole, SoftConsole (Eclipse, GNU Compiler/debugger)

Short of having deep pockets for an ASIC flow, or a platform provider lined up to license the IP and spin prototype silicon, FPGA prototyping makes good sense.  Virtual platforms make sense for reaching software developers, so they can interface with the IP as soon as possible.  One problem with providing models for developing software before the hardware is complete is that it’s difficult to keep the different perspectives aligned as the design evolves.  Tools like SpectaReg that model hardware/software interfacing and auto-generate dependent code keep the different stakeholders aligned, resulting in quicker time to revenue for the IP.

So back to my oceanside daydream… how would I get the IP to market and start making money?  It depends on the target market.  If the end user targets embedded system FPGAs then it’s a no brainer - go straight to FPGA prototyping and don’t worry about virtual platform.  If the target market is mobile silicon platforms, then virtual prototyping/platforming makes sense.  Having validated silicon in hand is ideal but impractical in many circumstances. FPGA prototyping is pretty darn compelling when you consider the speedy turnaround times, and the low startup costs.