1.3 The Computing Elements of Real-Time Systems
1.3.1 Overview
In real-time systems, computing elements are destined for use in either general-purpose or specialized applications (Figure 1.17):
To use these effectively, the software designer should have a good understanding of their features. After all, what might be an excellent design solution for one application might be ghastly (or even unusable) in others.
1.3.2 General-Purpose Microprocessors
General-purpose microprocessors were originally the core building blocks of microcomputer systems. Although they are far less common nowadays, they form a good starting point for this topic.
By itself, the processor is only one element within the microprocessor system. To turn it into a computing machine, certain essential elements need to be added (Figure 1.18):
The program code itself is stored in memory, which, for embedded systems, must be retained on power down. That is, the memory must be "non-volatile." Older designs typically used ultraviolet-erasable (electrically) programmable ROM (EPROM). The drawback to this device is that (normally) it must be removed from the computer for erasure and reprogramming. However, where in-circuit reprogramming is required, code is located in electrically erasable/ programmable non-volatile storage, the alternatives being:
- Electrically erasable programmable ROM (EEPROM)
- Flash memory (a particular type of EEPROM technology)
- Ferroelectric random-access memory (FRAM)
Flash memory has, to a large extent, replaced EPROM in new designs.
When large production quantities are concerned, two approaches may be used:
- Mask-programmable devices
- One-time programmable ROM (OTPROM)
In the first case, the program is set in memory by the chip manufacturer; as such, it is unalterable. The second method is essentially an EPROM device without a light window. Nowadays, this market sector usually uses single-chip microcomputers rather than general-purpose ones.
All data that is subject to regular change is located in read/write random-access memory (a confusing term, as memory locations, for most devices, can be accessed randomly). This includes program variables, stack data, process descriptors, and dynamic data items.
The final element is the address decoder unit. Its function is to identify the element being accessed by the processor.
Taken together, these items form the heart of the microcomputer. However, to make it usable in real-time applications, extra elements need to be added. The key items are:
- Interrupt controllers
- Real-time clocks
- Hardware timers
- Watchdog timers
- Serial communication controllers
Items that should also be considered at the design stage include:
- Direct memory access (DMA) controllers
- I/O peripheral controllers (only where a large volume of data transfer is required)
These may be essential in some systems but not in others:
- Interrupt controllers:
As pointed out earlier, real-time systems must support both periodic and aperiodic tasks. In most designs, "guaranteed" response times are obtained by using interrupts.
- Real-time clock:
The function of the real-time clock is to provide a highly accurate record of elapsed time. It is normally used in conjunction with an interrupt function. Real-time clocks shouldn't be confused with calendar clocks (although they may be used for calendar functions). When an operating system is incorporated within the software, the clock acts as the basic timing element (the "tick").
- Hardware timers:
Accurate timing, especially that involving long time periods, cannot normally be done in software. Without the timing support of the tick in an operating system, hardware timers have to be used. Even when an operating system is used, these timers provide great flexibility. Generally, these are software programmable (Figure 1.19), both in terms of timing and modes of operation (for example, in square-wave generation, the timing is a "one-shot" pulse outputs and retriggerable operations):
- Watchdog timers:
The purpose of the watchdog timer is to act as the last line of defense against program malfunction. It normally consists of a retriggerable monostable or one-shot timer, activated by a program
write
command (Figure 1.20). Each time the timer is signaled, it is retriggered, with the output staying in the "normal" state:If for any reason, it isn't retriggered, then a time-out occurs, and the output goes into alarm conditions. The usual course of action is to then generate a non-maskable interrupt (NMI), so setting a recovery program into action. In some instances, external warnings are also produced. In others, especially digital control systems, warnings are produced and the controller is then isolated from the controlled process.
Address decoding of the watchdog timer is, for critical systems, performed over all bits of the address. In these circumstances, the address is a unique one; hence retriggering by accident is virtually eliminated.
- Serial communication controllers:
Serial communication facilities are integral parts of many modern embedded systems. However, even where this isn't needed, it is worthwhile designing in a USB and/or an RS232-compatible communication channel. These can be used as major aids in the development and debugging of the application software.
- DMA controllers:
The DMA controller (Figure 1.21) is used where data has to be moved about quickly and/or in large amounts (data rates can exceed 1 gigabyte/sec):
DMA techniques are widely used in conjunction with bulk memory storage devices such as hard disks and compact disks. For many real-time systems, they are frequently used where high-speed serial communication links have to be supported.
In normal circumstances (that is, the "normal" mode of operation; see Figure 1.21 (a)), the controller acts just like any other slave device, being controlled by the processor. However, when a DMA request is generated by a peripheral device, control is taken over by the DMA controller (Figure 1.21 (b)). In this case, the micro is electrically disconnected from the rest of the system. Precise details of data transfer operations are usually programmed into the controller by the micro.
- I/O peripherals:
I/O peripherals are used either as controllers or as interfacing devices. When used as a controller, their function is to offload routine I/O processing, control, and high-speed transfer work from the processor itself (Figure 1.22):
One of the most common uses of such devices is to handle high-speed, large-volume data transfers to and from hard disk. They are especially useful in dealing with replicated memory storage units, as with replicated arrays of independent disk (RAID) technology. Other applications include intelligent bus, network, and communications interfacing.
The I/O controller's basic operation is similar to that of a DMA controller, but with two major differences. First, it can work cooperatively with the processor, using system resources when the processor is busy. Second, I/O processors are much more powerful than DMA devices. For example, the Intel i960 IOP includes (among other items) a high-speed parallel bus bridge, a specialized serial bus interface, internal DMA controllers, and a performance monitoring unit.
In other applications, I/O devices are used to provide compact, simple, and low-cost interfaces between the processor and peripheral equipment (Figure 1.23). Input/output pins are user-programmable to set up the desired connections to such equipment. These interface chips function as slave devices to the processing unit.
1.3.3 Highly Integrated Microprocessors
Highly integrated processors are those that contain many of the standard elements of a microcomputer system on a single chip. A typical example is the NXP MPC8240-integrated processor (Figure 1.24):
A comparison of Figure 1.18 and Figure 1.24 shows just what can be achieved on one chip (the MPC8240, for example, reduces the chip count from eight to one). Naturally, such processors are more expensive than the basic general-purpose device. However, the integration of many devices onto one chip usually reduces the overall system cost. Moreover, it makes a major impact on board-packing densities, which also reduces manufacturing and test costs. In short, these are highly suited for use in embedded systems design.
1.3.4 Single-Chip Microcomputers
With modern technology, complete microcomputers can be implemented on a single chip, eliminating the need for external components. Using the single-chip solution reduces the following:
- Package count
- Size
- Overall costs
One widely used device of this type is the 8052 microcomputer, a microchip variant is shown in Figure 1.25.
By now, all the on-chip devices will be familiar. Note that the interfacing to the outside world may be carried out through the I/O port subsystem. This is a highly flexible structure that, in smaller systems, minimizes the component count. However, with only 8 kBytes of ROM and 256 bytes of RAM, it is clearly intended for use in small systems (the memory size can, of course, be extended by using external devices):
1.3.5 Single-Chip Microcontrollers
Microcontrollers are derivatives of microcomputers but aimed specifically at the embedded control market (though the boundary between the two is becoming somewhat blurred). Like single-chip microcomputers, they are designed to provide all the necessary computing functions in a single package. Broadly speaking, there are two categories: general-purpose (sector-independent) and sector-specific. These differ only in the actual internal devices included on the chip. For sector-specific units, the on-chip devices are chosen to provide support specifically for that sector. In particular, they try to provide all required functionality on the chip, so minimizing the need for (extra) external hardware.
An example of such a device, aimed at automotive body electronic applications, is shown in Figure 1.26, the STMicroelectronics SPC560 series chip:
Like many modern microcontrollers, it contains an impressive set of functions:
(A) Memory:
- Up to 512 Kbytes Code Flash, with error correcting code (ECC)
- 64 Kbytes Data Flash, with error-correcting code
- Up to 48 Kbytes SRAM, with error-correcting code
- Memory protection unit (MPU)
(B) Interrupts:
- Up to 24 external interrupts
(C) GPIO:
- Between 45 and 123, depending on the IC package type
(D) Timers:
- 6-channel periodic interrupt timers
- 4-channel system timer module
- Software watchdog timer
- Real-time clock timer
(E) I/O:
- Up to 56 channels counter-time-triggered I/Os
(F) Communications Interface:
- Up to 6 CAN network interfaces
- 4 LIN network interfaces
- Others: Serial Peripheral (SPI) and I2C Interfaces
(G) ADC:
- Up to 36 channel 10-bit ADC
You might have noticed that the diagram of Figure 1.26 doesn't show any connections to the outside world. There is a simple reason for this. Although the device has many, many functions, not all of these may be accessed simultaneously. In practice, what you can actually use at any one time is limited by the package pin count. The SPC560 series, for example, comes in a number of chip sizes, including 64, 100, and 144 pin types. In many cases, a number of functions may be provided on individual pins, being accessed as a shared (multiplexed) item. Clearly, such functions are available only in a mutually exclusive way.
1.3.6 Digital Signal Processors
There are numerous applications that need to process analog signals very quickly. These include instrumentation, speech processing, telecommunications, radar, sonar, and control systems. In the past, such processing was done using analog techniques. However, because of the disadvantages of analog processors (filters), designers have, where possible, moved to digital techniques. Central to this is the use of digital filtering calculations, typified by sets of multiply and add (accumulate) instructions (the so-called "sum of products" computation). The important characteristics of such systems are that they:
- Have extremely high throughputs
- Are optimized for numerical operations
- Employ a small number of repetitive numerical calculations
- Are usually low-cost
These needs have, for some time now, been met by a device specialized for such work: the digital signal processor (DSP).
To achieve high processing speeds, the basic computing engine is organized around a high-speed multiplier/accumulator combination (Figure 1.27). In these designs, the Von Neumann structure is replaced by the Harvard architecture, having separate paths for instruction and data. The system form shown in Figure 1.27 is fairly typical of DSPs. Obviously, specific details vary from processor to processor; you can refer to http://www.ti.com/processors/dsp/overview.html for further information.
Programming DSPs is a demanding task, especially working at an assembly language level. The instruction sets are carefully chosen to perform fast, efficient, and effective arithmetic. Among those instructions are ones that invoke complex multipurpose operations. Added to this is the need to produce compact and efficient code if the whole program is to fit into the on-chip ROM. And finally, there is the need to handle extensive fixed-point computations without running into overflow problems.
It used to be said that in fixed-point DSP programming, "90% of the effort goes into worrying about where the decimal point is." Fortunately, this is much less of a problem nowadays as word lengths of 32 or 64 bits are commonplace:
A final point: the classical DSP processor is being challenged by "conventional" processors that include DSP instructions. One such example is the ARM NEON SIMD (single-instruction multiple-data) architecture extension for their Cortex series processors.
You can find out more at https://developer.arm.com/technologies/dsp.
1.3.7 Mixed-Signal Processors
Mixed-signal processors, as their name suggests, are designed to interface simultaneously to analog and digital components. The Texas MSP430, for example (Figure 1.28, the G2 variant), is aimed at battery-powered applications (such as multimeters and intelligent sensing) where low power consumption is paramount. Quoted figures (typical) for power requirements are:
- Active: 230 μA at 1MHz, 2.2 volts
- Standby: 0.5 μA
- Off mode (RAM retention): 0.1μA
Work like this could be done by a standard microcontroller, but this is a relatively costly solution. Hence, mixed-signal processors are optimized for use in low cost, high-volume products.
1.3.8 System-On-Chip Designs – Overview
A system-on-chip (SOC) device is an integrated circuit (IC) that integrates all components of a microcomputer or microcontroller (or other electronic system) into a single chip. Thus the Microchip AT89C52 (Figure 1.25) and the MSP430 (Figure 1.28) are, in fact, SOC devices. ICs like these have capabilities designed for general use within specific sectors (for example, the 89C52 for embedded controller applications and the MSP430 for metering systems). Their place in the SOC technology range is shown in Figure 1.29 (please note that this is a simplified view of the topic, being limited to the more important device types):
One of the key aspects here is that their hardware functionality is fixed by the manufacturer; it cannot be changed by the user. What these devices also have in common is that the processors themselves are company-specific. However, in the past few years, there has been a major trend by chip designers to buy-in the processor designs. Such components are usually called virtual components, virtual cores (VCs), or intellectual property (IP) cores. Probably, the most important company in this area is ARM (especially in the 32 and 64-bit field); you'll find their "products" incorporated in many, many SOC devices.
One drawback to using general-purpose SOC ICs is that designs may have to be extensively tailored to meet specific application needs. Other factors may also be important, for example, power consumption, temperature range, radiation hardness, and so on. Such needs can be met by using bespoke SOC devices, these essentially being specialized single-chip application-specific designs.
Now, application-specific integrated circuit (ASIC) technology is not new. Electronic engineers have been using it for many years to produce specialized devices (for example, ICs for advanced signal processing, image stabilization, and digital filtering). Design is performed typically using computer-aided design (CAD) methods based on very high-level description language (VHDL) programming. SOC design methods are fundamentally the same but the implementations are much more complex. In particular, they incorporate microprocessor(s) and memory devices to form full on-chip microcomputers or microcontrollers. Applications include digital cameras, wireless communication, engine management, specialized peripherals, and complex signal processing. Typical of this technology is the Snapdragon SOC suite from Qualcomm Inc, intended for use in mobile devices (https://www.qualcomm.com/products/snapdragon).
Figure 1.30 is a representative structure of an SOC unit; though, by definition, there are many variations of such structures:
The subsystems shown here fall into two groups, custom and others. Anything designed by the chip designer is labeled custom; the others represent bought-in items (for example, a microprocessor, RAM, or ROM). Of course, because this is chip fabrication, we cannot plugin such items onto the chip. What happens typically is that VHDL descriptions of the components are used within the overall design process. The end result of the design process is a manufacturing file that is then sent to a chip manufacturer.
1.3.9 Programmable SOCs – FPGA-Embedded Processors
The customized SOC technology described previously has some major drawbacks. First, having a specialized chip manufactured can be quite costly. Second, it isn't exactly a fast process; typically, the manufacturing process takes 6 to 8 weeks, done in highly specialized semiconductor fabrication plants. Third, modifying the design and producing a new chip is both costly and time-consuming. Fourth, it isn't suitable for low-volume product production because of the costs involved. Fortunately, for many applications, these obstacles can be overcome by using programmable SOC (PSOC) technology. And one of the most important devices here is the Field Programmable Gate Array (FPGA).
An FPGA is a general-purpose digital device that consists of sets of electronic building blocks. You, the designer, set the functionality of the FPGA by configuring these blocks, typically using VHDL design methods. This, which requires specialized design knowledge, is normally done by hardware engineers. Software engineers haven't generally concerned themselves with the detailed aspects of FPGAs, treating them merely as peripheral devices. However, things have changed as a result of FPGA chip manufacturers embedding silicon cores into their devices, the FPGA-embedded processor. As a result, we have some very compelling reasons to go down the FPGA route, such as:
- Producing custom products at a reasonable cost
- Minimizing component count (especially important when size is constrained)
- Maximizing performance by being able to make trade-offs between software and hardware
An example of this technology is the Intel Nios processor (Figure 1.31). Its use in an application is shown in Figure 1.32, where it forms part of a Cyclone V FPGA:
Here, the overall functionality of the device is split between hardware and software. For devices like these, all programming may be done in C: standard C for software and SystemC (or its equivalent) for the hardware. Such an approach has two significant benefits:
- It's much easier to get hold of C programmers than VHDL designers.
- Algorithms, and so on, coded in software can be readily transferred to hardware (by making only minor changes to the C code and then recompiling using SystemC).
And remember, the device functionality can always be modified without us having to make actual physical hardware changes. Thus devices are not only programmable, they're also reprogrammable.
1.3.10 SOC Devices – Single and Multicore
A single-core device is defined to be a chip that contains just one CPU (thus all conventional microprocessors can be considered to be examples of single-core designs). However, SOC technology has given rise to devices that consist of two or more cores, called multicore chips. This structure is now a very important feature of high-performance microcontrollers; at present, the claimed record for the greatest number of cores on a single chip is 100 ARM CPUs, made by the company EZchip (www.tilera.com).
Figure 1.33 shows the makeup of a typical small multicore-based SOC integrated circuit. Here, the multicore processor consists of two CPUs together with key processor-related devices: interrupt management, memory, and timers. This processor is embedded within a single chip microcontroller that also includes various real-world devices:
From hardware, perspective processors come in two forms: symmetric and asymmetric. Essentially, with asymmetric multiprocessor design, all the processing units are identical; with asymmetric multiprocessors, the units differ.
An example of an embedded multicore symmetric multiprocessor is the Arm Cortex A9 (Figure 1.34) showing a simplified description of its key features. This has four identical processing units (cores), each one consisting of a CPU, hardware accelerator, debug interface, and cache memory. It can be seen that several on-chip resources are shared by all the processing units. From a software perspective, the device can be used in two ways. First, each core can be allocated specific tasks, and hence is considered to be a dedicated resource. Second, any core can run any task, thus being treated as an anonymous resource:
An example of an asymmetric multicore multiprocessor is the Texas TMS320DM6443 (Figure 1.35):
In this device, there are two distinct processing units, one for general-purpose computing and the other for DSP.
One final small point: cores are defined to be "hard" or "soft." Where the actual silicon of a processor is used as the core, it is said to be hard. But when the core is implemented using design file data, then it is a soft one.