As you already know, the hardirq code is meant to do the bare minimum setup and interrupt handling, leaving the majority of the interrupt processing to be performed in a safe manner via the deferred functionality mechanisms we've been talking about, the tasklet and/or softirq. This 'bottom half' as well as deferred functionality handling is carried out in priority order – first, the softirq kernel timers, then tasklets (both of these are just special cases of the underlying softirq mechanism), then threaded interrupts, and finally workqueues (the latter two use underlying kernel threads).
So, the big question is, when you're writing your driver, which one of these should you use? Should you use a deferred mechanism at all? It really depends on the amount of time your complete interrupt processing takes to complete. If your complete interrupt...