Explain why side-effects must be as small as possible (in terms of code size).
In principle, side-effects should contain only the part of code that is a side-effect. However, when using existing libraries, the side-effect part often contains additional logic. This means that more code has to be tested, probably by using mocks. Since testing side-effects is more difficult than testing pure code, they should be stripped to the bare minimum.
What is the advantage of a protocol implemented without I/O?
A protocol implemented without I/O can be implemented only as pure code. As a consequence, such an implementation can be tested more easily and it can be reused on top of transport layers other than the original one intended.
What is the purpose of splitting logging and monitoring from features?
A feature, logging things that happen when providing this feature, and monitoring...