Developing safety cages to prevent models from breaking the entire system
As GenAI systems such as MLMs and AEs create new content, there is a risk that they generate content that can either break the entire software system or become unethical.
Therefore, software engineers often use the concept of a safety cage to guard the model itself from inappropriate input and output. For an MLM such as RoBERTa, this can be a simple preprocessor that checks whether the content generated is problematic. Conceptually, this is illustrated in Figure 11.8:
Figure 11.8 – Safety-cage concept for MLMs
In the example of the wolfBERTa
model, this can mean that we check whether the generated code does not contain cybersecurity vulnerabilities, which can potentially allow hackers to take over our system. This means that all programs generated by the wolfBERTa
model should be checked using tools such as SonarQube or CodeSonar to check for cybersecurity vulnerabilities...