Implementing multi-LLM model validation
We’ve learned how to utilize LLM-based APIs to accomplish different testing tasks, including unit-level testing and linting at the repository and the CI runner levels. Instead of relying only on Claude2 or a single engine, we can also utilize additional LLM models. Poe.com
makes this rather trivial as you can simply just swap the names of the “bots” and iterate the API call multiple times for the same detection. If you’re using a different platform, such as a mix of the OpenAI SDK and Google’s VertexAI, you would have to develop and adjust the scripts and then place them in a runner. From a sequence abstraction perspective, it is still trivial to do between just 1-2 models.
A more interesting use case would be to utilize a voting style where the same parameters of low, medium, high, and unknown are used and mapped to fixed quantitative scores such as the following:
- Unknown = 0
- Low = 1
- Medium...