Exploring models we cannot access
The visual interfaces explored in this chapter are fascinating. However, there is still a lot of work to do!
For example, OpenAI’s GPT-3 model runs online or through an API. Thus, we cannot access the weights of some Software as a Service (SaaS) transformer models. This trend will increase and expand in the years to come. Corporations that spend millions of dollars on research and computer power will tend to provide pay-as-you-go services, not open-source applications.
Even if we had access to the source code or output weights of a GPT-3 model, using a visual interface to analyze the 9,216 attention heads (96 layers x 96 heads) would be quite challenging.
Finding what is wrong will still require some human involvement in many cases.
For example, the polysemy issue of the word coach
in English to French translation often represents a problem. In English, a coach can be a person who trains people, or a bus. The word coach
exists...