LLMs for Data Science
This chapter is about how generative AI can automate data science. Generative AI, in particular LLMs, has the potential to accelerate scientific progress across various domains, especially by providing efficient analysis of research data and aiding in literature review processes. A lot of the current approaches that fall within the domain of Automated Machine Learning (AutoML) can help data scientists increase their productivity and make data science processes more repeatable. In this chapter, we’ll first discuss how data science is affected by generative AI and then cover an overview of automation in data science.
Next, we’ll discuss how we can use code generation and tools in diverse ways to answer questions related to data science. This can come in the form of doing a simulation or enriching our dataset with additional information. Finally, we’ll shift the focus to the exploratory analysis of structured datasets. We can set up agents...