Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
AI-Assisted Programming for Web and Machine Learning

You're reading from   AI-Assisted Programming for Web and Machine Learning Improve your development workflow with ChatGPT and GitHub Copilot

Arrow left icon
Product type Paperback
Published in Aug 2024
Publisher Packt
ISBN-13 9781835086056
Length 602 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Authors (5):
Arrow left icon
Marina Fernandez Marina Fernandez
Author Profile Icon Marina Fernandez
Marina Fernandez
Ajit Jaokar Ajit Jaokar
Author Profile Icon Ajit Jaokar
Ajit Jaokar
Anjali Jain Anjali Jain
Author Profile Icon Anjali Jain
Anjali Jain
Christoffer Noring Christoffer Noring
Author Profile Icon Christoffer Noring
Christoffer Noring
Ayşe Mutlu Ayşe Mutlu
Author Profile Icon Ayşe Mutlu
Ayşe Mutlu
+1 more Show less
Arrow right icon
View More author details
Toc

Table of Contents (25) Chapters Close

Preface 1. It’s a New World, One with AI Assistants, and You’re Invited FREE CHAPTER 2. Prompt Strategy 3. Tools of the Trade: Introducing Our AI Assistants 4. Build the Appearance of Our App with HTML and Copilot 5. Style the App with CSS and Copilot 6. Add Behavior with JavaScript 7. Support Multiple Viewports Using Responsive Web Layouts 8. Build a Backend with Web APIs 9. Augment Web Apps with AI Services 10. Maintaining Existing Codebases 11. Data Exploration with ChatGPT 12. Building a Classification Model with ChatGPT 13. Building a Regression Model for Customer Spend with ChatGPT 14. Building an MLP Model for Fashion-MNIST with ChatGPT 15. Building a CNN Model for CIFAR-10 with ChatGPT 16. Unsupervised Learning: Clustering and PCA 17. Machine Learning with Copilot 18. Regression with Copilot Chat 19. Regression with Copilot Suggestions 20. Increasing Efficiency with GitHub Copilot 21. Agents in Software Development 22. Conclusion 23. Other Books You May Enjoy
24. Index

Prompt strategy for data science

Let’s do a similar thought experiment for data science as we did for web development. We’ll use the presented guidelines “problem breakdown” and “generate prompts,” and just like in the web development section, we’ll draw some general conclusions on the domain and present those as a prompt strategy for data science.

Problem breakdown: predict sales

Let’s say we’re building a machine-learning model to predict sales. At a high level, we understand what the system should do. To solve the problem though, we need to divide it into smaller parts, which in data science usually entails the following components:

  • Data: The data is the part of the system that stores information. The data can come from many places like databases, web endpoints, static files, and more.
  • Model: The model is responsible for learning from the data and producing a prediction that’s as accurate as possible. To predict, you need an input that produces one or more outputs as a prediction.
  • Training: The training is the part of the system that trains the model. Here, you typically have part of your data as training and a part being sample data.
  • Evaluation: To ensure your model works as intended, you need to evaluate it. Evaluation means taking the data and model and producing a score that indicates how well the model performs.
  • Visualization: Visualization is the part where you can gain insights valuable for the business via graphs. This part is very important, as it’s the part that’s most visible to the business.

Further breakdown into features/steps for data science

At this point, you’re at too high a level to start writing prompts. We can break it down further by looking at each step:

  • Data: The data part has many steps, including collecting the data, cleaning it, and transforming it. Here’s how you can break it down:
    1. Collect data: The data needs to be collected from somewhere. It could be a database, a web endpoint, a static file, and so on.
    2. Clean data: The data needs to be cleaned. Cleaning means removing data that’s not relevant, removing duplicates, and so on.
    3. Transform data: The data needs to be transformed. Transformation means changing the data to a format that’s useful for the model.
  • Training: Just like the data part, the training part has many steps to it. Here’s how you can break it down:
    1. Split data: The data needs to be split into training and sample data. The training data is used to train the model and the sample data is used to evaluate the model.
    2. Train model: The model needs to be trained. Training means taking the training data and learning from it.
  • Evaluation: The evaluation part is usually a single step but can be broken down further.

Generate prompts for each step

Note how our breakdown for data science looks a bit different from web development. Instead of identifying features like Add inventory, we instead have a feature like Collect data.

However, we’re on the correct level to author a prompt, so let’s use the Collect data feature as our example:

[Prompt]

Collect data from data.xls and read it into a DataFrame using Pandas library.

[End of prompt]

The preceding prompt is both general and specific at the same time. It’s general in the sense that it tells you to “collect data” but specific in that it specifies a specific library to use and even what data structure (DataFrame). It’s entirely possible that a simpler prompt would have worked for the preceding step like so:

[Prompt]

Collect data from data.xls.

[End of prompt]

This is where it may vary depending on whether you use a tool like ChatGPT or GitHub Copilot.

Identify some basic principles for data science, “a prompt strategy for data science”

Here, we’ve identified some similar principles as in the web development example:

  • Provide context – filename: A CSV file can have any name. It’s important to specify the name of the file.
  • Specify how – libraries: There are many ways to load a CSV file, and even though Pandas library is a common choice, it’s important to specify it. There are other libraries to work with and you might need a solution for Java, C#, and Rust, for example, where libraries are named differently.
  • Iterate: It’s worth iterating on the prompt, rephrasing it, and adding separators like a comma, a colon, and so on.
  • Be context-aware: Also here, context matters a lot; if you’re working in Notebook, previous cells will be available to GitHub Copilot, previous conversations will be available to ChatGPT, and so on.

As you can see from the preceding guidance, the strategy is very similar for web development. Here we’re also listing “Provide,” “Specify how,” “Iterate,” and “Be context-aware.” The big difference lies in the details. However, there’s an alternate strategy that works in data science and that’s lengthy prompts. Even though we’ve broken down the data science problem into features, we don’t need to write a prompt per feature. Another way of solving it could be to express everything you want to be carried out in one large prompt. Such a prompt could therefore look like so:

[Prompt]

You want to predict sales on the file data.xsl. Use Python and Pandas library. Here are the steps that you should carry out:

  • Collect data
  • Clean data
  • Transform data
  • Split data
  • Train model
  • Evaluation

[End of prompt]

You will see examples in future chapters on data science and machine learning where both smaller prompts as well as lengthier prompts are being used. You decide which approach you want to use.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image