





















































Hi ,
Welcome to a brand new issue of PythonPro!
In today’sExpert Insight we bring you an excerpt from the recently published book, Building AI Applications with OpenAI APIs - Second Edition, which discusses how to create a language translation desktop app using OpenAI's ChatGPT API and Microsoft Word.
News Highlights: Protect AI to release Vulnhuntr, an AI tool for detecting Python zero-day vulnerabilities; Amazon launches SageMaker Core, a Python SDK simplifying machine learning with object-oriented interfaces; and PyCharm becomes the official IDE of OpenCV as JetBrains joins as a Silver Member.
And, today’s Featured Study, presents ChangeGuard, a tool designed to compare code behaviour before and after changes to detect functionality modifications.
Stay awesome!
Divya Anne Selvaraj
Editor-in-Chief
P.S.:This month's survey is still live, do take the opportunity to leave us your feedback, request a learning resource, and earn your one Packt credit for this month.
None
when no else
condition is provided, similar to Ruby.map()
, filter()
, and sorted()
, along with advantages, limitations, and best practices for effective use in simplifying code.assert
for internal checks during development and raise
for handling user-facing errors in ML/AI projects to ensure robust error handling.In "ChangeGuard: Validating Code Changes via Pairwise Learning-Guided Execution," Gröninger et al. present a tool called ChangeGuard, which compares code behaviour before and after changes to determine whether the modifications alter functionality.
Validating whether code changes preserve intended behaviour is a key challenge in software development, particularly when changes are deep within complex projects. Developers may make modifications to improve readability, performance, or to fix bugs, but unintended changes in functionality can lead to errors. Current methods, such as regression testing, often fail to catch these subtle changes. This study is relevant because it introduces a more reliable approach—ChangeGuard, which uses pairwise learning-guided execution. This approach involves running two versions of a code snippet simultaneously and predicting values to ensure the code runs correctly, even in complex scenarios.
This paper will be most useful for software developers, especially those working with large and complex codebases. It provides practical insights into validating code changes more effectively than existing methods, offering a way to catch unintended behaviour early in the development process. Developers using automated refactoring tools or large language models like GPT-4 will particularly benefit from ChangeGuard's ability to detect subtle, behaviour-altering modifications.
ChangeGuard's methodology is based on pairwise learning-guided execution, an extension of an existing technique. It predicts missing values dynamically, ensuring more execution paths are covered than previous approaches. The tool was evaluated on 224 annotated code changes from popular Python open-source projects, showing high accuracy in detecting semantics changes. Additionally, ChangeGuard was applied to automated refactoring tools and large language models like GPT-3.5 and GPT-4, where it found 87 out of 187 and 143 out of 258 code changes to unexpectedly alter behaviour. This comprehensive testing provides strong evidence for ChangeGuard's reliability and robustness.
You can learn more by reading the entire paper and accessing ChangeGuard.
Here’s an excerpt from “Chapter 6: Language Translation Desktop App with the ChatGPT API and Microsoft Word” in the book, Building AI Applications with OpenAI APIs - Second Edition by Martin Yanev, published in October 2024.
In this section, we will explore how to set up our project and install thedocx
Python library to extract text fromWorddocuments. Thedocx
library is a Python package that allows us to read and write
Microsoft Word (.docx
) files and provides a convenient interface to access information stored inthese files.
The first step is to initiate your work by creating a new directory calledTranslation App
and loading it with VSCode. This will enable you to have a dedicated area to craft and systematize your translation app code. Activate your virtual environment from the terminal window following the steps outlined inChapter 1,Getting Started with the ChatGPT API forNLP Tasks.
To run the language translation desktop app, you will need to install thefollowing libraries:
openai
: Theopenai
library allows you to interact with the OpenAI API and perform variousNLP tasksdocx
: Thedocx
library allows you to read and write Microsoft Word.docx
filesusing Pythontkinter
: Thetkinter
library is a built-in Python library that allows you to createGraphical User Interfaces(GUIs) for yourdesktop appAstkinter
is a built-in library, there is no need for installation since it already exists within your Python environment. To install theopenai
anddocx
libraries, access the VSCode terminal, and then execute thefollowing commands:
pip install openai
pip install python-docx
To access and read the contents of a Word document, you will need to create a sample Word file inside your project. Here are the steps to create a newWord file:
files
.files
folder and selectNew File..docx
extension – forexample,info.docx
.You can now add some text or content to this file, which we will later access and read using thedocx
library in Python. For this example, we have created an article about New York City. You can find the complete article here:https://en.wikipedia.org/wiki/New_York_City. However, you can choose any Word document containing text that you wantto analyze:
The United States’ most populous city, often referred to as New York City or NYC, is New York. In 2020, its population reached 8,804,190 people across 300.46 square miles, making it the most densely populated major city in the country and over two times more populous than the nation’s second-largest city, Los Angeles. The city’s population also exceeds that of 38 individual U.S. states. Situated at the southern end of New York State, New York City serves as the Northeast megalopolis and New York metropolitan area’s geographic and demographic center - the largest metropolitan area in the country by both urban area and population. Over 58 million people also live within 250 miles of the city. A significant influencer on commerce, health care and life sciences, research, technology, education, politics, tourism, dining, art, fashion, and sports, New York City is a global cultural, financial, entertainment, and media hub. It houses the headquarters of the United Nations, making it a significant center for international diplomacy, and is often referred to as theworld’s capital.
Now that you have created the Word file inside your project, you can move on to the next step, which is to create a new Python file calledapp.py
inside theTranslation App
root directory. This file will contain the code to read and manipulate the contents of the Word file using thedocx
library. With the Word file and the Python file in place, you are ready to start writing the code to extract data from the document and use it inyour application.
To test whether we can read Word files with thedocx-python
library, we can implement the following code in ourapp.py
file:
import docx
doc = docx.Document("<full_path_to_docx_file>")
text = ""
for para in doc.paragraphs:
text += para.text
print(text)
Make sure to replace<full_path_to_docx_file>
with the actual path to your Word document file. Obtaining the file path is a simple task, achieved by right-clicking on your.docx
file in VSCode and selecting theCopy Relative Pathoption from thedrop-down menu.
Once you have done that, run theapp.py
file and verify the output. This code will read the contents of your Word document and print them to the console. If the text extraction works correctly, you should see the text of your document printed in the console (seeFigure 6.1). Thetext
variable now holds the data frominfo.docx
as aPython string.
Figure 6.1 – Word text extraction console output
Packt library subscribers can continue reading the entire book for free. You can buy Building AI Applications with OpenAI APIs - Second Edition,here.
And that’s a wrap.
We have an entire range of newsletters with focused content for tech pros. Subscribe to the ones you find the most usefulhere. The complete PythonPro archives can be foundhere.
If you have any suggestions or feedback, or would like us to find you aPythonlearning resource on a particular subject, take thesurveyor just respond to this email!