Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon

Author Posts - Artificial Intelligence

5 Articles
article-image-enhancing-data-quality-with-cleanlab
Prakhar Mishra
11 Dec 2024
10 min read
Save for later

Enhancing Data Quality with Cleanlab

Prakhar Mishra
11 Dec 2024
10 min read
IntroductionIt is a well-established fact that your machine-learning model is only as good as the data it is fed. ML model trained on bad-quality data usually has a number of issues. Here are a few ways that bad data might affect machine-learning models -1. Predictions that are wrong may be made as a result of errors, missing numbers, or other irregularities in low-quality data. The model's predictions are likely to be inaccurate if the data used to train is unreliable.2. Bad data can also bias the model. The ML model can learn and reinforce these biases if the data is not representative of the real-world situations, which can result in predictions that are discriminating.3. Poor data also disables the the ability of ML model to generalize on fresh data. Poor data may not effectively depict the underlying patterns and relationships in the data.4. Models trained on bad-quality data might need more retraining and maintenance. The overall cost and complexity of model deployment could rise as a result.As a result, it is critical to devote time and effort to data preprocessing and cleaning in order to decrease the impact of bad data on ML models. Furthermore, to ensure the model's dependability and performance, it is often necessary to use domain knowledge to recognize and address data quality issues.It might come as a surprise, but gold-standard datasets like ImageNet, CIFAR, MNIST, 20News, and more also contain labeling issues. I have put in some examples below for reference -The above snippet is from the Amazon sentiment review dataset , where the original label was Neutral in both cases, whereas Cleanlab and Mechanical turk said it to be positive (which is correct).The above snippet is from the MNIST dataset, where the original label was marked to be 8 and 0 respectively, which is incorrect. Instead, both Cleanlab and Mechanical Turk said it to be 9 and 6 (which is correct).Feel free to check out labelerrors to explore more such cases in similar datasets.Introducing CleanlabThis is where Cleanlab can come in handy as your best bet. It helps by automatically identifying problems in your ML dataset, it assists you in cleaning both data and labels. This data centric AI software uses your existing models to estimate dataset problems that can be fixed to train even better models. The graphic below depicts the typical data-centric AI model development cycle:Apart from the standard way of coding all the way through finding data issues, it also offers Cleanlab Studio - a no-code platform for fixing all your data errors. For the purpose of this blog, we will go the former way on our sample use case.Getting Hands-on with CleanlabInstallationInstalling cleanlab is as easy as doing a pip install. I recommend installing optional dependencies as well, you never know what you need and when. I also installed sentence transformers, as I would be using them for vectorizing the text. Sentence transformers come with a bag of many amazing models, we particularly use ‘all-mpnet-base-v2’ as our choice of sentence-transformers for vectorizing text sequences. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for tasks like clustering or semantic search. Feel free to check out this for the list of all models and their comparisons.pip install ‘cleanlab[all]’ pip install sentence-transformersDatasetWe picked the SMS Spam Detection dataset as our choice of dataset for doing the experimentation. It is a public set of labeled SMS messages that have been collected for mobile phone spam research with total instances of roughly ~5.5k. The below graphic gives a sneak peek of some of the samples from the dataset.Data PreviewCodeLet’s now delve into the code. For demonstration purposes, we inject a 5% noise in the dataset, and see if we are able to detect them and eventually train a better model.Note: I have also annotated every segment of the code wherever necessary for better understanding.import pandas as pd from sklearn.model_selection import train_test_split, cross_val_predict       from sklearn.preprocessing import LabelEncoder               from sklearn.linear_model import LogisticRegression       from sentence_transformers import SentenceTransformer       from cleanlab.classification import CleanLearning       from sklearn.metrics import f1_score # Reading and renaming data. Here we set sep=’\t’ because the data is tab       separated.       data = pd.read_csv('SMSSpamCollection', sep='\t')       data.rename({0:'label', 1:'text'}, inplace=True, axis=1)       # Dropping any instance of duplicates that could exist       data.drop_duplicates(subset=['text'], keep=False, inplace=True)       # Original data distribution for spam and not spam (ham) categories       print (data['label'].value_counts(normalize=True))       ham 0.865937       spam 0.134063       # Adding noise. Switching 5% of ham data to ‘spam’ label       tmp_df = data[data['label']=='ham']               examples_to_change = int(tmp_df.shape[0]*0.05)       print (f'Changing examples: {examples_to_change}')       examples_text_to_change = tmp_df.head(examples_to_change)['text'].tolist() changed_df = pd.DataFrame([[i, 'spam'] for i in examples_text_to_change])       changed_df.rename({0:'text', 1:'label'}, axis=1, inplace=True)       left_data = data[~data['text'].isin(examples_text_to_change)]       final_df = pd.concat([left_data, changed_df])               final_df.reset_index(drop=True, inplace=True)       Changing examples: 216       # Modified data distribution for spam and not spam (ham) categories       print (final_df['label'].value_counts(normalize=True))       ham 0.840016       spam 0.159984    raw_texts, raw_labels = final_df["text"].values, final_df["label"].values # Converting label into integers encoder = LabelEncoder() encoder.fit(raw_train_labels)       train_labels = encoder.transform(raw_train_labels)       test_labels = encoder.transform(raw_test_labels)       # Vectorizing text sequence using sentence-transformers transformer = SentenceTransformer('all-mpnet-base-v2') train_texts = transformer.encode(raw_train_texts)       test_texts = transformer.encode(raw_test_texts)       # Instatiating model instance model = LogisticRegression(max_iter=200) # Wrapping the sckit model around CL cl = CleanLearning(model) # Finding label issues in the train set label_issues = cl.find_label_issues(X=train_texts, labels=train_labels) # Picking top 50 samples based on confidence scores identified_issues = label_issues[label_issues["is_label_issue"] == True] lowest_quality_labels =       label_issues["label_quality"].argsort()[:50].to_numpy()       # Beauty print the label issue detected by CleanLab def print_as_df(index):    return pd.DataFrame(              {    "text": raw_train_texts,              "given_label": raw_train_labels,           "predicted_label": encoder.inverse_transform(label_issues["predicted_label"]),       },       ).iloc[index]       print_as_df(lowest_quality_labels[:5]) As we can see, Cleanlab assisted us in automatically removing the incorrect labels and training a better model with the same parameters and settings. In my experience, people frequently ignore data concerns in favor of building more sophisticated models to increase accuracy numbers. Improving data, on the other hand, is a pretty simple performance win. And, thanks to products like Cleanlab, it's become really simple and convenient.Feel free to access and play around with the above code in the Colab notebook hereConclusionIn conclusion, Cleanlab offers a straightforward solution to enhance data quality by addressing label inconsistencies, a crucial step in building more reliable and accurate machine learning models. By focusing on data integrity, Cleanlab simplifies the path to better performance and underscores the significance of clean data in the ever-evolving landscape of AI. Elevate your model's accuracy by investing in data quality, and explore the provided code to see the impact for yourself.Author BioPrakhar has a Master’s in Data Science with over 4 years of experience in industry across various sectors like Retail, Healthcare, Consumer Analytics, etc. His research interests include Natural Language Understanding and generation, and has published multiple research papers in reputed international publications in the relevant domain. Feel free to reach out to him on LinkedIn
Read more
  • 2
  • 0
  • 1643

article-image-artificial-intelligence-in-game-development-understanding-behavior-trees
Marco Secchi
18 Nov 2024
10 min read
Save for later

Artificial Intelligence in Game Development: Understanding Behavior Trees

Marco Secchi
18 Nov 2024
10 min read
IntroductionIn the wild world of videogames, you'll inevitably encounter a foe that needs to be both engaging and captivating. This opponent isn't just a bunch of nice-to-see polygons and textures; it needs to be a challenge that'll keep your players hooked to the screen.Let's be honest, as a game developer, crafting a truly engaging opponent is often a challenge that rivals the one your players will face!In video games, we often use the term Artificial Intelligence (AI) to describe characters that are not controlled by the player, whether they are enemies or friendly entities. There are countless ways to develop compelling characters in video games. In this article, we'll explore one specific solution offered by Unreal Engine: behavior trees.NoteCitations come from my Artificial Intelligence in Unreal Engine 5 book.Using the Unreal Shooting Gym ProjectFor this article, I have created a dedicated project called Unreal Shooting Gym. You can freely download it from GitHub: https://github.com/marcosecchi/unreal-shooting-gym and open it up with Unreal Engine 5.4.Once opened, you should see a level showing a lab with a set of targets and a small robot armed with a gun (A.K.A. RoboGun), as shown in Figure 1: Figure 1. The project level.If you hit the Play button, you should notice the RoboGun rotating toward a target while shooting. Once the target has been hit, the RoboGun will start rotating towards another one. All this logic has been achieved through a behavior tree, so let’s see what it is all about.Behavior Trees“In the universe of game development, behavior trees are hierarchical structures that govern the decision-making processes of AI characters, determining their actions and reactions during gameplay.”Unreal Engine offers a solid framework for handling behavior trees based on two main elements: the blackboard and behavior tree assets.Blackboard Asset“In Unreal Engine, the Blackboard [...] acts as a memory space – some sort of brain – where AI agents can read and write data during their decision-making process.“By opening the AI project folder, you can double-click the BB_Robogun asset to open it. You will be presented with the blackboard that, as you can see from Figure 2, is quite simple to understand. Figure 2. The AI blackboardAs you can see there’s a couple of variables – called keys – that are used to store a reference to the actor owning the behavior tree – in this case, the RoboGun – and to the target object that will be used to rotate the RoboGun.Behavior Tree Asset“In Unreal Engine, behavior trees are assets that are edited in a similar way to Blueprints – that is, visually – by adding and linking a set of nodes with specific functionalities to form a behavior tree graph.”Now, double-click the BT_RoboGun asset located in the AI folder to open the behavior tree. You should see the tree structure depicted in Figure 3:Figure 3. The AI behavior treeAlthough this is a pretty simple behavior logic, there’s a lot of things involved here. First of all, you will notice that there is a Root node; this is where the behavior logic starts from.After that, you will see that there are three gray-colored nodes; these are defined composite nodes.“Composite nodes define the root of a branch and set the rules for its execution.”Each of them behaves differently, but it is sufficient to say that they control the subtree that will be executed; as an example, the Shoot Sequence node will execute all the subtrees one after the other.The purple-colored nodes are called tasks and they are basically the leaves of the tree, whose aim is to execute actions. Unreal Engine comes with some predefined tasks, but you will be able to create your own through Blueprints or C++.As an example, consider the Shoot task depicted in Figure 4: Figure 4. The Shoot task In this Blueprint, when the task is executed, it will call the Shoot method – by means of a ShootInterface – and then end the execution with success. For a slightly more complex task, please check the  BTTask_SeekTarget asset.Get back to the behavior tree, and you will notice that the Find Random Target node has a blue-colored section called Is Target Set? This is a decorator. “Decorators provide a way to add additional functionality or conditions to the execution of a portion of a behavior tree.”In our case, the decorator is checking if the TargetActor blackboard key is not set; the corresponding task will be executed only if that key is not set – that is, we have no viable target. If the target is set, this decorator will block task execution and the parent selector node – the Root Selector node – will execute the next subtree.Environment QueriesUnreal Engine provides an Environment Query System (EQS) framework that allows data collection about the virtual environment. AI agents will be able to make informed decisions based on the results.In our behavior tree, we are running an environment query to find a viable target in the Find Random Target task. The query I have created – called EQ_FindTarget – is pretty simple as it just queries the environment looking for instances of the class BP_Target, as shown in Figure 5:Figure 5. The environment queryPawn and ControllerOnce you have created your behavior tree, you will need to execute it through an AIController, the class that is used to possess pawns or characters in order to make them proper AI agents. In the Blueprints folder, you can double-click on the RoboGunController asset to check the pretty self-explanatory code depicted in Figure 6:Figure 6. The character controller codeAs you can see, it’s just a matter of running a behavior tree asset. Easy, isn’t it?If you open the BP_RoboGun asset, you will notice that, in the Details panel, I have set the AI Controller Class to the RoboGunController; this will make the RoboGun pawn be automatically possessed by the RoboGunController.ConclusionThis concludes this brief overview of the behavior tree system; I encourage you to explore the possibilities and more advanced features – such as writing your code the C++ way – by reading my new book “Artificial Intelligence in Unreal Engine 5”; I promise you it will be an informative and, sometimes, funny journey!Author BioMarco Secchi is a freelance game programmer who graduated in Computer Engineering at the Polytechnic University of Milan. He is currently lecturer of the BA in Creative Technologies and of the MA in Creative Media Production. He also mentors BA students in their final thesis projects. In his spare time, he reads (a lot), plays (less than he would like) and practices (to some extent) Crossfit.
Read more
  • 0
  • 0
  • 782

article-image-simplifying-ai-pipelines-using-the-fti-architecture
Paul Iusztin
08 Nov 2024
15 min read
Save for later

Simplifying AI pipelines using the FTI Architecture

Paul Iusztin
08 Nov 2024
15 min read
IntroductionNavigating the world of data and AI systems can be overwhelming.Their complexity often makes it difficult to visualize how data engineering, research (data science and machine learning), and production roles (AI engineering, ML engineering, MLOps) work together to form an end-to-end system.As a data engineer, your work finishes when standardized data is ingested into a data warehouse or lake.As a researcher, your work ends after training the optimal model on a static dataset and registering it.As an AI or ML engineer, deploying the model into production often signals the end of your responsibilities.As an MLOps engineer, your work finishes when operations are fully automated and adequately monitored for long-term stability.But is there a more intuitive and accessible way to comprehend the entire end-to-end data and AI ecosystem?Absolutely—through the FTI architecture.Let’s quickly dig into the FTI architecture and apply it to a production LLM & RAG use case. Figure 1: The mess of bringing structure between the common elements of an ML system.Introducing the FTI architectureThe FTI architecture proposes a clear and straightforward mind map that any team or person can follow to compute the features, train the model, and deploy an inference pipeline to make predictions.The pattern suggests that any ML system can be boiled down to these 3 pipelines: feature, training, and inference.This is powerful, as we can clearly define the scope and interface of each pipeline. Ultimately, we have just 3 instead of 20 moving pieces, as suggested in Figure 1, which is much easier to work with and define.Figure 2 shows the feature, training, and inference pipelines. We will zoom in on each one to understand its scope and interface.Figure 2: FTI architectureBefore going into the details, it is essential to understand that each pipeline is a separate component that can run on different processes or hardware. Thus, each pipeline can be written using a different technology, by a different team, or scaled differently.The feature pipelineThe feature pipeline takes raw data as input, processes it, and outputs the features and labels required by the model for training or inference.Instead of directly passing them to the model, the features and labels are stored inside a feature store. Its responsibility is to store, version, track, and share the features.By saving the features into a feature store, we always have a state of our features. Thus, we can easily send the features to the training and inference pipelines.The training pipelineThe training pipeline takes the features and labels from the features stored as input and outputs a trained model(s).The models are stored in a model registry. Its role is similar to that of feature stores, but the model is the first-class citizen this time. Thus, the model registry will store, version, track, and share the model with the inference pipeline.The inference pipelineThe inference pipeline takes as input the features and labels from the feature store and the trained model from the model registry. With these two, predictions can be easily made in either batch or real-time mode.As this is a versatile pattern, it is up to you to decide what you do with your predictions. If it’s a batch system, they will probably be stored in a DB. If it’s a real-time system, the predictions will be served to the client who requested them.The most important thing you must remember about the FTI pipelines is their interface. It doesn’t matter how complex your ML system gets — these interfaces will remain the same.The final thing you must understand about the FTI pattern is that the system doesn’t have to contain only 3 pipelines. In most cases, it will include more.For example, the feature pipeline can be composed of a service that computes the features and one that validates the data. Also, the training pipeline can comprise the training and evaluation components.Applying the FTI architecture to a use caseThe FTI architecture is tool-agnostic, but to better understand how it works, let’s present a concrete use case and tech stack.Use case: Fine-tune an LLM on your social media data (LinkedIn, Medium, GitHub) and expose it as a real-time RAG application. Let’s call it your LLM Twin.As we build an end-to-end system, we split it into 4 pipelines:The data collection pipeline (owned by the DE team)The FTI pipelines (owned by the AI teams)As the FTI architecture defines a straightforward interface, we can easily connect the data collection pipeline to the ML components through a data warehouse, which, in our case, is a MongoDB NoSQL DB.The feature pipeline (the second ML-oriented data pipeline) can easily extract standardized data from the data warehouse and preprocess it for fine-tuning and RAG.The communication between the two is done solely through the data warehouse. Thus, the feature pipeline isn’t aware of the data collection pipeline and how it collected the raw data. Figure 3: LLM Twin high-level architectureThe feature pipeline does two things:chunks, embeds and loads the data to a Qdrant vector DB for RAG;generates an instruct dataset and loads it into a versioned ZenML artifact.The training pipeline ingests a specific version of the instruct dataset, fine-tunes an open-source LLM from HuggingFace, such as Llama 3.1, and pushes it to a HuggingFace model registry.During the research phase, we use a Comet ML experiment tracker to compare multiple fine-tuning experiments and push only the best one to the model registry.During production, we can automate the training job and use our LLM evaluation strategy or canary tests to check if the new LLM is fit for production.As the input dataset and output model registry are decoupled, we can quickly launch our training jobs using ML platforms like AWS SageMaker.ZenML orchestrates the data collection, feature, and training pipelines. Thus, we can easily schedule them or run them on demand orThe end-to-end RAG application is implemented in the inference pipeline side, which accesses fresh documents from the Qdrant vector DB and the latest model from the HuggingFace model registry.Here, we can implement advanced RAG techniques such as query expansion, self-query and rerank to improve the accuracy of our retrieval step for better context during the generation step.The fine-tuned LLM will be deployed to AWS SageMaker as an inference endpoint. Meanwhile, the rest of the RAG application is hosted as a FastAPI server, exposing the end-to-end logic as REST API endpoints.The last step is to collect the input prompts and generated answers with a prompt monitoring tool such as Opik to evaluate the production LLM for things such as hallucinations, moderation or domain-specific problems such as writing tone and style.SummaryThe FTI architecture is a powerful mindmap that helps you connect the dots in the complex data and AI world, as illustrated in the LLM Twin use case.Unlock the full potential of Large Language Models with the "LLM Engineer's Handbook" by Paul Iusztin and Maxime Labonne. Dive deeper into real-world applications, like the FTI architecture, and learn how to seamlessly connect data engineering, ML pipelines, and AI production. With practical insights and step-by-step guidance, this handbook is an essential resource for anyone looking to master end-to-end AI systems. Don’t just read about AI—start building it. Get your copy today and transform how you approach LLM engineering!Author BioPaul Iusztin is a senior ML and MLOps engineer at Metaphysic, a leading GenAI platform, serving as one of their core engineers in taking their deep learning products to production. Along with Metaphysic, with over seven years of experience, he built GenAI, Computer Vision and MLOps solutions for CoreAI, Everseen, and Continental. Paul's determined passion and mission are to build data-intensive AI/ML products that serve the world and educate others about the process. As the Founder of Decoding ML, a channel for battle-tested content on learning how to design, code, and deploy production-grade ML, Paul has significantly enriched the engineering and MLOps community. His weekly content on ML engineering and his open-source courses focusing on end-to-end ML life cycles, such as Hands-on LLMs and LLM Twin, testify to his valuable contributions.
Read more
  • 0
  • 0
  • 1646
Banner background image

article-image-learn-transformers-for-natural-language-processing-with-denis-rothman
Expert Network
31 Aug 2021
7 min read
Save for later

Learn Transformers for Natural Language Processing with Denis Rothman

Expert Network
31 Aug 2021
7 min read
Key takeaways The transformer architecture has proved to be revolutionary in outperforming the classical RNN and CNN models in use today. Artificial intelligence is simply a recent form of automation, just like all other automation. AI consultants will always be necessary to implement AI. Understand transformers from a cognitive science perspective with the book Transformers for Natural Language Processing. The transformer architecture is both revolutionary and disruptive making it the hottest Algorithm in AI. It is a game-changer for Natural Language Understanding (NLU), a subset of Natural Language Processing (NLP), which has become one of the pillars of artificial intelligence in a global digital economy.​ Transformers can outperform the classical RNN and CNN models in use today. We interviewed artificial intelligence expert Denis Rothman about transformers, it's advancement in artificial intelligence & NLP, and his recent book Transformers for Natural Language Processing. What's the significance of AI language understanding in the tech world today and what role do transformers play in it? Artificial intelligence-driven language understanding is expanding exponentially. It has become the pillar of language modeling, chatbots, personal assistants, question answering, text summarizing, speech-to-text, sentiment analysis, machine translation, and more. The Transformer, introduced by Google, provides novel approaches to language understanding through a novel self-attention architecture. OpenAI offers transformer technology, and Facebook's AI Research department provides high-quality datasets. Overall, the Internet giants have made transformers available to all, as you will discover in my book. The transformer architecture is both revolutionary and disruptive. The Transformer and subsequent transformer architectures and models are revolutionary because they changed the way we think of NLP and artificial intelligence itself. The architecture of the Transformer is not an evolution. It breaks with the past, leaving RNNs and CNNs behind. It takes us closer to seamless machine intelligence that will match human intelligence in the years to come. What should deep learning & NLP practitioners keep in mind while starting their career with transformers? The world of artificial intelligence is undergoing an exponential evolution in NLP due to the amount of data available. As this evolution expands to all domains, new abilities are required. NLP will not just be about downloading a model and getting to work in terms of software. You will have to analyze the quality of what a transformer model produces to fine-tune it. In turn, to analyze NLP properly, a minimum knowledge in linguistics will become mandatory. Linguistics will enable you to understand the building blocks and structure of a language. Grammar will increase your ability to analyze the output of a transformer.  Otherwise, your team will have to hire a linguist, which will increase the project's cost and threaten the Return On Investment(ROI) of the team. What are some future advancements that you anticipate in transformers and NLP? Transformers have wiped RNNs off the map at this point. They represent the industrialization of artificial intelligence. As artificial intelligence, transformers are taking AI from the hype to an industrial level. Unlike traditional deep learning models, transformers contain optimized layers for GPUs and CPUs. In the future, creating NLP models will require machine architecture awareness. Machine performance will be the key to more efficient models. Not everybody can purchase or rent a supercomputer to train a model. Learning how to design tailored transformer models based on optimized datasets will become mandatory to face competition. What are some of the popular myths around transformers prevalent in the tech market? Many people believe that transformers can perform all NLP tasks with a model such as GPT-3. Nothing can be further from the truth. Google, Microsoft, Facebook, and Amazon, for example, need data for their everyday business and powerful NLP transformer models to analyze the billions of words coming in every day. However, the tasks are limited to their marketing usage. If you need to implement a transformer in a specific area, you will have to build datasets. You will also have to build pipelines with classical algorithms and queries to process the data, the inputs, and manage the outputs. In real-life, that means that artificial intelligence is only a component in a long chain of classical algorithms and processes. How was your experience building one of the very first word2matrix embedding solutions? In the early 1980s, I managed a company with many students who wanted to learn a language. I had a choice. Increase the number of teachers or automate vast portions of the process. I decided to go for automation. Any intelligent system requires calculations. I found that converting words and word pieces into numbers was far more efficient than directly analyzing the words. I thus create a word2vector system, patented it in 1982, wrote a textbook, and implemented it in our company. Students began to take specific courses independently in our lab without a teacher. I then went further in the next few years, writing one of the first Cognitive NLP Chatbots with was successfully implemented for an industrial amount of students. Being the author of three cutting-edge AI solutions, what is your take on the shrinkage of job opportunities due to AI? Automation began centuries ago with water mills, windmills, textile machines, locomotives, and more recently, motorized personal vehicles in the early 20th century. Tractors replaced millions of jobs in the fields. Services are no exception. In the 1950s, hundreds of thousands of tellers, actual humans, worked in banks around the world. Today everybody goes to an ATM. ATM stands for Automated Teller Machine(ATM). “Automated teller,” says it all. A person performing a service was automated. Software is the automation of human tasks from the beginning, from accounting to stock market management and thousands of tasks. Artificial intelligence is simply a recent form of automation, just like all other automation. AI cannot replace traditional mathematics in physics. The calculation of differential equations driving rockets and satellites requires classical software precision, not artificial intelligence. AI is only a component of automation, like when cars replaced horses and all of the jobs that went with horse-driven transportation. AI will not replace everything because AI is useless in many fields. AI consultants will always be necessary to implement AI. Why has Python become the most suitable language for natural language processing? It’s important not to confuse the concepts of “most used” and “most suitable.” Python is a great intuitive language to learn AI and NLP. But it’s not a prerequisite. Python is easy to use and run, making it the shortest path, at this point, to take to learn AI. But do not be mistaken. C++ skills will also be required in large real-life projects, for example. My advice. Learn AI with Python at full speed. Do some implementations with Python. But learn other languages such as C++, Java, and more. Real-life pipelines require classical processes and algorithms, not only AI. In some projects, C++ will boost performances, for example. Tell us about your book, Transformers for Natural Language Processing. What trajectory does your book follow to help its readers master transformers? Reading my book on transformers will help you save weeks and maybe months of effort trying to understand how they work by watching videos and reading blogs. The reader will begin by learning the original Transformer in depth. Once the transformer's building blocks are mastered, the reader will learn how to train and fine-tune a transformer. The reader will then build and run the main transformer models such as BERT, RoBERTa, GPT-2, T5, and more. The models will be applied to NLP tasks such as document summarization, Q&As, semantic analysis, and a wide range of NLP tasks. The book contains a method to analyze fake news with transformers. The book also goes beyond the architecture of transformers and into the world of usage. You will learn how to build, train, fine-tune, and implement transformers.
Read more
  • 0
  • 0
  • 8308

article-image-bringing-ai-to-the-b2b-world-catching-up-with-sidetrade-cto-mark-sheldon-interview
Packt Editorial Staff
24 Feb 2020
13 min read
Save for later

Bringing AI to the B2B world: Catching up with Sidetrade CTO Mark Sheldon [Interview]

Packt Editorial Staff
24 Feb 2020
13 min read
Sidetrade is an organization that is on a mission to transform customer engagement in the world of B2B marketing with the help of artificial intelligence. With its own AI technology - Aimie - it's now in a strong position to carve out a niche for itself in a market that shows no signs of slowing down. What makes the company even more exciting for us at Packt is that they're just a stones throw away from our offices in Birmingham. To get the lowdown on Sidetrade we spoke to CTO Mark Sheldon about the company's evolution and what the future might hold. Read the interview below... Packt: Tell us a bit about your background and what you're up to today. Mark Sheldon: I started my career as a developer and moved into the management of a large technical team, at one of the ‘big six’ utility companies in the UK. Back in 2013, when the AI buzz was in its infancy, I co-founded a predictive analytics software company called BrightTarget. It was clear there was a better way for B2B organisations to gain more value from their data, and cloud computing and machine learning were clear market changers. In 2017 BrightTarget was acquired by Sidetrade and at this point I became part of their technical leadership team, with the goal of making Sidetrade an AI-driven company. More recently I moved into the Group CTO position (as part of the global leadership team), responsible for more than 85 staff. Sidetrade has a total of 250 staff across six offices in Europe, with expansion planned in 2020. The AI boom and its impact on the B2B landscape Packt: Gartner predicts that this year 30% of B2B companies will use AI to augment at least one of their primary sales processes. What's your take on this? Mark: Yes, only 30%, so this market is still just emerging. Although machine learning has been around for decades, there's still a lot of confusion around AI in many B2B organisations, mostly caused by all of the market and vendor hype. Very few have successfully deployed machine learning and are able to demonstrate value. However, for those that have, the potential for commercial gain deploying AI is huge. The most common processes impacted in sales & marketing are those which involve interactions with customers or prospects at scale; where the decision making of a human can be augmented or improved. e.g. Identifying customers most at risk of churn, customers with the best opportunity to sell more product or prospects with the highest propensity to become a customer. AI really allows sales & marketing teams to optimize their time and marketing spend. BrightTarget and Sidetrade Packt: You co-founded BrightTarget which was acquired by Sidetrade in 2016. Could you tell us a little bit about BrightTarget?   Mark: BrightTarget was founded in 2014, on the principle of helping B2B organisations deploy AI without the need for expensive and hard to find data scientists. We invested significantly in automating the process of data loading, processing (feature generation), model building and monitoring. We achieved strong traction with some large enterprise accounts and were recognized by Forrester as a “Strong Performer”. Packt: How did the acquisition come about? Mark: At this time [when BrightTarget was founded], Olivier Novasque (the Sidetrade CEO and founder) had a clear vision to transform Sidetrade into an AI-driven business. So the acquisition of BrightTarget in November 2016 was a natural fit with the ambitions of Sidetrade and their goals. This has proven to be a great move with the launch of Aimee (AI engine) which has contributed significantly to the subsequent revenue growth following the acquisition. Aimie: Sidetrade's AI technology Packt: Tell us a bit more about Aimie. How does it work? What's the thinking behind it? Mark: Aimie is Sidetrade’s propriety AI technology that helps our customers augment their daily experience within our products. For example, Aimie helps every cash collector make the very optimum collection decisions, even if they have only joined the company two weeks ago! This AI technology is at the heart of our SaaS platforms – Augmented Revenue (helping B2B organisations to manage their Revenue; including managing revenue at risk and finding opportunities to grow revenue from existing customers) and Augmented Cash (again, helping B2B organsisations improve working capital by better cash collection). We also have an unrivalled data lake built up over 20 years. We now have 230 million B2B payment experiences, totaling sales of over 700 billion Euros [£621 bn] which we train our AI on, and enriches our client’s own data. More good quality data for AI to train upon means for better predictions and outcomes. For example (reported in Fortune and Forbes), one of our enterprise clients is Manpower, one of the biggest recruitment firms in the world. With an annual income of €4 bn per year, Manpower France collects 1.3 million receivables from 80,000 companies. To handle this volume, and increasingly complex payment procedures, Manpower’s finance department started using Sidetrade technology in 2013, and introduced Aimie in 2018-19. Manpower started Aimie off with two customer portfolios for a period of two months. Aimie analyzed what worked before for Manpower, directly executed automatic follow-up actions, and established which past-dues to target first. She considered available resources (staff hours, workloads) in order to take optimal actions. Encouraged by the results, Manpower ramped up their use of Aimie. Within four months, Aimie was managing nearly 60% of their single-site customers, which represents over 5,000 accounts, and nearly 10,000 follow-up actions per month. Manpower has over 700 payer centers to manage, making it impossible for a manager to call all the debtors in their portfolio. Aimie helped them decide which customers to contact first. After nine months of testing, the results were clear: with support from Aimie, effectiveness of recovery actions grew 12%. That’s a good improvement in cash collection which boosts working capital, vital for business. Sidetrade's data science team Packt: Sidetrade has a data science team, what is it and how does it function? How does your team of data engineers, data scientists work in tandem with the product teams to create AI powered B2B solutions for customers? Do they also work on customized solutions? Mark: Dr Clement Chastagnol (PhD in AI and robotics) leads our data science team. We currently have a team focused more on research topics, who really push the boundaries on some of the latest aspects of AI. However, the majority of our sata scientists work directly within product-led squads (with a mixture of different data, application & ops engineers). The reason for this is to ensure we deliver actionable AI/ML into our products on a regular basis, to ensure we are customer (therefore product) focused. As a SaaS company with 1000’s of customers/users, almost all of the work we do is to improve our overall products, and adding features that benefit the majority. This also applies to data science, although we have a very advanced M/L platform which allows us to automatically build and manage 1000’s of M/L models, that are often client-specific. In terms of research, each year we work with the French Government’s business ombudsman, to research and produce an index report of all B2B business payment disputes, including figures by industry, and length of delay. This involves our data scientists analysing over 9,000 French customer companies representing, 91% of large corporations and organizations with 250 to 5,000 employees. The data analyzed covers over 2.8 million invoices totaling €12bn. Also as part of our research work, we have received funding from the French Government, EU Commission agencies, the French national agency for research, and the DataAi Institute on the following projects: Eurofirmo, which is an index of all 26 million businesses in the EU and Britain, including headcount and revenue which has never been done before. Re-search Alps, which is a collaboration with academics from four universities that aims to track all research-active research institutes across seven European countries. It records their research projects, funding, publications, patents, and other academic output. Dirty Data: Two research axis are funded. One revolves around dirty data integration, funded by the ANR (Association National de Recherche). The other strives to develop new techniques to analyze incomplete data. It is funded by the DataAI institute. As part of this project, we’ve worked with Gaël Varoquaux (ML Researcher & creator of Scikit-learn) which has been great. Alongside all these projects, the team has recently worked with Facebook Research on the topic of data drift, as well as publishing research in journal papers, academic conference attendance and presentations, support for PhD students, hackathons and guest lectures at universities. Developing new talent in the AI space Packt: Sidetrade recently launched The Code Academy, what is it? How can developers take part in this initiative and how will it benefit them? What are other key initiatives by Sidetrade? Mark: The Code Academy is designed to develop the next generation of AI talent and is important for Sidetrade to maintain its position as a leading AI powered customer platform. The Code Academy, which was piloted in 2018, is part of Sidetrade’s commitment to providing engineering skills and jobs for young people in the Midlands, to keep the UK at the forefront of the AI industry. It’s a new, rapid approach to training and job creation. We welcome trainees with computing and non-computing backgrounds who can demonstrate ability and passion for technology. It’s rapid, as we design and deliver the academy in-house over four weeks, with a lot of support from senior developers in the team. We train for job roles, rather than just impart coding. And it’s offered without cost to the trainee, so money isn’t a barrier. At the end of four weeks the trainees are given a challenge, and asked to present their work to an audience of senior staff. Academy modules include: • Becoming familiar with Git • Setting up VSCode for .NET & Web development • How to load a relational data set through pgAdmin (CSV) • Learning how to write TSQL to analyse and find trends within a data set • Learning about the concept of & develop a basic RESTFul service • Introduction to angular (using http://angular.io/start) • Learn to connect all layers of the stack • Use Kanban (Trello) to manage projects • How to define an MVP In 2018 we trained 10 people and offered software and data engineer roles to three. In 2019, we revamped the academy, making it much more practical, and selected 12 trainees from 50 applicants. The quality of the talent was so good we offered five trainees data and software engineer roles with our professional services and R&D teams. Expanding the team Packt: You’re about to move into a new, much bigger office in central Birmingham. What are your challenges in terms of expanding the team? Can you elaborate more on the challenges faced by the team in terms of working with AIOps. Mark: That's right, we’re going to open a new Tech Hub that will house a much bigger team of data and software engineers working across the full stack. We’ll also run our 2020 Code Academy from the hub. A special launch event, Together for Tech, will make the opening official on 27th February 2020, with VIP guests, tech and business stakeholders. We’ll also be announcing a major investment in R&D and jobs creation. There is huge potential for Birmingham to become a tech powerhouse within Europe. The challenge for me is hiring enough senior level tech professionals. These people are needed to lead teams, develop staff, and keep pushing the boundary of what we can do. There’s overall a challenge to hire enough qualified professionals for the tech sector, and that is more acute at the senior levels. I think there’s a temptation for experienced tech types to head to London or even America, so it’s a challenge for the region to retain great talent. The team has spent a lot of time on ‘AI ops’, which has emerged in the past three years. So, the other challenge is how do we actually productionise all of the models and data engineering that our team and platform is producing. How do we deploy & run them in production? How do we monitor them? This is all about managing Machine Learning models at scale. For me, the sheer volume the team have to deal with is the biggest thing. We train thousands of different predictive models for different clients, and they are changing all of the time in terms of the data they are trained on. So actually, building workflows and processes to help monitor those in production at that kind of scale without having to scale the team in proportion is probably the biggest challenge. The future of AI and automation for B2B marketing and sales Packt: AI is a really broad term. It often gets used interchangeably with machine learning and deep learning. Do you think this confusion is risky or dangerous? Do you think people should simply stop talking about AI in favour of machine learning and deep learning? Mark: I think we’ve reached a point where AI has become a buzz word and a catch-all phrase. I think we can and should start being more sophisticated about what we mean, it’s a work of education. From a vendor and business point of view, AI is no longer a differentiator as everyone is talking about it, so it makes it harder to stand out. But as decision makers become more educated on the topic, it's clear which vendors have the expertise and depth of data to deliver true AI-powered solutions. Packt: What do you expect to come next in the B2B Sales and Marketing space? And how is automation of this space likely to impact other industries? Mark: My prediction for the next big thing to come in the AI space would be a major breakthrough in Quantum computing by either Google or one of the startups specializing in the field. In the B2B sales and marketing space I think the next step is just wider adoption and trust that AI can augment or even outperform humans. Most businesses will need to go through a cultural and often organisational shift that is required to get the full commercial benefits out of AI. Thanks for taking the time to speak to us Mark! We'll be watching Sidetrade closely over the months and years to come. It's also great to see such an exciting and innovative company growing in Birmingham, right near the Packt office. Learn more about Sidetrade: www.sidetrade.com
Read more
  • 0
  • 0
  • 12221
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime