Artificial Intelligence | 170 articles | Tech News, Tutorials & Expert Insights

article-image-5-ways-artificial-intelligence-is-upgrading-software-engineering

02 Sep 2018

8 min read

5 ways artificial intelligence is upgrading software engineering

02 Sep 2018

0
0
17781

article-image-8-machine-learning-best-practices

Melisha Dsouza

02 Sep 2018

9 min read

8 Machine learning best practices [Tutorial]

Melisha Dsouza

02 Sep 2018

9 min read

Machine Learning introduces a huge potential to reduce costs and generate new revenue in an enterprise. Application of machine learning effectively helps in solving practical problems smartly within an organization. Machine learning automates tasks that would otherwise need to be performed by a live agent. It has made drastic improvements in the past few years, but many a time, a machine needs the assistance of a human to complete its task. This is why it is necessary for organizations to learn best practices in machine learning which you will learn in this article today. This article is an excerpt from a book written by Chiheb Chebbi titled Mastering Machine Learning for Penetration Testing Feature engineering in machine learning Feature engineering and feature selection are essential to every modern data science product, especially machine learning based projects. According to research, over 50% of the time spent building the model is occupied by cleaning, processing, and selecting the data required to train the model. It is your responsibility to design, represent, and select the features. Most machine learning algorithms cannot work on raw data. They are not smart enough to do so. Thus, feature engineering is needed, to transform data in its raw status into data that can be understood and consumed by algorithms. Professor Andrew Ng once said: "Coming up with features is difficult, time-consuming, requires expert knowledge. 'Applied machine learning' is basically feature engineering." Feature engineering is a process in the data preparation phase, according to the cross-industry standard process for data mining: The term Feature Engineering itself is not a formally defined term. It groups together all of the tasks for designing features to build intelligent systems. It plays an important role in the system. If you check data science competitions, I bet you have noticed that the competitors all use the same algorithms, but the winners perform the best feature engineering. If you want to enhance your data science and machine learning skills, I highly recommend that you visit and compete at www.kaggle.com: When searching for machine learning resources, you will face many different terminologies. To avoid any confusion, we need to distinguish between feature selection and feature engineering. Feature engineering transforms raw data into suitable features, while feature selection extracts necessary features from the engineered data. Featuring engineering is selecting the subset of all features, without including redundant or irrelevant features. Machine learning best practices Feature engineering enhances the performance of our machine learning system. We discuss some tips and best practices to build robust intelligent systems. Let's explore some of the best practices in the different aspects of machine learning projects. Information security datasets Data is a vital part of every machine learning model. To train models, we need to feed them datasets. While reading the earlier chapters, you will have noticed that to build an accurate and efficient machine learning model, you need a huge volume of data, even after cleaning data. Big companies with great amounts of available data use their internal datasets to build models, but small organizations, like startups, often struggle to acquire such a volume of data. International rules and regulations are making the mission harder because data privacy is an important aspect of information security. Every modern business must protect its users' data. To solve this problem, many institutions and organizations are delivering publicly available datasets, so that others can download them and build their models for educational or commercial use. Some information security datasets are as follows: The Controller Area Network (CAN) dataset for intrusion detection (OTIDS): http://ocslab.hksecurity.net/Dataset/CAN-intrusion-dataset The car-hacking dataset for intrusion detection: http://ocslab.hksecurity.net/Datasets/CAN-intrusion-dataset The web-hacking dataset for cyber criminal profiling: http://ocslab.hksecurity.net/Datasets/web-hacking-profiling The API-based malware detection system (APIMDS) dataset: http://ocslab.hksecurity.net/apimds-dataset The intrusion detection evaluation dataset (CICIDS2017): http://www.unb.ca/cic/datasets/ids-2017.html The Tor-nonTor dataset: http://www.unb.ca/cic/datasets/tor.html The Android adware and general malware dataset: http://www.unb.ca/cic/datasets/android-adware.html Use Project Jupyter The Jupyter Notebook is an open source web application used to create and share coding documents. I highly recommend it, especially for novice data scientists, for many reasons. It will give you the ability to code and visualize output directly. It is great for discovering and playing with data; exploring data is an important step to building machine learning models. Jupyter's official website is http://jupyter.org/: To install it using pip, simply type the following: python -m pip install --upgrade pip python -m pip install jupyter Speed up training with GPUs As you know, even with good feature engineering, training in machine learning is computationally expensive. The quickest way to train learning algorithms is to use graphics processing units (GPUs). Generally, though not in all cases, using GPUs is a wise decision for training models. In order to overcome CPU performance bottlenecks, the gather/scatter GPU architecture is best, performing parallel operations to speed up computing. TensorFlow supports the use of GPUs to train machine learning models. Hence, the devices are represented as strings; following is an example: "/device:GPU:0" : Your device GPU "/device:GPU:1" : 2nd GPU device on your Machine To use a GPU device in TensorFlow, you can add the following line: with tf.device('/device:GPU:0'): <What to Do Here> You can use a single GPU or multiple GPUs. Don't forget to install the CUDA toolkit, by using the following commands: Wget "http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.44-1_amd64.deb" sudo dpkg -i cuda-repo-ubuntu1604_8.0.44-1_amd64.deb sudo apt-get update sudo apt-get install cuda Install cuDNN as follows: sudo tar -xvf cudnn-8.0-linux-x64-v5.1.tgz -C /usr/local export PATH=/usr/local/cuda/bin:$PATH export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64" export CUDA_HOME=/usr/local/cuda Selecting models and learning curves To improve the performance of machine learning models, there are many hyper parameters to adjust. The more data that is used, the more errors that can happen. To work on these parameters, there is a method called GridSearchCV. It performs searches on predefined parameter values, through iterations. GridSearchCV uses the score() function, by default. To use it in scikit-learn, import it by using this line: from sklearn.grid_search import GridSearchCV Learning curves are used to understand the performance of a machine learning model. To use a learning curve in scikit-learn, import it to your Python project, as follows: from sklearn.learning_curve import learning_curve Machine learning architecture In the real world, data scientists do not find data to be as clean as the publicly available datasets. Real world data is stored by different means, and the data itself is shaped in different categories. Thus, machine learning practitioners need to build their own systems and pipelines to achieve their goals and train the models. A typical machine learning project respects the following architecture: Coding Good coding skills are very important to data science and machine learning. In addition to using effective linear algebra, statistics, and mathematics, data scientists should learn how to code properly. As a data scientist, you can choose from many programming languages, like Python, R, Java, and so on. Respecting coding's best practices is very helpful and highly recommended. Writing elegant, clean, and understandable code can be done through these tips: Comments are very important to understandable code. So, don't forget to comment your code, all of the time. Choose the right names for variables, functions, methods, packages, and modules. Use four spaces per indentation level. Structure your repository properly. Follow common style guidelines. If you use Python, you can follow this great aphorism, called the The Zen of Python, written by the legend, Tim Peters: "Beautiful is better than ugly. Explicit is better than implicit. Simple is better than complex. Complex is better than complicated. Flat is better than nested. Sparse is better than dense. Readability counts. Special cases aren't special enough to break the rules. Although practicality beats purity. Errors should never pass silently. Unless explicitly silenced. In the face of ambiguity, refuse the temptation to guess. There should be one-- and preferably only one --obvious way to do it. Although that way may not be obvious at first unless you're Dutch. Now is better than never. Although never is often better than *right* now. If the implementation is hard to explain, it's a bad idea. If the implementation is easy to explain, it may be a good idea. Namespaces are one honking great idea -- let's do more of those!" Data handling Good data handling leads to successfully building machine learning projects. After loading a dataset, please make sure that all of the data has loaded properly, and that the reading process is performing correctly. After performing any operation on the dataset, check over the resulting dataset. Business contexts An intelligent system is highly connected to business aspects because, after all, you are using data science and machine learning to solve a business issue or to build a commercial product, or for getting useful insights from the data that is acquired, to make good decisions. Identifying the right problems and asking the right questions are important when building your machine learning model, in order to solve business issues. In this tutorial, we had a look at somes tips and best practices to build intelligent systems using Machine Learning. To become a master at penetration testing using machine learning with Python, check out this book Mastering Machine Learning for Penetration Testing Why TensorFlow always tops machine learning and artificial intelligence tool surveys Intelligent Edge Analytics: 7 ways machine learning is driving edge computing adoption in 2018 Tackle trolls with Machine Learning bots: Filtering out inappropriate content just got easy

0
0
10438

article-image-12-ubiquitous-artificial-intelligence-powered-apps-that-are-changing-lives

Bhagyashree R

30 Aug 2018

11 min read

12 ubiquitous artificial intelligence powered apps that are changing lives

Bhagyashree R

30 Aug 2018

11 min read

0
0
11194

article-image-nvidia-leads-the-ai-hardware-race-but-which-of-its-gpus-should-you-use-for-deep-learning

Prasad Ramesh

29 Aug 2018

8 min read

NVIDIA leads the AI hardware race. But which of its GPUs should you use for deep learning?

Prasad Ramesh

29 Aug 2018

8 min read

For readers who are new to deep learning and who might be wondering what a GPU is, let’s start there. To make it simple, consider deep learning as nothing more than a set of calculations - complex calculations, yes, but calculations nonetheless. To run these calculations, you need hardware. Ordinarily, you might just use a normal processor like the CPU inside your laptop. However, this isn’t powerful enough to process at the speed at which deep learning computations need to happen. GPUs, however, can. This is because while a conventional CPU has only a few complex cores, a GPU can have thousands of simple cores. With a GPU, training a deep learning data set can take just hours instead of days. However, although it’s clear that GPUs have significant advantages over CPUs, there is nevertheless a range of GPUs available, each having their own individual differences. Selecting one is ultimately a matter of knowing what your needs are. Let’s dig deeper and find out how to go about shopping for GPUs… What to look for before choosing a GPU? There are a few specifications to consider before picking a GPU. Memory bandwidth: This determines the capacity of a GPU to handle large amounts of data. It is the most important performance metric, as with faster memory bandwidth more data can be processed at higher speeds. Number of cores: This indicates how fast a GPU can process data. A large number of CUDA cores can handle large datasets well. CUDA cores are parallel processors similar to cores in a CPU but their number is in thousands and are not suited for complex calculations that a CPU core can perform. Memory size: For computer vision projects, it is crucial for memory size to be as much as you can afford. But with natural language processing, memory size does not play such an important role. Our pick of GPU devices to choose from The go to choice here is NVIDIA; they have standard libraries that make it simple to set things up. Other graphics cards are not very friendly in terms of the libraries supported for deep learning. NVIDIA CUDA Deep Neural Network library also has a good development community. “Is NVIDIA Unstoppable In AI?” -Forbes “Nvidia beats forecasts as sales of graphics chips for AI keep booming” -SiliconANGLE AMD GPUs are powerful too but lack library support to get things running smoothly. It would be really nice to see some AMD libraries being developed to break the monopoly and give more options to the consumers. NVIDIA RTX 2080 Ti: The RTX line of GPUs are to be released in September 2018. The RTX 2080 Ti will be twice as fast as the 1080 Ti. Price listed on NVIDIA website for founder’s edition is $1,199. RAM: 11 GB Memory bandwidth: 616 GBs/second Cores: 4352 cores @ 1545 MHz NVIDIA RTX 2080: This is more cost efficient than the 2080 Ti at a listed price of $799 on NVIDIA website for the founder’s edition. RAM: 8 GB Memory bandwidth: 448 GBs/second Cores: 2944 cores @ 1710 MHz NVIDIA RTX 2070: This is more cost efficient than the 2080 Ti at a listed price of $599 on NVIDIA website. Note that the other versions of the RTX cards will likely be cheaper than the founder’s edition around a $100 difference. RAM: 8 GB Memory bandwidth: 448 GBs/second Cores: 2304 cores @ 1620 MHz NVIDIA GTX 1080 Ti: Priced at $650 on Amazon. This is a higher end option but offers great value for money, and can also do well in Kaggle competitions. If you need more memory but cannot afford the RTX 2080 Ti go for this. RAM: 11 GB Memory bandwidth: 484 GBs/second Cores: 3584 cores @ 1582 MHz NVIDIA GTX 1080: Priced at $584 on Amazon. This is a mid-high end option only slightly behind the 1080Ti. VRAM: 8 GB Memory bandwidth: 320 GBs/second Processing power: 2560 cores @ 1733 MHz NVIDIA GTX 1070 Ti: Priced at around $450 on Amazon. This is slightly less performant than the GTX 1080 but $100 cheaper. VRAM: 8 GB Memory bandwidth: 256 GBs/second Processing power: 2438 cores @ 1683 MHz NVIDIA GTX 1070: Priced at $380 on Amazon is currently the bestseller because of crypto miners. Somewhat slower than the 1080 GPUs but cheaper. VRAM: 8 GB Memory bandwidth: 256 GBs/second Processing power: 1920 cores @ 1683 MHz NVIDIA GTX 1060 6GB: Priced at around $290 on Amazon. Pretty cheap but the 6 GB VRAM limits you. Should be good for NLP but you’ll find the performance lacking in computer vision. VRAM: 6 GB Memory bandwidth: 216 GBs/second Processing power: 1280 cores @ 1708 MHz NVIDIA GTX 1050 Ti: Priced at around $200 on Amazon. This is the cheapest workable option. Good to get started with deep learning and explore if you’re new. VRAM: 4 GB Memory bandwidth: 112 GBs/second Processing power: 768 cores @ 1392 MHz NVIDIA Titan XP: The Titan XP is also an option but gives only a marginally better performance while being almost twice as expensive as the GTX 1080 Ti, it has 12 GB memory, 547.7 GB/s bandwidth and 3840 cores @ 1582 MHz. On a side note, NVIDIA Quadro GPUs are pretty expensive and don’t really help in deep learning they are more of use in CAD and working with heavy graphics production tasks. The graph below does a pretty good job of visualizing how all the GPUs above compare: Source: Slav Ivanov Blog, processing power is calculated as CUDA cores times the clock frequency Does the number of GPUs matter? Yes, it does. But how many do you really need? What’s going to suit the scale of your project without breaking your budget? 2 GPUs will always yield better results than just one - but it’s only really worth it if you need the extra power. There are two options you can take with multi-GPU deep learning. On the one hand, you can train several different models at once across your GPUs, or, alternatively distribute one single training model across multiple GPUs known as “multi-GPU training”. The latter approach is compatible with TensorFlow, CNTK, and PyTorch. Both of these approaches have advantages. Ultimately, it depends on how many projects you’re working on and, again, what your needs are. Another important point to bear in mind is that if you’re using multiple GPUs, the processor and hard disk need to be fast enough to feed data continuously - otherwise the multi-GPU approach is pointless. Source: NVIDIA website It boils down to your needs and budget, GPUs aren’t exactly cheap. Other heavy devices There are also other large machines apart from GPUs. These include the specialized supercomputer from NVIDIA, the DGX-2, and Tensor processing units (TPUs) from Google. The NVIDIA DGX-2 If you thought GPUs were expensive, let me introduce you to NVIDIA DGX-2, the successor to the NVIDIA DGX-1. It’s a highly specialized workstation; consider it a supercomputer that has been specially designed to tackle deep learning. The price of the DGX-2 is (*gasp*) $399,000. Wait, what? I could buy some new hot wheels for that, or Dual Intel Xeon Platinum 8168, 2.7 GHz, 24-cores, 16 NVIDIA GPUs, 1.5 terabytes of RAM, and nearly 32 terabytes of SSD storage! The performance here is 2 petaFLOPS. Let’s be real: many of us probably won’t be able to afford it. However, NVIDIA does have leasing options, should you choose to try it. Practically speaking, this kind of beast finds its use in research work. In fact, the first DGX-1 was gifted to OpenAI by NVIDIA to promote AI research. Visit the NVIDIA website for more on these monster machines. There are also personal solutions available like the NVIDIA DGX Workstation. TPUs Now that you’ve caught your breath after reading about AI dream machines, let’s look at TPUs. Unlike the DGX machines, TPUs run on the cloud. A TPU is what’s referred to as an application-specific integrated circuit (ASIC) that has been designed specifically for machine learning and deep learning by Google. Here’s the key stats: Cloud TPUs can provide up to 11.5 petaflops of performance in a single pod. If you want to learn more, go to Google’s website. When choosing GPUs you need to weigh up your options The GTX 1080 Ti is most commonly used by researchers and competitively for Kaggle, as it gives good value for money. Go for this if you are sure about what you want to do with deep learning. The GTX 1080 and GTX 1070 Ti are cheaper with less computing power, a more budget friendly option if you cannot afford the 1080 Ti. GTX 1070 saves you some more money but is slower. The GTX 1060 6GB and GTX 1050 Ti are good if you’re just starting off in the world of deep learning without burning a hole in your pockets. If you must have the absolute best GPU irrespective of the cost then the RTX 2080 Ti is your choice. It offers twice the performance for almost twice the cost of a 1080 Ti. Nvidia unveils a new Turing architecture: “The world’s first ray tracing GPU” Nvidia GPUs offer Kubernetes for accelerated deployments of Artificial Intelligence workloads Nvidia’s Volta Tensor Core GPU hits performance milestones. But is it the best?

0
0
14741

article-image-a-machine-learning-roadmap-for-web-developers

Sugandha Lahoti

27 Aug 2018

7 min read

A Machine learning roadmap for Web Developers

Sugandha Lahoti

27 Aug 2018

7 min read

Now that you’ve opened this article, I’ll assume you’re a web developer who is all excited with the prospect of building a machine learning project. You may be here for one of these reasons. Either you have been in a circle of people who find web development is dying? (Is it really dying or just unwell?). Or maybe you are stagnating in your current trajectory. And so, you want to learn something different, something trending, something like Artificial Intelligence. Or you/your employer/your client is aware of the capabilities of machine learning and want to include it in some part of your web app to make it more powerful. Or like the majority of the folks, you just want to see first hand if all the fuss about artificial intelligence is really worth all the effort to switch gears, by building a side toy ML project. Either way, there are different approaches to fulfill these needs. Learning Machine Learning for the Web with Javascript Learning machine learning coming from a web development background comes with its own constraints. You might worry about having to learn entirely different concepts from scratch - from different algorithms to programming languages like Python to mathematical concepts like linear algebra, calculus, and statistics. However, chances are you can skip learning a new language. You probably know some Javascript in some form or the other thanks to your web development experience. As such, you can learn Machine Learning in JavaScript (You don’t have to learn another programming language from scratch) and take it right to your browsers with WebGL. There are some advantages to using JavaScript for ML. Its popularity is one; while ML in JavaScript is not as popular as Python’s ML ecosystem, at the moment, the language itself is. As demand for ML applications rises, and as hardware becomes faster and cheaper, it's only natural for machine learning to become more prevalent in the JavaScript world. The JavaScript ecosystem offers a rich set of libraries suited for most Machine Learning tasks. Math: math.js Data Analysis: d3.js Server: node.js (express, koa, hapi) Performance: Tensorflow.js (e.g. GPU accelerated via WebGL API in the browser), Keras.js etc. Read also: 5 JavaScript machine learning libraries you need to know BRIIM is a good collection of materials to get you started as web developer or JavaScript enthusiast in machine learning. In case you’re interested in learning Python instead of Javascript, here are the set of libraries you should pick. Math: numpy Data Analysis: Pandas Data Mining: PySpark Server: Flask, Django Performance: TensorFlow (because it is written with a Python API over a C/C++ engine) or Keras (sits on top of TensorFlow). Using Machine Learning as a service If you don’t want to spend your time learning frameworks, tools, and languages suited for machine learning, you can adopt Machine Learning as a service or MLaaS. These services provide machine learning tools as part of cloud computing services. So basically, you can benefit from machine learning without the allied cost, time and risk of establishing an in-house internal machine learning team. All you need is sufficient knowledge of incorporating APIs. All Machine Learning tasks including data pre-processing, model training, model evaluation, and predictions can be completed through MLaaS. Read also: How machine learning as a service is transforming cloud A large number of companies provide Machine Learning as a service. Most prominent ones include: Amazon Machine Learning Amazon ML makes it easy for web developers to build smart applications using simple APIs. This includes applications for fraud detection, demand forecasting, targeted marketing, and click prediction. They provide a Developer Guide, which provides a conceptual overview of Amazon ML and includes detailed instructions for using the service. They also have a API reference, which describes all the API operations and provides sample requests and responses for supported web service protocols. Azure ML web app templates The web app templates available in the Azure Marketplace can build a custom web app that knows your web service's input data and expected results. All you need to do is give the web app access to your web service and data, and the template does the rest. There are two available templates: Azure ML Request-Response Service Web App Template Azure ML Batch Execution Service Web App Template Each template creates a sample ASP.NET application by using the API URI and key for your web service. The template then deploys the application as a website to Azure. No coding is required to use these templates. You just supply the API key and URI, and the template builds the application for you. Google Cloud based APIs Google also provides machine learning services, with pre-trained models and a service to generate your own tailored models. Google’s Cloud AutoML is a suite of machine learning products that enables developers with limited machine learning expertise to train high-quality models specific to their business needs. Cloud AutoML is used by Disney on their website shopDisney to enhance guest experience through more relevant search results, expedited discovery, and product recommendations. Building Conversational Interfaces As a web developer, another thing you might be looking into, is developing conversational interfaces or chatbots to enhance your web apps. Amazon, Google, and Microsoft provide Machine learning powered tools to help developers with building their own chatbots. Amazon Lex You can embed chatbots in your web apps with the Amazon Lex featuring ASR (Automatic Speech Recognition) and NLP (Natural Language Processing) capabilities. The API can recognize written and spoken text and the Lex interface allows you to hook the recognized inputs to various back-end solutions. Lex currently supports deploying chatbots for Facebook Messenger, Slack, and Twilio. Google Dialogflow Google’s Dialogflow can build voice and text-based conversational interfaces, such as voice apps and chatbots, powered by AI. Dialogflow incorporates Google's machine learning expertise and products such as Google Cloud Speech-to-Text. The API can be tweaked and customized for needed intents using Java, Node.js, and Python. It is also available as an enterprise edition. Microsoft Azure Cognitive Services Microsoft Cognitive Services simplify a variety of AI-based tasks, giving you a quick way to add intelligence technologies to your bots with just a few lines of code. It provides tools and APIs for aiding the development of conversational interfaces. These include: Translator Speech API Bing Speech API to convert text into speech and speech into text Speaker Recognition API for voice verification tasks Custom Speech Service to apply Azure NLP capacities using own data and models Language Understanding Intelligent Service (LUIS) is an API that analyzes intentions in text to be recognized as commands Text Analysis API for sentiment analysis and defining topics Bing Spell Check Translator Text API Web Language Model API that estimates probabilities of words combinations and supports word autocompletion Linguistic Analysis API used for sentence separation, tagging the parts of speech, and dividing texts into labeled phrases Read also: Top 4 chatbot development frameworks for developers These tools should be enough to get your feet off the ground quickly and move into the specific area of machine learning. Ultimately your choice of tool relies on the kind of application you want to build, your level of expertise, and how much time and effort you’re willing to put to learn. Obviously, depending on your area of choice, you would have to do more research and develop yourself in those areas. How should web developers learn machine learning? 5 examples of Artificial Intelligence in Web apps The most valuable skills for web developers to learn in 2018

0
0
13270

article-image-tensorflow-always-tops-machine-learning-artificial-intelligence-tool-surveys

Sunith Shetty

23 Aug 2018

9 min read

Why TensorFlow always tops machine learning and artificial intelligence tool surveys

Sunith Shetty

23 Aug 2018

9 min read

TensorFlow is an open source machine learning framework for carrying out high-performance numerical computations. It provides excellent architecture support which allows easy deployment of computations across a variety of platforms ranging from desktops to clusters of servers, mobiles, and edge devices. Have you ever thought, why TensorFlow has become so popular in such a short span of time? What made TensorFlow so special, that we seeing a huge surge of developers and researchers opting for the TensorFlow framework? Interestingly, when it comes to artificial intelligence frameworks showdown, you will find TensorFlow emerging as a clear winner most of the time. The major credit goes to the soaring popularity and contributions across various forums such as GitHub, Stack Overflow, and Quora. The fact is, TensorFlow is being used in over 6000 open source repositories showing their roots in many real-world research and applications. How TensorFlow came to be The library was developed by a group of researchers and engineers from the Google Brain team within Google AI organization. They wanted a library that provides strong support for machine learning and deep learning and advanced numerical computations across different scientific domains. Since the time Google open sourced its machine learning framework in 2015, TensorFlow has grown in popularity with more than 1500 projects mentions on GitHub. The constant updates made to the TensorFlow ecosystem is the real cherry on the cake. This has ensured all the new challenges developers and researchers face are addressed, thus easing the complex computations and providing newer features, promises, and performance improvements with the support of high-level APIs. By open sourcing the library, the Google research team have received all the benefits from a huge set of contributors outside their existing core team. Their idea was to make TensorFlow popular by open sourcing it, thus making sure all new research ideas are implemented in TensorFlow first allowing Google to productize those ideas. Read Also: 6 reasons why Google open sourced TensorFlow What makes TensorFlow different from the rest? With more and more research and real-life use cases going mainstream, we can see a big trend among programmers, and developers flocking towards the tool called TensorFlow. The popularity for TensorFlow is quite evident, with big names adopting TensorFlow for carrying out artificial intelligence tasks. Many popular companies such as NVIDIA, Twitter, Snapchat, Uber and more are using TensorFlow for all their major operations and research areas. On one hand, someone can make a case that TensorFlow’s popularity is based on its origins/legacy. TensorFlow being developed under the house of “Google” enjoys the reputation of the household name. There’s no doubt, TensorFlow has been better marketed than some of its competitors. Source: The Data Incubator However that’s not the full story. There are many other compelling reasons why small scale to large scale companies prefer using TensorFlow over other machine learning tools TensorFlow key functionalities TensorFlow provides an accessible and readable syntax which is essential for making these programming resources easier to use. The complex syntax is the last thing developers need to know given machine learning’s advanced nature. TensorFlow provides excellent functionalities and services when compared to other popular deep learning frameworks. These high-level operations are essential for carrying out complex parallel computations and for building advanced neural network models. TensorFlow is a low-level library which provides more flexibility. Thus you can define your own functionalities or services for your models. This is a very important parameter for researchers because it allows them to change the model based on changing user requirements. TensorFlow provides more network control. Thus allowing developers and researchers to understand how operations are implemented across the network. They can always keep track of new changes done over time. Distributed training The trend of distributed deep learning began in 2017, when Facebook released a paper showing a set of methods to reduce the training time of a convolutional neural network model. The test was done on RESNET-50 model on ImageNet dataset which took one hour to train instead of two weeks. 256 GPUs spread over 32 servers were used. This revolutionary test has open the gates for many research work which have massively reduced the experimentation time by running many tasks in parallel on multiple GPUs. Google’s distributed TensorFlow has allowed all the researchers and developers to scale out complex distributed training using in-built methods and operations that optimizes distributed deep learning among servers. . Google’s distributed TensorFlow engine which is part of the regular TensorFlow repo, works exceptionally well with the existing TensorFlow’s operations and functionalities. It has allowed exploring two of the most important distributed methods: Distribute the training time of a neural network model over many servers to reduce the training time. Searching for good hyperparameters by running parallel experiments over multiple servers. Google has given distributed TensorFlow engine the required power to steal the share of the market acquired by other distributed projects such as Microsoft’s CNTK, AMPLab's SparkNet, and CaffeOnSpark. Even though the competition is tough, Google has still managed to become more popular when compared to the other alternatives in the market. From research to production Google has, in some ways, democratized deep learning., The key reason is TensorFlow’s high-level APIs making deep learning accessible to everyone. TensorFlow provides pre-built functions and advanced operations to ease the task of building different neural network models. It provides the required infrastructure and hardware which makes them one of the leading libraries used extensively by researchers and students in the deep learning domain. In addition to research tools, TensorFlow extends the services by bringing the model in production using TensorFlow Serving. It is specifically designed for production environments, which provides a flexible, high-performance serving system for machine learning models. It provides all the functionalities and operations which makes it easy to deploy new algorithms and experiments as per changing requirements and preferences. It provides an excellent feature of out-of-the-box integration with TensorFlow models which can be easily extended to serve other types of models and data. TensorFlow’s API is a complete package which is easier to use and read, plus provides helpful operators, debugging and monitoring tools, and deployment features. This has led to growing use of TensorFlow library as a complete package within the ecosystem by the emerging body of students, researchers, developers, production engineers from various fields who are gravitating towards artificial intelligence. There is a TensorFlow for web, mobile, edge, embedded and more TensorFlow provides a range of services and modules within their existing ecosystem making them as one of the ground-breaking end-to-end tools to provide state-of-the-art deep learning. TensorFlow.js for machine learning on the web JavaScript library for training and deploying machine learning models in the browser. This library provides flexible and intuitive APIs to build and train new and pre-existing models from scratch right in the browser or under Node.js. TensorFlow Lite for mobile and embedded ML It is a TensorFlow lightweight solution used for mobile and embedded devices. It is fast since it enables on-device machine learning inference with low latency. It supports hardware acceleration with the Android Neural Networks API. The future releases of TensorFlow Lite will bring more built-in operators, performance improvements, and will support more models to simplify the developer’s experience of bringing machine learning services within mobile devices. TensorFlow Hub for reusable machine learning A library which is used extensively to reuse machine learning models. Thus you can transfer learning by reusing parts of machine learning models. TensorBoard for visual debugging While training a complex neural network model, the computations you use in TensorFlow can be very confusing. TensorBoard makes it very easy to understand and debug your TensorFlow programs in the form of visualizations. It allows you to easily inspect and understand your TensorFlow runs and graphs. Sonnet Sonnet is a DeepMind library which is built on top of TensorFlow extensively used to build complex neural network models. All of this factors have made the TensorFlow library immensely appealing for building a wide spectrum of machine learning and deep learning projects. This tool has become a preferred choice for everyone from space research giant NASA and other confidential government agencies, to an impressive roster of private sector giants. Road Ahead for TensorFlow TensorFlow no doubt is better marketed compared to the other deep learning frameworks. The community appears to be moving very fast. In any given hour, there are approximately 10 people around the world contributing or improving the TensorFlow project on GitHub. TensorFlow dominates the field with the largest active community. It will be interesting to see what new advances TensorFlow and other utilities make possible for the future of our digital world. Continuing the recent trend of rapid updates, the TensorFlow team is making sure they address all the current and active challenges faced by the contributors and the developers while building machine learning and deep learning models. TensorFlow 2.0 will be a major update, we can expect the release candidate by next year early March. The preview version of this major milestone is expected to hit later this year. The major focus will be on ease of use, additional support for more platforms and languages, and eager execution will be the central feature of TensorFlow 2.0. This breakthrough version will add more functionalities and operations to handle current research areas such as reinforcement learning, GANs, building advanced neural network models more efficiently. Google will continue to invest and upgrade their existing TensorFlow ecosystem. According to Google’s CEO, Sundar Pichai “artificial intelligence is more important than electricity or fire.” TensorFlow is the solution they have come up with to bring artificial intelligence into reality and provide a stepping stone to revolutionize humankind. Read more The 5 biggest announcements from TensorFlow Developer Summit 2018 The Deep Learning Framework Showdown: TensorFlow vs CNTK Tensor Processing Unit (TPU) 3.0: Google’s answer to cloud-ready Artificial Intelligence

0
0
14493

article-image-ai-tools-data-scientists-might-not-know

Amey Varangaonkar

22 Aug 2018

8 min read

5 artificial intelligence tools data scientists might not know

Amey Varangaonkar

22 Aug 2018

8 min read

With Artificial Intelligence going mainstream, it is not at all surprising to see the number of tools and platforms for AI development go up as well. Open source libraries such as Tensorflow, Keras and PyTorch are very popular today. Not just those - enterprise platforms such as Azure AI Platform, Google Cloud AI and Amazon Sagemaker are commonly used to build scalable production-grade AI applications. While you might be already familiar with these tools and frameworks, there are quite a few relatively unknown AI tools and services which can make your life as a data scientist much, much easier! In this article, we look at 5 such tools for AI development which you may or may not have heard of before. Wit.ai One of the most popular use-cases of Artificial Intelligence today is building bots that facilitate effective human-computer interaction. Wit.ai, a platform for building these conversational chatbots, finds applications across various platforms, including mobile apps, IoT as well as home automation. Used by over 150,000 developers across the world, this platform gives you the ability to build conversational UI that supports text categorization, classification, sentiment analysis and a whole host of other features. Why you should try this machine learning tool out There are a multitude of reasons why wit.ai is so popular among developers for creating conversational chatbots. Some of the major reasons are: Support for text as well as voice, which gives you more options and flexibility in the way you want to design your bots Support for multiple languages such as Python, Ruby and Node.js which facilitates better integration of your app with the website or the platform of your choice The documentation is very easy to follow Lots of built-in entities to ease the development of your chatbots Intel OpenVINO Toolkit Bringing together two of the most talked about technologies today, i.e. Artificial Intelligence and Edge Computing, we had to include Intel’s OpenVINO Toolkit in this list. Short for Open Visual Inference and Neural Network Optimization, this toolkit brings comprehensive computer vision and deep learning capabilities to the edge devices. It has proved to be an invaluable resource to industries looking to set up smart IoT systems for image recognition and processing using edge devices. The OpenVINO toolkit can be used with the commonly used popular frameworks such as OpenCV, Tensorflow as well as Caffe. It can be configured to leverage the power of the traditional CPUs as well as customized AI chips and FPGAs. Not just that, this toolkit also has support for the Vision Processing Unit, a processor developed specifically for machine vision. Why you should try this AI tool out Allows you to develop smart Computer Vision applications for IoT-specific use-cases Support for a large number of deep learning and image processing frameworks. Also, it can be used with the traditional CPUs as well as customized chips for AI/Computer Vision Its distributed capability allows you to develop scalable applications, which again is invaluable when deployed on edge devices You can know more about OpenVINO’s features and capabilities in our detailed coverage of the toolkit. Apache PredictionIO This one is for the machine learning engineers and data scientists looking to build large-scale machine learning solutions using the existing Big Data infrastructure. Apache PredictionIO is an open source, state-of-the-art Machine Learning server which can be easily integrated with the popular Big Data tools such as Apache Hadoop, Apache Spark and Elasticsearch to deploy smart applications. Source: PredictionIO System architecture As can be seen from the architecture diagram above, PredictionIO has modules that interact with the different components of the Big Data system and uses an App Server to communicate the results of the analysis to the outside devices. Why you should try this machine learning tool out Let’s you build production-ready models which can also be deployed as web services You can also leverage the machine learning capabilities of Apache Spark to build large-scale machine learning models Pre-built performance evaluation measures available to check the accuracy of your predictive models Most importantly, this tool helps you simplify your Big Data infrastructure without adding too many complexities IBM Snap ML A machine learning library that is 46 times faster than Tensorflow. If that’s not a reason to start using IBM’s Snap ML, what is? IBM have been taking some giant strides in the field of AI research in a bid to compete with the heavyweights in this space - mainly Google, Microsoft and Amazon. With Snap ML, they seem to have struck a goldmine. A library that can be used for high-speed machine learning models using the cutting edge CPU/GPU technology, Snap ML allows for agile development of models while scaling to process massive datasets. Why you should try this machine learning tool out It is insanely fast. Snap ML was used to train a logistic regression classifier on a terabyte-scale dataset in just under 100 seconds. It allows for GPU acceleration to avoid large data transfer overheads. With the enhanced GPU technology available today, Snap ML is one of the best tools you can have at your disposal to train models quickly and efficiently It allows for distributed model training and works on sparse data structures as well You should definitely check out our detailed coverage of Snap ML where we go into the depth of its features and understand why this is a very special tool. Crypto-ML It is common knowledge that cryptocurrency, especially Bitcoin, can be traded more efficiently and profitably by leveraging the power of machine learning. Large financial institutions and trading firms have been using the machine learning tools to great effect. However, it’s the individuals, on the other hand, who have relied on historical data and outdated techniques to forecast the trends. All that has now changed, thanks to Crypto-ML. Crypto-ML is a cryptocurrency trading platform designed specifically for individuals who want to get the most out of their investments in the most reliable, error-free ways. Using state-of-the-art deep learning techniques, Crypto-ML uses historical data to build models that predict future price movement. At the same time, it eliminates any human error or mistakes arising out of emotions. Why you should try this machine learning tool out No expertise in cryptocurrency trading is required if you want to use this tool Crypto-ML only makes use of historical data and builds data models to predict future prices without any human intervention Per the Crypto-ML website, the average gain on winning trades is close to 53%, whereas the average loss on losing trades is just close to 6%. If you are a data scientist or a machine learning developer with an interest in finance and cryptocurrency, this platform can also help you customize your own models for efficient trading. Here’s where you can read on how Crypto-ML works, in more detail. Other notable mentions Apart from the tools we mentioned above, there are a quite a few other tools that could not make it to the list, but deserve a special mention. Some of them are: ABBYY’s Real-time Recognition SDK for document recognition, language processing and data capturing is worth checking out. Vertex.ai’s PlaidML is an open source tool that allows you to build smart deep learning models across a variety of platforms. It leverages the power of Tile, a new machine learning language that facilitates tensor manipulation. Facebook recently open sourced MUSE, a Python library for efficient word embedding and other NLP tasks. This one’s worth keeping an eye on for sure! If you’re interested in browser-based machine learning, MachineLabs recently open sourced the entire code base of their machine learning platform. NVIDIA’s very own NVVL, their open source offering that provides GPU-accelerated video decoding for training deep learning models The vast ecosystem of tools and frameworks available for building smart, intelligent use-cases across various domains just points to the fact that AI is finding practical applications with every passing day. It is not an overstatement anymore to suggest that that AI is slowly becoming indispensable to businesses. This is not the end of it by any means either - expect to see more such tools spring to life in the near future, with some having game-changing, revolutionary consequences. So which tools are you planning to use for your machine learning / AI tasks? Is there any tool we missed out? Let us know! Read more Predictive Analytics with AWS: A quick look at Amazon ML Four interesting Amazon patents in 2018 that use machine learning, AR, and robotics How to earn $1m per year? Hint: Learn machine learning

0
1
4204

article-image-5-examples-of-artificial-intelligence-in-web-apps

Sugandha Lahoti

20 Aug 2018

7 min read

5 examples of Artificial Intelligence in Web apps

Sugandha Lahoti

20 Aug 2018

7 min read

Modern day web app development is increasingly focused on building a customer-facing front-end presence with the use of Artificial Intelligence. Web apps, use Artificial Intelligence not just for intelligent automation, but also for building recommendation engines, website implementation, and image recognition, among other application areas. In this post, we look at five key areas, illustrated by real-world examples, where web apps are employing Artificial intelligence to automate some part of their system. Recommendation Engines of Amazon and Netflix Curating content based on the user’s context is one of the most widely used AI features in web apps. Amazon, for instance, uses item-based collaborative filtering for product classification. Amazon’s recommendation system uses a combination of goods-based recommendation (users are recommended for those similar to what they liked in the past) and buddy-based recommendation (users are recommended things which their Facebook friends like.) Not just for their recommendation system, Amazon has been using AI for multiple tasks. Their AI Management Strategy is called The Flywheel, where one part of Amazon acts as a catalyst for AI and machine learning growth in other areas. Read more: Four interesting Amazon patents in 2018 that use machine learning, AR, and robotics Another popular example is Netflix, who revamped their recommendation algorithm based on visual impressions. One of their research projects indicated that the artwork was not only the biggest influencer to a viewer's decision to watch content, but it also drew over 82% of their focus while browsing Netflix. This made them develop a new image recommendation algorithm which works in real time to project the image it thinks the user will respond to. They use implicit (user behavior) and Explicit data (user activity) and then feed this data to machine learning algorithms to figure out the relevant content for each user. For each title, users get the image with the highest rank based on their profile. Side by side, it continues collecting data from its 100 million other subscribers to improve its engine’s performance. Read more: What software stack does Netflix use? Google and Microsoft using Image recognition Image recognition can serve multiple uses for web apps including object and pattern recognition, locating duplicates (exact or partial), image search by fragments, and more. Two such unique applications of image recognition are Google’s Quickdraw and Microsoft’s Captionbot.ai. Quick Draw is Google’s AI-powered web app game, where users have to draw an everyday object that a neural network tries to recognize. Players are given 20 seconds to draw a random item, and Google’s neural network tries to match it with other 50 million hand-drawn sketches by other players to identify the correct one. Quickdraw aims to generate the world’s largest doodling data set, which is shared publicly to help further machine learning research. The data preserves user privacy by collecting only anonymous metadata, including timestamp, country code, whether or not the drawing was recognized, and which word the drawing corresponded to. This dataset was used in SketchRNN, a neural network that can draw words and interpolate between drawings. Another image recognition web app is Microsoft’s Captionbot.ai. The system can automatically generate a caption for an uploaded photograph. Users can rate how accurately it has detected what was on display. The algorithm learns from the rating, to make the captions more accurate. It uses three separate services to process the images. The Computer Vision API identifies the components of the photo, then mixes it with data from the Bing Image API, and runs any faces it spots through Emotion API. The Emotion API analyses facial expressions to detect anger, contempt, disgust, fear, and other traits. Based on the results from these APIs, the caption is generated. Google Docs powered by Natural Language Processing Modern Web apps can also be fueled with cognitive capabilities to make them stand apart from other apps. Instances of this include transforming human speech to text or conversing with people in natural language. One such example of a web app which includes natural language processing is Google Docs. Google Docs and Slides have an Explore feature to show text, images, and other features relevant to the document that a user is working on at any given point. Docs can also use natural language to search through data and reports, and automatically generate formulas in Sheets. Google Docs recently incorporated an AI grammar checker, announced at Google Cloud Next. It uses a machine translation algorithm to recognize errors and suggest corrections as users type. Google Docs can also be integrated with Natural Language API to recognize the sentiment of selected text in a Google Doc and highlight it based on that sentiment. Web-based artificial intelligence Chatbots Web-based chatbots are just like app-based chatbots albeit they interact with users in the website browser. They use AI techniques such as natural language understanding and pattern recognition to store and distinguish between the context of the information provided and elicit a suitable response for future replies. An example of web-based chatbots are the Live Chat bots where the conversation with a visitor on a website is automated using a chatbot. Many live chat software companies are already experimenting with chatbots. Examples include the Operator bot used by Intercom, a company building customer messaging platform or Driftbot by Drift which gives your website a personal assistant. Read More: Top 4 chatbot development frameworks for developers Another example, are AI based chatbots which help in creating full websites. Right Click is a startup that introduced an A.I.-powered chatbot which uses Artificial Intelligence in a conversational interface to create websites. It asks general questions during the conversation like “What industry you belong to?” and “Why do you want to make a website?” and creates customized templates as per the given answers. Similarly, Wix’s Artificial Intelligence Design bot can tailor websites by learning about each person’s or business’ own needs. Web-based code helpers using AI Intelligent coding assistants are gaining popularity with their ability to understand the code and provide right suggestions at the right time. They can analyze code on the web and give fast and smart completions. Codota for Chrome is a smart web-based IDE which can build predictive models of code and suggest code completions and related content based on the current context present in the code. It combines program analysis, natural language processing, and machine learning to learn from the code. Users can look for Codota’s Icon on every code snippet on their browsers - in GitHub, StackOverflow and others. Another example is Deep Cognition’s Deep Learning Studio – Cloud. It is not exactly an IDE, but it features AI-powered drag & drop interface to help design deep learning models with ease. It features assisted modeling, for automated tensor size calculations and real-time validation. It also has AutoML feature to automatically build a neural network. [dropcap]E[/dropcap]ven though AI is a great choice to enhance your web apps, an important facet to keep in mind is ensuring fairness, accuracy, and transparency of your web apps. For instance, web apps powered by natural language should not discriminate people based on caste, color, or creed or hurt user sentiments. Similarly, those using neural networks for recognizing images should ensure the filtering of obscene images. Creating such types of artificial intelligence systems would require a hybrid of designers, programmers, ML engineers, and researchers. This collective group will have a good grasp of user experience, will be comfortable thinking in abstracts and algorithms, and equally well versed with the social impacts of artificial intelligence. Read More: 20 lessons on bias in machine learning systems by Kate Crawford at NIPS 2017 Uber introduces Fusion.js, a plugin-based web development framework for high-performance apps. Electron Fiddle: A ‘code playground’ for experimenting with cross-platform native apps. Warp: Rust’s new web framework for implementing WAI (Web Application Interface)

0
0
27376

article-image-tackle-trolls-machine-learning-filtering-inappropriate-content

Amarabha Banerjee

15 Aug 2018

4 min read

Tackle trolls with Machine Learning bots: Filtering out inappropriate content just got easy

Amarabha Banerjee

15 Aug 2018

4 min read

The most feared online entities in the present day are trolls. Trolls, a fearsome bunch of fake or pseudo online profiles, tend to attack online users, mostly celebrities, sports person or political profiles using a wide range of methods. One of these methods is to post obscene or NSFW (Not Safe For Work) content on your profile or website where User Generated Content (USG) is allowed. This can create unnecessary attention and cause legal troubles for you too. The traditional way out is to get a moderator (or a team of them). Let all the USGs pass through this moderation system. This is a sustainable solution for a small platform. But if you are running a large scale app, say a publishing app where you publish one hundred stories a day, and the success of these stories depend on the user interaction with them, then this model of manual moderation becomes unsustainable. More the number of USGs, more is the turn-around time, larger the moderation team size. This results in escalating costs, for a purpose that’s not contributing to your business growth in any manner. That’s where Machine Learning could help. Machine Learning algorithms that can scan images and content for possible abusive or adult content is a better solution that manual moderation. Tech giants like Microsoft, Google, Amazon have a ready solution for this. These companies have created APIs which are commercially available for developers. You can incorporate these APIs in your application to weed out the filth served by the trolls. The different APIs available for this purpose are Microsoft moderation, Google Vision, AWS Rekognition & Clarifai. Dataturks have made a comparative study on using these APIs on one particular dataset to measure their efficiency. They used a YACVID dataset with 180 images, manually labelled 90 of these images as nude and the rest as non-nude. The dataset was then fed to the 4 APIs mentioned above, their efficiency was tested based on the following parameters. True Positive (TP): Given a safe photo, the API correctly says so False Positive (FP): Given an explicit photo but the API incorrectly classifies it as safe. False negative (FN): Given a safe photo but the API is not able to detect so and True negative(TN): Given an explicit photo and the API correctly says so. TP and TN are two cases which meant the system behaved correctly. An FP meant that the app was vulnerable to attacks from trolls, FN meant the efficiency of the systems were low and hence not practically viable. 10% of the cases would be such that the API can’t decide whether its explicit or not. Those would be sent for manual moderation. This would bring down the maintenance cost of the moderation team. The results that they received are shown below: Source: Dataturks As it is evident from the above table, the best standalone API is Google vision with a 99% accuracy and 94% recall value. Recall value implies that if the same images are repeated, it can recognize them with 94% precision. The best results however were received with the combination of Microsoft and Google. The comparison of the response times are mentioned below: Source: dataturks The response time might have been affected with the fact that all the images accessed by the APIs were stored in Amazon S3. Hence AWS API might have had an unfair advantage on the response time. The timings were noted for 180 image calls per API. The cost is the lowest for AWS Rekognition - $1 for 1000 calls to the API. It’s $1.2 for Clarifai, $1.5 for both Microsoft and Google. The one notable drawback of the Amazon API was that the images had to be stored as S3 objects, or converted into that. All the other APIs accepted any web links as possible source of images. What this study says is that the power of filtering out negative and explicit content in your app is much easier now. You might still have to have a small team of moderators, but their jobs will be made a lot easier with the ML models implemented in these APIs. Machine Learning is paving the way for us to be safe from the increasing menace of Trolls, a threat to free speech and open sharing of ideas which were the founding stones of internet and the world wide web as a whole. Will this discourage Trolls from continuing their slandering or will it create a counter system to bypass the APIs and checks? We can only know in time. Facebook launches a 6-part Machine Learning video series Google’s new facial recognition patent uses your social network to identify you! Microsoft’s Brad Smith calls for facial recognition technology to be regulated

0
0
4469

article-image-predictive-analytics-with-amazon-ml

Natasha Mathur

09 Aug 2018

9 min read

Predictive Analytics with AWS: A quick look at Amazon ML

Natasha Mathur

09 Aug 2018

9 min read

0
0
6914

article-image-write-python-code-or-pythonic-code

Aaron Lazar

08 Aug 2018

5 min read

Do you write Python Code or Pythonic Code?

Aaron Lazar

08 Aug 2018

5 min read

If you’re new to Programming, and Python in particular, you might have heard the term Pythonic being brought up at tech conferences, meetups and even at your own office. You might have also wondered why the term and whether they’re just talking about writing Python code. Here we’re going to understand what the term Pythonic means and why you should be interested in learning how to not just write Python code, rather write Pythonic code. What does Pythonic mean? When people talk about pythonic code, they mean that the code uses Python idioms well, that it’s natural or displays fluency in the language. In other words, it means the most widely adopted idioms that are adopted by the Python community. If someone said you are writing un-pythonic code, they might actually mean that you are attempting to write Java/C++ code in Python, disregarding the Python idioms and performing a rough transcription rather than an idiomatic translation from the other language. Okay, now that you have a theoretical idea of what Pythonic (and unpythonic) means, let’s have a look at some Pythonic code in practice. Writing Pythonic Code Before we get into some examples, you might be wondering if there’s a defined way/method of writing Pythonic code. Well, there is, and it’s called PEP 8. It’s the official style guide for Python. Example #1 x=[1, 2, 3, 4, 5, 6] result = [] for idx in range(len(x)); result.append(x[idx] * 2) result Output: [2, 4, 6, 8, 10, 12] Consider the above code, where you’re trying to multiply some elements, “x” by 2. So, what we did here was, we created an empty list to store the results. We would then append the solution of the computation into the result. The result now contains a function which is 2 multiplied by each of the elements. Now, if you were to write the same code in a Pythonic way, you might want to simply use list comprehensions. Here’s how: x=[1, 2, 3, 4, 5, 6] [(element * 2) for element in x] Output: [2, 4, 6, 8, 10] You might have noticed, we skipped the entire for loop! Example #2 Let’s make the previous example a bit more complex, and place a condition that the elements should be multiplied by 2 only if they are even. x=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] result = [] for idx in range(len(x)); if x[idx] % 2 == 0; result.append(x[idx] * 2) else; result.append(x[idx]) result Output: [1, 4, 3, 8, 5, 12, 7, 16, 9, 20] We’ve actually created an if else statement to solve this problem, but there is a simpler way of doing things the Pythonic way. [(element * 2 if element % 2 == 0 else element) for element in x] Output: [1, 4, 3, 8, 5, 12, 7, 16, 9, 20] If you notice what we’ve done here, apart from skipping multiple lines of code, is that we used the if-else statement in the same sentence. Now, if you wanted to perform filtering, you could do this: x=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] [element * 2 for element in x if element % 2 == 0] Output: [4, 8, 12, 16, 20] What we’ve done here is put the if statement after the for declaration, and Voila! We’ve achieved filtering. If you’re using a nice IDE like Jupyter Notebooks or PyCharm, they will help you format your code as per the PEP 8 suggestions. Why should you write Pythonic code? Well firstly, you’re saving loads of time writing humongous piles of cowdung code, so you’re obviously becoming a smarter and more productive programmer. Python is a pretty slow language, and when you’re trying to do something in Python, which is acquired from another language like Java or C++, you’re going to worsen things. With idiomatic, Pythonic code, you’re improving the speed of your programs. Moreover, idiomatic code is far easier to comprehend and understand for other developers who are working on the same code. It helps a great deal when you’re trying to refactor someone else’s code. Fearing Pythonic idioms Well, I don’t mean the idioms themselves are scary. Rather, quite a few developers and organisations have begun discriminating on the basis of whether someone can or cannot write Pythonic code. This is wrong, because, at the end of the day, though the PEP 8 exists, the idea of the term Pythonic is different for different people. To some it might mean picking up a new style guide and improving the way you code. To others, it might mean being succinct and not repeating themselves. It’s time we stopped judging people on whether they can or can’t write Pythonic code and instead, we should appreciate when someone is able to present readable, easily maintainable and succinct code. If you find them writing a bit of clumsy code, you can choose to talk to them about improving their design considerations. And the world will be a better place! If you’re interested in learning how to write more succinct and concise Python code, check out these resources: Learning Python Design Patterns - Second Edition Python Design Patterns [Video] Python Tips, Tricks and Techniques [Video]

0
2
21755

article-image-amazon-patents-2018-machine-learning-ar-robotics

Natasha Mathur

06 Aug 2018

7 min read

Four interesting Amazon patents in 2018 that use machine learning, AR, and robotics

Natasha Mathur

06 Aug 2018

7 min read

"There are two kinds of companies, those that work to try to charge more and those that work to charge less. We will be the second."-- Jeff Bezos, CEO Amazon When Jeff Bezos launched Amazon.com in 1994, it was an online bookselling site. This was during a time when bookstores such as Barnes & Noble, Waldenbooks and Crown Books were the leading front runners in the bookstore industry in the American shopping malls. Today, Amazon’s name has become almost synonymous with online retail for most people and has now spread its wings to cloud computing, electronics, tech gadgets and the entertainment world. With market capitalization worth $897.47B as of August 3rd 2018, it’s hard to believe that there was a time when Amazon sold only books. Amazon is constantly pushing to innovate and as new inventions come to shape, there are “patents” made that helps the company have a competitive advantage over technologies and products in order to attract more customers. [box type="shadow" align="" class="" width=""]According to United States Patent and Trademark Office (USPTO), Patent is an exclusive right to invention and “the right to exclude others from making, using, offering for sale, or selling the invention in the United States or “importing” the invention into the United States”.[/box] As of March 20, 2018, Amazon owned 7,717 US patents filed under two business entities, Amazon Technologies, Inc. (7,679), and Amazon.com, Inc (38). Looking at the chart below, you can tell that Amazon Technologies, Inc., was one among the top 15 companies in terms of number of patents granted in 2017. Top 15 companies, by number of patents granted by USPTO, 2017 Amazon competes closely with the world’s leading tech giants in terms of patenting technologies. The below table only considers US patents. Here, Amazon holds only few US patents than IBM, Microsoft, Google, and Apple. Number of US Patents Containing Emerging-Technology Keywords in Patent Description Some successfully patented Amazon innovations in 2018 There are thousands of inventions that Amazon is tied up with and for which they have filed for patents. These include employee surveillance AR goggles, a real-time accent translator, robotic arms tossing warehouse items, one-click buying, drones,etc. Let’s have a look at these remarkable innovations by Amazon. AR goggles for improving human-driven fulfillment (or is it to track employees?) Date of Patent: August 2, 2018 Filed: March 20, 2017 Assignee: Amazon Technologies, Inc. AR Goggles Features: Amazon has recently patented a pair of augmented reality goggles that could be used to keep track of its employees.The patent is titled “Augmented Reality User interface facilitating fulfillment.” As per the patent application, the application is a wearable computing device such as augmented reality glasses that are worn on user’s head. The user interface is rendered upon one or more lenses of the augmented reality glasses and it helps to show the workers where to place objects in Amazon's fulfillment centers. There’s also a feature in the AR glasses which provides workers with turn-by-turn directions to the destination within the fulfillment centre. This helps them easily locate the destination as all the related information gets rendered on the lenses. AR Goggles steps The patent has received criticism over concerns that this application might hamper the privacy of employees within the warehouses, tracking employees’ every single move. However, Amazon has defended the application by saying that it has got nothing to do with “employee surveillance”. As this is a patent, there’s no guarantee if it will actually hit the market. Robotic arms that toss warehouse items Date of Patent: July 17, 2018 Filed: September 29, 2015 Assignee: Amazon Technologies, Inc. Features: Amazon won a patent titled “Robotic tossing of items in inventory system” last month. As per the patent application, “Robotic arms or manipulators can be used to toss inventory items within an inventory system. Tossing strategies for the robotic arms may include information about how a grasped item is to be moved and released by a robotic arm to achieve a trajectory for moving the item to a receiving location”. Robotic Arms Utilizing a robotic arm to toss an item to a receiving location can help improve throughput through the inventory system. This is possible as the robotic arms will help with reducing the amount of time that may otherwise be spent on placing a grasped item directly onto a surface for receiving the item. “The tossing strategy may be based at least in part upon a database containing information about the item, characteristics of the item, and/or similar items, such as information indicating tossing strategies that have been successful or unsuccessful for such items in the past,” the patent reads. Robotic Arms Steps Amazon’s aim with this is to eliminate the challenges faced by modern inventory systems like supply chain distribution centers, airport luggage systems, etc, while responding to requests for inventory items. The patent received criticism over the concern that one of the examples in the application was a dwarf figurine and could possibly mock people of short stature. But, according to Amazon, “The intention was simply to illustrate a robotic arm moving products, and it should not be taken out of context.” Real-time accent translator Date of Patent: June 21, 2018 Filed: December 21, 2016 Assignee: Amazon Technologies, Inc. Features: Amazon won a patent for an audio system application, titled “Accent translation” back in June this year, which will help with translating the accent of the speaker to the listener’s accent. The aim with this app is to get rid of the possible communication barriers which may arise due to different accents as they can be difficult to understand at times. Accent translation system The accent translation system collects a number of audio samples from different sources such as phone call, television, movies, broadcasts, etc. Each audio sample will have its association with at least one of the accent sample sets present in its database. For instance, german accent will be associated with the german accent sample set. Accent translation system steps In a two-party dialog, acquired audio is analyzed and if it associates with one among a wide range of saved accents then the audio from both the sides is outputted based on the accent of the opposite party. The possibilities with this application are endless. One major use case is the customer care industry where people have to constantly talk to different people with different accents. Drone that uses Human gestures and voice commands Date of Patent: March 20, 2018 Filed: July 18, 2016 Assignee: Amazon Technologies, Inc. Features: Amazon patented for a drone, titled “Human interaction with unmanned aerial vehicles”, earlier this year, that would use human gestures and voice commands for package delivery. Amazon Drone makes use of propulsion technology which will help with managing the speed, trajectory, and direction of the drone. Drones As per the patent application, “an unmanned aerial vehicle is provided which includes propulsion device, sensor device and a management system. The management system is configured to receive human gestures via the sensor device and in response, instruct the propulsion device to affect and adjustment to the behavior of the unnamed aerial vehicle. Human gestures include-- visible gestures, audible gestures, and other gestures capable of recognition by the unmanned vehicle”. Working structure of drones The concept for drones started when Amazon CEO, Jeff Bezos, promised, back in 2013, that the company aims to make 30-minute deliveries, of packages up to 2.25 kgs or 5 pounds. Amazon’s patents are a clear indication of its efforts and determination for inventing cutting-edge technologies for optimizing its operations so that it can pass on the benefits to its customers in the form of competitively priced product offerings. As Amazon has been putting its focus on machine learning, the drones and robotic arms will make the day-to-day tasks of the facility workers easier and more efficient. In fact, Amazon has stepped up its game big time and is incorporating Augmented reality, with its AR glasses to further scale efficiencies. The real-time accent translators help eliminate the communication barriers, making Amazon cover a wide range of areas and perhaps provide a seamless customer care experience in the coming days. Amazon Echo vs Google Home: Next-gen IoT war Amazon is selling facial recognition technology to police

0
3
7304

article-image-5g-mobile-data-propel-artificial-intelligence

Neil Aitken

02 Aug 2018

7 min read

How 5G Mobile Data will propel Artificial Intelligence (AI) progress

Neil Aitken

02 Aug 2018

7 min read

Like it’s predecessors, 3G and 4G, 5G refers to the latest ‘G’ – Generation – of mobile technology. 5G will give us very fast - effectively infinitely fast - mobile data download bandwidth. Downloading a TV show to your phone over 5G, in its entirety, in HD, will take less than a second, for example. A podcast will be downloaded within a fraction of a second of you requesting it. Scratch the surface of 5G, however, and there is a great deal more to see than just fast mobile data speeds. 5G is the backbone on which a lot of emerging technologies such as AI, blockchains, IoT among others will reach mainstream adoption. Today, we look at how 5G will accelerate AI growth and adoption. 5G will create the data AI needs to thrive One feature of 5G with ramifications beyond data speed is ‘Latency.’ 5G offers virtually ‘Zero Latency’ as a service. Latency is the time needed to transmit a packet of data from one device to another. It includes the period of time between when the request was made, to the time the response is completed. [caption id="attachment_21251" align="aligncenter" width="580"] 5G will be superfast – but will also benefit from near zero ‘latency’[/caption] Source: Economist At the moment, we keep files (music, pictures or films) in our phones’ memory permanently. We have plenty of processing power on our devices. In fact, the main upgrade between phone generations these days is a faster processor. In a 5G world, we will be able to use cheap parts in our devices – processors and memory in our new phones. Data downloads will be so fast, that we can get them immediately when we need them. We won’t need to store information on the phone unless we want to. Even if the files are downloaded from the cloud, because the network has zero latency – he or she feels like the files are on the phone. In other words, you are guaranteed a seamless user experience in a 5G world. The upshot of all this is that the majority of any new data which is generated from mobile products will move to the cloud for storage. At their most fundamental level, AI algorithms are pattern matching tools. The bigger the data trove, the faster and better performing the results of AI analysis is. These new structured data sets, created by 5G, will be available from the place where it is easiest to extract and manipulate (‘Analyze’) it – the cloud. There will be 100 billion 5G devices connected to cellular networks by 2025, according to Huawei. 5G is going to generate data from those devices, and all the smartphones in the world and send it all back to the cloud. That data is the source of the incredible power AI gives businesses. 5G driving AI in autonomous vehicles 5G’s features and this Cloud / Connected Device future, will manifest itself in many ways. One very visible example is how 5G will supercharge the contribution, especially to reliability and safety, that AI can make to self driving cars. A great deal of the AI processing that is required to keep a self driving car operating safely, will be done by computers on board the vehicle. However, 5G’s facilities to communicate large amounts of data quickly will mean that any unusual inputs (for example, the car is entering or in a crash situation) can be sent to bigger computing equipment on the cloud which could perform more serious processing. Zero latency is important in these situations for commands which might come from a centralized accident computer, designed to increase safety– for example issuing the command ‘break.’ In fact, according to manufacturers, it’s likely that, ultimately, groups of cars will be coordinated by AI using 5G to control the vehicles in a model known as swarm computing. 5G will make AI much more useful with ‘context’ - Intel 5G will power AI by providing location information which can be considered in establishing the context of questions asked of the tool – according to Intel’s Data Center Group. For example, asking your Digital Assistant where the tablets are means something different depending on whether you’re in a pharmacy or an electronics store. The nature of 5G is that it’s a mobile service. Location information is both key to context and an inherent element of information sent over a 5G connection. By communicating where they are, 5G sensors will help AI based Digital Assistants solve our everyday problems. 5G phones will enable AI calculations on ‘Edge’ network devices - ARM 5G will push some processing to the ‘Edge’ of the network, for manipulation by a growing range of AI chips on to the processors of phones. In this regard, smartphones like any Internet Of Things connected processor ‘in the field’ are simply an ‘AI platform’. Handset manufacturers are including new software features in their phones that customers love to use – including AI based search interfaces which allow them to search for images containing ‘heads’ and see an accurate list. [caption id="attachment_21252" align="aligncenter" width="1918"] Arm are designing new types of chips targeted at AI calculations on ‘Edge’ network devices.[/caption] Source: Arm's Project Trillium ARM, one of the world’s largest CPU producers are creating specific, dedicated AI chip sets, often derived from the technology that was behind their Graphics Processing Units. These chips process AI based calculations up to 50 times faster than standard microprocessors already and their performance is set to improve 50x over the next 3 years, according to the company. AI is part of 5G networks - Huawei Huawei describes itself as an AI company (as well as a number of other things including handset manufacturer.) They are one of the biggest electronic manufacturers in China and are currently in the process of selling networking products to the world’s telecommunications companies, as they prepare to roll out their 5G networks. Based on the insight that 70% of network system downtime comes from human error, Huawei is now eliminating humans from the network management component of their work, to the degree that they can. Instead, they’re implementing automated AI based predictive maintenance systems to increase data throughput across the network and reduce downtime. The way we use cellular networks is changing. Different applications require different backend traffic to be routed across the network, depending on the customer need. Someone watching video, for example, has a far lower tolerance for a disruption to the data throughput (the ‘stuttering Netflix’ effect) than a connected IoT sensor which is trying to communicate the temperature of a thermometer. Huawei’s network maintenance AI software optimizes these different package needs, maintaining the near zero latency that the standard demands at a lower cost. AI based network maintenance complete a virtuous loop in which 5G devices on new cellular networks give AI the raw data they need, including valuable context information, and AI helps the data flow across the 5G network better. Bringing it all together 5G and artificial intelligence (AI) are revolutionary technologies that will evolve alongside each other. 5G isn’t just fast data, it’s one of the most important technologies ever devised. Just as the smartphone did, it will fundamentally change how we relate to information, partly, because it will link us to thousands of newly connected devices on the Internet of Things. Ultimately, it could be the secondary effects of 5G, the network’s almost zero latency, which could provide the largest benefit – by creating structured data sets from billions of connected devices, in an easily accessible place – the cloud which can be used to fuel the AI algorithms which run on them. Networking equipment, chip manufacturers and governments have all connected the importance of AI with the potential of 5G. Commercial sales of 5G start in The US, UK and Australia in 2019. 7 Popular Applications of Artificial Intelligence in Healthcare Top languages for Artificial Intelligence development Cognitive IoT: How Artificial Intelligence is remoulding Industrial and Consumer IoT

0
0
3897

article-image-top-automl-libraries-for-building-ml-pipelines

Sunith Shetty

01 Aug 2018

9 min read

Top AutoML libraries for building your ML pipelines

Sunith Shetty

01 Aug 2018

9 min read

What is AutoML? When talking about AutoML we mostly refer to automated data preparation (namely feature preprocessing, generation, and selection) and model training (model selection and hyperparameter optimization). The number of possible options for each step of this process can vary vastly depending on the problem type. AutoML allows researchers and practitioners to automatically build ML pipelines out of the possible options for every step to find high-performing ML models for a given problem. AutoML libraries carefully set up experiments for various ML pipelines, which covers all the steps from data ingestion, data processing, modeling, and scoring. In this article we deal with understanding what AutoML is and cover popular AutoML libraries with practical examples. This article is an excerpt from a book written by Sibanjan Das, Umit Mert Cakmak titled Hands-On Automated Machine Learning. Overview of AutoML libraries There are many popular AutoML libraries, and in this section you will get an overview of commonly used ones in the data science community. Featuretools Featuretools is a good library for automatically engineering features from relational and transactional data. The library introduces the concept called Deep Feature Synthesis (DFS). If you have multiple datasets with relationships defined among them such as parent-child based on columns that you use as unique identifiers for examples, DFS will create new features based on certain calculations, such as summation, count, mean, mode, standard deviation, and so on. Let's go through a small example where you will have two tables, one showing the database information and the other showing the database transactions for each database: import pandas as pd # First dataset contains the basic information for databases. databases_df = pd.DataFrame({"database_id": [2234, 1765, 8796, 2237, 3398], "creation_date": ["2018-02-01", "2017-03-02", "2017-05-03", "2013-05-12", "2012-05-09"]}) databases_df.head() You get the following output: The following is the code for the database transaction: # Second dataset contains the information of transaction for each database id db_transactions_df = pd.DataFrame({"transaction_id": [26482746, 19384752, 48571125, 78546789, 19998765, 26482646, 12484752, 42471125, 75346789, 16498765, 65487547, 23453847, 56756771, 45645667, 23423498, 12335268, 76435357, 34534711, 45656746, 12312987], "database_id": [2234, 1765, 2234, 2237, 1765, 8796, 2237, 8796, 3398, 2237, 3398, 2237, 2234, 8796, 1765, 2234, 2237, 1765, 8796, 2237], "transaction_size": [10, 20, 30, 50, 100, 40, 60, 60, 10, 20, 60, 50, 40, 40, 30, 90, 130, 40, 50, 30], "transaction_date": ["2018-02-02", "2018-03-02", "2018-03-02", "2018-04-02", "2018-04-02", "2018-05-02", "2018-06-02", "2018-06-02", "2018-07-02", "2018-07-02", "2018-01-03", "2018-02-03", "2018-03-03", "2018-04-03", "2018-04-03", "2018-07-03", "2018-07-03", "2018-07-03", "2018-08-03", "2018-08-03"]}) db_transactions_df.head() You get the following output: The code for the entities is as follows: # Entities for each of datasets should be defined entities = { "databases" : (databases_df, "database_id"), "transactions" : (db_transactions_df, "transaction_id") } # Relationships between tables should also be defined as below relationships = [("databases", "database_id", "transactions", "database_id")] print(entities) You get the following output for the preceding code: The following code snippet will create feature matrix and feature definitions: # There are 2 entities called ‘databases’ and ‘transactions’ # All the pieces that are necessary to engineer features are in place, you can create your feature matrix as below import featuretools as ft feature_matrix_db_transactions, feature_defs = ft.dfs(entities=entities, relationships=relationships, target_entity="databases") The following output shows some of the features that are generated: You can see all feature definitions by looking at the following features_defs: feature_defs The output is as follows: This is how you can easily generate features based on relational and transactional datasets. Auto-sklearn Scikit-learn has a great API for developing ML models and pipelines. Scikit-learn's API is very consistent and mature; if you are used to working with it, auto-sklearn will be just as easy to use since it's really a drop-in replacement for scikit-learn estimators. Let's see a little example: # Necessary imports import autosklearn.classification import sklearn.model_selection import sklearn.datasets import sklearn.metrics from sklearn.model_selection import train_test_split # Digits dataset is one of the most popular datasets in machine learning community. # Every example in this datasets represents a 8x8 image of a digit. X, y = sklearn.datasets.load_digits(return_X_y=True) # Let's see the first image. Image is reshaped to 8x8, otherwise it's a vector of size 64. X[0].reshape(8,8) The output is as follows: You can plot a couple of images to see how they look: import matplotlib.pyplot as plt number_of_images = 10 images_and_labels = list(zip(X, y)) for i, (image, label) in enumerate(images_and_labels[:number_of_images]): plt.subplot(2, number_of_images, i + 1) plt.axis('off') plt.imshow(image.reshape(8,8), cmap=plt.cm.gray_r, interpolation='nearest') plt.title('%i' % label) plt.show() Running the preceding snippet will give you the following plot: Splitting the dataset to train and test data: # We split our dataset to train and test data X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=1) # Similarly to creating an estimator in Scikit-learn, we create AutoSklearnClassifier automl = autosklearn.classification.AutoSklearnClassifier() # All you need to do is to invoke fit method to start experiment with different feature engineering methods and machine learning models automl.fit(X_train, y_train) # Generating predictions is same as Scikit-learn, you need to invoke predict method. y_hat = automl.predict(X_test) print("Accuracy score", sklearn.metrics.accuracy_score(y_test, y_hat)) # Accuracy score 0.98 That was easy, wasn't it? MLBox MLBox is another AutoML library that supports distributed data processing, cleaning, formatting, and state-of-the-art algorithms such as LightGBM and XGBoost. It also supports model stacking, which allows you to combine an information ensemble of models to generate a new model aiming to have better performance than the individual models. Here's an example of its usage: # Necessary Imports from mlbox.preprocessing import * from mlbox.optimisation import * from mlbox.prediction import * import wget file_link = 'https://apsportal.ibm.com/exchange-api/v1/entries/8044492073eb964f46597b4be06ff5ea/data?accessKey=9561295fa407698694b1e254d0099600' file_name = wget.download(file_link) print(file_name) # GoSales_Tx_NaiveBayes.csv The GoSales dataset contains information for customers and their product preferences: import pandas as pd df = pd.read_csv('GoSales_Tx_NaiveBayes.csv') df.head() You get the following output from the preceding code: Let's create a test set from the same dataset by dropping a target column: test_df = df.drop(['PRODUCT_LINE'], axis = 1) # First 300 records saved as test dataset test_df[:300].to_csv('test_data.csv') paths = ["GoSales_Tx_NaiveBayes.csv", "test_data.csv"] target_name = "PRODUCT_LINE" rd = Reader(sep = ',') df = rd.train_test_split(paths, target_name) The output will be similar to the following: Drift_thresholder will help you to drop IDs and drifting variables between train and test datasets: dft = Drift_thresholder() df = dft.fit_transform(df) You get the following output: Optimiser will optimize the hyperparameters: opt = Optimiser(scoring = 'accuracy', n_folds = 3) opt.evaluate(None, df) You get the following output by running the preceding code: The following code defines the parameters of the ML pipeline: space = { 'ne__numerical_strategy':{"search":"choice", "space":[0]}, 'ce__strategy':{"search":"choice", "space":["label_encoding","random_projection", "entity_embedding"]}, 'fs__threshold':{"search":"uniform", "space":[0.01,0.3]}, 'est__max_depth':{"search":"choice", "space":[3,4,5,6,7]} } best = opt.optimise(space, df,15) The following output shows you the selected methods that are being tested by being given the ML algorithms, which is LightGBM in this output: You can also see various measures such as accuracy, variance, and CPU time: Using Predictor, you can use the best model to make predictions: predictor = Predictor() predictor.fit_predict(best, df) You get the following output: TPOT Tree-Based Pipeline Optimization Tool (TPOT) uses genetic programming to find the best performing ML pipelines, built on top of scikit-learn. Once your dataset is cleaned and ready to be used, TPOT will help you with the following steps of your ML pipeline: Feature preprocessing Feature construction and selection Model selection Hyperparameter optimization Once TPOT is done with its experimentation, it will provide you with the best performing pipeline. TPOT is very user-friendly as it's similar to using scikit-learn's API: from tpot import TPOTClassifier from sklearn.datasets import load_digits from sklearn.model_selection import train_test_split # Digits dataset that you have used in Auto-sklearn example digits = load_digits() X_train, X_test, y_train, y_test = train_test_split(digits.data, digits.target, train_size=0.75, test_size=0.25) # You will create your TPOT classifier with commonly used arguments tpot = TPOTClassifier(generations=10, population_size=30, verbosity=2) # When you invoke fit method, TPOT will create generations of populations, seeking best set of parameters. Arguments you have used to create TPOTClassifier such as generations and population_size will affect the search space and resulting pipeline. tpot.fit(X_train, y_train) print(tpot.score(X_test, y_test)) # 0.9834 tpot.export('my_pipeline.py') Once you have exported your pipeline in the Python my_pipeline.py file, you will see the selected pipeline components: import numpy as np import pandas as pd from sklearn.model_selection import train_test_split from sklearn.neighbors import KNeighborsClassifier # NOTE: Make sure that the class is labeled 'target' in the data file tpot_data = pd.read_csv('PATH/TO/DATA/FILE', sep='COLUMN_SEPARATOR', dtype=np.float64) features = tpot_data.drop('target', axis=1).values training_features, testing_features, training_target, testing_target = train_test_split(features, tpot_data['target'].values, random_state=42) exported_pipeline = KNeighborsClassifier(n_neighbors=6, weights="distance") exported_pipeline.fit(training_features, training_target) results = exported_pipeline.predict(testing_features) To summarize, you learnt about Automated ML and practiced your skills using popular AutoML libraries. This is definitely not the whole list, and AutoML is an active area of research. You should check out other libraries such as Auto-WEKA, which also uses the latest innovations in Bayesian optimization, and Xcessive, which is a user-friendly tool for creating stacked ensembles. To know how AutoML can be further used to automate parts of Machine Learning, check out the book Hands-On Automated Machine Learning. Read more Anatomy of an automated machine learning algorithm (AutoML) AutoML: Developments and where is it heading to AmoebaNets: Google’s new evolutionary AutoML

0
0
8247

article-image-earn-1m-per-year-hint-learn-machine-learning

Neil Aitken

01 Aug 2018

10 min read

How to earn $1m per year? Hint: Learn machine learning

Neil Aitken

01 Aug 2018

10 min read

0
0
18209

Tech Guides - Artificial Intelligence

5 ways artificial intelligence is upgrading software engineering

8 Machine learning best practices [Tutorial]

12 ubiquitous artificial intelligence powered apps that are changing lives

NVIDIA leads the AI hardware race. But which of its GPUs should you use for deep learning?

A Machine learning roadmap for Web Developers

Why TensorFlow always tops machine learning and artificial intelligence tool surveys

5 artificial intelligence tools data scientists might not know

5 examples of Artificial Intelligence in Web apps

Tackle trolls with Machine Learning bots: Filtering out inappropriate content just got easy

Predictive Analytics with AWS: A quick look at Amazon ML

Trending Topics

Do you write Python Code or Pythonic Code?

Four interesting Amazon patents in 2018 that use machine learning, AR, and robotics

How 5G Mobile Data will propel Artificial Intelligence (AI) progress

Top AutoML libraries for building your ML pipelines

How to earn $1m per year? Hint: Learn machine learning