Data | 281 articles | Tech News, Tutorials & Expert Insights

article-image-ai-tools-data-scientists-might-not-know

22 Aug 2018

8 min read

5 artificial intelligence tools data scientists might not know

22 Aug 2018

With Artificial Intelligence going mainstream, it is not at all surprising to see the number of tools and platforms for AI development go up as well. Open source libraries such as Tensorflow, Keras and PyTorch are very popular today. Not just those - enterprise platforms such as Azure AI Platform, Google Cloud AI and Amazon Sagemaker are commonly used to build scalable production-grade AI applications. While you might be already familiar with these tools and frameworks, there are quite a few relatively unknown AI tools and services which can make your life as a data scientist much, much easier! In this article, we look at 5 such tools for AI development which you may or may not have heard of before. Wit.ai One of the most popular use-cases of Artificial Intelligence today is building bots that facilitate effective human-computer interaction. Wit.ai, a platform for building these conversational chatbots, finds applications across various platforms, including mobile apps, IoT as well as home automation. Used by over 150,000 developers across the world, this platform gives you the ability to build conversational UI that supports text categorization, classification, sentiment analysis and a whole host of other features. Why you should try this machine learning tool out There are a multitude of reasons why wit.ai is so popular among developers for creating conversational chatbots. Some of the major reasons are: Support for text as well as voice, which gives you more options and flexibility in the way you want to design your bots Support for multiple languages such as Python, Ruby and Node.js which facilitates better integration of your app with the website or the platform of your choice The documentation is very easy to follow Lots of built-in entities to ease the development of your chatbots Intel OpenVINO Toolkit Bringing together two of the most talked about technologies today, i.e. Artificial Intelligence and Edge Computing, we had to include Intel’s OpenVINO Toolkit in this list. Short for Open Visual Inference and Neural Network Optimization, this toolkit brings comprehensive computer vision and deep learning capabilities to the edge devices. It has proved to be an invaluable resource to industries looking to set up smart IoT systems for image recognition and processing using edge devices. The OpenVINO toolkit can be used with the commonly used popular frameworks such as OpenCV, Tensorflow as well as Caffe. It can be configured to leverage the power of the traditional CPUs as well as customized AI chips and FPGAs. Not just that, this toolkit also has support for the Vision Processing Unit, a processor developed specifically for machine vision. Why you should try this AI tool out Allows you to develop smart Computer Vision applications for IoT-specific use-cases Support for a large number of deep learning and image processing frameworks. Also, it can be used with the traditional CPUs as well as customized chips for AI/Computer Vision Its distributed capability allows you to develop scalable applications, which again is invaluable when deployed on edge devices You can know more about OpenVINO’s features and capabilities in our detailed coverage of the toolkit. Apache PredictionIO This one is for the machine learning engineers and data scientists looking to build large-scale machine learning solutions using the existing Big Data infrastructure. Apache PredictionIO is an open source, state-of-the-art Machine Learning server which can be easily integrated with the popular Big Data tools such as Apache Hadoop, Apache Spark and Elasticsearch to deploy smart applications. Source: PredictionIO System architecture As can be seen from the architecture diagram above, PredictionIO has modules that interact with the different components of the Big Data system and uses an App Server to communicate the results of the analysis to the outside devices. Why you should try this machine learning tool out Let’s you build production-ready models which can also be deployed as web services You can also leverage the machine learning capabilities of Apache Spark to build large-scale machine learning models Pre-built performance evaluation measures available to check the accuracy of your predictive models Most importantly, this tool helps you simplify your Big Data infrastructure without adding too many complexities IBM Snap ML A machine learning library that is 46 times faster than Tensorflow. If that’s not a reason to start using IBM’s Snap ML, what is? IBM have been taking some giant strides in the field of AI research in a bid to compete with the heavyweights in this space - mainly Google, Microsoft and Amazon. With Snap ML, they seem to have struck a goldmine. A library that can be used for high-speed machine learning models using the cutting edge CPU/GPU technology, Snap ML allows for agile development of models while scaling to process massive datasets. Why you should try this machine learning tool out It is insanely fast. Snap ML was used to train a logistic regression classifier on a terabyte-scale dataset in just under 100 seconds. It allows for GPU acceleration to avoid large data transfer overheads. With the enhanced GPU technology available today, Snap ML is one of the best tools you can have at your disposal to train models quickly and efficiently It allows for distributed model training and works on sparse data structures as well You should definitely check out our detailed coverage of Snap ML where we go into the depth of its features and understand why this is a very special tool. Crypto-ML It is common knowledge that cryptocurrency, especially Bitcoin, can be traded more efficiently and profitably by leveraging the power of machine learning. Large financial institutions and trading firms have been using the machine learning tools to great effect. However, it’s the individuals, on the other hand, who have relied on historical data and outdated techniques to forecast the trends. All that has now changed, thanks to Crypto-ML. Crypto-ML is a cryptocurrency trading platform designed specifically for individuals who want to get the most out of their investments in the most reliable, error-free ways. Using state-of-the-art deep learning techniques, Crypto-ML uses historical data to build models that predict future price movement. At the same time, it eliminates any human error or mistakes arising out of emotions. Why you should try this machine learning tool out No expertise in cryptocurrency trading is required if you want to use this tool Crypto-ML only makes use of historical data and builds data models to predict future prices without any human intervention Per the Crypto-ML website, the average gain on winning trades is close to 53%, whereas the average loss on losing trades is just close to 6%. If you are a data scientist or a machine learning developer with an interest in finance and cryptocurrency, this platform can also help you customize your own models for efficient trading. Here’s where you can read on how Crypto-ML works, in more detail. Other notable mentions Apart from the tools we mentioned above, there are a quite a few other tools that could not make it to the list, but deserve a special mention. Some of them are: ABBYY’s Real-time Recognition SDK for document recognition, language processing and data capturing is worth checking out. Vertex.ai’s PlaidML is an open source tool that allows you to build smart deep learning models across a variety of platforms. It leverages the power of Tile, a new machine learning language that facilitates tensor manipulation. Facebook recently open sourced MUSE, a Python library for efficient word embedding and other NLP tasks. This one’s worth keeping an eye on for sure! If you’re interested in browser-based machine learning, MachineLabs recently open sourced the entire code base of their machine learning platform. NVIDIA’s very own NVVL, their open source offering that provides GPU-accelerated video decoding for training deep learning models The vast ecosystem of tools and frameworks available for building smart, intelligent use-cases across various domains just points to the fact that AI is finding practical applications with every passing day. It is not an overstatement anymore to suggest that that AI is slowly becoming indispensable to businesses. This is not the end of it by any means either - expect to see more such tools spring to life in the near future, with some having game-changing, revolutionary consequences. So which tools are you planning to use for your machine learning / AI tasks? Is there any tool we missed out? Let us know! Read more Predictive Analytics with AWS: A quick look at Amazon ML Four interesting Amazon patents in 2018 that use machine learning, AR, and robotics How to earn $1m per year? Hint: Learn machine learning

0
1
6298

article-image-5-examples-of-artificial-intelligence-in-web-apps

Sugandha Lahoti

20 Aug 2018

7 min read

5 examples of Artificial Intelligence in Web apps

Sugandha Lahoti

20 Aug 2018

7 min read

Modern day web app development is increasingly focused on building a customer-facing front-end presence with the use of Artificial Intelligence. Web apps, use Artificial Intelligence not just for intelligent automation, but also for building recommendation engines, website implementation, and image recognition, among other application areas. In this post, we look at five key areas, illustrated by real-world examples, where web apps are employing Artificial intelligence to automate some part of their system. Recommendation Engines of Amazon and Netflix Curating content based on the user’s context is one of the most widely used AI features in web apps. Amazon, for instance, uses item-based collaborative filtering for product classification. Amazon’s recommendation system uses a combination of goods-based recommendation (users are recommended for those similar to what they liked in the past) and buddy-based recommendation (users are recommended things which their Facebook friends like.) Not just for their recommendation system, Amazon has been using AI for multiple tasks. Their AI Management Strategy is called The Flywheel, where one part of Amazon acts as a catalyst for AI and machine learning growth in other areas. Read more: Four interesting Amazon patents in 2018 that use machine learning, AR, and robotics Another popular example is Netflix, who revamped their recommendation algorithm based on visual impressions. One of their research projects indicated that the artwork was not only the biggest influencer to a viewer's decision to watch content, but it also drew over 82% of their focus while browsing Netflix. This made them develop a new image recommendation algorithm which works in real time to project the image it thinks the user will respond to. They use implicit (user behavior) and Explicit data (user activity) and then feed this data to machine learning algorithms to figure out the relevant content for each user. For each title, users get the image with the highest rank based on their profile. Side by side, it continues collecting data from its 100 million other subscribers to improve its engine’s performance. Read more: What software stack does Netflix use? Google and Microsoft using Image recognition Image recognition can serve multiple uses for web apps including object and pattern recognition, locating duplicates (exact or partial), image search by fragments, and more. Two such unique applications of image recognition are Google’s Quickdraw and Microsoft’s Captionbot.ai. Quick Draw is Google’s AI-powered web app game, where users have to draw an everyday object that a neural network tries to recognize. Players are given 20 seconds to draw a random item, and Google’s neural network tries to match it with other 50 million hand-drawn sketches by other players to identify the correct one. Quickdraw aims to generate the world’s largest doodling data set, which is shared publicly to help further machine learning research. The data preserves user privacy by collecting only anonymous metadata, including timestamp, country code, whether or not the drawing was recognized, and which word the drawing corresponded to. This dataset was used in SketchRNN, a neural network that can draw words and interpolate between drawings. Another image recognition web app is Microsoft’s Captionbot.ai. The system can automatically generate a caption for an uploaded photograph. Users can rate how accurately it has detected what was on display. The algorithm learns from the rating, to make the captions more accurate. It uses three separate services to process the images. The Computer Vision API identifies the components of the photo, then mixes it with data from the Bing Image API, and runs any faces it spots through Emotion API. The Emotion API analyses facial expressions to detect anger, contempt, disgust, fear, and other traits. Based on the results from these APIs, the caption is generated. Google Docs powered by Natural Language Processing Modern Web apps can also be fueled with cognitive capabilities to make them stand apart from other apps. Instances of this include transforming human speech to text or conversing with people in natural language. One such example of a web app which includes natural language processing is Google Docs. Google Docs and Slides have an Explore feature to show text, images, and other features relevant to the document that a user is working on at any given point. Docs can also use natural language to search through data and reports, and automatically generate formulas in Sheets. Google Docs recently incorporated an AI grammar checker, announced at Google Cloud Next. It uses a machine translation algorithm to recognize errors and suggest corrections as users type. Google Docs can also be integrated with Natural Language API to recognize the sentiment of selected text in a Google Doc and highlight it based on that sentiment. Web-based artificial intelligence Chatbots Web-based chatbots are just like app-based chatbots albeit they interact with users in the website browser. They use AI techniques such as natural language understanding and pattern recognition to store and distinguish between the context of the information provided and elicit a suitable response for future replies. An example of web-based chatbots are the Live Chat bots where the conversation with a visitor on a website is automated using a chatbot. Many live chat software companies are already experimenting with chatbots. Examples include the Operator bot used by Intercom, a company building customer messaging platform or Driftbot by Drift which gives your website a personal assistant. Read More: Top 4 chatbot development frameworks for developers Another example, are AI based chatbots which help in creating full websites. Right Click is a startup that introduced an A.I.-powered chatbot which uses Artificial Intelligence in a conversational interface to create websites. It asks general questions during the conversation like “What industry you belong to?” and “Why do you want to make a website?” and creates customized templates as per the given answers. Similarly, Wix’s Artificial Intelligence Design bot can tailor websites by learning about each person’s or business’ own needs. Web-based code helpers using AI Intelligent coding assistants are gaining popularity with their ability to understand the code and provide right suggestions at the right time. They can analyze code on the web and give fast and smart completions. Codota for Chrome is a smart web-based IDE which can build predictive models of code and suggest code completions and related content based on the current context present in the code. It combines program analysis, natural language processing, and machine learning to learn from the code. Users can look for Codota’s Icon on every code snippet on their browsers - in GitHub, StackOverflow and others. Another example is Deep Cognition’s Deep Learning Studio – Cloud. It is not exactly an IDE, but it features AI-powered drag & drop interface to help design deep learning models with ease. It features assisted modeling, for automated tensor size calculations and real-time validation. It also has AutoML feature to automatically build a neural network. [dropcap]E[/dropcap]ven though AI is a great choice to enhance your web apps, an important facet to keep in mind is ensuring fairness, accuracy, and transparency of your web apps. For instance, web apps powered by natural language should not discriminate people based on caste, color, or creed or hurt user sentiments. Similarly, those using neural networks for recognizing images should ensure the filtering of obscene images. Creating such types of artificial intelligence systems would require a hybrid of designers, programmers, ML engineers, and researchers. This collective group will have a good grasp of user experience, will be comfortable thinking in abstracts and algorithms, and equally well versed with the social impacts of artificial intelligence. Read More: 20 lessons on bias in machine learning systems by Kate Crawford at NIPS 2017 Uber introduces Fusion.js, a plugin-based web development framework for high-performance apps. Electron Fiddle: A ‘code playground’ for experimenting with cross-platform native apps. Warp: Rust’s new web framework for implementing WAI (Web Application Interface)

0
0
30931

article-image-how-everyone-at-netflix-uses-jupyter-notebooks-from-data-scientists-machine-learning-engineers-to-data-analysts

Bhagyashree R

18 Aug 2018

4 min read

How everyone at Netflix uses Jupyter notebooks from data scientists, machine learning engineers, to data analysts

Bhagyashree R

18 Aug 2018

4 min read

Netflix uses a variety of tools to do data analysis. One of the big ways that data scientists and engineers at Netflix interact with their data is through Jupyter notebooks. In addition to providing execution environments to users, Netflix invests in various parts of the Jupyter ecosystem and tooling. They are “reimagining what a notebook can be, who can use it, and what they can do with it.” Netflix aims to provide personalized content to their 130 million viewers. For this every day more than 1 trillion events are written into a streaming ingestion pipeline. To support this, they’ve built an industry-leading data platform which is flexible, powerful, and complex. There are so many diverse users of this platform, such as analytics engineers, data engineers, and data scientists, requiring different sets of tools and languages. To help the platform scale, they wanted to minimize the number of tools and the solution to this was the open-source tool: Jupyter notebooks. Why Jupyter notebook is so compelling for Netflix? These are the functionalities provided by notebook that benefits Netflix’s data scientists and engineers: Standard messaging API: The Jupyter protocol provides a standard messaging API with the kernels that act as computational engines. It separates where the content is written and where the content is executed. This makes it language agnostic. Editable file format: It provides an editable file format that stores the code and results together. Web-based UI: It is web-based which helps interactively writing and running code as well as visualizing outputs. How Netflix uses Jupyter Notebooks? The following are some of the use cases they use Jupyter notebooks for: Data access: Notebooks were first introduced for workflows and their adoption grew among the data scientists. Seeing this, Netflix decided to leverage its versatility and architecture for general data access. Notebooks provide an user-friendly interface for interactively running code, exploring the outputs, and visualizing data all from a single cloud-based development environment. Notebook Templates: They introduced parameterized notebooks, which allow the use of parameters in the code and take values as input at runtime. These templates help: Data scientists to run an experiment with different coefficients and summarize the results Data engineers to execute data quality audits Data analysts to share prepared queries and visualizations Software engineers to email the results of a troubleshooting script Scheduling notebooks: Next they are using notebooks for creating a unifying layer for scheduling workflows. Notebooks are used for interactive work and allows smooth move to scheduling that work to run recurrently. Many users create an entire workflow in a notebook and just copy/paste it into separate files for scheduling when they’re ready to deploy it. Notebook infrastructure: The three fundamental components of the infrastructure are: storage, compute, and interface. Source: Netflix Tech Blog Storage: The Netflix Data Platform is made of Amazon S3 and EFS for cloud storage, which notebooks treat as virtual filesystems. Each user has a home directory on EFS containing a personal workspace for notebooks. This workspace is for storing any notebook created or uploaded by a user. When a user launches a notebook interactively, all the reading and writing happens at the workspace. Compute: All the jobs on the data platform run on containers including queries, pipelines and notebooks. A container with reasonable default resources is provisioned when a user launches a notebook. Users can request more resources if they find that the provided resources are not enough. A unified execution environment with a prepared container image is provided, which has common libraries and an array of default kernels preinstalled. The orchestration and environments are managed with Titus, their container management platform. Interface: They are using nteract, a React-based frontend for Jupyter notebooks, which emphasizes simplicity and composability as core design principles.They’re also introducing native support for parameterization, which makes it easier to schedule notebooks and create reusable templates. Netflix is planning to make investments in both the frontend and backend to improve the overall notebook experience. This year they are also sponsoring JupyterCon. To read more about how Jupyter is offering value to Netflix read Netflix’s original post at Medium. 10 reasons why data scientists love Jupyter notebooks What’s new in Jupyter Notebook 5.3.0 Netflix open sources Zuul 2 cloud gateway

0
0
15180

article-image-tackle-trolls-machine-learning-filtering-inappropriate-content

Amarabha Banerjee

15 Aug 2018

4 min read

Tackle trolls with Machine Learning bots: Filtering out inappropriate content just got easy

Amarabha Banerjee

15 Aug 2018

4 min read

The most feared online entities in the present day are trolls. Trolls, a fearsome bunch of fake or pseudo online profiles, tend to attack online users, mostly celebrities, sports person or political profiles using a wide range of methods. One of these methods is to post obscene or NSFW (Not Safe For Work) content on your profile or website where User Generated Content (USG) is allowed. This can create unnecessary attention and cause legal troubles for you too. The traditional way out is to get a moderator (or a team of them). Let all the USGs pass through this moderation system. This is a sustainable solution for a small platform. But if you are running a large scale app, say a publishing app where you publish one hundred stories a day, and the success of these stories depend on the user interaction with them, then this model of manual moderation becomes unsustainable. More the number of USGs, more is the turn-around time, larger the moderation team size. This results in escalating costs, for a purpose that’s not contributing to your business growth in any manner. That’s where Machine Learning could help. Machine Learning algorithms that can scan images and content for possible abusive or adult content is a better solution that manual moderation. Tech giants like Microsoft, Google, Amazon have a ready solution for this. These companies have created APIs which are commercially available for developers. You can incorporate these APIs in your application to weed out the filth served by the trolls. The different APIs available for this purpose are Microsoft moderation, Google Vision, AWS Rekognition & Clarifai. Dataturks have made a comparative study on using these APIs on one particular dataset to measure their efficiency. They used a YACVID dataset with 180 images, manually labelled 90 of these images as nude and the rest as non-nude. The dataset was then fed to the 4 APIs mentioned above, their efficiency was tested based on the following parameters. True Positive (TP): Given a safe photo, the API correctly says so False Positive (FP): Given an explicit photo but the API incorrectly classifies it as safe. False negative (FN): Given a safe photo but the API is not able to detect so and True negative(TN): Given an explicit photo and the API correctly says so. TP and TN are two cases which meant the system behaved correctly. An FP meant that the app was vulnerable to attacks from trolls, FN meant the efficiency of the systems were low and hence not practically viable. 10% of the cases would be such that the API can’t decide whether its explicit or not. Those would be sent for manual moderation. This would bring down the maintenance cost of the moderation team. The results that they received are shown below: Source: Dataturks As it is evident from the above table, the best standalone API is Google vision with a 99% accuracy and 94% recall value. Recall value implies that if the same images are repeated, it can recognize them with 94% precision. The best results however were received with the combination of Microsoft and Google. The comparison of the response times are mentioned below: Source: dataturks The response time might have been affected with the fact that all the images accessed by the APIs were stored in Amazon S3. Hence AWS API might have had an unfair advantage on the response time. The timings were noted for 180 image calls per API. The cost is the lowest for AWS Rekognition - $1 for 1000 calls to the API. It’s $1.2 for Clarifai, $1.5 for both Microsoft and Google. The one notable drawback of the Amazon API was that the images had to be stored as S3 objects, or converted into that. All the other APIs accepted any web links as possible source of images. What this study says is that the power of filtering out negative and explicit content in your app is much easier now. You might still have to have a small team of moderators, but their jobs will be made a lot easier with the ML models implemented in these APIs. Machine Learning is paving the way for us to be safe from the increasing menace of Trolls, a threat to free speech and open sharing of ideas which were the founding stones of internet and the world wide web as a whole. Will this discourage Trolls from continuing their slandering or will it create a counter system to bypass the APIs and checks? We can only know in time. Facebook launches a 6-part Machine Learning video series Google’s new facial recognition patent uses your social network to identify you! Microsoft’s Brad Smith calls for facial recognition technology to be regulated

0
0
7092

article-image-budget-demand-forecasting-markov-model-in-sas

Sunith Shetty

10 Aug 2018

8 min read

Budget and Demand Forecasting using Markov model in SAS [Tutorial]

Sunith Shetty

10 Aug 2018

8 min read

0
1
6604

article-image-predictive-analytics-with-amazon-ml

Natasha Mathur

09 Aug 2018

9 min read

Predictive Analytics with AWS: A quick look at Amazon ML

Natasha Mathur

09 Aug 2018

9 min read

0
0
10222

article-image-write-python-code-or-pythonic-code

Aaron Lazar

08 Aug 2018

5 min read

Do you write Python Code or Pythonic Code?

Aaron Lazar

08 Aug 2018

5 min read

If you’re new to Programming, and Python in particular, you might have heard the term Pythonic being brought up at tech conferences, meetups and even at your own office. You might have also wondered why the term and whether they’re just talking about writing Python code. Here we’re going to understand what the term Pythonic means and why you should be interested in learning how to not just write Python code, rather write Pythonic code. What does Pythonic mean? When people talk about pythonic code, they mean that the code uses Python idioms well, that it’s natural or displays fluency in the language. In other words, it means the most widely adopted idioms that are adopted by the Python community. If someone said you are writing un-pythonic code, they might actually mean that you are attempting to write Java/C++ code in Python, disregarding the Python idioms and performing a rough transcription rather than an idiomatic translation from the other language. Okay, now that you have a theoretical idea of what Pythonic (and unpythonic) means, let’s have a look at some Pythonic code in practice. Writing Pythonic Code Before we get into some examples, you might be wondering if there’s a defined way/method of writing Pythonic code. Well, there is, and it’s called PEP 8. It’s the official style guide for Python. Example #1 x=[1, 2, 3, 4, 5, 6] result = [] for idx in range(len(x)); result.append(x[idx] * 2) result Output: [2, 4, 6, 8, 10, 12] Consider the above code, where you’re trying to multiply some elements, “x” by 2. So, what we did here was, we created an empty list to store the results. We would then append the solution of the computation into the result. The result now contains a function which is 2 multiplied by each of the elements. Now, if you were to write the same code in a Pythonic way, you might want to simply use list comprehensions. Here’s how: x=[1, 2, 3, 4, 5, 6] [(element * 2) for element in x] Output: [2, 4, 6, 8, 10] You might have noticed, we skipped the entire for loop! Example #2 Let’s make the previous example a bit more complex, and place a condition that the elements should be multiplied by 2 only if they are even. x=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] result = [] for idx in range(len(x)); if x[idx] % 2 == 0; result.append(x[idx] * 2) else; result.append(x[idx]) result Output: [1, 4, 3, 8, 5, 12, 7, 16, 9, 20] We’ve actually created an if else statement to solve this problem, but there is a simpler way of doing things the Pythonic way. [(element * 2 if element % 2 == 0 else element) for element in x] Output: [1, 4, 3, 8, 5, 12, 7, 16, 9, 20] If you notice what we’ve done here, apart from skipping multiple lines of code, is that we used the if-else statement in the same sentence. Now, if you wanted to perform filtering, you could do this: x=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] [element * 2 for element in x if element % 2 == 0] Output: [4, 8, 12, 16, 20] What we’ve done here is put the if statement after the for declaration, and Voila! We’ve achieved filtering. If you’re using a nice IDE like Jupyter Notebooks or PyCharm, they will help you format your code as per the PEP 8 suggestions. Why should you write Pythonic code? Well firstly, you’re saving loads of time writing humongous piles of cowdung code, so you’re obviously becoming a smarter and more productive programmer. Python is a pretty slow language, and when you’re trying to do something in Python, which is acquired from another language like Java or C++, you’re going to worsen things. With idiomatic, Pythonic code, you’re improving the speed of your programs. Moreover, idiomatic code is far easier to comprehend and understand for other developers who are working on the same code. It helps a great deal when you’re trying to refactor someone else’s code. Fearing Pythonic idioms Well, I don’t mean the idioms themselves are scary. Rather, quite a few developers and organisations have begun discriminating on the basis of whether someone can or cannot write Pythonic code. This is wrong, because, at the end of the day, though the PEP 8 exists, the idea of the term Pythonic is different for different people. To some it might mean picking up a new style guide and improving the way you code. To others, it might mean being succinct and not repeating themselves. It’s time we stopped judging people on whether they can or can’t write Pythonic code and instead, we should appreciate when someone is able to present readable, easily maintainable and succinct code. If you find them writing a bit of clumsy code, you can choose to talk to them about improving their design considerations. And the world will be a better place! If you’re interested in learning how to write more succinct and concise Python code, check out these resources: Learning Python Design Patterns - Second Edition Python Design Patterns [Video] Python Tips, Tricks and Techniques [Video]

0
2
28097

article-image-amazon-patents-2018-machine-learning-ar-robotics

Natasha Mathur

06 Aug 2018

7 min read

Four interesting Amazon patents in 2018 that use machine learning, AR, and robotics

Natasha Mathur

06 Aug 2018

7 min read

"There are two kinds of companies, those that work to try to charge more and those that work to charge less. We will be the second."-- Jeff Bezos, CEO Amazon When Jeff Bezos launched Amazon.com in 1994, it was an online bookselling site. This was during a time when bookstores such as Barnes & Noble, Waldenbooks and Crown Books were the leading front runners in the bookstore industry in the American shopping malls. Today, Amazon’s name has become almost synonymous with online retail for most people and has now spread its wings to cloud computing, electronics, tech gadgets and the entertainment world. With market capitalization worth $897.47B as of August 3rd 2018, it’s hard to believe that there was a time when Amazon sold only books. Amazon is constantly pushing to innovate and as new inventions come to shape, there are “patents” made that helps the company have a competitive advantage over technologies and products in order to attract more customers. [box type="shadow" align="" class="" width=""]According to United States Patent and Trademark Office (USPTO), Patent is an exclusive right to invention and “the right to exclude others from making, using, offering for sale, or selling the invention in the United States or “importing” the invention into the United States”.[/box] As of March 20, 2018, Amazon owned 7,717 US patents filed under two business entities, Amazon Technologies, Inc. (7,679), and Amazon.com, Inc (38). Looking at the chart below, you can tell that Amazon Technologies, Inc., was one among the top 15 companies in terms of number of patents granted in 2017. Top 15 companies, by number of patents granted by USPTO, 2017 Amazon competes closely with the world’s leading tech giants in terms of patenting technologies. The below table only considers US patents. Here, Amazon holds only few US patents than IBM, Microsoft, Google, and Apple. Number of US Patents Containing Emerging-Technology Keywords in Patent Description Some successfully patented Amazon innovations in 2018 There are thousands of inventions that Amazon is tied up with and for which they have filed for patents. These include employee surveillance AR goggles, a real-time accent translator, robotic arms tossing warehouse items, one-click buying, drones,etc. Let’s have a look at these remarkable innovations by Amazon. AR goggles for improving human-driven fulfillment (or is it to track employees?) Date of Patent: August 2, 2018 Filed: March 20, 2017 Assignee: Amazon Technologies, Inc. AR Goggles Features: Amazon has recently patented a pair of augmented reality goggles that could be used to keep track of its employees.The patent is titled “Augmented Reality User interface facilitating fulfillment.” As per the patent application, the application is a wearable computing device such as augmented reality glasses that are worn on user’s head. The user interface is rendered upon one or more lenses of the augmented reality glasses and it helps to show the workers where to place objects in Amazon's fulfillment centers. There’s also a feature in the AR glasses which provides workers with turn-by-turn directions to the destination within the fulfillment centre. This helps them easily locate the destination as all the related information gets rendered on the lenses. AR Goggles steps The patent has received criticism over concerns that this application might hamper the privacy of employees within the warehouses, tracking employees’ every single move. However, Amazon has defended the application by saying that it has got nothing to do with “employee surveillance”. As this is a patent, there’s no guarantee if it will actually hit the market. Robotic arms that toss warehouse items Date of Patent: July 17, 2018 Filed: September 29, 2015 Assignee: Amazon Technologies, Inc. Features: Amazon won a patent titled “Robotic tossing of items in inventory system” last month. As per the patent application, “Robotic arms or manipulators can be used to toss inventory items within an inventory system. Tossing strategies for the robotic arms may include information about how a grasped item is to be moved and released by a robotic arm to achieve a trajectory for moving the item to a receiving location”. Robotic Arms Utilizing a robotic arm to toss an item to a receiving location can help improve throughput through the inventory system. This is possible as the robotic arms will help with reducing the amount of time that may otherwise be spent on placing a grasped item directly onto a surface for receiving the item. “The tossing strategy may be based at least in part upon a database containing information about the item, characteristics of the item, and/or similar items, such as information indicating tossing strategies that have been successful or unsuccessful for such items in the past,” the patent reads. Robotic Arms Steps Amazon’s aim with this is to eliminate the challenges faced by modern inventory systems like supply chain distribution centers, airport luggage systems, etc, while responding to requests for inventory items. The patent received criticism over the concern that one of the examples in the application was a dwarf figurine and could possibly mock people of short stature. But, according to Amazon, “The intention was simply to illustrate a robotic arm moving products, and it should not be taken out of context.” Real-time accent translator Date of Patent: June 21, 2018 Filed: December 21, 2016 Assignee: Amazon Technologies, Inc. Features: Amazon won a patent for an audio system application, titled “Accent translation” back in June this year, which will help with translating the accent of the speaker to the listener’s accent. The aim with this app is to get rid of the possible communication barriers which may arise due to different accents as they can be difficult to understand at times. Accent translation system The accent translation system collects a number of audio samples from different sources such as phone call, television, movies, broadcasts, etc. Each audio sample will have its association with at least one of the accent sample sets present in its database. For instance, german accent will be associated with the german accent sample set. Accent translation system steps In a two-party dialog, acquired audio is analyzed and if it associates with one among a wide range of saved accents then the audio from both the sides is outputted based on the accent of the opposite party. The possibilities with this application are endless. One major use case is the customer care industry where people have to constantly talk to different people with different accents. Drone that uses Human gestures and voice commands Date of Patent: March 20, 2018 Filed: July 18, 2016 Assignee: Amazon Technologies, Inc. Features: Amazon patented for a drone, titled “Human interaction with unmanned aerial vehicles”, earlier this year, that would use human gestures and voice commands for package delivery. Amazon Drone makes use of propulsion technology which will help with managing the speed, trajectory, and direction of the drone. Drones As per the patent application, “an unmanned aerial vehicle is provided which includes propulsion device, sensor device and a management system. The management system is configured to receive human gestures via the sensor device and in response, instruct the propulsion device to affect and adjustment to the behavior of the unnamed aerial vehicle. Human gestures include-- visible gestures, audible gestures, and other gestures capable of recognition by the unmanned vehicle”. Working structure of drones The concept for drones started when Amazon CEO, Jeff Bezos, promised, back in 2013, that the company aims to make 30-minute deliveries, of packages up to 2.25 kgs or 5 pounds. Amazon’s patents are a clear indication of its efforts and determination for inventing cutting-edge technologies for optimizing its operations so that it can pass on the benefits to its customers in the form of competitively priced product offerings. As Amazon has been putting its focus on machine learning, the drones and robotic arms will make the day-to-day tasks of the facility workers easier and more efficient. In fact, Amazon has stepped up its game big time and is incorporating Augmented reality, with its AR glasses to further scale efficiencies. The real-time accent translators help eliminate the communication barriers, making Amazon cover a wide range of areas and perhaps provide a seamless customer care experience in the coming days. Amazon Echo vs Google Home: Next-gen IoT war Amazon is selling facial recognition technology to police

0
3
10882

article-image-product-development-need-developers-and-product-managers-collaborate

Packt Editorial Staff

04 Aug 2018

16 min read

Effective Product Development needs developers and product managers collaborating on success metrics

Packt Editorial Staff

04 Aug 2018

16 min read

Modern product development is witnessing a drastic shift. Disruptive ideas and ambiguous business conditions have changed the way products are developed. Product development is no longer guided by existing processes or predefined frameworks. Delivering on time is a baseline metric, as is software quality. Today, businesses are competing to innovate. They are willing to invest in groundbreaking products with cutting-edge technology. Cost is no longer the constraint—execution is. Can product managers then continue to rely upon processes and practices aimed at traditional ways of product building? How do we ensure that software product builders look at the bigger picture and do not tie themselves to engineering practices and technology viability alone? Understanding the business and customer context is essential for creating valuable products. In this article, we are going to identify what success means to us in terms of product development. This article is an excerpt from the book Lean Product Management written by Mangalam Nandakumar. For the kind of impact that we predict our feature idea to have on the Key Business Outcomes, how do we ensure that every aspect of our business is aligned to enable that success? We may also need to make technical trade-offs to ensure that all effort on building the product is geared toward creating a satisfying end-to-end product experience. When individual business functions take trade-off decisions in silo, we could end up creating a broken product experience or improvising the product experience where no improvement is required. For a business to be able to align on trade-offs that may need to be made on technology, it is important to communicate what is possible within business constraints and also what is not achievable. It is not necessary for the business to know or understand the specific best practices, coding practices, design patterns, and so on, that product engineering may apply. However, the business needs to know the value or the lack of value realization, of any investment that is made in terms of costs, effort, resources, and so on. The section addresses the following topics: The need to have a shared view of what success means for a feature idea Defining the right kind of success criteria Creating a shared understanding of technical success criteria "If you want to go quickly, go alone. If you want to go far, go together. We have to go far — quickly." Al Gore Planning for success doesn't come naturally to many of us. Come to think of it, our heroes are always the people who averted failure or pulled us out of a crisis. We perceive success as 'not failing,' but when we set clear goals, failures don't seem that important. We can learn a thing or two about planning for success by observing how babies learn to walk. The trigger for walking starts with babies getting attracted to, say, some object or person that catches their fancy. They decide to act on the trigger, focusing their full attention on the goal of reaching what caught their fancy. They stumble, fall, and hurt themselves, but they will keep going after the goal. Their goal is not about walking. Walking is a means to reaching the shiny object or the person calling to them. So, they don't really see walking without falling as a measure of success. Of course, the really smart babies know to wail their way to getting the said shiny thing without lifting a toe. Somewhere along the way, software development seems to have forgotten about shiny objects, and instead focused on how to walk without falling. In a way, this has led to an obsession with following processes without applying them to the context and writing perfect code, while disdaining and undervaluing supporting business practices. Although technology is a great enabler, it is not the end in itself. When applied in the context of running a business or creating social impact, technology cannot afford to operate as an isolated function. This is not to say that technologists don't care about impact. Of course, we do. Technologists show a real passion for solving customer problems. They want their code to change lives, create impact, and add value. However, many technologists underestimate the importance of supporting business functions in delivering value. I have come across many developers who don't appreciate the value of marketing, sales, or support. In many cases, like the developer who spent a year perfecting his code without acquiring a single customer, they believe that beautiful code that solves the right problem is enough to make a business succeed. Nothing can be further from the truth Most of this type of thinking is the result of treating technology as an isolated function. There is a significant gap that exists between nontechnical folks and software engineers. On the one hand, nontechnical folks don't understand the possibilities, costs, and limitations of software technology. On the other hand, technologists don't value the need for supporting functions and communicate very little about the possibilities and limitations of technology. This expectation mismatch often leads to unrealistic goals and a widening gap between technology teams and the supporting functions. The result of this widening gap is often cracks opening in the end-to-end product experience for the customer, thereby resulting in a loss of business. Bridging this gap of expectation mismatch requires that technical teams and business functions communicate in the same language, but first they must communicate. Setting SMART goals for team In order to set the right expectations for outcomes, we need the collective wisdom of the entire team. We need to define and agree upon what success means for each feature and to each business function. This will enable teams to set up the entire product experience for success. Setting specific, measurable, achievable, realistic, and time-bound (SMART) metrics can resolve this. We cannot decouple our success criteria from the impact scores we arrived at earlier. So, let's refer to the following table for the ArtGalore digital art gallery: The estimated impact rating was an indication of how much impact the business expected a feature idea to have on the Key Business Outcomes. If you recall, we rated this on a scale of 0 to 10. When the estimated impact of a Key Business Outcomes is less than five, then the success criteria for that feature is likely to be less ambitious. For example, the estimated impact of "existing buyers can enter a lucky draw to meet an artist of the month" toward generating revenue is zero. What this means is that we don't expect this feature idea to bring in any revenue for us or put in another way, revenue is not the measure of success for this feature idea. If any success criteria for generating revenue does come up for this feature idea, then there is a clear mismatch in terms of how we have prioritized the feature itself. For any feature idea with an estimated impact of five or above, we need to get very specific about how to define and measure success. For instance, the feature idea "existing buyers can enter a lucky draw to meet an artist of the month" has an estimated impact rating of six towards engagement. This means that we expect an increase in engagement as a measure of success for this feature idea. Then, we need to define what "increase in engagement" means. My idea of "increase in engagement" can be very different from your idea of "increase in engagement." This is where being S.M.A.R.T. about our definition of success can be useful. Success metrics are akin to user story acceptance criteria. Acceptance criteria define what conditions must be fulfilled by the software in order for us to sign off on the success of the user story. Acceptance criteria usually revolve around use cases and acceptable functional flows. Similarly, success criteria for feature ideas must define what indicators can tell us that the feature is delivering the expected impact on the KBO. Acceptance criteria also sometimes deal with NFRs (nonfunctional requirements). NFRs include performance, security, and reliability. In many instances, nonfunctional requirements are treated as independent user stories. I also have seen many teams struggle with expressing the need for nonfunctional requirements from a customer's perspective. In the early days of writing user stories, the tendency for myself and most of my colleagues was to write NFRs from a system/application point of view. We would say, "this report must load in 20 seconds," or "in the event of a network failure, partial data must not be saved." These functional specifications didn't tell us how/why they were important for an end user. Writing user stories forces us to think about the user's perspective. For example, in my team we used to have interesting conversations about why a report needed to load within 20 seconds. This compelled us to think about how the user interacted with our software. It is not uncommon for visionary founders to throw out very ambitious goals for success. Having ambitious goals can have a positive impact in motivating teams to outperform. However, throwing lofty targets around, without having a plan for success, can be counter-productive. For instance, it's rather ambitious to say, "Our newsletter must be the first to publish artworks by all the popular artists in the country," or that "Our newsletter must become the benchmark for art curation." These are really inspiring words, but can mean nothing if we don't have a plan to get there. The general rule of thumb for this part of product experience planning is that when we aim for an ambitious goal, we also sign up to making it happen. Defining success must be a collaborative exercise carried out by all stakeholders. This is the playing field for deciding where we can stretch our goals, and for everyone to agree on what we're signing up to, in order to set the product experience up for success. Defining key success metrics For every feature idea we came up with, we can create feature cards that look like the following sample. This card indicates three aspects about what success means for this feature. We are asking these questions: what are we validating? When do we validate this? What Key Business Outcomes does it help us to validate? The criteria for success demonstrates what the business anticipates as being a tangible outcome from a feature. It also demonstrates which business functions will support, own, and drive the execution of the feature. That's it! We've nailed it, right? Wrong. Success metrics must be SMART, but how specific is the specific? The preceding success metric indicates that 80% of those who sign up for the monthly art catalog will enquire about at least one artwork. Now, 80% could mean 80 people, 800 people, or 8000 people, depending on whether we get 100 sign-ups, 1000, or 10,000, respectively! We have defined what external (customer/market) metrics to look for, but we have not defined whether we can realistically achieve this goal, given our resources and capabilities. The question we need to ask is: are we (as a business) equipped to handle 8000 enquiries? Do we have the expertise, resources, and people to manage this? If we don't plan in advance and assign ownership, our goals can lead to a gap in the product experience. When we clarify this explicitly, each business function could make assumptions. When we say 80% of folks will enquire about one artwork, the sales team is thinking that around 50 people will enquire. This is what the sales team at ArtGalore is probably equipped to handle. However, marketing is aiming for 750 people and the developers are planning for 1000 people. So, even if we can attract 1000 enquiries, sales can handle only 50 enquiries a month! If this is what we're equipped for today, then building anything more could be wasteful. We need to think about how we can ramp up the sales team to handle more requests. The idea of drilling into success metrics is to gauge whether we're equipped to handle our success. So, maybe our success metric should be that we expect to get about 100 sign-ups in the first three months and between 40-70 folks enquiring about artworks after they sign up. Alternatively, we can find a smart way to enable sales to handle higher sales volumes. Before we write up success metrics, we should be asking a whole truckload of questions that determine the before-and-after of the feature. We need to ask the following questions: What will the monthly catalog showcase? How many curated art items will be showcased each month? What is the nature of the content that we should showcase? Just good high-quality images and text, or is there something more? Who will put together the catalog? How long must this person/team(s) spend to create this catalog? Where will we source the art for curation? Is there a specific date each month when the newsletter needs to go out? Why do we think 80% of those who sign up will enquire? Is it because of the exclusive nature of art? Is it because of the quality of presentation? Is it because of the timing? What's so special about our catalog? Who handles the incoming enquiries? Is there a number to call or is it via email? How long would we take to respond to enquiries? If we get 10,000 sign-ups and receive 8000 enquiries, are we equipped to handle these? Are these numbers too high? Can we still meet our response time if we hit those numbers? Would we still be happy if we got only 50% of folks who sign up enquiring? What if it's 30%? When would we throw away the idea of the catalog? This is where the meat of feature success starts taking shape. We need a plan to uncover underlying assumptions and set ourselves up for success. It's very easy for folks to put out ambitious metrics without understanding the before-and-after of the work involved in meeting that metric. The intent of a strategy should be to set teams up for success, not for failure. Often, ambitious goals are set without considering whether they are realistic and achievable or not. This is so detrimental that teams eventually resort to manipulating the metrics or misrepresenting them, playing the blame game, or hiding information. Sometimes teams try to meet these metrics by deprioritizing other stuff. Eventually, team morale, productivity, and delivery take a hit. Ambitious goals, without the required capacity, capability, and resources to deliver, are useless. Technology to be in line with business outcomes Every business function needs to align toward the Key Business Outcomes and conform to the constraints under which the business operates. In our example here, the deadline is for the business to launch this feature idea before the Big Art show. So, meeting timelines is already a necessary measure of success. The other indicators of product technology measures could be quality, usability, response times, latency, reliability, data privacy, security, and so on. These are traditionally clubbed under NFRs (nonfunctional requirements). They are indicators of how the system has been designed or how the system operates, and are not really about user behavior. There is no aspect of a product that is nonfunctional or without a bearing on business outcomes. In that sense, nonfunctional requirements are a misnomer. NFRs are really technical success criteria. They are also a business stakeholder's decision, based on what outcomes the business wants to pursue. In many time and budget-bound software projects, technical success criteria trade-offs happen without understanding the business context or thinking about the end-to-end product experience. Let's take an example: our app's performance may be okay when handling 100 users, but it could take a hit when we get to 10,000 users. By then, the business has moved on to other priorities and the product isn't ready to make the leap. This depends on how each team can communicate the impact of doing or not doing something today in terms of a cost tomorrow. What that means is that engineering may be able to create software that can scale to 5000 users with minimal effort, but in order to scale to 500,000 users, there's a different level of magnitude required. There is a different approach needed when building solutions for meeting short-term benefits, compared to how we might build systems for long-term benefits. It is not possible to generalize and make a case that just because we build an application quickly, that it is likely to be full of defects or that it won't be secure. By contrast, just because we build a lot of robustness into an application, this does not mean that it will make the product sell better. There is a cost to building something, and there is also a cost to not building something and a cost to a rework. The cost will be justified based on the benefits we can reap, but it is important for product technology and business stakeholders to align on the loss or gain in terms of the end-to-end product experience because of the technical approach we are taking today. In order to arrive at these decisions, the business does not really need to understand design patterns, coding practices, or the nuanced technology details. They need to know the viability to meet business outcomes. This viability is based on technology possibilities, constraints, effort, skills needed, resources (hardware and software), time, and other prerequisites. What we can expect and what we cannot expect must both be agreed upon. In every scope-related discussion, I have seen that there are better insights and conversations when we highlight what the business/customer does not get from this product release. When we only highlight what value they will get, the discussions tend to go toward improvising on that value. When the business realizes what it doesn't get, the discussions lean toward improvising the end-to-end product experience. Should a business care that we wrote unit tests? Does the business care what design patterns we used or what language or software we used? We can have general guidelines for healthy and effective ways to follow best practices within our lines of work, but best practices don't define us, outcomes do. To summarize we learned before commencing on the development of any feature idea, there must be a consensus on what outcomes we are seeking to achieve. The success metrics should be our guideline for finding the smartest way to implement a feature. Developer’s guide to Software architecture patterns Hey hey, I wanna be a Rockstar (Developer) The developer-tester face-off needs to end. It’s putting our projects at risk

0
0
4587

article-image-neo4j-most-popular-graph-database

Amey Varangaonkar

02 Aug 2018

7 min read

Why Neo4j is the most popular graph database

Amey Varangaonkar

02 Aug 2018

7 min read

Neo4j is an open source, distributed data store used to model graph problems. It departs from the traditional nomenclature of database technologies, in which entities are stored in schema-less, entity-like structures called nodes, which are connected to other nodes via relationships or edges. In this article, we are going to discuss the different features and use-cases of Neo4j. This article is an excerpt taken from the book 'Seven NoSQL Databases in a Week' written by Aaron Ploetz et al. Neo4j's best features Aside from its support of the property graph model, Neo4j has several other features that make it a desirable data store. Here, we will examine some of those features and discuss how they can be utilized in a successful Neo4j cluster. Clustering Enterprise Neo4j offers horizontal scaling through two types of clustering. The first is the typical high-availability clustering, in which several slave servers process data overseen by an elected master. In the event that one of the instances should fail, a new master is chosen. The second type of clustering is known as causal clustering. This option provides additional features, such as disposable read replicas and built-in load balancing, that help abstract the distributed nature of the clustered database from the developer. It also supports causal consistency, which aims to support Atomicity Consistency Isolation and Durability (ACID) compliant consistency in use cases where eventual consistency becomes problematic. Essentially, causal consistency is delivered with a distributed transaction algorithm that ensures that a user will be able to immediately read their own write, regardless of which instance handles the request. Neo4j Browser Neo4j ships with Neo4j Browser, a web-based application that can be used for database management, operations, and the execution of Cypher queries. In addition to, monitoring the instance on which it runs, Neo4j Browser also comes with a few built-in learning tools designed to help new users acclimate themselves to Neo4j and graph databases. Neo4j Browser is a huge step up from the command-line tools that dominate the NoSQL landscape. Cache sharding In most clustered Neo4j configurations, a single instance contains a complete copy of the data. At the moment, true sharding is not available, but Neo4j does have a feature known as cache sharding. This feature involves directing queries to instances that only have certain parts of the cache preloaded, so that read requests for extremely large data sets can be adequately served. Help for beginners One of the things that Neo4j does better than most NoSQL data stores is the amount of documentation and tutorials that it has made available for new users. The Neo4j website provides a few links to get started with in-person or online training, as well as meetups and conferences to become acclimated to the community. The Neo4j documentation is very well-done and kept up to date, complete with well-written manuals on development, operations, and data modeling. The blogs and videos by the Neo4j, Inc. engineers are also quite helpful in getting beginners started on the right path. Additionally, when first connecting to your instance/cluster with Neo4j Browser, the first thing that is shown is a list of links directed at beginners. These links direct the user to information about the Neo4j product, graph modeling and use cases, and interactive examples. In fact, executing the play movies command brings up a tutorial that loads a database of movies. This database consists of various nodes and edges that are designed to illustrate the relationships between actors and their roles in various films. Neo4j's versatility demonstrated in its wide use cases Because of Neo4j's focus on node/edge traversal, it is a good fit for use cases requiring analysis and examination of relationships. The property graph model helps to define those relationships in meaningful ways, enabling the user to make informed decisions. Bearing that in mind, there are several use cases for Neo4j (and other graph databases) that seem to fit naturally. Social networks Social networks seem to be a natural fit for graph databases. Individuals have friends, attend events, check in to geographical locations, create posts, and send messages. All of these different aspects can be tracked and managed with a graph database such as Neo4j. Who can see a certain person's posts? Friends? Friends of friends? Who will be attending a certain event? How is a person connected to others attending the same event? In small numbers, these problems could be solved with a number of data stores. But what about an event with several thousand people attending, where each person has a network of 500 friends? Neo4j can help to solve a multitude of problems in this domain, and appropriately scale to meet increasing levels of operational complexity. Matchmaking Like social networks, Neo4j is also a good fit for solving problems presented by matchmaking or dating sites. In this way, a person's interests, goals, and other properties can be traversed and matched to profiles that share certain levels of equality. Additionally, the underlying model can also be applied to prevent certain matches or block specific contacts, which can be useful for this type of application. Network management Working with an enterprise-grade network can be quite complicated. Devices are typically broken up into different domains, sometimes have physical and logical layers, and tend to share a delicate relationship of dependencies with each other. In addition, networks might be very dynamic because of hardware failure/replacement, organization, and personnel changes. The property graph model can be applied to adequately work with the complexity of such networks. In a use case study with Enterprise Management Associates (EMA), this type of problem was reported as an excellent format for capturing and modeling the inter dependencies that can help to diagnose failures. For instance, if a particular device needs to be shut down for maintenance, you would need to be aware of other devices and domains that are dependent on it, in a multitude of directions. Neo4j allows you to capture that easily and naturally without having to define a whole mess of linear relationships between each device. The path of relationships can then be easily traversed at query time to provide the necessary results. Analytics Many scalable data store technologies are not particularly suitable for business analysis or online analytical processing (OLAP) uses. When working with large amounts of data, coalescing desired data can be tricky with relational database management systems (RDBMS). Some enterprises will even duplicate their RDBMS into a separate system for OLAP so as not to interfere with their online transaction processing (OLTP) workloads. Neo4j can scale to present meaningful data about relationships between different enterprise-marketing entities, which is crucial for businesses. Recommendation engines Many brick-and-mortar and online retailers collect data about their customers' shopping habits. However, many of them fail to properly utilize this data to their advantage. Graph databases, such as Neo4j, can help assemble the bigger picture of customer habits for searching and purchasing, and even take trends in geographic areas into consideration. For example, purchasing data may contain patterns indicating that certain customers tend to buy certain beverages on Friday evenings. Based on the relationships of other customers to products in that area, the engine could also suggest things such as cups, mugs, or glassware. Is the customer also a male in his thirties from a sports-obsessed area? Perhaps suggesting a mug supporting the local football team may spark an additional sale. An engine backed by Neo4j may be able to help a retailer uncover these small troves of insight. To summarize, we saw Neo4j is widely used across all enterprises and businesses, primarily due to its speed, efficiency and accuracy. Check out the book Seven NoSQL Databases in a Week to learn more about Neo4j and the other popularly used NoSQL databases such as Redis, HBase, MongoDB, and more. Read more Top 5 programming languages for crunching Big Data effectively Top 5 NoSQL Databases Is Apache Spark today’s Hadoop?

0
0
12295

article-image-guide-to-safe-cryptocurrency-trading

Guest Contributor

02 Aug 2018

8 min read

A Guide to safe cryptocurrency trading

Guest Contributor

02 Aug 2018

8 min read

So, you’ve decided to take a leap of faith and start trading in cryptocurrency. But, do you know how to do it safely? Cryptocurrency has risen in popularity as of late- especially since its market reached half a trillion dollars in 2017! This is good news to you if you ever wanted to trade in a system that veers away from tradition or if you simply distrust the traditional market with all their brokers and bankers. Cryptocurrency trading is, however, not without risks. Hackers work hard every day to steal and scam you out of your hard-earned crypto cash by stealing or coaxing your private keys directly from you. The problem is there’s nowhere to run in case you lose your money since cryptocurrency is largely unregulated. So, should you steer clear of cryptocurrency after all? Heck, no! Read this guide and you’ll be a few steps closer to safe cryptocurrency trading in no time. Know the basics As with any endeavor that involves money, you should at least learn the basic ins and outs of cryptocurrency trading. Remember to always exercise prudence when dealing in cryptocurrency. Also, look for books or reliable sites to guide you through the various risks you might face in cryptocurrency trading. Finally, keep up to date with the latest news and trends involving cryptocurrency-related cybersecurity threats. Use a VPN Most people believe that cryptocurrencies are great for privacy because they don’t need any personal information to buy or sell. In short, they’re anonymous. But, this couldn’t be further from the truth. Cryptocurrencies are pseudonymous- not anonymous. Each coin acts as your pseudonym which means that if your transactions are ever linked to your identity (via your IP address stored in the blockchain), you’ll suddenly find yourself out in the open. A VPN hides this trail by hiding your IP address and encrypting your personal data (like your location and ISP). To ensure that your sensitive transactions (especially those made over public Wi-Fi), use only the best VPN you can afford. The keyword here is “afford”. Never use free VPNs while trading cryptocurrency because free VPNs have been known to share/sell your personal information to their partners or third parties. Worse still, these free VPNs aren’t exactly the most secure. This was the case of popular crypto service MyEtherWallet, which suffered a serious security issue after popular free VPN Hola was compromised for 5 hours. This doesn’t really come as a surprise since Hola was never a secure VPN, to begin with. Check out this Hola VPN review to see for yourself. If you want better VPN options for cryptocurrency trading, try out ZenMate and F-Secure Freedome. Install an antivirus program You can add another layer of safety by installing a high-quality antivirus program. These programs protect you from malware that could take over your computer or device. An antivirus program also protects you from ransomware which hackers use to wrest control over your computer or device by encrypting some or all of your data contained therein and keeping it in stasis until you pay the ransom- which costs $133,000 on average. Now, unlike VPNs, you can get quality protection from free antivirus programs. The best ones, so far, are Avast Free Antivirus and Bitdefender Antivirus. Keep your private key to yourself Your private key is basically the password you use to access your cryptocurrency and it’s the only thing a hacker needs to access to your cryptocurrency. Never share your private key with anyone. Don’t even show a QR code containing your private key. With that said: It’s important to note that your private key is usually stored in your cryptocurrency wallet- which is either “hot” or “cold”. A “hot” wallet is one that is always online and is always ready to use while a “cold” wallet is usually offline and only goes online when you need to use it. Hot wallets are provided by cryptocurrency exchanges when you register an account. They are easy to use and make your cryptocurrency more accessible. However, being provided by an exchange means that you might lose all the funds in that wallet if that exchange ever gets hacked- which usually results in that company shutting down (like Bitfinex, Mt. Gox, and Youbit). How do you avoid this? Easy. Just keep the exact amount you need to spend in your hot wallet and keep the rest in your cold wallet a.k.a cold storage- which, as I’ve already mentioned, is entirely offline. This way, if your hot wallet provider ever gets hacked and goes out of business, you would have only experienced a relatively lesser loss. Now, there are three types of cold wallets to choose from. When choosing which one to use, it’s always important to keep in mind your purpose and the amount of cryptocurrency you plan to keep in that wallet. That said, the three types are: Hardware wallet: By far the most popular type, this wallet takes the form of a device that you plug into your computer’s USB drive. To date, there has yet been any record of cryptocurrency being stolen from a hardware wallet- which makes it useful for when you plan to acquire large amounts of cryptocurrency. This form of cold wallet is also convenient as you don’t need to type in your details each time you buy or sell cryptocurrency. Check out this list for the best cryptocurrency hardware wallets. Paper wallet: This simply involves you printing out your public and private keys on a piece of paper, thus, preventing hackers from accessing them. However, this does make it a bit tedious to type in your keys every time you need to use them online. You also run the risk of losing all your funds if it somehow winds up in someone else’s hands. So, remember to keep your paper wallet safe and secure. Brainwallet: This type of wallet involves you keeping your keys in your brain! This is usually done by memorizing a seed phrase. This means that, as long as you don’t record your seed phrase anywhere else, you are the only one who’ll ever know your keys, thus, making this the most secure wallet of all. However, If the owner of the seed phrase ever forgets it (or worse, dies), the cryptocurrency connected to that seed phrase is lost forever. Beware of phishing Phishing attacks are usually experienced through deceptive emails and websites. This is where a hacker employs fraudulent (usually psychological) tactics to get you to divulge private details. This type of cyber attack is responsible for over $115 million in stolen Etherium just last year. Now, you might be thinking “Why don’t they just avoid suspicious emails or messages?”, right? The thing is, they’re hard to resist. If you want to avoid falling for phishing attempts, check out this post for how to tell if someone is phishing for your cryptocurrency. Trade in secure exchanges Cryptocurrencies are usually bought and sold in a cryptocurrency exchange. However, not all exchanges can be trusted as some have already been proven fake. The problem here is that there’s no inherent protection and nowhere to run to for help if you lose your money. This is because cryptocurrency is, for the most part, unregulated- although the world is starting to catch up. That said, make sure to do your research before investing your money in any cryptocurrency exchange. You can also check out these 20 security tips for a more detailed list of safe trading practices. Conclusion Cryptocurrency trading can be hard, confusing, and downright risky. But, if you follow this guide, you’re at least a few steps closer to safe cryptocurrency trading. Arm yourself with at least the basic knowledge of how cryptocurrency trading works. Don’t fall for the illusion of anonymity that has fooled others and get yourself the best VPN you can afford and remember to install a reliable antivirus program to avoid malware or ransomware. Never reveal your private key. Hot wallets are fine if they only contain the exact amount you want to spend but it’s better to keep all your keys safe in a cold wallet that fits your purpose. Be wary of suspicious sites, emails, or messages that could turn out to be phishing scams and only trade in secure cryptocurrency exchanges. About Author: Dana Jackson, an U.S. expat living in Germany and the founder of PrivacyHub. She loves all things related to security and privacy. She holds a degree in Political Science, and loves to call herself a scientist. Dana also loves morning coffee and her dog Paw. Cryptocurrency-based firm, Tron acquires BitTorrent Can Cryptocurrency establish a new economic world order? Top 15 Cryptocurrency Trading Bots

0
0
6331

article-image-5g-mobile-data-propel-artificial-intelligence

Neil Aitken

02 Aug 2018

7 min read

How 5G Mobile Data will propel Artificial Intelligence (AI) progress

Neil Aitken

02 Aug 2018

7 min read

Like it’s predecessors, 3G and 4G, 5G refers to the latest ‘G’ – Generation – of mobile technology. 5G will give us very fast - effectively infinitely fast - mobile data download bandwidth. Downloading a TV show to your phone over 5G, in its entirety, in HD, will take less than a second, for example. A podcast will be downloaded within a fraction of a second of you requesting it. Scratch the surface of 5G, however, and there is a great deal more to see than just fast mobile data speeds. 5G is the backbone on which a lot of emerging technologies such as AI, blockchains, IoT among others will reach mainstream adoption. Today, we look at how 5G will accelerate AI growth and adoption. 5G will create the data AI needs to thrive One feature of 5G with ramifications beyond data speed is ‘Latency.’ 5G offers virtually ‘Zero Latency’ as a service. Latency is the time needed to transmit a packet of data from one device to another. It includes the period of time between when the request was made, to the time the response is completed. [caption id="attachment_21251" align="aligncenter" width="580"] 5G will be superfast – but will also benefit from near zero ‘latency’[/caption] Source: Economist At the moment, we keep files (music, pictures or films) in our phones’ memory permanently. We have plenty of processing power on our devices. In fact, the main upgrade between phone generations these days is a faster processor. In a 5G world, we will be able to use cheap parts in our devices – processors and memory in our new phones. Data downloads will be so fast, that we can get them immediately when we need them. We won’t need to store information on the phone unless we want to. Even if the files are downloaded from the cloud, because the network has zero latency – he or she feels like the files are on the phone. In other words, you are guaranteed a seamless user experience in a 5G world. The upshot of all this is that the majority of any new data which is generated from mobile products will move to the cloud for storage. At their most fundamental level, AI algorithms are pattern matching tools. The bigger the data trove, the faster and better performing the results of AI analysis is. These new structured data sets, created by 5G, will be available from the place where it is easiest to extract and manipulate (‘Analyze’) it – the cloud. There will be 100 billion 5G devices connected to cellular networks by 2025, according to Huawei. 5G is going to generate data from those devices, and all the smartphones in the world and send it all back to the cloud. That data is the source of the incredible power AI gives businesses. 5G driving AI in autonomous vehicles 5G’s features and this Cloud / Connected Device future, will manifest itself in many ways. One very visible example is how 5G will supercharge the contribution, especially to reliability and safety, that AI can make to self driving cars. A great deal of the AI processing that is required to keep a self driving car operating safely, will be done by computers on board the vehicle. However, 5G’s facilities to communicate large amounts of data quickly will mean that any unusual inputs (for example, the car is entering or in a crash situation) can be sent to bigger computing equipment on the cloud which could perform more serious processing. Zero latency is important in these situations for commands which might come from a centralized accident computer, designed to increase safety– for example issuing the command ‘break.’ In fact, according to manufacturers, it’s likely that, ultimately, groups of cars will be coordinated by AI using 5G to control the vehicles in a model known as swarm computing. 5G will make AI much more useful with ‘context’ - Intel 5G will power AI by providing location information which can be considered in establishing the context of questions asked of the tool – according to Intel’s Data Center Group. For example, asking your Digital Assistant where the tablets are means something different depending on whether you’re in a pharmacy or an electronics store. The nature of 5G is that it’s a mobile service. Location information is both key to context and an inherent element of information sent over a 5G connection. By communicating where they are, 5G sensors will help AI based Digital Assistants solve our everyday problems. 5G phones will enable AI calculations on ‘Edge’ network devices - ARM 5G will push some processing to the ‘Edge’ of the network, for manipulation by a growing range of AI chips on to the processors of phones. In this regard, smartphones like any Internet Of Things connected processor ‘in the field’ are simply an ‘AI platform’. Handset manufacturers are including new software features in their phones that customers love to use – including AI based search interfaces which allow them to search for images containing ‘heads’ and see an accurate list. [caption id="attachment_21252" align="aligncenter" width="1918"] Arm are designing new types of chips targeted at AI calculations on ‘Edge’ network devices.[/caption] Source: Arm's Project Trillium ARM, one of the world’s largest CPU producers are creating specific, dedicated AI chip sets, often derived from the technology that was behind their Graphics Processing Units. These chips process AI based calculations up to 50 times faster than standard microprocessors already and their performance is set to improve 50x over the next 3 years, according to the company. AI is part of 5G networks - Huawei Huawei describes itself as an AI company (as well as a number of other things including handset manufacturer.) They are one of the biggest electronic manufacturers in China and are currently in the process of selling networking products to the world’s telecommunications companies, as they prepare to roll out their 5G networks. Based on the insight that 70% of network system downtime comes from human error, Huawei is now eliminating humans from the network management component of their work, to the degree that they can. Instead, they’re implementing automated AI based predictive maintenance systems to increase data throughput across the network and reduce downtime. The way we use cellular networks is changing. Different applications require different backend traffic to be routed across the network, depending on the customer need. Someone watching video, for example, has a far lower tolerance for a disruption to the data throughput (the ‘stuttering Netflix’ effect) than a connected IoT sensor which is trying to communicate the temperature of a thermometer. Huawei’s network maintenance AI software optimizes these different package needs, maintaining the near zero latency that the standard demands at a lower cost. AI based network maintenance complete a virtuous loop in which 5G devices on new cellular networks give AI the raw data they need, including valuable context information, and AI helps the data flow across the 5G network better. Bringing it all together 5G and artificial intelligence (AI) are revolutionary technologies that will evolve alongside each other. 5G isn’t just fast data, it’s one of the most important technologies ever devised. Just as the smartphone did, it will fundamentally change how we relate to information, partly, because it will link us to thousands of newly connected devices on the Internet of Things. Ultimately, it could be the secondary effects of 5G, the network’s almost zero latency, which could provide the largest benefit – by creating structured data sets from billions of connected devices, in an easily accessible place – the cloud which can be used to fuel the AI algorithms which run on them. Networking equipment, chip manufacturers and governments have all connected the importance of AI with the potential of 5G. Commercial sales of 5G start in The US, UK and Australia in 2019. 7 Popular Applications of Artificial Intelligence in Healthcare Top languages for Artificial Intelligence development Cognitive IoT: How Artificial Intelligence is remoulding Industrial and Consumer IoT

0
0
6868

article-image-top-automl-libraries-for-building-ml-pipelines

Sunith Shetty

01 Aug 2018

9 min read

Top AutoML libraries for building your ML pipelines

Sunith Shetty

01 Aug 2018

9 min read

What is AutoML? When talking about AutoML we mostly refer to automated data preparation (namely feature preprocessing, generation, and selection) and model training (model selection and hyperparameter optimization). The number of possible options for each step of this process can vary vastly depending on the problem type. AutoML allows researchers and practitioners to automatically build ML pipelines out of the possible options for every step to find high-performing ML models for a given problem. AutoML libraries carefully set up experiments for various ML pipelines, which covers all the steps from data ingestion, data processing, modeling, and scoring. In this article we deal with understanding what AutoML is and cover popular AutoML libraries with practical examples. This article is an excerpt from a book written by Sibanjan Das, Umit Mert Cakmak titled Hands-On Automated Machine Learning. Overview of AutoML libraries There are many popular AutoML libraries, and in this section you will get an overview of commonly used ones in the data science community. Featuretools Featuretools is a good library for automatically engineering features from relational and transactional data. The library introduces the concept called Deep Feature Synthesis (DFS). If you have multiple datasets with relationships defined among them such as parent-child based on columns that you use as unique identifiers for examples, DFS will create new features based on certain calculations, such as summation, count, mean, mode, standard deviation, and so on. Let's go through a small example where you will have two tables, one showing the database information and the other showing the database transactions for each database: import pandas as pd # First dataset contains the basic information for databases. databases_df = pd.DataFrame({"database_id": [2234, 1765, 8796, 2237, 3398], "creation_date": ["2018-02-01", "2017-03-02", "2017-05-03", "2013-05-12", "2012-05-09"]}) databases_df.head() You get the following output: The following is the code for the database transaction: # Second dataset contains the information of transaction for each database id db_transactions_df = pd.DataFrame({"transaction_id": [26482746, 19384752, 48571125, 78546789, 19998765, 26482646, 12484752, 42471125, 75346789, 16498765, 65487547, 23453847, 56756771, 45645667, 23423498, 12335268, 76435357, 34534711, 45656746, 12312987], "database_id": [2234, 1765, 2234, 2237, 1765, 8796, 2237, 8796, 3398, 2237, 3398, 2237, 2234, 8796, 1765, 2234, 2237, 1765, 8796, 2237], "transaction_size": [10, 20, 30, 50, 100, 40, 60, 60, 10, 20, 60, 50, 40, 40, 30, 90, 130, 40, 50, 30], "transaction_date": ["2018-02-02", "2018-03-02", "2018-03-02", "2018-04-02", "2018-04-02", "2018-05-02", "2018-06-02", "2018-06-02", "2018-07-02", "2018-07-02", "2018-01-03", "2018-02-03", "2018-03-03", "2018-04-03", "2018-04-03", "2018-07-03", "2018-07-03", "2018-07-03", "2018-08-03", "2018-08-03"]}) db_transactions_df.head() You get the following output: The code for the entities is as follows: # Entities for each of datasets should be defined entities = { "databases" : (databases_df, "database_id"), "transactions" : (db_transactions_df, "transaction_id") } # Relationships between tables should also be defined as below relationships = [("databases", "database_id", "transactions", "database_id")] print(entities) You get the following output for the preceding code: The following code snippet will create feature matrix and feature definitions: # There are 2 entities called ‘databases’ and ‘transactions’ # All the pieces that are necessary to engineer features are in place, you can create your feature matrix as below import featuretools as ft feature_matrix_db_transactions, feature_defs = ft.dfs(entities=entities, relationships=relationships, target_entity="databases") The following output shows some of the features that are generated: You can see all feature definitions by looking at the following features_defs: feature_defs The output is as follows: This is how you can easily generate features based on relational and transactional datasets. Auto-sklearn Scikit-learn has a great API for developing ML models and pipelines. Scikit-learn's API is very consistent and mature; if you are used to working with it, auto-sklearn will be just as easy to use since it's really a drop-in replacement for scikit-learn estimators. Let's see a little example: # Necessary imports import autosklearn.classification import sklearn.model_selection import sklearn.datasets import sklearn.metrics from sklearn.model_selection import train_test_split # Digits dataset is one of the most popular datasets in machine learning community. # Every example in this datasets represents a 8x8 image of a digit. X, y = sklearn.datasets.load_digits(return_X_y=True) # Let's see the first image. Image is reshaped to 8x8, otherwise it's a vector of size 64. X[0].reshape(8,8) The output is as follows: You can plot a couple of images to see how they look: import matplotlib.pyplot as plt number_of_images = 10 images_and_labels = list(zip(X, y)) for i, (image, label) in enumerate(images_and_labels[:number_of_images]): plt.subplot(2, number_of_images, i + 1) plt.axis('off') plt.imshow(image.reshape(8,8), cmap=plt.cm.gray_r, interpolation='nearest') plt.title('%i' % label) plt.show() Running the preceding snippet will give you the following plot: Splitting the dataset to train and test data: # We split our dataset to train and test data X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=1) # Similarly to creating an estimator in Scikit-learn, we create AutoSklearnClassifier automl = autosklearn.classification.AutoSklearnClassifier() # All you need to do is to invoke fit method to start experiment with different feature engineering methods and machine learning models automl.fit(X_train, y_train) # Generating predictions is same as Scikit-learn, you need to invoke predict method. y_hat = automl.predict(X_test) print("Accuracy score", sklearn.metrics.accuracy_score(y_test, y_hat)) # Accuracy score 0.98 That was easy, wasn't it? MLBox MLBox is another AutoML library that supports distributed data processing, cleaning, formatting, and state-of-the-art algorithms such as LightGBM and XGBoost. It also supports model stacking, which allows you to combine an information ensemble of models to generate a new model aiming to have better performance than the individual models. Here's an example of its usage: # Necessary Imports from mlbox.preprocessing import * from mlbox.optimisation import * from mlbox.prediction import * import wget file_link = 'https://apsportal.ibm.com/exchange-api/v1/entries/8044492073eb964f46597b4be06ff5ea/data?accessKey=9561295fa407698694b1e254d0099600' file_name = wget.download(file_link) print(file_name) # GoSales_Tx_NaiveBayes.csv The GoSales dataset contains information for customers and their product preferences: import pandas as pd df = pd.read_csv('GoSales_Tx_NaiveBayes.csv') df.head() You get the following output from the preceding code: Let's create a test set from the same dataset by dropping a target column: test_df = df.drop(['PRODUCT_LINE'], axis = 1) # First 300 records saved as test dataset test_df[:300].to_csv('test_data.csv') paths = ["GoSales_Tx_NaiveBayes.csv", "test_data.csv"] target_name = "PRODUCT_LINE" rd = Reader(sep = ',') df = rd.train_test_split(paths, target_name) The output will be similar to the following: Drift_thresholder will help you to drop IDs and drifting variables between train and test datasets: dft = Drift_thresholder() df = dft.fit_transform(df) You get the following output: Optimiser will optimize the hyperparameters: opt = Optimiser(scoring = 'accuracy', n_folds = 3) opt.evaluate(None, df) You get the following output by running the preceding code: The following code defines the parameters of the ML pipeline: space = { 'ne__numerical_strategy':{"search":"choice", "space":[0]}, 'ce__strategy':{"search":"choice", "space":["label_encoding","random_projection", "entity_embedding"]}, 'fs__threshold':{"search":"uniform", "space":[0.01,0.3]}, 'est__max_depth':{"search":"choice", "space":[3,4,5,6,7]} } best = opt.optimise(space, df,15) The following output shows you the selected methods that are being tested by being given the ML algorithms, which is LightGBM in this output: You can also see various measures such as accuracy, variance, and CPU time: Using Predictor, you can use the best model to make predictions: predictor = Predictor() predictor.fit_predict(best, df) You get the following output: TPOT Tree-Based Pipeline Optimization Tool (TPOT) uses genetic programming to find the best performing ML pipelines, built on top of scikit-learn. Once your dataset is cleaned and ready to be used, TPOT will help you with the following steps of your ML pipeline: Feature preprocessing Feature construction and selection Model selection Hyperparameter optimization Once TPOT is done with its experimentation, it will provide you with the best performing pipeline. TPOT is very user-friendly as it's similar to using scikit-learn's API: from tpot import TPOTClassifier from sklearn.datasets import load_digits from sklearn.model_selection import train_test_split # Digits dataset that you have used in Auto-sklearn example digits = load_digits() X_train, X_test, y_train, y_test = train_test_split(digits.data, digits.target, train_size=0.75, test_size=0.25) # You will create your TPOT classifier with commonly used arguments tpot = TPOTClassifier(generations=10, population_size=30, verbosity=2) # When you invoke fit method, TPOT will create generations of populations, seeking best set of parameters. Arguments you have used to create TPOTClassifier such as generations and population_size will affect the search space and resulting pipeline. tpot.fit(X_train, y_train) print(tpot.score(X_test, y_test)) # 0.9834 tpot.export('my_pipeline.py') Once you have exported your pipeline in the Python my_pipeline.py file, you will see the selected pipeline components: import numpy as np import pandas as pd from sklearn.model_selection import train_test_split from sklearn.neighbors import KNeighborsClassifier # NOTE: Make sure that the class is labeled 'target' in the data file tpot_data = pd.read_csv('PATH/TO/DATA/FILE', sep='COLUMN_SEPARATOR', dtype=np.float64) features = tpot_data.drop('target', axis=1).values training_features, testing_features, training_target, testing_target = train_test_split(features, tpot_data['target'].values, random_state=42) exported_pipeline = KNeighborsClassifier(n_neighbors=6, weights="distance") exported_pipeline.fit(training_features, training_target) results = exported_pipeline.predict(testing_features) To summarize, you learnt about Automated ML and practiced your skills using popular AutoML libraries. This is definitely not the whole list, and AutoML is an active area of research. You should check out other libraries such as Auto-WEKA, which also uses the latest innovations in Bayesian optimization, and Xcessive, which is a user-friendly tool for creating stacked ensembles. To know how AutoML can be further used to automate parts of Machine Learning, check out the book Hands-On Automated Machine Learning. Read more Anatomy of an automated machine learning algorithm (AutoML) AutoML: Developments and where is it heading to AmoebaNets: Google’s new evolutionary AutoML

0
0
11406

article-image-earn-1m-per-year-hint-learn-machine-learning

Neil Aitken

01 Aug 2018

10 min read

How to earn $1m per year? Hint: Learn machine learning

Neil Aitken

01 Aug 2018

10 min read

0
0
20822

article-image-are-distributed-networks-decentralized-systems-same

Amarabha Banerjee

31 Jul 2018

3 min read

Are distributed networks and decentralized systems the same?

Amarabha Banerjee

31 Jul 2018

3 min read

The emergence of Blockchain has paved way for the implementation of non-centralized network architecture. The seeds of distributed network architecture was sown back in 1964, by Paul Baran in his paper “On Distributed Networks“. Since then, there have been many attempts at implementing this architecture in network systems. The most recent implementation has been aided by the discovery of Blockchain technology, in 2009, by the anonymous Satoshi Nakamoto. Terminologies: Network: A collection of interlinked nodes that exchange information. Node: The most basic part of the network; for example, a user or computer. Link: The connection between two nodes. Server: A node that has connections to a relatively large amount of other nodes The mother of all the network architectures is a centralized Network. Here, the primary decision making control rests with one Node. The message is carried on by Sub Nodes. Distributed networks are in a way, a conglomeration of small centralized networks. It consists of multiple Nodes which themselves are miniature versions of centralized networks. Decentralized networks consist of individual nodes and every one of these Nodes are capable of independent decision making. Hence there is no central control in Decentralized networks. Source: Meshworld A common misconception is that Distributed and Decentralized systems are the same; they just have different nomenclature and slightly different functionalities. In a way this is true, but not completely. A distributed system still has one central control algorithm, that takes the final call over the process and protocol to be followed. As an example, let’s consider distributed digital ledgers. Each of these ledgers are independent network nodes. These ledgers get updated with individual transactions and other details and this information is not passed on to the other Nodes. This particular feature makes this system secure and decentralized. The other Nodes are not aware of the information stored. This is how a decentralized network behaves. Now the same system behaves a bit differently, when the Nodes communicate with each other, (in-case of Ethereum, by the movement of “Ether” between nodes). Although the individual Node’s information stays secure, the information about the state of the Nodes is passed on, finally to a central peer. Then the peer takes on the decision of which state to finally change to and what is the optimum process to change the state. This is decided by the votes of the individual nodes. The Nodes then change to the new state, preserving information about the previous state. This makes the system dynamic, secure and distributed, because although the Nodes get to vote based on their individual states, the final decision is taken by the centralized peer. This is a distributed system. Hence we can clearly state that decentralized systems are a subset of distributed systems, more independent and minus any central controlling authority. This presents a familiar question to us - are the current blockchain based apps purely decentralized, or are they just distributed systems with a central control? Could this be the reason why we have not yet reached the ultimate Cryptocurrency based alternative economy? Put differently, is the invisible central control hindering the evolution of blockchain based systems to a purely decentralized system and economy? Only time and more dedicated research, along with better practical implementation of decentralized applications will answer these questions. A brief history of Blockchain Blockchain can solve tech’s trust issues – Imran Bashir Partnership alliances of Kontakt.io and IOTA Foundation for IoT and Blockchain

0
0
9967

Tech Guides - Data

5 artificial intelligence tools data scientists might not know

5 examples of Artificial Intelligence in Web apps

How everyone at Netflix uses Jupyter notebooks from data scientists, machine learning engineers, to data analysts

Tackle trolls with Machine Learning bots: Filtering out inappropriate content just got easy

Budget and Demand Forecasting using Markov model in SAS [Tutorial]

Predictive Analytics with AWS: A quick look at Amazon ML

Do you write Python Code or Pythonic Code?

Four interesting Amazon patents in 2018 that use machine learning, AR, and robotics

Effective Product Development needs developers and product managers collaborating on success metrics

Why Neo4j is the most popular graph database

Trending Topics

A Guide to safe cryptocurrency trading

How 5G Mobile Data will propel Artificial Intelligence (AI) progress

Top AutoML libraries for building your ML pipelines

How to earn $1m per year? Hint: Learn machine learning

Are distributed networks and decentralized systems the same?