Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon

Tech Guides

852 Articles
article-image-polycloud-a-better-alternative-to-cloud-agnosticism
Richard Gall
16 May 2018
3 min read
Save for later

Polycloud: a better alternative to cloud agnosticism

Richard Gall
16 May 2018
3 min read
What is polycloud? Polycloud is an emerging cloud strategy that is starting to take hold across a range of organizations. The concept is actually pretty simple: instead of using a single cloud vendor, you use multiple vendors. By doing this, you can develop a customized cloud solution that is suited to your needs. For example, you might use AWS for the bulk of your services and infrastructure, but decide to use Google's cloud for its machine learning capabilities. Polycloud has emerged because of the intensely competitive nature of the cloud space today. All three major vendors - AWS, Azure, and Google Cloud - don't particularly differentiate their products. The core features are pretty much the same across the market. Of course, there are certain subtle differences between each solution, as the example above demonstrates; taking a polycloud approach means you can leverage these differences rather than compromising with your vendor of choice. What's the difference between a polycloud approach and a cloud agnostic approach? You might be thinking that polycloud sounds like cloud agnosticism. And while there are clearly many similarities, the differences between the two are very important. Cloud agnosticism aims for a certain degree of portability across different cloud solutions. This can, of course, be extremely expensive. It also adds a lot of complexity, especially in how you orchestrate deployments across different cloud providers. True, there are times when cloud agnosticism might work for you; if you're not using the services being provided to you, then yes, cloud agnosticism might be the way to go. However, in many (possibly most) cases, cloud agnosticism makes life harder. Polycloud makes it a hell of a lot easier. In fact, it ultimately does what many organizations have been trying to do with a cloud agnostic strategy. It takes the parts you want from each solution and builds it around what you need. Perhaps one of the key benefits of a polycloud approach is that it gives more power back to users. Your strategic thinking is no longer limited to what AWS, Azure or Google offers - you can instead start with your needs and build the solution around that. How quickly is polycloud being adopted? Polycloud first featured in Thoughtworks' Radar in November 2017. At that point it was in the 'assess' stage of Thoughtworks' cycle; this means it's simply worth exploring and investigating in more detail. However, in its May 2018 Radar report, polycloud had moved into the 'trial' phase. This means it is seen as being an approach worth adopting. It will be worth watching the polycloud trend closely over the next few months to see how it evolves. There's a good chance that we'll see it come to replace cloud agnosticism. Equally, it's likely to impact the way AWS, Azure and Google respond. In many ways, the trend a reaction to the way the market has evolved; it may force the big players in the market to evolve what they offer to customers and clients. Read next Serverless computing wars: AWS Lambdas vs Azure Functions How to run Lambda functions on AWS Greengrass
Read more
  • 0
  • 0
  • 21264

article-image-common-problems-in-delphi-parallel-programming
Pavan Ramchandani
27 Jul 2018
12 min read
Save for later

Common problems in Delphi parallel programming

Pavan Ramchandani
27 Jul 2018
12 min read
This tutorial will be explaining how to find performance bottlenecks and apply the correct algorithm to fix them when working with Delphi. Also, teach you how to improve your algorithms before taking you through parallel programming. The article is an excerpt from a book written by Primož Gabrijelčič, titled Delphi High Performance. Never access UI from a background thread Let's start with the biggest source of hidden problems—manipulating a user interface from a background thread. This is, surprisingly, quite a common problem—even more so as all Delphi resources on multithreaded programming will simply say to never do that. Still, it doesn't seem to touch some programmers, and they will always try to find an excuse to manipulate a user interface from a background thread. Indeed, there may be a situation where VCL or FireMonkey may be manipulated from a background thread, but you'll be treading on thin ice if you do that. Even if your code works with the current Delphi, nobody can guarantee that changes in graphical libraries introduced in future Delphis won't break your code. It is always best to cleanly decouple background processing from a user interface. Let's look at an example which nicely demonstrates the problem. The ParallelPaint demo has a simple form, with eight TPaintBox components and eight threads. Each thread runs the same drawing code and draws a pattern into its own TPaintBox. As every thread accesses only its own Canvas, and no other user interface components, a naive programmer would therefore assume that drawing into paintboxes directly from background threads would not cause problems. A naive programmer would be very much mistaken. If you run the program, you will notice that although the code paints constantly into some of the paint boxes, others stop to be updated after some time. You may even get a Canvas does not allow drawing exception. It is impossible to tell in advance which threads will continue painting and which will not. The following image shows an example of an output. The first two paint boxes in the first row, and the last one in the last row were not updated anymore when I grabbed the image: The lines are drawn in the DrawLine method. It does nothing special, just sets the color for that line and draws it. Still, that is enough to break the user interface when this is called from multiple threads at once, even though each thread uses its own Canvas: procedure TfrmParallelPaint.DrawLine(canvas: TCanvas; p1, p2: TPoint; color: TColor); begin Canvas.Pen.Color := color; Canvas.MoveTo(p1.X, p1.Y); Canvas.LineTo(p2.X, p2.Y); end; Is there a way around this problem? Indeed there is. Delphi's TThread class implements a method, Queue, which executes some code in the main thread. Queue takes a procedure or anonymous method as a parameter and sends it to the main thread. After some short time, the code is then executed in the main thread. It is impossible to tell how much time will pass before the code is executed, but that delay will typically be very short, in the order of milliseconds. As it accepts an anonymous method, we can use the magic of variable capturing and write the corrected code, as shown here: procedure TfrmParallelPaint.QueueDrawLine(canvas: TCanvas; p1, p2: TPoint; color: TColor); begin TThread.Queue(nil, procedure begin Canvas.Pen.Color := color; Canvas.MoveTo(p1.X, p1.Y); Canvas.LineTo(p2.X, p2.Y); end); end; In older Delphis you don't have such a nice Queue method but only a version of Synchronize that accepts a normal  method. If you have to use this method, you cannot count on anonymous method mechanisms to handle parameters. Rather, you have to copy them to fields and then Synchronize a parameterless method operating on these fields. The following code fragment shows how to do that: procedure TfrmParallelPaint.SynchronizedDraw; begin FCanvas.Pen.Color := FColor; FCanvas.MoveTo(FP1.X, FP1.Y); FCanvas.LineTo(FP2.X, FP2.Y); end; procedure TfrmParallelPaint.SyncDrawLine(canvas: TCanvas; p1, p2: TPoint; color: TColor); begin FCanvas := canvas; FP1 := p1; FP2 := p2; FColor := color; TThread.Synchronize(nil, SynchronizedDraw); end; If you run the corrected program, the final result should always be similar to the following image, with all eight  TPaintBox components showing a nicely animated image: Simultaneous reading and writing The next situation which I'm regularly seeing while looking at a badly-written parallel code is simultaneous reading and writing from/to a shared data structure, such as a list.  The SharedList program demonstrates how things can go wrong when you share a data structure between threads. Actually, scrap that, it shows how things will go wrong if you do that. This program creates a shared list, FList: TList<Integer>. Then it creates one background thread which runs the method ListWriter and multiple background threads, each running the ListReader method. Indeed, you can run the same code in multiple threads. This is a perfectly normal behavior and is sometimes extremely useful. The ListReader method is incredibly simple. It just reads all the elements in a list and does that over and over again. As I've mentioned before, the code in my examples makes sure that problems in multithreaded code really do occur, but because of that, my demo code most of the time also looks terribly stupid. In this case, the reader just reads and reads the data because that's the best way to expose the problem: procedure TfrmSharedList.ListReader; var i, j, a: Integer; begin for i := 1 to CNumReads do for j := 0 to FList.Count - 1 do a := FList[j]; end; The ListWriter method is a bit different. It also loops around, but it also sleeps a little inside each loop iteration. After the Sleep, the code either adds to the list or deletes from it. Again, this is designed so that the problem is quick to appear: procedure TfrmSharedList.ListWriter; var i: Integer; begin for i := 1 to CNumWrites do begin Sleep(1); if FList.Count > 10 then FList.Delete(Random(10)) else FList.Add(Random(100)); end; end; If you start the program in a debugger, and click on the Shared lists button, you'll quickly get an EArgumentOutOfRangeException exception. A look at the stack trace will show that it appears in the line a := FList[j];. In retrospect, this is quite obvious. The code in ListReader starts the inner for loop and reads the FListCount. At that time, FList has 11 elements so Count is 11. At the end of the loop, the code tries to read FList[10], but in the meantime ListWriter has deleted one element and the list now only has 10 elements. Accessing element [10] therefore raises an exception. We'll return to this topic later, in the section about Locking. For now you should just keep in mind that sharing data structures between threads causes problems. Sharing a variable OK, so rule number two is "Shared structures bad". What about sharing a simple variable? Nothing can go wrong there, right? Wrong! There are actually multiple ways something can go wrong. The program IncDec demonstrates one of the bad things that can happen. The code contains two methods: IncValue and DecValue. The former increments a shared FValue: integer; some number of times, and the latter decrements it by the same number of times: procedure TfrmIncDec.IncValue; var i: integer; value: integer; begin for i := 1 to CNumRepeat do begin value := FValue; FValue := value + 1; end; end; procedure TfrmIncDec.DecValue; var i: integer; value: integer; begin for i := 1 to CNumRepeat do begin value := FValue; FValue := value - 1; end; end; A click on the Inc/Dec button sets the shared value to 0, runs IncValue, then DecValue, and logs the result: procedure TfrmIncDec.btnIncDec1Click(Sender: TObject); begin FValue := 0; IncValue; DecValue; LogValue; end; I know you can all tell what FValue will hold at the end of this program. Zero, of course. But what will happen if we run IncValue and DecValue in parallel? That is, actually, hard to predict! A click on the Multithreaded button does almost the same, except that it runs IncValue and DecValue in parallel. How exactly that is done is not important at the moment (but feel free to peek into the code if you're interested): procedure TfrmIncDec.btnIncDec2Click(Sender: TObject); begin FValue := 0; RunInParallel(IncValue, DecValue); LogValue; end; Running this version of the code may still sometimes put zero in FValue, but that will be extremely rare. You most probably won't be able to see that result unless you are very lucky. Most of the time, you'll just get a seemingly random number from the range -10,000,000 to 10,000,000 (which is the value of the CNumRepeatconstant). In the following image, the first number is a result of the single-threaded code, while all the rest were calculated by the parallel version of the algorithm: To understand what's going on, you should know that Windows (and all other operating systems) does many things at once. At any given time, there are hundreds of threads running in different programs and they are all fighting for the limited number of CPU cores. As our program is the active one (has focus), its threads will get most of the CPU time, but still they'll sometimes be paused for some amount of time so that other threads can run. Because of that, it can easily happen that IncValue reads the current value of FValue into value (let's say that the value is 100) and is then paused. DecValue reads the same value and then runs for some time, decrementing FValue. Let's say that it gets it down to -20,000. (That is just a number without any special meaning.) After that, the IncValue thread is awakened. It should increment the value to -19,999, but instead of that it adds 1 to 100 (stored in value), gets 101, and stores that into FValue. Ka-boom! In each repetition of the program, this will happen at different times and will cause a different result to be calculated. You may complain that the problem is caused by the two-stage increment and decrement, but you'd be wrong. I dare you—go ahead, change the code so that it will modify FValue with Inc(FValue) and Dec(FValue) and it still won't work correctly. Well, I hear you say, so I shouldn't even modify one variable from two threads at the same time? I can live with that. But surely, it is OK to write into a variable from one thread and read from another? The answer, as you can probably guess given the general tendency of this section, is again—no, you may not. There are some situations where this is OK (for example, when a variable is only one byte long) but, in general, even simultaneous reading and writing can be a source of weird problems. The ReadWrite program demonstrates this problem. It has a shared buffer, FBuf: Int64, and a pointer variable used to read and modify the data, FPValue: PInt64. At the beginning, the buffer is initialized to an easily recognized number and a pointer variable is set to point to the buffer: FPValue := @FBuf; FPValue^ := $7777777700000000; The program runs two threads. One just reads from the location and stores all the read values into a list. This value is created with Sorted and Duplicates properties, set in a way that prevents it from storing duplicate values: procedure TfrmReadWrite.Reader; var i: integer; begin for i := 1 to CNumRepeat do FValueList.Add(FPValue^); end; The second thread repeatedly writes two values into the shared location: procedure TfrmReadWrite.Writer; var i: integer; begin for i := 1 to CNumRepeat do begin FPValue^ := $7777777700000000; FPValue^ := $0000000077777777; end; end; At the end, the contents of the FValueList list are logged on the screen. We would expect to see only two values—$7777777700000000 and $0000000077777777. In reality, we see four, as the following screenshot demonstrates: The reason for that strange result is that Intel processors in 32-bit mode can't write a 64-bit number (as int64 is) in one step. In other words, reading and writing 64-bit numbers in 32-bit code is not atomic. When multithreading programmers talk about something being atomic, they want to say that an operation will execute in one indivisible step. Any other thread will either see a state before the operation or a state after the operation, but never some undefined intermediate state. How do values $7777777777777777 and $0000000000000000 appear in the test application? Let's say that FValue^ contains $7777777700000000. The code then starts writing $0000000077777777 into FValue by firstly storing a $77777777 into the bottom four bytes. After that it starts writing $00000000 into the upper four bytes of FValue^, but in the meantime Reader reads the value and gets $7777777777777777. In a similar way, Reader will sometimes see $0000000000000000 in the FValue^. We'll look into a way to solve this situation immediately, but in the meantime, you may wonder—when is it okay to read/write from/to a variable at the same time? Sadly, the answer is—it depends. Not even just on the CPU family (Intel and ARM processors behave completely differently), but also on a specific architecture used in a processor. For example, older and newer Intel processors may not behave the same in that respect. You can always depend on access to byte-sized data being atomic, but that is that. Access (reads and writes) to larger quantities of data (words, integers) is atomic only if the data is correctly aligned. You can access word sized data atomically if it is word aligned, and integer data if it is double-word aligned. If the code was compiled in 64-bit mode, you can also atomically access in 64 data if it is quad-word aligned. When you are not using data packing (such as packed records) the compiler will take care of alignment and data access should automatically be atomic. You should, however, still check the alignment in code, if nothing else to prevent stupid programming errors. If you want to write and read larger amounts of data, modify the data, or if you want to work on shared data structures, correct alignment will not be enough. You will need to introduce synchronization into your program. If you found this post useful, do check out the book Delphi High Performance to learn more about the intricacies of how to perform High-performance programming with Delphi. Delphi: memory management techniques for parallel programming Parallel Programming Patterns Concurrency programming 101: Why do programmers hang by a thread?
Read more
  • 0
  • 1
  • 20846

article-image-7-ai-tools-mobile-developers-need-to-know
Bhagyashree R
20 Sep 2018
11 min read
Save for later

7 AI tools mobile developers need to know

Bhagyashree R
20 Sep 2018
11 min read
Advancements in artificial intelligence (AI) and machine learning has enabled the evolution of mobile applications that we see today. With AI, apps are now capable of recognizing speech, images, and gestures, and translate voices with extraordinary success rates. With a number of apps hitting the app stores, it is crucial that they stand apart from competitors by meeting the rising standards of consumers. To stay relevant it is important that mobile developers keep up with these advancements in artificial intelligence. As AI and machine learning become increasingly popular, there is a growing selection of tools and software available for developers to build their apps with. These cloud-based and device-based artificial intelligence tools provide developers a way to power their apps with unique features. In this article, we will look at some of these tools and how app developers are using them in their apps. Caffe2 - A flexible deep learning framework Source: Qualcomm Caffe2 is a lightweight, modular, scalable deep learning framework developed by Facebook. It is a successor of Caffe, a project started at the University of California, Berkeley. It is primarily built for production use cases and mobile development and offers developers greater flexibility for building high-performance products. Caffe2 aims to provide an easy way to experiment with deep learning and leverage community contributions of new models and algorithms. It is cross-platform and integrates with Visual Studio, Android Studio, and Xcode for mobile development. Its core C++ libraries provide speed and portability, while its Python and C++ APIs make it easy for you to prototype, train, and deploy your models. It utilizes GPUs when they are available. It is fine-tuned to take full advantage of the NVIDIA GPU deep learning platform. To deliver high performance, Caffe2 uses some of the deep learning SDK libraries by NVIDIA such as cuDNN, cuBLAS, and NCCL. Functionalities Enable automation Image processing Perform object detection Statistical and mathematical operations Supports distributed training enabling quick scaling up or down Applications Facebook is using Caffe2 to help their developers and researchers train large machine learning models and deliver AI on mobile devices. Using Caffe2, they significantly improved the efficiency and quality of machine translation systems. As a result, all machine translation models at Facebook have been transitioned from phrase-based systems to neural models for all languages. OpenCV - Give the power of vision to your apps Source: AndroidPub OpenCV short for Open Source Computer Vision Library is a collection of programming functions for real-time computer vision and machine learning. It has C++, Python, and Java interfaces and supports Windows, Linux, Mac OS, iOS and Android. It also supports the deep learning frameworks TensorFlow and PyTorch. Written natively in C/C++, the library can take advantage of multi-core processing. OpenCV aims to provide a common infrastructure for computer vision applications and to accelerate the use of machine perception in the commercial products. The library consists of more than 2500 optimized algorithms including both classic and state-of-the-art computer vision algorithms. Functionalities These algorithms can be used for the following: To detect and recognize faces Identify objects Classify human actions in videos Track camera movements and moving objects Extract 3D models of objects Produce 3D point clouds from stereo cameras Stitch images together to produce a high-resolution image of an entire scene Find similar images from an image database Applications Plickers is an assessment tool, that lets you poll your class for free, without the need for student devices. It uses OpenCV as its graphics and video SDK. You just have to give each student a card called a paper clicker, and use your iPhone/iPad to scan them to do instant checks-for-understanding, exit tickets, and impromptu polls. Also check out FastCV BoofCV TensorFlow Lite and Mobile - An Open Source Machine Learning Framework for Everyone Source: YouTube TensorFlow is an open source software library for building machine learning models. Its flexible architecture allows easy model deployment across a variety of platforms ranging from desktops to mobile and edge devices. Currently, TensorFlow provides two solutions for deploying machine learning models on mobile devices: TensorFlow Mobile and TensorFlow Lite. TensorFlow Lite is an improved version of TensorFlow Mobile, offering better performance and smaller app size. Additionally, it has very few dependencies as compared to TensorFlow Mobile, so it can be built and hosted on simpler, more constrained device scenarios. TensorFlow Lite also supports hardware acceleration with the Android Neural Networks API. But the catch here is that TensorFlow Lite is currently in developer preview and only has coverage to a limited set of operators. So, to develop production-ready mobile TensorFlow apps, it is recommended to use TensorFlow Mobile. Also, TensorFlow Mobile supports customization to add new operators not supported by TensorFlow Mobile by default, which is a requirement for most of the models of different AI apps. Although TensorFlow Lite is in developer preview, its future releases “will greatly simplify the developer experience of targeting a model for small devices”. It is also likely to replace TensorFlow Mobile, or at least overcome its current limitations. Functionalities Speech recognition Image recognition Object localization Gesture recognition Optical character recognition Translation Text classification Voice synthesis Applications The Alibaba tech team is using TensorFlow Lite to implement and optimize speaker recognition on the client side. This addresses many of the common issues of the server-side model, such as poor network connectivity, extended latency, and poor user experience. Google uses TensorFlow for advanced machine learning models including Google Translate and RankBrain. Core ML - Integrate machine learning in your iOS apps Source: AppleToolBox Core ML is a machine learning framework which can be used to integrate machine learning model in your iOS apps. It supports Vision for image analysis, Natural Language for natural language processing, and GameplayKit for evaluating learned decision trees. Core ML is built on top of the following low-level APIs, providing a simple higher level abstraction to these: Accelerate optimizes large-scale mathematical computations and image calculations for high performance. Basic neural network subroutines (BNNS) provides a collection of functions using which you can implement and run neural networks trained with previously obtained data. Metal Performance Shaders is a collection of highly optimized compute and graphic shaders that are designed to integrate easily and efficiently into your Metal app. To train and deploy custom models you can also use the Create ML framework. It is a machine learning framework in Swift, which can be used to train models using native Apple technologies like Swift, Xcode, and Other Apple frameworks. Functionalities Face and face landmark detection Text detection Barcode recognition Image registration Language and script identification Design games with functional and reusable architecture Applications Lumina is a camera designed in Swift for easily integrating Core ML models - as well as image streaming, QR/Barcode detection, and many other features. ML Kit by Google - Seamlessly build machine learning into your apps Source: Google ML Kit is a cross-platform suite of machine learning tools for its Firebase mobile development platform. It comprises of Google's ML technologies, such as the Google Cloud Vision API, TensorFlow Lite, and the Android Neural Networks API together in a single SDK enabling you to apply ML techniques to your apps easily. You can leverage its ready-to-use APIs for common mobile use cases such as recognizing text, detecting faces, identifying landmarks, scanning barcodes, and labeling images. If these APIs don't cover your machine learning problem, you can use your own existing TensorFlow Lite models. You just have to upload your model on Firebase and ML Kit will take care of the hosting and serving. These APIs can run on-device or in the cloud. Its on-device APIs process your data quickly and work even when there’s no network connection. Its cloud-based APIs leverage the power of Google Cloud Platform's machine learning technology to give you an even higher level of accuracy. Functionalities Automate tedious data entry for credit cards, receipts, and business cards, or help organize photos. Extract text from documents, which you can use to increase accessibility or translate documents. Real-time face detection can be used in applications like video chat or games that respond to the player's expressions. Using image labeling you can add capabilities such as content moderation and automatic metadata generation. Applications A popular calorie counter app, Lose It! uses Google ML Kit Text Recognition API to quickly capture nutrition information to ensure it’s easy to record and extremely accurate. PicsArt uses ML Kit custom model APIs to provide TensorFlow–powered 1000+ effects to enable millions of users to create amazing images with their mobile phones. Dialogflow - Give users new ways to interact with your product Source: Medium Dialogflow is a Natural Language Understanding (NLU) platform that makes it easy for developers to design and integrate conversational user interfaces into mobile apps, web applications, devices, and bots. You can integrate it on Alexa, Cortana, Facebook Messenger, and other platforms your users are on. With Dialogflow you can build interfaces, such as chatbots and conversational IVR that enable natural and rich interactions between your users and your business. It provides this human-like interaction with the help of agents. Agents can understand the vast and varied nuances of human language and translate that to standard and structured meaning that your apps and services can understand. It comes in two types: Dialogflow Standard Edition and Dialogflow Enterprise Edition. Dialogflow Enterprise Edition users have access to Google Cloud Support and a service level agreement (SLA) for production deployments. Functionalities Provide customer support One-click integration on 14+ platforms Supports multilingual responses Improve NLU quality by training with negative examples Debug using more insights and diagnostics Applications Domino’s simplified the process of ordering pizza using Dialogflow’s conversational technology. Domino's leveraged large customer service knowledge and Dialogflow's NLU capabilities to build both simple customer interactions and increasingly complex ordering scenarios. Also check out Wit.ai Rasa NLU Microsoft Cognitive Services - Make your apps see, hear, speak, understand and interpret your user needs Source: Neel Bhatt Cognitive Services is a collection of APIs, SDKs, and services to enable developers easily add cognitive features to their applications such as emotion and video detection, facial, speech, and vision recognition, among others. You need not be an expert in data science to make your systems more intelligent and engaging. The pre-built services come with high-quality RESTful intelligent APIs for the following: Vision: Make your apps identify and analyze content within images and videos. Provides capabilities such as image classification, optical character recognition in images, face detection, person identification, and emotion identification. Speech: Integrate speech processing capabilities into your app or services such as text-to-speech, speech-to-text, speaker recognition, and speech translation. Language: Your application or service will understand the meaning of the unstructured text or the intent behind a speaker's utterances. It comes with capabilities such as text sentiment analysis, key phrase extraction, automated and customizable text translation. Knowledge: Create knowledge-rich resources that can be integrated into apps and services. It provides features such as QnA extraction from unstructured text, knowledge base creation from collections of Q&As, and semantic matching for knowledge bases. Search: Using Search API you can find exactly what you are looking for across billions of web pages. It provides features like ad-free, safe, location-aware web search, Bing visual search, custom search engine creation, and many more. Applications To safeguard against fraud, Uber uses the Face API, part of Microsoft Cognitive Services, to help ensure the driver using the app matches the account on file. Cardinal Blue developed an app called PicCollage, a popular mobile app that allows users to combine photos, videos, captions, stickers, and special effects to create unique collages. Also check out AWS machine learning services IBM Watson These were some of the tools that will help you integrate intelligence into your apps. These libraries make it easier to add capabilities like speech recognition, natural language processing, computer vision, and many others, giving users the wow moment of accomplishing something that wasn’t quite possible before. Along with choosing the right AI tool, you must also consider other factors that can affect your app performance. These factors include the accuracy of your machine learning model, which can be affected by bias and variance, using correct datasets for training, seamless user interaction, and resource-optimization, among others. While building any intelligent app it is also important to keep in mind that the AI in your app is solving a problem and it doesn’t exist because it is cool. Thinking from the user’s perspective will allow you to assess the importance of a particular problem. A great AI app will not just help users do something faster, but enable them to do something they couldn’t do before. With the growing popularity and the need to speed up the development of intelligent apps, many companies ranging from huge tech giants to startups are providing AI solutions. In the future we will definitely see more developer tools coming into the market, making AI in apps a norm. 6 most commonly used Java Machine learning libraries 5 ways artificial intelligence is upgrading software engineering Machine Learning as a Service (MLaaS): How Google Cloud Platform, Microsoft Azure, and AWS are democratizing Artificial Intelligence
Read more
  • 0
  • 0
  • 20079
Banner background image

article-image-what-difference-between-functional-and-object-oriented-programming
Antonio Cucciniello
17 Sep 2017
5 min read
Save for later

What is the difference between functional and object oriented programming?

Antonio Cucciniello
17 Sep 2017
5 min read
There are two very popular programming paradigms in software development that developers design and program to. They are known as object oriented programming and functional programming. You've probably heard of these terms before, but what exactly are they and what is the difference between functional and object oriented programming? Let's take a look. What is object oriented programming? Object oriented programming is a programming paradigm in which you program using objects to represent things you are programming about (sometimes real world things). These objects could be data structures. The objects hold data about them in attributes. The attributes in the objects are manipulated through methods or functions that are given to the object. For instance, we might have a Person object that represents all of the data a person would have: weight, height, skin color, hair color, hair length, and so on. Those would be the attributes. Then the person object would also have things that it can do such as: pick box up, put box down, eat, sleep, etc. These would be the functions that play with the data the object stores. Engineers who program using object oriented design say that it is a style of programming that allows you to model real world scenarios much simpler. This allows for a good transition from requirements to code that works like the customer or user wants it to. Some examples of object oriented languages include C++, Java, Python, C#, Objective-C, and Swift. Want to learn object oriented programming? We recommend you start with Learning Object Oriented Programming. What is functional programming? Functional programming is the form of programming that attempts to avoid changing state and mutable data. In a functional program, the output of a function should always be the same, given the same exact inputs to the function. This is because the outputs of a function in functional programming purely relies on arguments of the function, and there is no magic that is happening behind the scenes. This is called eliminating side effects in your code. For example, if you call function getSum() it calculates the sum of two inputs and returns the sum. Given the same inputs for x and y, we will always get the same output for sum. This allows the function of a program to be extremely predictable. Each small function does its part and only its part. It allows for very modular and clean code that all works together in harmony. This is also easier when it comes to unit testing. Some examples of Functional Programming Languages include Lisp, Clojure, and F#. Problems with object oriented programming There are a few problems with object oriented programing. Firstly, it is known to be not as reusable. Because some of your functions depend on the class that is using them, it is hard to use some functions with another class. It is also known to be typically less efficient and more complex to deal with. Plenty of times, some object oriented designs are made to model large architectures and can be extremely complicated. Problems with functional programming Functional programming is not without its flaws either. It really takes a different mindset to approach your code from a functional standpoint. It's easy to think in object oriented terms, because it is similar to how the object being modeled happens in the real world. Functional programming is all about data manipulation. Converting a real world scenario to just data can take some extra thinking. Due to its difficulty when learning to program this way, there are fewer people that program using this style, which could make it hard to collaborate with someone else or learn from others because there will naturally be less information on the topic. A comparison between functional and object oriented programming Both programming concepts have a goal of wanting to create easily understandable programs that are free of bugs and can be developed fast. Both concepts have different methods for storing the data and how to manipulate the data. In object oriented programming, you store the data in attributes of objects and have functions that work for that object and do the manipulation. In functional programming, we view everything as data transformation. Data is not stored in objects, it is transformed by creating new versions of that data and manipulating it using one of the many functions. I hope you have a clearer picture of what the difference between functional and object oriented programming. They can both be used separately or can be mixed to some degree to suite your needs. Ultimately you should take into the consideration the advantages and disadvantages of using both before making that decision. Antonio Cucciniello is a Software Engineer with a background in C, C++ and JavaScript (Node.js) from New Jersey. His most recent project called Edit Docs is an Amazon Echo skill that allows users to edit Google Drive files using your voice. He loves building cool things with software, reading books on self-help and improvement, finance, and entrepreneurship. Follow him on Twitter @antocucciniello, and follow him on GitHub here.
Read more
  • 0
  • 1
  • 19963

article-image-building-better-bundles-why-processenvnodeenv-matters-optimized-builds
Mark Erikson
14 Nov 2016
5 min read
Save for later

Building Better Bundles: Why process.env.NODE_ENV Matters for Optimized Builds

Mark Erikson
14 Nov 2016
5 min read
JavaScript developers are keenly aware of the need to reduce the size of deployed assets, especially in today's world of single-page apps. This usually means running increasingly complex JavaScript codebases through build steps that produce a minified bundle for deployment. However, if you read a typical tutorial on setting up a build tool like Browserify or Webpack, you'll see numerous references to a variable called process.env.NODE_ENV. Tutorials always talk about how this needs to be set to a value like "production" in order to produce a properly optimized bundle, but most articles never really spell out why this value matters and how it relates to build optimization. Here's an explanation of why process.env.NODE_ENV is used and how it fits into the typical build process. Operating system environment variables are widely used as a method of configuring applications, especially as a way to activate behavior based on different deployment environments (such as development vs testing vs production). Node.js exposes the current process's environment variables to the script as an object called process.env. From there, the Express web server framework popularized using an environment variable called NODE_ENV as a flag to indicate whether the server should be running in "development" mode vs "production" mode. At runtime, the script looks up that value by checking process.env.NODE_ENV. Because it was used within the Node ecosystem, browser-focused libraries also started using it to determine what environment they were running in, and using it to control optimizations and debug mode behavior. For example, React uses it as the equivalent of a C preprocessor #ifdef to act as conditional checking for debug logging and perf tracking, roughly like this: function someInternalReactFunction() { // do actual work part 1 if(process.env.NODE_ENV === "development") { // do debug-only work, like recording perf stats } // do actual work part 2 } If process.env.NODE_ENV is set to "production", all those if clauses will evaluate to false, and the potentially expensive debug code won't run. In addition, in conjunction with a tool like UglifyJS that does minification and removal of dead code blocks, a clause that is surrounded with if(process.env.NODE_ENV === "development") will become dead code in a production build and be stripped out, thus reducing bundled code size and execution time. However, because the NODE_ENV environment variable and the corresponding process.env.NODE_ENV runtime field are normally server-only concepts, by default those values do not exist in client-side code. This is where build tools such as Webpack's DefinePlugin or the Browserify Envify transform come in, which perform search-and-replace operations on the original source code. Since these build tools are doing transformation of your code anyway, they can force the existence of global values such as process.env.NODE_ENV. (It's also important to note that because DefinePlugin in particular does a direct text replacement, the value given to DefinePlugin must include actual quotes inside of the string itself. Typically, this is done either with alternate quotes, such as '"production"', or by using JSON.stringify("production")). Here's the key: the build tool could set that value to anything, based on any condition that you want, as you're defining your build configuration. For example, I could have a webpack.production.config.js Webpack config file that always uses the DefinePlugin to set that value to "production" throughout the client-side bundle. It wouldn't have to be checking the actual current value of the "real" process.env.NODE_ENV variable while generating the Webpack config, because as the developer I would know that any time I'm doing a "production" build, I would want to set that value in the client code to "production'. This is where the "code I'm running as part of my build process" and "code I'm outputting from my build process" worlds come together. Because your build script is itself most likely to be JavaScript code running under Node, it's going to have process.env.NODE_ENV available to it as it runs. Because so many tools and libraries already share the convention of using that field's value to determine their dev-vs-production status, the common convention is to use the current value of that field inside the build script as it's running to also determine the value of that field as applied to the client code being transformed. Ultimately, it all comes down to a few key points: NODE_ENV is a system environment variable that Node exposes into running scripts. It's used by convention to determine dev-vs-prod behavior, by both server tools, build scripts, and client-side libraries. It's commonly used inside of build scripts (such as Webpack config generation) as both an input value and an output value, but the tie between the two is still just convention. Build tools generally do a transform step on the client-side code, replace any references to process.env.NODE_ENV with the desired value, and the resulting code will contain dead code blocks as debug-only code is now inside of an if(false)-type condition, ensuring that code doesn't execute at runtime. Minifier tools such as UglifyJS will strip out the dead code blocks, leaving the production bundle smaller. So, the next time you see process.env.NODE_ENV mentioned in a build script, hopefully you'll have a much better idea why it's there. About the author Mark Erikson is a software engineer living in southwest Ohio, USA, where he patiently awaits the annual heartbreak from the Reds and the Bengals. Mark is author of the Redux FAQ, maintains the React/Redux Links list and Redux Addons Catalog, and occasionally tweets at @acemarke. He can be usually found in the Reactiflux chat channels, answering questions about React and Redux. He is also slightly disturbed by the number of third-person references he has written in this bio!
Read more
  • 0
  • 0
  • 19860

article-image-dl-wars-pytorch-vs-tensorflow
Savia Lobo
15 Sep 2017
6 min read
Save for later

Is Facebook-backed PyTorch better than Google's TensorFlow?

Savia Lobo
15 Sep 2017
6 min read
[dropcap]T[/dropcap]he rapid rise of tools and techniques in Artificial Intelligence and Machine learning of late has been astounding. Deep Learning, or “Machine learning on steroids” as some say, is one area where data scientists and machine learning experts are spoilt for choice in terms of the libraries and frameworks available. There are two libraries that are starting to emerge as frontrunners. TensorFlow is the best in class, but PyTorch is a new entrant in the field that could compete. So, PyTorch vs TensorFlow, which one is better? How do the two deep learning libraries compare to one another? TensorFlow and PyTorch: the basics Google’s TensorFlow is a widely used machine learning and deep learning framework. Open sourced in 2015 and backed by a huge community of machine learning experts, TensorFlow has quickly grown to be THE framework of choice by many organizations for their machine learning and deep learning needs. PyTorch, on the other hand, a recently developed Python package by Facebook for training neural networks is adapted from the Lua-based deep learning library Torch. PyTorch is one of the few available DL frameworks that uses tape-based autograd system to allow building dynamic neural networks in a fast and flexible manner. Pytorch vs TensorFlow Let's get into the details - let the Python vs TensorFlow match up begin... What programming languages support PyTorch and TensorFlow? Although primarily written in C++ and CUDA, Tensorflow contains a Python API sitting over the core engine, making it easier for Pythonistas to use. Additional APIs for C++, Haskell, Java, Go, and Rust are also included which means developers can code in their preferred language. Although PyTorch is a Python package, there’s provision for you to code using the basic C/ C++ languages using the APIs provided. If you are comfortable using Lua programming language, you can code neural network models in PyTorch using the Torch API. How easy are PyTorch and TensorFlow to use? TensorFlow can be a bit complex to use if used as a standalone framework, and can pose some difficulty in training Deep Learning models. To reduce this complexity, one can use the Keras wrapper which sits on top of TensorFlow’s complex engine and simplifies the development and training of deep learning models. TensorFlow also supports Distributed training, which PyTorch currently doesn’t. Due to the inclusion of Python API, TensorFlow is also production-ready i.e., it can be used to train and deploy enterprise-level deep learning models. PyTorch was rewritten in Python due to the complexities of Torch. This makes PyTorch more native to developers. It has an easy to use framework that provides maximum flexibility and speed. It also allows quick changes within the code during training without hampering its performance. If you already have some experience with deep learning and have used Torch before, you will like PyTorch even more, because of its speed, efficiency, and ease of use. PyTorch includes custom-made GPU allocator, which makes deep learning models highly memory efficient. Due to this, training large deep learning models becomes easier. Hence, large organizations such as Facebook, Twitter, Salesforce, and many more are embracing Pytorch. In this PyTorch vs TensorFlow round, PyTorch wins out in terms of ease of use. Training Deep Learning models with PyTorch and TensorFlow Both TensorFlow and PyTorch are used to build and train Neural Network models. TensorFlow works on SCG (Static Computational Graph) that includes defining the graph statically before the model starts execution. However, once the execution starts the only way to tweak changes within the model is using tf.session and tf.placeholder tensors. PyTorch is well suited to train RNNs( Recursive Neural Networks) as they run faster in PyTorch than in TensorFlow. It works on DCG (Dynamic Computational Graph) and one can define and make changes within the model on the go. In a DCG, each block can be debugged separately, which makes training of neural networks easier. TensorFlow has recently come up with TensorFlow Fold, a library designed to create TensorFlow models that works on structured data. Like PyTorch, it implements the DCGs and gives massive computational speeds of up to 10x on CPU and more than 100x on GPU! With the help of Dynamic Batching, you can now implement deep learning models which vary in size as well as structure. Comparing GPU and CPU optimizations TensorFlow has faster compile times than PyTorch and provides flexibility for building real-world applications. It can run on literally any kind of processor from a CPU, GPU, TPU, mobile devices, to a Raspberry Pi (IoT Devices). PyTorch, on the other hand, includes Tensor computations which can speed up deep neural network models upto 50x or more using GPUs. These tensors can dwell on CPU or GPU. Both CPU and GPU are written as independent libraries; making PyTorch efficient to use, irrespective of the Neural Network size. Community Support TensorFlow is one of the most popular Deep Learning frameworks today, and with this comes a huge community support. It has great documentation, and an eloquent set of online tutorials. TensorFlow also includes numerous pre-trained models which are hosted and available on github. These models aid developers and researchers who are keen to work with TensorFlow with some ready-made material to save their time and efforts. PyTorch, on the other hand, has a relatively smaller community since it has been developed fairly recently. As compared to TensorFlow, the documentation isn’t that great, and codes are not readily available. However, PyTorch does allow individuals to share their pre-trained models with others. PyTorch and TensorFlow - A David & Goliath story As it stands, Tensorflow is clearly favoured and used more than PyTorch for a variety of reasons. Tensorflow best suited for a wide range of practical purposes. It is the obvious choice for many machine learning and deep learning experts because of its vast array of features. Its maturity in the market is important too. It has a better community support along with multiple language APIs available. It has a good documentation and is production-ready due to the availability of ready-to-use code. Hence, it is better suited for someone who wants to get started with Deep Learning, or for organizations wanting to productize their Deep Learning models. PyTorch is relatively new and has a smaller community than TensorFlow, but it is fast and efficient. In short, it gives you all the power of Torch wrapped in the usefulness and ease of Python. Because of its efficiency and speed, it's a good option for small, research based projects. As mentioned earlier, companies such as Facebook, Twitter, and many others are using Pytorch to train deep learning models. However, its adoption is yet to go mainstream. The potential is evident, PyTorch is just not ready yet to challenge the beast that is TensorFlow. However considering its growth, the day is not far when PyTorch is further optimized and offers more functionalities - to the point that it becomes the David to TensorFlow’s Goliath.
Read more
  • 0
  • 0
  • 19789
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-7-best-practices-for-logging-in-node-js
Guest Contributor
05 Mar 2019
5 min read
Save for later

7 Best Practices for Logging in Node.js

Guest Contributor
05 Mar 2019
5 min read
Node.js is one of the easiest platforms for prototyping and agile development. It’s used by large companies looking to scale their products quickly. However, using a platform on its own isn’t enough for most big projects today. Logging is also a key part of ensuring your web or mobile app runs smoothly for all users. Application logging is the practice of recording information about your application’s runtime. These files are usually saved a logging platform which helps identify potential problems. While no app is perfect 100% of the time, logging helps developers cut down on errors and even cyber attacks. The nature of software is complex. We can’t always predict how an application will react to data, errors, or system changes. Logging helps us better understand our own programs. So how do you handle logging in Node.js specifically? Following are some of the best practices for logging in Node.js to get the best results. 1. Understand the Regulations Let’s discuss the current legal regulations about what you can and cannot log. You should never log sensitive information or personal data. That means excluding credentials like passwords, credit card number or even email addresses. Recent changes to regulation like Europe’s GDPR make this even more essential. You don’t want to get tied up in the legal red tape of sensitive data. When in doubt, stick to the 3 things that are needed for a solid log message: timestamp, log level, and description. Beyond this, you don’t need any extensive framework. 2. Take advantage of Winston Node.js is built with a logging framework known as Winston. Winston is defined as transport for your logs, and you can install it directly into your application. Follow this guide to install Winston on your own. Winston is a powerful tool that comes with different logging levels with values. You can fully customize this console as well with colors, messages, and output details. The most recent version available is 3.0.0, but always make sure you have the latest edition to keep your app running smoothly. 3. Add Morgan In addition to Winston, Morgan is an HTTP request logger that collects server logs and standardizes them. Think of it as a logger simplification. Morgan. While you’re free to use Morgan on its own, most developers choose to use it with Winston since they make a powerful team. Morgan also works well with express.js. 4. Consider the Intel Package While Winston and Morgan are a great combination, they’re not your only option. Intel is another package solution with similar features as well as unique options. While you’ll see a lot of overlap in what they offer, Intel also includes a stack trace object. These features will come in handy when it’s time to actually debug. Because it gives a stack trace as a JSON object, it’s much easier to pass messages up the logger chain. Think of Intel like the breadcrumbs taking your developers to the error. 5. Use Environment Variables You’ll hear a lot of discussion about configuration management in the Node.js world. Decoupling your code from services and database is no straightforward process. In Node.js, it’s best to use environment variables. You can also look up values from process.env within your code. To determine which environment your program is running on, look up the NODE_ENV variables. You can also use the nconf module found here. 6. Choose a Style Guide No developer wants to spend time reading through lines of code only to have to change the spaces to tabs, reformat the braces, etc. Style guides are a must, especially when logging on Node.js. If you’re working with a team of developers, it’s time to decide on a team style guide that everyone sticks to across the board. When the code is written in a consistent style, you don’t have to worry about opinionated developers fighting for a say. It doesn’t matter which style you stick with, just make sure you can actually stick to it. The Googe style guide for Java is a great place to start if you can’t make a single decision. 7. Deal with Errors Finally, accept that errors will happen and prepare for them. You don’t want an error to bring down your entire software or program. Exception management is key. Use an asyn structure to cleanly handle any errors. Whether the app simply restarts or moves on to the next stage, make sure something happens. Users need their errors to be handled. As you can see, there are a few best practices to keep in mind when logging in Node.js. Don’t rely on your developers alone to debug the platform. Set a structure in place to handle these problems as they arise. Your users expect quality experience every time. Make sure you can deliver with these tips above. Author Bio Ashley Lipman Content marketing specialist Ashley is an award-winning writer who discovered her passion for providing creative solutions for building brands online. Since her first high school award in Creative Writing, she continues to deliver awesome content through various niches. Introducing Zero Server, a zero-configuration server for React, Node.js, HTML, and Markdown 5 reasons you should learn Node.js Deploying Node.js apps on Google App Engine is now easy
Read more
  • 0
  • 0
  • 19670

article-image-earn-1m-per-year-hint-learn-machine-learning
Neil Aitken
01 Aug 2018
10 min read
Save for later

How to earn $1m per year? Hint: Learn machine learning

Neil Aitken
01 Aug 2018
10 min read
Internet job portal ‘Indeed.com’ links potential employers with people who are looking to take the next step in their careers. The proportion of job posts on their site, relating to ‘Data Science’, a specific job in the AI category, is growing fast (see chart below). More broadly, Artificial Intelligence & machine learning skills, of which ‘Data Scientist’ is just one example, are in demand. No wonder that it has been termed as the sexiest job role of the 21st century. Interest comes from an explosion of jobs in the field from big companies and Start-Ups, all of which are competing to come up with the best AI business and to earn the money that comes with software that automates tasks. The skills shortage associated with Artificial Intelligence represents an opportunity for any developer. There has never been a better time to consider whether reskilling or upskilling in AI could be a lucrative path for you. Below : Indeed.com. Proportion of job postings containing Data Scientist or Data Science. [caption id="attachment_21240" align="aligncenter" width="1525"] Artificial Intelligence skills are increasingly in demand and create a real opportunity for those prepared to reskill or upskill.[/caption] Source: Indeed  The AI skills gap the market is experiencing comes from the difficulty associated with finding an individual demonstrating a competent mixture of the very disparate faculties that AI roles require. Artificial Intelligence and it’s near equivalents such as Machine Learning and Neural Networks operate at the intersection of what have mostly been two very different disciplines – statistics and software development. In simple terms, they are half coding, half maths. Hamish Ogilvy, CEO of AI based Internal Search company Sajari is all too familiar with the problem. He’s on the front line, hiring AI developers. “The hardest part”, says Ogilvy, “is that AI is pretty complex and the average developer/engineer does not have the background in maths/stats/science to actually understand what is happening. On the flip side the trouble with the stats/maths/science people is that they typically can't code, so finding people in that sweet spot that have both is pretty tough.” He’s right. The New York Times suggests that the pool of qualified talent is only 10,000 people, worldwide. Those who do have jobs are typically happily ensconced, paid well, treated appropriately and given no reason whatsoever to want to leave. [caption id="attachment_21244" align="aligncenter" width="1920"] Judged by $ investments in the area alone, AI skills are worth developing for those wishing to stay current on technology skills.[/caption] In fact, an instinct to develop AI skills will serve any technology employee well. No One can have escaped the many estimates, from reputable consultancies, suggesting that Automation will replace up to 30% of jobs in the next 10 years. No job is safe. Every industry is touched by AI in some form. Any responsible individual with a view to the management of their own skills could learn ML and AI skills to stay relevant in current times. Even if you don't want to move out of your current job, learning ML will probably help you adapt better in your industry. What is a typical AI job and what will it pay? OpenAI, a world class Artificial Intelligence research laboratory, revealed the salaries of some of its key Data Science employees recently. Those working in the AI field with a specialization can earn $300 to $500k in their first year out of university. Experts in Artificial Intelligence now command salaries of up to $1m. [caption id="attachment_21242" align="aligncenter" width="432"] The New York Times observes AI salaries[/caption] [caption id="attachment_21241" align="aligncenter" width="1121"] The New York Times observes AI salaries[/caption] Source: The New York times Indraneil Roy, an Expert in AI and Talent Acquisition who works for Edge Networks puts it this way when outlining the difficulties of hiring the right skills and to explain why wages in the field are so high. “The challenge is the quality of resources. As demand is high for this skill, we are already seeing candidates with fake experience and work pedigree not up to standards.” The phenomenon is also causing a ‘brain drain’ in Universities. About a third of jobs in the AI field will go to someone with a Ph.D., and all of those are drawn from universities working on the discipline, often lured by the significant pay packages which are available. So, with huge demand and the universities drained, where will future AI employees come from? 3 ways to skill up to become an AI expert (And earn all that money?) There is still not a lot of agreed terminology or even job roles and responsibility in the sector. However, some things are clear. Those wishing to evolve in to the field of AI must understand the conceptual thinking involved, as a starting point, whether that view is found on the job or as part of an informal / formal educational course. Specifically, most jobs in the specialty require a working knowledge of neural networks, data / analytics, predictive analytics, with some basic programming and database skills. There are some great resources available online to train you up. Most, as you’d expect, are available on your smartphone so there really is no excuse for not having a look. 1. Free online course: Machine Learning & Statistics and probability Hamish Ogilvy summed the online education which is available in the area well. There are “so many free courses now on AI from Stanford,” he said, “that people are able to educate themselves and make up for the failings of antiquated university courses. AI is just maths really,” he says “complex models and stats. So that's what people need grounding in to be successful.” Microsoft offer free AI courses for technical professionals: Microsoft’s training materials are second to none. They’re also provided free and provide a shortcut to a credible understanding in an area simply because it comes from a technical behemoth. Importantly, they also have a list of AI services which you can play with, again for free. For example, a Natural Language engine offers a facility for you to submit text from Instant Messaging conversations and establish the sentiment being felt by the writer. Practical experience of the tools, processes and concepts involved will set you apart. See below. [caption id="attachment_21245" align="aligncenter" width="1999"] Check out Microsoft’s free AI training program for developers.[/caption] Google are taking a proactive stance on Machine Learning. They see it’s potential to improve efficiency in every industry and also offer free ML training courses on their site. 2. Take courses on AI/ML Packt’s machine learning courses, books and videos: Packt is working towards a mission to help the world put software to work in new ways, through the delivery of effective learning and information services to IT professionals. It has published over 6,000 books and videos so far, providing IT professionals with the actionable knowledge they need to get the job done - whether that's specific learning on an emerging technology or optimizing key skills in more established tools. You can choose from a variety of Packt’s books, videos and courses for AI/ML. Here’s a list of top ones: Artificial Intelligence by Example [Book] Artificial Intelligence for Big data [Book] Learn Artificial Intelligence with TensorFlow [Video] Introduction to Artificial Intelligence with Java [Video] Advanced Artificial Intelligence Projects with Python [Video] Python Machine learning - Second Edition [Book] Machine Learning with R - Second Edition [Book] Coursera’s machine learning courses Coursera is a company which make training courses, for a variety of subjects, available online. Taken from actual University course content and delivered with tests, videos and training notes, all accessed online, each course is roughly a University Module. Students pick up an ‘up to under-graduate’ level of understanding of the content involved. Coursera’s courses are often cited as merit worthy and are recognizable in the industry. Costs vary but are typically between $2k and $5k per course. 3. Learn by doing Familiarize yourself with relevant frameworks and tools including Tensor Flow, Python and Keras. TensorFlow from Google is the most used open source AI software library. You can use existing code in your experiments and experiment with neural networks in much the same way as you can in Microsoft’s. Python is a programming language written for a big data world. Its proponents will tell you that Python saves developers hundreds of lines of code, allowing you to tie together information and systems faster than ever before. Python is used extensively in ML and AI applications and should be at the top of your study list. Keras, a deep learning library is similarly ubiquitous. It’s a high level Neural Network API designed to allow prototyping of your software as fast as possible. Finally, a lesser known but still valuable resources is the Accord.net. It is one final example of the many  software elements with which you can engage with to train yourself up. Accord Framework.net will expose you to image libraries, natural learning and real time facial recognition. Earn extra points with employers AI has several lighthouse tasks which are proving the potential of the technology in these still early stages. We’ve included a couple of examples, Natural Language processing and image recognition, above. Practical expertise in these areas specifically, image or voice recognition or pattern matching are valued highly by employers. Alternatively, have you patented something? A registered patent in your name is highly prized. Especially something to do with Machine Learning. Both will help you showcase Extra skills / achievements that will help your application.’ The specifics of how to apply for patents differ by country but you can find out more about the overall principles of how to submit an idea here. Passion and engagement in the subject are also, clearly appealing characteristics for potential employers to see in applicants. Participating in competitions like Kaggle, and having a portfolio of projects you can showcase on facilities like GitHub are also well prized. Of all of these suggestions, for those employed, any on the job experience you can get will stand you in the best stead. Indraneil says "Individual candidates need to spend more time doing relevant projects while in employment. Start ups involved in building products and platforms on AI seem to have better talent." The fact that there are not many AI specialists around is a bad sign There is a demand for employees with AI skills and an investment in relevant training may pay you well. Unfortunately, the underlying problem this situation reveals could be far worse than the problems experienced so far. Together, once found, all these AI scientists are going to automate millions of jobs, in every industry, in every country around the world. If Industry, Governments and Universities cannot train enough people to fill the roles being created by an evolving skills market, we may rightly be concerned to worry about how they will deal with retraining all those displaced by AI, for whom there may be no obvious replacement role available. 18 striking AI Trends to watch in 2018 – Part 1 DeepMind, Elon Musk, and others pledge not to build lethal AI Attention designers, Artificial Intelligence can now create realistic virtual textures
Read more
  • 0
  • 0
  • 19632

article-image-best-game-engines-for-ai-game-development
Natasha Mathur
24 Aug 2018
8 min read
Save for later

Best game engines for Artificial Intelligence game development

Natasha Mathur
24 Aug 2018
8 min read
"A computer would deserve to be called intelligent if it could deceive a human into believing that it was human" — Alan Turing It is quite common to find games which are initially exciting but take a boring turn eventually, making you want to quit the game. Then, there are games which are too difficult to hold your interest and you end up quitting in the beginning phase itself.  These are also two of the most common problems that game developers face when building games. This is where AI comes to your rescue, to spice things up. Why use Artificial Intelligence in games? The major reason for using AI in games is to provide a challenging opponent to make the game more fun to play. But, AI in the gaming industry is not a recent news. The gaming world has been leveraging the wonders of AI for a long time now. One of the first examples of AI is the computerized game, Nim, was created back in 1951. Other games such as Façade, Black & White, The Sims, Versu, and F.E.A.R. are all great AI games, that hit the market long time back. Even modern-day games like Need for Speed, Civilization, or Counter-Strike use AI. AI controls a lot of elements in games and is usually behind characters such as enemy creeps, neutral merchants, or even animals. AI in games is used to enable the non-human characters (NPCs) with responsive, adaptive, and intelligent behaviors similar to human-like intelligence. AI helps make NPCs seem intelligent as they are able to actively change their level of skills based on the person playing the game. This makes the game seem more personalized to the gamer. Playing video games is fun, and developing these games is equally fun. There are different game engines on the market to help with the development of games. A game engine is a software that provides game creators with the necessary set of features to build games quickly and efficiently. Let’s have a look at the top game engines for Artificial Intelligence game development. Unity3D Developer:  Unity Technologies Release Date: June 8, 2005 Unity is a cross-platform game engine which provides users with the ability to create games in both 2D and 3D. It is extremely popular and loved by game designers from large and small studios alike. Apart from 3D, and 2D games, it also helps with simulations for desktops, laptops, home consoles, smart TVs, and mobile devices. Key AI features: Unity offers a machine learning agents toolkit to the game developers, which help them include AI agents within games. As per the Unity team, “machine Learning Agents Toolkit (ML-Agents) is an open-source Unity plugin that enables games and simulations to serve as environments for training intelligent agents”. Unity AI - Unity 3D Artificial Intelligence  The ML-Agents SDK transforms games and simulations created using the Unity Editor into environments for training intelligent agents. These ML agents are trained using deep Reinforcement Learning, imitation learning, neuroevolution, or other machine learning methods via Python APIs. There’s also a TensorFlow based algorithm provided by Unity to allow game developers to easily train intelligent agents for 2D, 3D, and VR/AR games. These trained agents are then used for controlling the NPC behavior within games. The ML-Agents toolkit is beneficial for both game developers and AI researchers. Apart from this, Unity3D is easy to use and learn, compatible with every game platform and provides great community support. Learning Resources: Unity AI Programming Essentials Unity 2017 Game AI programming - Third Edition Unity 5.x Game AI Programming Cookbook Unreal Engine 4 Developer: Epic games Release Date: May 1998 Unreal Engine is widely used among developers all around the world. It is a collection of integrated tools for game developers which helps them build games, simulations, and visualization. It is also among the top game engines which are used to develop high-end AAA titles. Gears of War, Batman: Arkham Asylum and Mass Effect are some of the popular games developed using Unreal Engine. Key AI features: Unreal Engine uses a set of tools which helps add AI capabilities to a game. It uses tools such as behavior Tree, navigation Component, blackboard asset, enumeration, target point, AI Controller, and Navigation Volumes. Behavior tree creates different states and the logic behind AI. Navigation Component helps with handling movement for AI. Blackboard Asset stores information and acts as the local variable for AI. Enumeration creates states. It also allows alternating between these states. Target Point creates a basic Path node form. The AI Controller and Character tool is responsible for handling the communication between the world and the controlled pawn for AI. At last, the Navigation Volumes feature creates Navigation Mesh in the environment to allow easy Pathfinding for AI. There are also features such as Blueprint Visual Scripting which can be converted into performant C++ code, AIComponents, and the Environment Query System (EQS) which provides agents the ability to perceive their environment. Apart from its AI capabilities, the Unreal engine offers the largest community support with lifetime hours of video tutorials and assets. It is also compatible with a variety of operating platforms such as iOS, Android, Linux, Mac, Windows, and most game consoles. But there are certain inbuilt-tools in Unreal Engine which can be hard for beginners to learn. Learning resources: Unreal Engine 4 AI programming essentials CryEngine 3 Developer: Crytek Release Date: May 2, 2002 CryEngine is a powerful game development platform that comes packed with a set of tools and features to create world-class gaming experiences. It is the game engine behind games such as Sniper: Ghost Warrior 2, SNOW, etc. Key AI features: CryEngine comes with an AI system designed for easy creation of custom AI actors. This is flexible enough to handle a larger set of complex and different worlds. The core of CryEngine’s AI system is based on lots of scripting. There are different AI elements within this system that add the AI capabilities to the NPCs within the game. Some of these elements are AI Actions which allows the developers to script AI behaviors without creating new code. AI Actors Logger can log AI events and signals to files. AI Control Objects use AI object to control AI entities/actors. AI Debug Draw is the primary tool offered by CryEngine for information on the current state of the AI System and AI actors. AI Debugger registers the inputs that AI agents receive and the decisions that they make in real-time during a game session. AI Sequence system works in parallel to FG and AI systems. This helps to simplify and group AI control. CryEngine offers the easiest A.l. coding of any tech currently on the market. Since CryEngine is relatively new as compared to other game engines, it does not have a very flourishing community yet. Despite the easy AI coding, the overall learning curve of Unreal Engine is high. Panda3D Developer: Disney Interactive until 2010,  Walt Disney Imagineering, Carnegie Mellon University Release Date: 2002 Panda3D is a game engine, a framework for 3D rendering and game development for Python and C++ programs. It includes graphics, audio, I/O, collision detection, and other abilities for the creation of 3D games. Key AI features: Panda3D comes packed with an AI library named PandAI v1.0. PandAI is an AI library which provides 'Artificially Intelligent' behavior in NPC (Non-Playable Characters) in games. The PandAI library offers functionality for Steering Behaviors (Seek, Flee, Pursue, Evade, Wander, Flock, Obstacle Avoidance, Path Following) and path finding (helps the NPCs to intelligently avoiding obstacles via the shortest path ). This AI library is composed of several different entities. For instance, there’s a main AIWorld Class to update any AICharacters added to it. Each AICharacter has its own AIBehavior object for tracking all the position and rotation updates. Each AIBehavior object has the functionality to implement all the steering behaviors and pathfinding behaviors. These features within Panda3D gives you the ability to call the respective functions. Panda3D is a relatively simple game engine which lets you add AI capabilities within your games. The community is not as robust and has a low learning curve. AI is a fantastic tool which makes the entities in games seem more organic, alive, and real. The main goal here is not to copy the entire human thought process but to just sell the illusion of life. These game engines provide the developers with the entire framework needed to add AI capabilities to their games. The entire game development process is more fun as there is no need to create all systems including the physics, graphics, and AI, from scratch. Now, if you’re wondering about the best AI game engines out of the four mentioned in this article then there is no specific answer to that as selecting the best AI game engine depends on the requirements of your project. Game Engine Wars: Unity vs Unreal Engine Unity switches to WebAssembly as the output format for the Unity WebGL build target Developing Games Using AI  
Read more
  • 0
  • 1
  • 19586

article-image-5-ways-artificial-intelligence-is-upgrading-software-engineering
Melisha Dsouza
02 Sep 2018
8 min read
Save for later

5 ways artificial intelligence is upgrading software engineering

Melisha Dsouza
02 Sep 2018
8 min read
47% of digitally mature organizations, or those that have advanced digital practices, said they have a defined AI strategy (Source: Adobe). It is estimated that  AI-enabled tools alone will generate $2.9 trillion in business value by 2021.  80% of enterprises are smartly investing in AI. The stats speak for themselves. AI clearly follows the motto “go big or go home”. This explosive growth of AI in different sectors of technology is also beginning to show its colors in software development. Shawn Drost, co-founder and lead instructor of coding boot camp ‘Hack Reactor’ says that AI still has a long way to go and is only impacting the workflow of a small portion of software engineers on a minority of projects right now. AI promises to change how organizations will conduct business and to make applications smarter. It is only logical then that software development, i.e., the way we build apps, will be impacted by AI as well. Forrester Research recently surveyed 25 application development and delivery (AD&D) teams, and respondents said AI will improve planning, development and especially testing. We can expect better software created under traditional environments. 5 areas of Software Engineering AI will transform The 5 major spheres of software development-  Software design, Software testing, GUI testing, strategic decision making, and automated code generation- are all areas where AI can help. A majority of interest in applying AI to software development is already seen in automated testing and bug detection tools. Next in line are the software design precepts, decision-making strategies, and finally automating software deployment pipelines. Let's take an in-depth look into the areas of high and medium interest of software engineering impacted by AI according to the Forrester Research report.     Source: Forbes.com #1 Software design In software engineering, planning a project and designing it from scratch need designers to apply their specialized learning and experience to come up with alternative solutions before settling on a definite solution. A designer begins with a vision of the solution, and after that retracts and forwards investigating plan changes until they reach the desired solution. Settling on the correct plan choices for each stage is a tedious and mistake-prone action for designers. Along this line, a few AI developments have demonstrated the advantages of enhancing traditional methods with intelligent specialists. The catch here is that the operator behaves like an individual partner to the client. This associate should have the capacity to offer opportune direction on the most proficient method to do design projects. For instance, take the example of AIDA- The Artificial Intelligence Design Assistant, deployed by Bookmark (a website building platform). Using AI, AIDA understands a users needs and desires and uses this knowledge to create an appropriate website for the user. It makes selections from millions of combinations to create a website style, focus, image and more that are customized for the user. In about 2 minutes, AIDA designs the first version of the website, and from that point it becomes a drag and drop operation. You can get a detailed overview of this tool on designshack. #2 Software testing Applications interact with each other through countless  APIs. They leverage legacy systems and grow in complexity everyday. Increase in complexity also leads to its fair share of challenges that can be overcome by machine-based intelligence. AI tools can be used to create test information, explore information authenticity, advancement and examination of the scope and also for test management. Artificial intelligence, trained right, can ensure the testing performed is error free. Testers freed from repetitive manual tests thus have more time to create new automated software tests with sophisticated features. Also, if software tests are repeated every time source code is modified, repeating those tests can be not only time-consuming but extremely costly. AI comes to the rescue once again by automating the testing for you! With AI automated testing, one can increase the overall scope of tests leading to an overall improvement of software quality. Take, for instance, the Functionize tool. It enables users to test fast and release faster with AI enabled cloud testing. The users just have to type a test plan in English and it will be automatically get converted into a functional test case. The tool allows one to elastically scale functional, load, and performance tests across every browser and device in the cloud. It also includes Self-healing tests that update autonomously in real-time. SapFix is another AI Hybrid tool deployed by Facebook which can automatically generate fixes for specific bugs identified by 'Sapienz'. It then proposes these fixes to engineers for approval and deployment to production.   #3 GUI testing Graphical User Interfaces (GUI) have become important in interacting with today's software. They are increasingly being used in critical systems and testing them is necessary to avert failures. With very few tools and techniques available to aid in the testing process, testing GUIs is difficult. Currently used GUI testing methods are ad hoc. They require the test designer to perform humongous tasks like manually developing test cases, identifying the conditions to check during test execution, determining when to check these conditions, and finally evaluate whether the GUI software is adequately tested. Phew! Now that is a lot of work. Also, not forgetting that if the GUI is modified after being tested, the test designer must change the test suite and perform re-testing. As a result, GUI testing today is resource intensive and it is difficult to determine if the testing is adequate. Applitools is a GUI tester tool empowered by AI. The Applitools Eyes SDK automatically tests whether visual code is functioning properly or not. Applitools enables users to test their visual code just as thoroughly as their functional UI code to ensure that the visual look of the application is as you expect it to be. Users can test how their application looks in multiple screen layouts to ensure that they all fit the design. It allows users to keep track of both the web page behaviour, as well as the look of the webpage. Users can test everything they develop from the functional behavior of their application to its visual look. #4 Using Artificial Intelligence in Strategic Decision-Making Normally, developers have to go through a long process to decide what features to include in a product. However, machine learning AI solution trained on business factors and past development projects can analyze the performance of existing applications and help both teams of engineers and business stakeholders like project managers to find solutions to maximize impact and cut risk. Normally, the transformation of business requirements into technology specifications requires a significant timeline for planning. Machine learning can help software development companies to speed up the process, deliver the product in lesser time, and increase revenue within a short span. AI canvas is a well known tool for Strategic Decision making.The canvas helps identify the key questions and feasibility challenges associated with building and deploying machine learning models in the enterprise. The AI Canvas is a simple tool that helps enterprises organize what they need to know into seven categories, namely- Prediction, Judgement, Action, Outcome, Input, Training and feedback. Clarifying these seven factors for each critical decision throughout the organization will help in identifying opportunities for AIs to either reduce costs or enhance performance.   #5 Automatic Code generation/Intelligent Programming Assistants Coding a huge project from scratch is often labour intensive and time consuming. An Intelligent AI programming assistant will reduce the workload by a great extent. To combat the issues of time and money constraints, researchers have tried to build systems that can write code before, but the problem is that these methods aren’t that good with ambiguity. Hence, a lot of details are needed about what the target program aims at doing, and writing down these details can be as much work as just writing the code. With AI, the story can be flipped. ”‘Bayou’- an A.I. based application is an Intelligent programming assistant. It began as an initiative aimed at extracting knowledge from online source code repositories like GitHub. Users can try it out at askbayou.com. Bayou follows a method called neural sketch learning. It trains an artificial neural network to recognize high-level patterns in hundreds of thousands of Java programs. It does this by creating a “sketch” for each program it reads and then associates this sketch with the “intent” that lies behind the program. This DARPA initiative aims at making programming easier and less error prone. Sounds intriguing? Now that you know how this tool works, why not try it for yourself on i-programmer.info. Summing it all up Software engineering has seen massive transformation over the past few years. AI and software intelligence tools aim to make software development easier and more reliable. According to a Forrester Research report on AI's impact on software development, automated testing and bug detection tools use AI the most to improve software development. It will be interesting to see the future developments in software engineering empowered with AI. I’m expecting faster, more efficient, more effective, and less costly software development cycles while engineers and other development personnel focus on bettering their skills to make advanced use of AI in their processes. Implementing Software Engineering Best Practices and Techniques with Apache Maven Intelligent Edge Analytics: 7 ways machine learning is driving edge computing adoption in 2018 15 millions jobs in Britain at stake with AI robots set to replace humans at workforce
Read more
  • 0
  • 0
  • 19265
article-image-how-to-create-a-web-designer-resume-that-lands-you-a-job
Guest Contributor
19 Jul 2018
7 min read
Save for later

How to create a web designer resume that lands you a Job

Guest Contributor
19 Jul 2018
7 min read
Clearly, there’s lot of competition for web designer jobs, - with the salary rising each year - so it’s crucial you find a way to make your resume stand out. You need to balance creativity with professionalism, all while making sure your experience, personality, and skills shine through. Over the years, people have created numerous resumes for web designers and everybody knows what it takes to get that job. Follow this guide to write a creative and attention-grabbing web designer resume. Note: All images in this article are courtesy of zety.com resume templates page and the guide. Format is a window to your mind Because you’re applying for a design position, the look and format of your resume is very important. It gives prospective employers a sense of your design philosophies. Use white space, and clear, legible fonts to help a hiring manager easily find your information. To stand apart from the crowd, avoid using a word processor to create your resume. Instead use InDesign or Illustrator to design something creative and less cookie-cutter like. Submit your resume in PDF format to avoid formatting errors that will ruin the look of your document. Sometimes a job posting will specifically forbid submitting in PDF though, so watch out for that. Highlight your Experience The key to writing a good experience section for a web designer is keeping it brief and relevant, while highlighting your career achievements. Add no more than three to five bullet points with measurable achievements per past position. Don’t just list your past company and when you worked there. “Discuss what you did and include some tangible accomplishments. If you created a custom graphic set for clients, mention that, and also what percentage were satisfied (hopefully it’s very high.) Prove you have the necessary experience to do the job well,” advises Terrence Wood, resume proofreader at Paper Fellows. Education Your education is not nearly as important as your experience, but you still need to present it right. That means using this section to talk about your strong points. Include coursework and achievements that are relevant to the job description. Maybe you wrote a column about web design in your college’s newspaper, things like this help you stand out to a hiring manager, especially if you are just starting your career in web design. Also, include the GPA here. Showcase your skills Everyone is going to list their skills, if you want that interview you need to do something to make yourself seem exceptional. The first thing you’ll do is take a good look at the job description and note what skills and responsibilities they mention. Now you know what skills to include, but including them is not enough. You need to prove you have them by giving examples of times you used them at past jobs. Don’t just list that you are proficient in Adobe Creative Suite, prove it by describing how you used it to tackle web design for 90% of client projects. Up the ante even more and include samples of your past work. This is a good spot to include your portfolio, any certifications that you have or anything that can help you let them know just how good you are at your job and confirm your skills. If you've had a predominantly freelance career, list the companies or individuals that you've worked for and include examples of work for each of them. You can do the same thing if you had a '9 to 5' career – simply list your previous jobs and show examples of your best work. It's a good idea to let your potential employers know about any future skills you plan to acquire. Use infographics You can give your resume a really unique look by using infographics, while still keeping it professional. Divide your resume layout into a grid with two columns and four or five rows. Now place one category of data into each square of your grid. Next, transform each section into an infographic. Use icons to represent different skills or awards. You can use your design software’s shape tools to create charts and graphs. Programs like Adobe InDesign can be used to create your infographics. You can also use Canva or Visme. Keep it professional It’s a great idea to inject some creativity into you resume. You want to stand out, and after all, it is a design resume. But it’s also important to balance that creativity with professionalism. A hiring manager will make some judgements about your personality based on how your resume looks. Be subtle in your creativity. Use colors that are easy on the eyes, and keep fonts reasonable. There can be a lot of beauty in simplicity. Stick to the basics, place content in an order familiar to recruiters to avoid making them have to work for the information they need. Remember that your primary goal is to communicate your information clearly. Write a cover letter Some people say that it's not really necessary but it's your chance to stand out. Maybe there is something that isn't on your resume or you want to seem more appealing and human – cover letter is a good chance to do all of that. Cover letter is a great place to elaborate how you'll be able to meet their needs. It's a good opportunity to also show them that you have done your research and know their company. Writing resources for your resume ViaWriting and Writing Populist: These are grammar resources you can use to check over your resume for grammatical mistakes. Resume Service: This is a resume service you can use to improve the quality of your web designer resume. Boomessays and EssayRoo: These are online proofreading tools, suggested by Revieweal, you can use to make sure you resume is polished and free of errors. My Writing Way and Academadvisor: Check out these career writing blogs for tips and ideas on how to improve your resume. You’ll find posts here by people who have written web designer resumes before. OXEssays and UKWritings: These are editing tools, recommended by UKWritings review, you can use to go over your resume for typos and other errors. StateofWriting and SimpleGrad: Check out these writing guides for suggestions on how to improve your resume. Even experienced writers can benefit from some extra guidance now and then. ResumeLab: Learn what to include in a cover letter. The job market for web designers is competitive. Make sure you lead with your best qualities and skills, and be sure they fit the job description as closely as possible. Be creative, but ensure you keep your resume professional as well. Now go have fun using this guide to write a creative and attention-grabbing web design resume. Author bio Grace Carter is a resume proofreader at Assignment Writing Service and at Australian Help, where she helps with CV editing and cover letter proofreading. Also, Grace teaches business writing at Academized educational website.   Is your web design responsive? “Be objective, fight for the user, and test with real users on the go!” – Interview with design purist, Will Grant Tips and tricks to optimize your responsive web design  
Read more
  • 0
  • 2
  • 19175

article-image-libraries-for-geospatial-analysis
Aarthi Kumaraswamy
22 May 2018
12 min read
Save for later

Top 7 libraries for geospatial analysis

Aarthi Kumaraswamy
22 May 2018
12 min read
The term geospatial refers to finding information that is located on the earth's surface. This can include, for example, the position of a cellphone tower, the shape of a road, or the outline of a country. Geospatial data often associates some piece of information with a particular location. Geospatial development is the process of writing computer programs that can access, manipulate, and display this type of information. Internally, geospatial data is represented as a series of coordinates, often in the form of latitude and longitude values. Additional attributes, such as temperature, soil type, height, or the name of a landmark, are also often present. There can be many thousands (or even millions) of data points for a single set of geospatial data. In addition to the prosaic tasks of importing geospatial data from various external file formats and translating data from one projection to another, geospatial data can also be manipulated to solve various interesting problems. Obvious examples include the task of calculating the distance between two points, calculating the length of a road, or finding all data points within a given radius of a selected point. We use libraries to solve all of these problems and more. Today we will look at the major libraries used to process and analyze geospatial data. GDAL/OGR GEOS Shapely Fiona Python Shapefile Library (pyshp) pyproj Rasterio GeoPandas This is an excerpt from the book, Mastering Geospatial Analysis with Python by Paul Crickard, Eric van Rees, and Silas Toms. Geospatial Data Abstraction Library (GDAL) and the OGR Simple Features Library The Geospatial Data Abstraction Library (GDAL)/OGR Simple Features Library combines two separate libraries that are generally downloaded together as a GDAL. This means that installing the GDAL package also gives access to OGR functionality. The reason GDAL is covered first is that other packages were written after GDAL, so chronologically, it comes first. As you will notice, some of the packages covered in this post extend GDAL's functionality or use it under the hood. GDAL was created in the 1990s by Frank Warmerdam and saw its first release in June 2000. Later, the development of GDAL was transferred to the Open Source Geospatial Foundation (OSGeo). Technically, GDAL is a little different than your average Python package as the GDAL package itself was written in C and C++, meaning that in order to be able to use it in Python, you need to compile GDAL and its associated Python bindings. However, using conda and Anaconda makes it relatively easy to get started quickly. Because it was written in C and C++, the online GDAL documentation is written in the C++ version of the libraries. For Python developers, this can be challenging, but many functions are documented and can be consulted with the built-in pydoc utility, or by using the help function within Python. Because of its history, working with GDAL in Python also feels a lot like working in C++ rather than pure Python. For example, a naming convention in OGR is different than Python's since you use uppercase for functions instead of lowercase. These differences explain the choice for some of the other Python libraries such as Rasterio and Shapely, which are also covered in this chapter, that has been written from a Python developer's perspective but offer the same GDAL functionality. GDAL is a massive and widely used data library for raster data. It supports the reading and writing of many raster file formats, with the latest version counting up to 200 different file formats that are supported. Because of this, it is indispensable for geospatial data management and analysis. Used together with other Python libraries, GDAL enables some powerful remote sensing functionalities. It's also an industry standard and is present in commercial and open source GIS software. The OGR library is used to read and write vector-format geospatial data, supporting reading and writing data in many different formats. OGR uses a consistent model to be able to manage many different vector data formats. You can use OGR to do vector reprojection, vector data format conversion, vector attribute data filtering, and more. GDAL/OGR libraries are not only useful for Python programmers but are also used by many GIS vendors and open source projects. The latest GDAL version at the time of writing is 2.2.4, which was released in March 2018. GEOS The Geometry Engine Open Source (GEOS) is the C/C++ port of a subset of the Java Topology Suite (JTS) and selected functions. GEOS aims to contain the complete functionality of JTS in C++. It can be compiled on many platforms, including Python. As you will see later on, the Shapely library uses functions from the GEOS library. In fact, there are many applications using GEOS, including PostGIS and QGIS. GeoDjango, also uses GEOS, as well as GDAL, among other geospatial libraries. GEOS can also be compiled with GDAL, giving OGR all of its capabilities. The JTS is an open source geospatial computational geometry library written in Java. It provides various functionalities, including a geometry model, geometric functions, spatial structures and algorithms, and i/o capabilities. Using GEOS, you have access to the following capabilities—geospatial functions (such as within and contains), geospatial operations (union, intersection, and many more), spatial indexing, Open Geospatial Consortium (OGC) well-known text (WKT) and well-known binary (WKB) input/output, the C and C++ APIs, and thread safety. Shapely Shapely is a Python package for manipulation and analysis of planar features, using functions from the GEOS library (the engine of PostGIS) and a port of the JTS. Shapely is not concerned with data formats or coordinate systems but can be readily integrated with such packages. Shapely only deals with analyzing geometries and offers no capabilities for reading and writing geospatial files. It was developed by Sean Gillies, who was also the person behind Fiona and Rasterio. Shapely supports eight fundamental geometry types that are implemented as a class in the shapely.geometry module—points, multipoints, linestrings, multilinestrings, linearrings, multipolygons, polygons, and geometrycollections. Apart from representing these geometries, Shapely can be used to manipulate and analyze geometries through a number of methods and attributes. Shapely has mainly the same classes and functions as OGR while dealing with geometries. The difference between Shapely and OGR is that Shapely has a more Pythonic and very intuitive interface, is better optimized, and has a well-developed documentation. With Shapely, you're writing pure Python, whereas with GEOS, you're writing C++ in Python. For data munging, a term used for data management and analysis, you're better off writing in pure Python rather than C++, which explains why these libraries were created. For more information on Shapely, consult the documentation. This page also has detailed information on installing Shapely for different platforms and how to build Shapely from the source for compatibility with other modules that depend on GEOS. This refers to the fact that installing Shapely will require you to upgrade NumPy and GEOS if these are already installed. Fiona Fiona is the API of OGR. It can be used for reading and writing data formats. The main reason for using it instead of OGR is that it's closer to Python than OGR as well as more dependable and less error-prone. It makes use of two markup languages, WKT and WKB, for representing spatial information with regards to vector data. As such, it can be combined well with other Python libraries such as Shapely, you would use Fiona for input and output, and Shapely for creating and manipulating geospatial data. While Fiona is Python compatible and our recommendation, users should also be aware of some of the disadvantages. It is more dependable than OGR because it uses Python objects for copying vector data instead of C pointers, which also means that they use more memory, which affects the performance. Python shapefile library (pyshp) The Python shapefile library (pyshp) is a pure Python library and is used to read and write shapefiles. The pyshp library's sole purpose is to work with shapefiles—it only uses the Python standard library. You cannot use it for geometric operations. If you're only working with shapefiles, this one-file-only library is simpler than using GDAL. pyproj The pyproj is a Python package that performs cartographic transformations and geodetic computations. It is a Cython wrapper to provide Python interfaces to PROJ.4 functions, meaning you can access an existing library of C code in Python. PROJ.4 is a projection library that transforms data among many coordinate systems and is also available through GDAL and OGR. The reason that PROJ.4 is still popular and widely used is two-fold: Firstly, because it supports so many different coordinate systems Secondly, because of the routes it provides to do this—Rasterio and GeoPandas, two Python libraries covered next, both use pyproj and thus PROJ.4 functionality under the hood The difference between using PROJ.4 separately instead of using it with a package such as GDAL is that it enables you to re-project individual points, and packages using PROJ.4 do not offer this functionality. The pyproj package offers two classes—the Proj class and the Geod class. The Proj class performs cartographic computations, while the Geod class performs geodetic computations. Rasterio Rasterio is a GDAL and NumPy-based Python library for raster data, written with the Python developer in mind instead of C, using Python language types, protocols, and idioms. Rasterio aims to make GIS data more accessible to Python programmers and helps GIS analysts learn important Python standards. Rasterio relies on concepts of Python rather than GIS. Rasterio is an open source project from the satellite team of Mapbox, a provider of custom online maps for websites and applications. The name of this library should be pronounced as raster-i-o rather than ras-te-rio. Rasterio came into being as a result of a project called the Mapbox Cloudless Atlas, which aimed to create a pretty-looking basemap from satellite imagery. One of the software requirements was to use open source software and a high-level language with handy multi-dimensional array syntax. Although GDAL offers proven algorithms and drivers, developing with GDAL's Python bindings feels a lot like C++. Therefore, Rasterio was designed to be a Python package at the top, with extension modules (using Cython) in the middle, and a GDAL shared library on the bottom. Other requirements for the raster library were being able to read and write NumPy ndarrays to and from data files, use Python types, protocols, and idioms instead of C or C++ to free programmers from having to code in two languages. For georeferencing, Rasterio follows the lead of pyproj. There are a couple of capabilities added on top of reading and writing, one of them being a features module. Reprojection of geospatial data can be done with the rasterio.warp module. Rasterio's project homepage can be found on Github. GeoPandas GeoPandas is a Python library for working with vector data. It is based on the pandas library that is part of the SciPy stack. SciPy is a popular library for data inspection and analysis, but unfortunately, it cannot read spatial data. GeoPandas was created to fill this gap, taking pandas data objects as a starting point. The library also adds functionality from geographical Python packages. GeoPandas offers two data objects—a GeoSeries object that is based on a pandas Series object and a GeoDataFrame, based on a pandas DataFrame object, but adding a geometry column for each row. Both GeoSeries and GeoDataFrame objects can be used for spatial data processing, similar to spatial databases. Read and write functionality is provided for almost every vector data format. Also, because both Series and DataFrame objects are subclasses from pandas data objects, you can use the same properties to select or subset data, for example .loc or .iloc. GeoPandas is a library that employs the capabilities of newer tools, such as Jupyter Notebooks, pretty well, whereas GDAL enables you to interact with data records inside of vector and raster datasets through Python code. GeoPandas takes a more visual approach by loading all records into a GeoDataFrame so that you can see them all together on your screen. The same goes for plotting data. These functionalities were lacking in Python 2 as developers were dependent on IDEs without extensive data visualization capabilities which are now available with Jupyter Notebooks. We've provided an overview of the most important open source packages for processing and analyzing geospatial data. The question then becomes when to use a certain package and why. GDAL, OGR, and GEOS are indispensable for geospatial processing and analyzing, but were not written in Python, and so they require Python binaries for Python developers. Fiona, Shapely, and pyproj were written to solve these problems, as well as the newer Rasterio library. For a more Pythonic approach, these newer packages are preferable to the older C++ packages with Python binaries (although they're used under the hood). Now that you have an idea of what options are available for a certain use case and why one package is preferable over another, here’s something you should always remember. As is often the way in programming, there might be multiple solutions for one particular problem. For example, when dealing with shapefiles, you could use pyshp, GDAL, Shapely, or GeoPandas, depending on your preference and the problem at hand. Introduction to Data Analysis and Libraries 15 Useful Python Libraries to make your Data Science tasks Easier “Pandas is an effective tool to explore and analyze data”: An interview with Theodore Petrou Using R to implement Kriging – A Spatial Interpolation technique for Geostatistics data  
Read more
  • 0
  • 0
  • 19099

article-image-building-a-scalable-postgresql-solution
Natasha Mathur
14 Apr 2019
12 min read
Save for later

Building a scalable PostgreSQL solution 

Natasha Mathur
14 Apr 2019
12 min read
The term  Scalability means the ability of a software system to grow as the business using it grows. PostgreSQL provides some features that help you to build a scalable solution but, strictly speaking, PostgreSQL itself is not scalable. It can effectively utilize the following resources of a single machine: It uses multiple CPU cores to execute a single query faster with the parallel query feature When configured properly, it can use all available memory for caching The size of the database is not limited; PostgreSQL can utilize multiple hard disks when multiple tablespaces are created; with partitioning, the hard disks could be accessed simultaneously, which makes data processing faster However, when it comes to spreading a database solution to multiple machines, it can be quite problematic because a standard PostgreSQL server can only run on a single machine.  In this article, we will look at different scaling scenarios and their implementation in PostgreSQL. The requirement for a system to be scalable means that a system that supports a business now, should also be able to support the same business with the same quality of service as it grows. This article is an excerpt taken from the book  'Learning PostgreSQL 11 - Third Edition' written by Andrey Volkov and Salahadin Juba. The book explores the concepts of relational databases and their core principles.  You’ll get to grips with using data warehousing in analytical solutions and reports and scaling the database for high availability and performance. Let's say a database can store 1 GB of data and effectively process 100 queries per second. What if with the development of the business, the amount of data being processed grows 100 times? Will it be able to support 10,000 queries per second and process 100 GB of data? Maybe not now, and not in the same installation. However, a scalable solution should be ready to be expanded to be able to handle the load as soon as it is needed. In scenarios where it is required to achieve better performance, it is quite common to set up more servers that would handle additional load and copy the same data to them from a master server. In scenarios where high availability is required, this is also a typical solution to continuously copy the data to a standby server so that it could take over in case the master server crashes.  Scalable PostgreSQL solution Replication can be used in many scaling scenarios. Its primary purpose is to create and maintain a backup database in case of system failure. This is especially true for physical replication. However, replication can also be used to improve the performance of a solution based on PostgreSQL. Sometimes, third-party tools can be used to implement complex scaling scenarios. Scaling for heavy querying Imagine there's a system that's supposed to handle a lot of read requests. For example, there could be an application that implements an HTTP API endpoint that supports the auto-completion functionality on a website. Each time a user enters a character in a web form, the system searches in the database for objects whose name starts with the string the user has entered. The number of queries can be very big because of the large number of users, and also because several requests are processed for every user session. To handle large numbers of requests, the database should be able to utilize multiple CPU cores. In case the number of simultaneous requests is really large, the number of cores required to process them can be greater than a single machine could have. The same applies to a system that is supposed to handle multiple heavy queries at the same time. You don't need a lot of queries, but when the queries themselves are big, using as many CPUs as possible would offer a performance benefit—especially when parallel query execution is used. In such scenarios, where one database cannot handle the load, it's possible to set up multiple databases, set up replication from one master database to all of them, making each them work as a hot standby, and then let the application query different databases for different requests. The application itself can be smart and query a different database each time, but that would require a special implementation of the data-access component of the application, which could look as follows: Another option is to use a tool called Pgpool-II, which can work as a load-balancer in front of several PostgreSQL databases. The tool exposes a SQL interface, and applications can connect there as if it were a real PostgreSQL server. Then Pgpool-II will redirect the queries to the databases that are executing the fewest queries at that moment; in other words, it will perform load-balancing: Yet another option is to scale the application together with the databases so that one instance of the application will connect to one instance of the database. In that case, the users of the application should connect to one of the many instances. This can be achieved with HTTP load-balancing: Data sharding When the problem is not the number of concurrent queries, but the size of the database and the speed of a single query, a different approach can be implemented. The data can be separated into several servers, which will be queried in parallel, and then the result of the queries will be consolidated outside of those databases. This is called data sharding. PostgreSQL provides a way to implement sharding based on table partitioning, where partitions are located on different servers and another one, the master server, uses them as foreign tables. When performing a query on a parent table defined on the master server, depending on the WHERE clause and the definitions of the partitions, PostgreSQL can recognize which partitions contain the data that is requested and would query only these partitions. Depending on the query, sometimes joins, grouping and aggregation could be performed on the remote servers. PostgreSQL can query different partitions in parallel, which will effectively utilize the resources of several machines. Having all this, it's possible to build a solution when applications would connect to a single database that would physically execute their queries on different database servers depending on the data that is being queried. It's also possible to build sharding algorithms into the applications that use PostgreSQL. In short, applications would be expected to know what data is located in which database, write it only there, and read it only from there. This would add a lot of complexity to the applications. Another option is to use one of the PostgreSQL-based sharding solutions available on the market or open source solutions. They have their own pros and cons, but the common problem is that they are based on previous releases of PostgreSQL and don't use the most recent features (sometimes providing their own features instead). One of the most popular sharding solutions is Postgres-XL, which implements a shared-nothing architecture using multiple servers running PostgreSQL. The system has several components: Multiple data nodes: Store the data A single global transaction monitor (GTM): Manages the cluster, provides global transaction consistency Multiple coordinator nodes: Supports user connections, builds query-execution plans, and interacts with the GTM and the data nodes Postgres-XL implements the same API as PostgreSQL, therefore the applications don't need to treat the server in any special way. It is ACID-compliant, meaning it supports transactions and integrity constraints. The COPY command is also supported. The main benefits of using Postgres-XL are as follows: It can scale to support more reading operations by adding more data nodes It can scale for to support more writing operations by adding more coordinator nodes The current release of Postgres-XL (at the time of writing) is based on PostgreSQL 10, which is relatively new The main downside of Postgres-XL is that it does not provide any high-availability features out of the box. When more servers are added to a cluster, the probability of the failure of any of them increases. That's why you should take care with backups or implement replication of the data nodes themselves. Postgres-XL is open source, but commercial support is available. Another solution worth mentioning is Greenplum. It's positioned as an implementation of a massive parallel-processing database, specifically designed for data warehouses. It has the following components: Master node: Manages user connections, builds query execution plans, manages transactions Data nodes: Store the data and perform queries Greenplum also implements the PostgreSQL API, and applications can connect to a Greenplum database without any changes. It supports transactions, but support for integrity constraints is limited. The COPY command is supported. The main benefits of Greenplum are as follows: It can scale to support more reading operations by adding more data nodes. It supports column-oriented table organization, which can be useful for data-warehousing solutions. Data compression is supported. High-availability features are supported out of the box. It's possible (and recommended) to add a secondary master that would take over in case a primary master crashes. It's also possible to add mirrors to the data nodes to prevent data loss. The drawbacks are as follows: It doesn't scale to support more writing operations. Everything goes through the single master node and adding more data nodes does not make writing faster. However, it's possible to import data from files directly on the data nodes. It uses PostgreSQL 8.4 in its core. Greenplum has a lot of improvements and new features added to the base PostgreSQL code, but it's still based on a very old release; however, the system is being actively developed. Greenplum doesn't support foreign keys, and support for unique constraints is limited. There are commercial and open source editions of Greenplum. Scaling for many numbers of connections Yet another use case related to scalability is when the number of database connections is great.  However, when a single database is used in an environment with a lot of microservices and each has its own connection pool, even if they don't perform too many queries, it's possible that hundreds or even thousands of connections are opened in the database. Each connection consumes server resources and just the requirement to handle a great number of connections can already be a problem, without even performing any queries. If applications don't use connection pooling and open connections only when they need to query the database and close them afterwards, another problem could occur. Establishing a database connection takes time—not too much, but when the number of operations is great, the total overhead will be significant. There is a tool, named PgBouncer, that implements a connection-pool functionality. It can accept connections from many applications as if it were a PostgreSQL server and then open a limited number of connections towards the database. It would reuse the same database connections for multiple applications' connections. The process of establishing a connection from an application to PgBouncer is much faster than connecting to a real database because PgBouncer doesn't need to initialize a database backend process for the session. PgBouncer can create multiple connection pools that work in one of the three modes: Session mode: A connection to a PostgreSQL server is used for the lifetime of a client connection to PgBouncer. Such a setup could be used to speed up the connection process on the application side. This is the default mode. Transaction mode: A connection to PostgreSQL is used for a single transaction that a client performs. That could be used to reduce the number of connections at the PostgreSQL side when only a few translations are performed simultaneously. Statement mode: A database connection is used for a single statement. Then it is returned to the pool and a different connection is used for the next statement. This mode is similar to the transaction mode, though more aggressive. Note that multi-statement transactions are not possible when statement mode is used. Different pools can be set up to work in different modes. It's possible to let PgBouncer connect to multiple PostgreSQL servers, thus working as a reverse proxy. The way PgBouncer could be used is represented in the following diagram: PgBouncer establishes several connections to the database. When an application connects to PgBouncer and starts a transaction, PgBouncer assigns an existing database connection to that application, forwards all SQL commands to the database, and delivers the results back. When the transaction is finished, PgBouncer will dissociate the connections, but not close them. If another application starts a transaction, the same database connection could be used. Such a setup requires configuring PgBouncer to work in transaction mode. PostgreSQL provides several ways to implement replication that would maintain a copy of the data from a database on another server or servers. This can be used as a backup or a standby solution that would take over in case the main server crashes. Replication can also be used to improve the performance of a software system by making it possible to distribute the load on several database servers. In this article, we discussed the problem of building scalable solutions based on PostgreSQL utilizing the resources of several servers. We looked at scaling for querying, data sharding, as well as scaling for many numbers of connections.  If you enjoyed reading this article and want to explore other topics, be sure to check out the book 'Learning PostgreSQL 11 - Third Edition'. Handling backup and recovery in PostgreSQL 10 [Tutorial] Understanding SQL Server recovery models to effectively backup and restore your database Saving backups on cloud services with ElasticSearch plugins
Read more
  • 0
  • 0
  • 18520
article-image-a-non-programmers-guide-to-learning-machine-learning
Natasha Mathur
05 Sep 2018
11 min read
Save for later

A non programmer’s guide to learning Machine learning

Natasha Mathur
05 Sep 2018
11 min read
Artificial intelligence might seem intimidating, but it isn’t actually as complex as you might think. Many of the tools that have been developed over the last decade or so have all helped to make artificial intelligence and machine learning more accessible to engineers with varying degrees of experience and knowledge. Today, we’ve got to a stage where it’s now accessible even to people who have barely written a line of code in their life! Pretty exciting, right? But if you’re completely new to the field, it can be challenging to know how to get started - fortunately, we’re about to help you overcome that first hurdle. If you are an AI denier, then be sure to first read ‘why learn Machine Learning as a non-techie’ before you move forward. A strong purpose and belief is the first step to learning anything new. Alright, now here’s how you can get started with artificial intelligence and machine learning techniques quickly. 0. Use a free MLaaS or a no code interactive machine learning tool to experience first hand what is possible with learning machine learning: Some popular examples of no code machine learning as a service option are Microsoft Azure, BigML, Orange, and Amazon ML. Read Q2 under the FAQ section below to know more on this topic. 1. Learn Linear Algebra: Linear Algebra is the elementary unit for ML. It helps you effectively comprehend the theory behind the Machine learning algorithms and how they work. It also improves your math skills such as statistics, programming skills, which are all other skills that helps in ML. Learning Resources: Linear Algebra for Beginners: Open Doors to Great Careers Linear algebra Basics 2. Learn just enough Python or any programming: Now, you can get started with any language of your interest, but we suggest Python as  it’s great for people who are new to programming. It’s easy to learn due to its simple syntax. You’ll be able to quickly implement the ML algorithms. Also,  It has a rich development ecosystem that offers a ton of libraries and frameworks in Machine Learning such as Scikit Learn, Lasagne, Numpy, Scipy, Theano, Tensorflow, etc. Learning Resources: Python Machine Learning Learn Python in 7 Days Python for Beginners 2017 [Video] Learn Python with codecademy Python editor for beginner programmers 3. Learn basic Probability Theory and statistics: A lot of fundamental Statistical and Probability Theories form the basis for ML. You’ve probably already learned Probability and statistics in school, it easy to dive into advanced statistics for ML. Machine learning in its currently widely used form is a way to predict odds and see patterns. Knowing statistics and probability is important as it will help you with better understanding of why any machine learning algorithm works. For example, your grounding in this area, will help to ask the right questions, choose the right set of algorithms and know what to expect as answers from your ML model on questions such as: What are the odds of this person also liking this movie given their current movie watching choices ( Collaborative filtering and content-based filtering) How similar is this user to that group of users who brought a bunch of stuff on my site (clustering, collaborative filtering, and classification) Could this person be at risk of cancer given a certain set of traits and health indicator observations (logistic regression) Should you buy that stock (decision tree) Also, check out our interview with James D. Miller to know more about why learning stats is important in this field. Learning resources: Statistics for Data Science [Video] 4. Learn machine learning algorithms: Do not get intimidated!  You don’t have to be an expert to learn ML algorithms. Knowing basic ML algorithms that are majorly used in the real world applications like linear regression, naive Bayes, and decision trees, are enough to get you started. Learn what they do and how they are used in Machine Learning. 5. Learn numpy sci-kit learn,Keras or any other popular machine learning framework: It can be confusing initially to decide which framework to learn. Each one has its own advantages and disadvantages. Numpy is a linear algebra library which is useful for performing mathematical and logical operations. You can easily work with large multidimensional arrays using Numpy. Sci-kit learn helps with quick implementation of popular algorithms on datasets as just one line of code makes different algorithms available for you. Keras is minimalistic and straightforward with high-levels of extensibility, so it is easier to approach. Learning Resources:  Hands-on Machine Learning with TensorFlow [Video]  Hands-on Scikit-learn for Machine Learning [Video] If you have reached till here, it is time to put your learning into practice. Go ahead and create a simple linear regression model using some publicly available dataset in your area of interest. Kaggle, ourworldindata.org, UC Irvine Machine Learning repository, elitedatascience, all have a rich set of clean datasets in varied fields. Now, it is necessary to commit and put in daily efforts to practise these skills. Quora, Reddit, Medium, and stackoverflow will be your best friends when it comes to solving doubts regarding any of these skills. Data Helpers is another great resource that provides newcomers with help on queries regarding entering the ML field and related topics. Additionally, once you start getting hang of these skills, identify your strengths and interests, to realign your career goals. Research on the kind of work you want to put your newly gained Machine Learning skill to use. It needn’t be professional or serious, it just needs to be something that you deeply care about or are passionate about. This will pull you through your learning milestones, should you feel low at some point. Also, don’t forget to collaborate with other people and learn from them. You can work with web developers, software programmers, data analysts, data administrators, game developers etc. Finally, keep yourself updated with all the latest happenings in the ML world. Follow top experts and influencers on social media, top blogs on Machine Learning, and conferences. Once you are done checking off these steps off your list, you’ll be ready to start off with your ML project.                                                  Now, we’ll be looking at the most frequently asked questions by beginners in the field of Machine learning. Frequently asked questions by Beginners in ML As a beginner, it’s natural to have a lot of questions regarding ML. We’ll be addressing the top three frequently asked questions by beginners or non-programmers when it comes to Machine learning: Q.1 I am looking to make a career in Machine learning but I have no prior programming experience. Do I need to know programming for Machine learning? In a nutshell, Yes. If you want a career in Machine learning then having some form of programming knowledge really helps. As mentioned earlier in this article, learning a programming language can really help you with implementing ML algorithms. It also lets you know the internal mechanism behind Machine learning. So, having programming as a prior skill is great. Again, as mentioned before, you can get started with Python which is the easiest and the most common languages for ML. However, programming is just a part of Machine learning. For instance, “machine learning engineers” typically write more code than develop models, while “research scientists” work more on modelling and analyzing different models. Now, ML is based on the principles of statistical inference and for talking statistically to the computer, we need a language, there comes Coding. So, even though the nature of your job in ML might not require you to code as much, there’s still some amount of coding required. Read Also: Why is Python so good for AI and ML? 5 Python Experts Explain Top languages for Artificial Intelligence development Q.2 Are there any tools that can help me with Machine learning without touching a single line of code? Yes. With the rise of MLaaS (Machine learning as a service), there are certain tools that help you get started with machine learning right-away. These are especially useful for business applications of ML, such as predictive modelling and clustering. Read Also: How MLaaS is transforming cloud Some of the most popular ones are: BigML:  This cloud based web-service lets you upload your data, prepare it and run algorithms on it. It’s great for people with not so extensive data science backgrounds. It offers a clean and easy to use interfaces for configuring algorithms (decision trees) and reviewing the results. Being focused “only” on Machine Learning, it comes with a wide set of features, all well integrated within a usable Web UI. Other than that, it also offers an API so that if you like it you can build an application around it. Microsoft Azure: The Microsoft Azure ML studio is a “GUI-based integrated development environment for constructing and operationalizing Machine Learning workflow on Azure”. So, via an integrated development environment called ML Studio, people without data science background or non-programmers can also build data models with the help of drag-and-drop gestures and simple data flow diagrams. This also saves a lot of time through ML Studio's library of sample experiments. Learning resources: Microsoft Azure Machine Learning Machine Learning In The Cloud With Azure ML[Video] Orange: This is an open source machine learning and data visualization studio for novice and experts alike. It provides a toolbox comprising of text mining (topic modelling) and image recognition. It also offers a design tool for visual programming which allows you to connect together data preparation, algorithms, and result evaluation, thereby, creating machine learning “programs”. Apart from that, it provides over 100 widgets for the environment and there’s also a Python API and library available which you can integrate into your application. Amazon ML: Amazon ML is a part of Amazon Web Services ( AWS ) that combines powerful machine learning algorithms with interactive visual tools to guide you towards easily creating, evaluating, and deploying machine learning models. So, whether you are a data scientist or a newbie, it offers ML services and tools tailored to meet your needs and level of expertise. Building ML models using Amazon ML consists of three operations: data analysis, model training, and evaluation. Learning Resources: Effective Amazon Machine Learning Q.3  Do I need to know advanced mathematics ( college graduate level ) to learn Machine learning? It depends. As mentioned earlier, understanding of the following mathematical topics: Probability, Statistics and Linear Algebra can really make your machine learning journey easier and also help simplify your code. These help you understand the “why” behind the working of the machine learning algorithms, which is quite fundamental to understanding ML. However, not knowing advanced mathematics is not an excuse to not learning Machine Learning. There a lot of libraries which makes the task of applying an ML algorithm to solve a task easier. One such example is the widely used Python’s scikit-learn library. With scikit-learn, you just need one line of code and you’ll have the most common algorithms there for you, ready to be used. But, if you want to go deeper into machine learning then knowing advanced mathematics is a prerequisite as it will help you understand the algorithms, the formulas, how the learning is done and many other Machine Learning concepts. Also, with so many courses and tutorials online, you can always learn advanced mathematics on the side while exploring Machine learning. So, we looked at the three most asked questions by beginners in the field of Machine Learning. In the past, machine learning has provided us with self-driving cars, effective web search, speech recognition, etc. Machine learning is extremely pervasive, in fact, many researchers believe that ML is the best way to make progress towards human-level AI. Learning ML is not an easy task but its not next to impossible either. In the end, it all depends on the amount of dedication and efforts that you’re willing to put in to get a grasp of it. We just touched the tip of the iceberg in this article, there’s a lot more to know in Machine Learning which you will get a hang of as you get your feet dirty in it. That being said, all the best for the road ahead! Facebook launches a 6-part ML video series 7 of the best ML conferences for the rest of 2018 Google introduces Machine Learning courses for AI beginners
Read more
  • 0
  • 0
  • 18202

article-image-6-artificial-intelligence-cybersecurity-tools-you-need-to-know
Savia Lobo
25 Aug 2018
7 min read
Save for later

6 artificial intelligence cybersecurity tools you need to know

Savia Lobo
25 Aug 2018
7 min read
Recently, most of the organizations experienced severe downfall due to an undetected malware, Deeplocker, which secretly evaded even the stringent cyber security mechanisms. Deeplocker leverages the AI model to attack the target host by using indicators such as facial recognition, geolocation and voice recognition. This incidence speaks volumes about the big role AI plays in the cybersecurity domain. In fact, some may even go on to say that AI for cybersecurity is no longer a nice to have tech rather a necessity. Large and small organizations and even startups are hugely investing in building AI systems to analyze the huge data trove and in turn, help their cybersecurity professionals to identify possible threats and take precautions or immediate actions to solve it. If AI can be used in getting the systems protected, it can also harm it. How? The hackers and intruders can also use it to launch an attack--this would be a much smarter attack--which would be difficult to combat. Phishing, one of the most common and simple social engineering cyber attack is now easy for attackers to master. There are a plethora of tools on the dark web that can help anyone to get their hands on phishing. In such trying conditions, it is only imperative that organizations take necessary precautions to guard their information castles. What better than AI? How 6 tools are using artificial intelligence for cybersecurity Symantec’s Targeted attack analytics (TAA) tool This tool was developed by Symantec and is used to uncover stealthy and targeted attacks. It applies AI and machine learning on the processes, knowledge, and capabilities of the Symantec’s security experts and researchers. The TAA tool was used by Symantec to counter the Dragonfly 2.0 attack last year. This attack targeted multiple energy companies and tried to gain access to operational networks. Eric Chein, Technical Director of Symantec Security says, “ With TAA, we’re taking the intelligence generated from our leading research teams and uniting it with the power of advanced machine learning to help customers automatically identify these dangerous threats and take action.” The TAA tools analyze incidents within the network against the incidents found in their Symantec threat data lake. TAA unveils suspicious activity in individual endpoints and collates that information to determine whether each action indicate hidden malicious activity. The TAA tools are now available for Symantec Advanced Threat Protection (ATP) customers. Sophos’ Intercept X tool Sophos is a British security software and hardware company. Its tool, Intercept X, uses a deep learning neural network that works similar to a human brain. In 2010, the US Defense Advanced Research Projects Agency (DARPA) created their first Cyber Genome Program to uncover the ‘DNA’ of malware and other cyber threats, which led to the creation of algorithm present in the Intercept X. Before a file executes, the Intercept X is able to extract millions of features from a file, conduct a deep analysis, and determine if a file is benign or malicious in 20 milliseconds. The model is trained on real-world feedback and bi-directional sharing of threat intelligence via an access to millions of samples provided by the data scientists. This results in high accuracy rate for both existing and zero-day malware, and a lower false positive rate. Intercept X utilizes behavioral analysis to restrict new ransomware and boot-record attacks.  The Intercept X has been tested on several third parties such as NSS labs and received high-scores. It is also proven on VirusTotal since August of 2016. Maik Morgenstern, CTO, AV-TEST said, “One of the best performance scores we have ever seen in our tests.” Darktrace Antigena Darktrace Antigena is Darktrace’s active self-defense product. Antigena expands Darktrace’s core capabilities to detect and replicate the function of digital antibodies that identify and neutralize threats and viruses. Antigena makes use of Darktrace’s Enterprise Immune System to identify suspicious activity and responds to them in real-time, depending on the severity of the threat. With the help of underlying machine learning technology, Darktrace Antigena identifies and protects against unknown threats as they develop. It does this without the need for human intervention, prior knowledge of attacks, rules or signatures. With such automated response capability, organizations can respond to threats quickly, without disrupting the normal pattern of business activity. Darktrace Antigena modules help to regulate user and machine access to the internet, message protocols and machine and network connectivity via various products such as Antigena Internet, Antigena Communication, and Antigena network. IBM QRadar Advisor IBM’s QRadar Advisor uses the IBM Watson technology to fight against cyber attacks. It uses AI to auto-investigate indicators of any compromise or exploit. QRadar Advisor uses cognitive reasoning to give critical insights and further accelerates the response cycle. With the help of IBM’s QRadar Advisor, security analysts can assess threat incidents and reduce the risk of missing them. Features of the IBM QRadar Advisor Automatic investigations of incidents QRadar Advisor with Watson investigates threat incidents by mining local data using observables in the incident to gather broader local context. It later quickly assesses the threats regarding whether they have bypassed layered defenses or were blocked. Provides Intelligent reasoning QRadar identifies the likely threat by applying cognitive reasoning. It connects threat entities related to the original incident such as malicious files, suspicious IP addresses, and rogue entities to draw relationships among these entities. Identifies high priority risks With this tool, one can get critical insights on an incident, such as whether or not a malware has executed, with supporting evidence to focus your time on the higher risk threats. Then make a decision quickly on the best response method for your business. Key insights on users and critical assets IBM’s QRadar can detect suspicious behavior from insiders through integration with the User Behavior Analytics (UBA) App and understands how certain activities or profiles impact systems. Vectra’s Cognito Vectra’s Cognito platform uses AI to detect attackers in real-time. It automates threat detection and hunts for covert attackers. Cognito uses behavioral detection algorithms to collect network metadata, logs and cloud events. It further analyzes these events and stores them to reveal hidden attackers in workloads and user/IoT devices. Cognito platform consists of Cognito Detect and Cognito Recall. Cognito Detect reveals hidden attackers in real time using machine learning, data science, and behavioral analytics. It automatically triggers responses from existing security enforcement points by driving dynamic incident response rules. Cognito Recall determines exploits that exist in historical data. It further speeds up detection of incident investigations with actionable context about compromised devices and workloads over time. It’s a quick and easy fix to find all devices or workloads accessed by compromised accounts and identify files involved in exfiltration. Just as diamond cuts diamond, AI cuts AI. By using AI to attack and to prevent on either side, AI systems will learn different and newer patterns and also identify unique deviations to security analysts. This provides organizations to resolve an attack on the way much before it reaches to the core. Given the rate at which AI and machine learning are expanding, the days when AI will redefine the entire cybersecurity ecosystem are not that far. DeepMind AI can spot over 50 sight-threatening eye diseases with expert accuracy IBM’s DeepLocker: The Artificial Intelligence powered sneaky new breed of Malware 7 Black Hat USA 2018 conference cybersecurity training highlights Top 5 cybersecurity trends you should be aware of in 2018  
Read more
  • 0
  • 0
  • 18099