Tech Guides

article-image-top-8-ways-to-improve-your-data-visualizations

04 Jul 2018

7 min read

8 ways to improve your data visualizations

04 Jul 2018

In Dr. W.Edwards Deming’s words “In God we trust, all others must bring data”. Organizations worldwide, revolve around data like planets revolve around the sun. Since data is so central to organizations, there are certain data visualization tools that help them understand data to make better business decisions. A lot more data is getting churned out and collected by organizations than ever before. So, how to make sense of all this data? Humans are visual creatures and our human brain processes visual information far better than textual information. In fact, presentations that use visual aids such as colors, shapes, images, etc, are found to be far more persuasive according to a research done by University of Minnesota back in 1986. Data visualization is one such process that easily translates the collected information into engaging visuals. It’s easy, cheap and doesn’t require any designing expertise to create data visuals. However, some professionals feel that data visualization is just limited to slapping on charts and graphs when that’s not actually the case. Data visualization is about conveying the right information, in a way that enhances the audience’s experience. So, if you want your graphs and charts to be more succinct and understandable, here are eight ways to improve your data visualization process: 1. Get rid of unneeded information Less is more in some cases and the same goes for data visualization. Using excessive color, jargons, pie charts and metrics take away focus from the important information. For instance, when using colors, don’t make your charts and graphs a rainbow instead use a specific set of colors with a clear purpose and meaning. Do you see the difference color and chart make to visualization in the below images? Source: Podio Similarly, when it comes to expressing your data, note how people interact at your workplace. Keep the tone of your visuals as natural as possible to make it easy for the audience to interpret your data. For metrics, only show the ones that truly bring value to your storytelling. Filter out the ones that are not so important to create less fuss. Tread cautiously while using pie charts as they can be difficult to understand sometimes and also, get rid of the elements on a chart that cause unnecessary confusion. Source: Dashboard Zone 2. Use conditional formatting for tabular data Data visualization doesn’t need to use fancy tools or designs. Take your standard excel table for example. Do you want to point out patterns or outliers in your data? Conditional formatting is a great tool for people working with data. It involves making simple rules on a given data and once that’s done, it’ll highlight only the data that matters the most to you. This helps quickly track the main information. Conditional formatting can be used for different things. It can help spot duplicate data in your table. You need to set bounds for the data using the built-in conditional formatting. It’ll then format the cells based on those bounds, highlighting the data you want. For instance, if sales quota of over 65% is good, between 65% and 55% is average, and below 55% is poor, then with conditional formatting, you can quickly find out who is meeting the expected sales quota, and who is not. 3. Add trendlines to unearth patterns for prediction Another feature that can amp up your data visualization is trendlines. They observe the relationship between two variables from your existing data. They are also are useful for predicting future values. Trendlines are simple to add and help discover trends in the given data set. Source: Interworks It also show data trends or moving averages in your charts. Depending on the kind of data you’re working with, there are a number of trendlines out there that you can use on your visualizations. Questions like whether a new strategy seems to be working in favor of the organization can be answered with the help of trendlines. This insight, in turn, helps predict new outcomes for the future. Statistical models are used in trendlines to make predictions. Once you add trend lines to a view, it’s up to you to decide how you want them to look and behave. 4. Implement filter by rule to get more specific Filter helps display just the information that you need. Using filter by rule, you can add filter option to your dataset. Organizations produce huge amounts of data on a regular basis. Suppose you want to know which employees within your organization are consistent performers. So, instead of creating a visualization that includes all the employees and their performances, you can filter it down, so that it shows only the employees who are always doing well. Similarly, if you want to find out which day the sales went up or down, you can filter it to show results for only the past week or month depending upon your preference. 5. For complex or dense data representation, add hierarchy Hierarchies eliminate the need to create extra visualizations. You can view data from a high level and dig deeper into the specifics of the data as you come up with questions based on the data. Adding a hierarchy to the data helps club multiple information in one visualization. Source: dzone For instance, if you create a hierarchy that shows the total sales achieved by different sales representative within an organization in the past month. Now, you can further break this down by selecting a particular sales rep, and then you can go even further by selecting a specific product assigned to that sales rep. This cuts down on a lot of extra work. 6. Make visuals more appealing by formatting data Data formatting takes only a few seconds but it can make a huge difference when it comes to the audience interpreting your data. Source: dzone It makes the numbers appear more visually appealing and easier to read for the audience. It can be used for charts such as bar charts and column charts. Formatting data to show a certain number of decimals, comma separators, number font, currency or percentage can make your visualization process more engaging. 7. Include comparison for more insight Comparisons provide readers a better perspective on data. It can both improve and add insights to your visualizations by including comparisons to your charts. For instance, in case you want to inform your audience about organization’s growth in current as well as the past year then you can include comparison within the visualization. You can also use a comparison chart to compare between two data points such as budget vs actually spent. 8. Sort data to improve readability Again, sorting through data is a great way to make things easy for the audience when dealing with huge quantities of data. For instance, if you want to include information about the highest and lowest performing products, you can sort your data. Sorting can be done in the following ways: Ascending - This helps sort the data from lowest to highest. Descending - This sorts data from highest to lowest. Data source order - Sorts the data in the order it is sorted in the data source. Alphabetic - Data is alphabetically sorted. Manual - Data can be sorted manually in the order you prefer. Effective data visualization helps people interpret the information in data that could not be seen before, to change their minds and prompt action. These were some of the tricks and features to take your data visualization game to the next level. There are different data visualization tools available in the market to choose from. Tableau and Microsoft Power BI are among the top ones that offer great features for data visualization. So, now that we’ve got you covered with some of the best practices for data visualization, it’s your turn to put these tips to practice and create some strong visual data stories. Do you have any DataViz tips to share with our readers? Please add them in the comments below. Getting started with Data Visualization in Tableau What is Seaborn and why should you use it for data visualization? “Tableau is the most powerful and secure end-to-end analytics platform”: An interview with Joshua Milligan

0
0
9298

article-image-the-rise-of-machine-learning-in-the-investment-industry

Natasha Mathur

15 Feb 2019

13 min read

The rise of machine learning in the investment industry

Natasha Mathur

15 Feb 2019

13 min read

0
0
9251

article-image-top-6-java-machine-learningdeep-learning-frameworks-cant-miss

Kartikey Pandey

08 Dec 2017

4 min read

Top 6 Java Machine Learning/Deep Learning frameworks you can’t miss

Kartikey Pandey

08 Dec 2017

4 min read

The data science tech market is buzzing with new and interesting Machine Learning libraries and tools almost everyday. In an increasingly growing market, it becomes difficult to choose the right tool or set of tools. More importantly, Artificial Intelligence and Deep Learning based projects require a different approach than traditional programming which makes things tricky to zero-in on one library or a framework. The choice of a framework is largely based upon the type of problem, one is expecting to solve. But there are other considerations too. Speed is one such factor that more or less would always play an important role in decision making. Other reasons could be how open-ended it is, architecture, functions, complexity of use, support for algorithms, and so on. Here, we present to you six Java libraries for your next Deep Learning and Artificial Intelligence project you shouldn’t miss if you are a Java loyalist or simply a web developer who wants to enter the world of deep learning. DeepLearning4j (DL4J) One of the first, commercial grade, and most popular deep learning frameworks developed in Java. It also supports other JVM languages (Java, Clojure, Scala). What’s interesting about the DL4J, is that it comes with an in-built GPU support for the training process. It also supports Hadoop YARN for distributed application management. It is popular for solving problems related to image recognition, fraud detection and NLP. MALLET Mallet (Machine Learning for Language Toolkit) is an open source Java Machine Learning toolkit. It supports NLP, clustering, modelling, and classification. The most important capability of Mallet is its support for a wide variety of algorithms such as Naive Bayes and Decision Trees. Another useful feature it has is topic modelling toolkit. Topic models are useful when analyzing large collections of unlabelled texts. Massive Online Analysis (MOA) MOA is an open source data streaming and mining framework for real time analytics. It has a strong and growing community and is similar and related to Weka. It also has the ability to deal with massive data streams. Encog This framework supports a wide array of algorithms and neural networks such as Artificial Neural Network, Bayesian Network, Genetic Programming and algorithms. Neuroph Neuroph as the name suggests offers great simplicity when working on neural networks. The main USP of Neuroph is its incredibly useful GUI (Graphical User Interface) tool that helps in creating and training neural networks. Neuroph is a good choice of framework when you have a quick project on hand and you don’t want to spend hours learning the theory. Neuroph helps you quickly set up and running in putting neural networks to work for your project. Java Machine Learning Library The Java Machine Learning Library offers a great set of reference implementation of algorithms that you can’t miss for your next Machine Learning project. Some of the key highlights are support vector machines and clustering algorithms. These are a few key frameworks and tools you might want to consider when working on your next research work. The Java ML library ecosystem is vast with many tools and libraries to support, and we just touched the tip of that iceberg in this article. One particular tool that deserve an honourable mention is Environment for Developing KDD-Applications Supported by Index-Structure (ELKI). It is designed particularly with researchers and research students kept in mind. The main focus of ELKI is its broad coverage of data algorithms which makes it a natural fit for research work. What’s really important while choosing any of the above or tools outside of the list is a good understanding of the requirements and the problems you intend to solve. To reiterate, some of the key considerations to bear in mind before zeroing in on a tool would be - support for algorithms, implementation of neural networks, dataset size (small, medium, large), and speed.

0
0
9244

article-image-why-is-everyone-going-crazy-over-webassembly

Amarabha Banerjee

09 Sep 2018

4 min read

Why is everyone going crazy over WebAssembly?

Amarabha Banerjee

09 Sep 2018

4 min read

The history of web has seen a few major events in the past three decades. One of them was the launch of JavaScript 22 years ago on December 4, 1995. Since then JavaScript has slowly evolved to become the de-facto standard of front-end web development. The present day web is much more dynamic and data intensive. Heavy graphics based games and applications require a much more robust browser. That is why developers are going crazy over the concept of WebAssembly. Is it here to replace JavaScript? Or is it like any other hype that will fade away with time? The answer is neither of the two. Why use WebAssembly when you have JavaScript? To understand the buzz around WebAssembly, we will have to understand what JavaScript does best and what its limitations are. JavaScript compiles into machine code as it runs in the browser. Machine code is the language that communicates with the PC and instructs it what to do. Not only that, it also parses, analyzes, and optimizes the Emscripten-generated JavaScript while loading the application. That’s what makes the browser slow in compute heavy applications. JavaScript is a dynamically typed language. It doesn’t have any stored functions in advance. That’s why when the compiler in your browser runs JavaScript, it doesn’t know which function call is going to come next. That might seem very inconvenient. But that feature is what makes JavaScript based browsers so intuitive, and interactive. This feature ensures that your system would not have to install a standalone desktop application. The same application can be run from the browser. Graphical Representation of an Assembler-Source: logrocket The above image shows how an assembly level language is transformed into machine code when it is compiled. This is what exactly happens when WebAssembly code runs in browser. But since WebAssembly is in binary format, it becomes much easier for the compiler to convert it into machine code. Unfortunately JavaScript is not suitable for every single application. For example, gaming is an area, where running JavaScript code in the browser for a highly interactive multiplayer game is not the best solution. It takes a heavy toll on the system resources. That’s where WebAssembly comes in. WebAssembly is a low level binary language that runs parallel to JavaScript. Its biggest advantages are speed, portability and flexibility. The speed comes from the fact that Webassembly is in binary. JavaScript is a high level language. Compiling that to the machine code puts significant pressure on the JavaScript engine. Compared to that, WebAssembly binary files are much smaller in size (in Kb) and easy to execute and convert to machine code. Functioning of a WASM: Source: logrocket The code optimization in WebAssembly happens during the compilation of source code, unlike JavaScript. WebAssembly manages memory manually, just like in languages like C and C++, so there’s no garbage collection either. This enables code compiler performance similar to native code. You can also compile other languages like Rust, C, C++ into WASM format. This enables developers to run their native code in the browser without knowing much of JavaScript. WASM is not something that you can write as a code. It’s a format which is created from your native code, that transcompiles directly into machine code. This allows it to run parallel to HTML5, CSS and JavaScript code, giving you the taste of both worlds. So, is WebAssembly going to replace JavaScript? JavaScript is clearly not replaceable. Just that for heavy graphics/ audio/ AI based apps, a lot of function calls are made in the browser. This makes the browser slow. WebAssembly eases out this aspect. There are separate compilers that can turn your C, C++, Rust code into WASM code. These are then used in the browser as JavaScript objects. Since these are very small in size, they make the application fast. Support for WebAssembly has been rolled out by all major browsers. Majority of the world is using WebAssembly currently in their browsers. Until JavaScript capabilities improve, WebAssembly will work alongside Javascript to make your apps perform better and in making your browser interactive, intuitive and lightweight. Golang 1.11 rc1 is here with experimental port for WebAssembly! Unity switches to WebAssembly as the output format for the Unity WebGL build target Introducing Life: A cross-platform WebAssembly VM for decentralized Apps written in Go Grain: A new functional programming language that compiles to Webassembly

0
0
9168

Savia Lobo

28 Sep 2018

5 min read

What is Core ML?

Savia Lobo

28 Sep 2018

5 min read

Introduced by Apple, CoreML is a machine learning framework that powers the iOS app developers to integrate machine learning technology into their apps. It supports natural language processing (NLP), image analysis, and various other conventional models to provide a top-notch on-device performance with minimal memory footprint and power consumption. This article is an extract taken from the book Machine Learning with Core ML written by Joshua Newnham. In this article, you will get to know the basics of what CoreML is and its typical workflow. With the release of iOS 11 and Core ML, performing inference is just a matter of a few lines of code. Prior to iOS 11, inference was possible, but it required some work to take a pre-trained model and port it across using an existing framework such as Accelerate or metal performance shaders (MPSes). Accelerate and MPSes are still used under the hood by Core ML, but Core ML takes care of deciding which underlying framework your model should use (Accelerate using the CPU for memory-heavy tasks and MPSes using the GPU for compute-heavy tasks). It also takes care of abstracting a lot of the details away; this layer of abstraction is shown in the following diagram: There are additional layers too; iOS 11 has introduced and extended domain-specific layers that further abstract a lot of the common tasks you may use when working with image and text data, such as face detection, object tracking, language translation, and named entity recognition (NER). These domain-specific layers are encapsulated in the Vision and natural language processing (NLP) frameworks; we won't be going into any details of these frameworks here, but you will get a chance to use them in later chapters: It's worth noting that these layers are not mutually exclusive and it is common to find yourself using them together, especially the domain-specific frameworks that provide useful preprocessing methods we can use to prepare our data before sending to a Core ML model. So what exactly is Core ML? You can think of Core ML as a suite of tools used to facilitate the process of bringing ML models to iOS and wrapping them in a standard interface so that you can easily access and make use of them in your code. Let's now take a closer look at the typical workflow when working with Core ML. CoreML Workflow As described previously, the two main tasks of a ML workflow consist of training and inference. Training involves obtaining and preparing the data, defining the model, and then the real training. Once your model has achieved satisfactory results during training and is able to perform adequate predictions (including on data it hasn't seen before), your model can then be deployed and used for inference using data outside of the training set. Core ML provides a suite of tools to facilitate getting a trained model into iOS, one being the Python packaged released called Core ML Tools; it is used to take a model (consisting of the architecture and weights) from one of the many popular packages and exporting a .mlmodel file, which can then be imported into your Xcode project. Once imported, Xcode will generate an interface for the model, making it easily accessible via code you are familiar with. Finally, when you build your app, the model is further optimized and packaged up within your application. A summary of the process of generating the model is shown in the following diagram: The previous diagram illustrates the process of creating the .mlmodel;, either using an existing model from one of the supported frameworks, or by training it from scratch. Core ML Tools supports most of the frameworks, either internal or as third party plug-ins, including Keras, Turi, Caffe, scikit-learn, LibSVN, and XGBoost frameworks. Apple has also made this package open source and modular for easy adaption for other frameworks or by yourself. The process of importing the model is illustrated in this diagram: In addition; there are frameworks with tighter integration with Core ML that handle generating the Core ML model such as Turi Create, IBM Watson Services for Core ML, and Create ML. We will be introducing Create ML in chapter 10; for those interesting in learning more about Turi Create and IBM Watson Services for Core ML then please refer to the official webpages via the following links: Turi Create; https://github.com/apple/turicreate IBM Watson Services for Core ML; https://developer.apple.com/ibm/ Once the model is imported, as mentioned previously, Xcode generates an interface that wraps the model, model inputs, and outputs. Thus, in this post, we learned about the workflow of training and how to import a model. If you've enjoyed this post, head over to the book Machine Learning with Core ML to delve into the details of what this model is and what Core ML currently supports. Emotional AI: Detecting facial expressions and emotions using CoreML [Tutorial] Build intelligent interfaces with CoreML using a CNN [Tutorial] Watson-CoreML: IBM and Apple’s new machine learning collaboration project

0
0
9082

article-image-a-quick-look-at-ml-in-algorithmic-trading-strategies

Natasha Mathur

14 Feb 2019

6 min read

A Quick look at ML in algorithmic trading strategies

Natasha Mathur

14 Feb 2019

6 min read

Algorithmic trading relies on computer programs that execute algorithms to automate some, or all, elements of a trading strategy. Algorithms are a sequence of steps or rules to achieve a goal and can take many forms. In the case of machine learning (ML), algorithms pursue the objective of learning other algorithms, namely rules, to achieve a target based on data, such as minimizing a prediction error. In this article, we have a look at use cases of ML and how it is used in algorithmic trading strategies. These algorithms encode various activities of a portfolio manager who observes market transactions and analyzes relevant data to decide on placing buy or sell orders. The sequence of orders defines the portfolio holdings that, over time, aim to produce returns that are attractive to the providers of capital, taking into account their appetite for risk. This article is an excerpt taken from the book 'Hands-On Machine Learning for Algorithmic Trading' written by Stefan Jansen. The book explores effective trading strategies in real-world markets using NumPy, spaCy, pandas, scikit-learn, and Keras. Ultimately, the goal of active investment management consists in achieving alpha, that is, returns in excess of the benchmark used for evaluation. The fundamental law of active management applies the information ratio (IR) to express the value of active management as the ratio of portfolio returns above the returns of a benchmark, usually an index, to the volatility of those returns. It approximates the information ratio as the product of the information coefficient (IC), which measures the quality of forecast as their correlation with outcomes, and the breadth of a strategy expressed as the square root of the number of bets. The use of ML for algorithmic trading, in particular, aims for more efficient use of conventional and alternative data, with the goal of producing both better and more actionable forecasts, hence improving the value of active management. Quantitative strategies have evolved and become more sophisticated in three waves: In the 1980s and 1990s, signals often emerged from academic research and used a single or very few inputs derived from the market and fundamental data. These signals are now largely commoditized and available as ETF, such as basic mean-reversion strategies. In the 2000s, factor-based investing proliferated. Funds used algorithms to identify assets exposed to risk factors like value or momentum to seek arbitrage opportunities. Redemptions during the early days of the financial crisis triggered the quant quake of August 2007 that cascaded through the factor-based fund industry. These strategies are now also available as long-only smart-beta funds that tilt portfolios according to a given set of risk factors. The third era is driven by investments in ML capabilities and alternative data to generate profitable signals for repeatable trading strategies. Factor decay is a major challenge: the excess returns from new anomalies have been shown to drop by a quarter from discovery to publication, and by over 50% after publication due to competition and crowding. There are several categories of trading strategies that use algorithms to execute trading rules: Short-term trades that aim to profit from small price movements, for example, due to arbitrage Behavioral strategies that aim to capitalize on anticipating the behavior of other market participants Programs that aim to optimize trade execution, and A large group of trading based on predicted pricing The HFT funds discussed above most prominently rely on short holding periods to benefit from minor price movements based on bid-ask arbitrage or statistical arbitrage. Behavioral algorithms usually operate in lower liquidity environments and aim to anticipate moves by a larger player likely to significantly impact the price. The expectation of the price impact is based on sniffing algorithms that generate insights into other market participants' strategies, or market patterns such as forced trades by ETFs. Trade-execution programs aim to limit the market impact of trades and range from the simple slicing of trades to match time-weighted average pricing (TWAP) or volume-weighted average pricing (VWAP). Simple algorithms leverage historical patterns, whereas more sophisticated algorithms take into account transaction costs, implementation shortfall or predicted price movements. These algorithms can operate at the security or portfolio level, for example, to implement multileg derivative or cross-asset trades. Let's now have a look at different applications in Trading where ML is of key importance. Use Cases of ML for Trading ML extracts signals from a wide range of market, fundamental, and alternative data, and can be applied at all steps of the algorithmic trading-strategy process. Key applications include: Data mining to identify patterns and extract features Supervised learning to generate risk factors or alphas and create trade ideas Aggregation of individual signals into a strategy Allocation of assets according to risk profiles learned by an algorithm The testing and evaluation of strategies, including through the use of synthetic data The interactive, automated refinement of a strategy using reinforcement learning Supervised learning for alpha factor creation and aggregation The main rationale for applying ML to trading is to obtain predictions of asset fundamentals, price movements or market conditions. A strategy can leverage multiple ML algorithms that build on each other. Downstream models can generate signals at the portfolio level by integrating predictions about the prospects of individual assets, capital market expectations, and the correlation among securities. Alternatively, ML predictions can inform discretionary trades as in the quantamental approach outlined above. ML predictions can also target specific risk factors, such as value or volatility, or implement technical approaches, such as trend following or mean reversion. Asset allocation ML has been used to allocate portfolios based on decision-tree models that compute a hierarchical form of risk parity. As a result, risk characteristics are driven by patterns in asset prices rather than by asset classes and achieve superior risk-return characteristics. Testing trade ideas Backtesting is a critical step to select successful algorithmic trading strategies. Cross-validation using synthetic data is a key ML technique to generate reliable out-of-sample results when combined with appropriate methods to correct for multiple testing. The time series nature of financial data requires modifications to the standard approach to avoid look-ahead bias or otherwise contaminate the data used for training, validation, and testing. In addition, the limited availability of historical data has given rise to alternative approaches that use synthetic data. Reinforcement learning Trading takes place in a competitive, interactive marketplace. Reinforcement learning aims to train agents to learn a policy function based on rewards. In this article, we briefly discussed how ML has become a key ingredient for different stages of algorithmic trading strategies. If you want to learn more about trading strategies that use ML, be sure to check out the book 'Hands-On Machine Learning for Algorithmic Trading'. Using machine learning for phishing domain detection [Tutorial] Anatomy of an automated machine learning algorithm (AutoML) 10 machine learning algorithms every engineer needs to know

0
0
9075

article-image-streamline-your-application-development-process-in-5-simple-steps

Guest Contributor

23 Apr 2019

7 min read

Streamline your application development process in 5 simple steps

Guest Contributor

23 Apr 2019

7 min read

Chief Information Officers (CIOs) are under constant pressure to deliver substantial results that meet business goals. Planning a project and seeing it through to the end is a critical requirement of an effective development process. In the fast-paced world of software development, getting results is an essential key for businesses to flourish. There is a certain pleasure you get from ticking off tasks from your to-do lists. However, this becomes a burden when you are drowning with a lot of tasks on your head. Signs of inefficient processes are prevalent in every business. Unhappy customers, stressed out colleagues, disappointing code reviews, missed deadlines, and increases in costs are just some of the examples that are the direct result of dysfunctional processes. By streamlining your workflow you will be able to compete with modern technologies like Machine Learning and Artificial Intelligence. Gaining access to such technologies will also help you to automate the workflow, making your daily processes even smoother. Listed below are 5 steps that can help you in streamlining your development process. Step 1: Creating a Workflow This is a preliminary step for companies who have not considered creating a better workflow. A task is not just something you can write down, complete, and tick-off. Complex, software related tasks are not like the “do-the-dishes” type of tasks. Usually, there are many stages in software development tasks like planning, organizing, reviewing, and releasing. Regardless of the niche of your tasks, the workflow should be clear. You can always use software tools such as Zapier, Nintex, and ProcessMaker, etc. to customize your workflow and assign levels-of-importance to particular tasks. This might appear as micro-management at first, but once it becomes a part of the daily routine, it starts to get easier. Creating a workflow is probably the most important factor to consider when you are preparing to streamline your software development processes. There are several steps involved when creating a workflow: Mapping the Process Process mapping mainly focuses on the visualization of the current development process which allows a top-down view of how things are working. You can do process mapping via tools such as Draw.io, LucidCharts, and Microsoft Visio, etc. Analyze the Process Once you have a flowchart or a swim lane diagram setup, use it to investigate the problems within the process. The problems can range from costs, time, employee motivation, and other bottlenecks. Redesign the Process When you have identified the problems, you should try to solve them step by step. Working with people who are directly involved in the process (e.g Software Developers) and gaining an on-the-ground insight can prove very useful when redesigning the processes. Acquire Resources You now need to secure the resources that are required to implement the new processes. With regards to our topic, it can range from buying licensed software, faster computers, etc. Implementing Change It is highly likely that your business processes change with existing systems, teams, and processes. Allocate your time to solving these problems, while keeping the regular operations in the process. Process Review This phase might seem the easiest, but it is not. Once the changes are in place, you need to review them accordingly so that they do not rise up again Once the workflow is set in place, all you have to do is to identify the bugs in your workflow plan. The bugs can range anywhere from slow tasks, re-opening of finished tasks, to dead tasks. What we have observed about workflows is that you do not get it right the first time. You need to take your time to edit and review the workflow while still being in the loop of the workflow. The more transparent and active your process is, the easier it gets to spot problems and figure out solutions. Step 2: Backlog Maintenance Many times you assume all the tasks in your backlog to be important. They might have, however, this makes the backlog a little too jam-packed. Well, your backlog will not serve a purpose unless you are actively taking part in keeping it organized. A backlog, while being a good place to store tasks, is also home to tasks that will never see the light of day. A good practice, therefore, would be to either clean up your backlog of dead tasks or combine them with tasks that have more importance in your overall workflow. If some of the tasks are relatively low-priority, we would recommend creating a separate backlog altogether. Backlogs are meant to be a database of tasks but do not let that fact get over your head. You should not worry about deleting something important from your backlog, if the task is important, it will come back. You can use sites like Trello or Slack to create and maintain a backlog. Step 3: Standardized Procedure for Tasks You should have an accurate definition of “done”. With respect to software development, there are several things you need to consider before actually accomplishing a task. These include: Ensure all the features have been applied The unit tests are finished Software information is up-to-date Quality assurance tests have been carried out The code is in the master branch The code is deployed in the production This is simply a template of what you can consider “done” with respect to a software development project. Like any template, it gets even better when you include your additions and subtractions to it. Having a standardized definition of “done” helps remove confusion from the project so that every employee has an understanding of every stage until they are finished. and also gives you time to think about what you are trying to achieve. Lastly, it is always wise to spend a little extra time completing a task phase, so that you do not have to revisit it several times. Step 4: Work in Progress (WIP) Control The ultimate factor that kills workflow is multi-tasking. Overloading your employees with constant tasks results in an overall decline in output. Therefore, it is important that you do not exert your employees with multiple tasks, which only increases their work in progress. In order to fight the problem of multitasking, you need to reduce your cycle times by having fewer tasks at one time. Consider setting a WIP limit inside your workflow by introducing limits for daily and weekly tasks. This helps to keep control of the employee tasks and reduces their burden. Step 5: Progress Visualization When you have everything set up in your workflow, it is time to represent that data to present and potential stakeholders. You need to make it clear that all of the features are completed and the ones you are currently working on. And if you will be releasing the product on time or no? A good way to represent data to senior management is through visualizations. With visualizations, you can use tools like Jira or Trello to make your data shine even more. In terms of data representation, you can use various free online tools, or buy software like Microsoft PowerPoint or Excel. Whatever tools you might use, your end-goal should be to make the information as simple as possible to the stakeholders. You need to avoid clutter and too much technical information. However, these are not the only methods you can use. Look around your company and see where you are lacking in your current processes. Take note of all of them, and research on how you can change them for the better. Author Bio Shawn Mike has been working with writing challenging clients for over five years. He provides ghostwriting, and copywriting services. His educational background in the technical field and business studies has given him the edge to write on many topics. He occasionally writes blogs for Dynamologic Solutions. Microsoft Store updates its app developer agreement, to give developers up to 95% of app revenue React Native Vs Ionic: Which one is the better mobile app development framework? 9 reasons to choose Agile Methodology for Mobile App Development

0
0
9067

article-image-deoldify-colorising-and-restoring-bw-images-and-videos-using-a-nogan-approach

Savia Lobo

17 May 2019

5 min read

DeOldify: Colorising and restoring B&W images and videos using a NoGAN approach

Savia Lobo

17 May 2019

5 min read

Wouldn’t it be magical if we could watch old black and white movie footages and images in color? Deep learning, more precisely, GANs can help here. A recent approach by a software researcher Jason Antic tagged as ‘DeOldify’ is a deep learning based project for colorizing and restoring old images and film footages. https://twitter.com/johnbreslin/status/1127690102560448513 https://twitter.com/johnbreslin/status/1129360541955366913 In one of the sessions at the recent Facebook Developer Conference held from April 30 - May 1, 2019, Antic, along with Jeremy Howard, and Uri Manor talked about how by using GANs one can reconstruct images and videos, such as increasing their resolution or adding color to a black and white film. However, they also pointed out that GANs can be slow, and difficult and expensive to train. They demonstrated how to colorize old black & white movies and drastically increase the resolution of microscopy images using new PyTorch-based tools from fast.ai, the Salk Institute, and DeOldify that can be trained in just a few hours on a single GPU. https://twitter.com/citnaj/status/1123748626965114880 DeOldify makes use of a NoGAN training, which combines the benefits of GAN training (wonderful colorization) while eliminating the nasty side effects (like flickering objects in the video). NoGAN training is crucial while getting some images or videos stable and colorful. An example of DeOldify trying to achieve a stable video is as follows: Source: GitHub Antic said, “the video is rendered using isolated image generation without any sort of temporal modeling tacked on. The process performs 30-60 minutes of the GAN portion of "NoGAN" training, using 1% to 3% of Imagenet data once. Then, as with still image colorization, we "DeOldify" individual frames before rebuilding the video.” The three models in DeOldify DeOldify includes three models including video, stable and artistic. Each of the models has its strengths and weaknesses, and their own use cases. The Video model is for video and the other two are for images. Stable https://twitter.com/johnbreslin/status/1126733668347564034 This model achieves the best results with landscapes and portraits and produces fewer zombies (where faces or limbs stay gray rather than being colored in properly). It generally has less unusual miscolorations than artistic, but it's also less colorful in general. This model uses a resnet101 backbone on a UNet with an emphasis on width of layers on the decoder side. This model was trained with 3 critic pretrain/GAN cycle repeats via NoGAN, in addition to the initial generator/critic pretrain/GAN NoGAN training, at 192px. This adds up to a total of 7% of Imagenet data trained once (3 hours of direct GAN training). Artistic https://twitter.com/johnbreslin/status/1129364635730272256 This model achieves the highest quality results in image coloration, with respect to interesting details and vibrance. However, in order to achieve this, one has to adjust the rendering resolution or render_factor. Additionally, the model does not do as well as ‘stable’ in a few key common scenarios- nature scenes and portraits. Artistic model uses a resnet34 backbone on a UNet with an emphasis on depth of layers on the decoder side. This model was trained with 5 critic pretrain/GAN cycle repeats via NoGAN, in addition to the initial generator/critic pretrain/GAN NoGAN training, at 192px. This adds up to a total of 32% of Imagenet data trained once (12.5 hours of direct GAN training). Video https://twitter.com/citnaj/status/1124719757997907968 The Video model is optimized for smooth, consistent and flicker-free video. This would definitely be the least colorful of the three models; while being almost close to the ‘stable’ model. In terms of architecture, this model is the same as "stable"; however, differs in training. It's trained for a mere 2.2% of Imagenet data once at 192px, using only the initial generator/critic pretrain/GAN NoGAN training (1 hour of direct GAN training). DeOldify was achieved by combining certain approaches including: Self-Attention Generative Adversarial Network: Here, Antic has modified the generator, a pre-trained U-Net, to have the spectral normalization and self-attention. Two Time-Scale Update Rule: It’s just one to one generator/critic iterations and higher critic learning rate. This is modified to incorporate a "threshold" critic loss that makes sure that the critic is "caught up" before moving on to generator training. This is particularly useful for the "NoGAN" method. NoGAN doesn’t have a separate research paper. This, in fact, is a new type of GAN training developed to solve some key problems in the previous DeOldify model. NoGAN includes the benefits of GAN training while spending minimal time doing direct GAN training. Antic says, “I'm looking to make old photos and film look reeeeaaally good with GANs, and more importantly, make the project useful.” “I'll be actively updating and improving the code over the foreseeable future. I'll try to make this as user-friendly as possible, but I'm sure there's going to be hiccups along the way”, he further added. To further know about the hardware components and other details head over to Jason Antic’s GitHub page. Training Deep Convolutional GANs to generate Anime Characters [Tutorial] Sherin Thomas explains how to build a pipeline in PyTorch for deep learning workflows Using deep learning methods to detect malware in Android Applications

0
0
9020

article-image-10-best-graphics-and-rendering-tools-game-developers

Raka Mahesa

22 Oct 2017

5 min read

The 10 best graphics and rendering tools for game developers

Raka Mahesa

22 Oct 2017

5 min read

While it's true that a tool is only as good as its user, there's also another saying, that a good carpenter should sharpen the axe before chopping down a tree. So yes, effective tools matter, whether it's for something physical like carpentry or something digital like video game development. And that's why in this post we're going to be talking about the best graphics and rendering tools for game development. Before we continue though, let's take a moment to discuss what counts graphics and rendering tools are, exactly. For starters, game engines and frameworks are not going to be included in this list. Yes, that software is used to render stuff, but they are game creation tools, not tools specifically for graphics. General image editors and 3D editors are also not going to be included here, because those tools are meant to be general and not specifically tailored for video game development. What are the best graphics and rendering tools? So, with that out of the way, let's start listing the very best graphics and rendering tools. We will start with tools that are specific for 2D games, then tools that are for 3D games, and lastly, tools that can be used for either 2D or 3D games. Aseprite Aseprite is an image editor geared specifically for pixel art. It has various features to make creating pixel art sprite easier, like color palette editor, pixel-perfect pencil tool, frame-based animation editor, and a smart image rotation algorithm that avoids pixel distortion. And of course, it has the usual features of a modern image editor, like layer and transparency control. Spine Spine, in short, is a tool for creating 2D skeletal animation specifically for games. By using skeletal animation, artists no longer need to create animation frame-by-frame, and can simply animate the required part. So instead of making 10 full images of a character walking, an artist just needs to move the body part images to the desired position to create a walking animation. That animation can then be exported with JSON format and be used in a game engine. Enlighten Enlighten is a tool that can be integrated to a game to provide real-time, physic-based lighting. Physic-based lighting is usually not used in video games because they're slow to compute, however Enlighten manages to approximate this lighting system with a much faster calculation process. Enlighten is also the main technology behind Unity's Physic Based Rendering feature that was introduced with Unity 5. SpeedTree Foliage has always been one of the hardest things to achieve in 3D rendering. Fortunately, we have SpeedTree now, which is a tool that enables video games to render vegetation easily. SpeedTree provides a vegetation modeling tool that allows developer to quickly create 3D trees and plants, as well as an SDK that can be integrated into a game to render vegetation beautifully and efficiently. Substance Designer Substance Designer is a material authoring tool, which is a tool to process and create textures for 3D objects. Using this software, game developer can decide how an object would look in game and create the appropriate textures and configuration for the object. Umbra 3D rendering is quite a heavy task for computers, especially when a video game features a gigantic, complex environment. So, optimizing the rendering process is really important to make sure these games always run smoothly, and Umbra is a tool that can help game developers do just that. Umbra can process a 3D scene and calculate which objects are visible and which are not, making sure the GPU only renders the necessary objects in the scene. CrazyBump 3D objects usually have additional data that describes how a particular object would look when rendered. One of these additional data is normal map, a texture that describes the smoothness of an object's surface. Normal maps are important because they can make objects appear to be rough. CrazyBump allows developer to quickly generate a normal map from a texture. So if you have a rocky texture, CrazyBump can analyze the contrast of that texture and generate the appropriate normal map. Littera Many video games use a technique called bitmap font to render text on screen. This technique uses an image containing all the letters written in a font and renders letters to form a text. Littera is a tool to generate such image from a font type. With Littera developers can also customize the rendered font further by adding outlines or using gradient to color the letters. STG STG stands for Seamless Texture Generator, and, as the name implies, it's a tool that provides game developer with seamless textures that can be tiled. STG is able to process a digital photo and generate a seamless texture based on that photo. This is a really handy tool for creating realistic ground, grass, wall, and other textures that can be applied on a big surface. TexturePacker The last one on this list is TexturePacker, and being the last one certainly doesn't mean it's the least important, because this is a really useful tool. TexturePacker is a tool that can pack multiple images into a single texture, using the most efficient layout possible. This technique is called texture atlas, and it's a really great thing to have in video games, because having fewer texture files will reduce the rendering load and optimize the game. About the Author RakaMahesa is a game developer at Chocoarts (http://chocoarts.com/), who is interested in digital technology in general. Outside of work hours, he likes to work on his own projects, with Corridoom VR being his latest released game. Raka also regularly tweets as @legacy99.

0
1
8984

Tech Guides

article-image-5-javascript-machine-learning-libraries-you-need-to-know

Pravin Dhandre

08 Jun 2018

3 min read

5 JavaScript machine learning libraries you need to know

Pravin Dhandre

08 Jun 2018

3 min read

Technologies like machine learning, predictive analytics, natural language processing and artificial intelligence are the most trending and innovative technologies of 21st century. Whether it is an enterprise software or a simple photo editing application, they all are backed and rooted in machine learning technology making them smart enough to be a friend to humans. Until now, the tools and frameworks that were capable of running machine learning were majorly developed in languages like Python, R and Java. However, recently the web ecosystem has picked up machine learning into its fold and is achieving transformation in web applications. Today in this article, we will look at the most useful and popular libraries to perform machine learning in your browser without the need of softwares, compilers, installations and GPUs. TensorFlow.js GitHub: 7.5k+ stars With the growing popularity of TensorFlow among machine learning and deep learning enthusiasts, Google recently released TensorFlowjs, the JavaScript version of TensorFlow. With this library, JavaScript developers can train and deploy their machine learning models faster in browser without much hassle. This library is speedy, tensile, scalable and a great start to practically experience the taste of machine learning. With TensorFlow.js, importing existing models and retraining pretrained model is a piece of cake. To check out examples on tensorflow.js, visit GitHub repository. ConvNetJS GitHub: 9k+ stars ConvNetJS provides neural networks implementation in JavaScript with numerous demos of neural networks available on GitHub repository. The framework has a good number of active followers who are programmers and coders. The library provides support to various neural network modules, and popular machine learning techniques like Classification and Regression. Developers who are interested in getting reinforcement learning onto the browser or in training complex convolutional networks, can visit the ConvNetJS official page. Brain.js GitHub: 8k+ stars Brain.js is another addition to the web development ecosystem that brings smart features onto the browser with just a few lines of code. Using Brain.js, one can easily create simple neural networks and can develop smart functionality in their browser applications without much of the complexity. It is already preferred by web developers for client side applications like in-browser games or placement of Ads, or for character recognition. You can checkout its GitHub repository to see a complete demonstration of approximating XOR function using brain.js. Synaptic GitHub: 6k+ stars Synaptic is a well-liked machine learning library for training recurrent neural networks as it has in-built architecture-free generalized algorithm. Few of the in-built architectures include multilayer perceptrons, LSTM networks and Hopfield networks. With Synaptic, you can develop various in-browser applications such as Paint an Image, Learn Image Filters, Self-Organizing Map or Reading from Wikipedia. Neurojs GitHub: 4k+ stars Another recently developed framework especially for reinforcement learning tasks in your browser, is neurojs. It mainly focuses on Q-learning, but can be used for any type of neural network based task whether it is for building a browser game or an autonomous driving application. Some of the exciting features this library has to offer are full-stack neural network implementation, extended support to reinforcement learning tasks, import/export of weight configurations and many more. To see the complete list of features, visit the GitHub page. How should web developers learn machine learning? NVIDIA open sources NVVL, library for machine learning training Build a foodie bot with JavaScript

0
0
8962

article-image-how-artificial-intelligence-and-machine-learning-can-turbocharge-a-game-developers-career

Guest Contributor

06 Sep 2018

7 min read

How Artificial Intelligence and Machine Learning can turbocharge a Game Developer's career

Guest Contributor

06 Sep 2018

7 min read

Gaming - whether board games or games set in the virtual realm - has been a massively popular form of entertainment since time immemorial. In the pursuit of creating more sophisticated, thrilling, and intelligent games, game developers have delved into ML and AI technologies to fuel innovation in the gaming sphere. The gaming domain is the ideal experimentation bed for evolving technologies because not only do they put up complex and challenging problems for ML and AI to solve, they also pose as a ground for creativity - a meeting ground for machine learning and the art of interaction. Machine Learning and Artificial Intelligence in Gaming The reliance on AI for gaming is not a recent development. In fact, it dates back to 1949, when the famous cryptographer and mathematician Claude Shannon made his musings public about how a supercomputer could be made to master Chess. Then again, in 1952, a graduate student in the UK developed an AI that could play tic-tac-toe with ultimate perfection. Source : Medium However, it isn’t just ML and AI that are progressing through experimentations on games. Game development, too, has benefited a great deal from these pioneering technologies. AI and ML have helped enhance the gaming experience on many grounds such as gaming design, the interactive quotient, as well as the inner functionalities of games. The above mentioned AI use cases focus on two primary things: one is to impart enhanced realism in virtual gaming environment and the second is to create a more naturalistic interface between the gaming environment and the players. As of now, the focus of game developers, data scientists, and ML researchers lies in two specific categories of the gaming domain - games of perfect information and games of imperfect information. In games of perfect information, a player is aware of all the aspects of the game throughout the playing session, whereas, in games of imperfect information, players are oblivious to specific aspects of the game. When it comes to games of perfect information such as Chess and Go, AI has shown various instances of overpowering human intelligence. Back in 1997, IBM’s Deep Blue successfully defeated world Chess champion, Garry Kasparov in a six-game match. In 2016, Google’s AlphaGo emerged as the victor in a Go match scoring 4-1 after defeating South Korean Go champion, Lee Sedol. One of the most advanced chess AIs developed yet, Stockfish, uses a combination of advanced heuristics and brute force to compute numeric values for each and every move in a specific position in Chess. It also effectively eliminates bad moves using the Alpha-beta pruning search algorithm. While the progress and contribution of AI and ML to the field of games of perfect information is laudable, researchers are now intrigued by games of imperfect information. Games of imperfect information offer much more challenging situations that are essentially difficult for machines to learn and master. Thus, the next evolution in the world of gaming will be to create spontaneous gaming environment using AI technology, in which developers will build only the gaming environment and its mechanics instead of creating a game with pre-programmed/scripted plots. In such a scenario, the AI will have to confront and solve spontaneous challenges with personalized scenarios generated on the spot. Games like StarCraft and StarCraft II have stirred up massive interest among game researchers and developers. In these games, the players are only partially aware of the gaming aspects and the game is largely determined not just by the AI moves and the previous state of the game, but also by the moves of other players. Since in these games you will have little knowledge about your rival’s moves, you have to take decisions on the go and your moves have to be spontaneous. The recent win of OpenAI Five over amateur human players in Dota2 is a good case in point. OpenAI Five is a team of five neural networks that leverages an advanced version of Proximal Policy Optimization and uses a separate LSTM to learn identifiable strategies. The progress of OpenAI Five shows that even without human data, reinforcement learning can facilitate long-term planning, thus, allowing us to make further progress in the games of imperfect information. Career in Game Development With ML and AI As ML and AI continue to penetrate the gaming industry, it is creating a huge demand for talented and skilled game developers who are well-versed in these technologies. Today, game development is at a place where it’s no longer necessary to build games using time-consuming manual techniques. ML and AI have made the task of game developers easier as by leveraging these technologies, they can design and build innovative gaming environment, and test them automatically. The integration of AI and ML in the gaming domain is giving birth to new job positions like Gameplay Software Engineer (AI), Gameplay Programmer (AI), and Game Security Data Scientist, to name a few. The salaries of traditional game developers is in stark contrast with that of those having AI/ML skills. While the average salary of game developers is usually around $44,000, it can scale up to and over $1,20,000 if one possesses AI/ML skills. Gameplay Engineer Average salary - $73,000 - $1,16,000 Gameplay engineers are usually part of the core game dev team and are entrusted with the responsibility of enhancing the existing gameplay systems to enrich the player experience. Companies today demand for gameplay engineers who are proficient in C/C++ and well-versed with AI/ML technologies. Gameplay Programmer Average salary - $98,000 - $1,49,000 Gameplay programmers work in close collaboration with the production and design team to develop cutting edge features in the existing and upcoming gameplay systems. Programming skills are a must and knowledge of AI/ML technologies is an added bonus. Game Security Data Scientist Average salary - $73,000 - $1,06,000 The role of a gameplay security data scientist is to combine both security and data science approaches to detect anomalies and fraudulent behavior in games. This calls for a high degree of expertise in AI, ML, and other statistical methods. With impressive salaries and exciting job opportunities cropping up fast in the game development sphere, the industry is attracting some major talent towards it. Game developers and software developers around the world are choosing the field due to the promises of rapid career growth. If you wish to bag better and more challenging roles in the domain of game development, you should definitely try and upskill your talent and knowledge base by mastering the fields of ML and AI. Packt Publishing is the leading UK provider of Technology eBooks, Coding eBooks, Videos and Blogs; helping IT professionals to put software to work. It offers several books and videos on Game development with AI and machine learning. It’s never too late to learn new disciplines and expand your knowledge base. There are numerous online platforms that offer great artificial intelligent courses. The perk of learning from a registered online platform is that you can learn and grow at your own pace and according to your convenience. So, enroll yourself in one and spice up your career in game development! About Author: Abhinav Rai is the Data Analyst at UpGrad, an online education platform providing industry oriented programs in collaboration with world-class institutes, some of which are MICA, IIIT Bangalore, BITS and various industry leaders which include MakeMyTrip, Ola, Flipkart etc. Best game engines for AI game development Implementing Unity game engine and assets for 2D game development [Tutorial] How to use arrays, lists, and dictionaries in Unity for 3D game development

0
0
8931

article-image-understanding-sentiment-analysis-and-other-key-nlp-concepts

Sunith Shetty

20 Dec 2017

12 min read

Understanding Sentiment Analysis and other key NLP concepts

Sunith Shetty

20 Dec 2017

12 min read

0
0
8908

article-image-how-protect-yourself-botnet-attack

Hari Vignesh

22 Oct 2017

6 min read

How to protect yourself from a botnet attack

Hari Vignesh

22 Oct 2017

6 min read

The word 'botnet' is formed from the words ‘robot’ and ‘network’. Cybercriminals use special Trojan viruses to breach the security of several users’ computers, taking control of each computer and organizing all of the infected machines into a network of ‘bots’ that the criminal can remotely manage. It’s basically a collection of Internet-connected devices, which may include PCs, servers, mobile devices, and Internet of Things devices that are infected and controlled by a common type of malware. Users are often unaware of a botnet infecting their system. How can it affect you? Often, the cybercriminal will seek to infect and control thousands, tens of thousands, or even millions of computers, so that the cybercriminal can act as the master of a large ‘zombie network’ or ‘bot-network’ which is capable of delivering a Distributed Denial of Service (DDoS) attack, a large-scale spam campaign, or other types of cyberattack. In some cases, cybercriminals will establish a large network of zombie machines and then sell access to the zombie network to other criminals — either on a rental basis or as an outright sale. Spammers may rent or buy a network in order to operate a large-scale spam campaign. How do botnets work? The botnet malware typically looks for vulnerable devices across the Internet, rather than targeting specific individuals, companies, or industries. The objective for creating a botnet is to infect as many connected devices as possible, and to use the computing power and resources of those devices for automated tasks that generally remain hidden to the users of the devices. For example, an ad fraud botnet that infects a user’s PC will take over the system’s web browsers to divert fraudulent traffic to certain online advertisements. However, to stay concealed, the botnet won’t take complete control of the web browsers, which would alert the user. Instead, the botnet may use a small portion of the browser’s processes, often running in the background, to send a barely noticeable amount of traffic from the infected device to the targeted ads. On its own, that fraction of bandwidth taken from an individual device won’t offer much to the cybercriminals running the ad fraud campaign. However, a botnet that combines millions of devices will be able to generate a massive amount of fake traffic for ad fraud, while also avoiding detection by the individuals using the devices. Notable botnet attacks The Zeus malware, first detected in 2007, is one of the best-known and widely used malware types in the history of information security. Zeus uses a Trojan horse program to infect vulnerable devices and systems, and variants of this malware have been used for various purposes over the years, including to spread CryptoLocker ransomware. The Srizbi botnet, which was first discovered in 2007, was, for a time, the largest botnet in the world. Srizbi, also known as the Ron Paul spam botnet, was responsible for a massive amount of email spam — as much as 60 billion messages a day, accounting for roughly half of all email spam on the Internet at the time. In 2007, the Srizbi botnet was used to send out political spam emails promoting then-U.S. Presidential candidate Ron Paul. An extensive cybercrime operation and ad fraud botnet known as Methbot was revealed in 2016 by cybersecurity services company White Ops. According to security researchers, Methbot was generating between $3 million and $5 million in fraudulent ad revenue daily last year by producing fraudulent clicks for online ads, as well as fake views of video advertisements. Several powerful, record-setting distributed denial-of-service (DDoS) attacks were observed in late 2016, and they later traced to a new brand of malware known as Mirai. The DDoS traffic was produced by a variety of connected devices, such as wireless routers and CCTV cameras. Preventing botnet attacks In the past, botnet attacks were disrupted by focusing on the command-and-control source. Law enforcement agencies and security vendors would trace the bots’ communications to wherever the C&C servers were hosted, and then force the hosting or service provider to shut them down. There are several measures that users can take to prevent botnet virus infection. Because bot infections usually spread via malware, many of these measures actually focus on preventing malware infections. Recommended practices for botnet prevention include: Network baselining: Network performance and activity should be monitored so that irregular network behavior is apparent. Software patches: All software should be kept up-to-date with security patches. Vigilance: Users should be trained to refrain from activity that puts them at risk of bot infections or other malware. This includes opening emails or messages, downloading attachments, or clicking links from untrusted or unfamiliar sources. Anti-botnet tools: Anti-botnet tools provide botnet detection to augment preventative efforts by finding and blocking bot viruses before infection occurs. Most programs also offer features such as scanning for bot infections and botnet removal as well. Firewalls and antivirus software typically include basic tools for botnet detection, prevention, and removal. Tools like Network Intrusion Detection Systems (NIDS), rootkit detection packages, network sniffers, and specialized anti-bot programs can be used to provide more sophisticated botnet detection/prevention/removal. However, as botnet malware has become more sophisticated, and communications have become decentralized, takedown efforts have shifted away from targeting C&C infrastructures to other approaches. These approaches include identifying and removing botnet malware infections at the source devices, identifying and replicating the peer-to-peer communication methods and, in cases of ad fraud, disrupting the monetization schemes, rather than the technical infrastructures. Preventing botnet attacks has been complicated by the emergence of malware like Mirai, which targets routers and IoT devices that have weak or factory default passwords, and which can be easily compromised. In addition, users may be unable to change the passwords for many IoT devices, which leaves them exposed to attacks. If the manufacturer cannot remotely update the devices’ firmware to patch them or change their hardcoded passwords, then they may have to conduct a factory recall of the affected devices. About the Author Hari Vignesh Jayapalan is a Google Certified Android app developer, IDF Certified UI & UX Professional, street magician, fitness freak, technology enthusiast, and wannabe entrepreneur. He can be found on Twitter @HariofSpades.

0
0
8903

Tech Guides

article-image-a-serverless-online-store-on-aws-could-save-you-money-build-one

Savia Lobo

14 Jun 2018

9 min read

A serverless online store on AWS could save you money. Build one.

Savia Lobo

14 Jun 2018

9 min read

In this article you will learn to build an entire serverless project of an AWS online store, beginning with a React SPA frontend hosted on AWS followed by a serverless backend with API Gateway and Lambda functions. This article is an excerpt taken from the book, 'Building Serverless Web Applications' written by Diego Zanon. In this book, you will be introduced to the AWS services, and you'll learn how to estimate costs, and how to set up and use the Serverless Framework. The serverless architecture of AWS' online store We will build a real-world use case of a serverless solution. This sample application is an online store with the following requirements: List of available products Product details with user rating Add products to a shopping cart Create account and login pages For a better understanding of the architecture, take a look at the following diagram which gives a general view of how different services are organized and how they interact: Estimating costs In this section, we will estimate the costs of our sample application demo based on some usage assumptions and Amazon's pricing model. All pricing values used here are from mid 2017 and considers the cheapest region, US East (Northern Virginia). This section covers an example to illustrate how costs are calculated. Since the billing model and prices can change over time, always refer to the official sources to get updated prices before making your own estimations. You can use Amazon's calculator, which is accessible at this link: http://calculator.s3.amazonaws.com/index.html. If you still have any doubts after reading the instructions, you can always contact Amazon's support for free to get commercial guidance. Assumptions For our pricing example, we can assume that our online store will receive the following traffic per month: 100,000 page views 1,000 registered user accounts 200 GB of data transferred considering an average page size of 2 MB 5,000,000 code executions (Lambda functions) with an average of 200 milliseconds per request Route 53 pricing We need a hosted zone for our domain name and it costs US$ 0.50 per month. Also, we need to pay US$ 0.40 per million DNS queries to our domain. As this is a prorated cost, 100,000 page views will cost only US$ 0.04. Total: US$ 0.54 S3 pricing Amazon S3 charges you US$ 0.023 per GB/month stored, US$ 0.004 per 10,000 requests to your files, and US$ 0.09 per GB transferred. However, as we are considering the CloudFront usage, transfer costs will be charged by CloudFront prices and will not be considered in S3 billing. If our website occupies less than 1 GB of static files and has an average per page of 2 MB and 20 files, we can serve 100,000 page views for less than US$ 20. Considering CloudFront, S3 costs will go down to US$ 0.82 while you need to pay for CloudFront usage in another section. Real costs would be even lower because CloudFront caches files and it would not need to make 2,000,000 file requests to S3, but let's skip this detail to reduce the complexity of this estimation. On a side note, the cost would be much higher if you had to provision machines to handle this number of page views to a static website with the same availability and scalability. Total: US$ 0.82 CloudFront pricing CloudFront is slightly more complicated to price since you need to guess how much traffic comes from each region, as they are priced differently. The following table shows an example of estimation: RegionEstimated trafficCost per GB transferredCost per 10,000 HTTPS requestsNorth America70%US$ 0.085US$ 0.010Europe15%US$ 0.085US$ 0.012Asia10%US$ 0.140US$ 0.012South America5%US$ 0.250US$ 0.022 As we have estimated 200 GB of files transferred with 2,000,000 requests, the total will be US$ 21.97. Total: US$ 21.97 Certificate Manager pricing Certificate Manager provides SSL/TLS certificates for free. You only need to pay for the AWS resources you create to run your application. IAM pricing There is no charge specifically for IAM usage. You will be charged only by what AWS resources your users are consuming. Cognito pricing Each user has an associated profile that costs US$ 0.0055 per month. However, there is a permanent free tier that allows 50,000 monthly active users without charges, which is more than enough for our use case. Besides that, we are charged for Cognito Syncs of our user profiles. It costs US$ 0.15 for each 10,000 sync operations and US$ 0.15 per GB/month stored. If we estimate 1,000 active and registered users with less than 1 MB per profile, with less than 10 visits per month in average, we can estimate a charge of US$ 0.30. Total: US$ 0.30 IoT pricing IoT charges starts at US$ 5 per million messages exchanged. As each page view will make at least 2 requests, one to connect and another to subscribe to a topic, we can estimate a minimum of 200,000 messages per month. We need to add 1,000 messages if we suppose that 1% of the users will rate the products and we can ignore other requests like disconnect and unsubscribed because they are excluded from billing. In this setting, the total cost would be of US$ 1.01. Total: US$ 1.01 SNS pricing We will use SNS only for internal notifications, when CloudWatch triggers a warning about issues in our infrastructure. SNS charges US$ 2.00 per 100,000 e-mail messages, but it offers a permanent free tier of 1,000 e-mails. So, it will be free for us. CloudWatch pricing CloudWatch charges US$ 0.30 per metric/month and US$ 0.10 per alarm and offers a permanent free tier of 50 metrics and 10 alarms per month. If we create 20 metrics and expect 20 alarms in a month, we can estimate a cost of US$ 1.00. Total: US$ 1.00 API Gateway pricing API Gateway starts charging US$ 3.50 per million of API calls received and US$ 0.09 per GB transferred out to the Internet. If we assume 5 million requests per month with each response with an average of 1 KB, the total cost of this service will be US$ 17.93. Total: US$ 17.93 Lambda pricing When you create a Lambda function, you need to configure the amount of RAM memory that will be available for use. It ranges from 128 MB to 1.5 GB. Allocating more memory means additional costs. It breaks the philosophy of avoiding provision, but at least it's the only thing you need to worry about. The good practice here is to estimate how much memory each function needs and make some tests before deploying to production. A bad provision may result in errors or higher costs. Lambda has the following billing model: US$ 0.20 per 1 million requests US$ 0.00001667 GB-second Running time is counted in fractions of seconds, rounding up to the nearest multiple of 100 milliseconds. Furthermore, there is a permanent free tier that gives you 1 million requests and 400,000 GB-seconds per month without charges. In our use case scenario, we have assumed 5 million requests per month with an average of 200 milliseconds per execution. We can also assume that the allocated RAM memory is 512 MB per function: Request charges: Since 1 million requests are free, you pay for 4 million that will cost US$ 0.80. Compute charges: Here, 5 million executions of 200 milliseconds each gives us 1 million seconds. As we are running with a 512 MB capacity, it results in 500,000 GB-seconds, where 400,000 GB-seconds of these are free, resulting in a charge of 100,000 GB-seconds that costs US$ 1.67. Total: US$ 2.47 SimpleDB pricing Take a look at the following SimpleDB billing where the free tier is valid for new and existing users: US$ 0.14 per machine-hour (25 hours free) US$ 0.09 per GB transferred out to the internet (1 GB is free) US$ 0.25 per GB stored (1 GB is free) Take a look at the following charges: Compute charges: Considering 5 million requests with an average of 200 milliseconds of execution time, where 50% of this time is waiting for the database engine to execute, we estimate 139 machine hours per month. Discounting 25 free hours, we have an execution cost of US$ 15.96. Transfer costs: Since we'll transfer data between SimpleDB and AWS Lambda, there is no transfer cost. Storage charges: If we assume a 5 GB database, it results in US$ 1.00, since 1 GB is free. Total: US$ 16.96, but this will not be added in the final estimation since we will run our application using DynamoDB. DynamoDB DynamoDB requires you to provision the throughput capacity that you expect your tables to offer. Instead of provisioning hardware, memory, CPU, and other factors, you need to say how many read and write operations you expect and AWS will handle the necessary machine resources to meet your throughput needs with consistent and low-latency performance. One read capacity unit represents one strongly consistent read per second or two eventually consistent reads per second, where objects have a size up to 4 KB. Regarding the writing capacity, one unit means that you can write one object of size 1 KB per second. Considering these definitions, AWS offers in the permanent free tier 25 read units and 25 write units of throughput capacity, in addition to 25 GB of free storage. It charges as follows: US$ 0.47 per month for every Write Capacity Unit (WCU) US$ 0.09 per month for every Read Capacity Unit (RCU) US$ 0.25 per GB/month stored US$ 0.09 GB per GB transferred out to the Internet Since our estimated database will have only 5 GB, we are on the free tier and we will not pay for transferred data because there is no transfer cost to AWS Lambda. Regarding read/write capacities, we have estimated 5 million requests per month. If we evenly distribute them, we will get two requests per second. In this case, we will consider that it's one read and one write operation per second. We need to estimate now how many objects are affected by a read and a write operation. For a write operation, we can estimate that we will manipulate 10 items on average and a read operation will scan 100 objects. In this scenario, we would need to reserve 10 WCU and 100 RCU. As we have 25 WCU and 25 RCU for free, we only need to pay for 75 RCU per month, which costs US$ 6.75. Total: US$ 6.75 Total pricing Let's summarize the cost of each service in the following table: ServiceMonthly CostsRoute 53US$ 0.54S3US$ 0.82CloudFrontUS$ 21.97CognitoUS$ 0.30IoTUS$ 1.01CloudWatchUS$ 1.00API GatewayUS$ 17.93LambdaUS$ 2.47DynamoDBUS$ 6.75TotalUS$ 52.79 It results in a total cost of ~ US$ 50 per month in infrastructure to serve 100,000 page views. If you have a conversion rate of 1%, you can get 1,000 sales per month, which means that you pay US$ 0.05 in infrastructure for each product that you sell. Thus, in this article you learned the serverless architecture of AWS online store also learned how to estimate its costs. If you've enjoyed reading the excerpt, do check out, Building Serverless Web Applications to monitor the performance, efficiency and errors of your apps and also learn how to test and deploy your applications. Google Compute Engine Plugin makes it easy to use Jenkins on Google Cloud Platform Serverless computing wars: AWS Lambdas vs Azure Functions Using Amazon Simple Notification Service (SNS) to create an SNS topic

0
0
8863

article-image-top-5-penetration-testing-tools-for-ethical-hackers

Vijin Boricha

27 Apr 2018

5 min read

Top 5 penetration testing tools for ethical hackers

Vijin Boricha

27 Apr 2018

5 min read

0
0
8844

8 ways to improve your data visualizations

The rise of machine learning in the investment industry

Top 6 Java Machine Learning/Deep Learning frameworks you can’t miss

Why is everyone going crazy over WebAssembly?

What is Core ML?

A Quick look at ML in algorithmic trading strategies

Streamline your application development process in 5 simple steps

DeOldify: Colorising and restoring B&W images and videos using a NoGAN approach

The 10 best graphics and rendering tools for game developers

5 JavaScript machine learning libraries you need to know

Trending Topics

How Artificial Intelligence and Machine Learning can turbocharge a Game Developer's career

Understanding Sentiment Analysis and other key NLP concepts

How to protect yourself from a botnet attack

A serverless online store on AWS could save you money. Build one.

Top 5 penetration testing tools for ethical hackers