Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon

Tech Guides

852 Articles
article-image-uses-of-machine-learning-in-gaming
Natasha Mathur
22 Oct 2018
5 min read
Save for later

Uses of Machine Learning in Gaming

Natasha Mathur
22 Oct 2018
5 min read
All around us, our perception of learning and intellect is being challenged daily with the advent of new and emerging technologies. From self-driving cars, playing Go and Chess, to computers being able to beat humans at classic Atari games, the advent of a group of technologies we colloquially call Machine Learning have come to dominate a new era in technological growth – a new era of growth that has been compared with the same importance as the discovery of electricity and has already been categorized as the next human technological age. Games and simulations are no stranger to AI technologies and there are numerous assets available to the Unity developer in order to provide simulated machine intelligence. These technologies include content like Behavior Trees, Finite State Machine, navigation meshes, A*, and other heuristic ways game developers use to simulate intelligence. So, why Machine Learning and why now? The reason is due in large part to the OpenAI initiative, an initiative that encourages research across academia and the industry to share ideas and research on AI and ML. This has resulted in an explosion of growth in new ideas, methods, and areas for research. This means for games and simulations that we no longer have to fake or simulate intelligence. Now, we can build agents that learn from their environment and even learn to beat their human builders. This article is an excerpt taken from the book 'Learn Unity ML-Agents – Fundamentals of Unity Machine Learning'  by Micheal Lanham. In this article, we look at the role that machine learning plays in game development. Machine Learning is an implementation of Artificial Intelligence. It is a way for a computer to assimilate data or state and provide a learned solution or response. We often think of AI now as a broader term to reflect a "smart" system. A full game AI system, for instance, may incorporate ML tools combined with more classic AIs like Behavior Trees in order to simulate a richer, more unpredictable AI. We will use AI to describe a system and ML to describe the implementation. How Machine Learning is useful in gaming Game engines have embraced the idea of incorporating ML into all aspects of its product and not just for use as a game AI. While most developers may try to use ML for gaming, it certainly helps game development in the following areas: Map/Level Generation: There are already plenty of examples where developers have used ML to auto-generate everything from dungeons to the realistic terrain. Getting this right can provide a game with endless replayability, but it can be some of the most challenging ML to develop. Texture/Shader Generation: Another area that is getting the attention of ML is texture and shader generation. These technologies are getting a boost brought on by the attention of advanced generative adversarial networks, or GAN. There are plenty of great and fun examples of this tech in action; just do a search for DEEP FAKES in your favorite search engine. Model Generation: There are a few projects coming to fruition in this area that could greatly simplify 3D object construction through enhanced scanning and/or auto-generation. Imagine being able to textually describe a simple model and having ML build it for you, in real-time, in a game or other AR/VR/MR app, for example. Audio Generation: Being able to generate audio sound effects or music on the fly is already being worked on for other areas, not just games. Yet, just imagine being able to have a custom designed soundtrack for your game developed by ML. Artificial Players: This encompasses many uses from the gamer themselves using ML to play the game on their behalf to the developer using artificial players as enhanced test agents or as a way to engage players during low activity. If your game is simple enough, this could also be a way of auto testing levels. NPCs or Game AI: Currently, there are better patterns out there to model basic behavioral intelligence in the form of Behavior Trees. While it's unlikely that BTs or other similar patterns will go away any time soon, imagine being able to model an NPC that may actually do an unpredictable, but rather cool behavior. This opens all sorts of possibilities that excite not only developers but players as well. So, we learned about different areas of the gaming world such as model generation, artificial players, NPCs, level generation, etc, where Machine learning can be extensively used. If you found this post useful, be sure to check out the book 'Learn Unity ML-Agents – Fundamentals of Unity Machine Learning' to learn more machine learning concepts in gaming. 5 Ways Artificial Intelligence is Transforming the Gaming Industry How should web developers learn machine learning? Deep Learning in games – Neural Networks set to design virtual worlds
Read more
  • 0
  • 0
  • 10846

article-image-how-artificial-intelligence-can-improve-pentesting
Melisha Dsouza
21 Oct 2018
8 min read
Save for later

How artificial intelligence can improve pentesting

Melisha Dsouza
21 Oct 2018
8 min read
686 cybersecurity breaches were reported in the first three months of 2018 alone, with unauthorized intrusion accounting for 38.9% of incidents. And with high-profile data breaches dominating headlines, it’s clear that while modern, complex software architecture might be more adaptable and data-intensive than ever, securing that software is proving a real challenge. Penetration testing (or pentesting) is a vital component within the cybersecurity toolkit. In theory, it should be at the forefront of any robust security strategy. But it isn’t as simple as just rolling something out with a few emails and new software - it demands people with great skills, as well a culture where stress testing and hacking your own system is viewed as a necessity, not an optional extra. This is where artificial intelligence comes in - the automation that you can achieve through artificial intelligence could well help make pentesting much easier to do consistently and at scale. In turn, this would help organizations tackle both issues of skills and culture, and get serious about their cybersecurity strategies. But before we dive deeper into artificial intelligence and pentesting, let’s take a look at where we are now, and the shortcomings of established pentesting methods. The shortcomings of established methods of pentesting Typically, pentesting is carried out in 5 stages: Source: Incapsula Every one of these stages, when carried out by humans, opens up the chance of error. Yes, software is important, but contextual awareness and decisions are required.. This process, then, provides plenty of opportunities for error. From misinterpreting data - like thinking a system is secure, when actually it isn’t - to taking care of evidence and thoroughly and clearly recording the results of pentests, even the most experienced pentester will get things wrong. But even if you don’t make any mistakes, this whole process is hard to do well at scale. It requires a significant amount of time and energy to test a piece of software, which, given the pace of change created by modern processes, makes it much harder to maintain the levels of rigor you ultimately want from pentesting. This is where artificial intelligence comes in. The pentesting areas that artificial intelligence can impact Let’s dive into the different stages of pentesting that AI can impact. #1 Reconnaissance Stage The most important stage in pentesting is the Reconnaissance or information gathering stage. As rightly said by many in cybersecurity, "The more information gathered, the higher the likelihood of success." Therefore, a significant amount of time should be spent obtaining as much information as possible about the target. Using AI to automate this stage would provide accurate results as well as save a lot of time invested. Using a combination of Natural Language Processing, Computer Vision, and Artificial Intelligence, experts can identify a wide variety of details that can be used to build a profile of the company, its employees, the security posture, and even the software/hardware components of the network and computers. #2 Scanning Stage Comprehensive coverage is needed In the scanning phase. Manually scanning through thousands if systems in an organization is not ideal. NNor is it ideal to interpret the results returned by scanning tools. AI can be used to tweak the code of the scanning tools to scan systems as well as interpret the results of the scan. It can help save pentesters time and help in the overall efficiency of the pentesting process. AI can focus on test management and the creation of test cases automatically that will check if a particular program can be tagged having security flaw. They can also be used to check how a target system responds to an intrusion. #3 Gaining and Maintaining access stage Gaining access phase involves taking control of one or more network devices in order to either extract data from the target, or to use that device to then launch attacks on other targets. Once a system is scanned for vulnerabilities, the pentesters need to ensure that the system does not have any loopholes that attackers can exploit to get into the network devices. They need to check that the network devices are safely protected with strong passwords and other necessary credentials. AI-based algorithms can try out different combinations of passwords to check if the system is susceptible for a break-in. The algorithms can be trained to observe user data, look for trends or patterns to make inferences about possible passwords used. Maintaining access focuses on establishing other entry points to the target. This phase is expected to trigger mechanisms, to ensure that the penetration tester’s security when accessing the network. AI-based algorithms should be run at equal intervals to time to guarantee that the primary path to the device is closed. The algorithms should be able to discover backdoors, new administrator accounts, encrypted channels, new network access channels, and so on. #4 Covering Tracks And Reporting The last stage tests whether an attacker can actually remove all traces of his attack on the system. Evidence is most often stored in user logs, existing access channels, and in error messages caused by the infiltration process. AI-powered tools can assist in the discovery of hidden backdoors and multiple access points that haven't been left open on the target network; All of these findings should be automatically stored in a report with a proper timeline associated with every attack done. A great example of a tool that efficiently performs all these stages of pentesting is CloudSEK’s X-Vigil. This tool leverages AI to extract data, derive analysis and discover vulnerabilities in time to protect an organization from data breach. Manual vs automated vs AI-enabled pentesting Now that you have gone through the shortcomings of manual pen testing and the advantages of AI-based pentesting, let’s do a quick side-by-side comparison to understand the difference between the two.   Manual Testing Automated Testing AI enabled pentesting Manual testing is not accurate at all times due to human error This is more likely to return false positives AI enabled pentesting is accurate as compared to automated testing Manual testing is time-consuming and takes up human resources.   Automated testing is executed by software tools, so it is significantly faster than a manual approach.   AI enabled testing does not consume much time. The algorithms can be deployed for thousands of systems at a single instance. Investment is required for human resources.   Investment is required for testing tools. AI will save the investment for human resources in pentesting. Rather, the same employees can be used to perform less repetitive and more efficient tasks Manual testing is only practical when the test cases are run once or twice, and frequent repetition is not required..   Automated testing is practical when tools find test vulnerabilities out of programmable bounds AI-based pentesting is practical in organizations with thousands of systems that need to be tested at once to save time and resources.   AI-based pentesting tools Pentoma is an AI-powered penetration testing solution that allows software developers to conduct smart hacking attacks and efficiently pinpoint security vulnerabilities in web apps and servers. It identifies holes in web application security before hackers do, helping prevent any potential security damages. Pentoma analyzes web-based applications and servers to find unknown security risks.In Pentoma, with each hacking attempt, machine learning algorithms incorporate new vulnerability discoveries, thus continuously improving and expanding threat detection capability. Wallarm Security Testing is another AI based testing tool that discovers network assets, scans for common vulnerabilities, and monitors application responses for abnormal patterns. It discovers application-specific vulnerabilities via Automated Threat Verification. The content of a blocked malicious request is used to create a sanitized test with the same attack vector to see how the application or its copy in a sandbox would respond. With such AI based pentesting tools, pentesters can focus on the development process itself, confident that applications are secured against the latest hacking and reverse engineering attempts, thereby helping to streamline a product’s time to market. Perhaps it is the increase in the number of costly data breaches or the continually expanding attack and proliferation of sensitive data and the attempt to secure them with increasingly complex security technologies that businesses lack in-house expertise to properly manage. Whatever be the reason, more organizations are waking up to the fact that if vulnerabilities are not caught in time can be catastrophic for the business. These weaknesses, which can range from poorly coded web applications, to unpatched databases to exploitable passwords to an uneducated user population, can enable sophisticated adversaries to run amok across your business.  It would be interesting to see the growth of AI in this field to overcome all the aforementioned shortcomings. 5 ways artificial intelligence is upgrading software engineering Intelligent Edge Analytics: 7 ways machine learning is driving edge computing adoption in 2018 8 ways Artificial Intelligence can improve DevOps
Read more
  • 0
  • 0
  • 13445

article-image-julia-for-machine-learning-will-the-new-language-pick-up-pace
Prasad Ramesh
20 Oct 2018
4 min read
Save for later

Julia for machine learning. Will the new language pick up pace?

Prasad Ramesh
20 Oct 2018
4 min read
Machine learning can be done using many languages, with Python and R being the most popular. But one language has been overlooked for some time—Julia. Why isn’t Julia machine learning a thing? Julia isn't an obvious choice for machine learning simply because it's a new language that has only recently hit version 1.0. While Python is well-established, with a large community and many libraries, Julia simply doesn't have the community to shout about it. And that's a shame. Right now Julia is used in various fields. From optimizing milk production in dairy farms to parallel supercomputing for astronomy, Julia has a wide range of applications. A common theme here is that these actions all require numerical, scientific, and sometimes parallel computation. Julia is well-suited to the sort of tasks where intensive computation is essential. Viral Shah, CEO of Julia Computing said to Forbes “Amazon, Apple, Disney, Facebook, Ford, Google, Grindr, IBM, Microsoft, NASA, Oracle and Uber are other Julia users, partners and organizations hiring Julia programmers.” Clearly, Julia is powering the analytical nous of some of the most high profile organizations on the planet. Perhaps it just needs more cheerleading to go truly mainstream. Why Julia is a great language for machine learning Julia was originally designed for high-performance numerical analysis. This means that everything that has gone into its design is built for the very things you need to do to build effective machine learning systems. Speed and functionality Julia combines the functionality from various popular languages like Python, R, Matlab, SAS and Stata with the speed of C++ and Java. A lot of the standard LaTeX symbols can be used in Julia, with the syntax usually being the same as LaTeX. This mathematical syntax makes it easy for implementing mathematical formulae in code and make Julia machine learning possible. It also has in-built support for parallelism which allows utilization of multiple cores at once making it fast at computations. Julia’s loops and functions features are pretty fast, fast enough that you would probably notice significant performance differences against other languages. The performance can be almost comparable to C with very little code actually used. With packages like ArrayFire, generic code can be run on GPUs. In Julia, the multiple dispatch feature is very useful for defining number and array-like datatypes. Matrices, data tables work with good compatibility and performance. Julia has automatic garbage collection, a collection of libraries for mathematical calculations, linear algebra, random number generation, and regular expression matching. Libraries and scalability Julia machine learning can be done with powerful tools like MLBase.jl, Flux.jl, Knet.jl, that can be used for machine learning and artificial intelligence systems. It also has a scikit-learn implementation called ScikitLearn.jl. Although ScikitLearn.jl is not an official port, it is a useful additional tool for building machine learning systems with Julia. As if all those weren’t enough, Julia also has TensorFlow.jl and MXNet.jl. So, if you already have experience with these tools, in other implementations, the transition is a little easier than learning everything from scratch. Julia is also incredibly scalable. It can be deployed on large clusters quickly, which is vital if you’re working with big data across a distributed system. Should you consider Julia machine learning? Because it’s fast and possesses a great range of features, Julia could potentially overtake both Python and R to be the choice of language for machine learning in the future. Okay, maybe we shouldn’t get ahead of ourselves. But with Julia reaching the 1.0 milestone, and the language rising on the TIOBE index, you certainly shouldn’t rule out Julia when it comes to machine learning. Julia is also available to use in the popular tool Jupyter Notebook, paving a path for wider adoption. A note of caution, however, is important. Rather than simply dropping everything for Julia, it will be worth monitoring the growth of the language. Over the next 12 to 24 months we’ll likely see new projects and libraries, and the Julia machine learning community expanding. If you start hearing more noise about the language, it becomes a much safer option to invest your time and energy in learning it. If you are just starting off with machine learning, then you should stick to other popular languages. An experienced engineer, however, who already has a good grip on other languages shouldn’t be scared of experimenting with Julia - it gives you another option, and might just help you to uncover new ways of working and solving problems. Julia 1.0 has just been released What makes functional programming a viable choice for artificial intelligence projects? Best Machine Learning Datasets for beginners
Read more
  • 0
  • 0
  • 8445
Banner background image

article-image-4-key-benefits-of-using-firebase-for-mobile-app-development
Guest Contributor
19 Oct 2018
6 min read
Save for later

4 key benefits of using Firebase for mobile app development

Guest Contributor
19 Oct 2018
6 min read
A powerful backend solution is essential for building sophisticated mobile apps. In recent years, Firebase has emerged to prominence as a power-packed Backend-as-a-Solution (BaaS), thanks to its wide-ranging features and performance boosting elements. After being acquired in 2014 by Google, several of its features further got a performance boost. These features have made  Firebase quite a popular backend solution for app developers and other emerging IT sectors. Let us look at its 4 key benefits for cross-platform mobile app development. Unleashing the power of Google Analytics Google Analytics for Firebase is a completely free solution with unconstrained reporting on many aspects. The reporting feature allows you to evaluate client behavior, report on broken links, user interactions and all other aspects of user experience and user interface. The reporting helps developers make informed decisions while optimizing the UI and the app performance. The unmatched scale of reporting: Firebase analytics allows access to unlimited reports on as many as 500 different events. The developers can also create custom events for reporting as their need suits. Robust audience segmentation: The Firebase analytics also allows segmenting the app audience on different parameters and grounds. The integrated console allows segmenting the audience on the basis of device information, custom events, and user characteristics. Crash reporting to fix Bugs Firebase also helps to address performance issues of an app by fixing bugs right from its backend solution. It is also equipped with robust crash reporting feature. Its crash reporting helps to deliver intricate and detailed bug and crash reports to address all the coding errors in an app. The reporting feature is capable of grouping together the issues in different categories as per the characteristics of the problem. Here are some of the attributes of this reporting feature. Monitoring errors: It is capable of monitoring fatal errors for iOS apps and both fatal and non-fatal errors for Android apps. Generally, reports are initiated as per the impact caused by such errors on the user experience. Required data collection to fix errors: The reports also enlist all the details concerning the device in use, performance shortfalls and user scenarios concerning the erroneous events. According to the contributing factors and other similarities, the issues are grouped in different categories. Email alerts: It also allows sending email alerts as and when such issues or problems are detected. The configuration of error reporting: The error reporting can also be configured remotely to control who can access the reports and list of events that occurred before an event. It is free: Crash and bug reporting is free with Firebase. You don't need to pay a penny to access this feature. Synchronizing data with real-time database With Firebase you can sync the offline and online data through NoSQL database. This makes the application data available on both offline and online states of the app. This boosts collaboration on the application data in real time. Here are some of its benefits. Real-time: Unlike the so-called HTTP requests that work to update the data across interfaces, the Real-time Database of firebase syncs data with every change thus helping to reflect the change in real time across any device in use. Offline: As Firebase Real-time Database SDK helps save your data in local disk, you can always access the data offline. As and when connectivity is back, the changes are synced with the present state of the server. Access from multiple devices: The Firebase Real-time Database allows accessing application data from multiple devices and interfaces including mobile devices and web. Splitting and scaling your data: Thanks to Firebase Real-time Database, you can split your data across multiple databases within the same project and set rules for each database instances. Firebase is feature rich for futuristic app development In addition to the above, Firebase is fully empowered with a host of rich features required for building sophisticated and most feature-rich mobile apps. Let us have a look at some of the key features of Firebase that made it a reliable platform for cross-platform development. Hosting: The hosting feature of Firebase allows developers to update their contents in the Content Delivery Network (CDN) during production. Firebase offers full hosting support with a custom domain, Global CDN, and an automatically provided SSL Certificate. Authentication: Firebase backend service offers a powerful authentication feature. It comes equipped with simple SDKs and easy to use libraries to integrate authentication feature with any mobile app. Storage: Firebase storage feature is powered by Google Cloud Storage and allows users to easily download media files and visual contents. This feature is also helpful in making use of user-generated content. Cloud Messaging: With Cloud Messaging, a mobile app powered can easily send a message to users and indulge in real-time communication. Remote Configuration: This feature of Firebase allows developers to incorporate certain changes in the app remotely. Thanks to this, the changes are reflected in the existing version, and the user does not need to download the latest updated version. Test Lab: With Test lab, developers can easily test the app in all the devices listed in the Google data center. It can even do the testing without requiring any test code of the respective app. Notifications: This feature gives developers a console to manage and send user-focused custom notifications to the users. App Indexing: This feature allows developers to index the app in Google Search and achieve higher search ranks in app marketplaces like Play Store and App Store. Dynamic Links: Firebase also equips the app to create dynamic links or smart URLs to present the respective app across all digital platforms including social media, mobile app, web, email, and other channels. All the above-mentioned benefits and useful features that empower mobile app developers to create dynamic user experience helped Firebase achieve such unprecedented popularity among developers worldwide. No wonder, in a short time span it has become a very popular backend solution for so many successful cross-platform mobile apps. Some exemplary use cases of Firebases Here we have picked two use cases of Firebase, respectively for one relatively new and successful app and one leading app in its niche. Fabulous Fabulous is a unique app that trains users to dispose of bad habits and get used to good habits to ensure health and wellbeing. The app by customizing the onboarding process through Firebase managed to double the retention rate. The app could incorporate custom user experience for different groups of users as per their preference. Onefootball This leading mobile soccer app OneFootBall experienced more than 5% increase in user session time thanks to Firebase. The new backend solution powered by Firebase helped the game app engage the audience more efficiently than ever before. The custom contents created by this popular app can enjoy better traction with users thanks to higher engagement. Author Bio: Juned Ahmed works as an IT consultant at IndianAppDevelopers, a leading Mobile app development company which offers to hire app developers in India for mobile solutions. He has more than 10 years of experience in developing and implementing marketing strategies. How to integrate Firebase on Android/iOS applications natively. Build powerful progressive web apps with Firebase. How to integrate Firebase with NativeScript for cross-platform app development.
Read more
  • 0
  • 0
  • 14103

article-image-why-uber-created-hudi-an-open-source-incremental-processing-framework-on-apache-hadoop
Bhagyashree R
19 Oct 2018
3 min read
Save for later

Why did Uber created Hudi, an open source incremental processing framework on Apache Hadoop?

Bhagyashree R
19 Oct 2018
3 min read
In the process of rebuilding its Big Data platform, Uber created an open-source Spark library named Hadoop Upserts anD Incremental (Hudi). This library permits users to perform operations such as update, insert, and delete on existing Parquet data in Hadoop. It also allows data users to incrementally pull only the changed data, which significantly improves query efficiency. It is horizontally scalable, can be used from any Spark job, and the best part is that it only relies on HDFS to operate. Why is Hudi introduced? Uber studied its current data content, data access patterns, and user-specific requirements to identify problem areas. This research revealed the following four limitations: Scalability limitation in HDFS Many companies who use HDFS to scale their Big Data infrastructure face this issue. Storing large numbers of small files can affect the performance significantly as HDFS is bottlenecked by its NameNode capacity. This becomes a major issue when the data size grows above 50-100 petabytes. Need for faster data delivery in Hadoop Since Uber operates in real time, there was a need for providing services the latest data. It was important to make the data delivery much faster, as the 24-hour data latency was way too slow for many of their use cases. No direct support for updates and deletes for existing data Uber used snapshot-based ingestion of data, which means a fresh copy of source data was ingested every 24 hours. As Uber requires the latest data for its business, there was a need for a solution which supports update and delete operations for existing data. However, since their Big Data is stored in HDFS and Parquet, direct support for update operations on existing data is not available. Faster ETL and modeling ETL and modeling jobs were also snapshot-based, requiring their platform to rebuild derived tables in every run. ETL jobs also needed to become incremental to reduce data latency. How Hudi solves the aforementioned limitations? The following diagram shows Uber's Big Data platform after the incorporation of Hudi: Source: Uber Regardless of whether the data updates are new records added to recent date partitions or updates to older data, Hudi allows users to pass on their latest checkpoint timestamp and retrieve all the records that have been updated since. This data retrieval happens without running an expensive query that scans the entire source table. Using this library Uber has moved to an incremental ingestion model leaving behind the snapshot-based ingestion. As a result, the data latency was reduced from 24 hrs to less than one hour. To know about Hudi in detail, check out Uber’s official announcement. How can Artificial Intelligence support your Big Data architecture? Big data as a service (BDaaS) solutions: comparing IaaS, PaaS and SaaS Uber’s Marmaray, an Open Source Data Ingestion and Dispersal Framework for Apache Hadoop
Read more
  • 0
  • 0
  • 12821

article-image-5-best-practices-to-perform-data-wrangling-with-python
Savia Lobo
18 Oct 2018
5 min read
Save for later

5 best practices to perform data wrangling with Python

Savia Lobo
18 Oct 2018
5 min read
Data wrangling is the process of cleaning and structuring complex data sets for easy analysis and making speedy decisions in less time. Due to the internet explosion and the huge trove of IoT devices there is a massive availability of data, at present. However, this data is most often in its raw form and includes a lot of noise in the form of unnecessary data, broken data, and so on. Clean up of this data is essential in order to use it for analysis by organizations. Data wrangling plays a very important role here by cleaning this data and making it fit for analysis. Also, Python language has built-in features to apply any wrangling methods to various data sets to achieve the analytical goal. Here are 5 best practices that will help you out in your data wrangling journey with the help of Python. And at the end, all you’ll have is a clean and ready to use data for your business needs. 5 best practices for data wrangling with Python Learn the data structures in Python really well Designed to be a very high-level language, Python offers an array of amazing data structures with great built-in methods. Having a solid grasp of all the capabilities will be a potent weapon in your repertoire for handling data wrangling task. For example, dictionary in Python can act almost like a mini in-memory database with key-value pairs. It supports extremely fast retrieval and search by utilizing a hash table underneath. Explore other built-in libraries related to these data structures e.g. ordered dict, string library for advanced functions. Build your own version of essential data structures like stack, queues, heaps, and trees, using classes and basic structures and keep them handy for quick data retrieval and traversal. Learn and practice file and OS handling in Python How to open and manipulate files How to manipulate and navigate directory structure Have a solid understanding of core data types and capabilities of Numpy and Pandas How to create, access, sort, and search a Numpy array. Always think if you can replace a conventional list traversal (for loop) with a vectorized operation. This will increase speed of your data operation. Explore special file types like .npy (Numpy’s native storage) to access/read large data set with much higher speed than usual list. Know in details all the file types you can read using built-in Pandas methods. This will simplify to a great extent your data scraping. Almost all of these methods have great data cleaning and other checks built in. Try to use such optimized routines instead of writing your own to speed up the process. Build a good understanding of basic statistical tests and a panache for visualization Running some standard statistical tests can quickly give you an idea about the quality of the data you need to wrangle with. Plot data often even if it is multi-dimensional. Do not try to create fancy 3D plots. Learn to explore simple set of pairwise scatter plots. Use boxplots often to see the spread and range of the data and detect outliers. For time-series data, learn basic concepts of ARIMA modeling to check the sanity of the data Apart from Python, if you want to master one language, go for SQL As a data engineer, you will inevitably run across situations where you have to read from a large, conventional database storage. Even if you use Python interface to access such database, it is always a good idea to know basic concepts of database management and relational algebra. This knowledge will help you build on later and move into the world of Big Data and Massive Data Mining (technologies like Hadoop/Pig/Hive/Impala) easily. Your basic data wrangling knowledge will surely help you deal with such scenarios. Although Data wrangling may be the most time-consuming process, it is the most important part of the data management. Data collected by businesses on a daily basis can help them make decisions on the latest information available. It also allows businesses to find the hidden insights and use it in the decision-making processes and provide them with new analytic initiatives, improved reporting efficiency and much more. About the authors Dr. Tirthajyoti Sarkar works in San Francisco Bay area as a senior semiconductor technologist where he designs state-of-the-art power management products and applies cutting-edge data science/machine learning techniques for design automation and predictive analytics. He has 15+ years of R&D experience and is a senior member of IEEE. Shubhadeep Roychowdhury works as a Sr. Software Engineer at a Paris based Cyber Security startup where he is applying the state-of-the-art Computer Vision and Data Engineering algorithms and tools to develop cutting edge product. Data cleaning is the worst part of data analysis, say data scientists Python, Tensorflow, Excel and more – Data professionals reveal their top tools Manipulating text data using Python Regular Expressions (regex)
Read more
  • 0
  • 0
  • 8894
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-4-misconceptions-about-data-wrangling
Sugandha Lahoti
17 Oct 2018
4 min read
Save for later

4 misconceptions about data wrangling

Sugandha Lahoti
17 Oct 2018
4 min read
Around 80% of the time in data analysis is spent on cleaning and preparing data for analysis. This is, however, an important task, and is a prerequisite to the rest of the data analysis workflow, including visualization, analysis, and reporting. Although, being an important task given its nature, there are certain myths associated with data wrangling which developers should be cautious of. In this post, we will discuss four such misconceptions. Myth #1: Data wrangling is all about writing SQL query There was a time when data processing needed data to be presented in a relational manner so that SQL queries could be written. Today, there are many other types of data sources in addition to the classic static SQL databases, which can be analyzed. Often, an engineer has to pull data from diverse sources such as web portals, Twitter feeds, sensor fusion streams, police or hospital records. Static SQL query can help only so much in those diverse domains. A programmatic approach, which is flexible enough to interface with myriad sources and is able to parse the raw data through clever algorithmic techniques and use of fundamental data structures (trees, graphs, hash tables, heaps), will be the winner. Myth #2: Knowledge of statistics is not required for data wrangling Quick statistical tests and visualizations are always invaluable to check the ‘quality’ of the data you sourced. These tests can help detect outliers and wrong data entry, without running complex scripts. For effective data wrangling, you don’t need to have knowledge of advanced statistics. However, you must understand basic descriptive statistics and know how to execute them using built-in Python libraries. Myth #3: You have to be a machine learning expert to do great data wrangling Deep knowledge of machine learning is certainly not a pre-requisite for data wrangling. It is true that the end goal of data wrangling is often to prepare the data so that it can be used in a machine learning task downstream. As a data wrangler, you do not have to know all the nitty-gritties of your project’s machine learning pipeline. However, it is always a good idea to talk to the machine learning expert who will use your data and understand the data structure interface and format he/she needs to run the model fast and accurately. Myth #4: Deep knowledge of programming is not required for data wrangling As explained above, the diversity and complexity of data sources require that you are comfortable with deep notions of fundamental data structures and how a programming language paradigm handles them. Increasing deep knowledge of the programming framework (Python for example) will surely help you to come up with innovative methods for dealing with data source interfacing and data cleaning issues. The speed and efficiency of your data processing pipeline can often be benefited from using advanced knowledge of basic algorithms e.g. search, sort, graph traversal, hash table building, etc. Although built-in methods in standard libraries are optimized, having this knowledge gives you an edge for any situation. You read a guest post from Tirthajyoti Sarkar and Shubhadeep Roychowdhury, the authors of Data Wrangling with Python. We hope that these misconceptions would help you realize that data wrangling is not as difficult as it seems. Have fun wrangling data! About the authors Dr. Tirthajyoti Sarkar works as a Sr. Principal Engineer in the semiconductor technology domain where he applies cutting-edge data science/machine learning techniques for design automation and predictive analytics. Shubhadeep Roychowdhury works as a Sr. Software Engineer at a Paris based Cyber Security startup. He holds a Master Degree in Computer Science from West Bengal University Of Technology and certifications in Machine Learning from Stanford. Don’t forget to check out Data Wrangling with Python to learn the essential basics of data wrangling using Python. 30 common data science terms explained Python, Tensorflow, Excel and more – Data professionals reveal their top tools How to create a strong data science project portfolio that lands you a job
Read more
  • 0
  • 0
  • 5103

article-image-how-is-artificial-intelligence-changing-the-mobile-developer-role
Bhagyashree R
15 Oct 2018
10 min read
Save for later

How is Artificial Intelligence changing the mobile developer role?

Bhagyashree R
15 Oct 2018
10 min read
Last year, at Google I/O, Sundar Pichai, the CEO of Google, said: “We are moving from a mobile-first world to an AI-first world” Is it only applicable to Google? Not really. In the recent past, we have seen several advancements in Artificial Intelligence and in parallel a plethora of intelligent apps coming into the market. These advancements are enabling developers to take their apps to the next level by integrating recommendation service, image recognition, speech recognition, voice translation, and many more cool capabilities. Artificial Intelligence is becoming a potent tool for mobile developers to experiment and innovate. The Artificial Intelligence components that are integral to mobile experiences, such as voice-based assistants and location-based services, increasingly require mobile developers to have a basic understanding of Artificial Intelligence to be effective. Of course, you don’t have to be Artificial Intelligence experts to include intelligent components in your app. But, you should definitely understand something about what you’re building into your app and why. After all AI in mobile is not just limited to calling an API, isn't it? There’s more to it and in this article we will explore how Artificial Intelligence will shape the mobile developer role in the immediate future. Read also: AI on mobile: How AI is taking over the mobile devices marketspace What is changing in the mobile developer role? Focus shifting to data With Artificial Intelligence becoming more and more accessible, intelligent apps are becoming the new norm for businesses. Artificial Intelligence strengthens the relationship between brands and customers, inspiring developers to build smart apps that increase user retention. This also means that developers have to direct their focus to data. They have to understand things like how the data will be collected? How will the data be fed to machines and how often will data input be needed? When nearly 1 in 4 people abandon an app after its first use, as a mobile app developer, you need to rethink how you drive in-app personalization and engagement. Explore “humanized” way of user-app interaction With so many chatbots such as Siri and Google Assistant coming into the market, we can see that “humanizing” the interaction between the user and the app is becoming mainstream. “Humanizing” is the process where the app becomes relatable to the user, and the more effective it is conducted, the more the end user will interact with the app. Users now want easy navigation and searching system and Artificial Intelligence fits perfectly in the scenario. The advances in technologies like text-to-speech, speech-to-text, Natural Language Processing, and cloud services, in general, have contributed to the mass adoption of these types of interfaces. Companies are increasingly expecting mobile developers to be comfortable working with AI functionalities Artificial Intelligence is the future. Companies are now expecting their mobile developers to know how to handle the huge amount of data generated every day and how to use it. Here's is an example of what Google wants their engineers to do: “We're looking for engineers who bring fresh ideas from all areas, including information retrieval, distributed computing, large-scale system design, networking and data storage, security, artificial intelligence, natural language processing, UI design and mobile; the list goes on and is growing every day.” This open-ended requirement list shows that it is the right time to learn and embrace Artificial Intelligence as soon as possible. What skills do you need to build intelligent apps? Ideally, data scientists are the ones who conceptualize mathematical models and machine learning engineers are the ones who translate it into the code and train the model. But, when you are working in a resource-tight environment, for example in a start-up, you will be responsible for doing the end-to-end job. It is not as scary as it sounds, because you have several resources to get started with! Taking your first steps with machine learning as a service Learning anything starts with motivating yourself. Directly diving into the maths and coding part of machine learning might exhaust and bore you. That's why it's a good idea to know what the end goal of your entire learning process is going to be and what types of solutions are possible using machine learning. There are many products available that you can try to quickly get started such as Google Cloud AutoML (Beta), Firebase MLKit (Beta), and Fritz Mobile SDK, among others. Read also: Machine Learning as a Service (MLaaS): How Google Cloud Platform, Microsoft Azure, and AWS are democratizing Artificial Intelligence Getting your hands dirty After getting a “warm-up” the next step will involve creating and training your own model. This is where you’ll be introduced to TensorFlow Lite, which is going to be your best friend throughout your journey as a machine learning mobile developer. There are many other machine learning tools coming into the market that you can make use of. These tools make building AI in mobile easier. For instance, you can use Dialogflow, a Natural Language Understanding (NLU) platform that makes it easy for developers to design and integrate conversational user interfaces into mobile apps, web applications, devices, and bots. You can then integrate it on Alexa, Cortana, Facebook Messenger, and other platforms your users are on. Read also: 7 Artificial Intelligence tools mobile developers need to know For practicing you can leverage an amazing codelab by Google, TensorFlow For Poets. It guides you through creating and training a custom image classification model. Through this codelab you will learn the basics of data collection, model optimization, and other key components involved in creating your own model. The codelab is divided into two parts. The first part covers creating and training the model, and the second part is focused on TensorFlow Lite which is the mobile version of TensorFlow that allows you to run the same model on a mobile device. Mathematics is the foundation of machine learning Love it or hate it, machine learning and Artificial Intelligence are built on mathematical principles like calculus, linear algebra, probability, statistics, and optimization. You need to learn some essential foundational concepts and the notation used to express them. There are many reasons why learning mathematics for machine learning is important. It will help you in the process of selecting the right algorithm which includes giving considerations to accuracy, training time, model complexity, number of parameters and number of features. Maths is needed when choosing parameter settings and validation strategies, identifying underfitting and overfitting by understanding the bias-variance tradeoff. Read also: Bias-Variance tradeoff: How to choose between bias and variance for your machine learning model [Tutorial] Read also: What is Statistical Analysis and why does it matter? What are the key aspects of Artificial Intelligence for mobile to keep in mind? Understanding the problem Your number one priority should be the user problem you are trying to solve. Instead of randomly integrating a machine learning model into an application, developers should understand how the model applies to the particular application or use case. This is important because you might end up building a great machine learning model with excellent accuracy rate, but if it does not solve any problem, it will end up being redundant. You must also understand that while there are many business problems which require machine learning approaches, not all of them do. Most business problems can be solved through simple analytics or a baseline approach. Data is your best friend Machine learning is dependent on data; the data that you use, and how you use it, will define the success of your machine learning model. You can also make use of thousands of open source datasets available online. Google recently launched a tool for dataset search named, Google Dataset Search which will make it easier for you to search the right dataset for your problem. Typically, there’s no shortage of data; however, the abundant existence of data does not mean that the data is clean, reliable, or can be used as intended. Data cleanliness is a huge issue. For example, a typical company will have multiple customer records for a single individual, all of which differ slightly. If the data isn’t clean, it isn’t reliable. The bottom line is, it’s a bad practice to just grabbing the data and using it without considering its origin. Read also: Best Machine Learning Datasets for beginners Decide which model to choose A machine learning algorithm is trained and the artifact that it creates after the training process is called the machine learning model. An ML model is used to find patterns in data without the developer having to explicitly program those patterns. We cannot look through such a huge amount of data and understand the patterns. Think of the model as your helper who will look through all those terabytes of data and extract knowledge and insights from the data. You have two choices here: either you can create your own model or use a pre-built model. While there are several pre-built models available, your business-specific use cases may require specialized models to yield the desired results. These off-the-shelf model may also need some fine-tuning or modification to deliver the value the app is intended to provide. Read also: 10 machine learning algorithms every engineer needs to know Thinking about resource utilization is important Artificial Intelligence-powered apps or apps, in general, should be developed with resource utilization in mind. Though companies are working towards improving mobile hardware, currently, it is not the same as what we can accomplish with GPU clusters in the cloud. Therefore, developers need to consider how the models they intend to use would affect resources including battery power and memory usage. In terms of computational resources, inferencing or making predictions is less costly than training. Inferencing on the device means that the models need to be loaded into RAM, which also requires significant computational time on the GPU or CPU. In scenarios that involve continuous inferencing, such as audio and image data which can chew up bandwidth quickly, on-device inferencing is a good choice. Learning never stops Maintenance is important, and to do that you need to establish a feedback loop and have a process and culture of continuous evaluation and improvement. A change in consumer behavior or a market trend can make a negative impact on the model. Eventually, something will break or no longer work as intended, which is another reason why developers need to understand the basics of what it is they’re adding to an app. You need to have some knowledge of how the Artificial Intelligence component that you just put together is working or how it could be made to run faster. Wrapping up Before falling for the Artificial Intelligence and machine learning hype, it’s important to understand and analyze the problem you are trying to solve. You should examine whether applying machine learning can improve the quality of the service, and decide if this improvement justifies the effort of deploying a machine learning model. If you just want a simple API endpoint and don’t want to dedicate much time in deploying a model, cloud-based web services are the best option for you. Tools like ML Kit for Firebase looks promising and seems like a good choice for startups or developers just starting out. TensorFlow Lite and Core ML are good options if you have mobile developers on your team or if you’re willing to get your hands a little dirty. Artificial Intelligence is influencing the app development process by providing us a data-driven approach for solving user problems. It wouldn't be surprising if in the near future Artificial Intelligence becomes a forerunning factor for app developers in their expertise and creativity. 10 useful Google Cloud Artificial Intelligence services for your next machine learning project [Tutorial] How Artificial Intelligence is going to transform the Data Center How Serverless computing is making Artificial Intelligence development easier
Read more
  • 0
  • 0
  • 7542

article-image-4-myths-about-git-and-github-you-should-know-about
Savia Lobo
07 Oct 2018
3 min read
Save for later

4 myths about Git and GitHub you should know about

Savia Lobo
07 Oct 2018
3 min read
With an aim to replace BitKeeper, Linus Torvalds created Git in 2005 to support the development of the Linux kernel. However, Git isn’t necessarily limited to code, any product or project that requires or exhibits characteristics such as having multiple contributors, requiring release management and versioning stands to have an improved workflow through Git. Just as every solution or tool has its own positives and negatives, Git is also surrounded by myths. Alex Magana and Joseph Mul, the authors of Introduction to Git and GitHub course discuss in this post some of the myths about the Git tool and GitHub. Git is GitHub Due to the usage of Git and GitHub as the complete set that forms the version control toolkit, adopters of the two tools misconceive Git and GitHub as interchangeable tools. Git is a tool that offers the ability to track changes on files that constitute a project. Git offers the utility that is used to monitor changes and persists the changes. On the other hand, GitHub is akin to a website hosting service. The difference here is that with GitHub, the hosted content is a repository. The repository can then be accessed from this central point and the codebase shared. Backups are equivalent to version control This emanates from a misunderstanding of what version control is and by extension what Git achieves when it’s incorporated into the development workflow. Contrary to archives created based on a team’s backup policy, Git tracks changes made to files and maintains snapshots of a repository at a given point in time. Git is only suitable for teams With the usage of hosting services such as GitHub, the element of sharing and collaboration, may be perceived as a preserve of teams. Git offers gains beyond source control. It lends itself to the delivery of a feature or product from the point of development to deployment. This means that Git is a tool for delivery. It can, therefore, be utilized to roll out functionality and manage changes to source code for teams and individuals alike. To effectively use Git, you need to learn every command to work When working as an individual or a team, the common commands required to allow you to contribute a repository encompass commands for initiating tracking of specific files, persisting changes made to tracked files, reverting changes made to files incorporating changes introduced by other developers working on the same project you are on. The four myths mentioned by the authors provides a clarification on both Git and GitHub and its uses. If you found this post useful, do check out the course titled Introduction to Git and GitHub by Alex and Joseph. GitHub addresses technical debt, now runs on Rails 5.2.1 GitLab 11.3 released with support for Maven repositories, protected environments and more GitLab raises $100 million, Alphabet backs it to surpass Microsoft’s GitHub  
Read more
  • 0
  • 0
  • 5232

article-image-what-role-does-linux-play-in-securing-android-devices
Sugandha Lahoti
07 Oct 2018
9 min read
Save for later

What role does Linux play in securing Android devices?

Sugandha Lahoti
07 Oct 2018
9 min read
In this article, we will talk about the Android Model particularly the Linux Kernel layer, over which Android is built. We will also talk about Android's security features and offerings and how Linux plays a role to secure Android OS. This article is taken from the book Practical Mobile Forensics - Third Edition by Rohit Tamma et al. In this book, you will investigate, analyze, and report iOS, Android, and Windows devices. The Android architecture Android is open source and the code is released under the Apache license. Practically, this means anyone (especially device manufacturers) can access it, freely modify it, and use the software according to the requirements of any device. This is one of the primary reasons for its wide acceptance. Notable players that use Android include Samsung, HTC, Sony, and LG. As with any other platform, Android consists of a stack of layers running one above the other. To understand the Android ecosystem, it's essential to have a basic understanding of what these layers are and what they do. The following figure summarizes the various layers involved in the Android software stack: Android architecture Each of these layers performs several operations that support specific operating system functions. Each layer provides services to the layers lying on top of it. The Linux kernel layer Android OS is built on top of the Linux kernel, with some architectural changes made by Google. There are several reasons for choosing the Linux kernel. Most importantly, Linux is a portable platform that can be compiled easily on different hardware. The kernel acts as an abstraction layer between the software and hardware present on the device. Consider the case of a camera click. What happens when you take a photo using the camera button on your device? At some point, the hardware instruction (pressing a button) has to be converted to a software instruction (to take a picture and store it in the gallery). The kernel contains drivers to facilitate this process. When the user presses on the button, the instruction goes to the corresponding camera driver in the kernel, which sends the necessary commands to the camera hardware, similar to what occurs when a key is pressed on a keyboard. In simple words, the drivers in the kernel command control the underlying hardware. The Linux kernel is responsible for managing the core functionality of Android, such as process management, memory management, security, and networking. Linux is a proven platform when it comes to security and process management. Android has taken leverage of the existing Linux open source OS to build a solid foundation for its ecosystem. Each version of Android has a different version of the underlying Linux kernel. The Marshmallow Android version is known to use Linux kernel 3.18.10, whereas the Nougat version is known to use Linux kernel 4.4.1. Android security Android was designed with a specific focus on security. Android as a platform offers and enforces certain features that safeguard the user data present on the mobile through multi-layered security. There are certain safe defaults that will protect the user, and certain offerings that can be leveraged by the development community to build secure applications. The following are issues that are to be kept in mind while incorporating Android security controls: Protecting user-related data Safeguarding the system resources Making sure that one application cannot access the data of another application The next few sections will help us understand more about Android's security features and offerings. Secure kernel Linux has evolved as a trusted platform over the years, and Android has leveraged this fact using it as its kernel. The user-based permission model of Linux has, in fact, worked well for Android. As mentioned earlier, there is a lot of specific code built into the Linux kernel. With each Android version release, the kernel version has also changed. The following table shows Android versions and their corresponding kernel versions: Android version Linux kernel version 1 2.6.25 1.5 2.6.27 1.6 2.6.29 2.2 2.6.32 2.3 2.6.35 3.0 2.6.36 4.0 3.0.1 4.1 3.0.31 4.2 3.4.0 4.2 3.4.39 4.4 3.8 5.0 3.16.1 6.0 3.18.1 7.0 4.4.1 The permission model As shown in the following screenshot, any Android application must be granted permissions to access sensitive functionality, such as the internet, dialer, and so on, by the user. This provides an opportunity for the user to know in advance which functions on the device is being accessed by the application. Simply put, it requires the user's permission to perform any kind of malicious activity (stealing data, compromising the system, and so on). This model helps the user to prevent attacks, but if the user is unaware and gives away a lot of permissions, it leaves them in trouble (remember, when it comes to installing malware on any device, the weakest link is always the user). Until Android 6.0, users needed to grant the permissions during install time. Users had to either accept all the permissions or not install the application. But, starting from Android 6.0, users grant permissions to apps while the app is running. This new permission system also gives the user more control over the app's functionality by allowing the user to grant selective permissions. For example, a user can deny a particular app access to his location but provide access to the internet. The user can revoke the permissions at any time by going to the app's Settings screen. Application sandbox In Linux systems, each user is assigned a unique user ID (UID), and users are segregated so that one user cannot access the data of another user. However, all applications under a particular user are run with the same privileges. Similarly, in Android, each application runs as a unique user. In other words, a UID is assigned to each application and is run as a separate process. This concept ensures an application sandbox at the kernel level. The kernel manages the security restrictions between the applications by making use of existing Linux concepts, such as UID and GID. If an application attempts to do something malicious, say to read the data of another application, this is not permitted as the application does not have user privileges. Hence, the operating system protects an application from accessing the data of another application. Secure inter-process communication Android offers secure inter-process communication through which one's activity in an application can send messages to another activity in the same application or a different application. To achieve this, Android provides inter-process communication (IPC) mechanisms: intents, services, content providers, and so on. Application signing It is mandatory that all of the installed applications are digitally signed. Developers can place their applications in Google's Play Store only after signing the applications. The private key with which the application is signed is held by the developer. Using the same key, a developer can provide updates to their application, share data between the applications, and so on. Security-Enhanced Linux Security-Enhanced Linux (SELinux) is a security feature that was introduced in Android 4.3 and fully enforced in Android 5.0. Until this addition, Android security was based on Discretionary Access Control (DAC), which means applications can ask for permissions, and users can grant or deny those permissions. Thus, malware can create havoc on phones by gaining those permissions. But, SE Android uses Mandatory Access Control (MAC), which ensures that applications work in isolated environments. Hence, even if a user installs a malware app, the malware cannot access the OS and corrupt the device. SELinux is used to enforce MAC over all the processes, including the ones running with root privileges. SELinux operates on the principle of default denial: anything that is not explicitly allowed is denied. SELinux can operate in one of the two global modes: permissive mode, in which permission denials are logged but not enforced, and enforcing mode, in which denials are both logged and enforced. Full Disk Encryption With Android 6.0 Marshmallow, Google has mandated Full Disk Encryption (FDE) for most devices, provided that the hardware meets certain minimum standards. Encryption is the process of converting data into cipher text using a secret key. On Android devices, full disk encryption refers to the process of encrypting all user data using a secret key. This key is then encrypted by the lock screen PIN/pattern/password before being securely stored in a trusted location. Once a device is encrypted, all user-created data is automatically encrypted before writing it to disk, and all reads automatically decrypt data before returning it to the calling process. Full disk encryption in Android works only with an Embedded Multimedia Card (eMMC) and similar flash devices that present themselves to the kernel as block devices. Staring from Android 7.x, Google decided to shift the encryption feature from full-disk encryption to file-based encryption. In file-based encryption, different files are encrypted with different keys. By doing so, those files can be unlocked independently without requiring an entire partition to be decrypted at once. As a result of this, the system can now decrypt and use files needed to boot the system, and open notifications without having to wait until the user unlocks the phone. Trusted Execution Environment Trusted Execution Environment (TEE) is an isolated area (typically a separate microprocessor) intended to guarantee the security of data stored inside it, and also to execute code with integrity. The main processor on mobile devices is considered untrusted and cannot be used to store secret data (such as cryptographic keys). Hence, TEE is used specifically to perform such operations, and the software running on the main processor delegates any operations that require the use of secret data to the TEE processor. Thus we talked about the Linux Kernel layer, over which Android is built. We also talked about Android's security features and offerings and how Linux plays a role to secure Android OS. To learn more about methods for accessing the data stored on Android devices, read our book Practical Mobile Forensics - Third Edition. The kernel community attempting to make Linux more secure. Google open sources Filament – a physically based rendering engine for Android, Windows, Linux and macOS Google becomes a new platinum member of the Linux Foundation
Read more
  • 0
  • 0
  • 10020
article-image-6-common-use-cases-of-reverse-proxy-scenarios
Guest Contributor
05 Oct 2018
6 min read
Save for later

6 common use cases of Reverse Proxy scenarios

Guest Contributor
05 Oct 2018
6 min read
Proxy servers are used as intermediaries between a client and a website or online service. By routing traffic through a proxy server, users can disguise their geographic location and their IP address. Reverse proxies, in particular, can be configured to provide a greater level of control and abstraction, thereby ensuring the flow of traffic between clients and servers remains smooth. This makes them a popular tool for individuals who want to stay hidden online, but they are also widely used in enterprise settings, where they can improve security, allow tasks to be carried out anonymously, and control the way employees are able to use the internet. What is a Reverse Proxy? A reverse proxy server is a type of proxy server that usually exists behind the firewall of a private network. It directs any client requests to the appropriate server on the backend. Reverse proxies are also used as a means of caching common content and compressing inbound and outbound data, resulting in a faster and smoother flow of traffic between clients and servers. Furthermore, the reverse proxy can handle other tasks, such as SSL encryption, further reducing the load on web servers. There is a multitude of scenarios and use cases in which having a reverse proxy can make all the difference to the speed and security of your corporate network. By providing you with a point at which you can inspect traffic and route it to the appropriate server, or even transform the request, a reverse proxy can be used to achieve a variety of different goals. Load Balancing to route incoming HTTP requests This is probably the most familiar use of reverse proxies for many users. Load balancing involves the proxy server being configured to route incoming HTTP requests to a set of identical servers. By spreading incoming requests across these servers, the reverse proxies are able to balance out the load, therefore sharing it amongst them equally. The most common scenario in which load balancing is employed is when you have a website that requires multiple servers. This happens due to the volume of requests, which are too much for one server to handle efficiently. By balancing the load across multiple servers, you can also move away from an architecture that features a single point of failure. Usually, the servers will all be hosting the same content, but there are also situations in which the reverse proxy will also be retrieving specific information from one of a number of different servers. Provide security by monitoring and logging traffic By acting as the mediator between clients and your system’s backend, a reverse proxy server can hide the overall structure of your backend servers. This is because the reverse proxy will capture any requests that would otherwise go to those servers and handle them securely. A reverse proxy can also improve security by providing businesses with a point at which they can monitor and log traffic flowing through their network. A common use case in which a reverse proxy is used to bolster the security of a network would be the use of a reverse proxy as an SSL gateway. This allows you to communicate using HTTP behind the firewall without compromising your security. It also saves you the trouble of having to configure security for each server behind the firewall individually. A rotating residential proxy, also known as a backconnect proxy, is a type of proxy that frequently changes the IP addresses and connections that the user uses. This allows users to hide their identity and generate a large number of requests without setting alarms off. A reverse rotating residential proxy can be used to improve the security of a corporate network or website. This is because the servers in question will display the information for the proxy server while keeping their own information hidden from potential attackers. No need to install certificates on your backend servers with SSL Termination SSL termination process occurs when an SSL connection server ends, or when the traffic shifts between encrypted and unencrypted requests. By using a reverse proxy to handle any incoming HTTPS connections, you can have the proxy server decrypt the request, and then pass on the unencrypted request to the appropriate server. Taking this approach offers practical benefits. For example, it eliminates the need to install certificates on your backend servers. It also provides you with a single configuration point for managing SSL/TLS. Removing the need for your web servers to undertake this decryption means that you are also reducing the processing load on the server. Serve static content on behalf of backend servers Some reverse proxy servers can be configured to also act as web servers. Websites contain a mixture of dynamic content, which changes over time, and static content, which always remains the same. If you can configure your reverse proxy server to serve up static content on behalf of backend servers, you can greatly reduce the load, freeing up more power for dynamic content rendering. Alternatively, a reverse proxy can be configured to behave like a cache. This allows it to store and serve content that is frequently requested, thereby further reducing the load on backend servers. URL Rewriting before they go on to the backend servers Anything that a business can do to easily to improve their SEO score is worth considering. Without an investment in your SEO, your business or website will remain invisible to search engine users. With URL rewriting, you can compensate for any legacy systems you use, which produce URLs that are less than ideal for SEO. With a reverse proxy server, the URLs can be automatically reformatted before they are passed on to the backend servers. Combine Different Websites into a Single URL Space It is often desirable for a business to adopt a distributed architecture whereby different functions are handled by different components. With a reverse proxy, it is easy to route a single URL to a multitude of components. To anyone who uses your URL, it will simply appear as if they are moving to another page on the website. In fact, each page within that URL might actually be connecting to a completely different backend service. This is an approach that is widely used for web service APIs. To sum up, the primary function of a reverse proxy is load balancing, ensuring that no individual backend server becomes inundated with more traffic or requests than it can handle. However, there are a number of other scenarios in which a reverse proxy can potentially offer enormous benefits. About the author Harold Kilpatrick is a cybersecurity consultant and a freelance blogger. He's currently working on a cybersecurity campaign to raise awareness around the threats that businesses can face online. Read Next HAProxy introduces stick tables for server persistence, threat detection, and collecting metrics How to Configure Squid Proxy Server Acting as a proxy (HttpProxyModule)
Read more
  • 0
  • 0
  • 27507

article-image-what-is-statistical-analysis-and-why-does-it-matter
Sugandha Lahoti
02 Oct 2018
6 min read
Save for later

What is Statistical Analysis and why does it matter?

Sugandha Lahoti
02 Oct 2018
6 min read
As a data developer, the concept or process of data analysis may be clear to your mind. However, although there happen to be similarities between the art of data analysis and that of statistical analysis, there are important differences to be understood as well. This article is taken from the book Statistics for Data Science by James D. Miller. This book takes you through an entire journey of statistics, from knowing very little to becoming comfortable in using various statistical methods for data science tasks. In this article, we've broken things into the following topics: What is statistical analysis and it's best practices? How to establish the nature of data? What is Statistical analysis? Some in the study of statistics sometimes describe statistical analysis as part of statistical projects that involves the collection and scrutiny of a data source in an effort to identify trends within the data. With data analysis, the goal is to validate that the data is appropriate for a need, and with statistical analysis, the goal is to make sense of, and draw some inferences from, the data. There is a wide range of possible statistical analysis techniques or approaches that can be considered. How to perform a successful statistical analysis It is worthwhile to mention some key points, dealing with ensuring a successful (or at least productive) statistical analysis effort. As soon as you can, decide on your goal or objective. You need to know what the win is, that is, what the problem or idea is that is driving the analysis effort. In addition, you need to make sure that, whatever is driving the analysis, the result obtained must be measurable in some way. This metric or performance indicator must be identified early. Identify key levers. This means that once you have established your goals and a way to measure performance towards obtaining those goals, you also need to find out what has an effect on the performance towards obtaining each goal. Conduct a thorough data collection. Typically, the more data the better, but in the absence of quantity, always go with quality. Clean your data. Make sure your data has been cleaned in a consistent way so that data issues would not impact your conclusions. Model, model, and model your data. Modeling drives modeling. The more you model your data, the more questions you'll have asked and answered, and the better results you'll have. Take time to grow in your statistical analysis skills. It's always a good idea to continue to evolve your experiences and style of statistical analysis. The way to improve is to do it. Another approach is to remodel the data you may have on hand for other projects to hone your skills. Optimize and repeat. As always, you need to take the time for standardizing, following proven practices, using templates, and testing and documenting your scripts and models, so that you can re-use your best efforts over and over again. You will find that this time will be well spent and even your better efforts will improve with use. Finally, share your work with others! The more eyes, the better the product. Some interesting advice on ensuring success with statistical projects includes the following quote: It's a good idea to build a team that allows those with an advanced degree in statistics to focus on data modeling and predictions, while others in the team-qualified infrastructure engineers, software developers and ETL experts-build the necessary data collection infrastructure, data pipeline and data products that enable streaming the data through the models and displaying the results to the business in the form of reports and dashboards.                                                                                                            - G Shapira, 2017 Establishing the nature of data When asked about the objectives of statistical analysis, one often refers to the process of describing or establishing the nature of a data source. Establishing the nature of something implies gaining an understanding of it. This understanding can be found to be both simple as well as complex. For example, can we determine the types of each of the variables or components found within our data source; are they quantitative, comparative, or qualitative? A more advanced statistical analysis aims to identify patterns in data; for example, whether there is a relationship between the variables or whether certain groups are more likely to show certain attributes than others. Exploring the relationships presented in data may appear to be similar to the idea of identifying a foreign key in a relational database, but in statistics, relationships between the components or variables are based upon correlation and causation. Further, establishing the nature of a data source is also, really, a process of modeling that data source. During modeling, the process always involves asking questions such as the following (in an effort establish the nature of the data): What? Some common examples of this (what) are revenue, expenses, shipments, hospital visits, website clicks, and so on. In the example, we are measuring quantities, that is, the amount of product that is being moved (sales). Why? This (why) will typically depend upon your project's specific objectives, which can vary immensely. For example, we may want to track the growth of a business, the activity on a website, or the evolution of a selected product or market interest. Again, in our current transactional data example, we may want to identify over- and under-performing sales types, and determine if, new or repeat customers provide more or fewer sales? How? The how will most likely be over a period of time (perhaps a year, month, week, and so on) and then by some other related measure, such as a product, state, region, reseller, and so on. Within our transactional data example, we've focused on the observation of quantities by sale type. Another way to describe establishing the nature of your data is adding context to it or profiling it. In any case, the objective is to allow the data consumer to better understand the data through visualization. Another motive for adding context or establishing the nature of your data can be to gain a new perspective on the data. In this article, we explored the purpose and process of statistical analysis and listed the steps involved in a successful statistical analysis. Next, to learn about statistical regression and why it is important to data science, read our book Statistics for Data Science. Estimating population statistics with Point Estimation. Why You Need to Know Statistics To Be a Good Data Scientist. Why choose IBM SPSS Statistics over R for your data analysis project.
Read more
  • 0
  • 0
  • 6352

article-image-9-reasons-to-choose-agile-methodology-for-mobile-app-development
Guest Contributor
01 Oct 2018
6 min read
Save for later

9 reasons to choose Agile Methodology for Mobile App Development

Guest Contributor
01 Oct 2018
6 min read
As mobile application development is becoming the new trend in the business world, the competition to enter the mobile market has intensified. Business leaders are hunting for every possible way to reach the market at the earliest and outshine the competition. Albeit, without compromising on the quality and quantity of the opportunities in the market. One such method prevalent in the market is the adoption of Agile methodology. Agile mobile app development methodology The Agile methodology is an incremental and iterative mobile application development approach, where the complete app development process cycle is divided into multiple sub-modules, considered as mini-projects. Every submodule is assigned to an individual team and subjected to the complete development cycle, right from designing to development, testing, and delivery. Image Source: AppInventiv Benefits of Agile Mobile App Development Approach The technology, due to this iterative nature, is highly recommended in the market. Here are 9 reasons to choose Agile for your mobile app development project Faster Development In the case of the Agile model, the complete mobile app project is divided into smaller modules which are treated like independent sub-projects. These sub-projects are handled by different teams independently, with little-to-no dependencies on each other. Besides, everyone has a clear idea of what their contribution will be and the associated resources and deadline, which accelerates the development process. Every developer puts their best efforts into completing their part in the mobile app development project, an outcome of which is a more streamlined app development process with faster delivery. Reduced Risks With the changing market needs and trends, it is quite risky to launch your own application. Many times, it happens that the market data you have taken into consideration while developing your app gets outdated by the time you launch your app. The outcome is poor ROI and a shady future in the market. Agile, in this situation, is a tool that allows you to take calculated risks and improve your project’s market scope. In other words, the methodology enables a mobile app development company to make changes to any particular sprint without disturbing the code of the previous sprints. Thus, making their application more suitable for the market. Better Quality Agile, unlike the traditional app development models, does not test the app at the end of the development phase. Rather, it fosters testing of every single module at the primitive level. This reduces the risk of encountering a bug at the time of quality testing of the complete project. It also helps mobile app developers to inspect the app elements at every stage of the development process and make adjustments as per the requirement, eventually helping in delivering a higher quality of services. Seamless Project Management By transforming the complete app development project into multiple individual modules, the Agile methodology provides you with the facility to manage your project easily. You can easily assign the tasks to different teams and reduce the dependencies and discussions at the inter-team level. You can also keep a record of the activities performed on each mini project and in this way, determine if something is missing or not working as per the proposed plan. Besides, you can check the productivity of every individual and put your efforts into making/hiring more efficient experts. Enhanced Customer Experience The Agile mobile app development approach puts a strong emphasis on people and collaboration, which renders the development team with an opportunity to work closely with their clients and understand their vision. Besides, the projects are delivered to the clients in the form of multiple sprints, which brings transparency to the process. It also enables the team to determine if both parties are on the same page and if not, allow them to make the required changes before proceeding further. This lessens the chances of launching app that does not fulfill to the idea behind, and therefore, ensures enhanced customer experience. Lower Development Cost Since every step is well planned, executed, and delivered, you can easily calculate the cost of making an app and thus, justify your app budget. Besides, if at any stage of development, you feel the need to raise the app budget, you can easily do it with Agile methodology. In this way, you can avoid leaving the project incomplete due to lack of required resources and funds. Customization The agile mobile app development approach also provides developers with an opportunity to customize their development process. There’ are no rules to create an app in a particular way. Experts can look for different ways to develop and launch the mobile app, and integrate the cutting-edge technologies and tools into the process. In a nutshell, the Agile process enables developers to customize the development timeline as per their choice and deliver a user-centric solution. Higher ROI The Agile methodology lets the mobile app development companies enter the market with the most basic app (MVP) and update the app with each iteration. This makes it easier for the app owners to test their idea, gather required data and insights, build a brand presence, and thus, deliver the best features as per the customer needs and market trends. This aids the app owners and associated mobile app development company to take the right decision for gaining better ROI in the market. Earlier Market Reach By dividing the complete app project into submodules, the Agile mobile app development approach encourages the team to deliver every module with the stipulated deadline. Lagging behind the deadlines is not practically possible. The outcome of this is that the complete app project is designed and delivered on-time or even earlier to it, which means earlier market reach. The Agile methodology can do wonders for mobile app development. However, it is required that everyone is well aware of the end goal and contributes to this approach wisely. Only then can you enjoy all of the aforementioned benefits. With this, are you ready to go Agile? Are you ready to develop a mobile app with an Agile mobility strategy?. Author Bio Holding a Bachelor’s degree in Technology and 2 years of work experience in a mobile app development company, Bhupinder is focused on making technology digestible to all. Being someone who stays updated with the latest tech trends, she’s always armed to write and spread the knowledge. When not found writing, you will find her answering on Quora while sipping coffee.
Read more
  • 0
  • 0
  • 26430
article-image-is-youtubes-ai-algorithm-evil
Amarabha Banerjee
30 Sep 2018
6 min read
Save for later

Is YouTube's AI Algorithm evil?

Amarabha Banerjee
30 Sep 2018
6 min read
YouTube is at the center of content creation, content distribution, and advertising activities for some time now. The impact of YouTube can be estimated from the 1.8 billion YouTube users worldwide. While the YouTube video hosting concept has been a great success story for content creators, the video viewing and recommendation model has been in the middle of a brewing controversy lately. The Controversy Logan Paul was already a top rated YouTube star when he stumbled across a hanging dead body in a Japanese forest which is famous as a suicide spot. After the initial shock and awe, Logan Paul seemed quite amused and commented “Dude, his hands are purple,” then he turned to his friends and giggled. “You ever stand next to a dead guy?”. This particular instance was a shocking moment for YouTubers all across the globe. Disapproving reactions had poured in and the video was taken down 24 hours later by YouTube. In those 24 hours, the video managed to garner 6 million views. Even after the furious backlash, users complained that they were still seeing recommendations of Logan Paul’s videos. That brought the emphasis back on the recommendation system that YouTube uses. YouTube Video Recommendation Back in 2005, when YouTube first started out, it had a uniform homepage for all users. This meant that every YouTube user would see the same homepage and the creators who would feature there, would get a huge boost in their viewership. Their selection was based on their subscriber count, views and user engagement metrics e.g. likes, comments, shares etc. This inspired other users to become creators and start contributing content to become a part of the YouTube family. In 2006, YouTube was bought by Google. Their policies and homepage started evolving gradually. As ads started showing on YouTube videos, the scenario changed quite quickly. Also, with the rapid rise in the number of users, Google had thought it to be a good idea to curate the homepage as per each user’s watch history, subscriptions, and likes. This was a good move in principle since it helped the users to see what they wanted to see. As a part of their next level innovation, a machine learning model was created to suggest or recommend videos to users. The goal of this deep neural network based recommendation engine was to increase watch time of every video so that users stay longer on the platform. What did it change and How When Youtube’s machine learning algorithm shows a few videos in your feed as “Recommended for you”, it predicts what you want to see from your watch history and watch history of similar users. If you interact with any of these videos and watch it for a certain amount of time, the recommendation engine considers it as a success and starts curating a list based on your interactions with its suggested videos. The more data it gathers about your choices and watch history, the more confident it becomes of its own video decisions. The major goal of Youtube’s recommendation engine is to attract your attention and get you hooked to the platform to get more watch time. More watch time means more revenue and more scope for targeted ads. What this changes, is the fundamental concept of choice and the exercising of user discretion. The moment the YouTube Algorithm considers watch time as the most important metric to recommend videos to you, less importance goes into the organic interactions on YouTube, which includes liking, commenting and subscribing to videos and channels. Users get to see video recommendations based on the YouTube Algorithm’s user understanding and its goal of maximizing watch time, with less importance given to user choices. Distorted Reality and YouTube This attention maximizing model is the fundamental working mechanism of mostly all social media networks. But YouTube has not been implicated in the accusation of distorting reality and spreading the fake news as much as Facebook has been in mainstream media. But times are changing and so are the viewpoints related to YouTube’s influence on the global population and its ability to manipulate important public opinion. Guillaume Chaslot, a 36-year-old French computer programmer with a Ph.D. in artificial intelligence, was one of those engineers who was in the core team to develop and perfect the YouTube algorithm. In his own words “YouTube is something that looks like reality, but it is distorted to make you spend more time online. The recommendation algorithm is not optimizing for what is truthful, or balanced, or healthy for democracy.” Chaslot explains that the algorithm never stays the same. It is constantly changing the weight it gives to different signals; the viewing patterns of a user, for example, or the length of time a video is watched before someone clicks away.” Chaslot was fired by Google in 2013 over performance issues. His claim was that he wanted to bring about a change in the approach of the YouTube algorithm to make it more aligned with democratic values instead of being devoted to just increasing the watch time. Where are we headed I am not qualified or righteous enough to answer the direct question - is YouTube good or bad. YouTube creates opportunities for millions of creators worldwide to showcase their talent and present it to a global audience without worrying about country or boundaries. This itself is a huge power for an internet application. But the crucial point to remember here is whether YouTube is using this power to just make the users glued to the screen. Do they really care if you are seeing divisive content or prejudiced flat earther conspiracies as recommended videos? The algorithm can be tweaked to include parameters which will remove unintended bias such as whether a video is propagating fake news or influencing voters minds in an unlawful way. But that is near impossible as machines lack morality or empathy or even common sense. To incorporate humane values such as honesty and morality into an AI system is like creating an AI that is more human than a machine. This is why machine augmented human intelligence will play a more and more crucial role in the near future. The possibilities are endless, be it good or bad. Whether we progress or digress, might not be in our hands anymore. But what might be in our hands is to come together to put effective checkpoints to identify and course correct scenarios where algorithms rule wild. Sex robots, artificial intelligence, and ethics: How desire shapes and is shaped by algorithms Like newspapers, Google algorithms are protected by the First amendment California replaces cash bail with algorithms
Read more
  • 0
  • 0
  • 9369

article-image-what-is-core-ml
Savia Lobo
28 Sep 2018
5 min read
Save for later

What is Core ML?

Savia Lobo
28 Sep 2018
5 min read
Introduced by Apple, CoreML is a machine learning framework that powers the iOS app developers to integrate machine learning technology into their apps. It supports natural language processing (NLP), image analysis, and various other conventional models to provide a top-notch on-device performance with minimal memory footprint and power consumption. This article is an extract taken from the book Machine Learning with Core ML written by Joshua Newnham. In this article, you will get to know the basics of what CoreML is and its typical workflow. With the release of iOS 11 and Core ML, performing inference is just a matter of a few lines of code. Prior to iOS 11, inference was possible, but it required some work to take a pre-trained model and port it across using an existing framework such as Accelerate or metal performance shaders (MPSes). Accelerate and MPSes are still used under the hood by Core ML, but Core ML takes care of deciding which underlying framework your model should use (Accelerate using the CPU for memory-heavy tasks and MPSes using the GPU for compute-heavy tasks). It also takes care of abstracting a lot of the details away; this layer of abstraction is shown in the following diagram: There are additional layers too; iOS 11 has introduced and extended domain-specific layers that further abstract a lot of the common tasks you may use when working with image and text data, such as face detection, object tracking, language translation, and named entity recognition (NER). These domain-specific layers are encapsulated in the Vision and natural language processing (NLP) frameworks; we won't be going into any details of these frameworks here, but you will get a chance to use them in later chapters: It's worth noting that these layers are not mutually exclusive and it is common to find yourself using them together, especially the domain-specific frameworks that provide useful preprocessing methods we can use to prepare our data before sending to a Core ML model. So what exactly is Core ML? You can think of Core ML as a suite of tools used to facilitate the process of bringing ML models to iOS and wrapping them in a standard interface so that you can easily access and make use of them in your code. Let's now take a closer look at the typical workflow when working with Core ML. CoreML Workflow As described previously, the two main tasks of a ML workflow consist of training and inference. Training involves obtaining and preparing the data, defining the model, and then the real training. Once your model has achieved satisfactory results during training and is able to perform adequate predictions (including on data it hasn't seen before), your model can then be deployed and used for inference using data outside of the training set. Core ML provides a suite of tools to facilitate getting a trained model into iOS, one being the Python packaged released called Core ML Tools; it is used to take a model (consisting of the architecture and weights) from one of the many popular packages and exporting a .mlmodel file, which can then be imported into your Xcode project. Once imported, Xcode will generate an interface for the model, making it easily accessible via code you are familiar with. Finally, when you build your app, the model is further optimized and packaged up within your application. A summary of the process of generating the model is shown in the following diagram:   The previous diagram illustrates the process of creating the .mlmodel;, either using an existing model from one of the supported frameworks, or by training it from scratch. Core ML Tools supports most of the frameworks, either internal or as third party plug-ins, including  Keras, Turi, Caffe, scikit-learn, LibSVN, and XGBoost frameworks. Apple has also made this package open source and modular for easy adaption for other frameworks or by yourself. The process of importing the model is illustrated in this diagram: In addition; there are frameworks with tighter integration with Core ML that handle generating the Core ML model such as Turi Create, IBM Watson Services for Core ML, and Create ML. We will be introducing Create ML in chapter 10; for those interesting in learning more about Turi Create and IBM Watson Services for Core ML then please refer to the official webpages via the following links: Turi Create; https://github.com/apple/turicreate IBM Watson Services for Core ML; https://developer.apple.com/ibm/ Once the model is imported, as mentioned previously, Xcode generates an interface that wraps the model, model inputs, and outputs. Thus, in this post, we learned about the workflow of training and how to import a model. If you've enjoyed this post, head over to the book  Machine Learning with Core ML  to delve into the details of what this model is and what Core ML currently supports. Emotional AI: Detecting facial expressions and emotions using CoreML [Tutorial] Build intelligent interfaces with CoreML using a CNN [Tutorial] Watson-CoreML: IBM and Apple’s new machine learning collaboration project
Read more
  • 0
  • 0
  • 7685