Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon

Tech Guides

852 Articles
article-image-2018-year-of-graph-databases
Amey Varangaonkar
04 May 2018
5 min read
Save for later

2018 is the year of graph databases. Here's why.

Amey Varangaonkar
04 May 2018
5 min read
With the explosion of data, businesses are looking to innovate as they connect their operations to a whole host of different technologies. The need for consistency across all data elements is now stronger than ever. That’s where graph databases come in handy. Because they allow for a high level of flexibility when it comes to representing your data and also while handling complex interactions within different elements, graph databases are considered by many to be the next big trend in databases. In this article, we dive deep into the current graph database scene, and list out 3 top reasons why graph databases will continue to soar in terms of popularity in 2018. What are graph databases, anyway? Simply put, graph databases are databases that follow the graph model. What is a graph model, then? In mathematical terms, a graph is simply a collection of nodes, with different nodes connected by edges. Each node contains some information about the graph, while edges denote the connection between the nodes. How are graph databases different from the relational databases, you might ask? Well, the key difference between the two is the fact that graph data models allow for more flexible and fine-grained relationships between data objects, as compared to relational models. There are some more differences between the graph data model and the relational data model, which you should read through for more information. Often, you will see that graph databases are without a schema. This allows for a very flexible data model, much like the document or key/value store database models. A unique feature of the graph databases, however, is that they also support relationships between the data objects like a relational database. This is useful because it allows for a more flexible and faster database, which can be invaluable to your project which demands a quicker response time. Image courtesy DB-Engines The rise in popularity of the graph database models over the last 5 years has been stunning, but not exactly surprising. If we were to drill down the 3 key factors that have propelled the popularity of graph databases to a whole new level, what would they be? Let’s find out. Major players entering the graph database market About a decade ago, the graph database family included just Neo4j and a couple of other less-popular graph databases. More recently, however, all the major players in the industry such as Oracle (Oracle Spatial and Graph), Microsoft (Graph Engine), SAP (SAP Hana as a graph store) and IBM (Compose for JanusGraph) have come up with graph offerings of their own. The most recent entrant to the graph database market is Amazon, with Amazon Neptune announced just last year. According to Andy Jassy, CEO of Amazon Web Services, graph databases are becoming a part of the growing trend of multi-model databases. Per Jassy, these databases are finding increased adoption on the cloud as they support a myriad of useful data processing methods. The traditional over-reliance on relational databases is slowly breaking down, he says. Rise of the Cypher Query Language With graph databases slowly getting mainstream recognition and adoption, the major companies have identified the need for a standard query language for all graph databases. Similar to SQL, Cypher has emerged as a standard and is a widely-adopted alternative to write efficient and easy to understand graph queries. As of today, the Cypher Query Language is used in popular graph databases such as Neo4j, SAP Hana, Redis graph and so on. The OpenCypher project, the project that develops and maintains Cypher, has also released Cypher for popular Big Data frameworks like Apache Spark. Cypher’s popularity has risen tremendously over the last few years. The primary reason for this is the fact that like SQL, Cypher’s declarative nature allows users to state the actions they want performed on their graph data without explicitly specifying them. Finding critical real-world applications Graph databases were in the news as early as 2016, when the Panama paper leaks were revealed with the help of Neo4j and Linkurious, a data visualization software. In more recent times, graph databases have also found increased applications in online recommendation engines, as well as for performing tasks that include fraud detection and managing social media. Facebook’s search app also uses graph technology to map social relationships. Graph databases are also finding applications in virtual assistants to drive conversations - eBay’s virtual shopping assistant is an example. Even NASA uses the knowledge graph architecture to find critical data. What next for graph databases? With growing adoption of graph databases, we expect graph-based platforms to soon become the foundational elements of many corporate tech stacks. The next focus area for these databases will be practical implementations such as graph analytics and building graph-based applications. The rising number of graph databases would also mean more competition, and that is a good thing - competition will bring more innovation, and enable incorporation of more cutting-edge features. With a healthy and steadily growing community of developers, data scientists and even business analysts, this evolution may be on the cards, sooner than we might expect. Amazon Neptune: A graph database service for your applications When, why and how to use Graph analytics for your big data
Read more
  • 0
  • 0
  • 7130

article-image-future-python-3-experts-views
Richard Gall
27 Mar 2018
7 min read
Save for later

The future of Python: 3 experts' views

Richard Gall
27 Mar 2018
7 min read
Python is the fastest growing programming language on the planet. This year’s Stack Overflow survey produces clear evidence that it is growing at an impressive rate. And it’s not really that surprising - versatile, dynamic, and actually pretty easy to learn, it’s a language that is accessible and powerful enough to solve problems in a range of fields, from statistics to building APIs. But what does the future hold for Python? How will it evolve to meet the needs of its growing community of engineers and analysts? Read the insights from 3 Python experts on what the future might hold for the programming language, taken from Python Interviews, a book that features 20 conversations with leading figures from the Python community. In the future, Python will spawn other more specialized languages Steve Holden (@HoldenWeb), CTO of Global Stress Index and former chairman and director of The PSF: I'm not really sure where the language is going. You hear loose talk of Python 4. To my mind though, Python is now at the stage where it's complex enough. Python hasn't bloated in the same way that I think the Java environment has. At that maturity level, I think it's rather more likely that Python's ideas will spawn other, perhaps more specialized, languages aimed at particular areas of application. I see this as fundamentally healthy and I have no wish to make all programmers use Python for everything; language choices should be made on pragmatic grounds. I've never been much of a one for pushing for change. Enough smart people are thinking about that already. So mostly I lurk on Python-Dev and occasionally interject a view from the consumer side, when I think that things are becoming a little too esoteric. The needs of the Python community are going to influence where the language goes in future Carol Willing (@WillingCarol), former director of The Python Foundation, core developer of CPython, and Research Software Engineer at Project Jupyter. I think we're going to continue to see growth in the scientific programming part of Python. So things that support the performance of Python as a language and async stability are going to continue to evolve. Beyond that, I think that Python is a pretty powerful and solid language. Even if you stopped development today, Python is a darn good language. I think that the needs of the Python community are going to feed back into Python and influence where the language goes. It's great that we have more representation from different groups within the core development team. Smarter minds than mine could provide a better answer to your question. I'm sure that Guido has some things in mind for where he wants to see Python go. Mobile development has been an Achilles' heel for Python for a long time. I'm hoping that some of the BeeWare stuff is going to help with the cross-compilation. A better story in mobile is definitely needed. But you know, if there's a need then Python will get there. I think that the language is going to continue to move towards the stuff that's in Python 3. Some big code bases, like Instagram, have now transitioned from Python 2 to 3. While there is much Python 2.7 code still in production, great strides have been made by Instagram, as they shared in their PyCon 2017 keynote. There's more tooling around Python 3 and more testing tools, so it's less risky for companies to move some of their legacy code to Python 3, where it makes business sense to. It will vary by company, but at some point, business needs, such as security and maintainability, will start driving greater migration to Python 3. If you're starting a new project, then Python 3 is the best choice. New projects, especially when looking at microservices and AI, will further drive people to Python 3. Organizations that are building very large Python codebases are adopting type annotations to help new developers Barry Warsaw (@pumpichank), member of the Python Foundation team at LinkedIn, former project leader of GNU Mailman: In some ways it's hard to predict where Python is going. I've been involved in Python for 23 years, and there was no way I could have predicted in 1994 what the computing world was going to look like today. I look at phones, IoT (Internet of things) devices, and just the whole landscape of what computing looks like today, with the cloud and containers. It's just amazing to look around and see all of that stuff. So there's no real way to predict what Python is going to look like even five years from now, and certainly not ten or fifteen years from now. I do think Python's future is still very bright, but I think Python, and especially CPython, which is the implementation of Python in C, has challenges. Any language that's been around for that long is going to have some challenges. Python was invented to solve problems in the 90s and the computing world is different now and is going to become different still. I think the challenges for Python include things like performance and multi-core or multi-threading applications. There are definitely people who are working on that stuff and other implementations of Python may spring up like PyPy, Jython, or IronPython. Aside from the challenges that the various implementations have, one thing that Python has as a language, and I think this is its real strength, is that it scales along with the human scale. For example, you can have one person write up some scripts on their laptop to solve a particular problem that they have. Python's great for that. Python also scales to, let's say, a small open source project with maybe 10 or 15 people contributing. Python scales to hundreds of people working on a fairly large project, or thousands of people working on massive software projects. Another amazing strength of Python as a language is that new developers can come in and learn it easily and be productive very quickly. They can pull down a completely new Python source code for a project that they've never seen before and dive in and learn it very easily and quickly. There are some challenges as Python scales on the human scale, but I feel like those are being solved by things like the type annotations, for example. On very large Python projects, where you have a mix of junior and senior developers, it can be a lot of effort for junior developers to understand how to use an existing library or application, because they're coming from a more statically-typed language. So a lot of organizations that are building very large Python codebases are adopting type annotations, maybe not so much to help with the performance of the applications, but to help with the onboarding of new developers. I think that's going a long way in helping Python to continue to scale on a human scale. To me, the language's scaling capacity and the welcoming nature of the Python community are the two things that make Python still compelling even after 23 years, and will continue to make Python compelling in the future. I think if we address some of those technical limitations, which are completely doable, then we're really setting Python up for another 20 years of success and growth.
Read more
  • 0
  • 2
  • 7105

article-image-rxswift-part-1-where-start-beginning-hot-and-cold-observables
Darren Karl
09 Feb 2017
6 min read
Save for later

RxSwift Part 1: Where to Start? Beginning with Hot and Cold Observables

Darren Karl
09 Feb 2017
6 min read
In the earlier articles, we gave a short introduction to RxSwift and talked about the advantages of the functional aspect of Rx,by using operators and composing a stream of operations. In my journey to discover and learn Rx, I was drawn to it after finding various people talking about its benefits. After I finally bought into the idea that I wanted to learn it, I began to read the documentation. I was overwhelmed by how many objects, classes, or operators were provided. There were loads of various terminologies that I encountered in the documentation. The documentation was (and still is) there, but because I was still at page one, my elementary Rx vocabulary prevented me from actually being able to appreciate, maximize, and use RxSwift. I had to go through months of soaking in the documentation until it saturated in my brain and things clickedfinally. I found that I wasn’t the only one who was experiencing this after talking with some of my RxSwift community members in Slack. This is the gap. RxSwift is a beautifully designed API (I’ll talk about why exactly, later), but I personally didn’t know how long it would take to go from my working non-Rx knowledge to slowly learning the well-designed tools that Rx provides. The problem wasn’t that the documentation was lacking, because it was sufficient. It was that while reading the documentation, I found that I didn't even know what questions to ask or which documentation answered what questions I had. What I did know was programming concepts in the context of application development, in a non-Rx way. What I wanted to discover was how things would be done in RxSwift along with the thought processes that led to the design of elegant units of code, such as the various operators like flatMap or concatMap or the units of code such as Subjects or Drivers. This article aims to walk you through real programming situations that software developers encounter, while gradually introducing the Rx concepts that can be used. This article assumes that you’ve read through the last two articles on RxSwift I’ve written, which is linked above, and that you’ve found and read some of the documentation but don’t know where to start. It also assumes that you’re familiar with how network calls or database queries are made and how to wrap them using Rx. A simple queuing application Let’s start with something simple, such as a mobile application, for queuing. We can have multiple queues, which contain zero to many people in order. Let’s say that we have the following code that performs a network query to get the queue data from your REST API. We assume that these are network requests wrapped using Observable.create(): privatefuncgetQueues() -> Observable<[Queue]> privatefuncgetPeople(in queue: Queue) -> Observable<[Person]> privatevardisposeBag = DisposeBag() An example of the Observable code for getting the queue data is available here. Where do I write my subscribe code? Initially, a developer might write the following code in the viewDidLoad() method and bind it to some UITableView: funcviewDidLoad() { getQueue() .subscribeOn(ConcurrentDispatchQueueScheduler(queue: networkQueue)) .observeOn(MainScheduler.instance) .bindTo(tableView.rx.items(cellIdentifier: "Cell")) { index, model, cell in cell.textLabel?.text = model } .addDisposableTo(disposeBag) } However, if the getQueues() observable code loads the data from a cold observable network call, then, by definition, the cold observable will only perform the network call once during viewDidLoad(), load the data into the views, and it is done. The table view will not update in case the queue is updated by the server, unless the view controller gets disposed and viewDidLoad() is performed again. Note that should the network call fail, we can use the catchError() operator right after and swap in a database query or from a cache instead, assuming we’ve persisted the queue data through a file or database. Thisway, we’re assured that this view controller will always have data to display. Introduction to cold and hot observables By cold observable, we mean that the observable code (that is, the network call to get the data) will only begin emitting items on subscription (which is currently on viewDidLoad). This is the difference between a hot and a cold observable: hot observables can be emitting items even when there are no observers subscribed, while cold observables will only run once an observer is subscribed. The examples of cold observables are things you’ve wrapped using Observable.create(), while the examples of hot observables are things like UIButton.rx.tap or UITextField.rx.text, which can be emitting items such as Void for a button press or String for atext field, even when there aren’t any observers subscribed to them. Inherently, we are wrong to use a cold observable because its definition will simply not meet the demands of our application. A quick fix might be to write it in viewWillAppear Going back to our queuing example, one could write it in the viewWillAppear() life cycle state of the app so that it will refresh its data every time the view appears. The problem that arises from this solution is that we perform a network query too frequently. Furthermore, every time viewWillAppear is called, note that a new subscription is added to the disposeBag. If, for some reason, the last subscription does not dispose (that is, it is still processing and emitting items and has not yet entered into the onComplete or onError state) and you’ve begun to perform a network query again, then it means that you have a possibility of amemory leak! Here’s an example of the (impractical) code that refreshes on every view. The code will work (it will refresh every time), but this isn’t good code: publicviewWillAppear(_ animated: Bool) { getQueues().bindTo(tableView.rx.items(cellIdentifier: "Cell")) { index, model, cell in cell.textLabel?.text = model } .addDisposableTo(self.disposeBag) } So, if we don’t want to query only one time and we don’t want to query too frequently, it begs the question,“How many times should the queries really be performed?” In part 2 of this article, we'll discuss what the right amount of querying is. About the Author Darren Karl Sapalo is a software developer, an advocate ofUX, and a student taking up his Master's degree in Computer Science. He enjoyed developing games in his free time when he was twelve. He finally finished with his undergraduate thesis on computer vision and took up some industry work with Apollo Technologies Inc. developing for both Android and iOS platforms.
Read more
  • 0
  • 0
  • 7086

article-image-5-ways-artificial-intelligence-is-transforming-the-gaming-industry
Amey Varangaonkar
01 Dec 2017
7 min read
Save for later

5 Ways Artificial Intelligence is Transforming the Gaming Industry

Amey Varangaonkar
01 Dec 2017
7 min read
Imagine yourself playing a strategy game, like Age of Empires perhaps. You are in a world that looks real and you are pitted against the computer, and your mission is to protect your empire and defeat the computer, at the same time. What if you could create an army of soldiers who could explore the map and attack the enemies on their own, based on just a simple command you give them? And what if your soldiers could have real, unscripted conversations with you as their commander-in-chief to seek instructions? And what if the game’s scenes change spontaneously based on your decisions and interactions with the game elements, like a movie? Sounds too good to be true? It’s not far-fetched at all - thanks to the rise of Artificial Intelligence! The gaming industry today is a market worth over a hundred billion dollars. The Global Games Market Report says that about 2.2 billion gamers across the world are expected to generate an incredible $108.9 billion in game revenue by the end of 2017. As such, gaming industry giants are seeking newer and more innovative ways to attract more customers and expand their brands. While terms like Virtual Reality, Augmented Reality and Mixed Reality come to mind immediately as the future of games, the rise of Artificial Intelligence is an equally important stepping stone in making games smarter and more interactive, and as close to reality as possible. In this article, we look at the 5 ways AI is revolutionizing the gaming industry, in a big way! Making games smarter While scripting is still commonly used for control of NPCs (Non-playable character) in many games today, many heuristic algorithms and game AIs are also being incorporated for controlling these NPCs. Not just that, the characters also learn from the actions taken by the player and modify their behaviour accordingly. This concept can be seen implemented in Nintendogs, a real-time pet simulation video game by Nintendo. The ultimate aim of the game creators in the future will be to design robust systems within games that understand speech, noise and other sounds within the game and tweak the game scenario accordingly. This will also require modern AI techniques such as pattern recognition and reinforcement learning, where the characters within the games will self-learn from their own actions and evolve accordingly. The game industry has identified this and some have started implementing these ideas - games like F.E.A.R and The Sims are a testament to this. Although the adoption of popular AI techniques in gaming is still quite limited, their possible applications in the near-future has the entire gaming industry buzzing. Making games more realistic This is one area where the game industry has grown leaps and bounds over the last 10 years. There have been incredible advancements in 3D visualization techniques, physics-based simulations and more recently, inclusion of Virtual Reality and Augmented Reality in games. These tools have empowered game developers to create interactive, visually appealing games which one could never imagine a decade ago. Meanwhile, gamers have evolved too. They don’t just want good graphics anymore; they want games to resemble reality. This is a massive challenge for game developers, and AI is playing a huge role in addressing this need. Imagine a game which can interpret and respond to your in-game actions, anticipate your next move and act accordingly. Not the usual scripts where an action X will give a response Y, but an AI program that chooses the best possible alternative to your action in real-time, making the game more realistic and enjoyable for you. Improving the overall gaming experience Let’s take a real-world example here. If you’ve played EA Sports’ FIFA 17, you may be well-versed with their Ultimate Team mode. For the uninitiated, it’s more of a fantasy draft, where you can pick one of the five player choices given to you for each position in your team, and the AI automatically determines the team chemistry based on your choices. The team chemistry here is important, because the higher the team chemistry, the better the chances of your team playing well. The in-game AI also makes the playing experience better by making it more interactive. Suppose you’re losing a match against an opponent - the AI reacts by boosting your team’s morale through increased fan chants, which in turn affects player performances positively. Gamers these days pay a lot of attention to detail - this not only includes the visual appearance and the high-end graphics, but also how immersive and interactive the game is, in all possible ways. Through real-time customization of scenarios, AI has the capability to play a crucial role in taking the gaming experience to the next level. Transforming developer skills The game developer community have always been innovators in adopting cutting edge technology to hone their technical skills and creativity. Reinforcement Learning, a sub-set of Machine Learning, and the algorithm behind the popular AI computer program AlphaGo, that beat the world’s best human Go player is a case in point. Even for the traditional game developers, the rising adoption of AI in games will mean a change in the way games are developed. In an interview with Gamasutra, AiGameDev.com’s Alex Champandard says something interesting: “Game design that hinges on more advanced AI techniques is slowly but surely becoming more commonplace. Developers are more willing to let go and embrace more complex systems.” It’s safe to say that the notion of Game AI is changing drastically. Concepts such as smarter function-based movements, pathfinding, inclusion of genetic algorithms and rule-based AI such as fuzzy logic are being increasingly incorporated in games, although not at a very large scale. There are some implementation challenges currently as to how academic AI techniques can be brought more into games, but with time these AI algorithms and techniques are expected to embed more seamlessly with traditional game development skills. As such, in addition to knowledge of traditional game development tools and techniques, game developers will now have to also skill up on these AI techniques to make smarter, more realistic and more interactive games. Making smarter mobile games The rise of the mobile game industry today is evident from the fact that close to 50% of the game revenue in 2017 will come from mobile games - be it smartphones or tablets. The increasingly high processing power of these devices has allowed developers to create more interactive and immersive mobile games. However, it is important to note that the processing power of the mobile games is yet to catch up to their desktop counterparts, not to mention the lack of a gaming console, which is beyond comparison at this stage. To tackle this issue, mobile game developers are experimenting with different machine learning and AI algorithms to impart ‘smartness’ to mobile games, while still adhering to the processing power limits. Compare today’s mobile games to the ones 5 years back, and you’ll notice a tremendous shift in terms of the visual appearance of the games, and how interactive they have become. New machine learning and deep learning frameworks & libraries are being developed to cater specifically to the mobile platform. Google’s TensorFlow Lite and Facebook’s Caffe2 are instances of such development. Soon, these tools will come to developers’ rescue to build smarter and more interactive mobile games. In Conclusion Gone are the days when games were just about entertainment and passing time. The gaming industry is now one of the most profitable industries of today. As it continues to grow, the demands of the gaming community and the games themselves keep evolving. The need for realism in games is higher than ever, and AI has an important role to play in making games more interactive, immersive and intelligent. With the rate at which new AI techniques and algorithms are developing, it’s an exciting time for game developers to showcase their full potential. Are you ready to start building AI for your own games? Here are some books to help you get started: Practical Game AI Programming Learning game AI programming with Lua
Read more
  • 0
  • 0
  • 6983

article-image-how-to-choose-components-to-build-a-basic-robot
Prasad Ramesh
31 Dec 2018
10 min read
Save for later

How to choose components to build a basic robot 

Prasad Ramesh
31 Dec 2018
10 min read
This post will show you how to choose a robot chassis kit with wheels and motors, a motor controller, and some power for the robot, talking through the trade-offs and things to avoid. This article is an excerpt from a book written by Danny Staple titled Learn Robotics Programming. In this book, you will learn you'll gain experience of building a next-generation collaboration robot Choosing a robot chassis kit The chassis, like the controller, is a fundamental decision when making a robot. Although these can be self-made using 3D printing or toy hacking, the most simple place to start is with a robot chassis kit. These kits contain sets of parts to start off your robot build. A chassis can be changed, but it would mean rebuilding the robot. The internet has plenty of robot chassis kits around. Too many, so how do you choose one? Size Getting the size for a robot right matters too. Take a look at the following photos: Chassis 1 is 11 cm in and just about fits a controller in it, but is too tiny. This will make it hard to build your robot. Squeezing the controller, power, and all the sensors into this small space would need skill and experience beyond the scope of a first robot build. Chassis 2 is Armbot. This large robot is 33 cm by 30 cm, with an arm reach of another 300 mm. It needs eight AA batteries, big motors, and a big controller. These add to the expense and may cause issues around power handling for a new builder. It has lots of space, but issues around weight and rigidity. Armbot is one of my most expensive robots, excluding the cost of the arm! Chassis 3 in the preceding image will fit the Pi, batteries, and sensor, but without being large and bulky. It is around the right dimensions, being between 15-20 cm long and 10-15 cm wide. Those that have split levels might be great for this, but only one or two levels, as three or four will make a robot top heavy and may cause it to topple. This has enough space and is relatively easy to build. Wheel count Some robot chassis kits have elaborate movement methods, legs, tank tracks, and tri-star wheels, to name a few. While these are fun and I encourage experimenting with them, this is not the place to start at. So, I recommend a thoroughly sensible, if basic, wheels on motors version. There are kits with four-wheel drive and six-wheel drive. These can be quite powerful and will require larger motor controllers. They may also chew through batteries, and you are increasing the likelihood of overloading something. This also makes for trickier wiring, as seen in the following: Two-wheel drive is the simplest to wire in. It usually requires a third wheel for balance. This can be a castor wheel, roller ball, or just a Teflon sled for tiny robots. Two wheels are also the easiest to steer, avoiding some friction issues seen with robots using four or more wheels. Two wheels won't have the pulling power of four or six-wheel drive, but they are simple and will work. They are also less expensive: Wheels and motors A kit for a beginner should come with the wheels and the motors. The wheels should have simple non-pneumatic rubber tires. The most obvious style for inexpensive robots is shown in the following photo. There are many kits with these in them: The kit should also come with two motors, one for each wheel, and include the screws or parts to mount them onto the chassis. I recommend DC Gear motors, as the gearing will keep the speed usable while increasing the mechanical pushing power the robot has. Importantly, the motors should have the wires connected, like the first motor in the following photo: It is tricky to solder or attach these wires to the small tags on motors, and poorly attached ones do have a frustrating habit of coming off. The kits you will want to start with have these wires attached, as can be seen in the following: Another point to note is that where the motors are mounted, the kits should have some encoder wheels, and a slot to read them through. The encoder wheels are also known as odometry, tacho, or tachometer wheels. Simplicity You don't want to use a complex or hard-to-assemble kit for your first robot build. I've repeated this throughout with two-wheel drive, two motors with the wires soldered on and steering clear of large robots, or unusual and interesting locomotion systems, not because they are flawed, but because it's better to start simple. There is a limit to this, a robot kit that is a fully built and enclosed robot leaves little room for learning or experimentation and would actually require toy hacking skills to customize. Cost Related to simplicity is cost. Robot chassis kits can be brought from around $15, up to thousands of dollars. Larger and more complex robots tend to be far more costly. Here, I am aiming to keep to the less costly options or at least show where they are possible. Conclusion So, now you can choose a chassis kit, with two wheels and a castor, two motors with wires soldered on them, slots, and encoder wheels. These are not expensive, and widely available on popular internet shopping sites as "Smart Car Chassis," with terms like "2WD": The kit I'm working with looks like the preceding photo when assembled without the Raspberry Pi. Choosing a motor controller The next important part you'll need is a motor controller. Much like the motors, there are a number of trade-offs and considerations before buying one. Integration level Motor controllers can be as simple as motor power control driven from GPIO pins directly, such as the L298. This is the cheapest solution: a generic L298N motor controller can be connected to some of the IO pins on the Raspberry Pi. These are reasonably robust and have been easily available for a long time. They are flexible, but using parts like this will take up more space and need to be wired point to point, adding complexity to the build: Others are as complex as whole IO controller boards, many of which hide their own controller similar to an Arduino, along with motor control chips. Although the cheapest and most flexible ways are the most basic controllers, those with higher integration will reduce size, keep the pin usage count low (handy when you are connecting a lot to the robot), and may simplify your robot build. They often come integrated with a power supply too. Motor controllers can be bought as fully integrated Raspberry Pi hats, boards designed to fit exactly on top of a Raspberry Pi. These tend to have a high level of integration, as discussed before, but may come at the cost of flexibility, especially if you plan to use other accessories. Pin usage When buying a motor controller in Raspberry Pi hat form, pin usage is important. If we intend to use microphones (PCM/I2S), servo motors, and I2c and SPI devices with this robot, having boards that make use of these pins is less than ideal. Simply being plugged into pins doesn't mean they are all used, so only a subset of the pins is usually actually connected on a hat. To get an idea of how pins in different boards interact on the Raspberry Pi, take a look at https://pinout.xyz , which lets you select Raspberry Pi boards and see the pin configuration for them. Controllers that use the I2C or serial bus are great because they make efficient use of pins and that bus can be shared. At the time of writing, PiConZero, the Stepper Motor Hat, and ZeroBorg all use I2C pins. The Full Function Stepper Motor Hat is able to control DC motors and servo motors, is cheap, and is widely available. It also has the pins available straight through on the top and an I2C connector on the side. It's designed to work with other hats and allow more expansion. Size The choice of this depends on the chassis, specifically the size of the motors you have. In simple terms, the larger your chassis, the larger a controller you will need. The power handling capacity of a motor controller is specified in amps. For a robot like the The Robot Kit I'm Using image, around 1 to 1.5 amps per channel is good. The consequence of too low a rating can be disaster, resulting in a robot that barely moves, while the components cook themselves or violently go bang. Too large a controller has consequences for space, weight, and cost: The level of integration can also contribute to size. A tiny board that stacks on a Pi would take up less space than separate boards. Related to size is if the board keeps the camera port on the Raspberry Pi accessible. Soldering As you choose boards for a robot, you will note that some come as kits themselves, requiring parts to be soldered on. If you are already experienced with this, it may be an option. For experienced builders, this becomes a small cost in time depending on the complexity of the soldering. A small header is going to be a very quick and easy job, and a board that comes as a bag of components with a bare board will be a chunk of an evening. Here, I will recommend components that require the least soldering. Connectors Closely related to soldering are the connectors for the motors and batteries. I tend to prefer the screw type connectors. Other types may require matching motors or crimping skills: Conclusion Our robot is space constrained; for this reason, we will be looking at the Raspberry Pi hat type form factor. We are also looking to keep the number of pins it binds to really low. An I2C-based hat will let us do this. The Full Function Stepper Motor Hat (also known as the Full Function Robot Expansion Board) gets us access to all the Pi pins while being a powerful motor controller: It's available in most countries, has space for the ribbon for the camera, and controls servo motors. I recommend the 4tronix PiConZero hat, or assembling a stack of PiBorg hats. These may be harder to source outside of the UK. The reader will need to adapt the code, and consider a tiny shim to retain access to the GPIO pins if using a different board. In this article, we learned about selecting the parts needed to build a basic robot. We looked at the size, wheel, cost, and connectors for the robot chassis and a controller. To learn more about robotics and build your own robot check out this book Learn Robotics Programming. Real-time motion planning for robots made faster and efficient with RapidPlan processor Boston Dynamics adds military-grade mortor (parkour) skills to its popular humanoid Atlas Robot Sex robots, artificial intelligence, and ethics: How desire shapes and is shaped by algorithms
Read more
  • 0
  • 0
  • 6948

article-image-what-is-the-future-of-on-demand-e-commerce-apps
Guest Contributor
18 Jun 2019
6 min read
Save for later

What is the future of on-demand e-commerce apps?

Guest Contributor
18 Jun 2019
6 min read
On-demand apps almost came as a movement in the digital world and transformed the way we avail services and ready-to-use business deliverables. -E-commerce stores like Amazon and eBay were the first on-demand apps and over time the business model penetrated across other niches. Now, from booking a taxi ride online to booking food delivery to booking accommodation in a distant city, on-demand apps are making spaces for every different customer interaction. As these on-demand apps are gradually building the foundation for a fully-fledged on-demand economy, the future of e-commerce will depend on how new and cutting-edge features are introduced and how the user experience can be boosted with new UI and UX elements. But before taking a look into the future of on-demand e-commerce, it is essential to understand the evolution of the on-demand apps in recent years.   Let us have a brief look at various facets of this ongoing evolution.   Mobile-push for change: Already mobile search has surpassed desktop search in both volume and frequency. Moreover, mobile has become a lifestyle factor allowing instant access to services and contents. It is a mobile device’s round the clock connectivity and ease of keeping in constant touch that has made it a key to the thriving on-demand economy.   Overwhelming Social Media penetration: The penetration of social media across all spheres of life has helped people staying connected while communicating almost on anything and everything, giving businesses a never-before opportunity to cater to the customer demands. Addressing value as well as a convenience: With the proliferation of on-demand apps, we can see two gross categories of consumers- the value-oriented and the convenience-oriented consumers. Besides giving priority to more value at a lesser cost, the on-demand apps are now facilitating more convenient and timely delivery of products. Frictionless business process: Allowing easy and smooth purchase with least friction in the business process has become the subject of demand for most consumers. Frictionless and smooth customer experience and delivery are the two most important criteria that on-demand apps fulfill.   How to cater to customers with on-demand e-commerce apps? If as a business you want to cater to your customers with on-demand apps, there are several ways you can do that. When providing customers more value is your priority, you can only ensure this with easier, connected and smooth e-shopping experience. 4 specific ways you can cater to your customers with on-demand e-commerce apps. By trying and testing various services, you can easily get a first-hand feel of how these services work. Next, evaluate what the services do best and what they don’t. Now, think about how you can deliver a better service for your customers. To transform your existing business into an on-demand business, you can also partner with a service provider who can ensure same-day delivery of your products to the customers. You can partner with services like Google Express, Instacart, Amazon, PostMates, Google Express, Uber Rush, etc. You can also utilize the BOPUS (by online, pick up in store) model to cater to many customers who find this helpful. Always make sure to minimize the time and trouble for the customers to pick up products from your store. Providing on-site installation of the product can also boost customer experience. You can partner with a service provider to install the product and guide the customers about its usage. How on-Demand apps are transforming the face of business? The on-demand economy is experiencing a never-before boom and there are too many examples of how it has transformed businesses. The emergence of Uber and Airbnb is an excellent example of how on-demand apps deliver popular service for several daily needs. Just as Uber transformed the way we think of transport, Airbnb transformed the way we conceive booking accommodations and hotels in places of travel. Similarly, apps like Swiggy, Just Eat and Uber Eats are continuing to change the way we order foods from restaurants and food chains. The same business model is slowly penetrating across other niches and products. From the daily consumable goods to the groceries, now almost everything is being delivered through on-demand apps to our doorstep. Thanks to customer-centric UI and UX elements in mobile apps and an increasing number of businesses paving the way for unique and innovative shop fronts, personalization has become one of the biggest driving factors for on-demand mobile apps. Consumers also have got the taste of personalized shopping experience, and they are increasingly demanding products, services and shopping experience that suit their specific needs and preferences. This is one area where on-demand apps within the same niche are competitive in a bid to deliver better customer experience and win more business. The Future of On-demand eCommerce Apps The future of the on-demand e-commerce apps will mainly revolve around new concepts and breakthrough ideas of providing customers more ease and convenience. From gesture-based checkout and payment processing to product search through images to video chat, a lot of breakthrough features will shape the future of on-demand e-commerce apps. Conversational Marketing Unlike the conventional marketing channels that follow the one-way directive, in the new era of on-demand e-commerce apps, conversational marketing will play a bigger role. From intelligent Chatbots to real-time video chat communication, we have a lot of avenues to utilise conversational marketing methods. Image-Based Product Search By integrating image search technology with the e-commerce interfaces customers can be provided with an easy and effortless ways of searching for products online. They can take photos of nearby objects and can search for those items across e-commerce stores.   Real-time Shopping Apps What about getting access to products just when and where you need them? Well, such ease of shopping in real-time may not be a distant thing of the future, thanks to real-time shopping apps. Just when you need a particular product, you can shop it then and there and based upon availability, the order can be accepted and delivered from the nearest store in time. Gesture-Based Login Biometrics is already part and parcel of smart user experience. Gestures are also used in the latest mobile handsets for login and authentication. So, those days are not far when the gestures will be used for customer login and authentication in the e-commerce store. This will make the entire shopping experience easier, effortless and least time-consuming. Conclusion The future of on-demand e-commerce apps is bright. In the years to come, the on-demand apps are going to be more mainstream and commonplace to transform the business process and the way customers are served by retailers across the niches. Author Bio Atman Rathod is the Co-founder at CMARIX TechnoLabs Pvt. Ltd. with 13+ years of experience. He loves to write about technology, startups, entrepreneurship and business. His creative abilities, academic track record and leadership skills made him one of the key industry influencers as well. You can find him on Linkedin, Twitter, and Medium. Declarative UI programming faceoff: Apple’s SwiftUI vs Google’s Flutter What Elon Musk can teach us about Futurism & Technology Forecasting 12 Visual Studio Code extensions that Node.js developers will love [Sponsored by Microsoft]
Read more
  • 0
  • 0
  • 6892
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-julia-for-machine-learning-will-the-new-language-pick-up-pace
Prasad Ramesh
20 Oct 2018
4 min read
Save for later

Julia for machine learning. Will the new language pick up pace?

Prasad Ramesh
20 Oct 2018
4 min read
Machine learning can be done using many languages, with Python and R being the most popular. But one language has been overlooked for some time—Julia. Why isn’t Julia machine learning a thing? Julia isn't an obvious choice for machine learning simply because it's a new language that has only recently hit version 1.0. While Python is well-established, with a large community and many libraries, Julia simply doesn't have the community to shout about it. And that's a shame. Right now Julia is used in various fields. From optimizing milk production in dairy farms to parallel supercomputing for astronomy, Julia has a wide range of applications. A common theme here is that these actions all require numerical, scientific, and sometimes parallel computation. Julia is well-suited to the sort of tasks where intensive computation is essential. Viral Shah, CEO of Julia Computing said to Forbes “Amazon, Apple, Disney, Facebook, Ford, Google, Grindr, IBM, Microsoft, NASA, Oracle and Uber are other Julia users, partners and organizations hiring Julia programmers.” Clearly, Julia is powering the analytical nous of some of the most high profile organizations on the planet. Perhaps it just needs more cheerleading to go truly mainstream. Why Julia is a great language for machine learning Julia was originally designed for high-performance numerical analysis. This means that everything that has gone into its design is built for the very things you need to do to build effective machine learning systems. Speed and functionality Julia combines the functionality from various popular languages like Python, R, Matlab, SAS and Stata with the speed of C++ and Java. A lot of the standard LaTeX symbols can be used in Julia, with the syntax usually being the same as LaTeX. This mathematical syntax makes it easy for implementing mathematical formulae in code and make Julia machine learning possible. It also has in-built support for parallelism which allows utilization of multiple cores at once making it fast at computations. Julia’s loops and functions features are pretty fast, fast enough that you would probably notice significant performance differences against other languages. The performance can be almost comparable to C with very little code actually used. With packages like ArrayFire, generic code can be run on GPUs. In Julia, the multiple dispatch feature is very useful for defining number and array-like datatypes. Matrices, data tables work with good compatibility and performance. Julia has automatic garbage collection, a collection of libraries for mathematical calculations, linear algebra, random number generation, and regular expression matching. Libraries and scalability Julia machine learning can be done with powerful tools like MLBase.jl, Flux.jl, Knet.jl, that can be used for machine learning and artificial intelligence systems. It also has a scikit-learn implementation called ScikitLearn.jl. Although ScikitLearn.jl is not an official port, it is a useful additional tool for building machine learning systems with Julia. As if all those weren’t enough, Julia also has TensorFlow.jl and MXNet.jl. So, if you already have experience with these tools, in other implementations, the transition is a little easier than learning everything from scratch. Julia is also incredibly scalable. It can be deployed on large clusters quickly, which is vital if you’re working with big data across a distributed system. Should you consider Julia machine learning? Because it’s fast and possesses a great range of features, Julia could potentially overtake both Python and R to be the choice of language for machine learning in the future. Okay, maybe we shouldn’t get ahead of ourselves. But with Julia reaching the 1.0 milestone, and the language rising on the TIOBE index, you certainly shouldn’t rule out Julia when it comes to machine learning. Julia is also available to use in the popular tool Jupyter Notebook, paving a path for wider adoption. A note of caution, however, is important. Rather than simply dropping everything for Julia, it will be worth monitoring the growth of the language. Over the next 12 to 24 months we’ll likely see new projects and libraries, and the Julia machine learning community expanding. If you start hearing more noise about the language, it becomes a much safer option to invest your time and energy in learning it. If you are just starting off with machine learning, then you should stick to other popular languages. An experienced engineer, however, who already has a good grip on other languages shouldn’t be scared of experimenting with Julia - it gives you another option, and might just help you to uncover new ways of working and solving problems. Julia 1.0 has just been released What makes functional programming a viable choice for artificial intelligence projects? Best Machine Learning Datasets for beginners
Read more
  • 0
  • 0
  • 6867

article-image-what-is-security-chaos-engineering-and-why-is-it-important
Amrata Joshi
21 Nov 2018
6 min read
Save for later

What is security chaos engineering and why is it important?

Amrata Joshi
21 Nov 2018
6 min read
Chaos engineering is, at its root, all about stress testing software systems in order to minimize downtime and maximize resiliency. Security chaos engineering takes these principles forward into the domain of security. The central argument of security chaos engineering is that current security practices aren’t fit for purpose. “Despite spending more on security, data breaches are continuously getting bigger and more frequent across all industries” write Aaron Rinehart and Charles Nwatu in a post published on opensource.com in January 2018. “We hypothesize that a large portion of data breaches are caused not by sophisticated nation-state actors or hacktivists, but rather simple things rooted in human error and system glitches.” The rhetorical question they’re asking is clear: should we wait for an incident to happen in order to work on it? Or should we be looking at ways to prevent them from happening at all? Why do we need security chaos engineering today? There are two problems that make security chaos engineering so important today. One is the way in which security breaches and failures are understood culturally across the industry. Security breaches tend to be seen as either isolated attacks or ‘holes’ within software - anomalies that should have been thought of but weren’t. In turn, this leads to a spiral of failures. Rather than thinking about cybersecurity in a holistic and systematic manner, the focus is all too often on simply identifying weaknesses when they happen and putting changes in place to stop them from happening again. You can see this approach even in the way organizations communicate after high-profile attacks have taken place - ‘we’re taking steps to ensure nothing like this ever happens again.’ While that sentiment is important for both customers and shareholders to hear, it also betrays exactly the problems Rinehart, Wong and Nwatu appear to be talking about. The second problem is more about the nature of software today. As the world moves to distributed systems, built on a range of services, and with an extensive set of software dependencies, vulnerabilities naturally begin to increase too. “Where systems are becoming more and more distributed, ephemeral, and immutable in how they operate… it is becoming difficult to comprehend the operational state and health of our systems' security,” Rinehart and Nwatu explain. When you take the cultural issues and the evolution of software together, it becomes clear that the only way cybersecurity is going to properly tackle today’s challenges is by doing an extensive rethink of how and why things happen. What security chaos engineering looks like in practice If you want to think about what the transition to security chaos engineering actually means in practice, a good way to think about it is seeing it as a shift in mindset. It’s a mindset that doesn’t focus on isolated issues but instead on the overall health of the system. Essentially, you start with a different question: don’t ask ‘where are the potential vulnerabilities in our software’ ask ‘where are the potential points of failure in the system?’ Rinehart and Nwatu explain: “Failures we can consist not only of IT, business, and general human factors but also the way we design, build, implement, configure, operate, observe, and manage security controls. People are the ones designing, building, monitoring, and managing the security controls we put in place to defend against malicious attackers.” By focusing on questions of system design and decision making, you can begin to capture security threats that you might otherwise miss. So, while malicious attacks might account for 47% of all security breaches, human error and system glitches combined account for 53%. This means that while we’re all worrying about the hooded hacker that dominates stock imagery, someone made a simple mistake that just about any software-savvy criminal could take advantage of. How is security chaos engineering different from penetration testing? Security chaos engineering looks a lot like penetration testing, right? After all, the whole point of pentesting is, like chaos engineering, determining weaknesses before they can have an impact. But there are some important differences that shouldn’t be ignored. Again, the key difference is the mindset behind both. Penetration testing is, for the most part, an event. It’s something you do when you’ve updated or changed something significant. It also has a very specific purpose. That’s not a bad thing, but with such a well-defined testing context you might miss security issues that you hadn’t even considered. And if you consider the complexity of a given software system, in which its state changes according to the services and requests it is handling, it’s incredibly difficult - not to mention expensive - to pentest an application in every single possible state. Security chaos engineering tackles that by actively experimenting on the software system to better understand it. The context in which it takes place is wide-reaching and ongoing, not isolated and particular. ChaoSlingr, the security chaos engineering tool ChaoSlingr is perhaps the most prominent tool out there to help you actually do security chaos engineering. Built for AWS, it allows you to perform a number of different ‘security chaos experiments’ in the cloud. Essentially, ChaosSlingr pushes failures into the system in a way that allows you to not only identify security issues but also to better understand your infrastructure. This SlideShare deck, put together by Aaron Rinehart himself, is a good introduction to how it works in a little more detail. Security teams have typically always focused on preventive security measures. ChaosSlingr empowers teams to dig deeper into their systems and improve it in ways that mitigate security risks. It allows you to be proactive rather than reactive. The future is security chaos engineering Chaos engineering has not quite taken off - yet. But it’s clear that the principles behind it are having an impact across software engineering. In particular, at a time when ever-evolving software feels so vulnerable - fragile even - applying it to cybersecurity feels incredibly pertinent and important. It’s true that the shift in mindset is going to be tough. But if we can begin to distrust our assumptions, experiment on our systems, and try to better understand how and why they work the way they do, we are certainly moving towards a healthier and more secure software world. Chaos Conf 2018 Recap: Chaos engineering hits maturity as community moves towards controlled experimentation Chaos engineering platform Gremlin announces $18 million series B funding and new feature for “full-stack resiliency” Gremlin makes chaos engineering with Docker easier with new container discovery feature
Read more
  • 0
  • 0
  • 6861

article-image-common-data-science-terms
Aarthi Kumaraswamy
16 May 2018
27 min read
Save for later

30 common data science terms explained

Aarthi Kumaraswamy
16 May 2018
27 min read
Let’s begin at the beginning. What do terms like statistical population, statistical comparison, statistical inference mean? What good is munging, coding, booting, regularization etc. On a scale of 1 to 30 (1 being the lowest and 30, the highest), rate yourself as a data scientist. No matter what you have scored yourself, we hope to have improved that score at least by a little, by the end of this post. Let’s start with a basic question: What is data science? [box type="shadow" align="" class="" width=""]The following is an excerpt from the book, Statistics for Data Science written by James D. Miller and published by Packt Publishing.[/box] The idea of how data science is defined is a matter of opinion. I personally like the explanation that data science is a progression or, even better, an evolution of thought or steps, as shown in the following figure: Although a progression or evolution implies a sequential journey, in practice, this is an extremely fluid process; each of the phases may inspire the data scientist to reverse and repeat one or more of the phases until they are satisfied. In other words, all or some phases of the process may be repeated until the data scientist determines that the desired outcome is reached. Depending on your sources and individual beliefs, you may say the following: Statistics is data science, and data science is statistics. Based upon personal experience, research, and various industry experts' advice, someone delving into the art of data science should take every opportunity to understand and gain experience as well as proficiency with the following list of common data science terms: Statistical population Probability False positives Statistical inference Regression Fitting Categorical data Classification Clustering Statistical comparison Coding Distributions Data mining Decision trees Machine learning Munging and wrangling Visualization D3 Regularization Assessment Cross-validation Neural networks Boosting Lift Mode Outlier Predictive modeling Big data Confidence interval Writing Statistical population You can perhaps think of a statistical population as a recordset (or a set of records). This set or group of records will be of similar items or events that are of interest to the data scientist for some experiment. For a data developer, a population of data may be a recordset of all sales transactions for a month, and the interest might be reporting to the senior management of an organization which products are the fastest sellers and at which time of the year. For a data scientist, a population may be a recordset of all emergency room admissions during a month, and the area of interest might be to determine the statistical demographics for emergency room use. [box type="note" align="" class="" width=""]Typically, the terms statistical population and statistical model are or can be used interchangeably. Once again, data scientists continue to evolve with their alignment on their use of common terms. [/box] Another key point concerning statistical populations is that the recordset may be a group of (actually) existing objects or a hypothetical group of objects. Using the preceding example, you might draw a comparison of actual objects as those actual sales transactions recorded for the month while the hypothetical objects as sales transactions are expected, forecast, or presumed (based upon observations or experienced assumptions or other logic) to occur during a month. Finally, through the use of statistical inference, the data scientist can select a portion or subset of the recordset (or population) with the intention that it will represent the total population for a particular area of interest. This subset is known as a statistical sample. If a sample of a population is chosen accurately, characteristics of the entire population (that the sample is drawn from) can be estimated from the corresponding characteristics of the sample. Probability Probability is concerned with the laws governing random events.                                           -www.britannica.com When thinking of probability, you think of possible upcoming events and the likelihood of them actually occurring. This compares to a statistical thought process that involves analyzing the frequency of past events in an attempt to explain or make sense of the observations. In addition, the data scientist will associate various individual events, studying the relationship of these events. How these different events relate to each other governs the methods and rules that will need to be followed when we're studying their probabilities. [box type="note" align="" class="" width=""]A probability distribution is a table that is used to show the probabilities of various outcomes in a sample population or recordset. [/box] False positives The idea of false positives is a very important statistical (data science) concept. A false positive is a mistake or an errored result. That is, it is a scenario where the results of a process or experiment indicate a fulfilled or true condition when, in fact, the condition is not true (not fulfilled). This situation is also referred to by some data scientists as a false alarm and is most easily understood by considering the idea of a recordset or statistical population (which we discussed earlier in this section) that is determined not only by the accuracy of the processing but by the characteristics of the sampled population. In other words, the data scientist has made errors during the statistical process, or the recordset is a population that does not have an appropriate sample (or characteristics) for what is being investigated. Statistical inference What developer at some point in his or her career, had to create a sample or test data? For example, I've often created a simple script to generate a random number (based upon the number of possible options or choices) and then used that number as the selected option (in my test recordset). This might work well for data development, but with statistics and data science, this is not sufficient. To create sample data (or a sample population), the data scientist will use a process called statistical inference, which is the process of deducing options of an underlying distribution through analysis of the data you have or are trying to generate for. The process is sometimes called inferential statistical analysis and includes testing various hypotheses and deriving estimates. When the data scientist determines that a recordset (or population) should be larger than it actually is, it is assumed that the recordset is a sample from a larger population, and the data scientist will then utilize statistical inference to make up the difference. [box type="note" align="" class="" width=""]The data or recordset in use is referred to by the data scientist as the observed data. Inferential statistics can be contrasted with descriptive statistics, which is only concerned with the properties of the observed data and does not assume that the recordset came from a larger population. [/box] Regression Regression is a process or method (selected by the data scientist as the best fit technique for the experiment at hand) used for determining the relationships among variables. If you're a programmer, you have a certain understanding of what a variable is, but in statistics, we use the term differently. Variables are determined to be either dependent or independent. An independent variable (also known as a predictor) is the one that is manipulated by the data scientist in an effort to determine its relationship with a dependent variable. A dependent variable is a variable that the data scientist is measuring. [box type="note" align="" class="" width=""]It is not uncommon to have more than one independent variable in a data science progression or experiment. [/box] More precisely, regression is the process that helps the data scientist comprehend how the typical value of the dependent variable (or criterion variable) changes when any one or more of the independent variables is varied while the other independent variables are held fixed. Fitting Fitting is the process of measuring how well a statistical model or process describes a data scientist's observations pertaining to a recordset or experiment. These measures will attempt to point out the discrepancy between observed values and probable values. The probable values of a model or process are known as a distribution or a probability distribution. Therefore, a probability distribution fitting (or distribution fitting) is when the data scientist fits a probability distribution to a series of data concerning the repeated measurement of a variable phenomenon. The object of a data scientist performing a distribution fitting is to predict the probability or to forecast the frequency of, the occurrence of the phenomenon at a certain interval. [box type="note" align="" class="" width=""]One of the most common uses of fitting is to test whether two samples are drawn from identical distributions.[/box] There are numerous probability distributions a data scientist can select from. Some will fit better to the observed frequency of the data than others will. The distribution giving a close fit is supposed to lead to good predictions; therefore, the data scientist needs to select a distribution that suits the data well. Categorical data Earlier, we explained how variables in your data can be either independent or dependent. Another type of variable definition is a categorical variable. This type of variable is one that can take on one of a limited, and typically fixed, number of possible values, thus assigning each individual to a particular category. Often, the collected data's meaning is unclear. Categorical data is a method that a data scientist can use to put meaning to the data. For example, if a numeric variable is collected (let's say the values found are 4, 10, and 12), the meaning of the variable becomes clear if the values are categorized. Let's suppose that based upon an analysis of how the data was collected, we can group (or categorize) the data by indicating that this data describes university students, and there is the following number of players: 4 tennis players 10 soccer players 12 football players Now, because we grouped the data into categories, the meaning becomes clear. Some other examples of categorized data might be individual pet preferences (grouped by the type of pet), or vehicle ownership (grouped by the style of a car owned), and so on. So, categorical data, as the name suggests, is data grouped into some sort of category or multiple categories. Some data scientists refer to categories as sub-populations of data. [box type="note" align="" class="" width=""]Categorical data can also be data that is collected as a yes or no answer. For example, hospital admittance data may indicate that patients either smoke or do not smoke. [/box] Classification Statistical classification of data is the process of identifying which category (discussed in the previous section) a data point, observation, or variable should be grouped into. The data science process that carries out a classification process is known as a classifier. Read this post: Classification using Convolutional Neural Networks [box type="note" align="" class="" width=""]Determining whether a book is fiction or non-fiction is a simple example classification. An analysis of data about restaurants might lead to the classification of them among several genres. [/box] Clustering Clustering is the process of dividing up the data occurrences into groups or homogeneous subsets of the dataset, not a predetermined set of groups as in classification (described in the preceding section) but groups identified by the execution of the data science process based upon similarities that it found among the occurrences. Objects in the same group (a group is also referred to as a cluster) are found to be more analogous (in some sense or another) to each other than to those objects found in other groups (or found in other clusters). The process of clustering is found to be very common in exploratory data mining and is also a common technique for statistical data analysis. Statistical comparison Simply put, when you hear the term statistical comparison, one is usually referring to the act of a data scientist performing a process of analysis to view the similarities or variances of two or more groups or populations (or recordsets). As a data developer, one might be familiar with various utilities such as FC Compare, UltraCompare, or WinDiff, which aim to provide the developer with a line-by-line comparison of the contents of two or more (even binary) files. In statistics (data science), this process of comparing is a statistical technique to compare populations or recordsets. In this method, a data scientist will conduct what is called an Analysis of Variance (ANOVA), compare categorical variables (within the recordsets), and so on. [box type="note" align="" class="" width=""]ANOVA is an assortment of statistical methods that are used to analyze the differences among group means and their associated procedures (such as variations among and between groups, populations, or recordsets). This method eventually evolved into the Six Sigma dataset comparisons. [/box] Coding Coding or statistical coding is again a process that a data scientist will use to prepare data for analysis. In this process, both quantitative data values (such as income or years of education) and qualitative data (such as race or gender) are categorized or coded in a consistent way. Coding is performed by a data scientist for various reasons such as follows: More effective for running statistical models Computers understand the variables Accountability--so the data scientist can run models blind, or without knowing what variables stand for, to reduce programming/author bias [box type="shadow" align="" class="" width=""]You can imagine the process of coding as the means to transform data into a form required for a system or application. [/box] Distributions The distribution of a statistical recordset (or of a population) is a visualization showing all the possible values (or sometimes referred to as intervals) of the data and how often they occur. When a distribution of categorical data (which we defined earlier in this chapter) is created by a data scientist, it attempts to show the number or percentage of individuals in each group or category. Linking an earlier defined term with this one, a probability distribution, stated in simple terms, can be thought of as a visualization showing the probability of occurrence of different possible outcomes in an experiment. Data mining With data mining, one is usually more absorbed in the data relationships (or the potential relationships between points of data, sometimes referred to as variables) and cognitive analysis. To further define this term, we can say that data mining is sometimes more simply referred to as knowledge discovery or even just discovery, based upon processing through or analyzing data from new or different viewpoints and summarizing it into valuable insights that can be used to increase revenue, cuts costs, or both. Using software dedicated to data mining is just one of several analytical approaches to data mining. Although there are tools dedicated to this purpose (such as IBM Cognos BI and Planning Analytics, Tableau, SAS, and so on.), data mining is all about the analysis process finding correlations or patterns among dozens of fields in the data and that can be effectively accomplished using tools such as MS Excel or any number of open source technologies. [box type="note" align="" class="" width=""]A common technique to data mining is through the creation of custom scripts using tools such as R or Python. In this way, the data scientist has the ability to customize the logic and processing to their exact project needs. [/box] Decision trees A statistical decision tree uses a diagram that looks like a tree. This structure attempts to represent optional decision paths and a predicted outcome for each path selected. A data scientist will use a decision tree to support, track, and model decision making and their possible consequences, including chance event outcomes, resource costs, and utility. It is a common way to display the logic of a data science process. Machine learning Machine learning is one of the most intriguing and exciting areas of data science. It conjures all forms of images around artificial intelligence which includes Neural Networks, Support Vector Machines (SVMs), and so on. Fundamentally, we can describe the term machine learning as a method of training a computer to make or improve predictions or behaviors based on data or, specifically, relationships within that data. Continuing, machine learning is a process by which predictions are made based upon recognized patterns identified within data, and additionally, it is the ability to continuously learn from the data's patterns, therefore continuingly making better predictions. It is not uncommon for someone to mistake the process of machine learning for data mining, but data mining focuses more on exploratory data analysis and is known as unsupervised learning. Machine learning can be used to learn and establish baseline behavioral profiles for various entities and then to find meaningful anomalies. Here is the exciting part: the process of machine learning (using data relationships to make predictions) is known as predictive analytics. Predictive analytics allow the data scientists to produce reliable, repeatable decisions and results and uncover hidden insights through learning from historical relationships and trends in the data. Munging and wrangling The terms munging and wrangling are buzzwords or jargon meant to describe one's efforts to affect the format of data, recordset, or file in some way in an effort to prepare the data for continued or otherwise processing and/or evaluations. With data development, you are most likely familiar with the idea of Extract, Transform, and Load (ETL). In somewhat the same way, a data developer may mung or wrangle data during the transformation steps within an ETL process. Common munging and wrangling may include removing punctuation or HTML tags, data parsing, filtering, all sorts of transforming, mapping, and tying together systems and interfaces that were not specifically designed to interoperate. Munging can also describe the processing or filtering of raw data into another form, allowing for more convenient consumption of the data elsewhere. Munging and wrangling might be performed multiple times within a data science process and/or at different steps in the evolving process. Sometimes, data scientists use munging to include various data visualization, data aggregation, training a statistical model, as well as much other potential work. To this point, munging and wrangling may follow a flow beginning with extracting the data in a raw form, performing the munging using various logic, and lastly, placing the resulting content into a structure for use. Although there are many valid options for munging and wrangling data, preprocessing and manipulation, a tool that is popular with many data scientists today is a product named Trifecta, which claims that it is the number one (data) wrangling solution in many industries. [box type="note" align="" class="" width=""]Trifecta can be downloaded for your personal evaluation from https://www.trifacta.com/. Check it out! [/box] Visualization The main point (although there are other goals and objectives) when leveraging a data visualization technique is to make something complex appear simple. You can think of visualization as any technique for creating a graphic (or similar) to communicate a message. Other motives for using data visualization include the following: To explain the data or put the data in context (which is to highlight demographic statistics) To solve a specific problem (for example, identifying problem areas within a particular business model) To explore the data to reach a better understanding or add clarity (such as what periods of time do this data span?) To highlight or illustrate otherwise invisible data (such as isolating outliers residing in the data) To predict, such as potential sales volumes (perhaps based upon seasonality sales statistics) And others Statistical visualization is used in almost every step in the data science process, within the obvious steps such as exploring and visualizing, analyzing and learning, but can also be leveraged during collecting, processing, and the end game of using the identified insights. D3 D3 or D3.js, is essentially an open source JavaScript library designed with the intention of visualizing data using today's web standards. D3 helps put life into your data, utilizing Scalable Vector Graphics (SVG), Canvas, and standard HTML. D3 combines powerful visualization and interaction techniques with a data-driven approach to DOM manipulation, providing data scientists with the full capabilities of modern browsers and the freedom to design the right visual interface that best depicts the objective or assumption. In contrast to many other libraries, D3.js allows inordinate control over the visualization of data. D3 is embedded within an HTML webpage and uses pre-built JavaScript functions to select elements, create SVG objects, style them, or add transitions, dynamic effects, and so on. Regularization Regularization is one possible approach that a data scientist may use for improving the results generated from a statistical model or data science process, such as when addressing a case of overfitting in statistics and data science. [box type="note" align="" class="" width=""]We defined fitting earlier (fitting describes how well a statistical model or process describes a data scientist's observations). Overfitting is a scenario where a statistical model or process seems to fit too well or appears to be too close to the actual data.[/box] Overfitting usually occurs with an overly simple model. This means that you may have only two variables and are drawing conclusions based on the two. For example, using our previously mentioned example of daffodil sales, one might generate a model with temperature as an independent variable and sales as a dependent one. You may see the model fail since it is not as simple as concluding that warmer temperatures will always generate more sales. In this example, there is a tendency to add more data to the process or model in hopes of achieving a better result. The idea sounds reasonable. For example, you have information such as average rainfall, pollen count, fertilizer sales, and so on; could these data points be added as explanatory variables? [box type="note" align="" class="" width=""]An explanatory variable is a type of independent variable with a subtle difference. When a variable is independent, it is not affected at all by any other variables. When a variable isn't independent for certain, it's an explanatory variable. [/box] Continuing to add more and more data to your model will have an effect but will probably cause overfitting, resulting in poor predictions since it will closely resemble the data, which is mostly just background noise. To overcome this situation, a data scientist can use regularization, introducing a tuning parameter (additional factors such as a data points mean value or a minimum or maximum limitation, which gives you the ability to change the complexity or smoothness of your model) into the data science process to solve an ill-posed problem or to prevent overfitting. Assessment When a data scientist evaluates a model or data science process for performance, this is referred to as assessment. Performance can be defined in several ways, including the model's growth of learning or the model's ability to improve (with) learning (to obtain a better score) with additional experience (for example, more rounds of training with additional samples of data) or accuracy of its results. One popular method of assessing a model or processes performance is called bootstrap sampling. This method examines performance on certain subsets of data, repeatedly generating results that can be used to calculate an estimate of accuracy (performance). The bootstrap sampling method takes a random sample of data, splits it into three files--a training file, a testing file, and a validation file. The model or process logic is developed based on the data in the training file and then evaluated (or tested) using the testing file. This tune and then test process is repeated until the data scientist is comfortable with the results of the tests. At that point, the model or process is again tested, this time using the validation file, and the results should provide a true indication of how it will perform. [box type="note" align="" class="" width=""]You can imagine using the bootstrap sampling method to develop program logic by analyzing test data to determine logic flows and then running (or testing) your logic against the test data file. Once you are satisfied that your logic handles all of the conditions and exceptions found in your testing data, you can run a final test on a new, never-before-seen data file for a final validation test. [/box] Cross-validation Cross-validation is a method for assessing a data science process performance. Mainly used with predictive modeling to estimate how accurately a model might perform in practice, one might see cross-validation used to check how a model will potentially generalize, in other words, how the model can apply what it infers from samples to an entire population (or recordset). With cross-validation, you identify a (known) dataset as your validation dataset on which training is run along with a dataset of unknown data (or first seen data) against which the model will be tested (this is known as your testing dataset). The objective is to ensure that problems such as overfitting (allowing non-inclusive information to influence results) are controlled and also provide an insight into how the model will generalize a real problem or on a real data file. The cross-validation process will consist of separating data into samples of similar subsets, performing the analysis on one subset (called the training set) and validating the analysis on the other subset (called the validation set or testing set). To reduce variability, multiple iterations (also called folds or rounds) of cross-validation are performed using different partitions, and the validation results are averaged over the rounds. Typically, a data scientist will use a models stability to determine the actual number of rounds of cross-validation that should be performed. Neural networks Neural networks are also called artificial neural networks (ANNs), and the objective is to solve problems in the same way that the human brain would. Google will provide the following explanation of ANN as stated in Neural Network Primer: Part I, by Maureen Caudill, AI Expert, Feb. 1989: [box type="note" align="" class="" width=""]A computing system made up of several simple, highly interconnected processing elements, which process information by their dynamic state response to external inputs. [/box] To oversimplify the idea of neural networks, recall the concept of software encapsulation, and consider a computer program with an input layer, a processing layer, and an output layer. With this thought in mind, understand that neural networks are also organized in a network of these layers, usually with more than a single processing layer. Patterns are presented to the network by way of the input layer, which then communicates to one (or more) of the processing layers (where the actual processing is done). The processing layers then link to an output layer where the result is presented. Most neural networks will also contain some form of learning rule that modifies the weights of the connections (in other words, the network learns which processing nodes perform better and gives them a heavier weight) per the input patterns that it is presented with. In this way (in a sense), neural networks learn by example as a child learns to recognize a cat from being exposed to examples of cats. Boosting In a manner of speaking, boosting is a process generally accepted in data science for improving the accuracy of a weak learning data science process. [box type="note" align="" class="" width=""]Data science processes defined as weak learners are those that produce results that are only slightly better than if you would randomly guess the outcome. Weak learners are basically thresholds or a 1-level decision tree. [/box] Specifically, boosting is aimed at reducing bias and variance in supervised learning. What do we mean by bias and variance? Before going on further about boosting, let's take note of what we mean by bias and variance. Data scientists describe bias as a level of favoritism that is present in the data collection process, resulting in uneven, disingenuous results and can occur in a variety of different ways. A sampling method is called biased if it systematically favors some outcomes over others. A variance may be defined (by a data scientist) simply as the distance from a variable mean (or how far from the average a result is). The boosting method can be described as a data scientist repeatedly running through a data science process (that has been identified as a weak learning process), with each iteration running on different and random examples of data sampled from the original population recordset. All the results (or classifiers or residue) produced by each run are then combined into a single merged result (that is a gradient). This concept of using a random subset of the original recordset for each iteration originates from bootstrap sampling in bagging and has a similar variance-reducing effect on the combined model. In addition, some data scientists consider boosting a means to convert weak learners into strong ones; in fact, to some, the process of boosting simply means turning a weak learner into a strong learner. Lift In data science, the term lift compares the frequency of an observed pattern within a recordset or population with how frequently you might expect to see that same pattern occur within the data by chance or randomly. If the lift is very low, then typically, a data scientist will expect that there is a very good probability that the pattern identified is occurring just by chance. The larger the lift, the more likely it is that the pattern is real. Mode In statistics and data science, when a data scientist uses the term mode, he or she refers to the value that occurs most often within a sample of data. Mode is not calculated but is determined manually or through processing of the data. Outlier Outliers can be defined as follows: A data point that is way out of keeping with the others That piece of data that doesn't fit Either a very high value or a very low value Unusual observations within the data An observation point that is distant from all others Predictive modeling The development of statistical models and/or data science processes to predict future events is called predictive modeling. Big Data Again, we have some variation of the definition of big data. A large assemblage of data, data sets that are so large or complex that traditional data processing applications are inadequate, and data about every aspect of our lives have all been used to define or refer to big data. In 2001, then Gartner analyst Doug Laney introduced the 3V's concept. The 3V's, as per Laney, are volume, variety, and velocity. The V's make up the dimensionality of big data: volume (or the measurable amount of data), variety (meaning the number of types of data), and velocity (referring to the speed of processing or dealing with that data). Confidence interval The confidence interval is a range of values that a data scientist will specify around an estimate to indicate their margin of error, combined with a probability that a value will fall in that range. In other words, confidence intervals are good estimates of the unknown population parameter. Writing Although visualizations grab much more of the limelight when it comes to presenting the output or results of a data science process or predictive model, writing skills are still not only an important part of how a data scientist communicates but still considered an essential skill for all data scientists to be successful. Did we miss any of your favorite terms? Now that you are at the end of this post, we ask you again: On a scale of 1 to 30 (1 being the lowest and 30, the highest), how do you rate yourself as a data scientist? Why You Need to Know Statistics To Be a Good Data Scientist [interview] How data scientists test hypotheses and probability 6 Key Areas to focus on while transitioning to a Data Scientist role Soft skills every data scientist should teach their child
Read more
  • 0
  • 0
  • 6826

article-image-streamline-your-application-development-process-in-5-simple-steps
Guest Contributor
23 Apr 2019
7 min read
Save for later

Streamline your application development process in 5 simple steps

Guest Contributor
23 Apr 2019
7 min read
Chief Information Officers (CIOs) are under constant pressure to deliver substantial results that meet business goals. Planning a project and seeing it through to the end is a critical requirement of an effective development process. In the fast-paced world of software development, getting results is an essential key for businesses to flourish. There is a certain pleasure you get from ticking off tasks from your to-do lists. However, this becomes a burden when you are drowning with a lot of tasks on your head. Signs of inefficient processes are prevalent in every business. Unhappy customers, stressed out colleagues, disappointing code reviews, missed deadlines, and increases in costs are just some of the examples that are the direct result of dysfunctional processes. By streamlining your workflow you will be able to compete with modern technologies like Machine Learning and Artificial Intelligence. Gaining access to such technologies will also help you to automate the workflow, making your daily processes even smoother. Listed below are 5 steps that can help you in streamlining your development process. Step 1: Creating a Workflow This is a preliminary step for companies who have not considered creating a better workflow. A task is not just something you can write down, complete, and tick-off. Complex, software related tasks are not like the “do-the-dishes” type of tasks. Usually, there are many stages in software development tasks like planning, organizing, reviewing, and releasing. Regardless of the niche of your tasks, the workflow should be clear. You can always use software tools such as Zapier, Nintex, and ProcessMaker, etc. to customize your workflow and assign levels-of-importance to particular tasks. This might appear as micro-management at first, but once it becomes a part of the daily routine, it starts to get easier. Creating a workflow is probably the most important factor to consider when you are preparing to streamline your software development processes. There are several steps involved when creating a workflow: Mapping the Process Process mapping mainly focuses on the visualization of the current development process which allows a top-down view of how things are working. You can do process mapping via tools such as Draw.io, LucidCharts, and Microsoft Visio, etc. Analyze the Process Once you have a flowchart or a swim lane diagram setup, use it to investigate the problems within the process. The problems can range from costs, time, employee motivation, and other bottlenecks. Redesign the Process When you have identified the problems, you should try to solve them step by step. Working with people who are directly involved in the process (e.g Software Developers) and gaining an on-the-ground insight can prove very useful when redesigning the processes. Acquire Resources You now need to secure the resources that are required to implement the new processes. With regards to our topic, it can range from buying licensed software, faster computers, etc. Implementing Change It is highly likely that your business processes change with existing systems, teams, and processes. Allocate your time to solving these problems, while keeping the regular operations in the process. Process Review This phase might seem the easiest, but it is not. Once the changes are in place, you need to review them accordingly so that they do not rise up again Once the workflow is set in place, all you have to do is to identify the bugs in your workflow plan. The bugs can range anywhere from slow tasks, re-opening of finished tasks, to dead tasks. What we have observed about workflows is that you do not get it right the first time. You need to take your time to edit and review the workflow while still being in the loop of the workflow. The more transparent and active your process is, the easier it gets to spot problems and figure out solutions. Step 2: Backlog Maintenance Many times you assume all the tasks in your backlog to be important. They might have, however, this makes the backlog a little too jam-packed. Well, your backlog will not serve a purpose unless you are actively taking part in keeping it organized. A backlog, while being a good place to store tasks, is also home to tasks that will never see the light of day. A good practice, therefore, would be to either clean up your backlog of dead tasks or combine them with tasks that have more importance in your overall workflow. If some of the tasks are relatively low-priority, we would recommend creating a separate backlog altogether. Backlogs are meant to be a database of tasks but do not let that fact get over your head. You should not worry about deleting something important from your backlog, if the task is important, it will come back. You can use sites like Trello or Slack to create and maintain a backlog. Step 3: Standardized Procedure for Tasks You should have an accurate definition of “done”. With respect to software development, there are several things you need to consider before actually accomplishing a task. These include: Ensure all the features have been applied The unit tests are finished Software information is up-to-date Quality assurance tests have been carried out The code is in the master branch The code is deployed in the production This is simply a template of what you can consider “done” with respect to a software development project. Like any template, it gets even better when you include your additions and subtractions to it. Having a standardized definition of “done” helps remove confusion from the project so that every employee has an understanding of every stage until they are finished. and also gives you time to think about what you are trying to achieve. Lastly, it is always wise to spend a little extra time completing a task phase, so that you do not have to revisit it several times. Step 4: Work in Progress (WIP) Control The ultimate factor that kills workflow is multi-tasking. Overloading your employees with constant tasks results in an overall decline in output. Therefore, it is important that you do not exert your employees with multiple tasks, which only increases their work in progress. In order to fight the problem of multitasking, you need to reduce your cycle times by having fewer tasks at one time. Consider setting a WIP limit inside your workflow by introducing limits for daily and weekly tasks. This helps to keep control of the employee tasks and reduces their burden. Step 5: Progress Visualization When you have everything set up in your workflow, it is time to represent that data to present and potential stakeholders. You need to make it clear that all of the features are completed and the ones you are currently working on. And if you will be releasing the product on time or no? A good way to represent data to senior management is through visualizations. With visualizations, you can use tools like Jira or Trello to make your data shine even more. In terms of data representation, you can use various free online tools, or buy software like Microsoft PowerPoint or Excel. Whatever tools you might use, your end-goal should be to make the information as simple as possible to the stakeholders. You need to avoid clutter and too much technical information. However, these are not the only methods you can use. Look around your company and see where you are lacking in your current processes. Take note of all of them, and research on how you can change them for the better. Author Bio Shawn Mike has been working with writing challenging clients for over five years. He provides ghostwriting, and copywriting services. His educational background in the technical field and business studies has given him the edge to write on many topics. He occasionally writes blogs for Dynamologic Solutions. Microsoft Store updates its app developer agreement, to give developers up to 95% of app revenue React Native Vs Ionic: Which one is the better mobile app development framework? 9 reasons to choose Agile Methodology for Mobile App Development
Read more
  • 0
  • 0
  • 6806
article-image-a-serverless-online-store-on-aws-could-save-you-money-build-one
Savia Lobo
14 Jun 2018
9 min read
Save for later

A serverless online store on AWS could save you money. Build one.

Savia Lobo
14 Jun 2018
9 min read
In this article you will learn to build an entire serverless project of an AWS online store, beginning with a React SPA frontend hosted on AWS followed by a serverless backend with API Gateway and Lambda functions. This article is an excerpt taken from the book, 'Building Serverless Web Applications' written by Diego Zanon. In this book, you will be introduced to the AWS services, and you'll learn how to estimate costs, and how to set up and use the Serverless Framework. The serverless architecture of AWS' online store We will build a real-world use case of a serverless solution. This sample application is an online store with the following requirements: List of available products Product details with user rating Add products to a shopping cart Create account and login pages For a better understanding of the architecture, take a look at the following diagram which gives a general view of how different services are organized and how they interact: Estimating costs In this section, we will estimate the costs of our sample application demo based on some usage assumptions and Amazon's pricing model. All pricing values used here are from mid 2017 and considers the cheapest region, US East (Northern Virginia). This section covers an example to illustrate how costs are calculated. Since the billing model and prices can change over time, always refer to the official sources to get updated prices before making your own estimations. You can use Amazon's calculator, which is accessible at this link: http://calculator.s3.amazonaws.com/index.html. If you still have any doubts after reading the instructions, you can always contact Amazon's support for free to get commercial guidance. Assumptions For our pricing example, we can assume that our online store will receive the following traffic per month: 100,000 page views 1,000 registered user accounts 200 GB of data transferred considering an average page size of 2 MB 5,000,000 code executions (Lambda functions) with an average of 200 milliseconds per request Route 53 pricing We need a hosted zone for our domain name and it costs US$ 0.50 per month. Also, we need to pay US$ 0.40 per million DNS queries to our domain. As this is a prorated cost, 100,000 page views will cost only US$ 0.04. Total: US$ 0.54 S3 pricing Amazon S3 charges you US$ 0.023 per GB/month stored, US$ 0.004 per 10,000 requests to your files, and US$ 0.09 per GB transferred. However, as we are considering the CloudFront usage, transfer costs will be charged by CloudFront prices and will not be considered in S3 billing. If our website occupies less than 1 GB of static files and has an average per page of 2 MB and 20 files, we can serve 100,000 page views for less than US$ 20. Considering CloudFront, S3 costs will go down to US$ 0.82 while you need to pay for CloudFront usage in another section. Real costs would be even lower because CloudFront caches files and it would not need to make 2,000,000 file requests to S3, but let's skip this detail to reduce the complexity of this estimation. On a side note, the cost would be much higher if you had to provision machines to handle this number of page views to a static website with the same availability and scalability. Total: US$ 0.82 CloudFront pricing CloudFront is slightly more complicated to price since you need to guess how much traffic comes from each region, as they are priced differently. The following table shows an example of estimation: RegionEstimated trafficCost per GB transferredCost per 10,000 HTTPS requestsNorth America70%US$ 0.085US$ 0.010Europe15%US$ 0.085US$ 0.012Asia10%US$ 0.140US$ 0.012South America5%US$ 0.250US$ 0.022 As we have estimated 200 GB of files transferred with 2,000,000 requests, the total will be US$ 21.97. Total: US$ 21.97 Certificate Manager pricing Certificate Manager provides SSL/TLS certificates for free. You only need to pay for the AWS resources you create to run your application. IAM pricing There is no charge specifically for IAM usage. You will be charged only by what AWS resources your users are consuming. Cognito pricing Each user has an associated profile that costs US$ 0.0055 per month. However, there is a permanent free tier that allows 50,000 monthly active users without charges, which is more than enough for our use case. Besides that, we are charged for Cognito Syncs of our user profiles. It costs US$ 0.15 for each 10,000 sync operations and US$ 0.15 per GB/month stored. If we estimate 1,000 active and registered users with less than 1 MB per profile, with less than 10 visits per month in average, we can estimate a charge of US$ 0.30. Total: US$ 0.30 IoT pricing IoT charges starts at US$ 5 per million messages exchanged. As each page view will make at least 2 requests, one to connect and another to subscribe to a topic, we can estimate a minimum of 200,000 messages per month. We need to add 1,000 messages if we suppose that 1% of the users will rate the products and we can ignore other requests like disconnect and unsubscribed because they are excluded from billing. In this setting, the total cost would be of US$ 1.01. Total: US$ 1.01 SNS pricing We will use SNS only for internal notifications, when CloudWatch triggers a warning about issues in our infrastructure. SNS charges US$ 2.00 per 100,000 e-mail messages, but it offers a permanent free tier of 1,000 e-mails. So, it will be free for us. CloudWatch pricing CloudWatch charges US$ 0.30 per metric/month and US$ 0.10 per alarm and offers a permanent free tier of 50 metrics and 10 alarms per month. If we create 20 metrics and expect 20 alarms in a month, we can estimate a cost of US$ 1.00. Total: US$ 1.00 API Gateway pricing API Gateway starts charging US$ 3.50 per million of API calls received and US$ 0.09 per GB transferred out to the Internet. If we assume 5 million requests per month with each response with an average of 1 KB, the total cost of this service will be US$ 17.93. Total: US$ 17.93 Lambda pricing When you create a Lambda function, you need to configure the amount of RAM memory that will be available for use. It ranges from 128 MB to 1.5 GB. Allocating more memory means additional costs. It breaks the philosophy of avoiding provision, but at least it's the only thing you need to worry about. The good practice here is to estimate how much memory each function needs and make some tests before deploying to production. A bad provision may result in errors or higher costs. Lambda has the following billing model: US$ 0.20 per 1 million requests US$ 0.00001667 GB-second Running time is counted in fractions of seconds, rounding up to the nearest multiple of 100 milliseconds. Furthermore, there is a permanent free tier that gives you 1 million requests and 400,000 GB-seconds per month without charges. In our use case scenario, we have assumed 5 million requests per month with an average of 200 milliseconds per execution. We can also assume that the allocated RAM memory is 512 MB per function: Request charges: Since 1 million requests are free, you pay for 4 million that will cost US$ 0.80. Compute charges: Here, 5 million executions of 200 milliseconds each gives us 1 million seconds. As we are running with a 512 MB capacity, it results in 500,000 GB-seconds, where 400,000 GB-seconds of these are free, resulting in a charge of 100,000 GB-seconds that costs US$ 1.67. Total: US$ 2.47 SimpleDB pricing Take a look at the following SimpleDB billing where the free tier is valid for new and existing users: US$ 0.14 per machine-hour (25 hours free) US$ 0.09 per GB transferred out to the internet (1 GB is free) US$ 0.25 per GB stored (1 GB is free) Take a look at the following charges: Compute charges: Considering 5 million requests with an average of 200 milliseconds of execution time, where 50% of this time is waiting for the database engine to execute, we estimate 139 machine hours per month. Discounting 25 free hours, we have an execution cost of US$ 15.96. Transfer costs: Since we'll transfer data between SimpleDB and AWS Lambda, there is no transfer cost. Storage charges: If we assume a 5 GB database, it results in US$ 1.00, since 1 GB is free. Total: US$ 16.96, but this will not be added in the final estimation since we will run our application using DynamoDB. DynamoDB DynamoDB requires you to provision the throughput capacity that you expect your tables to offer. Instead of provisioning hardware, memory, CPU, and other factors, you need to say how many read and write operations you expect and AWS will handle the necessary machine resources to meet your throughput needs with consistent and low-latency performance. One read capacity unit represents one strongly consistent read per second or two eventually consistent reads per second, where objects have a size up to 4 KB. Regarding the writing capacity, one unit means that you can write one object of size 1 KB per second. Considering these definitions, AWS offers in the permanent free tier 25 read units and 25 write units of throughput capacity, in addition to 25 GB of free storage. It charges as follows: US$ 0.47 per month for every Write Capacity Unit (WCU) US$ 0.09 per month for every Read Capacity Unit (RCU) US$ 0.25 per GB/month stored US$ 0.09 GB per GB transferred out to the Internet Since our estimated database will have only 5 GB, we are on the free tier and we will not pay for transferred data because there is no transfer cost to AWS Lambda. Regarding read/write capacities, we have estimated 5 million requests per month. If we evenly distribute them, we will get two requests per second. In this case, we will consider that it's one read and one write operation per second. We need to estimate now how many objects are affected by a read and a write operation. For a write operation, we can estimate that we will manipulate 10 items on average and a read operation will scan 100 objects. In this scenario, we would need to reserve 10 WCU and 100 RCU. As we have 25 WCU and 25 RCU for free, we only need to pay for 75 RCU per month, which costs US$ 6.75. Total: US$ 6.75 Total pricing Let's summarize the cost of each service in the following table: ServiceMonthly CostsRoute 53US$ 0.54S3US$ 0.82CloudFrontUS$ 21.97CognitoUS$ 0.30IoTUS$ 1.01CloudWatchUS$ 1.00API GatewayUS$ 17.93LambdaUS$ 2.47DynamoDBUS$ 6.75TotalUS$ 52.79 It results in a total cost of ~ US$ 50 per month in infrastructure to serve 100,000 page views. If you have a conversion rate of 1%, you can get 1,000 sales per month, which means that you pay US$ 0.05 in infrastructure for each product that you sell. Thus, in this article you learned the serverless architecture of AWS online store also learned how to estimate its costs. If you've enjoyed reading the excerpt, do check out, Building Serverless Web Applications to monitor the performance, efficiency and errors of your apps and also learn how to test and deploy your applications. Google Compute Engine Plugin makes it easy to use Jenkins on Google Cloud Platform Serverless computing wars: AWS Lambdas vs Azure Functions Using Amazon Simple Notification Service (SNS) to create an SNS topic
Read more
  • 0
  • 0
  • 6751

article-image-everything-know-ethereum
Packt Editorial Staff
10 Apr 2018
8 min read
Save for later

Everything you need to know about Ethereum

Packt Editorial Staff
10 Apr 2018
8 min read
Ethereum was first conceived of by Vitalik Buterin in November 2013. The critical idea proposed was the development of a Turing-complete language that allows the development of arbitrary programs (smart contracts) for Blockchain and decentralized applications. This concept is in contrast to Bitcoin, where the scripting language is limited in nature and allows necessary operations only. This is an excerpt from the second edition of Mastering Blockchain by Imram Bashir. The following table shows all the releases of Ethereum starting from the first release to the planned final release: Version Release date Olympic May, 2015 Frontier July 30, 2015 Homestead March 14, 2016 Byzantium (first phase of Metropolis) October 16, 2017 Metropolis To be released Serenity (final version of Ethereum) To be released   The first version of Ethereum, called Olympic, was released in May, 2015. Two months later, a second version was released, called Frontier. After about a year, another version named Homestead with various improvements was released in March, 2016. The latest Ethereum release is called Byzantium. This is the first part of the development phase called Metropolis. This release implemented a planned hard fork at block number 4,370,000 on October 16, 2017. The second part of this release called Constantinople is expected in 2018 but there is no exact time frame available yet. The final planned release of Ethereum is called Serenity. It's planned for Serenity to introduce the final version of PoS based blockchain instead of PoW. The yellow paper The Yellow Paper, written by Dr. Gavin Wood, serves as a formal definition of the Ethereum protocol. Anyone can implement an Ethereum client by following the protocol specifications defined in the paper. While this paper is a challenging read, especially for those who do not have a background in algebra or mathematics, it contains a complete formal specification of Ethereum. This specification can be used to implement a fully compliant Ethereum client. The list of all symbols with their meanings used in the paper is provided here with the anticipation that it will make reading the yellow paper more accessible. Once symbol meanings are known, it will be much easier to understand how Ethereum works in practice. Symbol Meaning Symbol Meaning ≡ Is defined as ≤ Less than or equal to = Is equal to Sigma, World state ≠ Is not equal to Mu, Machine state ║...║ Length of Upsilon, Ethereum state transition function Is an element of Block level state transition function Is not an element of . Sequence concatenation For all There exists Union ᴧ Contract creation function Logical AND Increment : Such that Floor, lowest element {} Set Ceiling, highest element () Function of tuple No of bytes [] Array indexing Exclusive OR Logical OR (a ,b) Real numbers >= a and < b > Is greater than Empty set, null + Addition - Subtraction ∑ Summation { Describing various cases of if, otherwise   Ethereum blockchain Ethereum, like any other blockchain, can be visualized as a transaction-based state machine. This definition is referred to in the Yellow Paper. The core idea is that in Ethereum blockchain, a genesis state is transformed into a final state by executing transactions incrementally. The final transformation is then accepted as the absolute undisputed version of the state. In the following diagram, the Ethereum state transition function is shown, where a transaction execution has resulted in a state transition: In the example above, a transfer of two Ether from address 4718bf7a to address 741f7a2 is initiated. The initial state represents the state before the transaction execution, and the final state is what the morphed state looks like. Mining plays a central role in state transition, and we will elaborate the mining process in detail in the later sections. The state is stored on the Ethereum network as the world state. This is the global state of the Ethereum blockchain. How Ethereum works from a user's perspective For all the conversation around cryptocurrencies, it's very rare for anyone to actually explain how it works from the perspective of a user. Let's take a look at how it works in practice. In this example, I'll use the example of one man (Bashir) transferring money to another (Irshad). You may also want to read our post on if Ethereum will eclipse bitcoin. For the purposes of this example, we're using Jaxx wallet. However, you can use any cryptocurrency wallet for this. First, either a user requests money by sending the request to the sender, or the sender decides to send money to the receiver. The request can be sent by sending the receivers Ethereum address to the sender. For example, there are two users, Bashir and Irshad. If Irshad requests money from Bashir, then she can send a request to Bashir by using QR code. Once Bashir receives this request he will either scan the QR code or manually type in Irshad's Ethereum address and send Ether to Irshad's address. This request is encoded as a QR code shown in the following screenshot which can be shared via email, text or any other communication methods.2. Once Bashir receives this request he will either scan this QR code or copy the Ethereum address in the Ethereum wallet software and initiate a transaction. This process is shown in the following screenshot where the Jaxx Ethereum wallet software on iOS is used to send money to Irshad. The following screenshot shows that the sender has entered both the amount and destination address for sending Ether. Just before sending the Ether the final step is to confirm the transaction which is also shown here: Once the request (transaction) of sending money is constructed in the wallet software, it is then broadcasted to the Ethereum network. The transaction is digitally signed by the sender as proof that he is the owner of the Ether. This transaction is then picked up by nodes called miners on the Ethereum network for verification and inclusion in the block. At this stage, the transaction is still unconfirmed. Once it is verified and included in the block, the PoW process begins. Once a miner finds the answer to the PoW problem, by repeatedly hashing the block with a new nonce, this block is immediately broadcasted to the rest of the nodes which then verifies the block and PoW. If all the checks pass then this block is added to the blockchain, and miners are paid rewards accordingly. Finally, Irshad gets the Ether, and it is shown in her wallet software. This is shown here: On the blockchain, this transaction is identified by the following transaction hash: 0xc63dce6747e1640abd63ee63027c3352aed8cdb92b6a02ae25225666e171009e Details regarding this transaction can be visualized from the block explorer, as shown in the following screenshot: Thiswalkthroughh should give you some idea of how it works. Different Ethereum networks The Ethereum network is a peer-to-peer network where nodes participate in order to maintain the blockchain and contribute to the consensus mechanism. Networks can be divided into three types, based on requirements and usage. These types are described in the following subsections. Mainnet Mainnet is the current live network of Ethereum. The current version of mainnet is Byzantium (Metropolis) and its chain ID is 1. Chain ID is used to identify the network. A block explorer which shows detailed information about blocks and other relevant metrics is available here. This can be used to explore the Ethereum blockchain. Testnet Testnet is the widely used test network for the Ethereum blockchain. This test blockchain is used to test smart contracts and DApps before being deployed to the production live blockchain. Because it is a test network, it allows experimentation and research. The main testnet is called Ropsten which contains all features of other smaller and special purpose testnets that were created for specific releases. For example, other testnets include Kovan and Rinkeby which were developed for testing Byzantium releases. The changes that were implemented on these smaller testnets has also been implemented on Ropsten. Now the Ropsten test network contains all properties of Kovan and Rinkeby. Private net As the name suggests, this is the private network that can be created by generating a new genesis block. This is usually the case in private blockchain distributed ledger networks, where a private group of entities start their blockchain and use it as a permissioned blockchain. The following table shows the list of Ethereum network with their network IDs. These network IDs are used to identify the network by Ethereum clients. Network name Network ID / Chain ID Ethereum mainnet 1 Morden 2 Ropsten 3 Rinkeby 4 Kovan 42 Ethereum Classic mainnet 61   You should now have a good foundation of knowledge to get started with Ethereum. To learn more about Ethereum and other cryptocurrencies, check out the new edition of Mastering Blockchain. Other posts from this book A brief history of Blockchain Write your first Blockchain: Learning Solidity Programming in 15 minutes 15 ways to make Blockchains scalable, secure and safe! What is Bitcoin
Read more
  • 0
  • 0
  • 6720

article-image-why-do-it-teams-need-to-transition-from-devops-to-devsecops
Guest Contributor
13 Jul 2019
8 min read
Save for later

Why do IT teams need to transition from DevOps to DevSecOps?

Guest Contributor
13 Jul 2019
8 min read
Does your team perform security testing during development? If not, why not? Cybercrime is on the rise, and formjacking, ransomware, and IoT attacks have increased alarmingly in the last year. This makes security a priority at every stage of development. In this kind of ominous environment, development teams around the globe should take a more proactive approach to threat detection. This can be done in a number of ways. There are some basic techniques that development teams can use to protect their development environments. But ultimately, what is needed is an integration of threat identification and management into the development process itself. Integrated processes like this are referred to as DevSecOps, and in this guide, we’ll take you through some of the advantages of transitioning to DevSecOps. Protect Your Development Environment First, though, let’s look at some basic measures that can help to protect your development environment. For both individuals and enterprises, online privacy is perhaps the most valuable currency of all. Proxy servers, Tor, and virtual private networks (VPN) have slowly crept into the lexicon of internet users as cost-effective privacy tools to consider if you want to avoid drawing the attention of hackers. But what about enterprises? Should they use the same tools? They would prefer to avoid hackers as well. This answer is more complicated. Encryption and authentication should be addressed early in the development process, especially given the common practice of using open source libraries for app coding. The advanced security protocols that power many popular consumer VPN services make it a good first step to protecting coding and any proprietary technology. Additional controls like using 2-factor authentication and limiting who has access will further protect the development environment and procedures. Beyond these basic measures, though, it is also worth looking in detail at your entire development process and integrating security management at every stage. This is sometimes referred to as integrating DevOps and DevSecOps. DevOps vs. DevSecOps: What's the Difference? DevOps and DevSecOps are not separate entities, but different facets of the development process. Traditionally, DevOps teams work to integrate software development and implementation in order to facilitate the rapid delivery of new business applications. Since this process omits security testing and solutions, many security flaws and vulnerabilities aren't addressed early enough in the development process. With a new approach, DevSecOps, this omission is addressed by automating security-related tasks and integrating controls and functions like composition analysis and configuration management into the development process. Previously, DevSec focused only on automating security code testing, but it is gradually transitioning to incorporate an operations-centric approach. This helps in reconciling two environments that are opposite by nature. DevOps is forward-looking because it's toward rapid deployment, while development security looks backward to analyze and predict future issues. By prioritizing security analysis and automation, teams can still improve delivery speed without the need to retroactively find and deal with threats. Best Practices: How DevSecOps Should Work The goal of current DevSecOps best practices is to implement a shift towards real-time threat detection rather than undergoing a historical analysis. This enables more efficient application development that recognizes and deals with issues as they happen rather than waiting until there's a problem. This can be done by developing a more effective strategy while adopting DevSecOps practices. When all areas of concern are addressed, it results in: Automatic code procurement: Automatic code procurement eliminates the problem of human error and incorporating weak or flawed coding. This benefits developers by allowing vulnerabilities and flaws to be discovered and corrected earlier in the process. Uninterrupted security deployment: Uninterrupted security deployment through the use of automation tools that work in real time. This is done by creating a closed-loop testing and reporting and real-time threat resolution. Leveraged security resources: Leveraged security resources through automation. Using automated DevSecOps typically address areas related to threat assessment, event monitoring, and code security. This frees your IT or security team to focus in other areas, like threat remediation and elimination. There are five areas that need to be addressed in order for DevSecOps to be effective: Code analysis By delivering code in smaller modules, teams are able to identify and address vulnerabilities faster. Management changes Adapting the protocol for changes in management or admins allows users to improve on changes faster as well as enabling security teams to analyze their impact in real time. This eliminates the problem of getting calls about problems with system access after the application is deployed. Compliance Addressing compliance with Payment Card Industry Digital Security Standard (PCI DSS) and the new General Data Protection Regulations (GDPR) earlier, helps prevent audits and heavy fines. It also ensures that you have all of your reporting ready to go in the event of a compliance audit. Automating threat and vulnerability detection Threats evolve and proliferate fast, so security should be agile enough to deal with emerging threats each time coding is updated or altered. Automating threat detection earlier in the development process improves response times considerably. Training programs Comprehensive security response begins with proper IT security training. Developers should craft a training protocol that ensures all personnel who are responsible for security are up to date and on the same page. Organizations should bring security and IT staff into the process sooner. That means advising current team members of current procedures and ensuring that all new staff is thoroughly trained. Finding the Right Tools for DevSecOps Success Does a doctor operate with a chainsaw? Hopefully not. Likewise, all of the above points are nearly impossible to achieve without the right tools to get the job done with precision. What should your DevSec team keep in their toolbox? Automation tools Automation tools provide scripted remediation recommendations for security threats detected. One such tool is Automate DAST, which scans new or modified code against security vulnerabilities listed on the Open Web Application Security Project's (OWASP) list of the most common flaws, such as a SQL injection errors. These are flaws you might have missed during static analysis of your application code. Attack modeling tools Attack modeling tools create models of possible attack matrices and map their implications. There are plenty of attack modeling tools available, but a good one for identifying cloud vulnerabilities is Infection Monkey, which simulates attacks against the parts of your infrastructure that run on major public cloud hosts like Google Cloud, AWS, and Azure, as well as most cloud storage providers like Dropbox and pCloud. Visualization tools Visualization tools are used for evolving, identifying, and sharing findings with the operations team. An example of this type of tool is PortVis, developed by a team led by professor Kwan-Liu Ma at the University of California, Davis. PortVis is designed to display activity by host or port in three different modes: a grid visualization, in which all network activity is displayed on a single grid; a volume visualization, which extends the grid to a three-dimensional volume; and a port visualization, which allows devs to visualize the activity on specific ports over time. Using this tool, different types of attack can be easily distinguished from each other. Alerting tools  Alerting tools prioritize threats and send alerts so that the most hazardous vulnerabilities can be addressed immediately. WhiteSource Bolt, for instance, is a useful tool of this type, designed to improve the security of open source components. It does this by checking these components against known security threats, and providing security alerts to devs. These alerts also auto-generate issues within GitHub. Here, devs can see details such as references for the CVE, its CVSS rating, a suggested fix, and there is even an option to assign the vulnerability to another team member using the milestones feature. The Bottom Line Combining DevOps and DevSec is not a meshing of two separate disciplines, but rather the natural transition of development to a more comprehensive approach that takes security into account earlier in the process, and does it in a more meaningful way. This saves a lot of time and hassles by addressing enterprise security requirements before deployment rather than probing for flaws later. The sooner your team hops on board with DevSecOps, the better. Author Bio Gary Stevens is a front-end developer. He’s a full-time blockchain geek and a volunteer working for the Ethereum foundation as well as an active Github contributor. Is DevOps really that different from Agile? No, says Viktor Farcic [Podcast] Does it make sense to talk about DevOps engineers or DevOps tools? How Visual Studio Code can help bridge the gap between full-stack development and DevOps
Read more
  • 0
  • 0
  • 6698
article-image-8-myths-rpa-robotic-process-automation
Savia Lobo
08 Nov 2017
9 min read
Save for later

8 Myths about RPA (Robotic Process Automation)

Savia Lobo
08 Nov 2017
9 min read
Many say we are on the cusp of the fourth industrial revolution that promises to blur the lines between the real, virtual and the biological worlds. Amongst many trends, Robotic Process Automation (RPA) is also one of those buzzwords surrounding the hype of the fourth industrial revolution. Although poised to be a $6.7 trillion industry by 2025, RPA is shrouded in just as much fear as it is brimming with potential. We have heard time and again how automation can improve productivity, efficiency, and effectiveness while conducting business in transformative ways. We have also heard how automation and machine-driven automation, in particular, can displace humans and thereby lead to a dystopian world. As humans, we make assumptions based on what we see and understand. But sometimes those assumptions become so ingrained that they evolve into myths which many start accepting as facts. Here is a closer look at some of the myths surrounding RPA. [dropcap]1[/dropcap] RPA means robots will automate processes The term robot evokes in our minds a picture of a metal humanoid with stiff joints that speaks in a monotone. RPA does mean robotic process automation. But the robot doing the automation is nothing like the ones we are used to seeing in the movies. These are software robots that perform routine processes within organizations. They are often referred to as virtual workers/digital workforce complete with their own identity and credentials. They essentially consist of algorithms programmed by RPA developers with an aim to automate mundane business processes. These processes are repetitive, highly structured, fall within a well-defined workflow, consist of a finite set of tasks/steps and may often be monotonous and labor intensive. Let us consider a real-world example here - Automating the invoice generation process. The RPA system will run through all the emails in the system, and download the pdf files containing details of the relevant transactions. Then, it would fill a spreadsheet with the details and maintain all the records therein. Later, it would log on to the enterprise system and generate appropriate invoice reports for each detail in the spreadsheet. Once the invoices are created, the system would then send a confirmation mail to the relevant stakeholders. Here, the RPA user will only specify the individual tasks that are to be automated, and the system will take care of the rest of the process. So, yes, while it is true that RPA involves robots automating processes, it is a myth that these robots are physical entities or that they can automate all processes. [dropcap]2[/dropcap] RPA is useful only in industries that rely heavily on software “Almost anything that a human can do on a PC, the robot can take over without the need for IT department support.” - Richard Bell, former Procurement Director at Averda RPA is a software which can be injected into a business process. Traditional industries such as banking and finance, healthcare, manufacturing etc that have significant tasks that are routine and depend on software for some of their functioning can benefit from RPA. Loan processing and patient data processing are some examples. RPA, however, cannot help with automating the assembly line in a manufacturing unit or with performing regular tests on patients. Even in industries that maintain daily essential utilities such as cooking gas, electricity, telephone services etc RPA can be put to use for generating automated bills, invoices, meter-readings etc. By adopting RPA, businesses irrespective of the industry they belong to can achieve significant cost savings, operational efficiency, and higher productivity. To leverage the benefits of RPA, rather than understanding the SDLC process, it is important that users have a clear understanding of business workflow processes and domain knowledge. Industry professionals can be easily trained on how to put RPA into practice. The bottom line - RPA is not limited to industries that rely heavily on software to exist. But it is true that RPA can be used only in situations where some form of software is used to perform tasks manually. [dropcap]3[/dropcap] RPA will replace humans in most frontline jobs Many organizations employ a large workforce in frontline roles to do routine tasks such as data entry operations, managing processes, customer support, IT support etc. But frontline jobs are just as diverse as the people performing them. Take sales reps for example. They bring new business through their expert understanding of the company’s products, their potential customer base coupled with the associated soft skills. Currently, they spend significant time on administrative tasks such as developing and finalizing business contracts, updating the CRM database, making daily status reports etc. Imagine the spike in productivity if these aspects could be taken off the plates of sales reps and they could just focus on cultivating relationships and converting leads. By replacing human efforts in mundane tasks within frontline roles, RPA can help employees focus on higher value-yielding tasks. In conclusion, RPA will not replace humans in most frontline jobs. It will, however, replace humans in a few roles that are very rule-based and narrow in scope such as simple data entry operators or basic invoice processing executives. In most frontline roles like sales or customer support, RPA is quite likely to change significantly at least in some ways how one sees their job responsibilities. Also, the adoption of RPA will generate new job opportunities around the development, maintenance, and sale of RPA based software. [dropcap]4[/dropcap] Only large enterprises can afford to deploy RPA The cost of implementing and maintaining the RPA software and training employees to use it can be quite high. This can make it an unfavorable business proposition for SMBs with fairly simple organizational processes and cross-departmental considerations. On the other hand, large organizations with higher revenue generation capacity, complex business processes, and a large army of workers can deploy an RPA system to automate high-volume tasks quite easily and recover that cost within a few months.   It is obvious that large enterprises will benefit from RPA systems due to the economies of scale offered and faster recovery of investments made. SMBs (Small to medium-sized businesses) can also benefit from RPA to automate their business processes. But this is possible only if they look at RPA as a strategic investment whose cost will be recovered over a longer time period of say 2-4 years. [dropcap]5[/dropcap] RPA adoption should be owned and driven by the organization's IT department The RPA team handling the automation process need not be from the IT department. The main role of the IT department is providing necessary resources for the software to function smoothly. An RPA reliability team which is trained in using RPA tools does not include IT professionals but rather business operations professionals. In simple terms, RPA is not owned by the IT department but by the whole business and is driven by the RPA team. [dropcap]6[/dropcap] RPA is an AI virtual assistant specialized to do a narrow set of tasks An RPA bot performs a narrow set of tasks based on the given data and instructions. It is a system of rule-based algorithms which can be used to capture, process and interpret streams of data, trigger appropriate responses and communicate with other processes. However, it cannot learn on its own - a key trait of an AI system. Advanced AI concepts such as reinforcement learning and deep learning are yet to be incorporated in robotic process automation systems. Thus, an RPA bot is not an AI virtual assistant, like Apple’s Siri, for example. That said, it is not impractical to think that in the future, these systems will be able to think on their own, decide the best possible way to execute a business process and learn from its own actions to improve the system. [dropcap]7[/dropcap] To use the RPA software, one needs to have basic programming skills Surprisingly, this is not true. Associates who use the RPA system need not have any programming knowledge. They only need to understand how the software works on the front-end, and how they can assign tasks to the RPA worker for automation. On the other hand, RPA system developers do require some programming skills, such as knowledge of scripting languages. Today, there are various platforms for developing RPA tools such as UIPath, Blueprism and more, which empower RPA developers to build these systems without any hassle, reducing their coding responsibilities even more. [dropcap]8[/dropcap] RPA software is fully automated and does not require human supervision This is a big myth. RPA is often misunderstood as a completely automated system. Humans are indeed required to program the RPA bots, to feed them tasks for automation and to manage them. The automation factor here lies in aggregating and performing various tasks which otherwise would require more than one human to complete. There’s also the efficiency factor which comes into play - the RPA systems are fast, and almost completely avoid faults in the system or the process that are otherwise caused due to human error. Having a digital workforce in place is far more profitable than recruiting human workforce. Conclusion One of the most talked about areas in terms of technological innovations, RPA is clearly still in its early days and is surrounded by a lot of myths. However, there’s little doubt that its adoption will take off rapidly as RPA systems become more scalable, more accurate and deploy faster. AI, cognitive, and Analytics-driven RPA will take it up a notch or two, and help the businesses improve their processes even more by taking away dull, repetitive tasks from the people. Hype can get ahead of the reality, as we've seen quite a few times - but RPA is an area definitely worth keeping an eye on despite all the hype.
Read more
  • 0
  • 0
  • 6666

article-image-is-linux-hard-to-learn
Jay LaCroix
30 Jan 2018
6 min read
Save for later

Is Linux hard to learn?

Jay LaCroix
30 Jan 2018
6 min read
This post is an extract from Linux Mint Essentials by Jay LaCroix. Quite often, I am asked whether or not Linux is hard to learn. The reputation Linux has of being hard to use and learn most likely stems from the early days when typical distributions actually were quite difficult to use. I remember a time when simply installing a video card driver required manually recompiling the kernel (which took many hours) and enabling support for media such as MP3s required multiple manual commands. Nowadays, however, how difficult Linux is to learn and use is determined by which distribution you pick. If, for example, you're a beginner and you choose a distribution tailored for advanced users, you are likely to find yourself frustrated very quickly. In fact, there are distros available that make you do everything manually, such as choosing which version of the kernel to run and installing and configuring the desktop environment. This level of customizability is wonderful for advanced users who wish to build their own Linux system from the ground up, though it is more likely that beginners would be put off by it. General purpose distributions such as Mint are actually very easy to learn, and in some cases, some tasks in Mint are even easier to perform than in other operating systems. The ease of use we enjoy with a number of Linux distributions is due in part to the advancements that Ubuntu has made in usability. Around the time when Windows Vista was released, a renaissance of sorts occurred in the Linux community. At that time, quite a few people were so outraged by Windows Vista that a lot more effort was put into making Ubuntu easier to use. It can be argued that the time period of Vista was the fastest growth in usability that Linux ever saw. Tasks that were once rites of passage (such as installing drivers and media codecs) became trivial. The exciting changes in Ubuntu during that time inspired other distributions to make similar changes. Nowadays, usage of Ubuntu is beginning to decline due to the fact that not everyone is pleased about its new user interface (Unity); however, there is no denying the positive impact it had on Linux usability. Being based on Ubuntu, Mint inherits many of those benefits, but also aims to improve on its proposed weaknesses. Due to its great reception, it eventually went on to surpass Ubuntu itself. Mint currently sits at the very top of the charts on Distrowatch.com, and with a good reason—it's an amazing distribution. Distributions such as Mint are incredibly user friendly. Even the installation procedure is a cinch, and most can get through it by simply accepting the defaults. Installing new software is also straightforward as everything is included in software repositories and managed through a graphical application. In fact, I recently acquired an HP printer that comes with a CD full of required software for Windows, but when connected to my Mint computer, it just worked. No installation of any software was required. Linux has never been easier! Why use Linux Mint When it comes to Linux, there are many distributions available, each vying for your attention. But which Linux distribution should you use? In this post, taken from Linux Mint Essentials, we’ll explore why you should choose Linux Mint rather than larger distributions such as Fedora and Ubuntu. In the first instance, the user-friendly nature of Linux Mint is certainly a good reason to use it. However, there’s much more to it than just that. Of course, it’s true that Ubuntu is the big player when it comes to Linux distributions - but because Linux Mint is built on Ubuntu it has the power of its foundations. That means by choosing Mint, you’re not compromising on what has become a standard in Linux. So, Linux Mint takes the already solid foundation of Ubuntu, and improves on it by using a different user interface, adding custom tools, and including a number of further tweaks to make its media formats recognized right from the start. It’s not uncommon for a Linux distribution to be based on other distributions. This is because it's much easier to build a distribution on an already existing foundation, since building your own base is quite time consuming (and expensive). By utilizing the existing foundation of Ubuntu, Mint benefits from the massive software repository that Ubuntu has at its disposal, without having to reinvent the wheel and recreate everything from the ground up. The development time saved by doing this allows the Linux Mint developers to focus on adding exciting features and tweaks to improve its ease of use. Given the fact that Ubuntu is open source, it's perfectly fine to use it as a base for a completely separate distribution. Unlike the proprietary software market, the developers of Mint aren't at risk of being sued for recycling the package base of another distribution. In fact, Ubuntu itself is built on the foundation of another distribution (Debian), and Mint is not the only distribution to use Ubuntu as a base. As mentioned before, Mint utilizes a different user interface than Ubuntu. Ubuntu ships with the Unity interface, which (so far) has not been highly regarded by the majority of the Linux community. Unity split Ubuntu's user community in half as some people loved the new interface, though others were not so enthused and made their distaste well-known. Rather than adopt Unity during this transition, Mint opted for two primary environments instead, Cinnamon and MATE. Cinnamon is recommended for more modern computers, and MATE is useful for older computers that are lower in processing power and memory. MATE is also useful for those who prefer the older style of Linux environments, as it is a fork of GNOME 2.x. Many people consider Cinnamon to be the default desktop environment in Linux Mint, but that is open to debate. The Mint developers have yet to declare either of them as the default. Mint actually ships five different versions (also known as spins) of its distribution. Four of them (Cinnamon, MATE, KDE, and Xfce) feature different user interfaces as the main difference, while the fifth is a completely different distribution that is based on Debian instead of Ubuntu. Due to its popularity, Cinnamon is the closest thing to a default in Mint and as such, it is a recommended starting point.
Read more
  • 0
  • 0
  • 6636