Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon

Tech Guides

852 Articles
article-image-penetration-testing-rules-of-engagement
Fatema Patrawala
14 May 2018
7 min read
Save for later

5 pen testing rules of engagement: What to consider while performing Penetration testing

Fatema Patrawala
14 May 2018
7 min read
Penetration testing and ethical hacking are proactive ways of testing web applications by performing attacks that are similar to a real attack that could occur on any given day. They are executed in a controlled way with the objective of finding as many security flaws as possible and to provide feedback on how to mitigate the risks posed by such flaws. Security-conscious corporations have implemented integrated penetration testing, vulnerability assessments, and source code reviews in their software development cycle. Thus, when they release a new application, it has already been through various stages of testing and remediation. When planning to execute a penetration testing project, be it for a client as a professional penetration tester or as part of a company's internal security team, there are aspects that always need to be considered before starting the engagement. [box type="shadow" align="" class="" width=""]This article is an excerpt from the book Web Penetration testing with Kali Linux - Third Edition, written by Gilberto Najera-Gutierrez, Juned Ahmed Ansari.[/box] Rules of Engagement for Pen testing Rules of Engagement (RoE) is a document that deals with the manner in which the penetration test is to be conducted. Some of the directives that should be clearly spelled out in RoE before you start the penetration test are as follows: The type and scope of testing Client contact details Client IT team notifications Sensitive data handling Status meeting and reports Type and scope of Penetration testing The type of testing can be black box, white box, or an intermediate gray box, depending on how the engagement is performed and the amount of information shared with the testing team. There are things that can and cannot be done in each type of testing. With black box testing, the testing team works from the view of an attacker who is external to the organization, as the penetration tester starts from scratch and tries to identify the network map, the defense mechanisms implemented, the internet-facing websites and services, and so on. Even though this approach may be more realistic in simulating an external attacker, you need to consider that such information may be easily gathered from public sources or that the attacker may be a disgruntled employee or ex-employee who already possess it. Thus, it may be a waste of time and money to take a black box approach if, for example, the target is an internal application meant to be used by employees only. White box testing is where the testing team is provided with all of the available information about the targets, sometimes even including the source code of the applications, so that little or no time is spent on reconnaissance and scanning. A gray box test then would be when partial information, such as URLs of applications, user-level documentation, and/or user accounts are provided to the testing team. Gray box testing is especially useful when testing web applications, as the main objective is to find vulnerabilities within the application itself, not in the hosting server or network. Penetration testers can work with user accounts to adopt the point of view of a malicious user or an attacker that gained access through social engineering. [box type="note" align="" class="" width=""]When deciding on the scope of testing, the client along with the testing team need to evaluate what information is valuable and necessary to be protected, and based on that, determine which applications/networks need to be tested and with what degree of access to the information.[/box] Client contact details We can agree that even when we take all of the necessary precautions when conducting tests, at times the testing can go wrong because it involves making computers do nasty stuff. Having the right contact information on the client-side really helps. A penetration test is often seen turning into a Denial-of-Service (DoS) attack. The technical team on the client side should be available 24/7 in case a computer goes down and a hard reset is needed to bring it back online. [box type="note" align="" class="" width=""]Penetration testing web applications has the advantage that it can be done in an environment that has been specially built for that purpose, allowing the testers to reduce the risk of negatively affecting the client's productive assets.[/box] Client IT team notifications Penetration tests are also used as a means to check the readiness of the support staff in responding to incidents and intrusion attempts. You should discuss this with the client whether it is an announced or unannounced test. If it's an announced test, make sure that you inform the client of the time and date, as well as the source IP addresses from where the testing (attack) will be done, in order to avoid any real intrusion attempts being missed by their IT security team. If it's an unannounced test, discuss with the client what will happen if the test is blocked by an automated system or network administrator. Does the test end there, or do you continue testing? It all depends on the aim of the test, whether it's conducted to test the security of the infrastructure or to check the response of the network security and incident handling team. Even if you are conducting an unannounced test, make sure that someone in the escalation matrix knows about the time and date of the test. Web application penetration tests are usually announced. Sensitive data handling During test preparation and execution, the testing team will be provided with and may also find sensitive information about the company, the system, and/or its users. Sensitive data handling needs special attention in the RoE and proper storage and communication measures should be taken (for example, full disk encryption on the testers' computers, encrypting reports if they are sent by email, and so on). If your client is covered under the various regulatory laws such as the Health Insurance Portability and Accountability Act (HIPAA), the Gramm-Leach-Bliley Act (GLBA), or the European data privacy laws, only authorized personnel should be able to view personal user data. Status meeting and reports Communication is key for a successful penetration test. Regular meetings should be scheduled between the testing team and the client organization and routine status reports issued by the testing team. The testing team should present how far they have reached and what vulnerabilities have been found up to that point. The client organization should also confirm whether their detection systems have triggered any alerts resulting from the penetration attempt. If a web server is being tested and a WAF was deployed, it should have logged and blocked attack attempts. As a best practice, the testing team should also document the time when the test was conducted. This will help the security team in correlating the logs with the penetration tests. [box type="note" align="" class="" width=""]WAFs work by analyzing the HTTP/HTTPS traffic between clients and servers, and they are capable of detecting and blocking the most common attacks on web applications.[/box] To build defense against web attacks with Kali Linux and understand the concepts of hacking and penetration testing, check out this book Web Penetration Testing with Kali Linux - Third Edition. Top 5 penetration testing tools for ethical hackers Essential skills required for penetration testing Approaching a Penetration Test Using Metasploit
Read more
  • 0
  • 2
  • 64429

article-image-10-types-dos-attacks-you-need-to-know
Savia Lobo
05 Jun 2018
11 min read
Save for later

The 10 most common types of DoS attacks you need to know

Savia Lobo
05 Jun 2018
11 min read
There are businesses that are highly dependent on their services hosted online. It's important that their servers are up and running smoothly during their business hours. Stock markets and casinos are examples of such institutions. They are businesses that deal with a huge sum of money and they expect their servers to work properly during their core business hours. Hackers may extort money by threatening to take down or block these servers during these hours. Denial of service (DoS) attack is the most common methodology used to carry out these kinds of attacks. In this post, we will get to know about  DoS attacks and their various types. This article is an excerpt taken from the book, 'Preventing Ransomware' written by Abhijit Mohanta, Mounir Hahad, and Kumaraguru Velmurugan. In this book, you will learn how to respond quickly to ransomware attacks to protect yourself. What are DoS attacks? DoS is one of the oldest forms of cyber extortion attack. As the term indicates, distributed denial of service (DDoS) means it denies its service to a legitimate user. If a railway website is brought down, it fails to serve the people who want to book tickets. Let's take a peek into some of the details. A DoS attack can happen in two ways: Specially crafted data: If specially crafted data is sent to the victim and if the victim is not set up to handle the data, there are chances that the victim may crash. This does not involve sending too much data but includes specially crafted data packets that the victim fails to handle. This can involve manipulating fields in the network protocol packets, exploiting servers, and so on. Ping of death and teardrop attacks are examples of such attacks. Flooding: Sending too much data to the victim can also slow it down. So it will spend resources on consuming the attackers' data and fail to serve the legitimate data. This can be a DDoS attack where packets are sent to the victim by the attacker from many computers. Attacks can also use a combination of both. For example, UDP flooding and SYN flooding are examples of such attacks. There is another form of DoS attack called a DDoS attack. A DoS attack uses a single computer to carry out the attack. A DDoS attack uses a series of computers to carry out the attack. Sometimes the target server is flooded with so much data that it can't handle it. Another way is to exploit the workings of internal protocols. A DDoS attack that deals with extortion is often termed a ransom DDoS. We will now talk about various types of the DoS attacks that might occur. Teardrop attacks or IP fragmentation attacks In this type of attack, the hacker sends a specially crafted packet to the victim. To understand this, one must have knowledge of the TCP/IP protocol. In order to transmit data across networks, IP packets are broken down into smaller packets. This is called fragmentation. When the packets finally reach their destination, they are re-assembled together to get the original data. In the process of fragmentation, some fields are added to the fragmented packets so that they can be tracked at the destination while reassembling. In a teardrop attack, the attacker crafts some packets that overlap with each other. Consequently, the operating system at the destination gets confused about how to reassemble the packets and hence it crashes. User Datagram Protocol flooding User Datagram Protocol (UDP) is an unreliable packet. This means the sender of the data does not care if the receiver has received it. In UDP flooding, many UDP packets are sent to the victim at random ports. When the victim gets a packet on a port, it looks out for an application that is listening to that port. When it does not find the packet, it replies back with an Internet Control Message Protocol (ICMP) packet. ICMP packets are used to send error messages. When a lot of UDP packets are received, the victim consumes a lot of resources in replying back with ICMP packets. This can prevent the victim from responding to legitimate requests. SYN flood TCP is a reliable connection. That means it makes sure that the data sent by the sender is completely received by the receiver. To start a communication between the sender and receiver, TCP follows a three-way handshake. SYN denotes the synchronization packet and ACK stands for acknowledgment: The sender starts by sending a SYN packet and the receiver replies with SYN-ACK. The sender sends back an ACK packet followed by the data. In SYN flooding, the sender is the attacker and the receiver is the victim. The attacker sends a SYN packet and the server responds with SYN-ACK. But the attacker does not reply with an ACK packet. The server expects an ACK packet from the attacker and waits for some time. The attacker sends a lot of SYN packets and the server waits for the final ACK until timeout. Hence, the server exhausts its resources waiting for ACK. This kind of attack is called SYN flooding. Ping of death While transmitting data over the internet, the data is broken into smaller chunks of packets. The receiving end reassembles these broken packets together in order to derive a conclusive meaning. In a ping of death attack, the attacker sends a packet larger than 65,536 bytes, the maximum size of a packet allowed by the IP protocol. The packets are split and sent across the internet. But when the packets are reassembled at the receiving end, the operating system is clueless about how to handle these bigger packets, so it crashes. Exploits Exploits for servers can also cause DDoS vulnerability. A lot of web applications are hosted on web servers, such as Apache and Tomcat. If there is a vulnerability in these web servers, the attacker can launch an exploit against the vulnerability. The exploit need not necessarily take control, but it can crash the web server software. This can cause a DoS attack. There are easy ways for hackers to find out the web server and its version if the server has default configurations. The attacker finds out the possible vulnerabilities and exploits for that web server. If the web server is not patched, the attacker can bring it down by sending an exploit. Botnets Botnets can be used to carry out DDoS attacks. A botnet herd is a collection of compromised computers. The compromised computers, called bots, act on commands from a C&C server. These bots, on the commands of the C&C server, can send a huge amount of data to the victim server, and as a result, the victim server is overloaded: Reflective DDoS attacks and amplification attacks In this kind of attack, the attacker uses a legitimate computer to launch an attack against the victim by hiding its own IP address. The usual way is the attacker sends a small packet to a legitimate machine after forging the sender of the packet to look as if it has been sent from the victim. The legitimate machine will, in turn, send the response to the victim. If the response data is large, the impact is amplified. We can call the legitimate computers reflectors and this kind of attack, where the attacker sends small data and the victim receives a larger amount of data, is called an amplification attack. Since the attacker does not directly use computers controlled by him and instead uses legitimate computers, it's called a reflective DDoS attack: The reflectors are not compromised machines, unlike botnets. Reflectors are machines that respond to a particular request. It can be a DNS request or a Networking Time Protocol (NTP) request, and so on. DNS amplification attacks, WordPress pingback attacks, and NTP attacks are amplification attacks. In a DNS amplification attack, the attacker sends a forged packet to the DNS server containing the IP address of the victim. The DNS server replies back to the victim instead with larger data. Other kinds of amplification attack include SMTP, SSDP, and so on. We will look at an example of such an attack in the next section. The computers that are used to send traffic to the victim are not the compromised ones and are called reflectors. There are several groups of cyber criminals responsible for carrying out ransom DDoS attacks, such as DD4BC, Armada Collective, Fancy Bear, XMR-Squad, and Lizard Squad. These groups target enterprises. They will first send out an extortion email, followed by an attack if the victim does not pay the ransom. DD4BC The DD4BC group was seen operating in 2014. It charged Bitcoins as the extortion fee. The group mainly targeted media, entertainment, and financial services. They would send a threatening email stating that a low-intensity DoS attack will be carried out first. They would claim that they will protect the organization against larger attacks. They also threatened that they will publish information about the attack in social media to bring down the reputation of the company: Usually, DD4DC are known to exploit a bug WordPress pingback vulnerability. We don't want to get into too much detail about this bug or vulnerability. Pingback is a feature provided by WordPress through which the original author of the WordPress site or blog gets notified where his site has been linked or referenced. We can call the site which refers to the original site as the referrer and the original site as the original. If the referrer uses the original, it sends a request called a pingback request to the original which contains the URL of itself. This is a kind of notification to the original site from the referrer informing that it is linking to the original site. Now the original site downloads the referrer site as a response to the pingback request as per the protocol designed by WordPress and this action is termed as a reflection. The WordPress sites used in the attack are called reflectors. So an attacker can misuse it by creating a forged pingback request with a URL of a victim site and send it to the WordPress sites. The attack uses these WordPress sites in the attack. As a result, the WordPress sites respond to the victim. Put simply, the attack notifies the WordPress sites that the victim has referred them on his/her site. So all the WordPress sites try to connect to the victim, which overloads the victim. If the victim's web page is large and the WordPress sites try to download it, then it chokes the bandwidth and this is called amplification: Armada Collective The Armada Collective group was first seen in 2015. They attacked various financial services and web hosting sites in Russia, Switzerland, Greece, and Thailand. They again re-emerged in Central Europe in October 2017. They used to carry out a demo-DDoS attack to threaten the victim. Here is an extortion letter from Armada Collective: This group is known to carry out reflective DDoS attacks through NTP. The NTP protocol is a protocol that is used to synchronize computer clock times in a protocol. The NTP protocol provides a support for a monlist command for administrative purpose. When an administrator sends the monlist command to an NTP server, the server responds with a list of 600 hosts that are connected to that NTP server. The attacker can exploit this by creating a forged NTP packet which has a monlist command containing the IP address of the victim and then sending multiple copies to the NTP server. The NTP server thinks that the monlist request has come from the victim address and sends a response which contains a list of 600 computers connected to that server. Thus the victim receives too much data from the NTP response and it can crash: Fancy Bear Fancy Bear is one of the hacker groups we have known about since 2010. Fancy Bear threatened to use Mirai Botnet in the attack. Mirai Botnet was known to target Linux operating systems used in IoT devices. It was mostly known to infect CCTV cameras. Here is a letter from Fancy Bear: We have talked about a few groups that were infamous for carrying out DoS extortion and some of the techniques used by them. We explored different types of DoS attacks and how they can occur. If you've enjoyed this excerpt, check out 'Preventing Ransomware' to know in detail about the latest ransomware attacks involving WannaCry, Petya, and BadRabbit. Anatomy of a Crypto Ransomware Barracuda announces Cloud-Delivered Web Application Firewall service Top 5 penetration testing tools for ethical hackers
Read more
  • 0
  • 0
  • 52467

article-image-5-engines-build-games-without-coding
Sam Wood
17 Feb 2016
4 min read
Save for later

5 engines to build games without coding

Sam Wood
17 Feb 2016
4 min read
Let's start with a disclaimer - if you want to make a video game the best it can be, you're going to need to learn how to code. But you can build games without coding. So, if the prospect of grinding C++ in order to make the next Minecraft doesn't quite appeal to you here are 5 accessible tools to help you get into game development without writing a single line of code. GameMaker - drag and drop game development GameMaker is one of the premier engines that offers users the chance to make complete mobile games just with using a drag-and-drop interface. Specifically designed so that novice computer programmers would be able to make computer games without much programming knowledge, it's an excellent choice for anyone looking to make cross-platform game app without reams of code. In addition, GameMaker boasts its own language for when you want to add extra custom features and refine your game experience. Unreal Engine - AAA game development without writing code Unreal Engine is a AAA engine, used to make some of the biggest names out there. If you're just getting into games development and are unsure about coding, you might be surprised to see it on this list. What UE4 offers to beginners and non-coders, though, is the power of its Blueprints visual scripting. With Blueprints, you can create (reasonably) complex games all without typing a single line of C++. Based on the concept of using a node-based interface, Blueprints allows non-programmers access to gameplay elements including camera control, player input, items and triggers, and more. Unity - the definitive game engine and a good place to begin Unity is the tool of pro-game developers; in our 2015 Skill Up Survey, it was revealed as the tech that was most important for earning the top salaries in the industry. Unity has no built-in visual scripting like Unreal - but what it does have is a massive community, and a huge supply of code snippets and assets available for almost every requirement. You can do a lot in its editor just by dragging scripts onto in-game objects. Whilst (like Unreal) you'll want to pick up some coding skills when you start to make your games more complex, you can get quite a way standing on the shoulders of your fellow developers. If that's not working for you, though, why not check out PlayMaker? This visual scripting plugin for Unity offers a whole new arrange of options for when you want something a little more custom. GameSalad - an amazing behavior library Much like GameMaker, GameSalad is an intuitive drag-and-drop interface game creator. What makes GameSalad stand out from the crowd though is its amazing behavior library. This library lets developers implement really complex behaviors, of a kind that would be challenging or even impossible for someone to muddle through without a knowledge of coding. There a thousands of successful games on Google Play and the App Store built with GameSalad - why not add yours to their number? Lumberyard Okay, so I lied a bit in the title of this blog - currently, there's nothing to suggest that Amazon's new game engine will be particularly friendly to non-coders. So what is Lumberyard? Why is it on here? Lumberyard is Amazon's new game engine, derived from CryENGINE. It's built to get people deploying their games to Amazon Web Services (AWS) but is otherwise free to use. What's interesting about Lumberyard is its visual scripting tool supposedly made for designers and engineers with little to no backend experience to add cloud-connected features to a game. These features can include a "community news feed, daily gifts, or server-side combat resolution" - added within minutes through drag-and-drop visual scripting. Lumberyard is still super new, so we'll have to wait and see if it delivers on its promises - but we may well find ourselves with a serious contender to the likes of Unity and Unreal. Check out other related posts: Construct Game Development: Platformer Revisited, a 2D Shooter C++, SFML, Visual Studio, and Starting the first game  
Read more
  • 0
  • 2
  • 50199

article-image-what-are-best-programming-languages-building-apis
Antonio Cucciniello
11 Jun 2017
4 min read
Save for later

What are the best programming languages for building APIs?

Antonio Cucciniello
11 Jun 2017
4 min read
Are you in the process of designing your first web application? Maybe you have built some in the past but are looking for a change in language to increase your skill set, or try out something new. If you fit into those categories, you are in the right place. With all of the information out there, it could be hard to decide on what programming language to select for your next product or project. Because any programming language can ultimately be used to write APIs, some can be better and more efficient to use than others. Today we will be discussing what should be taken into consideration when choosing the programming language to build out APIs for your web app. Comfort is essential when it comes to programming languages This goes out to any developer who has experience in a certain language. If you already have experience in a language, you will ultimately have an easier time developing, understanding the concepts involved, and you will be able to make more progress right out of the gate. This translates to improved code and performance as well because you can spend more time on that rather than learning a brand new programming language. For example, if I have been developing in Python for a few years, but I have the option between using PHP or Python as a programming language for the project, I simply select Python due to the time saved already spent learning Python. This is extremely important because when trying to do something new, you want to limit the amount of unknowns that you will have in the project. That will help your learning and help to achieve better results. If you are a brand new developer with zero programming experience, the following section might help you narrow your results. Libraries and frameworks that support developing APIs The next question to ask in the process of eliminating potential programming languages to build out your API is: Does the language come with plenty of different options for libraries or frameworks that aid in the developing of APIs? To continue with the Python example in the previous section, there is the Django REST framework that is specifically built on top of Django. Django is a web development framework for Python, made for creating an API in the programming language faster and easier. Did you hear faster and easier? Why yes you did, and that is why this is important. These libraries and frameworks allow you to speed up the development process by containing functions and objects that handle plenty of the repetitive or dirty work in building an API. Once you have spent some time researching what is available to you in terms of libraries and frameworks for languages, it is time to check out how active the communities are. Support and community The next question to ask yourself in this process is: Are the frameworks and libraries for this programming language still being supported? If so, how active is the community of developers? Do they have continuous or regular updates to their software and capabilities? Do the updates help improve security and usability? Given that not many people use the language, nor is it being updated for bug fixes in the future, you may not want to continue using it. Another thing to pay attention to is the community of users. Are there plenty of resources for you to learn from? How clear and available is the documentation? Are there experienced developers who have blog posts on the necessary topics to learn? Are there questions being asked and answered on Stack Overflow? Are there any hard resources such as magazines or textbooks that show you how to use these languages and frameworks? Potential languages for building APIs From my experience, there are a number of better programming languages.Here is an example framework for some of these languages, which you can use to start developing your next API: Language Framework Java Spring JavaScript(Node) Express Python Django PHP Laravel Ruby Ruby on Rails   Ultimately, the programming language you select is dependent on several factors: your experience with the language, the frameworks available for API building, and how active both the support and the community are. Do not be afraid to try something new! You can always learn, but if you are concerned about speed and ease of development, use these criteria to help select the language of use. Leave a comment down below and let us know which programming language is your favorite and how you will use it in your future applications!
Read more
  • 0
  • 0
  • 46717

article-image-rust-as-a-game-programming-language-is-it-any-good
Amarabha Banerjee
22 Sep 2018
4 min read
Save for later

Rust as a Game Programming Language: Is it any good?

Amarabha Banerjee
22 Sep 2018
4 min read
We have moved lightyears away from the the handheld gaming days. The good old Tetris and Mario games were easy to use, low on graphics, super difficult to program in-spite of their apparent simpler appearance. Although it’s difficult to trace back to the language in which all of these games were written, many of them were written in the C family of languages, which contributed to the difficulty in programming them. Rust has been touted as one of the successors of C. Which in-turn brings the question back - if C was difficult for coding, then how exactly is Rust going to be different? The answer of this question lies in the approach of Rust. Rust was designed primarily as a systems programming language by the Mozilla Foundation. The primary game development language over the past 20 years have been C/C++ majorly. Rust brings a fresh change in approach - from Object Oriented to Data Oriented. The problem with object oriented programming was summarized nicely by Catherine West from Chucklefish. According to her, treating game elements like NPC, game worlds, as Objects might work well at a small level. But when you are trying to create your own game engine, then treating game elements as Objects will imply creating a lot of super sized objects with complex layers of dependencies. The Rust approach, on the other hand, is data oriented. This implies that every element is treated as data. This simplifies the process of creating midsized game engines a lot. Chucklefish being a significant name in 2D game development, this statement from Catherine West comes as a major boost for developers who want to use Rust for developing 2D games. She has although expressed her doubts on using Rust for 3D game development. Another important personality who has recently come out in support of Rust is Andrea Pessino -  CTO of Ready at Dawn. Ready at Dawn is a well established game studio known for games such as The Order: 1886, Daxter and various God of War titles. His tweet read like this. This is another feather in Rust’s cap for game development. The present state of game development in Rust is quite encouraging. There are quite a few low level graphical libraries like GFX.  GFX is a low-level abstraction layer over platform specific graphical interfaces (OpenGL, Metal, Vulkan). It offers some handy wrapper over windows backend (glutin the Rust one, or wrapper around Vulkan system, GLFW and more). GFX is still at a very early stage of development with the present version being 0.17. Although major game engines like Unity, and Unreal are yet to support Rust for game development, there exist a few complete game engines which allow you to create complete games with Rust using their framework. The first one is Piston. It is the oldest game engine for Rust. It is also the most stable and one with great documentation. However, many people find Piston confusing and hard to use as it is super-modular by design. Sometimes it is even hard to understand which module to load for achieving a certain goal or build a certain component of a game. Amethyst is a more recent game engine/framework inspired by commercial monolithic game engines. It comes with all the necessary dependencies in its package. However it is evolving quickly and hence the present documentation is already outdated. However there is a vibrant community which is looking to include more and more developers into its foray. Hence this gives an opportunity to new developers to get into game development with Rust and get involved with a game engine also. GGEZ is a simple 2D game engine inspired by the LÖVE engine. This library is more suited at creating simple 2D games for hobbyists. GGEZ is also very new and changes quickly. The design simplicity is an incentive for indie developers and hobbyists to start creating games with it. Some other popular libraries include: noise-rs / a noise generator rlua / High level bindings between Rust and Lua sfxr / Reimplementation of DrPetter’s “sfxr” sound effect generator as a Rust library The conclusion that we can draw from here is that Rust has a lot of promise when it comes to game development. With the data oriented approach, easy memory management and access to low level performance enhancement techniques, Rust can become a full fledged game development language in the near future. Best game engines for Artificial Intelligence game development Implementing Unity game engine and assets for 2D game development How to use arrays, lists, and dictionaries in Unity for 3D game development
Read more
  • 0
  • 0
  • 41087

article-image-essential-skills-penetration-testing
Hari Vignesh
11 Jun 2017
6 min read
Save for later

Essential skills for penetration testing

Hari Vignesh
11 Jun 2017
6 min read
Cybercriminals are continally developing new and more sophisticated ways to exploit software vulnerabilities, making it increasingly difficult to defend our systems. Today, then, we need to be proactive in how we protect our digital properties. That's why penetration testers are so in demand. Although risk analysis can easily be done by internal security teams, support from skilled penetration testers can be the difference between security and vulnerability. These highly trained professionals can “think like the enemy” and employ creative ways to identify problems before they occur, going beyond the use of automated tools. Pentesters can perform technological offensives, but also simulate spear phishing campaigns to identify weak links in the security posture of the companies and pinpoint training needs. The human element is essential to simulate a realistic attack and uncover all of the infrastructure’s critical weaknesses. Being a pen tester can be financially rewarding because trained and skilled ones can normally secure good wages. Employers are willing to pay top dollar to attract and retain talent. Most pen testers enjoy sizable salaries depending on where they live and their level of experience and training. According to a PayScale salary survey, the average salary is approximately $78K annually, ranging from $44K to $124K on the higher end. To be a better pen tester, you need to upgrade or master your art in certain aspects. The following skills will make you stand out in the crowd and will make you a better and more effective pen tester. I know what you’re thinking. This seems like an awful lot of work learning penetration testing, right? Wrong. You can still learn how to penetration test and become a penetration tester without these things, but learning all of these things will make it easier and help you understand both how and why things are done a certain way. Bad pen testers know that things are vulnerable. Good pen testers know how things are vulnerable. Great pen testers know why things are vulnerable. Mastering command-line If you notice that even in modern hacker films and series, the hackers always have a little black box on the screen with text going everywhere. It’s a cliché but it’s based in reality. Hackers and penetration testers alike use the command line a lot. Most of the tools are normally command line based. It’s not showing off, it’s just the most efficient way to do our jobs. If you want to become a penetration tester you need to be at the very least, comfortable with a DOS or PowerShell prompt or terminal. The best way to develop this sort of skillset is to learn how to write DOS Batch or PowerShell scripts. There are various command line tools that make the life of a pen-tester easy. So learning to use those tools and mastering them will enable you to pen-test your environment efficiently. Mastering OS concepts If you look at penetration testing or hacking sites and tutorials, there’s a strong tendency to use Linux. If you start with something like Ubuntu, Mint or Fedora or Kali as a main OS and try to spend some time tinkering under the hood, it’ll help you become more familiar with the environment. Setting up a VM to install and break into a Linux server is a great way to learn. You wouldn’t expect to be able to comfortably find and exploit file permission weaknesses if you don’t understand how Linux file permissions work, nor should you expect to be able to exploit the latest vulnerabilities comfortably and effectively without understanding how they affect a system. A basic understanding of Unix file permissions, processes, shell scripting, and sockets will go a long way. Mastering networking and protocols to the packet level TCP/IP seems really scary at first, but the basics can be learned in a day or two. While breaking in you can use a packet sniffing tool called Wireshark to see what’s really going on when they send traffic to a target instead of blindly accepting documented behavior without understanding what’s happening. You’ll also need to know not only how HTTP works over the wire, but also you’ll need to understand the Document Object Model (DOM) and enough knowledge about how backends work to then, further understand how web-based vulnerabilities occur. You can become a penetration tester without learning a huge volume of things, but you’ll struggle and it’ll be a much less rewarding career. Mastering programming If you can’t program then you’re at risk of losing out to candidates who can. At best, you’re possibly going to lose money from that starting salary. Why? You would require sufficient knowledge in a programming language to understand the source code and find a vulnerability in it. For instance, only if you know PHP and how it interacts with a database, will you be able to exploit SQL injection. Your prospective employer is going to need to give you time to learn these things if they’re going to get the most out of you. So don’t steal money from your own career, learn to program. It’s not hard. Being able to program means you can write tools, automate activities, and be far more efficient. Aside from basic scripting you should ideally become at least semi-comfortable with one programming languageand cover the basics in another. Web people like Ruby. Python is popular amongst reverse engineers. Perl is particularly popular amongst hardcore Unix users. You don’t need to be a great programmer, but being able to program is worth its weight in goldand most languages have online tutorials to get you started. Final thoughts Employers will hire a bad junior tester if they have to, and a good junior tester if there’s no one better, but they’ll usually hire a potentially great junior pen tester in a heartbeat. If you don’t spend time learning the basics to make yourself a great pen tester, you’re stealing from your own potential salary. If you’re missing some or all of the things above, don’t be upset. You can still work towards getting a job in penetration testing and you don’t need to be an expert in any of these things. They’re simply technical qualities that make you a much better candidate for being (and probably better paid) hired from a hiring manager and supporting interviewer’s perspective. About the author Hari Vignesh Jayapalan is a Google Certified Android app developer, IDF Certified UI & UX Professional, street magician, fitness freak, technology enthusiast, and wannabe entrepreneur. He can be found on Twitter @HariofSpades.
Read more
  • 0
  • 0
  • 40531
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-unity-and-unreal-comparison
Raka Mahesa
26 Jan 2018
5 min read
Save for later

Unity and Unreal comparison

Raka Mahesa
26 Jan 2018
5 min read
If you want to find out how to get into game development, you’ve probably come across the two key game engines in the industry: Unreal Engine and Unity. But how do Unity and Unreal Engine compare? What are the differences between Unity and Unreal engine? Is one better than the other? Explore the newest and most popular Unity eBooks and video courses. Discover Unreal eBooks and video courses here. Unity and Unreal price comparison Unreal Engine has a simple pricing scheme: You get everything for free, but you have to pay 5 percent of your earnings. Unity also has a free tier that includes the core features of the engine, but if your company has an annual revenue of more than $100,000, you have to use the paid tier, which will cost you $35 per month. The paid tier also gives you additional features including a custom splash screen, an enhanced analytics feature, and expanded multiplayer hosting  The question here is which pricing scheme fits with your business model (and budget). If you have a small, nimble team Unity might be the better option, but if you have a big team developing a complex game, Unreal Engine might be more cost effective. The good thing is, without spending a dime, you can get the full capability of both tools, so you can't really go wrong starting with either of them. How do Unity's and Unreal's capabilities compare? We'll start with a simple, but important, question: what platforms do Unreal Engine and Unity support? Unreal engine supports developing games for mobile platforms like iOS and Android, for consoles like PS4, XBOX ONE, and Nintendo Switch, and for desktop operating systems like Windows, Mac, and Linux. It also has support for VR platforms such as Oculus, SteamVR, PSVR, Google Daydream, and Samsung Gear VR.  Unity, on the other hand, not only supports all of those platforms, it also supports smart TV platforms like Android TV and Samsung SmartTV, as well as augmented reality platforms like Apple ARKit and Google ARCore. And Unity doesn't simply support more platforms than Unreal, it is also usually the first game engine to provide compatibility when a new platform is launched. Unity is the clear winner when it comes to compatibility, and if you're looking to release your game to as many platforms as possible, then Unity is your best choice. Comparing Unity and Unreal's feature sets Even though both software have similar capabilities, Unreal Engine provides more built-in tools that makes game development easier. Unreal has a built-in, extensive material editor as well as a built-in cinematic editor that allows developers to easily create cinematic sequences in their games. Meanwhile, Unity relies on third-party addons from their asset store to provide similar functionalities. That said, the 2D development tool provided by Unity is much more effective than Unreal’s.  Do keep in mind that features can't only be judged by their numbers alone. One of the most important qualities in a tool is how easy they are to use. Ease of use is, of course, relatively subjective – what one person loves using might be a nightmare for another.  Is Unity or Unreal easier to use? Based on the built-in tools provided by the engine, we can see that Unreal is the more powerful of the two options. But that also means Unity is simpler to use. The same comparison can be seen in their programming aspect. Unity is using C# for their main programming language, which is easier to use and learn. Unreal, on the other hand, is using C++, which is much more powerful, but is also harder to learn and more prone to mistakes. Fortunately, Unreal makes up for its complexity by providing an alternative, easy-to-use scripting language: Blueprint. Blueprint is a scripting language where developers can simply connect nodes together to program gameplay elements. Using this tool, non-programmers like artists and writers are able to script gameplay events without relying on programmers. Comparing the Unity and Unreal communities The last point we're going to address is something not directly related to the engine itself, but it is nevertheless pretty important - the community. A big community makes it much easier to get help when you run into trouble; it also means more tool and resource development. Unity is the winner on this front, as can be seen with the huge amount of tutorials and third-party libraries that are created for it. It’s important to remember one thing: both development tools are fully capable of producing great games with amazing graphics and good performance that can sell millions. One tool may need more work than the other to get the same result, but that result is perfectly achievable with both engines. So you don't need to worry that choosing one tool over the other will negatively affect your end product. So, have you made up your mind on which tool you're going to use? Raka Mahesa is a game developer at Chocoarts who is interested in digital technology in general. Outside of work, he enjoys working on his own projects, with Corridoom VR being his latest relesed gme. Raka also regularly tweets @legacy99.
Read more
  • 0
  • 0
  • 37750

article-image-common-big-data-design-patterns
Sugandha Lahoti
08 Jul 2018
17 min read
Save for later

Common big data design patterns

Sugandha Lahoti
08 Jul 2018
17 min read
Design patterns have provided many ways to simplify the development of software applications. Now that organizations are beginning to tackle applications that leverage new sources and types of big data, design patterns for big data are needed. These big data design patterns aim to reduce complexity, boost the performance of integration and improve the results of working with new and larger forms of data. This article intends to introduce readers to the common big data design patterns based on various data layers such as data sources and ingestion layer, data storage layer and data access layer. This article is an excerpt from Architectural Patterns by Pethuru Raj, Anupama Raman, and Harihara Subramanian. In this book, you will learn the importance of architectural and design patterns in business-critical applications. Data sources and ingestion layer Enterprise big data systems face a variety of data sources with non-relevant information (noise) alongside relevant (signal) data. Noise ratio is very high compared to signals, and so filtering the noise from the pertinent information, handling high volumes, and the velocity of data is significant. This is the responsibility of the ingestion layer. The common challenges in the ingestion layers are as follows: Multiple data source load and prioritization Ingested data indexing and tagging Data validation and cleansing Data transformation and compression The preceding diagram depicts the building blocks of the ingestion layer and its various components. We need patterns to address the challenges of data sources to ingestion layer communication that takes care of performance, scalability, and availability requirements. In this section, we will discuss the following ingestion and streaming patterns and how they help to address the challenges in ingestion layers. We will also touch upon some common workload patterns as well, including: Multisource extractor Multidestination Protocol converter Just-in-time (JIT) transformation Real-time streaming pattern Multisource extractor An approach to ingesting multiple data types from multiple data sources efficiently is termed a Multisource extractor. Efficiency represents many factors, such as data velocity, data size, data frequency, and managing various data formats over an unreliable network, mixed network bandwidth, different technologies, and systems: The multisource extractor system ensures high availability and distribution. It also confirms that the vast volume of data gets segregated into multiple batches across different nodes. The single node implementation is still helpful for lower volumes from a handful of clients, and of course, for a significant amount of data from multiple clients processed in batches. Partitioning into small volumes in clusters produces excellent results. Data enrichers help to do initial data aggregation and data cleansing. Enrichers ensure file transfer reliability, validations, noise reduction, compression, and transformation from native formats to standard formats. Collection agent nodes represent intermediary cluster systems, which helps final data processing and data loading to the destination systems. The following are the benefits of the multisource extractor: Provides reasonable speed for storing and consuming the data Better data prioritization and processing Drives improved business decisions Decoupled and independent from data production to data consumption Data semantics and detection of changed data Scaleable and fault tolerance system The following are the impacts of the multisource extractor: Difficult or impossible to achieve near real-time data processing Need to maintain multiple copies in enrichers and collection agents, leading to data redundancy and mammoth data volume in each node High availability trade-off with high costs to manage system capacity growth Infrastructure and configuration complexity increases to maintain batch processing Multidestination pattern In multisourcing, we saw the raw data ingestion to HDFS, but in most common cases the enterprise needs to ingest raw data not only to new HDFS systems but also to their existing traditional data storage, such as Informatica or other analytics platforms. In such cases, the additional number of data streams leads to many challenges, such as storage overflow, data errors (also known as data regret), an increase in time to transfer and process data, and so on. The multidestination pattern is considered as a better approach to overcome all of the challenges mentioned previously. This pattern is very similar to multisourcing until it is ready to integrate with multiple destinations (refer to the following diagram). The router publishes the improved data and then broadcasts it to the subscriber destinations (already registered with a publishing agent on the router). Enrichers can act as publishers as well as subscribers: Deploying routers in the cluster environment is also recommended for high volumes and a large number of subscribers. The following are the benefits of the multidestination pattern: Highly scalable, flexible, fast, resilient to data failure, and cost-effective Organization can start to ingest data into multiple data stores, including its existing RDBMS as well as NoSQL data stores Allows you to use simple query language, such as Hive and Pig, along with traditional analytics Provides the ability to partition the data for flexible access and decentralized processing Possibility of decentralized computation in the data nodes Due to replication on HDFS nodes, there are no data regrets Self-reliant data nodes can add more nodes without any delay The following are the impacts of the multidestination pattern: Needs complex or additional infrastructure to manage distributed nodes Needs to manage distributed data in secured networks to ensure data security Needs enforcement, governance, and stringent practices to manage the integrity and consistency of data Protocol converter This is a mediatory approach to provide an abstraction for the incoming data of various systems. The protocol converter pattern provides an efficient way to ingest a variety of unstructured data from multiple data sources and different protocols. The message exchanger handles synchronous and asynchronous messages from various protocol and handlers as represented in the following diagram. It performs various mediator functions, such as file handling, web services message handling, stream handling, serialization, and so on: In the protocol converter pattern, the ingestion layer holds responsibilities such as identifying the various channels of incoming events, determining incoming data structures, providing mediated service for multiple protocols into suitable sinks, providing one standard way of representing incoming messages, providing handlers to manage various request types, and providing abstraction from the incoming protocol layers. Just-In-Time (JIT) transformation pattern The JIT transformation pattern is the best fit in situations where raw data needs to be preloaded in the data stores before the transformation and processing can happen. In this kind of business case, this pattern runs independent preprocessing batch jobs that clean, validate, corelate, and transform, and then store the transformed information into the same data store (HDFS/NoSQL); that is, it can coexist with the raw data: The preceding diagram depicts the datastore with raw data storage along with transformed datasets. Please note that the data enricher of the multi-data source pattern is absent in this pattern and more than one batch job can run in parallel to transform the data as required in the big data storage, such as HDFS, Mongo DB, and so on. Real-time streaming pattern Most modern businesses need continuous and real-time processing of unstructured data for their enterprise big data applications. Real-time streaming implementations need to have the following characteristics: Minimize latency by using large in-memory Event processors are atomic and independent of each other and so are easily scalable Provide API for parsing the real-time information Independent deployable script for any node and no centralized master node implementation The real-time streaming pattern suggests introducing an optimum number of event processing nodes to consume different input data from the various data sources and introducing listeners to process the generated events (from event processing nodes) in the event processing engine: Event processing engines (event processors) have a sizeable in-memory capacity, and the event processors get triggered by a specific event. The trigger or alert is responsible for publishing the results of the in-memory big data analytics to the enterprise business process engines and, in turn, get redirected to various publishing channels (mobile, CIO dashboards, and so on). Big data workload patterns Workload patterns help to address data workload challenges associated with different domains and business cases efficiently. The big data design pattern manifests itself in the solution construct, and so the workload challenges can be mapped with the right architectural constructs and thus service the workload. The following diagram depicts a snapshot of the most common workload patterns and their associated architectural constructs: Workload design patterns help to simplify and decompose the business use cases into workloads. Then those workloads can be methodically mapped to the various building blocks of the big data solution architecture. Data storage layer Data storage layer is responsible for acquiring all the data that are gathered from various data sources and it is also liable for converting (if needed) the collected data to a format that can be analyzed. The following sections discuss more on data storage layer patterns. ACID versus BASE versus CAP Traditional RDBMS follows atomicity, consistency, isolation, and durability (ACID) to provide reliability for any user of the database. However, searching high volumes of big data and retrieving data from those volumes consumes an enormous amount of time if the storage enforces ACID rules. So, big data follows basically available, soft state, eventually consistent (BASE), a phenomenon for undertaking any search in big data space. Database theory suggests that the NoSQL big database may predominantly satisfy two properties and relax standards on the third, and those properties are consistency, availability, and partition tolerance (CAP). With the ACID, BASE, and CAP paradigms, the big data storage design patterns have gained momentum and purpose. We will look at those patterns in some detail in this section. The patterns are: Façade pattern NoSQL pattern Polyglot pattern Façade pattern This pattern provides a way to use existing or traditional existing data warehouses along with big data storage (such as Hadoop). It can act as a façade for the enterprise data warehouses and business intelligence tools. In the façade pattern, the data from the different data sources get aggregated into HDFS before any transformation, or even before loading to the traditional existing data warehouses: The façade pattern allows structured data storage even after being ingested to HDFS in the form of structured storage in an RDBMS, or in NoSQL databases, or in a memory cache. The façade pattern ensures reduced data size, as only the necessary data resides in the structured storage, as well as faster access from the storage. NoSQL pattern This pattern entails getting NoSQL alternatives in place of traditional RDBMS to facilitate the rapid access and querying of big data. The NoSQL database stores data in a columnar, non-relational style. It can store data on local disks as well as in HDFS, as it is HDFS aware. Thus, data can be distributed across data nodes and fetched very quickly. Let's look at four types of NoSQL databases in brief: Column-oriented DBMS: Simply called a columnar store or big table data store, it has a massive number of columns for each tuple. Each column has a column key. Column family qualifiers represent related columns so that the columns and the qualifiers are retrievable, as each column has a column key as well. These data stores are suitable for fast writes. Key-value pair database: A key-value database is a data store that, when presented with a simple string (key), returns an arbitrarily large data (value). The key is bound to the value until it gets a new value assigned into or from a database. The key-value data store does not need to have a query language. It provides a way to add and remove key-value pairs. A key-value store is a dictionary kind of data store, where it has a list of words and each word represents one or more definitions. Graph database: This is a representation of a system that contains a sequence of nodes and relationships that creates a graph when combined. A graph represents three data fields: nodes, relationships, and properties. Some types of graph store are referred to as triple stores because of their node-relationship-node structure. You may be familiar with applications that provide evaluations of similar or likely characteristics as part of the search (for example, a user bought this item also bought... is a good illustration of graph store implementations). Document database: We can represent a graph data store as a tree structure. Document trees have a single root element or sometimes even multiple root elements as well. Note that there is a sequence of branches, sub-branches, and values beneath the root element. Each branch can have an expression or relative path to determine the traversal path from the origin node (root) and to any given branch, sub-branch, or value. Each branch may have a value associated with that branch. Sometimes the existence of a branch of the tree has a specific meaning, and sometimes a branch must have a given value to be interpreted correctly. The following table summarizes some of the NoSQL use cases, providers, tools and scenarios that might need NoSQL pattern considerations. Most of this pattern implementation is already part of various vendor implementations, and they come as out-of-the-box implementations and as plug and play so that any enterprise can start leveraging the same quickly. NoSQL DB to Use Scenario Vendor / Application / Tools Columnar database Application that needs to fetch entire related columnar family based on a given string: for example, search engines SAP HANA / IBM DB2 BLU / ExtremeDB / EXASOL / IBM Informix / MS SQL Server / MonetDB Key Value Pair database Needle in haystack applications (refer to the Big data workload patterns given in this section) Redis / Oracle NoSQL DB / Linux DBM / Dynamo / Cassandra Graph database Recommendation engine: application that provides evaluation of Similar to / Like: for example, User that bought this item also bought ArangoDB / Cayley / DataStax / Neo4j / Oracle Spatial and Graph / Apache Orient DB / Teradata Aster Document database Applications that evaluate churn management of social media data or non-enterprise data Couch DB / Apache Elastic Search / Informix / Jackrabbit / Mongo DB / Apache SOLR Polyglot pattern Traditional (RDBMS) and multiple storage types (files, CMS, and so on) coexist with big data types (NoSQL/HDFS) to solve business problems. Most modern business cases need the coexistence of legacy databases. At the same time, they would need to adopt the latest big data techniques as well. Replacing the entire system is not viable and is also impractical. The polyglot pattern provides an efficient way to combine and use multiple types of storage mechanisms, such as Hadoop, and RDBMS. Big data appliances coexist in a storage solution: The preceding diagram represents the polyglot pattern way of storing data in different storage types, such as RDBMS, key-value stores, NoSQL database, CMS systems, and so on. Unlike the traditional way of storing all the information in one single data source, polyglot facilitates any data coming from all applications across multiple sources (RDBMS, CMS, Hadoop, and so on) into different storage mechanisms, such as in-memory, RDBMS, HDFS, CMS, and so on. Data access layer Data access in traditional databases involves JDBC connections and HTTP access for documents. However, in big data, the data access with conventional method does take too much time to fetch even with cache implementations, as the volume of the data is so high. So we need a mechanism to fetch the data efficiently and quickly, with a reduced development life cycle, lower maintenance cost, and so on. Data access patterns mainly focus on accessing big data resources of two primary types: End-to-end user-driven API (access through simple queries) Developer API (access provision through API methods) In this section, we will discuss the following data access patterns that held efficient data access, improved performance, reduced development life cycles, and low maintenance costs for broader data access: Connector pattern Lightweight stateless pattern Service locator pattern Near real-time pattern Stage transform pattern The preceding diagram represents the big data architecture layouts where the big data access patterns help data access. We discuss the whole of that mechanism in detail in the following sections. Connector pattern The developer API approach entails fast data transfer and data access services through APIs. It creates optimized data sets for efficient loading and analysis. Some of the big data appliances abstract data in NoSQL DBs even though the underlying data is in HDFS, or a custom implementation of a filesystem so that the data access is very efficient and fast. The connector pattern entails providing developer API and SQL like query language to access the data and so gain significantly reduced development time. As we saw in the earlier diagram, big data appliances come with connector pattern implementation. The big data appliance itself is a complete big data ecosystem and supports virtualization, redundancy, replication using protocols (RAID), and some appliances host NoSQL databases as well. The preceding diagram shows a sample connector implementation for Oracle big data appliances. The data connector can connect to Hadoop and the big data appliance as well. It is an example of a custom implementation that we described earlier to facilitate faster data access with less development time. Lightweight stateless pattern This pattern entails providing data access through web services, and so it is independent of platform or language implementations. The data is fetched through restful HTTP calls, making this pattern the most sought after in cloud deployments. WebHDFS and HttpFS are examples of lightweight stateless pattern implementation for HDFS HTTP access. It uses the HTTP REST protocol. The HDFS system exposes the REST API (web services) for consumers who analyze big data. This pattern reduces the cost of ownership (pay-as-you-go) for the enterprise, as the implementations can be part of an integration Platform as a Service (iPaaS): The preceding diagram depicts a sample implementation for HDFS storage that exposes HTTP access through the HTTP web interface. Near real-time pattern For any enterprise to implement real-time data access or near real-time data access, the key challenges to be addressed are: Rapid determination of data: Ensure rapid determination of data and make swift decisions (within a few seconds, not in minutes) before the data becomes meaningless Rapid analysis: Ability to analyze the data in real time and spot anomalies and relate them to business events, provide visualization, and generate alerts at the moment that the data arrived Some examples of systems that would need real-time data analysis are: Radar systems Customer services applications ATMs Social media platforms Intrusion detection systems Storm and in-memory applications such as Oracle Coherence, Hazelcast IMDG, SAP HANA, TIBCO, Software AG (Terracotta), VMware, and Pivotal GemFire XD are some of the in-memory computing vendor/technology platforms that can implement near real-time data access pattern applications: As shown in the preceding diagram, with multi-cache implementation at the ingestion phase, and with filtered, sorted data in multiple storage destinations (here one of the destinations is a cache), one can achieve near real-time access. The cache can be of a NoSQL database, or it can be any in-memory implementations tool, as mentioned earlier. The preceding diagram depicts a typical implementation of a log search with SOLR as a search engine. Stage transform pattern In the big data world, a massive volume of data can get into the data store. However, all of the data is not required or meaningful in every business case. The stage transform pattern provides a mechanism for reducing the data scanned and fetches only relevant data. HDFS has raw data and business-specific data in a NoSQL database that can provide application-oriented structures and fetch only the relevant data in the required format: Combining the stage transform pattern and the NoSQL pattern is the recommended approach in cases where a reduced data scan is the primary requirement. The preceding diagram depicts one such case for a recommendation engine where we need a significant reduction in the amount of data scanned for an improved customer experience. The implementation of the virtualization of data from HDFS to a NoSQL database, integrated with a big data appliance, is a highly recommended mechanism for rapid or accelerated data fetch. We discussed big data design patterns by layers such as data sources and ingestion layer, data storage layer and data access layer. To know more about patterns associated with object-oriented, component-based, client-server, and cloud architectures, read our book Architectural Patterns. Why we need Design Patterns? Implementing 5 Common Design Patterns in JavaScript (ES8) An Introduction to Node.js Design Patterns
Read more
  • 0
  • 0
  • 36265

article-image-pentest-tool-in-focus-metasploit
Savia Lobo
30 May 2018
5 min read
Save for later

Pentest tool in focus: Metasploit

Savia Lobo
30 May 2018
5 min read
Security over the web is of the highest priority these days as most of our transactions and storage takes place on the web. Our systems are ripe for cracking by hackers. Don’t believe me? check out the below video. How can we improve our security belts around our system? Metasploit is one solution cybersecurity professionals look at to tight-lock their security with no risk of intruders. Metasploit, an open source project, allows individuals or organizations to identify security vulnerabilities and develop a code using which network administrators can break into their own code and identify potential risks. They can then prioritize which vulnerabilities need to be addressed. The Metasploit project offers Penetration (pen) testing software Tools for automating the comparison of a program's vulnerability Anti-forensic and advanced evasion tools Some tools are also built-in the Metasploit framework. The Metasploit Framework  is a collection of tools, libraries, modules and so on. It is popular among cybersecurity professionals and ethical hackers to carry out penetration testing or hacking. They can use it to exploit vulnerabilities on a network and also make Trojans, backdoors, botnets, phishing and so on. You can check out our article on 12 common malware types you should know, to know about the different malware types. The Metasploit Framework is supported by various operating systems including, Linux, MAC-OS, Windows, Android and so on. One can use metasploit in both free and paid versions, where the free version(Metasploit Framework and Metasploit community)can be used to find out basic exploits. However, a full paid version(Metasploit Pro) is preferred as it allows one to carry out deep pen-tests and other advanced features. A paid version offers: Collects integrations via remote APIs Automate several tasks, which include smart exploitation, penetration testing reports, and much more. Infiltrates dynamic payloads to evade the top antivirus solutions Also, in order to use this hacking tool, one can make use of the different interfaces it offers. Metasploit Interfaces Msfconsole Msfconsole is one of the highly popular interfaces in the metasploit framework. Once you have a hang of this interface and its syntax, it will provide a coherent access to all the options within the Metasploit Framework. Some advantages of msfconsole include: With the msfconsole, one can access all the features in the MSF Most stable and provides a console-based interface With msfconsole executing external commands is possible One can experience a full readline support, tabbing, and command completion Msfcli Msfcli enables a powerful command-line interface to the framework. Some features of this interface include: Support for the launch of exploits and auxiliary modules. Great for use in scripts and basic automation. However, one should be careful while using msfcli as variables are case-sensitive, and are assigned using an equal to (=) sign. MsfGUI Msfgui is the GUI of the framework and a tool to carry out demonstrations to clients and management. The msfgui: provides a point-and-click interface for exploitation a GTK wizard-based interface for using the metasploit framework Armitage Developed by Raphael Mudge, Armitage, is an open source Java-based frontend GUI for the metasploit framework. Its primary aim is to assist security professionals to understand hacking, by getting to know the true potential of Metasploit. Advantages of using Metasploit One can automate each phase of penetration testing Metasploit allows pentesters and cyber professionals to automate all phases within the penetration test. This is because, the amount of time required to carry out a complete and thorough pen-test is huge. Metasploit automates tasks; right from selecting the appropriate exploit to streamline the evidence collection and reporting of the attack. Credentials can be gathered and reused Credentials are the keys to any network, and the biggest prize for a penetration tester. With metasploit, one can catalog and track user credentials for reporting. Professionals and hackers can also make use of these credentials across every system in the network using a simple credential domino wizard. Become a next-Level Pen Tester If one has already worked with Metasploit framework for years together, its pro version is definitely the next step to head for. With Metasploit Pro, the expert can easily move through a network using the pivoting and antivirus evasion capabilities. They can also create instant reports on the progress and evidence. The best part is, one can seamlessly use custom scripts by going into the command line framework. Metasploit in competition with other pentesting tools Metasploit is not the only tool that offers penetration testing but it is one of the preferred ones. There are a number of other tools in the market that can give Metasploit a tough competition. Some of them include Wireshark, Nessus, Nmap, and so on. Wireshark is a famous network protocol analyzer. It can read captured information from other applications and is multiplatform. The only con it has is, it has a steep learning curve. Nessus is a vulnerability scanner and a popular tool among the professionals in security. It has a huge library of vulnerabilities and respective tests to identify them. It relies on the response from the target host to identify a breach. Here, metasploit is used as an exploitation tool to identify if the detected breach could be exploitable. Nmap (Network mapper) is a highly competent pen testing tool used for network mapping or discovery. On comparing with metasploit, it has a rudimentary GUI as compared to Metasploit. Metasploit is moving into web application security with its 3.5.0 release. The community has also added native PHP and Java payloads, which makes it easy to acquire advanced functionality through web application and Java server vulnerabilities. The community plans to port more exploits and modules to the metasploit platform. Additional modules that target embedded devices, hardware devices, etc.and BUS systems, such as K-Line could be added in the near future. 5 pen testing rules of engagement: What to consider while performing Penetration testing How to secure a private cloud using IAM Top 5 penetration testing tools for ethical hackers
Read more
  • 0
  • 0
  • 34919

article-image-how-to-plan-a-system-migration-10-steps
Hari Vignesh
14 Nov 2017
6 min read
Save for later

How to plan a system migration in 10 steps

Hari Vignesh
14 Nov 2017
6 min read
How do I plan a system migration? A system migration refers to the process of moving an application from one environment to another (such as from an on-premises enterprise server to a cloud-based environment, from one server to another, or from cloud-to-cloud). You might, for example, migrate to or from custom-built on platforms like Microsoft Azure, the Google App Engine, Force.com, MySQL, or Amazon Web Services. Software migration is always a challenge, but fortunately many system migrations can be managed – and even automated - by a third-party middleware solution. Sometimes a system migration might be something smaller scale. You might want to move installed applications and data from one piece of hardware to another (as you would when you give your team new computers), rather than moving an app’s entire development environment. While this is pretty easy in technical terms, making sure it’s carefully managed for users is nevertheless important. Why migration? Migrations done to improve efficiency or bring all applications from a legacy system into a current one. That’s why it’s becoming such a pressing issue for many organizations as they seek to undergo ‘digital transformation’ or optimize their existing setup. Often, organizations will want to virtualize their software. This is ultimately about disassociating it with specific operating systems, instead hosting programs in separate environments for sandboxing at runtime. Here are some migration scenarios: Example 1: You want to move your team using Adobe Creative Cloud (CC) from old PCs to new Macs. You need to ensure that once team members are working on Macs with Adobe CC installed, they’re still able to use paths to the server to access all creative assets. Example 2: Your team uses custom software developed on one type of cloud environment — like Amazon Web Services (AWS) — and now your organization is moving en masse to Google Cloud Platform (GCP). You need to map each piece of functionality your app had on AWS to GCP, despite the major differences in how each environment operates. How to successfully plan a system migration The hard bit is actually planning and executing a system migration. There’s a lot that can go wrong from both a technical and people perspective. Here are 10 steps you should follow to remain (relatively) calm and in control when you’re making a move. Establish your cross-functional representatives. Because of the many hands required to see a software migration project through, and its long timeframe and far-off ROI, you need a champion in your corner — from every corner of the business. Get one key representative from each business function relevant to the software that’s moving — be it production, sales, accounting, IT, or another department. These people will help you gain continued support of the project as it continues without ROI yet and comes under budget threats during, say, a lean quarter. Frame the project for stakeholders. Be it department heads, the C-suite, or the board of directors, lay out the plan and just how essential it is for growth. Set up what the project entails, what it isn’t, and lay out goalposts for each phase. Whenever it comes under review, you’ll have this initial framework you and stakeholders agreed upon. Build a team of internal experts. Find technical experts within your organization who can assist with each part of the migration, even if you’re ultimately using a third-party vendor or software for the migration. Put these people in charge of cleaning or writing programs to clean existing data, knowing where everything is stored, and understanding limitations of the platforms on each end of the migration. Depending on the size of your organization, each member of this team may lead their own small team to handle their portion of the project. Take inventory of assets. There’s no way to judge a migration as successful if you’re not sure whether you lost any data along the way. In the case of data, some of your internal experts can check in on what is stored, making backups, and exporting to lightweight .CSV files or hard-copies (in the case of legal and other vital documentation). For software or applications, take inventory on each action and function possible with the software, how it interfaces with its databases, what it’s compatible with and what it isn’t, and the unique custom configurations it has that separate it from off-the-shelf software’s documentation. Create a risk assessment report. Using the section above on challenges, determine all relevant risks to the migration, including opportunity costs and compliance issues. This will be vital for getting final approval from stakeholders, and insulate project runners from being blindsided later. One of these risk assessment matrix templates can help you get started. Determine technical, time, and financial requirements. Work with individuals in finance to work out long-term budget needs and rates of approval over the whole project. Work with IT, developers, and engineering to figure out the technical aspects and requirements, what method of migration is appropriate, and who will be forced into downtime at what stages of the project. Compile all of this to figure out realistic timing and checkpoints in the migration. Create project management system for all parties. With the data you gathered in the previous step, and all the teams you’ve assembled (technical, cross-functional, and stakeholder teams), create a common project management hub where everyone can see progress, send messages, attach files and findings, and generally lend visibility into the process. It should be intuitive for all users. Set up the project management software with the budget and time expectations at each phase agreed upon. You can present this information to the stakeholders for final approval prior to project kickoff, and use it to submit regular reports to them as they request.  Perform the migration in phases. Depending on the appropriate methods, perform the migration and document every step. Use the project management tool to keep everyone informed and gather documentation. Along the way, when some employees inevitably leave or get added to the team, you can use this tool to quickly get them up to speed. Test cases after each phase. After each phase, test whatever you’ve migrated into the new environment, and document the outcomes. Regular testing and sandboxing will allow your team to catch problems early and regroup or change direction before data is lost and progress is wasted. Results. Once the migration is complete, record final results, and compare it to the goalposts set up and tracked in your project management tool. Combine all documentation and deliver a final report to stakeholders, and begin reaping the rewards of your newer, faster, better software, operating system, cloud environment, or whatever else you migrated. By following the steps above, you should find your system migration a little more stress free than it might otherwise be! Hari Vignesh Jayapalan is a Google Certified Android app developer, IDF Certified UI & UX Professional, street magician, fitness freak, technology enthusiast, and wannabe entrepreneur. He can be found on Twitter @HariofSpades.
Read more
  • 0
  • 1
  • 34508
article-image-promising-devops-projects
Julian Ursell
29 Jan 2015
3 min read
Save for later

Promising DevOps Projects

Julian Ursell
29 Jan 2015
3 min read
The DevOps movement is currently driving a wave of innovations in technology, which are contributing to the development of powerful systems and software development architectures, as well as generating a transformation in “systems thinking”. Game changers like Docker have revolutionized the way system engineers, administrators, and application developers approach their jobs, and there is now a concerted push to iterate on the new paradigms that have emerged. The crystallization of containerization virtualization methods is producing a different perspective on service infrastructures, enabling a greater modularity and fine-grained-ness not so imaginable a decade ago. Powerful configuration management tools such as Chef, Puppet, and Ansible allow for infrastructure to be defined literally as code. The flame of innovation is burning brightly in this space and the concept of the “DevOps engineer” is becoming a reality and not the idealistic myth it appeared to be before. Now that DevOps know roughly where they're going, a feverish development drive is gathering pace with projects looking to build upon the flagship technologies that contributed to the initial spark. The next few years are going to be fascinating in terms of seeing how the DevOps foundations laid down will be built upon moving forward. The major foundation of modern DevOps development is the refinement of the concept and implementation of containerization. Docker has demonstrated how it can be leveraged to host, run, and deploy applications, servers, and services in an incredibly lightweight fashion, abstracting resources by isolating parts of the operating system in separate containers. The sea change in thinking this has created has been resounding. Still, however, a particular challenge for DevOps engineers working at scale with containers is developing effective orchestration services. Enter Kubernetes (apparently meaning “helmsman” in Greek), the project open sourced by Google for the orchestration and management of container clusters. The value of Kubernetes is that it works alongside Docker, building beyond simply booting containers to allow a finer degree of management and monitoring. It utilizes units called “pods” that facilitate communication and data sharing between Docker containers and the grouping of application-specific containers. The Docker project has actually taken the orchestration service Fig under its wing for further development, but there are a myriad of ways in which containers can be orchestrated. Kubernetes illustrates how the wave of DevOps-oriented technologies like Docker are driving large scale companies to open source their own solutions, and contribute to the spirit of open source collaboration that underlines the movement. Other influences of DevOps can be seen on the reappraisal of operating system architectures. CoreOS,for example, is a Linux distribution that has been designed with scale, flexibility, and lightweight resource consumption in mind. It hosts applications as Docker containers, and makes the development of large scale distributed systems easier by making it “natively” clustered, meaning it is adapted naturally for use over multiple machines. Under the hood it offers powerful tools including Fleet (CoreOS' cluster orchestration system) and Etcd for service discovery and information sharing between cluster nodes. A tool to watch out for in the future is Terraform (built by the same team behind Vagrant), which offers at its core the ability to build infrastructures with combined resources from multiple service providers, such as Digital Ocean, AWS, and Heroku, describing this infrastructure as code with an abstracted configuration syntax. It will be fascinating to see whether Terraform catches on and becomes opened up to a greater mass of major service providers. Kubernetes, CoreOS, and Terraform all convey the immense development pull generated by the DevOps movement, and one that looks set to roll on for some time yet.
Read more
  • 0
  • 0
  • 33028

article-image-why-containers-are-driving-devops
Diego Rodriguez
12 Jun 2017
5 min read
Save for later

Why containers are driving DevOps

Diego Rodriguez
12 Jun 2017
5 min read
It has been a long ride since the days where one application would just take a full room of computing hardware. Research and innovation in information technology (IT) have taken us far and will surely keep moving even faster every day. Let's talk a bit about the present state of DevOps, and how containers are driving the scene. What are containers? According to Docker (the most popular containers platform), a container is a stand-alone, lightweight package that has everything needed to execute a piece of software. It packs your code, runtime environment, systems tools, libraries, binaries, and settings. It's available for Linux and Windows apps. It runs the same everytime regardless of where you run it. It adds a layer of isolation, helping reduce conflicts between teams running different software on the same infrastructure. Containers are one level deeper in the virtualization stack, allowing lighter environments, more isolation, more security, more standarization, and many more blessings. There are tons of benefits you could take advantage of. Instead of having to virtualize the whole operating system (like virtual machines [VMs] do), containers take the advantage of sharing most of the core of the host system and just add the required, not-in-the-host binaries and libraries; no more gigabytes of disk space lost due to bloated operating systems with repeated stuff. This means a lot of things: your deployments can go packed in a much more smaller image than having to run it alone in a full operating system, each deployment boots up way faster, the idling resource usage is lower, there is less configuration and more standarization (remember "Convention over configuration"), less things to manage and more isolated apps means less ways to screw something up, therefore there is less attack surface, which subsequently means more security. But keep in mind, not everything is perfect and there are many factors that you need to take into account before getting into the containerization realm. Considerations It has been less than 10 years since containerization started, and in the technology world that is a lot, considering how fast other technologies such as web front-end frameworks and artificial intelligence [AI] are moving. In just a few years, development of this widely-deployed technology has gone mature and production-ready, coupled with microservices, the boost has taken it to new parts in the DevOps world, being now the defacto solution for many companies in their application and services deployment flow. Just before all this exciting movement started, VMs were the go-to for the many problems encountered by IT people, including myself. And although VMs are a great way to solve many of these problems, there was still room for improvement. Nowadays, the horizon seems really promising with the support of top technology companies backing tools, frameworks, services and products, all around containers, benefiting most of the daily code we develop, test, debug, and deploy on a daily basis. These days, thanks to the work of many, it's possible to have a consistent all-around lightweight way to run, test, debug, and deploy code from whichever platform you work from. So, if you code in Linux using VIM, but your coworker uses Windows using VS code, both can have the same local container with the same binaries and libraries where code is ran. This removes a lot of incompatibility issues and allows teams to enjoy production environments in their own machine, not having to worry about sharing the same configuration files, misconfiguration, versioning hassles, etc. It gets even better. Not only is there no need to maintain the same configuration files across the different services: there is less configuration to handle as a whole. Templates do most of the work for us, allowing you and your team to focus on creating and deploying your products, improving and iterating your services, changing and enhancing your code. In less than 10 lines you can specify a working template containing everything needed to run a simple Node.js service, or maybe a Ruby on Rails application, and how about a Scala cron job. Containerization supports most, if not all languages and stacks. Containers and virtualization Virtualization has allowed for acceleration in the speed in which we build things for many years. It will continue to provide us with better solutions as time goes by. Just as we went from Infrastructure as a Service (IaaS) to Platform as a Service (PaaS) and finally Software as a Service (SaaS) and others (Anything as a Service? AaaS?), I am certain that we will find more abstraction beyond containers, making our life easier everyday. As most of today's tools, many virtualization and containerization ones are open source, with huge communities around them and support boards, but keep the trust in good'ol Stack Overflow. So remember to give back something to the amazing community of open source, open issues, report bugs, share the best about it and help fix and improve the lacking parts. But really, just try to learn these new and promising technologies that give us IT people a huge bump in efficiency in pretty much all aspects. About the author Diego Rodriguez Baquero is a full stack developer specializing in DevOps and SysOps. He is also a WebTorrent core team member. He can be found at https://diegorbaquero.com/. 
Read more
  • 0
  • 0
  • 32981

article-image-top-frameworks-for-building-your-progressive-web-application-pwa
Sugandha Lahoti
15 May 2018
6 min read
Save for later

Top frameworks for building your Progressive Web Apps (PWA)

Sugandha Lahoti
15 May 2018
6 min read
The hype and rise of progressive web apps are tremendous. A PWA is basically a web application that feels like a native application to the user. By making your app a PWA, not only do you acquire new users, but you can also retain them for longer. Here’s a quick rundown of all things good about a PWA. Reliable: Loads instantly even under poor network conditions. Lighting fast and app-like: Responds to the user’s actions with speed and with a smooth interaction. Engaging and responsive: Gives the feeling that it was made specifically for that device, but it should be able to work across all platforms. Protected and secure: Served over HTTPS to make sure the contents of the app are not messed with. If you’re not already developing your next app as a PWA, here are 5 reasons why you should do that asap. And if you’re confused about choosing the best framework for developing your PWA, here are the top 3 frameworks available to make your next app a PWA. Ionic Ionic is one of the most popular frameworks for building a progressive web app. Let’s look at a few reasons why you should choose Ionic as your PWA framework Free and open-source:  Ionic is open source, and licensed under MIT. Open-source means developers can manage the code structure easily, saving time, money and efforts. They also have a worldwide community forum to connect with other Ionic developers, ask questions, and help out others. Cross-platform and one-codebase: Ionic allows seamless building of apps across popular operating systems, such as Android, iOS and Windows. It has a one codebase feature. This means apps are deployed through Apache Cordova with a single code base, and the application adapts automatically to the device it is functioning in. Rich UI: Ionic is equipped with pre-built components that are used to customize design themes and elements. It is based on SASS UI, with rich features to create fast, robust, interactive, native-like applications. Powerful functionality: Ionic is supported by Angular. The component API of Angular helps developers create interactive hybrid and web apps. Ionic is equipped with Cordova Plugins for accessing various native features, like Camera, GPS, and so on. It also features a powerful CLI for building, testing and deploying apps across multiple platforms. Read our Hybrid Mobile Development with Ionic to build a complete, professional-quality, hybrid mobile application with Ionic. You can also checkout Hybrid Mobile apps: What you need to know, for a quick rundown of all that is to know about a Hybrid mobile app. Polymer Google’s Polymer App Toolbox is another contender for the development of PWAs. It is a collection of web components, tools, and templates for building Progressive Web Apps. Blends PWAs with Web components Polymer allows developers to architect a component-based web app using Polymer and Web Components. Web components can form encapsulated and reusable custom HTML elements. They are independent of the frameworks because they are made of pure HTML/CSS/JS, unlike framework-dependent UI components in React/Angular. The web components are provided through a light-weight Polymer Library for creating framework-independent, custom components. More features include: Responsive design using the app layout components. Modular routing using the <app-route> elements. Localization with <app-localize-behavior>. Turnkey support for local storage with app storage elements. Offline caching as a progressive enhancement, using service workers. Build tooling to support serving the app multiple ways: unbundled for delivery over HTTP/2 with server push, and bundled for delivery over HTTP/1. Each component whether used separately or together can be used to build a full-featured Progressive web app. Most importantly, each component is additive. For a simple app one only needs the app-layout. As it gets more complicated, developers can add routing, offline caching, and a high-performance server as required. Read our Getting Started with Polymer book to create responsive web apps using Polymer. Angular Angular, probably the most popular front-end web application platform, can also be used to make robust, reliable, and responsive PWAs. Before the release of version 5, supporting progressive web apps in Angular required a lot of expertise on the developers’ part.Version 5 comes equipped with a new version of the Angular Service Worker for built-in PWA support. Angular 6 (released a few days back) has two new CLI commands. Both these versions make it very simple to make web application downloadable and installable, just like a native mobile application. Service Worker Updates With Angular 5 the development of Service Workers is becoming significantly easier. By using Angular CLI developers can choose to add Service Worker functionality by default. The Angular Service Worker functionality is provided by the module @angular/service-worker. Service worker can power up an application by only providing some JSON configuration instead of writing the code manually. The key difference with other service worker generators (like Workbox, sw-precache) is the fact, that you do not re-generate the service worker file itself, you only update its control file. New CLI commands Angular 6, also introduces two new commands apart from the service worker updates. The first, ng update, is a CLI command for updating dependencies and code. The second command, ng add, supports turning applications into progressive web apps, which support offline web pages. Apart from these frameworks, React is also a good alternative. Backed by Facebook, it has a Create-React-App generator which is the official scaffolding tool to generate a Reactjs App. Get started with Scott Domes's Progressive Web Apps with React as your first step for building PWA applications. Yet another popular choice, would be Webpack. Webpack plugins can generate the service worker and manifest required for a PWA to be registered. It uses a Google project called Workbox which provides tools that help make offline support for web apps easier to set up. The bottom line is that the frameworks for building progressive web apps are growing and expanding at a rapid rate with regular updates every couple of months. Choosing a particular framework thus doesn’t make much difference to the app behavior. It only depends on the developer’s area of interest and expertise. Windows launches progressive web apps… that don't yet work on mobile How to Secure and Deploy an Android App How Android app developers can convert iPhone apps
Read more
  • 0
  • 0
  • 32204
article-image-what-software-stack-does-netflix-use
Richard Gall
03 Sep 2017
5 min read
Save for later

What software stack does Netflix use?

Richard Gall
03 Sep 2017
5 min read
Netflix is a company that has grown at an incredible pace. In July 2017 it reached 100 million subscribers around the world - for a company that started life as a DVD subscription service, Netflix has proved to be adaptable, constantly one step ahead of changes in the market and changes in user behaviour. It’s an organization that has been able to scale, while maintaining a strong focus on user experience. This flexibility and adaptability has been driven - or enabled by its approach to software. But what software does Netflix use, exactly? How and why has it made decisions about its software stack? Netflix's front end development tools User experience is critical for Netflix. That’s why React is such a valuable framework for the engineering team. The team moved to React in 2014 as a means to completely overhaul their UI - essentially to make it fit for purpose for the future. As they outline in this piece from January 2015, their core considerations were startup speed, runtime performance and modularity. Or to summarize, to manage scale in terms of users, content, and devices. The piece goes on to explain why React fit the bill. It mentions how important isomorphic JavaScript is to the way they work: React enabled us to build JavaScript UI code that can be executed in both server (e.g. Node.js) and client contexts. To improve our start up times, we built a hybrid application where the initial markup is rendered server-side and the resulting UI elements are subsequently manipulated as done in a single-page application. How Netflix manages microservices at scale A lot has been written about Netflix and microservices (we recommend this blog post as a good overview of how microservices have been used by Netflix over the last decade). A big challenge for a platform that has grown like Netflix has is managing those microservices. To do this, the team built their own tool - called Netflix Conductor. “In a microservices world, a lot of business process automations are driven by orchestrating across services. Conductor enables orchestration across services while providing control and visibility into their interactions.” To get microservices to interact with one another, the team write Shell/Python scripts - however this ran into issues, largely due to scale. To combat this, the engineers developed a tool called Scriptflask.Scriptflask ‘exposes the functionality of utilities as REST endpoints.’ This was developed using Flask (hence the name Scriptflask) - which offered good interoperability with Python, a language used across a wide range of Netflix applications. The team also use Node.js with Docker to manage services - it’s well worth watching this video from Node.JS Interactive in December 2016 where Yunong Xiao, Principal Software Engineer at Netflix talks ‘slaying monoliths’. Netflix and AWS migration Just as we saw with Airbnb, AWS has proven crucial to Netflix success. In fact for the last few years, it has been undergoing a huge migration project to move the bulk of its architecture into AWS. This migration was finally complete at the start of 2016. “We chose Amazon Web Services (AWS) as our cloud provider because it provided us with the greatest scale and the broadest set of services and features,” writes Yuri Izrailevsky, VP of cloud and platform engineering at Netflix. Netflix and big data The move to AWS has, of course, been driven by data related challenges. In fact, the team use Amazon S3 as their data warehouse. The scale of data is stunning - it’s been said that the data warehouse is 60 petabytes. This post from InfoQ elaborates on the Netflix big data infrastructure. How Netflix does DevOps For a company that has proven itself so adaptable at a macro level, it’s unsurprising that the way the engineering teams at Netflix build code is incredibly flexible and agile too. It’s worth quoting this from the team - it says a lot about the culture: The Netflix culture of freedom and responsibility empowers engineers to craft solutions using whatever tools they feel are best suited to the task. In our experience, for a tool to be widely accepted, it must be compelling, add tremendous value, and reduce the overall cognitive load for the majority of Netflix engineers. Clearly the toolchain that supports development teams is open ended - it’s constantly subject to revision and change. But we wanted to flag some of the key tools that helps keep the culture running. First, there is gradle - the team write that “Gradle was chosen because it was easy to write testable plugins, while reducing the size of a project’s build file.” It also makes sense in that Java makes up such a large proportion of the Netflix codebase too. To provide additional support, the team also developed something called Nebula - “an opinionated set of plugins for the Gradle build system, to help with the heavy lifting around building applications”. When it comes to integration, Jenkins is essential for Netflix. “We started with a single massive Jenkins master in our datacenter and have evolved to running 25 Jenkins masters in AWS”. This is just a snapshot of the tools used by the team to build and deploy code - for a much deeper exploration we recommend this post on Medium.
Read more
  • 0
  • 31
  • 30826

article-image-bridging-gap-between-data-science-and-devops
Richard Gall
23 Mar 2016
5 min read
Save for later

Bridging the gap between data science and DevOps with DataOps

Richard Gall
23 Mar 2016
5 min read
What’s the real value of data science? Hailed as the sexiest job of the 21st century just a few years ago, there are rumors that it’s not quite proving its worth. Gianmario Spacagna, a data scientist for Barclays bank in London, told Computing magazine at Spark Summit Europe in October 2015 that, in many instances, there’s not enough impact from data science teams – “It’s not a playground. It’s not academic” he said. His solution sounds simple. We need to build a bridge between data science and DevOps - and DataOps is perhaps the answer. He says: "If you're a start-up, the smartest person you want to hire is your DevOps guy, not a data scientist. And you need engineers, machine learning specialists, mathematicians, statisticians, agile experts. You need to cover everything otherwise you have a very hard time to actually create proper applications that bring value." This idea makes a lot of sense. It’s become clear over the past few years that ‘data’ itself isn’t enough; it might even be distracting for some organizations. Sometimes too much time is spent in spreadsheets and not enough time is spent actually doing stuff. Making decisions, building relationships, building things – that’s where real value comes from. What Spacagna has identified is ultimately a strategic flaw within how data science is used in many organizations. There’s often too much focus on what data we have and what we can get, rather than who can access it and what they can do with it. If data science isn’t joining the dots, DevOps can help. True, a large part of the problem is strategic, but DevOps engineers can also provide practical solutions by building dashboards and creating APIs. These sort of things immediately give data additional value by making they make it more accessible and, put simply, more usable. Even for a modest medium sized business, data scientists and analysts will have minimal impact if they are not successfully integrated into the wider culture. While it’s true that many organizations still struggle with this, Airbnb demonstrate how to do it incredibly effectively. Take a look at their Airbnb Engineering and Data Science publication on Medium. In this post, they talk about the importance of scaling knowledge effectively. Although they don’t specifically refer to DevOps, it’s clear that DevOps thinking has informed their approach. In the products they’ve built to scale knowledge, for example, the team demonstrate a very real concern for accessibility and efficiency. What they build is created so people can do exactly what they want and get what they need from data. It’s a form of strict discipline that is underpinned by a desire for greater freedom. If you keep reading Airbnb’s publication, another aspect of ‘DevOps thinking’ emerges: a relentless focus on customer experience. By this, I don’t simply mean that the work done by the Airbnb engineers is specifically informed by a desire to improve customer experiences; that’s obvious. Instead, it’s the sense that tools through which internal collaboration and decision making take place should actually be similar to a customer experience. They need to be elegant, engaging, and intuitive. This doesn’t mean seeing every relationship as purely transactional, based on some perverse logic of self-interest, but rather having a deeper respect for how people interact and share ideas. If DevOps is an agile methodology that bridges the gap between development and operations, it can also help to bridge the gap between data and operations. DataOps - bringing DevOps and data science together This isn’t a new idea. As much as I’d like to, I can’t claim credit for inventing ‘DataOps’. But there’s not really much point in asserting that distinction. DataOps is simply another buzzword for the managerial class. And while some buzzwords have value, I’m not so sure that we need another one. More importantly, why create another gap between Data and Development? That gap doesn’t make sense in the world we’re building with software today. Even for web developers and designers, the products they are creating are so driven by data that separating the data from the dev is absurd. Perhaps then, it’s not enough to just ask more from our data science as Gianmario Spacagna does. DevOps offers a solution, but we’re going to miss out on the bigger picture if we start asking for more DevOps engineers and some space for them to sit next to the data team. We also need to ask how data science can inform DevOps too. It’s about opening up a dialogue between these different elements. While DevOps evangelists might argue that DevOps has already started that, the way forward is to push for more dialogue, more integration and more collaboration. As we look towards the future, with the API economy becoming more and more important to the success of both startups and huge corporations, the relationships between all these different areas are going to become more and more complex. If we want to build better and build smarter we’re going to have to talk more. DevOps and DataOps both offer us a good place to start the conversation, but it’s important to remember it’s just the start.
Read more
  • 0
  • 0
  • 30483