Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

Tech Guides

852 Articles
article-image-should-you-move-python-3-7-experts-opinions
Richard Gall
29 Mar 2018
9 min read
Save for later

Should you move to Python 3? 7 Python experts' opinions

Richard Gall
29 Mar 2018
9 min read
Python is one of the most used programming languages on the planet. But when something is so established and popular across a number of technical domains the pace of change slows. Moving to Python 3 appears to be a challenge for many development teams and organizations. So, is switching to Python 3 worth the financial investment, the training and the stress? Mike Driscoll spoke to a number of Python experts about whether developers should move to Python 3 for Python Interviews, a book that features 20 interviews with leading Python programmers and community contributors. The transition to Python 3 can be done gradually Brett Cannon (@brettsky), Python core developer and Principal Software Developer at Microsoft: As someone who helped to make Python 3 come about, I'm not exactly an unbiased person to ask about this. I obviously think people should make the switch to Python 3 immediately, to gain the benefits of what has been added to the language since Python 3.0 first came out. I hope people realize that the transition to Python 3 can be done gradually, so the switch doesn't have to be abrupt or especially painful. Instagram switched in nine months, while continuing to develop new features, which shows that it can be done. Anyone starting out with Python should learn Python 3 Steve Holden (@HoldenWeb), CTO of Global Stress Index and former chairman and director of The PSF: Only when they need to. There will inevitably be systems written in 2.7 that won't get migrated. I hope that their operators will collectively form an industry-wide support group, to extend the lifetimes of those systems beyond the 2020 deadline for Python-Dev support. However, anyone starting out with Python should clearly learn Python 3 and that is increasingly the case. Python 3 resolves a lot of inconsistencies Glyph Lefkowitz (@glyph), founder of Twisted, a Python network programming framework, awarded The PSF’s Community Service Award in 2017: I'm in Python 3 in my day job now and I love it. After much blood, sweat and tears, I think it actually is a better programming language than Python 2 was. I think that it resolves a lot of inconsistencies. Most improvements should mirror quality of life issues and the really interesting stuff going on in Python is all in the ecosystem. I absolutely cannot wait for a PyPy 3.5, because one of the real downsides of using Python 3 at work is that I now have to deal with the fact that all of my code is 20 times slower. When I do stuff for the Twisted ecosystem, and I run stuff on Twisted's infrastructure, we use Python 2.7 as a language everywhere, but we use PyPy as the runtime. It is just unbelievably fast! If you're running services, then they can run with a tenth of the resources. A PyPy process will take 80 MB of memory, but once you're running that it will actually take more memory per interpreter, but less memory per object. So if you're doing any Python stuff at scale, I think PyPy is super interesting. One of my continued bits of confusion about the Python community is that there's this thing out there which, for Python 2 anyway, just makes all of your code 20 times faster. This wasn't really super popular, in fact PyPy download stats still show that it's not as popular as Python 3, and Python 3 is really experiencing a huge uptick in popularity. I do think that given that the uptake in popularity has happened, the lack of a viable Python 3 implementation for PyPy is starting to hurt it quite a bit. But it was around and very fast for a long time before Python 3 had even hit 10% of PyPy's downloads. So I keep wanting to predict that this is the year of PyPy on the desktop, but it just never seems to happen. Most actively maintained libraries support Python 3 Doug Hellmann (@doughellmann), the man behind Python Module of the Week and a fellow of The PSF: The long lifetime for Python 2.7 recognizes the reality that rewriting functional software based on backwards-incompatible upstream changes isn't a high priority for most companies. I encourage people to use the latest version of Python 3 that is available on their deployment platform for all new projects. I also advise them to carefully reconsider porting their remaining legacy applications now that most actively maintained libraries support Python 3. Migration from Python 2 to 3 is difficult Massimo Di Pierro (@mdipierro), Professor at the School of Computing at De Paul University in Chicago and creator of web2py, an open source web application framework written in Python: Python 3 is a better language than Python 2, but I think that migration from Python 2 to Python 3 is difficult. It cannot be completely automated and often it requires understanding the code. People do not want to touch things that currently work. For example, the str function in Python 2 converts to a string of bytes, but in Python 3, it converts to Unicode. So this makes it impossible to switch from Python 2 to Python 3, without actually going through the code and understanding what type of input is being passed to the function, and what kind of output is expected. A naïve conversion may work very well as long as you don't have any strange characters in your input (like byte sequences that do not map into Unicode). When that happens, you don't know if the code is doing what it was supposed to do originally or not. Consider banks, for example. They have huge codebases in Python, which have been developed and tested over many years. They are not going to switch easily because it is difficult to justify that cost. Consider this: some banks still use COBOL. There are tools to help with the transition from Python 2 to Python 3. I'm not really an expert on those tools, so a lot of the problems I see may have a solution that I'm not aware of. But I still found that each time I had to convert code, this process was not as straightforward as I would like. The divide between the worlds of Python 2 and 3 will exist well beyond 2020 Marc-Andre Lemburg (@malemburg), co-founder of The PSF and CEO of eGenix: Yes, you should, but you have to consider the amount of work which has to go into a port from Python 2.7 to 3.x. Many companies have huge code bases written for Python 2.x, including my own company eGenix. Commercially, it doesn't always make sense to port to Python 3.x, so the divide between the two worlds will continue to exist well beyond 2020. Python 2.7 does have its advantages because it became the LTS version of Python. Corporate users generally like these long-term support versions, since they reduce porting efforts from one version to the next. I believe that Python will have to come up with an LTS 3.x version as well, to be able to sustain success in the corporate world. Once we settle on such a version, this will also make a more viable case for a Python 2.7 port, since the investment will then be secured for a good number of years. Python 3 has tons of amazing new features Barry Warsaw (@pumpichank), member of the Python Foundation team at LinkedIn, former project leader of GNU Mailman: We all know that we've got to get on Python 3, so Python 2's life is limited. I made it a mission inside of Ubuntu to try to get people to get on Python 3. Similarly, within LinkedIn, I'm really psyched, because all of my projects are on Python 3 now. Python 3 is so much more compelling than Python 2. You don't even realize all of the features that you have in Python 3. One of the features that I think is really awesome is the async I/O library. I'm using that in a lot of things and think it is a very compelling new feature, that started with Python 3.4. Even with Python 3.5, with the new async keywords for I/O-based applications, asyncio was just amazing. There are tons of these features that once you start to use them, you just can't go back to Python 2. It feels so primitive. I love Python 3 and use it exclusively in all of my personal open source projects. I find that dropping back to Python 2.7 is often a chore, because so many of the cool things you depend on are just missing, although some libraries are available in Python 2 compatible back ports. I firmly believe that it's well past the time to fully embrace Python 3. I wouldn't write a line of new code that doesn't support it, although there can be business reasons to continue to support existing Python 2 code. It's almost never that difficult to convert to Python 3, although there are still a handful of dependencies that don't support it, often because those dependencies have been abandoned. It does require resources and careful planning though, but any organization that routinely addresses technical debt should have conversion to Python 3 in their plans. That said, the long life of Python 2.7 has been great. It's provided two important benefits I think. The first is that it provided a very stable version of Python, almost a long-term support release, so folks didn't have to even think about changes in Python every 18 months (the typical length of time new versions are in development). Python 2.7's long life also allowed the rest of the ecosystem to catch up with Python 3. So the folks who were very motivated to support it could sand down the sharp edges and make it much easier for others to follow. I think we now have very good tools, experience, and expertise in how to switch to Python 3 with the greatest chance of success. I think we reached the tipping point somewhere around the Python 3.5 release. Regardless of what the numbers say, we're well past the point where there's any debate about choosing Python 3, especially for new code. Python 2.7 will end its life in mid-2020 and that's about right, although not soon enough for me! At some point, it's just more fun to develop in and on Python 3. That's where you are seeing the most energy and enthusiasm from Python developers.
Read more
  • 0
  • 0
  • 9986

article-image-top-5-automated-testing-frameworks
Sugandha Lahoti
11 Jul 2018
6 min read
Save for later

Top 5 automated testing frameworks

Sugandha Lahoti
11 Jul 2018
6 min read
The world is abuzz with automation. It is everywhere today and becoming an integral part of organizations and processes. Software testing, an intrinsic part of website/app/software development has also been taken over by test automation tools. However, as it happens in many software markets, a surplus of tools complicates the selection process. We have identified top 5 testing frameworks, used by most developers for automating the testing process. These automation testing frameworks cover a broad range of devices and support different scripting languages. Each framework has their own uniques pros, cons, and learning approaches. Selenium [box type="shadow" align="" class="" width=""]Creators: Jason Huggins Language: Java Current version: 3.11.0 Popularity: 11,031 stars on GitHub[/box] Selenium is probably the most popular test automation framework, primarily used for testing web apps.  However, selenium can also be used in cloud-based services, load-testing services and for monitoring, quality assurance, test architecture, regression testing, performance analysis, and mobile testing. It is open source; i.e., the source code can be altered and modified if you want to customize it for your testing purposes. It is flexible enough for you to write your own script and add functionality to test scripts and the framework. The Selenium suite consists of four different tools: Selenium IDE, Selenium Grid, Selenium RC, and Selenium WebDriver. It also supports a wide range of programming languages such as C#, Java, Python, PHP, Ruby, Groovy, and Perl. Selenium is portable, so it can be run anywhere, eliminating the need to configure it specifically for a particular machine. It becomes quite handy when you are working in varied environments and platforms supporting various system environments - Windows, Mac, Linux and browsers - Chrome, Firefox, IE, and Headless browsers. Most importantly, Selenium has a great community which implies more forums, more resources, examples, and solved problems. Appium [box type="shadow" align="" class="" width=""]Creators: Dan Cuellar Language: C# Current version: 1.8.1 Popularity: 7,432 stars on GitHub[/box] Appium is an open source test automation framework for testing native, hybrid, and mobile web applications. It allows you to run automated tests on actual devices, emulators (Android), and simulators (iOS). It provides cross-platform solutions for native and hybrid mobile apps, which means that the same test cases will work on multiple platforms (iOS, Android, Windows, Mac).  Appium also allows you to talk to other Android apps that are integrated with App Under Test (AUT). Appium has a client-server architecture. It extends the WebDriver client libraries, which are already written in most popular programming languages. So, you are free to use any programming language to write the automation test scripts. With Appium, you can also run your test scripts in the cloud using services such as Sauce Labs and Testdroid. Appium is available on GitHub with documentations and tutorial to learn all that is needed. The Appium team is alive, active, and highly responsive as far as solving an issue is concerned. Developers can expect a reply after no more than 36 hours, after an issue is opened. The community around Appium is also pretty large and growing every month. Katalon Studio [box type="shadow" align="" class="" width=""]Creators: Katalon LLC. Language: Groovy Current version: 5.4.2[/box] Katalon Studio is another test automation solution for web application, mobile, and web services. Katalon Studio uses Groovy, a language built on top of Java. It is built on top of the Selenium and Appium frameworks, taking advantage of these two for integrated web and mobile test automation. Unlike Appium, and Selenium, which are more suitable for testers who possess good programming skills, Katalon Studio can be used by testers with limited technical knowledge. Katalon Studio has a interactive UI with drag-drop features, select keywords and test objects to form test steps functionalities. It has a manual mode for technically strong users and a scripting mode that supports development facilities like syntax highlighting, code suggestion and debugging. On the down side, Katlon has to load many extra libraries for parsing test data, test objects, and for logging. Therefore, it may be a bit slower for long test cases as compared to other testing frameworks which use Java. Robot Framework [box type="shadow" align="" class="" width=""]Creators: Pekka Klärck, Janne Härkönen et al. Language: Python Current version: 3.0.4 Popularity: 2,393 stars on GitHub[/box] Robot Framework is a Python-based, keyword-driven, acceptance test automation framework. It is a general purpose test automation framework primarily  used for acceptance testing and streamlines it into mainstream development, thus giving rise to the concept of acceptance test driven development (ATDD). It was created by Pekka Klärck as part of his master's thesis and was developed within Nokia Siemens Networks in 2005. Its core framework is written in Python, but it also supports IronPython (.NET), Jython (JVM) and PyPy. The Keyword driven approach simplifies tests and makes them readable. There is also provision for creating reusable higher-level keywords from existing ones. Robot Framework stands out from other testing tools by working on easy-to-use tabular test files that provide different approaches towards test creation. It is the extensible nature of the tool that makes it so versatile. It can be adjusted into different scenarios and used with different software backend such as by using Python and Java libraries, and also via different API’s. Watir [box type="shadow" align="" class="" width=""]Creators: Bret Pettichord, Charley Baker, and more. Language: Ruby Current version: 6.7.2 Popularity: 1126 stars on GitHub[/box] Watir is powerful test automation tool based on a family of ruby libraries. It stands for Web Application Testing In Ruby. Watir can connect to databases, export XML, and structure code as reusable libraries, and read data files and spreadsheets all thanks to Ruby. It supports cross-browser and data-driven testing and the tests are easy to read and maintain. It also integrates with other BBD tools such as Cucumber, Test/Unit, BrowserStack or SauceLabs for cross-browser testing and Applitools for visual testing. Whilst Watir supports only Internet Explorer on Windows, Watir-WebDriver, the modern version of the Watir API based on Selenium,  supports Chrome, Firefox, Internet Explorer, Opera and also can run in headless mode (HTMLUnit). [dropcap]A[/dropcap]ll the frameworks that we discussed above offer unique benefits based on their target platforms and respective audiences. One should avoid selecting a framework based solely on technical requirements. Instead, it is important to identify what is suitable to developers, their team, and the project. For instance, even though general-purpose frameworks cover a broad range of devices, they often lack hardware support. And frameworks which are device-specific often lack support for different scripting languages and approaches. Work with what suits your project and your team requirements best. Selenium and data-driven testing: An interview with Carl Cocchiaro 3 best practices to develop effective test automation with Selenium Writing Your First Cucumber Appium Test
Read more
  • 0
  • 7
  • 9976

article-image-2018-year-of-graph-databases
Amey Varangaonkar
04 May 2018
5 min read
Save for later

2018 is the year of graph databases. Here's why.

Amey Varangaonkar
04 May 2018
5 min read
With the explosion of data, businesses are looking to innovate as they connect their operations to a whole host of different technologies. The need for consistency across all data elements is now stronger than ever. That’s where graph databases come in handy. Because they allow for a high level of flexibility when it comes to representing your data and also while handling complex interactions within different elements, graph databases are considered by many to be the next big trend in databases. In this article, we dive deep into the current graph database scene, and list out 3 top reasons why graph databases will continue to soar in terms of popularity in 2018. What are graph databases, anyway? Simply put, graph databases are databases that follow the graph model. What is a graph model, then? In mathematical terms, a graph is simply a collection of nodes, with different nodes connected by edges. Each node contains some information about the graph, while edges denote the connection between the nodes. How are graph databases different from the relational databases, you might ask? Well, the key difference between the two is the fact that graph data models allow for more flexible and fine-grained relationships between data objects, as compared to relational models. There are some more differences between the graph data model and the relational data model, which you should read through for more information. Often, you will see that graph databases are without a schema. This allows for a very flexible data model, much like the document or key/value store database models. A unique feature of the graph databases, however, is that they also support relationships between the data objects like a relational database. This is useful because it allows for a more flexible and faster database, which can be invaluable to your project which demands a quicker response time. Image courtesy DB-Engines The rise in popularity of the graph database models over the last 5 years has been stunning, but not exactly surprising. If we were to drill down the 3 key factors that have propelled the popularity of graph databases to a whole new level, what would they be? Let’s find out. Major players entering the graph database market About a decade ago, the graph database family included just Neo4j and a couple of other less-popular graph databases. More recently, however, all the major players in the industry such as Oracle (Oracle Spatial and Graph), Microsoft (Graph Engine), SAP (SAP Hana as a graph store) and IBM (Compose for JanusGraph) have come up with graph offerings of their own. The most recent entrant to the graph database market is Amazon, with Amazon Neptune announced just last year. According to Andy Jassy, CEO of Amazon Web Services, graph databases are becoming a part of the growing trend of multi-model databases. Per Jassy, these databases are finding increased adoption on the cloud as they support a myriad of useful data processing methods. The traditional over-reliance on relational databases is slowly breaking down, he says. Rise of the Cypher Query Language With graph databases slowly getting mainstream recognition and adoption, the major companies have identified the need for a standard query language for all graph databases. Similar to SQL, Cypher has emerged as a standard and is a widely-adopted alternative to write efficient and easy to understand graph queries. As of today, the Cypher Query Language is used in popular graph databases such as Neo4j, SAP Hana, Redis graph and so on. The OpenCypher project, the project that develops and maintains Cypher, has also released Cypher for popular Big Data frameworks like Apache Spark. Cypher’s popularity has risen tremendously over the last few years. The primary reason for this is the fact that like SQL, Cypher’s declarative nature allows users to state the actions they want performed on their graph data without explicitly specifying them. Finding critical real-world applications Graph databases were in the news as early as 2016, when the Panama paper leaks were revealed with the help of Neo4j and Linkurious, a data visualization software. In more recent times, graph databases have also found increased applications in online recommendation engines, as well as for performing tasks that include fraud detection and managing social media. Facebook’s search app also uses graph technology to map social relationships. Graph databases are also finding applications in virtual assistants to drive conversations - eBay’s virtual shopping assistant is an example. Even NASA uses the knowledge graph architecture to find critical data. What next for graph databases? With growing adoption of graph databases, we expect graph-based platforms to soon become the foundational elements of many corporate tech stacks. The next focus area for these databases will be practical implementations such as graph analytics and building graph-based applications. The rising number of graph databases would also mean more competition, and that is a good thing - competition will bring more innovation, and enable incorporation of more cutting-edge features. With a healthy and steadily growing community of developers, data scientists and even business analysts, this evolution may be on the cards, sooner than we might expect. Amazon Neptune: A graph database service for your applications When, why and how to use Graph analytics for your big data
Read more
  • 0
  • 0
  • 9945

article-image-how-will-ai-impact-job-roles-in-cybersecurity
Melisha Dsouza
25 Sep 2018
7 min read
Save for later

How will AI impact job roles in Cybersecurity

Melisha Dsouza
25 Sep 2018
7 min read
"If you want a job for the next few years, work in technology. If you want a job for life, work in cybersecurity." -Aaron Levie, chief executive of cloud storage vendor Box The field of cybersecurity will soon face some dire, but somewhat conflicting, views on the availability of qualified cybersecurity professionals over the next four or five years. Global Information Security Workforce Study from the Center for Cyber Safety and Education, predicts a shortfall of 1.8 million cybersecurity workers by 2022. The cybersecurity workforce gap will hit 1.8 million by 2022. On the flipside,  Cybersecurity Jobs Report, created by the editors of Cybersecurity Ventures highlight that there will be 3.5 million cybersecurity job openings by 2021. Cybercrime will feature more than triple the number of job openings over the next 5 years. Living in the midst of a digital revolution caused by AI- we can safely say that AI will be the solution to the dilemma of “what will become of human jobs in cybersecurity?”. Tech enthusiasts believe that we will see a new generation of robots that can work alongside humans and complement or, maybe replace, them in ways not envisioned previously. AI will not make jobs easier to accomplish, but also bring about new job roles for the masses. Let’s find out how. Will AI destroy or create jobs in Cybersecurity? AI-driven systems have started to replace humans in numerous industries. However, that doesn’t appear to be the case in cybersecurity. While automation can sometimes reduce operational errors and make it easier to scale tasks, using AI to spot cyberattacks isn’t completely practical because such systems yield a large number of false positives. It lacks the contextual awareness which can lead to attacks being wrongly identified or missed completely. As anyone who’s ever tried to automate something knows, automated machines aren’t great at dealing with exceptions that fall outside of the parameters to which they have been programmed. Eventually, human expertise is needed to analyze potential risks or breaches and make critical decisions. It’s also worth noting that completely relying on artificial intelligence to manage security only leads to more vulnerabilities - attacks could, for example, exploit the machine element in automation. Automation can support cybersecurity professionals - but shouldn’t replace them Supported by the right tools, humans can do more. They can focus on critical tasks where an automated machine or algorithm is inappropriate. In the context of cybersecurity, artificial intelligence can do much of the 'legwork' at scale in processing and analyzing data, to help inform human decision making. Ultimately, this isn’t a zero-sum game -  humans and AI can work hand in hand to great effects. AI2 Take, for instance, the project led by the experts at the Massachusetts Institute of Technology’s Computer Science and Artificial Intelligence Lab. AI2 (Artificial Intelligence + Analyst Intuition) is a system that combines the capabilities of AI with the intelligence of human analysts to create an adaptive cybersecurity solution that improves over time. The system uses the PatternEx machine learning platform, and combs through data looking for meaningful, predefined patterns. For instance, a sudden spike in postback events on a webpage might indicate an attempt at staging a SQL injection attack. The top results are then presented to a human analyst, who will separate any false positives and flags legitimate threats. The information is then fed into a virtual analyst that uses human input to learn and improve the system’s detection rates. On future iterations, a more refined dataset is presented to the human analyst, who goes through results and once again "teaches" the system to make better decisions. AI2 is a perfect example that shows man and machine can complement each other’s strengths to create something even more effective. It’s worth remembering that in any company that uses AI for cybersecurity, automated tools and techniques require significant algorithm training, data markup. New cybersecurity job roles and the evolution of the job market The bottom line of this discussion is that- AI will not destroy cybersecurity jobs, but it will drastically change them. The primary focus of many cybersecurity jobs can be going through the hundreds of security tools available, determining what tools and techniques are most appropriate for their organization’s needs. Of course, as systems move to the cloud, these decisions will already be made because cloud providers will offer in-built security solutions. This means that the number of companies that will need a full staff of cybersecurity experts will be drastically reduced. Instead, companies will need more individuals that understand issues like the potential business impact and risk of different projects and architectural decisions. This demands a very different set of skills and knowledge compared to the typical current cybersecurity role - it is less directly technical and will require more integration with other key business decision makers. AI can provide assistance, but it can’t offer easy answers. Humans and AI working together Companies concerned with cybersecurity legal compliance and effective real-world solutions should note that cybersecurity and information technology professionals are best-suited for tasks such as risk analysis, policy formulation and cyber attack response. Human intervention is can help AI systems to learn and evolve. Take the example of Spain-based antivirus company called Panda Security, that had a number of people reverse-engineering malicious code and writing signatures. In today's times, to keep pace with overflowing amounts of data, the company would need hundreds of thousands of engineers to deal with malicious code. Enter AI and only a small team of engineers are required-- to look at more than 200,000 new malware samples per day. Is AI going to steal cybersecurity engineers their jobs? So what about former employees that used to perform this job? Have they been laid off? The answer is a straight No! But they will need to upgrade their skill set. In the world of cybersecurity, AI is going to create new jobs, as it throws up new problems to be analyzed and solved. It’s going to create what are being called "new collar" jobs - this is something that IBM’s hiring strategy has already taken into account. Once graduates enter the IBM workforce, AI enters the equation to help them get a fast start. Even the Junior analysts can have the ability to investigate a new malware infecting mobile phones of employees. AI would quickly research the new malware impacting the phones, and identify the characteristics reported by others and to provide a recommended course of action. This would relieve analysts from the manual work of going through reams of data and lines of code - in theory, it should make their job more interesting and more fun. Artificial intelligence and the human workforce, then, aren’t in conflict when it comes to cybersecurity. Instead, they can complement each other to create new job opportunities that will test the skills of the upcoming generation, and lead experienced professionals into new and maybe more interesting directions. It will be interesting to see how cybersecurity workforce makes use of AI in the future. Intelligent Edge Analytics: 7 ways machine learning is driving edge computing adoption in 2018 15 millions jobs in Britain at stake with AI robots set to replace humans at workforce 5 ways artificial intelligence is upgrading software engineering    
Read more
  • 0
  • 0
  • 9944

article-image-5-things-you-need-to-learn-to-become-a-server-side-web-developer
Amarabha Banerjee
19 Jun 2018
6 min read
Save for later

5 things you need to learn to become a server-side web developer

Amarabha Banerjee
19 Jun 2018
6 min read
The profession of a back end web developer is ringing out loud and companies seek to get a qualified server-side developer to their team. The fact that the back-end specialist has comprehensive set of knowledge and skills helps them realize their potential in versatile web development projects. Before diving into what it takes to succeed at back end development as a profession, let’s look at what it’s about. In simple words, the back end is that invisible part of any application that activates all its internal elements. If the front-end answers the question of “how does it look”, then the back end or server-side web development deals with “how does it work”. A back end developer is the one who deals with the administrative part of the web application, the internal content of the system, and server-side technologies such as database, architecture and software logic. If you intend to become a professional server-side developer then there are few basic steps which will ease out your journey. In this article we have listed down five aspects of server-side development: servers, databases, networks, queues and frameworks, which you must master to become a successful server side web developer. Servers and databases: At the heart of server-side development are servers which are nothing but the hardware and storage devices connected to a local computer with working internet connection. So everytime you ask your browser to load a web page, the data stored in the servers are accessed and sent to the browser in a certain format. The bigger the application, the larger the amount of data stored in the server-side. The larger the data, the higher possibility of lag and slow performance. Databases are the particular file formats in which the data is stored. There are two different types of databases - Relational and Non- Relational. Both have their own pros and cons. Some of the popular databases which you can learn to take your skills up to the next level are NoSQL, SQL Server, MySQL, MongoDB, DynamoDB etc. Static and Dynamic servers: Static servers are physical hard drives where application data, CSS and HTML files, pictures and images are stored. Dynamic servers actually signify another layer between the server and the browser. They are often known as application servers. The primary function of these application servers is to process the data and format it as per the web page when the data fetching operation is initiated from the browser. This makes saving data much easier and process of data loading becomes much faster. For example, Wikipedia servers are filled with huge amounts of data, but they are not stored as HTML pages, rather they are stored as raw data. When they are queried by the browser, the application browser processes the data and formats it into the HTML format and then sends it to the browser. This makes the process a whole lot faster and space saving for the physical data storage. If you want to go a step ahead and think futuristic, then the latest trend is moving your servers on the cloud. This means the server-side tasks are performed by different cloud based services like Amazon AWS, and Microsoft Azure. This makes your task much simpler as a back end developer, since you simply need to decide which services you would require to best run your application and the rest is taken care off by the cloud service providers. Another aspect of server side development that’s generating a lot of interest among developer is is serverless development. This is based on the concept that the cloud service providers will allocate server space depending on your need and you don’t have to take care of backend resources and requirements. In a way the name Serverless is a misnomer, because the servers are there, just that they are in the cloud and you don’t have to bother about it. The primary role of a backend developer in a serverless system would be to figure out the best possible services and optimize the running cost on the cloud, deploy and monitor the system for non-stop robust performance. The communication protocol: The protocol which defines the data transfer between client side and server side is called HyperTextTransfer Protocol (HTTP). When a search request is typed in the browser, an HTTP request with a URL is sent to the server and the server then sends a response message with either request succeeded or web page not found. When an HTML page is returned for a search query, it is rendered by the web browser. While processing the response, the browser may discover links to other resources (e.g. an HTML page usually references JavaScript and CSS pages), and send separate HTTP Requests to download these files. Both static and dynamic websites use exactly the same communication protocol/patterns. As we have progressed quite a long way from the initial communication protocols, newer technologies like SSL, TLS, IPv6 have taken over the web communication domain. Transport Layer Security (TLS) – and its predecessor, Secure Sockets Layer (SSL), which is now deprecated by the Internet Engineering Task Force (IETF) – are cryptographic protocols that provide communications security over a computer network. The primary reason these protocols were introduced was to protect user data and provide increased security. Similarly newer protocols had to be introduced around late 90’s to cater to the increasing number of internet users. Protocols are basically unique identification pointers that determine the IP address of the server. The initial protocol used was IPv4 which is currently being substituted by IPv6 which has the capability to provide 2^128 or 3.4×1038 addresses. Message Queuing: This is one of the most important aspects of creating fast and dynamic web applications. Message Queuing is the stage where data is queued as per the different responses and then delivered to the browser. This process is asynchronous which means that the server and the browser need not interact with the message queue at the same time. There are some popular message queuing tools like RabbitMQ, MQTT, ActiveMQ which provide real time message queuing functionality. Server-side frameworks and languages: Now comes the last but one of the most important pointers. If you are a developer with a particular choice of language in mind, you can use a language based framework to add functionalities to your application easily. Also this makes it more efficient. Some of the popular server-side frameworks are Node.js for JavaScript, Django for Python, Laravel for PHP, Spring for Java and so on. But using these frameworks will need some amount of experience in respective languages. Now that you have a broad understanding of what server-side web development is, and what are the components, you can jump right into server-side development, databases and protocols management to progress into a successful professional back-end web developer. The best backend tools in web development Preparing the Spring Web Development Environment Is novelty ruining web development?  
Read more
  • 0
  • 0
  • 9916

article-image-how-is-artificial-intelligence-changing-the-mobile-developer-role
Bhagyashree R
15 Oct 2018
10 min read
Save for later

How is Artificial Intelligence changing the mobile developer role?

Bhagyashree R
15 Oct 2018
10 min read
Last year, at Google I/O, Sundar Pichai, the CEO of Google, said: “We are moving from a mobile-first world to an AI-first world” Is it only applicable to Google? Not really. In the recent past, we have seen several advancements in Artificial Intelligence and in parallel a plethora of intelligent apps coming into the market. These advancements are enabling developers to take their apps to the next level by integrating recommendation service, image recognition, speech recognition, voice translation, and many more cool capabilities. Artificial Intelligence is becoming a potent tool for mobile developers to experiment and innovate. The Artificial Intelligence components that are integral to mobile experiences, such as voice-based assistants and location-based services, increasingly require mobile developers to have a basic understanding of Artificial Intelligence to be effective. Of course, you don’t have to be Artificial Intelligence experts to include intelligent components in your app. But, you should definitely understand something about what you’re building into your app and why. After all AI in mobile is not just limited to calling an API, isn't it? There’s more to it and in this article we will explore how Artificial Intelligence will shape the mobile developer role in the immediate future. Read also: AI on mobile: How AI is taking over the mobile devices marketspace What is changing in the mobile developer role? Focus shifting to data With Artificial Intelligence becoming more and more accessible, intelligent apps are becoming the new norm for businesses. Artificial Intelligence strengthens the relationship between brands and customers, inspiring developers to build smart apps that increase user retention. This also means that developers have to direct their focus to data. They have to understand things like how the data will be collected? How will the data be fed to machines and how often will data input be needed? When nearly 1 in 4 people abandon an app after its first use, as a mobile app developer, you need to rethink how you drive in-app personalization and engagement. Explore “humanized” way of user-app interaction With so many chatbots such as Siri and Google Assistant coming into the market, we can see that “humanizing” the interaction between the user and the app is becoming mainstream. “Humanizing” is the process where the app becomes relatable to the user, and the more effective it is conducted, the more the end user will interact with the app. Users now want easy navigation and searching system and Artificial Intelligence fits perfectly in the scenario. The advances in technologies like text-to-speech, speech-to-text, Natural Language Processing, and cloud services, in general, have contributed to the mass adoption of these types of interfaces. Companies are increasingly expecting mobile developers to be comfortable working with AI functionalities Artificial Intelligence is the future. Companies are now expecting their mobile developers to know how to handle the huge amount of data generated every day and how to use it. Here's is an example of what Google wants their engineers to do: “We're looking for engineers who bring fresh ideas from all areas, including information retrieval, distributed computing, large-scale system design, networking and data storage, security, artificial intelligence, natural language processing, UI design and mobile; the list goes on and is growing every day.” This open-ended requirement list shows that it is the right time to learn and embrace Artificial Intelligence as soon as possible. What skills do you need to build intelligent apps? Ideally, data scientists are the ones who conceptualize mathematical models and machine learning engineers are the ones who translate it into the code and train the model. But, when you are working in a resource-tight environment, for example in a start-up, you will be responsible for doing the end-to-end job. It is not as scary as it sounds, because you have several resources to get started with! Taking your first steps with machine learning as a service Learning anything starts with motivating yourself. Directly diving into the maths and coding part of machine learning might exhaust and bore you. That's why it's a good idea to know what the end goal of your entire learning process is going to be and what types of solutions are possible using machine learning. There are many products available that you can try to quickly get started such as Google Cloud AutoML (Beta), Firebase MLKit (Beta), and Fritz Mobile SDK, among others. Read also: Machine Learning as a Service (MLaaS): How Google Cloud Platform, Microsoft Azure, and AWS are democratizing Artificial Intelligence Getting your hands dirty After getting a “warm-up” the next step will involve creating and training your own model. This is where you’ll be introduced to TensorFlow Lite, which is going to be your best friend throughout your journey as a machine learning mobile developer. There are many other machine learning tools coming into the market that you can make use of. These tools make building AI in mobile easier. For instance, you can use Dialogflow, a Natural Language Understanding (NLU) platform that makes it easy for developers to design and integrate conversational user interfaces into mobile apps, web applications, devices, and bots. You can then integrate it on Alexa, Cortana, Facebook Messenger, and other platforms your users are on. Read also: 7 Artificial Intelligence tools mobile developers need to know For practicing you can leverage an amazing codelab by Google, TensorFlow For Poets. It guides you through creating and training a custom image classification model. Through this codelab you will learn the basics of data collection, model optimization, and other key components involved in creating your own model. The codelab is divided into two parts. The first part covers creating and training the model, and the second part is focused on TensorFlow Lite which is the mobile version of TensorFlow that allows you to run the same model on a mobile device. Mathematics is the foundation of machine learning Love it or hate it, machine learning and Artificial Intelligence are built on mathematical principles like calculus, linear algebra, probability, statistics, and optimization. You need to learn some essential foundational concepts and the notation used to express them. There are many reasons why learning mathematics for machine learning is important. It will help you in the process of selecting the right algorithm which includes giving considerations to accuracy, training time, model complexity, number of parameters and number of features. Maths is needed when choosing parameter settings and validation strategies, identifying underfitting and overfitting by understanding the bias-variance tradeoff. Read also: Bias-Variance tradeoff: How to choose between bias and variance for your machine learning model [Tutorial] Read also: What is Statistical Analysis and why does it matter? What are the key aspects of Artificial Intelligence for mobile to keep in mind? Understanding the problem Your number one priority should be the user problem you are trying to solve. Instead of randomly integrating a machine learning model into an application, developers should understand how the model applies to the particular application or use case. This is important because you might end up building a great machine learning model with excellent accuracy rate, but if it does not solve any problem, it will end up being redundant. You must also understand that while there are many business problems which require machine learning approaches, not all of them do. Most business problems can be solved through simple analytics or a baseline approach. Data is your best friend Machine learning is dependent on data; the data that you use, and how you use it, will define the success of your machine learning model. You can also make use of thousands of open source datasets available online. Google recently launched a tool for dataset search named, Google Dataset Search which will make it easier for you to search the right dataset for your problem. Typically, there’s no shortage of data; however, the abundant existence of data does not mean that the data is clean, reliable, or can be used as intended. Data cleanliness is a huge issue. For example, a typical company will have multiple customer records for a single individual, all of which differ slightly. If the data isn’t clean, it isn’t reliable. The bottom line is, it’s a bad practice to just grabbing the data and using it without considering its origin. Read also: Best Machine Learning Datasets for beginners Decide which model to choose A machine learning algorithm is trained and the artifact that it creates after the training process is called the machine learning model. An ML model is used to find patterns in data without the developer having to explicitly program those patterns. We cannot look through such a huge amount of data and understand the patterns. Think of the model as your helper who will look through all those terabytes of data and extract knowledge and insights from the data. You have two choices here: either you can create your own model or use a pre-built model. While there are several pre-built models available, your business-specific use cases may require specialized models to yield the desired results. These off-the-shelf model may also need some fine-tuning or modification to deliver the value the app is intended to provide. Read also: 10 machine learning algorithms every engineer needs to know Thinking about resource utilization is important Artificial Intelligence-powered apps or apps, in general, should be developed with resource utilization in mind. Though companies are working towards improving mobile hardware, currently, it is not the same as what we can accomplish with GPU clusters in the cloud. Therefore, developers need to consider how the models they intend to use would affect resources including battery power and memory usage. In terms of computational resources, inferencing or making predictions is less costly than training. Inferencing on the device means that the models need to be loaded into RAM, which also requires significant computational time on the GPU or CPU. In scenarios that involve continuous inferencing, such as audio and image data which can chew up bandwidth quickly, on-device inferencing is a good choice. Learning never stops Maintenance is important, and to do that you need to establish a feedback loop and have a process and culture of continuous evaluation and improvement. A change in consumer behavior or a market trend can make a negative impact on the model. Eventually, something will break or no longer work as intended, which is another reason why developers need to understand the basics of what it is they’re adding to an app. You need to have some knowledge of how the Artificial Intelligence component that you just put together is working or how it could be made to run faster. Wrapping up Before falling for the Artificial Intelligence and machine learning hype, it’s important to understand and analyze the problem you are trying to solve. You should examine whether applying machine learning can improve the quality of the service, and decide if this improvement justifies the effort of deploying a machine learning model. If you just want a simple API endpoint and don’t want to dedicate much time in deploying a model, cloud-based web services are the best option for you. Tools like ML Kit for Firebase looks promising and seems like a good choice for startups or developers just starting out. TensorFlow Lite and Core ML are good options if you have mobile developers on your team or if you’re willing to get your hands a little dirty. Artificial Intelligence is influencing the app development process by providing us a data-driven approach for solving user problems. It wouldn't be surprising if in the near future Artificial Intelligence becomes a forerunning factor for app developers in their expertise and creativity. 10 useful Google Cloud Artificial Intelligence services for your next machine learning project [Tutorial] How Artificial Intelligence is going to transform the Data Center How Serverless computing is making Artificial Intelligence development easier
Read more
  • 0
  • 0
  • 9889
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-looking-different-types-lookup-cache
Savia Lobo
20 Nov 2017
6 min read
Save for later

Looking at the different types of Lookup cache

Savia Lobo
20 Nov 2017
6 min read
[box type="note" align="" class="" width=""]The following is an excerpt from a book by Rahul Malewar titled Learning Informatica PowerCenter 10.x. We walk through the various types of lookup cache based on how a cache is defined in this article.[/box] Cache is the temporary memory that is created when you execute a process. It is created automatically when a process starts and is deleted automatically once the process is complete. The amount of cache memory is decided based on the property you define at the transformation level or session level. You usually set the property as default, so as required, it can increase the size of the cache. If the size required for caching the data is more than the cache size defined, the process fails with the overflow error. There are different types of caches available. Building the Lookup Cache - Sequential or Concurrent You can define the session property to create the cache either sequentially or concurrently. Sequential cache When you select to create the cache sequentially, Integration Service caches the data in a row-wise manner as the records enter the lookup transformation. When the first record enters the lookup transformation, lookup cache gets created and stores the matching record from the lookup table or file in the cache. This way, the cache stores only the matching data. It helps in saving the cache space by not storing unnecessary data. Concurrent cache When you select to create cache concurrently, Integration service does not wait for the data to flow from the source; it first caches complete data. Once the caching is complete, it allows the data to flow from the source. When you select a concurrent cache, the performance enhances as compared to sequential cache since the scanning happens internally using the data stored in the cache. Persistent cache - the permanent one You can configure the cache to permanently save the data. By default, the cache is created as non-persistent, that is, the cache will be deleted once the session run is complete. If the lookup table or file does not change across the session runs, you can use the existing persistent cache. Suppose you have a process that is scheduled to run every day and you are using lookup transformation to lookup on the reference table that which is not supposed to change for six months. When you use non-persistent cache every day, the same data will be stored in the cache; this will waste time and space every day. If you select to create a persistent cache, the integration service makes the cache permanent in the form of a file in the $PMCacheDir location. So, you save the time every day, creating and deleting the cache memory. When the data in the lookup table changes, you need to rebuild the cache. You can define the condition in the session task to rebuild the cache by overwriting the existing cache. To rebuild the cache, you need to check the rebuild option on the session property. Sharing the cache - named or unnamed You can enhance the performance and save the cache memory by sharing the cache if there are multiple lookup transformations used in a mapping. If you have the same structure for both the lookup transformations, sharing the cache will help in enhancing the performance by creating the cache only once. This way, we avoid creating the cache multiple times, which in turn, enhances the performance. You can share the cache--either named or unnamed Sharing unnamed cache If you have multiple lookup transformations used in a single mapping, you can share the unnamed cache. Since the lookup transformations are present in the same mapping, naming the cache is not mandatory. Integration service creates the cache while processing the first record in first lookup transformation and shares the cache with other lookups in the mapping. Sharing named cache You can share the named cache with multiple lookup transformations in the same mapping or in another mapping. Since the cache is named, you can assign the same cache using the name in the other mapping. When you process the first mapping with lookup transformation, it saves the cache in the defined cache directory and with a defined cache file name. When you process the second mapping, it searches for the same location and cache file and uses the data. If the Integration service does not find the mentioned cache file, it creates the new cache. If you run multiple sessions simultaneously that use the same cache file, Integration service processes both the sessions successfully only if the lookup transformation is configured for read-only from the cache. If there is a scenario when both lookup transformations are trying to update the cache file or a scenario where one lookup is trying to read the cache file and other is trying to update the cache, the session will fail as there is conflict in the processing. Sharing the cache helps in enhancing the performance by utilizing the cache created. This way we save the processing time and repository space by not storing the same data multiple times for lookup transformations. Modifying cache - static or dynamic When you create a cache, you can configure them to be static or dynamic. Static cache A cache is said to be static if it does not change with the changes happening in the lookup table. The static cache is not synchronized with the lookup table. By default, Integration service creates a static cache. The Lookup cache is created as soon as the first record enters the lookup transformation. Integration service does not update the cache while it is processing the data. Dynamic cache A cache is said to be dynamic if it changes with the changes happening in the lookup table. The static cache is synchronized with the lookup table. You can choose from the lookup transformation properties to make the cache dynamic. Lookup cache is created as soon as the first record enters the lookup transformation. Integration service keeps on updating the cache while it is processing the data. The Integration service marks the record as an insert for the new row inserted in the dynamic cache. For the record that is updated, it marks the record as an update in the cache. For every record that doesn't change, the Integration service marks it as unchanged. You use the dynamic cache while you process the slowly changing dimension tables. For every record inserted in the target, the record will be inserted in the cache. For every record updated in the target, the record will be updated in the cache. A similar process happens for the deleted and rejected records.
Read more
  • 0
  • 0
  • 9880

article-image-self-service-analytics-changing-modern-day-businesses
Amey Varangaonkar
20 Nov 2017
6 min read
Save for later

How self-service analytics is changing modern-day businesses

Amey Varangaonkar
20 Nov 2017
6 min read
To stay competitive in today’s economic environment, organizations can no longer be reliant on just their IT team for all their data consumption needs. At the same time, the need to get quick insights to make smarter and more accurate business decisions is now stronger than ever. As a result, there has been a sharp rise in a new kind of analytics where the information seekers can themselves create and access a specific set of reports and dashboards - without IT intervention. This is popularly termed as Self-service Analytics. Per Gartner, Self-service analytics is defined as: “A  form of business intelligence (BI) in which line-of-business professionals are enabled and encouraged to perform queries and generate reports on their own, with nominal IT support.” Expected to become a $10 billion market by 2022, self-service analytics is characterized by simple, intuitive and interactive BI tools that have basic analytic and reporting capabilities with a focus on easy data access. It empowers business users to access relevant data and extract insights from it without needing to be an expert in statistical analysis or data mining. Today, many tools and platforms for self-service analytics are already on the market - Tableau, Microsoft Power BI, IBM Watson, Qlikview and Qlik Sense being some of the major ones. Not only have these empowered users to perform all kinds of analytics with accuracy, but their reasonable pricing, in-tool guidance and the sheer ease of use have also made them very popular among business users. Rise of the Citizen Data Scientist The rise in popularity of self-service analytics has led to the coining of a media-favored term - ‘Citizen Data Scientist’. But what does the term mean? Citizen data scientists are business users and other professionals who can perform less intensive data-related tasks such as data exploration, visualization and reporting on their own using just the self-service BI tools. If Gartner’s predictions are to be believed, there will be more citizen data scientists in 2019 than the traditional data scientists who will be performing a variety of analytics-related tasks. How Self-service Analytics benefits businesses Allowing the end-users within a business to perform their own analysis has some important advantages as compared to using the traditional BI platforms: The time taken to arrive at crucial business insights is drastically reduced. This is because teams don’t have to rely on the IT team to deliver specific reports and dashboards based on the organizational data. Quicker insights from self-service BI tools mean businesses can take decisions faster with higher confidence and deploy appropriate strategies to maximize business goals. Because of the relative ease of use, business users can get up to speed with the self-service BI tools/platform in no time and with very little training as compared to being trained on complex BI solutions. This means relatively lower training costs and democratization of BI analytics which in turn reduces the workload on the IT team and allows them to focus on their own core tasks. Self-service analytics helps the users to manage the data from disparate sources more efficiently, thus allowing organizations to be agiler in terms of handling new business requirements. Challenges in Self-service analytics While the self-service analytics platforms offer many benefits, they come with their own set of challenges too.  Let’s see some of them: Defining a clear role for the IT team within the business by addressing concerns such as: Identifying the right BI tool for the business - Among the many tools out there, identifying which tool is the best fit is very important. Identifying which processes and business groups can make the best use of self-service BI and who may require assistance from IT Setting up the right infrastructure and support system for data analysis and reporting Answering questions like - who will design complex models and perform high-level data analysis Thus, rather than becoming secondary to the business, the role of the IT team becomes even more important when adopting a self-service business intelligence solution. Defining a strict data governance policy - This is a critical task as unauthorized access to organizational data can be detrimental to the business. Identifying the right ‘power users’, i.e., the users who need access to the data and the tools, the level of access that needs to be given to them, and ensuring the integrity and security of the data are some of the key factors that need to be kept in mind. The IT team plays a major role in establishing strict data governance policies and ensuring the data is safe, secure and shared across the right users for self-service analytics. Asking the right kind of questions on the data - When users who aren’t analysts get access to data and the self-service tools, asking the right questions of the data in order to get useful, actionable insights from it becomes highly important. Failure to perform correct analysis can result in incorrect or insufficient findings, which might lead to wrong decision-making. Regular training sessions and support systems in place can help a business overcome this challenge. To read more about the limitations of self-service BI, check out this interesting article. In Conclusion IDC has predicted that spending on self-service BI tools will grow 2.5 times than spending on traditional IT-controlled BI tools by 2020. This is an indicator that many organizations worldwide and of all sizes will increasingly believe that self-service analytics is a feasible and profitable way to go forward. Today mainstream adoption of self-service analytics still appears to be in the early stages due to a general lack of awareness among businesses. Many organizations still depend on the IT team or an internal analytics team for all their data-driven decision-making tasks. As we have already seen, this comes with a lot of limitations - limitations that can easily be overcome by the adoption of a self-service culture in analytics, and thus boost the speed, ease of use and quality of the analytics. By shifting most of the reporting work to the power users,  and by establishing the right data governance policies, businesses with a self-service BI strategy can grow a culture that fuels agile thinking, innovation and thus is ready for success in the marketplace. If you’re interested in learning more about popular self-service BI tools, these are some of our premium products to help you get started:   Learning Tableau 10 Tableau 10 Business Intelligence Cookbook Learning IBM Watson Analytics QlikView 11 for Developers Microsoft Power BI Cookbook    
Read more
  • 0
  • 0
  • 9866

article-image-5-reasons-learn-generative-adversarial-networks-gans
Savia Lobo
12 Dec 2017
5 min read
Save for later

5 reasons to learn Generative Adversarial Networks (GANs) in 2018

Savia Lobo
12 Dec 2017
5 min read
Generative Adversarial Networks (GANs) are a prominent branch of Machine learning research today. As deep neural networks require a lot of data to train on, they perform poorly if data provided is not sufficient. GANs can overcome this problem by generating new and real data, without using the tricks like data augmentation. As the application of GANs in the Machine learning industry is still at the infancy level, it is considered a highly desirable niche skill. Having an added hands-on experience raises the bar higher in the job market. It can fetch you a higher pay over your colleagues and can also be the feature that sets your resume stand apart. Source: Gartner's Hype Cycle 2017  GANs along with CNNs and RNNs are a part of the in demand deep neural network experience in the industry. Here are five reasons why you should learn GANs today and how Kuntal Ganguly’s book, Learning Generative Adversarial Networks help you do just that. Kuntal is a big data analytics engineer at Amazon Web Services. He has around 7 years of experience building large-scale, data-driven systems using big data frameworks and machine learning. He has designed, developed, and deployed several large-scale distributed applications, without any assistance. Kuntal is a seasoned author with a rich set of books ranging across the data science spectrum from machine learning, deep learning, to Generative Adversarial Networks, published under his belt.[/author] The book shows how to implement GANs in your machine learning models in a quick and easy format with plenty of real-world examples and hands-on tutorials. 1. Unsupervised Learning now a cakewalk with GANs A major challenge of unsupervised learning is the massive amount of unlabelled data one needed to work through as part of data preparation. In traditional neural networks, this labeling of data is both costly and time-consuming. A creative aspect of Deep learning is now possible using Generative Adversarial Networks. Here, the neural networks are capable of generating realistic images from the real-world datasets (such as MNIST and CIFAR). GANs provide an easy way to train the DL algorithms. This is done by slashing down the amount of data required to train the neural network models, that too, with no labeling of data required. This book uses a semi-supervised approach to solve the problem of unsupervised learning for classifying images. However, this could be easily leveraged into developer’s own problem domain. 2. GANs help you change a horse into a zebra using Image style transfer https://www.youtube.com/watch?v=9reHvktowLY Turning an apple into an orange is Magic!! GANs can do this magic, without casting a  spell. Transferring Image-to-Image style, where the styling of one image is applied to the other. What GANs can do is, they can perform image-to-image translations across various domains (such as changing apple to orange or horse to zebra) using Cycle Consistent Generative Network (Cycle GANs). Detailed examples of how to turn the image of an apple to an orange using TensorFlow, and how of turn an image of a horse into a zebra using a GAN model, are given in this book.  3. GANs inputs your text and outputs an image Generative Adversarial networks can also be utilized for text-to-image synthesis. An example of this is in generating a photo-realistic image based on a caption. To do this, a dataset of images with their associated captions are given as training data. The dataset is first encoded using a hybrid neural network called the character-level convolutional Recurrent Neural network, which creates a joint representation of both in multimodal space for both the generator and the discriminator. In this book, Kuntal showcases the technique of stacking multiple generative networks to generate realistic images from textual information using StackGANs.Further, the book goes on to explain the coupling of two generative networks, to automatically discover relationships among various domains (a relationship between shoes and handbags or actor and actress) using DiscoGANs. 4. GANs + Transfer Learning = No more model generation from scratch Source: Learning Generative Adversarial Networks Data is the basis to train any Machine learning model, scarcity of which can lead to a poorly-trained model, which can have high chances of failure. Some real-life scenarios may not have sufficient data, hardware, or resources to train bigger networks in order to achieve the desired accuracy. So, is training from scratch a must-do for training the models? A well-known technique used in deep learning that adapts an existing trained model for a similar task to the task at hand is known as Transfer Learning. This book will showcase Transfer learning using some hands-on examples. Further, a combination of both Transfer learning and GANs, to generate high-resolution realistic images with facial datasets is explained. Thus, you will also understand how to create creating artistic hallucination on images beyond GAN. 5. GANs help you take Machine Learning models to Production Most Machine learning tutorials, video courses, and books, explain the training and the evaluation of the models. But how do we take this trained model to production and put it to use and make it available to customers? In this book, the author has taken an example, i.e. developing a facial correction system using an LFW dataset, to automatically correct corrupted images using your trained GAN model. This book also contains several techniques of deploying machine learning or deep learning models in production both on data centers and clouds with micro-service based containerized environments.You will also learn the way of running deep models in a serverless environment and with managed cloud services. This article just scratches the surface of what is possible with GANs and why learning it would change your thinking about deep neural networks. To know more grab your copy of Kuntal Ganguly’s book on GANs: Learning Generative Adversarial Networks.     .
Read more
  • 0
  • 0
  • 9845

article-image-what-is-statistical-analysis-and-why-does-it-matter
Sugandha Lahoti
02 Oct 2018
6 min read
Save for later

What is Statistical Analysis and why does it matter?

Sugandha Lahoti
02 Oct 2018
6 min read
As a data developer, the concept or process of data analysis may be clear to your mind. However, although there happen to be similarities between the art of data analysis and that of statistical analysis, there are important differences to be understood as well. This article is taken from the book Statistics for Data Science by James D. Miller. This book takes you through an entire journey of statistics, from knowing very little to becoming comfortable in using various statistical methods for data science tasks. In this article, we've broken things into the following topics: What is statistical analysis and it's best practices? How to establish the nature of data? What is Statistical analysis? Some in the study of statistics sometimes describe statistical analysis as part of statistical projects that involves the collection and scrutiny of a data source in an effort to identify trends within the data. With data analysis, the goal is to validate that the data is appropriate for a need, and with statistical analysis, the goal is to make sense of, and draw some inferences from, the data. There is a wide range of possible statistical analysis techniques or approaches that can be considered. How to perform a successful statistical analysis It is worthwhile to mention some key points, dealing with ensuring a successful (or at least productive) statistical analysis effort. As soon as you can, decide on your goal or objective. You need to know what the win is, that is, what the problem or idea is that is driving the analysis effort. In addition, you need to make sure that, whatever is driving the analysis, the result obtained must be measurable in some way. This metric or performance indicator must be identified early. Identify key levers. This means that once you have established your goals and a way to measure performance towards obtaining those goals, you also need to find out what has an effect on the performance towards obtaining each goal. Conduct a thorough data collection. Typically, the more data the better, but in the absence of quantity, always go with quality. Clean your data. Make sure your data has been cleaned in a consistent way so that data issues would not impact your conclusions. Model, model, and model your data. Modeling drives modeling. The more you model your data, the more questions you'll have asked and answered, and the better results you'll have. Take time to grow in your statistical analysis skills. It's always a good idea to continue to evolve your experiences and style of statistical analysis. The way to improve is to do it. Another approach is to remodel the data you may have on hand for other projects to hone your skills. Optimize and repeat. As always, you need to take the time for standardizing, following proven practices, using templates, and testing and documenting your scripts and models, so that you can re-use your best efforts over and over again. You will find that this time will be well spent and even your better efforts will improve with use. Finally, share your work with others! The more eyes, the better the product. Some interesting advice on ensuring success with statistical projects includes the following quote: It's a good idea to build a team that allows those with an advanced degree in statistics to focus on data modeling and predictions, while others in the team-qualified infrastructure engineers, software developers and ETL experts-build the necessary data collection infrastructure, data pipeline and data products that enable streaming the data through the models and displaying the results to the business in the form of reports and dashboards.                                                                                                            - G Shapira, 2017 Establishing the nature of data When asked about the objectives of statistical analysis, one often refers to the process of describing or establishing the nature of a data source. Establishing the nature of something implies gaining an understanding of it. This understanding can be found to be both simple as well as complex. For example, can we determine the types of each of the variables or components found within our data source; are they quantitative, comparative, or qualitative? A more advanced statistical analysis aims to identify patterns in data; for example, whether there is a relationship between the variables or whether certain groups are more likely to show certain attributes than others. Exploring the relationships presented in data may appear to be similar to the idea of identifying a foreign key in a relational database, but in statistics, relationships between the components or variables are based upon correlation and causation. Further, establishing the nature of a data source is also, really, a process of modeling that data source. During modeling, the process always involves asking questions such as the following (in an effort establish the nature of the data): What? Some common examples of this (what) are revenue, expenses, shipments, hospital visits, website clicks, and so on. In the example, we are measuring quantities, that is, the amount of product that is being moved (sales). Why? This (why) will typically depend upon your project's specific objectives, which can vary immensely. For example, we may want to track the growth of a business, the activity on a website, or the evolution of a selected product or market interest. Again, in our current transactional data example, we may want to identify over- and under-performing sales types, and determine if, new or repeat customers provide more or fewer sales? How? The how will most likely be over a period of time (perhaps a year, month, week, and so on) and then by some other related measure, such as a product, state, region, reseller, and so on. Within our transactional data example, we've focused on the observation of quantities by sale type. Another way to describe establishing the nature of your data is adding context to it or profiling it. In any case, the objective is to allow the data consumer to better understand the data through visualization. Another motive for adding context or establishing the nature of your data can be to gain a new perspective on the data. In this article, we explored the purpose and process of statistical analysis and listed the steps involved in a successful statistical analysis. Next, to learn about statistical regression and why it is important to data science, read our book Statistics for Data Science. Estimating population statistics with Point Estimation. Why You Need to Know Statistics To Be a Good Data Scientist. Why choose IBM SPSS Statistics over R for your data analysis project.
Read more
  • 0
  • 0
  • 9836
article-image-understanding-security-features-in-the-google-cloud-platform-gcp
Vincy Davis
27 Jul 2019
10 min read
Save for later

Understanding security features in the Google Cloud Platform (GCP)

Vincy Davis
27 Jul 2019
10 min read
Google's long experience and success in, protecting itself against cyberattacks plays to our advantage as customers of the Google Cloud Platform (GCP). From years of warding off security threats, Google is well aware of the security implications of the cloud model. Thus, they provide a well-secured structure for their operational activities, data centers, customer data, organizational structure, hiring process, and user support. Google uses a global scale infrastructure to provide security to build commercial services, such as Gmail, Google search, Google Photos, and enterprise services, such as GCP and gsuite. This article is an excerpt taken from the book, "Google Cloud Platform for Architects.", written by Vitthal Srinivasan, Janani Ravi and Et al. In this book, you will learn about Google Cloud Platform (GCP) and how to manage robust, highly available, and dynamic solutions to drive business objective. This article gives an insight into the security features in Google Cloud Platform, the tools that GCP provides for users benefit, as well as some best practices and design choices for security. Security features at Google and on the GCP Let's start by discussing what we get directly by virtue of using the GCP. These are security protections that we would not be able to engineer for ourselves. Let's go through some of the many layers of security provided by the GCP. Datacenter physical security: Only a small fraction of Google employees ever get to visit a GCP data center. Those data centers, the zones that we have been talking so much about, probably would seem out of a Bond film to those that did—security lasers, biometric detectors, alarms, cameras, and all of that cloak-and-dagger stuff. Custom hardware and trusted booting: A specific form of security attacks named privileged access attacks are on the rise. These involve malicious code running from the least likely spots that you'd expect, the OS image, hypervisor, or boot loader. There is the only way to really protect against these, which is to design and build every single element in-house. Google has done that, including hardware, a firmware stack, curated OS images, and a hardened hypervisor. Google data centers are populated with thousands of servers connected to a local network. Google selects and validates building components from vendors and designs custom secure server boards and networking devices for server machines. Google has cryptographic signatures on all low-level components, such as BIOS, bootloader, kernel, and base OS, to validate the correct software stack is booting up. Data disposal: The detritus of the persistent disks and other storage devices that we use are also cleaned thoroughly by Google. This data destruction process involves several steps: an authorized individual will wipe the disk clean using a logical wipe. Then, a different authorized individual will inspect the wiped disk. The results of the erasure are stored and logged too. Then, the erased driver is released into inventory for reuse. If the disk was damaged and could not be wiped clean, it is stored securely and not reused, and such devices are periodically destroyed. Each facility where data disposal takes place is audited once a week. Data encryption: By default GCP always encrypts all customer data at rest as well as in motion. This encryption is automatic, and it requires no action on the user's part. Persistent disks, for instance, are already encrypted using AES-256, and the keys themselves are encrypted with master keys. All these key management and rotation is managed by Google. In addition to this default encryption, a couple of other encryption options exist as well, more on those in the following diagram: Secure service deployment: Google's security documentation will often refer to secure service deployment, and it is important to understand that in this context, the term service has a specific meaning in the context of security: a service is the application binary that a developer writes and runs on infrastructure. This secure service deployment is based on three attributes: Identity: Each service running on Google infrastructure has an associated service account identity. A service has to submit cryptographic credentials provided to it to prove its identity while making or receiving remote procedure calls (RPC) to other services. Clients use these identities to make sure that they are connecting to an intended server and the server will use to restrict access to data and methods to specific clients. Integrity: Google uses a cryptographic authentication and authorization technique at an application layer to provide strong access control at the abstraction level for interservice communication. Google has an ingress and egress filtering facility at various points in their network to avoid IP spoofing. With this approach, Google is able to maximize their network's performance and its availability. Isolation: Google has an effective sandbox technique to isolate services running on the same machine. This includes Linux user separation, language and kernel-based sandboxes, and hardware virtualization. Google also secures operation of sensitive services such as cluster orchestration in GKE on exclusively dedicated machines. Secure interservice communication: The term inter-service communication refers to GCP's resources and services talking to each other. For doing so, the owners of the services have individual whitelists of services which can access them. Using them, the owner of the service can also allow some IAM identities to connect with the services managed by them.Apart from that, Google engineers on the backend who would be responsible to manage the smooth and downtime-free running of the services are also provided special identities to access the services (to manage them, not to modify their user-input data). Google encrypts interservice communication by encapsulating application layer protocols in RPS mechanisms to isolate the application layer and to remove any kind of dependency on network security. Using Google Front End: Whenever we want to expose a service using GCP, the TLS certificate management, service registration, and DNS are managed by Google itself. This facility is called the Google Front End (GFE) service. For example, a simple file of Python code can be hosted as an application on App Engine that (application) will have its own IP, DNS name, and so on. In-built DDoS protections: Distributed Denial-of-Service attacks are very well studied, and precautions against such attacks are already built into many GCP services, notably in networking and load balancing. Load balancers can actually be thought of as hardened, bastion hosts that serve as lightning rods to attract attacks, and so are suitably hardened by Google to ensure that they can withstand those attacks. HTTP(S) and SSL proxy load balancers, in particular, can protect your backend instances from several threats, including SYN floods, port exhaustion, and IP fragment floods. Insider risk and intrusion detection: Google constantly monitors activities of all available devices in Google infrastructure for any suspicious activities. To secure employees' accounts, Google has replaced phishable OTP second factors with U2F, compatible security keys. Google also monitors its customer devices that employees use to operate their infrastructure. Google also conducts a periodic check on the status of OS images with security patches on customer devices. Google has a special mechanism to grant access privileges named application-level access management control, which exposes internal applications to only specific users from correctly managed devices and expected network and geographic locations. Google has a very strict and secure way to manage its administrative access privileges. They have a rigorous monitoring process of employee activities and also a predefined limit for administrative accesses for employees. Google-provided tools and options for security As we've just seen, the platform already does a lot for us, but we still could end up leaving ourselves vulnerable to attack if we don't go about designing our cloud infrastructure carefully. To begin with, let's understand a few facilities provided by the platform for our benefit. Data encryption options: We have already discussed Google's default encryption; this encrypts pretty much everything and requires no user action. So, for instance, all persistent disks are encrypted with AES-256 keys that are automatically created, rotated, and themselves encrypted by Google. In addition to default encryption, there are a couple of other encryption options available to users. Customer-managed encryption keys (CMEK) using Cloud KMS: This option involves a user taking control of the keys that are used, but still storing those keys securely on the GCP, using the key management service. The user is now responsible for managing the keys that are for creating, rotating and destroying them. The only GCP service that currently supports CMEK is BigQuery and is in beta stage for Cloud Storage. Customer-supplied encryption keys (CSEK): Here, the user specifies which keys are to be used, but those keys do not ever leave the user's premises. To be precise, the keys are sent to Google as a part of API service calls, but Google only uses these keys in memory and never persists them on the cloud. CSEK is supported by two important GCP services: data in cloud storage buckets as well as by persistent disks on GCE VMs. There is an important caveat here though: if you lose your key after having encrypted some GCP data with it, you are entirely out of luck. There will be no way for Google to recover that data. Cloud security scanner: Cloud security scanner is a GCP, provided security scanner for common vulnerabilities. It has long been available for App Engine applications, but is now also available in alpha for Compute Engine VMs. This handy utility will automatically scan and detect the following four common vulnerabilities: Cross-site scripting (XSS) Flash injection Mixed content (HTTP in HTTPS) The use of outdated/insecure libraries Like most security scanners, it automatically crawls an application, follows links, and tries out as many different types of user input and event handlers as possible. Some security best practices Here is a list of design choices that you could exercise to cope with security threats such as DDoS attacks: Use hardened bastion hosts such as load balancers (particularly HTTP(S) and SSL proxy load balancers). Make good use of the firewall rules in your VPC network. Ensure that incoming traffic from unknown sources, or on unknown ports, or protocols is not allowed through. Use managed services such as Dataflow and Cloud Functions wherever possible; these are serverless and so have smaller attack vectors. If your application lends itself to App Engine it has several security benefits over GCE or GKE, and it can also be used to autoscale up quickly, damping the impact of a DDOS attack. If you are using GCE VMs, consider the use of API rate limits to ensure that the number of requests to a given VM does not increase in an uncontrolled fashion. Use NAT gateways and avoid public IPs wherever possible to ensure network isolation. Use Google CDN as a way to offload incoming requests for static content. In the event of a storm of incoming user requests, the CDN servers will be on the edge of the network, and traffic into the core infrastructure will be reduced. Summary In this article, you learned that the GCP benefits from Google's long experience countering cyber-threats and security attacks targeted at other Google services, such as Google search, YouTube, and Gmail. There are several built-in security features that already protect users of the GCP from several threats that might not even be recognized as existing in an on-premise world. In addition to these in-built protections, all GCP users have various tools at their disposal to scan for security threats and to protect their data. To know more in-depth about the Google Cloud Platform (GCP), head over to the book, Google Cloud Platform for Architects. Ansible 2 for automating networking tasks on Google Cloud Platform [Tutorial] Build Hadoop clusters using Google Cloud Platform [Tutorial] Machine learning APIs for Google Cloud Platform
Read more
  • 0
  • 0
  • 9790

article-image-what-you-need-to-know-about-generative-adversarial-networks
Guest Contributor
19 Jan 2018
7 min read
Save for later

What you need to know about Generative Adversarial Networks

Guest Contributor
19 Jan 2018
7 min read
[box type="note" align="" class="" width=""]We have come to you with another guest post by Indra den Bakker, an experienced deep learning engineer and a mentor on Udacity for many budding data scientists. Indra has also written one of our best selling titles, Python Deep Learning Cookbook which covers solutions to various problems in modeling deep neural networks.[/box] In 2014, we took a significant step in AI with the introduction of Generative Adversarial Networks -better known as GANs- by Ian Goodfellow, amongst others. The real breakthrough of GANs didn’t follow until 2016, however, the original paper includes many novel ideas that would be exploited in the years to come. Previously, deep learning had already revolutionized many industries by achieving above human performance. However, many critics argued that these deep learning models couldn’t compete with human creativity. With the introduction to GANs, Ian showed that these critics could be wrong. Figure 1: example of style transfer with deep learning The idea behind GANs is to create new examples based on a training set - for example to demonstrate the ability to create new paintings or new handwritten digits. In GANs two competing deep learning models are trained simultaneously. These networks compete against each other: one model tries to generate new realistic examples, this network is also called the generator. The other network tries to classify if an example originates from the training set or from the generator, also called as discriminator. In other words, the generator tries to mislead the discriminator by generating new examples. In the figure below we can see the general structure of GANs. Figure 2: GAN structure with X as training examples and Z as noise input. GANs are fundamentally different from other machine learning applications. The task of a GAN is unsupervised: we try to extract patterns and structure from data without additional information. Therefore, we don’t have a truth label. GANs shouldn’t be confused with autoencoder networks. With autoencoders we know what the output should be: the same as the input. But in case of GANs we try to create new examples that look like the training examples but are different. It’s a new way of teaching an agent to learn complex tasks by imitating an “expert”. If the generator is able to fool the discriminator one could argue that the agent mastered the task - think about the Turing test. Best way to explain GANs is to use images as an example. The resulting output of GANs can be fascinating. The most used dataset for GANs is the popular MNIST dataset. This dataset has been used in many deep learning papers, including the original Generative Adversarial Nets paper. Figure 3: example of MNIST training images Let’s say as input we have a bunch of handwritten digits. We want our model to be able to take these examples and create new handwritten digits. We want our model to learn how to write digits in such a way that it looks like handwritten digits. Note, that we don’t care which digits the model creates as long as it looks like one of the digits from 0 to 9. As you may suspect, there is a thin line between generating examples that are exact copies of the training set and newly created images. We need to make sure that the generator generates new images that follow the distribution of the training examples but are slightly different. This is where the creativity needs to come in. In Figure 2, we’ve showed that the generator uses noise -random values- as input. This noise is random, to make sure that the generator creates different output each time. Now that we know what we need and what we want to achieve, let’s have a closer look at both model architectures. Let’s start with the generator. We will feed the generator with random noise: a vector of 100 values randomly drawn between -1 and 1. Next, we stack multiple fully connected layers with Leaky ReLU activation function. Our training images are in grayscale and are sized as 28x28. Which means, flattened we need an output of 784 units for the final layer of our generator - the output of the generator should match the size of the training images. As activation function for our final layer we will be using TanH to make sure the resulting values are squeezed between -1 and 1. The final model architecture of our generator looks as follows: Figure 4: model architecture of the generator Next, we define our discriminator model. Most common is to use a mirrored version of the generator, where we have as input 784 values and as final layer a fully connected layer with 1 hidden neuron and sigmoid activation function for binary classification. Keep in mind that both the generator and discriminator are trained at the same time. The model looks like this: Figure 5: model architecture of the discriminator In general, generating new images is a harder task. Therefore, sometimes it can be beneficial to train the generator twice for each step. Whereas the discriminator will only be trained once. Another option is to set the learning rate for the discriminator a bit smaller than the learning rate for the generator. Tracking the performance of GANs can be tricky. Sometimes a lower loss doesn’t represent a better output. That’s why it’s a good idea to output the generated images during the training process. In the following figure we can see the digits generated by a GAN after 20 epochs. Figure 6: example output of generated MNIST images As we have stated in the introduction, GANs didn’t get much traction until 2016. GANs were mostly unstable and hard to train. Small adjustments in the model or training parameter resulted in unsatisfying results. Advancements in model architecture and other improvements fixed some of the previous limitations and unlocked the real potential of GANs. An important improvement was introduced by Deep Convolutional GANs (DCGANs). DCGANs is a network architecture, where in both the discriminator and generator are fully convolutional. The output is more stable - for datasets with higher translation invariance, like the Fashion MNIST dataset. Figure 7: example of Fashion MNIST images generated by a Deep Convolutional Generative Adversarial Network (DCGAN) There is so much more to discover with GANs and there is huge potential still to be unlocked. According to Yann LeCun - one of the fathers of deep learning - GANs are the most important advancement in machine learning in the last 20 years. GANs can be used for many different applications, ranging from 3D face generation to upscaling resolution of images and text-to-image. GANs might be the stepping stone we have been waiting for to add creativity to machines. [author title="Author's Bio"]Indra den Bakker is an experienced deep learning engineer and mentor on Udacity. He is the founder of 23insights, a part of NVIDIA's Inception program—a machine learning start-up building solutions that transform the world’s most important industries. For Udacity, he mentors students pursuing a Nanodegree in deep learning and related fields, and he is also responsible for reviewing student projects. Indra has a background in computational intelligence and has worked for several years as a data scientist for IPG Mediabrands and Screen6 before founding 23insights. [/author]      
Read more
  • 0
  • 0
  • 9788

article-image-how-ai-is-going-to-transform-the-data-center
Melisha Dsouza
13 Sep 2018
7 min read
Save for later

How AI is going to transform the Data Center

Melisha Dsouza
13 Sep 2018
7 min read
According to Gartner analyst Dave Cappuccio, 80% of enterprises will have shut down their traditional data centers by 2025, compared to just 10% today. The figures are fitting considering the host of problems faced by traditional data centers. The solution to all these problems lies right in front of us- Incorporating Intelligence in traditional data centers. To support this claim, Gartner also predicts that by 2020, more than 30 percent of data centers that fail to implement AI and Machine Learning will cease to be operationally and economically viable. Across the globe, Data science and AI are influencing the design and development of the modern data centers. With the surge in the amount of data everyday, traditional data centers will eventually get slow and result in an inefficient output. Utilizing AI in ingenious ways, data center operators can drive efficiencies up and costs down. A fitting example of this is the tier-two automated control system implemented at Google to cool its data centers autonomously. The system makes all the cooling-plant tweaks on its own, continuously, in real-time- thus saving up to 30% of the plant’s energy annually. Source: DataCenter Knowledge AI has enabled data center operators to add more workloads on the same physical silicon architecture. They can aggregate and analyze data quickly and generate productive outputs, which is specifically beneficial to companies that deal with immense amounts of data like hospitals, genomic systems, airports, and media companies. How is AI facilitating data centers Let's look at some of the ways that Intelligent data centers will serve as a solution to issues faced by traditionally operated data centers. #1 Energy Efficiency The Delta Airlines data center outage in 2016, that was attributed to electrical-system failure over a three day period, cost the airlines around $150 million, grounding about 2,000 flights. This situation could have been easily averted had the data centers used Machine Learning for their workings. As data centers get ever bigger, more complex and increasingly connected to the cloud, artificial intelligence is becoming an essential tool for keeping things from overheating and saving power at the same time. According to the Energy Department’s U.S. Data Center Energy Usage Report, the power usage of data centers in the United States has grown at about a 4 percent rate annually since 2010 and is expected to hit 73 billion kilowatt-hours by 2020, more than 1.8 percent of the country’s total electricity use. Data centers also contribute about 2 percent of the world’s greenhouse gas emissions, AI techniques can do a lot to make processes more efficient, more secure and less expensive. One of the keys to better efficiency is keeping things cool, a necessity in any area of computing. Google and DeepMind (Alphabet Inc.’s AI division)use of AI to directly control its data center has reduced energy use for cooling by about 30 percent. #2 Server optimization Data centers have to maintain physical servers and storage equipment. AI-based predictive analysis can help data centers distribute workloads across the many servers in the firm. Data center loads can become more predictable and more easily manageable. Latest load balancing tools with built-in AI capabilities are able to learn from past data and run load distribution more efficiently. Companies will be able to better track server performance, disk utilization, and network congestions. Optimizing server storage systems, finding possible fault points in the system, improve processing times and reducing risk factors will become faster. These will in turn facilitate maximum possible server optimization. #3 Failure prediction / troubleshooting Unplanned downtime in a datacenter can lead to money loss. Datacenter operators need to quickly identify the root case of the failure, so they can prioritize troubleshooting and get the datacenter up and running before any data loss or business impact take place. Self managing datacenters make use of AI based deep learning (DL) applications to predict failures ahead of time. Using ML based recommendation systems, appropriate fixes can be inferred upon the system in time. Take for instance the HPE artificial intelligence predictive engine that identifies and solves trouble in the data center. Signatures are built to identify other users that might be affected. Rules are then developed to instigate a solution, which can be automated. The AI-machine learning solution, can quickly interject through the entire system and stop others from inheriting the same issue. #4 Intelligent Monitoring and storing of Data Incorporating machine learning, AI can take over the mundane job of monitoring huge amounts of data and make IT professionals more efficient in terms of the quality of tasks they handle. Litbit has developed the first AI-powered, data center operator, Dac. It uses a human-to-machine learning interface that combines existing human employee knowledge with real-time data. Incorporating over 11,000 pieces of innate knowledge, Dac has the potential to hear when a machine is close to failing, feel vibration patterns that are bad for HDD I/O, and spot intruders. Dac is proof of how AI can help monitor networks efficiently. Along with monitoring of data, it is also necessary to be able to store vast amounts of data securely. AI holds the potential to make more intelligent decisions on - storage optimization or tiering. This will help transform storage management by learning IO patterns and data lifecycles, helping storage solutions etc. Mixed views on the future of AI in data centers? Let’s face the truth, the complexity that comes with huge masses of data is often difficult to handle. Humans ar not as scalable as an automated solution to handle data with precision and efficiency. Take Cisco’s M5 Unified Computing or HPE’s InfoSight as examples. They are trying to alleviate the fact that humans are increasingly unable to deal with the complexity of a modern data center. One of the consequences of using automated systems is that there is always a possibility of humans losing their jobs and being replaced by machines at varying degrees depending on the nature of job roles. AI is predicted to open its doors to robots and automated machines that will soon perform repetitive tasks in the datacenters. On the bright side, organizations could allow employees, freed from repetitive and mundane tasks, to invest their time in more productive, and creative aspects of running a data center. In addition to new jobs, the capital involved in setting up and maintaining a data center is huge. Now add AI to the Datacenter and you have to invest double or maybe triple the amount of money to keep everything running smoothly. Managing and storing all of the operational log data for analysis also comes with its own set of issues. The log data that acts as the input to these ML systems becomes a larger data set than the application data itself. Hence firms need a proper plan in place to manage all of this data. Embracing AI in data centers would mean greater financial benefits from the outset while attracting more customers. It would be interesting to see the turnout of tech companies following Google’s footsteps and implementing AI in their data centers. Tech companies should definitely watch this space to take their data center operations up a notch. 5 ways artificial intelligence is upgrading software engineering Intelligent Edge Analytics: 7 ways machine learning is driving edge computing adoption in 2018 15 millions jobs in Britain at stake with Artificial Intelligence robots set to replace humans at workforce
Read more
  • 0
  • 0
  • 9786
article-image-everything-know-ethereum
Packt Editorial Staff
10 Apr 2018
8 min read
Save for later

Everything you need to know about Ethereum

Packt Editorial Staff
10 Apr 2018
8 min read
Ethereum was first conceived of by Vitalik Buterin in November 2013. The critical idea proposed was the development of a Turing-complete language that allows the development of arbitrary programs (smart contracts) for Blockchain and decentralized applications. This concept is in contrast to Bitcoin, where the scripting language is limited in nature and allows necessary operations only. This is an excerpt from the second edition of Mastering Blockchain by Imram Bashir. The following table shows all the releases of Ethereum starting from the first release to the planned final release: Version Release date Olympic May, 2015 Frontier July 30, 2015 Homestead March 14, 2016 Byzantium (first phase of Metropolis) October 16, 2017 Metropolis To be released Serenity (final version of Ethereum) To be released   The first version of Ethereum, called Olympic, was released in May, 2015. Two months later, a second version was released, called Frontier. After about a year, another version named Homestead with various improvements was released in March, 2016. The latest Ethereum release is called Byzantium. This is the first part of the development phase called Metropolis. This release implemented a planned hard fork at block number 4,370,000 on October 16, 2017. The second part of this release called Constantinople is expected in 2018 but there is no exact time frame available yet. The final planned release of Ethereum is called Serenity. It's planned for Serenity to introduce the final version of PoS based blockchain instead of PoW. The yellow paper The Yellow Paper, written by Dr. Gavin Wood, serves as a formal definition of the Ethereum protocol. Anyone can implement an Ethereum client by following the protocol specifications defined in the paper. While this paper is a challenging read, especially for those who do not have a background in algebra or mathematics, it contains a complete formal specification of Ethereum. This specification can be used to implement a fully compliant Ethereum client. The list of all symbols with their meanings used in the paper is provided here with the anticipation that it will make reading the yellow paper more accessible. Once symbol meanings are known, it will be much easier to understand how Ethereum works in practice. Symbol Meaning Symbol Meaning ≡ Is defined as ≤ Less than or equal to = Is equal to Sigma, World state ≠ Is not equal to Mu, Machine state ║...║ Length of Upsilon, Ethereum state transition function Is an element of Block level state transition function Is not an element of . Sequence concatenation For all There exists Union ᴧ Contract creation function Logical AND Increment : Such that Floor, lowest element {} Set Ceiling, highest element () Function of tuple No of bytes [] Array indexing Exclusive OR Logical OR (a ,b) Real numbers >= a and < b > Is greater than Empty set, null + Addition - Subtraction ∑ Summation { Describing various cases of if, otherwise   Ethereum blockchain Ethereum, like any other blockchain, can be visualized as a transaction-based state machine. This definition is referred to in the Yellow Paper. The core idea is that in Ethereum blockchain, a genesis state is transformed into a final state by executing transactions incrementally. The final transformation is then accepted as the absolute undisputed version of the state. In the following diagram, the Ethereum state transition function is shown, where a transaction execution has resulted in a state transition: In the example above, a transfer of two Ether from address 4718bf7a to address 741f7a2 is initiated. The initial state represents the state before the transaction execution, and the final state is what the morphed state looks like. Mining plays a central role in state transition, and we will elaborate the mining process in detail in the later sections. The state is stored on the Ethereum network as the world state. This is the global state of the Ethereum blockchain. How Ethereum works from a user's perspective For all the conversation around cryptocurrencies, it's very rare for anyone to actually explain how it works from the perspective of a user. Let's take a look at how it works in practice. In this example, I'll use the example of one man (Bashir) transferring money to another (Irshad). You may also want to read our post on if Ethereum will eclipse bitcoin. For the purposes of this example, we're using Jaxx wallet. However, you can use any cryptocurrency wallet for this. First, either a user requests money by sending the request to the sender, or the sender decides to send money to the receiver. The request can be sent by sending the receivers Ethereum address to the sender. For example, there are two users, Bashir and Irshad. If Irshad requests money from Bashir, then she can send a request to Bashir by using QR code. Once Bashir receives this request he will either scan the QR code or manually type in Irshad's Ethereum address and send Ether to Irshad's address. This request is encoded as a QR code shown in the following screenshot which can be shared via email, text or any other communication methods.2. Once Bashir receives this request he will either scan this QR code or copy the Ethereum address in the Ethereum wallet software and initiate a transaction. This process is shown in the following screenshot where the Jaxx Ethereum wallet software on iOS is used to send money to Irshad. The following screenshot shows that the sender has entered both the amount and destination address for sending Ether. Just before sending the Ether the final step is to confirm the transaction which is also shown here: Once the request (transaction) of sending money is constructed in the wallet software, it is then broadcasted to the Ethereum network. The transaction is digitally signed by the sender as proof that he is the owner of the Ether. This transaction is then picked up by nodes called miners on the Ethereum network for verification and inclusion in the block. At this stage, the transaction is still unconfirmed. Once it is verified and included in the block, the PoW process begins. Once a miner finds the answer to the PoW problem, by repeatedly hashing the block with a new nonce, this block is immediately broadcasted to the rest of the nodes which then verifies the block and PoW. If all the checks pass then this block is added to the blockchain, and miners are paid rewards accordingly. Finally, Irshad gets the Ether, and it is shown in her wallet software. This is shown here: On the blockchain, this transaction is identified by the following transaction hash: 0xc63dce6747e1640abd63ee63027c3352aed8cdb92b6a02ae25225666e171009e Details regarding this transaction can be visualized from the block explorer, as shown in the following screenshot: Thiswalkthroughh should give you some idea of how it works. Different Ethereum networks The Ethereum network is a peer-to-peer network where nodes participate in order to maintain the blockchain and contribute to the consensus mechanism. Networks can be divided into three types, based on requirements and usage. These types are described in the following subsections. Mainnet Mainnet is the current live network of Ethereum. The current version of mainnet is Byzantium (Metropolis) and its chain ID is 1. Chain ID is used to identify the network. A block explorer which shows detailed information about blocks and other relevant metrics is available here. This can be used to explore the Ethereum blockchain. Testnet Testnet is the widely used test network for the Ethereum blockchain. This test blockchain is used to test smart contracts and DApps before being deployed to the production live blockchain. Because it is a test network, it allows experimentation and research. The main testnet is called Ropsten which contains all features of other smaller and special purpose testnets that were created for specific releases. For example, other testnets include Kovan and Rinkeby which were developed for testing Byzantium releases. The changes that were implemented on these smaller testnets has also been implemented on Ropsten. Now the Ropsten test network contains all properties of Kovan and Rinkeby. Private net As the name suggests, this is the private network that can be created by generating a new genesis block. This is usually the case in private blockchain distributed ledger networks, where a private group of entities start their blockchain and use it as a permissioned blockchain. The following table shows the list of Ethereum network with their network IDs. These network IDs are used to identify the network by Ethereum clients. Network name Network ID / Chain ID Ethereum mainnet 1 Morden 2 Ropsten 3 Rinkeby 4 Kovan 42 Ethereum Classic mainnet 61   You should now have a good foundation of knowledge to get started with Ethereum. To learn more about Ethereum and other cryptocurrencies, check out the new edition of Mastering Blockchain. Other posts from this book A brief history of Blockchain Write your first Blockchain: Learning Solidity Programming in 15 minutes 15 ways to make Blockchains scalable, secure and safe! What is Bitcoin
Read more
  • 0
  • 0
  • 9764

article-image-raspberry-pi-zero-w-what-you-need-know-and-why-its-great
Raka Mahesa
25 Apr 2017
6 min read
Save for later

Raspberry Pi Zero W: What you need to know and why it's great

Raka Mahesa
25 Apr 2017
6 min read
On February 28th, 2017, the Raspberry Pi Foundation announced the latest product in the Raspberry Pi series – the Raspberry Pi Zero W. The new product adds wireless connectivity to the Raspberry Pi Zero and is being retailed for just $10. This is great news for enthusiasts and hobbyists all around the world.  Wait, wait, Raspberry Pi? Raspberry Pi Zero? Wireless? What are we talking about? Okay, so, to understand the idea behind Raspberry Pi Zero W and the benefits it brings, we need to back up a bit and talk about the Raspberry Pi series of products and its history. The Raspberry Pi's history The Raspberry Pi is a computer that's the size of a credit card and was made available to the public for the low price of $35. And yes, despite the size and the price of the product, it's a full-fledged computer capable of running an operating system like Linux and Android, though Windows is a bit too heavy for it to run. It came with 2 USB ports and a HDMI port so you can plug your keyboard, mouse, and monitor into it and treat it just like your everyday computer.  The first generation of the Raspberry Pi was released in February 2012 and was an instant hit among the DIY and hobbyist crowd. The small-sized and low-priced computer proved to be perfect to power up their DIY projects. By the time this post was written, 10 million Raspberry Pi computers have been sold and countless numbers of projects using the miniature computer have been made. It has been used in projects including: home arcade boxes, automated pet feeders, media centers, security cameras, and many, many others.  The second generation of the Raspberry Pi was launched in February 2015. The computer now offered a higher-clocked, quad-core processor with 1 GB of RAM and was still being sold at $35. Then, a year later in February 2016, the Raspberry Pi 3 was launched. While the price remained the same, this latest generation of the computer boasted higher performance as well as wireless connectivity via WiFi and Bluetooth.  What's better than a $35 computer?  The Raspberry Pi has come a long way but, with all of that said, do you know what's better than a $35 computer? A $5 computer that’s even smaller, which is exactly what was launched in November 2015: the Raspberry Pi Zero. Despite its price, this new computer is actually faster than the original Raspberry Pi and, by using micro USB and mini HDMI instead of the normal-sized port, the Raspberry Pi Zero managed to shrink down to just half the size of a credit card.  Unfortunately, using micro USB and mini HDMI ports leads to another set of problems. Most people need additional dongles or converters to connect to those ports, and those accessories can be as expensive as the computer itself. For example: a micro-USB to Ethernet connector will cost $5, a micro-USB to USB connector will cost $4, and a micro-USB WiFi adapter will cost $10.  Welcome the Raspberry Pi Zero W  Needing additional dongles and accessories that cost as much as the computer itself pretty much undermines the point of a cheap computer. So to mitigate that the Raspberry Pi Zero W, a Raspberry Pi Zero with integrated WiFi and Bluetooth connectivity, was introduced in February 2017 at the price of $10. Here are the hardware specifications ofthe Raspberry Pi Zero W: Broadcom BCM2835 single-core CPU @1GHz 512MB LPDDR2 SDRAM Micro USB data port Micro USB power port Mini HDMI port with 1080p60 video output Micro SD card slot HAT-compatible 40-pin header Composite video and reset headers CSI camera connector 802.11n wireless LAN Bluetooth 4.0  Its dimensions are 65mm x 30mm x 5mm (for comparison, the size of a Raspberry Pi 3 is 85mm x 56mm x 17mm).  There are several things to note about the hardware. One of them is that the 40-pin GPIO connector is not soldered out of the box; you have to solder it yourself. These unsoldered connectors are what allow the computer to be so slim and will be pretty useful to people who don't need a GPIO connection.  Another thing to note is that the wireless chip is the same wireless chip found in the Raspberry Pi 3, so they should behave and perform pretty similarly. And because the rest of the hardware is basically the same as the ones found in the Raspberry Pi Zero, you can think of the RaspberryPi Zero W as a fusion between both series.  Is the wireless connectivity worth the added cost? You may wonder if the wireless connectivity is worth the additional $5. Well, it really depends on your use case. For example, in my home everything is already wireless and I don't have any LAN cables that I can plug in to connect to the Internet, so wireless connectivity is a really big deal for me.  And really, there are a lot of projects and places where having wireless connectivity could help a lot. Imagine if you want to setup a camera in front of your home that would send an email to you every time it spots a particular type of car. Without a WiFi connection, you would have to pull your Ethernet cable all the way out there to have an Internet connection. And it's not just the Internet to consider – having Bluetooth connectivity is a really practical way to connect to other devices, like your phone for instance.  All in all, the Raspberry Pi Zero W is a great addition to the Raspberry Pi line of computers. It's affordable, it's highly capable, and with the addition of wireless connectivity it has become practical to use too. So go get your hands on one and start your own project today.  About the author Raka Mahesa is a game developer at Chocoarts: chocoarts.com, who is interested in digital technology in general. In his spare time, he likes to work on his own projects, with Corridoom VR being his latest released game. Raka also regularly tweets as @legacy99. 
Read more
  • 0
  • 0
  • 9753