Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

Tech Guides

852 Articles
article-image-create-strong-data-science-project-portfolio-lands-job
Aaron Lazar
13 Feb 2018
8 min read
Save for later

How to create a strong data science project portfolio that lands you a job

Aaron Lazar
13 Feb 2018
8 min read
Okay, you’re probably here because you’ve got just a few months to graduate and the projects section of your resume is blank. Or you’re just an inquisitive little nerd scraping the WWW for ways to crack that dream job. Either way, you’re not alone and there are ten thousand others trying to build a great Data Science portfolio to land them a good job. Look no further, we’ll try our best to help you on how to make a portfolio that catches the recruiter’s eye! David “Trent” Salazar‘s portfolio is a great example of a wholesome one and Sajal Sharma’s, is a good example of how one can display their Data Science Portfolios on a platform like Github. Companies are on the lookout for employees who can add value to the business. To showcase this on your resume effectively, the first step is to understand the different ways in which you can add value. 4 things you need to show in a data science portfolio Data science can be broken down into 4 broad areas: Obtaining insights from data and presenting them to the business leaders Designing an application that directly benefits the customer Designing an application or system that directly benefits other teams in the organisation Sharing expertise on data science with other teams You’ll need to ensure that your portfolio portrays all or at least most of the above, in order to easily make it through a job selection. So let’s see what we can do to make a great portfolio. Demonstrate that you know what you're doing So the idea is to show the recruiter that you’re capable of performing the critical aspects of Data Science, i.e. import a data set, clean the data, extract useful information from the data using various techniques, and finally visualise the findings and communicate them. Apart from the technical skills, there are a few soft skills that are expected as well. For instance, the ability to communicate and collaborate with others, the ability to reason and take the initiative when required. If your project is actually able to communicate these things, you’re in! Stay focused and be specific You might know a lot, but rather than throwing all your skills, projects and knowledge in the employer’s face, it’s always better to be focused on doing something and doing it right. Just as you’d do in your resume, keeping things short and sweet, you can implement this while building your portfolio too. Always remember, the interviewer is looking for specific skills. Research the data science job market Find 5-6 jobs, probably from Linkedin or Indeed, that interest you and go through their descriptions thoroughly. Understand what kind of skills the employer is looking for. For example, it could be classification, machine learning, statistical modeling or regression. Pick up the tools that are required for the job - for example, Python, R, TensorFlow, Hadoop, or whatever might get the job done. If you don’t know how to use that tool, you’ll want to skill-up as you work your way through the projects. Also, identify the kind of data that they would like you to be working on, like text or numerical, etc. Now, once you have this information at hand, start building your project around these skills and tools. Be a problem solver Working on projects that are not actual ‘problems’ that you’re solving, won’t stand out in your portfolio. The closer your projects are to the real-world, the easier it will be for the recruiter to make their decision to choose you. This will also showcase your analytical skills and how you’ve applied data science to solve a prevailing problem. Put at least 3 diverse projects in your data science portfolio A nice way to create a portfolio is to list 3 good projects that are diverse in nature. Here are some interesting projects to get you started on your portfolio: Data Cleaning and wrangling Data Cleaning is one of the most critical tasks that a data scientist performs. By taking a group of diverse data sets, consolidating and making sense of them, you’re giving the recruiter confidence that you know how to prep them for analysis. For example, you can take Twitter or Whatsapp data and clean it for analysis. The process is pretty simple; you first find a “dirty” data set, then spot an interesting angle to approach the data from, clean it up and perform analysis on it, and finally present your findings. Data storytelling Storytelling showcases not only your ability to draw insight from raw data, but it also reveals how well you’re able to convey the insights to others and persuade them. For example, you can use data from the bus system in your country and gather insights to identify which stops incur the most delays. This could be fixed by changing their route. Make sure your analysis is descriptive and your code and logic can be followed. Here’s what you do; first you find a good dataset, then you explore the data and spot correlations in the data. Then you visualize it before you start writing up your narrative. Tackle the data from various angles and pick up the most interesting one. If it’s interesting to you, it will most probably be interesting to anyone else who’s reviewing it. Break down and explain each step in detail, each code snippet, as if you were describing it to a friend. The idea is to teach the reviewer something new as you run through the analysis. End to end data science If you’re more into Machine Learning, or algorithm writing, you should do an end-to-end data science project. The project should be capable of taking in data, processing it and finally learning from it, every step of the way. For example, you can pick up fuel pricing data for your city or maybe stock market data. The data needs to be dynamic and updated regularly. The trick for this one is to keep the code simple so that it’s easy to set up and run. You first need to identify a good topic. Understand here that we will not be working with a single dataset, rather you will need to import and parse all the data and bring it under a single dataset yourself. Next, get the training and test data ready to make predictions. Document your code and other findings and you’re good to go. Prove you have the data science skill set If you want to get that job, you’ve got to have the appropriate tools to get the job done. Here’s a list of some of the most popular tools with a link to the right material for you to skill-up: Data science languages There's a number of key languages in data science that are essential. It might seem obvious, but making sure they're on your resume and demonstrated in your portfolio is incredibly important. Include things like: Python R Java Scala SQL Big Data tools If you're applying for big data roles, demonstrating your experience with the key technologies is a must. It not only proves you have the skills, but also shows that you have an awareness of what tools can be used to build a big data solution or project. You'll need: Hadoop, Spark Hive Machine learning frameworks With machine learning so in demand, if you can prove you've used a number of machine learning frameworks, you've already done a lot to impress. Remember, many organizations won't actually know as much about machine learning as you think. In fact, they might even be hiring you with a view to building out this capability. Remember to include: TensorFlow Caffe2 Keras PyTorch Data visualisation tools Data visualization is a crucial component of any data science project. If you can visualize and communicate data effectively, you're immediately demonstrating you're able to collaborate with others and make your insights accessible and useful to the wider business. Include tools like these in your resume and portfolio:  D3.js Excel chart  Tableau  ggplot2 So there you have it. You know what to do to build a decent data science portfolio. It’s really worth attending competitions and challenges. It will not only help you keep up to data and well oiled with your skills, but also give you a broader picture of what people are actually working on and with what tools they’re able to solve problems.
Read more
  • 0
  • 2
  • 12865

article-image-how-do-you-become-a-developer-advocate
Packt Editorial Staff
11 Oct 2019
8 min read
Save for later

How do you become a developer advocate?

Packt Editorial Staff
11 Oct 2019
8 min read
Developer advocates are people with a strong technical background, whose job is to help developers be successful with a platform or technology. They act as a bridge between the engineering team and the developer community. A developer advocate does not only fill in the gap between developers and the platform but also looks after the development of developers in terms of traction and progress on their projects. Developer advocacy, is broadly referred to as "developer relations". Those who practice developer advocacy have fallen into in this profession in one way or another. As the processes and theories in the world of programming have evolved over several years, so has the idea of developer advocacy. This is the result of developer advocates who work in the wild using their own initiatives. This article is an excerpt from the book Developer, Advocate! by Geertjan Wielenga. This book serves as a rallying cry to inspire and motivate tech enthusiasts and burgeoning developer advocates to take their first steps within the tech community. The question then arises, how does one become a developer advocate? Here are some experiences shared by some well-known developer advocates on how they started the journey that landed them to this role. Is developer advocacy taught in universities? Bruno Borges, Principal Product Manager at Microsoft says, for most developer advocates or developer relations personnel, it was something that just happened. Developer advocacy is not a discipline that is taught in universities; there's no training specifically for this. Most often, somebody will come to realize that what they already do is developer relations. This is a discipline that is a conjunction of several other roles: software engineering, product management, and marketing. I started as a software engineer and then I became a product manager. As a product manager, I was engaged with marketing divisions and sales divisions directly on a weekly basis. Maybe in some companies, sales, marketing, and product management are pillars that are not needed. I think it might vary. But in my opinion, those pillars are essential for doing a proper developer relations job. Trying to aim for those pillars is a great foundation. Just as in computer science when we go to college for four years, sometimes we don't use some of that background, but it gives us a good foundation. From outsourcing companies that just built business software for companies, I then went to vendor companies. That's where I landed as a person helping users to take full advantage of the software that they needed to build their own solutions. That process is, ideally, what I see happening to others. The journey of a regular tech enthusiast to a developer advocate Ivar Grimstad, a developer advocate at Eclipse foundation, speaks about his journey from being a regular tech enthusiast attending conferences to being there speaking at conferences as an advocate for his company. Ivar Grimstad says, I have attended many different conferences in my professional life and I always really enjoyed going to them. After some years of regularly attending conferences, I came to the point of thinking, "That guy isn't saying anything that I couldn't say. Why am I not up there?" I just wanted to try speaking, so I started submitting abstracts. I already gave talks at meetups locally, but I began feeling comfortable enough to approach conferences. I continued submitting abstracts until I got accepted. As it turned out, while I was becoming interested in speaking, my company was struggling to raise its profile. Nobody, even in Sweden, knew what we did. So, my company was super happy for any publicity it could get. I could provide it with that by just going out and talking about tech. It didn't have to be related to anything we did; I just had to be there with the company name on the slides. That was good enough in the eyes of my company. After a while, about 50% of my time became dedicated to activities such as speaking at conferences and contributing to open source projects. Tables turned from being an engineer to becoming a developer advocate Mark Heckler, a Spring developer and advocate at Pivotal, narrates his experience about how tables turned for him from University to Pivotal Principal Technologist & Developer Advocate. He says, initially, I was doing full-time engineering work and then presenting on the side. I was occasionally taking a few days here and there to travel to present at events and conferences. I think many people realized that I had this public-facing level of activities that I was doing. I was out there enough that they felt I was either doing this full-time or maybe should be. A good friend of mine reached out and said, "I know you're doing this anyway, so how would you like to make this your official role?" That sounded pretty great, so I interviewed, and I was offered a full-time gig doing, essentially, what I was already doing in my spare time. A hobby turned out to be a profession Matt Raible, a developer advocate at Okta has worked as an independent consultant for 20 years. He did advocacy as a side hobby. He talks about his experience as a consultant and walks through the progress and development. I started a blog in 2002 and wrote about Java a lot. This was before Stack Overflow, so I used Struts and Java EE. I posted my questions, which you would now post on Stack Overflow, on that blog with stack traces, and people would find them and help. It was a collaborative community. I've always done the speaking at conferences on the side. I started working for Stormpath two years ago, as a contractor part-time, and I was working at Computer Associates at the same time. I was doing Java in the morning at Stormpath and I was doing JavaScript in the afternoon at Computer Associates. I really liked the people I was working with at Stormpath and they tried to hire me full-time. I told them to make me an offer that I couldn't refuse, and they said, "We don't know what that is!" I wanted to be able to blog and speak at conferences, so I spent a month coming up with my dream job. Stormpath wanted me to be its Java lead. The problem was that I like Java, but it's not my favorite thing. I tend to do more UI work. The opportunity went away for a month and then I said, "There's a way to make this work! Can I do Java and JavaScript?" Stormpath agreed that instead of being more of a technical leader and owning the Java SDK, I could be one of its advocates. There were a few other people on board in the advocacy team. Six months later, Stormpath got bought out by Okta. As an independent consultant, I was used to switching jobs every six months, but I didn't expect that to happen once I went full-time. That's how I ended up at Okta! Developer advocacy can be done by calculating the highs and lows of the tech world Scott Davis, a Principal Engineer at Thoughtworks, was also a classroom instructor, teaching software classes to business professionals before becoming a developer advocate. As per him, tech really is a world of strengths and weaknesses. Advocacy, I think, is where you honestly say, "If we balance out the pluses and the minuses, I'm going to send you down the path where there are more strengths than weaknesses. But I also want to make sure that you are aware of the sharp, pointy edges that might nick you along the way." I spent eight years in the classroom as a software instructor and that has really informed my entire career. It's one thing to sit down and kind of understand how something works when you're cowboy coding on your own. It's another thing altogether when you're standing up in front of an audience of tens, or hundreds, or thousands of people. Discover how developer advocates are putting developer interests at the heart of the software industry in companies including Microsoft and Google with Developer, Advocate! by Geertjan Wielenga. This book is a collection of in-depth conversations with leading developer advocates that reveal the world of developer relations today. 6 reasons why employers should pay for their developers’ training and learning resources “Developers need to say no” – Elliot Alderson on the FaceApp controversy in a BONUS podcast episode [Podcast] GitHub has blocked an Iranian software developer’s account How do AWS developers manage Web apps? Are you looking at transitioning from being a developer to manager? Here are some leadership roles to consider
Read more
  • 0
  • 0
  • 12717

article-image-uses-of-machine-learning-in-gaming
Natasha Mathur
22 Oct 2018
5 min read
Save for later

Uses of Machine Learning in Gaming

Natasha Mathur
22 Oct 2018
5 min read
All around us, our perception of learning and intellect is being challenged daily with the advent of new and emerging technologies. From self-driving cars, playing Go and Chess, to computers being able to beat humans at classic Atari games, the advent of a group of technologies we colloquially call Machine Learning have come to dominate a new era in technological growth – a new era of growth that has been compared with the same importance as the discovery of electricity and has already been categorized as the next human technological age. Games and simulations are no stranger to AI technologies and there are numerous assets available to the Unity developer in order to provide simulated machine intelligence. These technologies include content like Behavior Trees, Finite State Machine, navigation meshes, A*, and other heuristic ways game developers use to simulate intelligence. So, why Machine Learning and why now? The reason is due in large part to the OpenAI initiative, an initiative that encourages research across academia and the industry to share ideas and research on AI and ML. This has resulted in an explosion of growth in new ideas, methods, and areas for research. This means for games and simulations that we no longer have to fake or simulate intelligence. Now, we can build agents that learn from their environment and even learn to beat their human builders. This article is an excerpt taken from the book 'Learn Unity ML-Agents – Fundamentals of Unity Machine Learning'  by Micheal Lanham. In this article, we look at the role that machine learning plays in game development. Machine Learning is an implementation of Artificial Intelligence. It is a way for a computer to assimilate data or state and provide a learned solution or response. We often think of AI now as a broader term to reflect a "smart" system. A full game AI system, for instance, may incorporate ML tools combined with more classic AIs like Behavior Trees in order to simulate a richer, more unpredictable AI. We will use AI to describe a system and ML to describe the implementation. How Machine Learning is useful in gaming Game engines have embraced the idea of incorporating ML into all aspects of its product and not just for use as a game AI. While most developers may try to use ML for gaming, it certainly helps game development in the following areas: Map/Level Generation: There are already plenty of examples where developers have used ML to auto-generate everything from dungeons to the realistic terrain. Getting this right can provide a game with endless replayability, but it can be some of the most challenging ML to develop. Texture/Shader Generation: Another area that is getting the attention of ML is texture and shader generation. These technologies are getting a boost brought on by the attention of advanced generative adversarial networks, or GAN. There are plenty of great and fun examples of this tech in action; just do a search for DEEP FAKES in your favorite search engine. Model Generation: There are a few projects coming to fruition in this area that could greatly simplify 3D object construction through enhanced scanning and/or auto-generation. Imagine being able to textually describe a simple model and having ML build it for you, in real-time, in a game or other AR/VR/MR app, for example. Audio Generation: Being able to generate audio sound effects or music on the fly is already being worked on for other areas, not just games. Yet, just imagine being able to have a custom designed soundtrack for your game developed by ML. Artificial Players: This encompasses many uses from the gamer themselves using ML to play the game on their behalf to the developer using artificial players as enhanced test agents or as a way to engage players during low activity. If your game is simple enough, this could also be a way of auto testing levels. NPCs or Game AI: Currently, there are better patterns out there to model basic behavioral intelligence in the form of Behavior Trees. While it's unlikely that BTs or other similar patterns will go away any time soon, imagine being able to model an NPC that may actually do an unpredictable, but rather cool behavior. This opens all sorts of possibilities that excite not only developers but players as well. So, we learned about different areas of the gaming world such as model generation, artificial players, NPCs, level generation, etc, where Machine learning can be extensively used. If you found this post useful, be sure to check out the book 'Learn Unity ML-Agents – Fundamentals of Unity Machine Learning' to learn more machine learning concepts in gaming. 5 Ways Artificial Intelligence is Transforming the Gaming Industry How should web developers learn machine learning? Deep Learning in games – Neural Networks set to design virtual worlds
Read more
  • 0
  • 0
  • 12595

article-image-react-native-vs-ionic-which-one-is-the-better-mobile-app-development-framework
Guest Contributor
01 Mar 2019
6 min read
Save for later

React Native Vs Ionic : Which one is the better mobile app development framework?

Guest Contributor
01 Mar 2019
6 min read
Today, mobile app development has come a long way, it isn’t the same as it used to be. In earlier days, the development process included only simple decisions such as design, features and the cost of creating the app. But, this scenario has changed now. Nowadays, mobile application development starts with the selection of the right app development framework. There are lots of options to choose from like Flutter, AngularJS, Ionic, React Native, etc. In this post, we are going to compare two powerful mobile app development frameworks: Ionic and React Native, to figure out the best option for your app development needs. React Native - An introduction React native is developed by Facebook using JavaScript which is one of the most popular languages used by mobile developers. React Native allows creating high-end applications for specific operating systems. Developers can reuse the code from this framework and don’t need to build an application from scratch. This is a helpful tool to create applications for Android and iOS operating systems. Features and benefits of React Native As it is reusable across Android and iOS, it saves development time and cost. With virtual-DOM support, it allows viewing changes in real time. There is a huge community of React native developers. Code written by one developer can be read, studied, understood and extended easily by other developers. Once the code is developed,  it can be used on iOS and Android. Issues with React Native apps for Android or iOS can be resolved quickly. It’s consistently improving and with every new release app development becomes interesting and convenient. Ionic - An introduction Ionic is developed by Drifty using TypeScript. It’s an open-source platform for developing hybrid mobile applications using HTML5, JavaScript and CSS technologies. Apps built with the Ionic framework are mainly focused on the UI, appearance, and feel. As it utilizes a combination of Apache Cordova and Angular, Ionic for many developers, is the first choice for app development. It provides tools such as HTML5, CSS, SaaS, etc to develop top-notch hybrid mobile apps to be run on Windows, Android, and iOS. Features and benefits of Ionic Ionic is an open source framework used for developing hybrid mobile applications. It is built on top of AngularJS and Apache Cordova. Ionic Framework comes with a command line interface (CLI) that empowers developers to build and test apps on any platform. It offers all the functionalities that are available with native app development SDKs to allows to develop apps and customize them for the different OS then deploy through Cordova. Apps require one-time development with Ionic and can be deployed on Android, iOS and Windows platforms. Facility to build apps using HTML5, CSS, and JavaScript technologies. The apps developed with Ionic are majorly focused on UI to provide the better user experience. It offers a multitude of exciting elements to choose from for development. Ionic 4 is the newest release of Ionic so far. The release is a complete rebuild of the popular JavaScript framework for developing mobile and desktop apps. Although Ionic has, up until now, been using Angular components, this new version has instead been built using Web Components. This is significant, as it changes the whole ball game for the project. It means the Ionic Framework is now an app development framework that can be used alongside any front end frameworks, not just Angular. React Native Vs Ionic: A comparison The following table below shows the difference between these two on different bases. Basis for comparison React Native Ionic Ease of learning Due to a few pre-developed elements, learning takes time. With plenty of pre-developed and pre-designed elements, learning is easier and shorter. Code language JSX (A syntax extension to JavaScript used to optimize code before compilation into JS) TypeScript (A typed superset of JavaScript for compiling clean and simple JS code on any browser) Code reusability It allows using the same code to develop Windows, Android, and iOS mobile apps. Same code can be utilized for creating apps for iOS, Android, Windows as well as web and PWA. Performance It has excellent performance as it doesn’t use WebView. The performance is average because it uses WebView. Community support Strong Strong Ease of development React follows the approach, ‘learn once write anywhere’ Written only once, it can be executed on any platform Phone hardware accessibility To access phone hardware Apache Cordova is used. No third Party tool is required to access phone hardware. Code testing An emulator or real mobile is needed for testing. Apps can be tested on any web browser. Documentation Very basic documentation Quite simple, clear and consistent documentation Developer Facebook Drifty.co By now, you must have obtained knowledge about the basic differences between Ionic and React Native. Both these frameworks are different from each other and they provide distinguishing features. Let us now further investigate both frameworks based on some board parameters Performance Android apps developed with React Native usually have a better performance score than ones developed with Ionic. This is because Ionic uses web-view in mobile app development and this is not the case with React Native framework. Design Ionic comes with plenty of pre-developed elements that allows creating elegant apps with excellent UI. This is what makes Ionic beat React Native when it comes to design. React Native offers a few pre-developed elements as compared to Ionic. Cost Developing apps with Ionic is cheaper than developing with React Native. This is because, in Ionic, the same code can be utilized across different platforms. Final words So which technology you should use? Well, this is not easy to tell. There are several factors you can consider like cost, features, requirements, platforms, and team size when deciding the best app development framework. They both serve different purposes and choosing any of them may be easy. If you a low budget then Ionic can be your choice to build an appealing application with a good performance. On the other hand, React Native lets you build native-like apps but the cost of development may be much than Ionic. Depending on your requirements and preferences, you can decide to choose any of the frameworks. Author-Bio David Meyer is a senior web developer at CSSChopper, a front end, and custom web development company catering customers across the globe. David has a passion for web development and likes to share his knowledge through informative blogs and articles.
Read more
  • 0
  • 0
  • 12473

article-image-what-does-a-data-science-team-look-like
Fatema Patrawala
21 Nov 2019
11 min read
Save for later

What does a data science team look like?

Fatema Patrawala
21 Nov 2019
11 min read
Until a couple of years ago, people barely knew the term 'data science' which has now evolved into an extremely popular career field. The Harvard Business Review dubbed data scientist within the data science team as the sexiest job of the 21st century and expert professionals jumped on the data is the new oil bandwagon. As per the Figure Eight Report 2018, which takes the pulse of the data science community in the US, a lot has changed rapidly in the data science field over the years. For the 2018 report, they surveyed approximately 240 data scientists and found out that machine learning projects have multiplied and more and more data is required to power them. Data science and machine learning jobs are LinkedIn's fastest growing jobs. And the internet is creating 2.5 quintillion bytes of data to process and analyze each day. With all these changes, it is evident for data science teams to evolve and change among various organizations. The data science team is responsible for delivering complex projects where system analysis, software engineering, data engineering, and data science is used to deliver the final solution. To achieve all of this, the team does not only have a data scientist or a data analyst but also includes other roles like business analyst, data engineer or architect, and chief data officer. In this post, we will differentiate and discuss various job roles within a data science team, skill sets required and the compensation benefit for each one of them. For an in-depth understanding of data science teams, read the book, Managing Data Science by Kirill Dubovikov, which has interesting case studies on building successful data science teams. He also explores how the team can efficiently manage data science projects through the use of DevOps and ModelOps.  Now let's get into understanding individual data science roles and functions, but before that we take a look at the structure of the team.There are three basic team structures to match different stages of AI/ML adoption: IT centric team structure At times for companies hiring a data science team is not an option, and they have to leverage in-house talent. During such situations, they take advantage of the fully functional in-house IT department. The IT team manages functions like data preparation, training models, creating user interfaces, and model deployment within the corporate IT infrastructure. This approach is fairly limited, but it is made practical by MLaaS solutions. Environments like Microsoft Azure or Amazon Web Services (AWS) are equipped with approachable user interfaces to clean datasets, train models, evaluate them, and deploy. Microsoft Azure, for instance, supports its users with detailed documentation for a low entry threshold. The documentation helps in fast training and early deployment of models even without an expert data scientists on board. Integrated team structure Within the integrated structure, companies have a data science team which focuses on dataset preparation and model training, while IT specialists take charge of the interfaces and infrastructure for model deployment. Combining machine learning expertise with IT resource is the most viable option for constant and scalable machine learning operations. Unlike the IT centric approach, the integrated method requires having an experienced data scientist within the team. This approach ensures better operational flexibility in terms of available techniques. Additionally, the team leverages deeper understanding of machine learning tools and libraries – like TensorFlow or Theano which are specifically for researchers and data science experts. Specialized data science team Companies can also have an independent data science department to build an all-encompassing machine learning applications and frameworks. This approach entails the highest cost. All operations, from data cleaning and model training to building front-end interfaces, are handled by a dedicated data science team. It doesn't necessarily mean that all team members should have a data science background, but they should have technology background with certain service management skills. A specialized structure model aids in addressing complex data science tasks that include research, use of multiple ML models tailored to various aspects of decision-making, or multiple ML backed services. Today's most successful Silicon Valley tech operates with specialized data science teams. Additionally they are custom-built and wired for specific tasks to achieve different business goals. For example, the team structure at Airbnb is one of the most interesting use cases. Martin Daniel, a data scientist at Airbnb in this talk explains how the team emphasizes on having an experimentation-centric culture and apply machine learning rigorously to address unique product challenges. Job roles and responsibilities within data science team As discussed earlier, there are many roles within a data science team. As per Michael Hochster, Director of Data Science at Stitch Fix, there are two types of data scientists: Type A and Type B. Type A stands for analysis. Individuals involved in Type A are statisticians that make sense of data without necessarily having strong programming knowledge. Type A data scientists perform data cleaning, forecasting, modeling, visualization, etc. Type B stands for building. These individuals use data in production. They're good software engineers with strong programming knowledge and statistics background. They build recommendation systems, personalization use cases, etc. Though it is rare that one expert will fit into a single category. But understanding these data science functions can help make sense of the roles described further. Chief data officer/Chief analytics officer The chief data officer (CDO) role has been taking organizations by storm. A recent NewVantage Partners' Big Data Executive Survey 2018 found that 62.5% of Fortune 1000 business and technology decision-makers said their organization appointed a chief data officer. The role of chief data officer involves overseeing a range of data-related functions that may include data management, ensuring data quality and creating data strategy. He or she may also be responsible for data analytics and business intelligence, the process of drawing valuable insights from data. Even though chief data officer and chief analytics officer (CAO) are two distinct roles, it is often handled by the same person. Expert professionals and leaders in analytics also own the data strategy and how a company should treat its data. It does make sense as analytics provide insights and value to the data. Hence, with a CDO+CAO combination companies can take advantage of a good data strategy and proper data management without losing on quality. According to compensation analysis from PayScale, the median chief data officer salary is $177,405 per year, including bonuses and profit share, ranging from $118,427 to $313,791 annually. Skill sets required: Data science and analytics, programming skills, domain expertise, leadership and visionary abilities are required. Data analyst The data analyst role implies proper data collection and interpretation activities. The person in this job role will ensure that collected data is relevant and exhaustive while also interpreting the results of the data analysis. Some companies also require data analysts to have visualization skills to convert alienating numbers into tangible insights through graphics. As per Indeed, the average salary for a data analyst is $68,195 per year in the United States. Skill sets required: Programming languages like R, Python, JavaScript, C/C++, SQL. With this critical thinking, data visualization and presentation skills will be good to have. Data scientist Data scientists are data experts who have the technical skills to solve complex problems and the curiosity to explore what problems are needed to be solved. A data scientist is an individual who develops machine learning models to make predictions and is well versed in algorithm development and computer science. This person will also know the complete lifecycle of the model development. A data scientist requires large amounts of data to develop hypotheses, make inferences, and analyze customer and market trends. Basic responsibilities include gathering and analyzing data, using various types of analytics and reporting tools to detect patterns, trends and relationships in data sets. According to Glassdoor, the current U.S. average salary for a data scientist is $118,709. Skills set required: A data scientist will require knowledge of big data platforms and tools like  Seahorse powered by Apache Spark, JupyterLab, TensorFlow and MapReduce; and programming languages that include SQL, Python, Scala and Perl; and statistical computing languages, such as R. They should also have cloud computing capabilities and knowledge of various cloud platforms like AWS, Microsoft Azure etc.You can also read this post on how to ace a data science interview to know more. Machine learning engineer At times a data scientist is confused with machine learning engineers, but a machine learning engineer is a distinct role that involves different responsibilities. A machine learning engineer is someone who is responsible for combining software engineering and machine modeling skills. This person determines which model to use and what data should be used for each model. Probability and statistics are also their forte. Everything that goes into training, monitoring, and maintaining a model is the ML engineer's job. The average machine learning engineer's salary is $146,085 in the US, and is ranked No.1 on the Indeed's Best Jobs in 2019 list. Skill sets required: Machine learning engineers will be required to have expertise in computer science and programming languages like R, Python, Scala, Java etc. They would also be required to have probability techniques, data modelling and evaluation techniques. Data architects and data engineers The data architects and data engineers work in tandem to conceptualize, visualize, and build an enterprise data management framework. The data architect visualizes the complete framework to create a blueprint, which the data engineer can use to build a digital framework. The data engineering role has recently evolved from the traditional software-engineering field.  Recent enterprise data management experiments indicate that the data-focused software engineers are needed to work along with the data architects to build a strong data architecture. Average salary for a data architect in the US ranges from $1,22,000 to $1,29, 000 annually as per a recent LinkedIn survey. Skill sets required: A data architect or an engineer should have a keen interest and experience in programming languages frameworks like HTML5, RESTful services, Spark, Python, Hive, Kafka, and CSS etc. They should have the required knowledge and experience to handle database technologies such as PostgreSQL, MapReduce and MongoDB and visualization platforms such as; Tableau, Spotfire etc. Business analyst A business analyst (BA) basically handles Chief analytics officer's role but on the operational level. This implies converting business expectations into data analysis. If your core data scientist lacks domain expertise, a business analyst can bridge the gap. They are responsible for using data analytics to assess processes, determine requirements and deliver data-driven recommendations and reports to executives and stakeholders. BAs engage with business leaders and users to understand how data-driven changes will be implemented to processes, products, services, software and hardware. They further articulate these ideas and balance them against technologically feasible and financially reasonable. The average salary for a business analyst is $75,078 per year in the United States, as per Indeed. Skill sets required: Excellent domain and industry expertise will be required. With this good communication as well as data visualization skills and knowledge of business intelligence tools will be good to have. Data visualization engineer This specific role is not present in each of the data science teams as some of the responsibilities are realized by either a data analyst or a data architect. Hence, this role is only necessary for a specialized data science model. The role of a data visualization engineer involves having a solid understanding of UI development to create custom data visualization elements for your stakeholders. Regardless of the technology, successful data visualization engineers have to understand principles of design, both graphical and more generally user-centered design. As per Payscale, the average salary for a data visualization engineer is $98,264. Skill sets required: A data visualization engineer need to have rigorous knowledge of data visualization methods and be able to produce various charts and graphs to represent data. Additionally they must understand the fundamentals of design principles and visual display of information. To sum it up, a data science team has evolved to create a number of job roles and opportunities, but companies still face challenges in building up the team from scratch and find it hard to figure where to start from. If you are facing a similar dilemma, check out this book, Managing Data Science, written by Kirill Dubovikov. It covers concepts and methodologies to manage and deliver top-notch data science solutions, while also providing guidance on hiring, growing and sustaining a successful data science team. How to learn data science: from data mining to machine learning How to ace a data science interview Data science vs. machine learning: understanding the difference and what it means today 30 common data science terms explained 9 Data Science Myths Debunked
Read more
  • 0
  • 0
  • 12471

article-image-google-opensorced-tensorflow
Kunal Parikh
13 Sep 2017
7 min read
Save for later

6 reasons why Google open-sourced TensorFlow

Kunal Parikh
13 Sep 2017
7 min read
On November 9, 2015, a storm loomed over the SF Bay area creating major outages. At Mountain View, California, Google engineers were busy creating a storm of their own. That day, Sundar Pichai announced to the world that TensorFlow, their machine learning system, was going Open Source. He said: “...today we’re also open-sourcing TensorFlow. We hope this will let the machine learning community—everyone from academic researchers, to engineers, to hobbyists—exchange ideas much more quickly, through working code rather than just research papers.” That day the tech world may not have fully grasped the gravity of the announcement but those in the know knew it was a pivotal moment in Google’s transformational journey into an AI first world. How did TensorFlow begin? TensorFlow was part of a former Google product called DistBelief. DistBelief was responsible for a program called DeepDream. The program was built for scientists and engineers to visualise how deep neural networks process images. As fate would have it, the algorithm went viral and everyone started visualising abstract and psychedelic art in it. Although people were having fun playing with image forms, they were unaware of the technology that powered those images - neural networks and deep learning - the exact reason why TensorFlow was built for. TensorFlow is a machine learning platform that allows one to run a wide range of algorithms like the aforementioned neural networks and deep learning based projects. TensorFlow with its flexibility, high performance, portability, and production-readiness is changing the landscape of artificial intelligence and machine learning. Be it face recognition, music, and art creation or detecting clickbait headline for blogs, the use cases are immense. With Google open sourcing TensorFlow, the platform that powers Google search and other smart Google products is now accessible to everyone - researchers, scientists, machine learning experts, students, and others. So why did Google open source TensorFlow? Yes, Google made a world of difference to the machine learning community at large by open sourcing TensorFlow. But what was in it for Google? As it turns out, a whole lot. Let’s look at a few. Google is feeling the heat from rival deep learning frameworks Major deep learning frameworks like Theano, Keras, etc., were already open source. Keeping a framework proprietary was becoming a strategic disadvantage as most DL core users i.e. scientists, engineers, and academicians prefer using open source software for their work. “Pure” researchers and aspiring “Phds” are key groups that file major patents in the world of AI. By open sourcing TensorFlow, Google gave this community access to a platform it backs to power their research. This makes migrating the world’s algorithms from other deep learning tools onto TensorFlow theoretically possible. AI as a trend is clearly here to stay and Google wants a platform that leads this trend. An open source TensorFlow can better support the Google Brain project Behind all the PR, Google does not speak much about its pet project Google Brain. When Sundar Pichai talks of Google’s transformation from Search to AI, this project is doing all the work behind the scenes. Google Brain is headed by some of the best minds in the industry like Jeff Dean, Geoffery Hilton, Andrew NG, among many others. They developed TensorFlow and they might still have some state-of-the-art features up their sleeves privy only to them. After all, they have done a plethora of stunning research in areas like parallel computing, machine intelligence, natural language processing and many more. With TensorFlow now open sourced, this team can accelerate the development of the platform and also make significant inroads into areas they are currently researching on. This research can then potentially develop into future products for Google which will allow them to expand their AI and Cloud clout, especially in the enterprise market. Tapping into the collective wisdom of the academic intelligentsia Most innovations and breakthroughs come from universities before they go mainstream and become major products in enterprises. AI, still making this transition, will need a lot of investment in research. To work on difficult algorithms, researchers will need access to sophisticated ML frameworks. Selling TensorFlow to universities is an old school way to solve the problem - that’s why we no longer hear about products like LabView. Instead, by open-sourcing TensorFlow, the team at Google now has the world’s best minds working on difficult AI problems on their platform for free. As these researchers start writing papers on AI using TensorFlow, it will keep adding to the existing body of knowledge. They will have all the access to bleeding-edge algorithms that are not yet available in the market. Their engineers could simply pick and choose what they like and start developing commercially ready services. Google wants to develop TensorFlow as a platform-as-a-service for AI application development An advantage of open-sourcing a tool is that it accelerates time to build and test through collaborative app development. This means most of the basic infrastructure and modules to build a variety of TensorFlow based applications will already exist on the platform. TensorFlow developers can develop and ship interesting modular products by mixing and matching code and providing a further layer of customization or abstraction. What Amazon did for storage with AWS, Google can do for AI with TensorFlow. It won’t come as a surprise if Google came up with their own integrated AI ecosystem with TensorFlow on the Google Cloud promising you the AI resources your company would need. Suppose you want a voice based search function on your ecommerce mobile application. Instead, of completely reinventing the wheel, you could buy TensorFlow powered services provided by Google. With easy APIs, you can get voice based search and save substantial developer cost and time. Open sourcing TensorFlow will help Google to extend their talent pipeline in a competitive Silicon Valley jobs market Hiring for AI development is  competitive in the Silicon Valley as all major companies vie for attention from the same niche talent pool. With TensorFlow made freely available, Google’s HR team can quickly reach out to a talent pool specifically well versed with the technology and also save on training cost. Just look at the interest TensorFlow has generated on a forum like StackOverflow: This indicates that growing number of users are asking and inquiring about TensorFlow. Some of these users will migrate into power users who the Google HR team can tap into. A developer pool at this scale would never have been possible with a proprietary tool. Replicating the success and learning from Android Agreed, a direct comparison with Android is not possible. However, the size of the mobile market and Google’s strategic goal of mobile-first when they introduced Android bear striking similarity with the nascent AI ecosystem we have today and Google’s current AI-first rhetoric. In just a decade since its launch, Android now owns more than 85% of the smartphone mobile OS market. Piggybacking on Android’s success, Google now has control of mobile search (96.19%), services (Google Play), a strong connection with the mobile developer community and even a viable entry into the mobile hardware market. Open sourcing Android did not stop Google from making money. Google was able to monetize through other ways like mobile search, mobile advertisements, Google Play, devices like Nexus, mobile payments, etc. Google did not have all this infrastructure planned and ready before Android was open sourced - It innovated, improvised, and created along the way. In the future, we can expect Google to adopt key learnings from its Android growth story and apply to TensorFlow’s market expansion strategy. We can also see supporting infrastructures and models for commercialising TensorFlow emerge for enterprise developers. [dropcap]T[/dropcap]he road to AI world domination for Google is on the back of an open sourced TensorFlow platform. It appears not just exciting but also promises to be one full of exponential growth, crowdsourced innovation and learnings drawn from other highly successful Google products and services. The storm that started two years ago is surely morphing into a hurricane. As Professor Michael Guerzhoy of University of Toronto quotes in Business Insider “Ten years ago, it took me months to do something that for my students takes a few days with TensorFlow.”
Read more
  • 0
  • 0
  • 12436
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-top-languages-for-artificial-intelligence-development
Natasha Mathur
05 Jun 2018
11 min read
Save for later

Top languages for Artificial Intelligence development

Natasha Mathur
05 Jun 2018
11 min read
Artificial Intelligence is one of the hottest technologies currently. From work colleagues to your boss, chances are that most (yourself included) wish to create the next big AI project. Artificial Intelligence is a vast field and with thousands of languages to choose from, it can get a bit difficult to pick the language that will bring the most value to your project. For anyone wanting to dive in the AI space, the initial stage of choosing the right language can really decelerate the development process. Moreover, making a right choice about the language for the Artificial Intelligence development depends on your skills and needs. Following are the top 5 programming languages for Artificial Intelligence development: 1.Python Python, hands down, is the number one programming language when it comes to Artificial Intelligence development. Not only is it one of the most popular languages in the field of data science, machine learning, and Artificial Intelligence in general, it is also popular among game developers, web developers, cybersecurity professionals and others. It offers a ton of libraries and frameworks in Machine Learning and Deep Learning that are extremely powerful and essential for AI development such as TensorFlow, Theano, Keras, Scikit Learn, etc. Python is the go-to language for AI development for most people, novices and experts alike. Pros It’s quite easy to learn due to its simple syntax. This helps in implementing the AI algorithms in a quick and easy manner. Development is faster in Python as compared to Java, C++ or Ruby. It is a multi-paradigm programming language and supports object-oriented, functional and procedure-oriented programming languages. Python has a ton of libraries and tools to offer. Python libraries such as Scikit-learn, Numpy, CNTK, etc are quite trending. It is a portable language and can be used on multiple operating systems namely Windows, Mac OS, Linux, and Unix. Cons Integration of the AI systems with non-Python infrastructure. For e.g. for an infrastructure built around Java, it would be advisable to build deep learning models using Java rather than Python. If you are a data scientist, a machine learning developer or just a domain expert like a bioinformatician who hasn’t yet learned a programming language, Python is your best bet. It is easy to learn, translate equations and logic well in few lines of code and has a rich development ecosystem. 2.  C++ C++  comes second on the list when it comes to top 5 programming languages for Artificial Intelligence development. There are cases where C++ supersedes Python even though it is not the most common language when talking about AI development. For instance, when working with an embedded environment where you don’t want a lot of overhead due to Java Virtual Machine or Python Interpreter; C++ is a perfect choice. C++ also consists of some popular libraries and frameworks in AI, machine learning and deep learning namely, Mlpack, shark, OpenNN, Caffe, Dlib, etc. Pros Execution in C++ is very fast which is why it can be the go-to language when it comes to AI projects that are time-sensitive. It offers substantial use of algorithms. It uses statistical AI techniques quite effectively. Data hiding and inheritance make it possible to reuse the existing code during the development process. It is also suitable for machine learning and Neural Networks. Cons It follows a bottom-up approach and this makes it very complex for large-scale projects. If you are a game developer, you’ve already dabbled with C++ in some form or the other. Given the popularity of C++ among developers, it goes without saying, that if you choose C++, it can definitely kickstart your AI development process to build smarter, more interactive games. 3. Java Java is a close contender to C++. From Machine Learning to Natural language processing, Java comes with a plethora of libraries for all aspects of Artificial Intelligence development. Java has all the infrastructure that you need to create your next big AI project. Some popular Java libraries and frameworks are Deeplearning4j, Weka, Java-ML, etc. Pros Java follows the once Written Read/Run Anywhere (WORA) principle. It is a time-efficient language as it can be run on any platform without the need for re-compilation every time because of Virtual Machine Technology. Java works well for search algorithms, neural networks, and NLP. It is a multi-paradigm language i.e. it supports object-oriented, procedure-oriented and functional programming languages. It is easy to debug. Cons As mentioned, Java has a complex and verbose code structure which can be a bit time-consuming as it increases the development time. If you are into development of software, web, mobile or anywhere in between, you’ve worked with Java at some point, probably you still are. Most commercial apps have Java baked in them. The familiarity and robustness that Java has to offer is a good reason to pick Java when working with AI development. This is especially relevant if you want to enter well-established domains like banking that are historically built on top of Java-based systems. 4. Scala Just like Java, Scala belongs to the JVM family. Scala is a fairly new language in the AI space but it’s finding quite a bit of recognition recently in many corporations and startups. It has a lot to offer in terms of convenience which is why developers enjoy working with it. Also, ScalaNLP, DeepLearning4j, etc are all tools and libraries that make the AI development process a bit easier with Scala. Let’s have a look at the features that make it a good choice for AI development. Pros It’s good for projects that need scalability. It combines the strengths of Functional and Imperative programming models to act as a powerful tool which helps build highly concurrent applications while reaping the benefits of an OO approach at the same time. It provides good concurrency support which helps with projects involving real-time parallelized analytics. Scala has a good open source community when it comes to statistical learning, information theory and Artificial Intelligence in general. Cons Scala falls short when it comes to machine learning libraries. Scala consists of concepts such as implicits as well as type classes. These might not be familiar to programmers coming from the object-oriented world. The learning curve in Scala is steep. Even though Scala lacks in machine learning libraries, its scalability, and concurrency support makes it a good option for AI development. With more companies such as IBM and lightbend collaborating together to use Scala for building more AI applications, it’s no secret that Scala’s use for AI development is on constant demand in the present as well as for the future. 5. R R is a language that’s catching up in the race recently for AI development. Primarily used for academic research, R is written by statisticians and it provides basic data management which makes tasks really easy. It’s not as pricey as statistical software namely Matlab or SAS, which makes it a great substitute for this software and a golden child of data science. Pros R comes with plenty packages that help boost its performance. There are packages available for pre-modeling, modeling and post modeling stages in data analysis. R is very efficient in tasks such as continuous regression, model validation, and data visualization. R being a statistical language offers very robust statistical model packages for data analysis such as caret, ggplot, dplyr, lattice, etc which can help boost the AI development process. Major tasks can be done with little code developed in an interactive environment which makes it easy for the developers to try out new ideas and verify them with varied graphics functions that come with R. Cons R’s major drawback is its inconsistency due to third-party algorithms. Development speed is quite slow when it comes to R as you have to learn new ways for data modeling. You also have to make predictions every time when using a new algorithm. R is one of those skills that’s mainly demanded by recruiters in data science and machine learning. Overall, R is a very clever language. It is freely available, runs on server as well as common hardware. R can help amp up your AI development process to a great extent. Other languages worth mentioning There are three other languages that deserve a mention in this article: Go, Lisp and Prolog. Let’s have a look at what makes these a good choice for AI development. Go Go has been receiving a lot of attention recently. There might not be as many projects available in AI development using Go as for now but the language is on its path to continuous growth these days. For instance, AlphaGo, is a first computer program in Go that was able to defeat the world champion human Go player, proves how powerful the language is in terms of features that it can offer. Pros You don’t have to call out to libraries, you can make use of Go’s existing machine learning libraries. It doesn’t consist of classes. It only consists of packages which make the code cleaner and clear. It doesn’t support inheritance which makes it easy to modify the code in Go. Cons There aren’t many solid libraries for core AI development tasks. With Go, it is possible to pull off core ML and some reinforcement learning tasks as well, despite the lack of libraries. But given other versatile features of Go, the future looks bright for this language with it finding more applications in AI development. Lisp Lisp is one of the oldest languages for AI development and as such gets an honorary mention. It is a very popular language in AI academic research and is equally effective in the AI development process as well. However, it is not such a usual choice among the developers of recent times. Also, most modern libraries in machine learning, deep learning, and AI are written in popular languages such as C++, Python, etc. But I wouldn’t write off Lisp yet. It still has an immense capacity to build some really innovative AI projects, if take the time to learn it. Pros Its flexible and extendable nature enables fast prototyping, thereby, providing developers with the needed freedom to quickly test out ideas and theories. Since it was custom built for AI, its symbolic information processing capability is above par. It is suitable for machine learning and inductive learning based projects. Recompilation of functions alongside the running program is possible which saves time. Cons Since it is an old language, not a lot of developers are well-versed with it. Also, new software and hardware have to be configured to be able to accommodate using Lisp. Given the vintage nature of Lisp for the AI world, it is quite interesting to see how things work in Lisp for AI development.  The most famous example of a lisp-based AI project is DART (Dynamic Analysis and Replanning Tool), used by the U.S. military. Prolog Finally, we have Prolog, which is another old language primarily associated with AI development and symbolic computation. Pros It is a declarative language where everything is dictated by rules and facts. It supports mechanisms such as tree-based data structuring, automatic backtracking, nondeterminism and pattern matching which is helpful for AI development. This makes it quite a powerful language for AI development. Its varied features are quite helpful in creating AI projects for different fields such as medical, voice control, networking and other such Artificial development projects. It is flexible in nature and is used extensively for theorem proving, natural language processing, non-numerical programming, and AI in general. Cons High level of difficulty when it comes to learning Prolog as compared to other languages. Apart from the above-mentioned features, implementation of symbolic computation in other languages can take up to tens of pages of indigestible code. But the same algorithms implemented in Prolog results in a clear and concise program that easily fits on one page. So those are the top programming languages for Artificial Intelligence development. Choosing the right language eventually depends on the nature of your project. If you want to pick an easy to learn language go for Python but if you are working on a project where speed and performance are most critical then pick C++. If you are a creature of habit, Java is a good choice. If you are a thrill seeker who wants to learn a new and different language, choose Scala, R or Go, and if you are feeling particularly adventurous, explore the quaint old worlds of Lisp or Prolog. Why is Python so good for AI and Machine Learning? 5 Python Experts Explain Top 6 Java Machine Learning/Deep Learning frameworks you can’t miss 15 Useful Python Libraries to make your Data Science tasks Easier
Read more
  • 0
  • 0
  • 12403

article-image-neo4j-most-popular-graph-database
Amey Varangaonkar
02 Aug 2018
7 min read
Save for later

Why Neo4j is the most popular graph database

Amey Varangaonkar
02 Aug 2018
7 min read
Neo4j is an open source, distributed data store used to model graph problems. It departs from the traditional nomenclature of database technologies, in which entities are stored in schema-less, entity-like structures called nodes, which are connected to other nodes via relationships or edges. In this article, we are going to discuss the different features and use-cases of Neo4j. This article is an excerpt taken from the book 'Seven NoSQL Databases in a Week' written by Aaron Ploetz et al. Neo4j's best features Aside from its support of the property graph model, Neo4j has several other features that make it a desirable data store. Here, we will examine some of those features and discuss how they can be utilized in a successful Neo4j cluster. Clustering Enterprise Neo4j offers horizontal scaling through two types of clustering. The first is the typical high-availability clustering, in which several slave servers process data overseen by an elected master. In the event that one of the instances should fail, a new master is chosen. The second type of clustering is known as causal clustering. This option provides additional features, such as disposable read replicas and built-in load balancing, that help abstract the distributed nature of the clustered database from the developer. It also supports causal consistency, which aims to support Atomicity Consistency Isolation and Durability (ACID) compliant consistency in use cases where eventual consistency becomes problematic. Essentially, causal consistency is delivered with a distributed transaction algorithm that ensures that a user will be able to immediately read their own write, regardless of which instance handles the request. Neo4j Browser Neo4j ships with Neo4j Browser, a web-based application that can be used for database management, operations, and the execution of Cypher queries. In addition to, monitoring the instance on which it runs, Neo4j Browser also comes with a few built-in learning tools designed to help new users acclimate themselves to Neo4j and graph databases. Neo4j Browser is a huge step up from the command-line tools that dominate the NoSQL landscape. Cache sharding In most clustered Neo4j configurations, a single instance contains a complete copy of the data. At the moment, true sharding is not available, but Neo4j does have a feature known as cache sharding. This feature involves directing queries to instances that only have certain parts of the cache preloaded, so that read requests for extremely large data sets can be adequately served. Help for beginners One of the things that Neo4j does better than most NoSQL data stores is the amount of documentation and tutorials that it has made available for new users. The Neo4j website provides a few links to get started with in-person or online training, as well as meetups and conferences to become acclimated to the community. The Neo4j documentation is very well-done and kept up to date, complete with well-written manuals on development, operations, and data modeling. The blogs and videos by the Neo4j, Inc. engineers are also quite helpful in getting beginners started on the right path. Additionally, when first connecting to your instance/cluster with Neo4j Browser, the first thing that is shown is a list of links directed at beginners. These links direct the user to information about the Neo4j product, graph modeling and use cases, and interactive examples. In fact, executing the play movies command brings up a tutorial that loads a database of movies. This database consists of various nodes and edges that are designed to illustrate the relationships between actors and their roles in various films. Neo4j's versatility demonstrated in its wide use cases Because of Neo4j's focus on node/edge traversal, it is a good fit for use cases requiring analysis and examination of relationships. The property graph model helps to define those relationships in meaningful ways, enabling the user to make informed decisions. Bearing that in mind, there are several use cases for Neo4j (and other graph databases) that seem to fit naturally. Social networks Social networks seem to be a natural fit for graph databases. Individuals have friends, attend events, check in to geographical locations, create posts, and send messages. All of these different aspects can be tracked and managed with a graph database such as Neo4j. Who can see a certain person's posts? Friends? Friends of friends? Who will be attending a certain event? How is a person connected to others attending the same event? In small numbers, these problems could be solved with a number of data stores. But what about an event with several thousand people attending, where each person has a network of 500 friends? Neo4j can help to solve a multitude of problems in this domain, and appropriately scale to meet increasing levels of operational complexity. Matchmaking Like social networks, Neo4j is also a good fit for solving problems presented by matchmaking or dating sites. In this way, a person's interests, goals, and other properties can be traversed and matched to profiles that share certain levels of equality. Additionally, the underlying model can also be applied to prevent certain matches or block specific contacts, which can be useful for this type of application. Network management Working with an enterprise-grade network can be quite complicated. Devices are typically broken up into different domains, sometimes have physical and logical layers, and tend to share a delicate relationship of dependencies with each other. In addition, networks might be very dynamic because of hardware failure/replacement, organization, and personnel changes. The property graph model can be applied to adequately work with the complexity of such networks. In a use case study with Enterprise Management Associates (EMA), this type of problem was reported as an excellent format for capturing and modeling the inter dependencies that can help to diagnose failures. For instance, if a particular device needs to be shut down for maintenance, you would need to be aware of other devices and domains that are dependent on it, in a multitude of directions. Neo4j allows you to capture that easily and naturally without having to define a whole mess of linear relationships between each device. The path of relationships can then be easily traversed at query time to provide the necessary results. Analytics Many scalable data store technologies are not particularly suitable for business analysis or online analytical processing (OLAP) uses. When working with large amounts of data, coalescing desired data can be tricky with relational database management systems (RDBMS). Some enterprises will even duplicate their RDBMS into a separate system for OLAP so as not to interfere with their online transaction processing (OLTP) workloads. Neo4j can scale to present meaningful data about relationships between different enterprise-marketing entities, which is crucial for businesses. Recommendation engines Many brick-and-mortar and online retailers collect data about their customers' shopping habits. However, many of them fail to properly utilize this data to their advantage. Graph databases, such as Neo4j, can help assemble the bigger picture of customer habits for searching and purchasing, and even take trends in geographic areas into consideration. For example, purchasing data may contain patterns indicating that certain customers tend to buy certain beverages on Friday evenings. Based on the relationships of other customers to products in that area, the engine could also suggest things such as cups, mugs, or glassware. Is the customer also a male in his thirties from a sports-obsessed area? Perhaps suggesting a mug supporting the local football team may spark an additional sale. An engine backed by Neo4j may be able to help a retailer uncover these small troves of insight. To summarize, we saw Neo4j is widely used across all enterprises and businesses, primarily due to its speed, efficiency and accuracy. Check out the book Seven NoSQL Databases in a Week to learn more about Neo4j and the other popularly used NoSQL databases such as Redis, HBase, MongoDB, and more. Read more Top 5 programming languages for crunching Big Data effectively Top 5 NoSQL Databases Is Apache Spark today’s Hadoop?
Read more
  • 0
  • 0
  • 12354

article-image-why-oracle-losing-database-race
Aaron Lazar
06 Apr 2018
3 min read
Save for later

Why Oracle is losing the Database Race

Aaron Lazar
06 Apr 2018
3 min read
When you think of databases, the first thing that comes to mind is Oracle or IBM. Oracle has been ruling the database world for decades now, and it has been able to acquire tonnes of applications that use its databases. However, that’s changing now, and if you didn’t know already, you might be surprised to know that Oracle is losing the database race. Oracle = Goliath Oracle was and still is ranked number one among databases, owing to its legacy in the database ballpark. Source - DB Engines The main reason why Oracle has managed to hold its position is because of lock-in, a CIO’s worst nightmare. Migrating data that’s accumulated over the years is not a walk in the park and usually has top management flinching every time it’s mentioned. Another reason is because Oracle is known to be aggressive when it comes to maintaining and enforcing licensing terms. You won’t be surprised to find Oracle ‘agents’ at the doorstep of your organisation, slapping you with a big fine for non-compliance! Oracle != Goliath for everyone You might wonder whether even the biggies are in the same position, locked-in with Oracle. Well, the Amazons and Salesforces of the world have quietly moved away from lock-in hell and have their applications now running on open-source projects. In fact, Salesforce plans to be completely free of Oracle databases by 2023 and has even codenamed this project “Sayonara”. I wonder what inspired the name! Enter the “Davids” of Databases While Oracle’s databases have been declining, alternatives like SQL Server and PostgreSQL have been steadily growing. SQL Server has been doing it in leaps and bounds, with a growth rate of over 30%. Amazon and Microsoft’s cloud based databases have seen close to 10x growth. While one might think that all Cloud solutions would have dominated the database world, databases like Google Cloud SQL and IBM Cognos have been suffering very slow to no growth as the question of lock-in arises again, only this time with a cloud vendor. MongoDB has been another shining star in the database race. Several large organisations like HSBC, Adobe, Ebay, Forbes and MTV have adopted MongoDB as their database solution. Newer organisations have been resorting to adopt these databases instead to looking to Oracle. However, it’s not really eating into Oracle’s existing market, at least not yet. Is 18c Oracle’s silver bullet? Oracle bragged a lot about 18c, last year, positioning it as a database that needs little to no human interference thanks to its ground-breaking machine learning; one that operates at less than 30 minutes of downtime a year and many more features. Does this make Microsoft and Amazon break into a sweat? Hell no! Although Oracle has strategically positioned 18c as a database that lowers operational cost by cutting down on the human element, it still is quite expensive when compared to its competitors - they haven’t dropped their price one bit. Moreover, it can’t really automate “everything” and there’s always a need for a human administrator - not really convincing enough. Quite naturally customers will be drawn towards competition. In the end, the way I look at it, Oracle already had a head start and is now inches from the elusive finish line, probably sniggering away at all the customers that it has on a leash. All while cloud databases are slowly catching up and will soon be leaving Oracle in a heap of dirt. Reminds me of that fable mum used to read to me...what’s it called...The hare and the tortoise.
Read more
  • 0
  • 0
  • 12270

article-image-what-software-stack-does-airbnb-use
Richard Gall
20 Aug 2017
4 min read
Save for later

What software stack does Airbnb use?

Richard Gall
20 Aug 2017
4 min read
Airbnb is one of the most disruptive organizations of the last decade. Since its inception in 2008, the company has developed a platform that allows people to ‘belong anywhere’ (to quote their own mission statement). In doing so, the very nature of tourism has changed. But what software does Airbnb use? What tools are enabling their level of innovation? How Airbnb develops a dynamic front end Let’s start with  the key challenge for Airbnb. Like many similar platforms, one of the central difficulties handling data in a way that’s incredibly dynamic. That means you need to ensure your JavaScript is working hard for you without taking too much strain. That’s where a reactive approach comes in. As an asynchronous paradigm, it’s able to manage how data moves from source to the components that react to it. But the paradigm can only do so much. By using ReactJS, Airbnb have a library that is capable of giving you the necessary dynamism in your UI. The Airbnb team have written a lot on their love for ReactJS, making it their canonical front end framework in 2015. But they’ve also built a large number of other tools around React to make life easier for their engineers. In this post, for example, the team discuss React Sketch.app which ‘allows you to write React components that render to Sketch documents.’ Elsewhere, Ruby also forms an important part of the development stack. However, as with React, the team are committed to innovating with the tools at their disposal. In this post, they discuss how the built a ‘blazing fast thrift bindings for Ruby with C extensions.’ How Airbnb manages data If managing data on the front end has been a crucial part of their software consideration, what about the tools that actually manage and store data? The company use MySQL to manage core business data; this hasn’t been without challenges - not least because of scalability. However, the team have found ways of making MySQL work to their advantage. Redis is also worth a mention here - read here how Airbnb use Redis to monitor customer issues at scale. But Airbnb have always been a big data company at heart - that’s why Hadoop is so important to their data infrastructure. A number of years ago, Airbnb ran Hadoop on Mesos which allows you to deploy a single configuration on different servers; this worked for a while, but owing to a number of challenges, (which you can read about here) the team moved away from Mesos, running a more straightforward Hadoop infrastructure. Spark is also an important tool for Airbnb. The team actually built something called Airstream, which is a computational framework that sits on top of Spark Streaming and Spark SQL, allowing engineers and the data team to get quick insights. Ultimately, for an organization that depends on predictions and machine learning, something like Spark - alongside other open source machine learning libraries - is crucial in the Airbnb stack. Cloud - how Airbnb takes advantage of AWS If you take a close look at how they work, the Airbnb team have a true hacker mentality, where it’s about playing, building, creating new tools to tackle new challenges. This has arguably been enabled by the way they use AWS. It’s perhaps no coincidence that around the time Airbnb was picking up speed and establishing itself that the Amazon cloud offering was reaching maturity. Airbnb adopted a number of AWS services such as S3 and EC2 early on. But the reason Airbnb have stuck with AWS comes down to cultural fit. “For us, an investment in AWS is really about making sure our engineers are focused on the things that are uniquely core to our business. Everything that we do in engineering is ultimately about creating great matches between people,” Kevin Rice, Director of Engineering has said. How Airbnb creates a DevOps culture But there’s more to it than AWS; there’s a real DevOps culture inside Airbnb that further facilitates a mixture of agility and creativity. The tools used for DevOps are an interesting mix - some of which are unsurprising - like GitHub, and Nginx (which powers some of the busiest sites on the planet), but some slightly more surprising features, such as Kibana, which is used by the company to monitor data alongside Elasticsearch. When it comes to developing and provisioning environments, Airbnb use Vagrant and Chef. It’s easy to see the benefits here - it makes setting up and configuring environments incredibly easy and fast. And if you’re going to live by the principles of DevOps, this is essential - it’s the foundation of everything you do.
Read more
  • 0
  • 0
  • 12239
article-image-how-develop-game-concept
Raka Mahesa
18 Sep 2017
5 min read
Save for later

How to develop a game concept

Raka Mahesa
18 Sep 2017
5 min read
You may have an idea or a concept for a game and you may like to make a full game based on that concept. Congratulations, you're now taking the first step in the game development process. But you may be unsure of what to do next with your game concept. Fortunately, that’s what we’re here to discuss this.  How to find inspiration for a game idea A game idea or concept can come from a variety of places. You may be inspired by another medium, such as a film or a book, you may have had an exciting experience and want to share it with others, you may be playing another game and think you can do better, or you may just have a sudden flash of inspiration out of nowhere. Because ideas can come from a variety of sources, they can take on a number of different forms and robustness. So it's important to take a step back and have another look at this idea of yours.  How to create a game prototype  What should you do after your game concept has been fleshed out? Well, the next step is to create a simple prototype based on your game concept to see if it is viable and actually fun to play.  Wait, what if this is your first foray into game development and you barely have any programming skill? Well, fortunately, developing a game prototype is a good entry to the world of programming. There are many game development tools out there like GameMaker, Stencyl, and Construct 2 that can help you quickly create a prototype without having to write too many lines of code. These tools are so useful that even seasoned programmers use them to quickly build a prototype.  Should I use a game engine to prototype?  Should you use full-featured, professional game engines for making a prototype? Well, it's completely up to you, but one of the purposes of making a prototype is to be able to test out your ideas easily, so when the idea doesn't work out, you can tweak it quickly. With a full-featured game engine, even though it's powerful, it may take longer to complete simple tasks, and you end up not being able to iterate on your game quick enough.  That's also why most game prototypes are made with just simple shapes or very simple graphics. Creating those kinds of graphics doesn't take a lot of time and allows you to iterate on your game concept quickly. Imagine you're testing out a game concept and found out that enemies that just randomly hop around aren't fun, so you decide to make those enemies simply run on the ground. If you're just using a red square for your hopping enemies, you can use the same square for running enemies. But if you're using, say, frog images for those enemies, you will have to switch to a different image when you want the enemies to run. Why is prototyping so important in game development?  You may wonder why the emphasis is on creating a prototype instead of building the actual game. After all, isn't fleshing out a game concept supposed to make sure the game is fun to play? Well, unfortunately, what seems fun in theory may not be actually fun in practice. Maybe you thought that having a jump stamina would make things more exciting for a player, but after prototyping such a system, you may discover that it actually slow things down and makes the game less fun.  Also, prototyping is not just useful for measuring a game's fun, it's also useful for making sure the player has the kinds of experiences that the game concept wants to deliver. Maybe you have this idea of a game where the hero fights many enemies at once so the player can experience an epic battle. But after you prototyped it, you found out that the game felt chaotic instead of epic. Fortunately with a prototype you can quickly tweak the variables of your enemies to make the game feel more epic and less chaotic.  Using simple graphics  Using simple graphics is important for a game prototype. If players can have a good experience with a prototype that uses simple graphics, imagine the fun they'll have with the final graphics. Simple graphics are good because the experience the player feels is due to the game's functions, and not because of how the game looks.  Next steps  After you're done building the prototype and have proven that your game concept is fun to play, you can move on to the next step in the game development process. Your next step depends on the sort of game you want to make. If it's a massive game with many systems, you might want to create a proper game design document that includes how you want to expand the mechanics of your game. But if the game is on the small side with simple mechanics, you can start building the final product and assets.  Good luck on your game development journey! Raka Mahesa is a game developer at Chocoarts (http://chocoarts.com/), who is interested in digital technology in general. Outside of work hours, he likes to work on his own projects, with Corridoom VR being his latest released game. Raka also regularly tweets as @legacy99. 
Read more
  • 0
  • 0
  • 12234

article-image-diy-iot-projects-you-can-build-under-50
Vijin Boricha
29 Jun 2018
5 min read
Save for later

5 DIY IoT projects you can build under $50

Vijin Boricha
29 Jun 2018
5 min read
Lately, IoT is beginning to play an integral part in various industries, be it at the consumer-level, or at the enterprise side of it. With a lot of big players like Apple, Microsoft, Amazon, and Google entering this market, IoT adoption has scaled tremendously. It is said to have jumped from a hobbyist level to an industry infrastructure where everything functions on smart devices, that can talk. The bulk release of popular IoT products prove that this market is getting bigger and a lot of individuals have been amazed with home automation products such as Amazon Alexa, Apple Homepod, Google Home and others. These devices are one of the most sought-after things for hobbyist and enthusiasts who are interested to do simple automation with sensors. Following are 5 IoT projects ideas that you can build without a hole in the pocket. To learn how to actually build similar kind of projects, check out our books; Internet of Things with Raspberry Pi 3 Smart Internet of Things Projects Raspberry Pi 3 Home Automation Projects Weather control station This project will not only help you measure the room temperature but will also help you measure the altitude and the pressure in the room. For this project you will need the Adafruit Starter Pack for Windows 10 IoT Core on the latest Raspberry Pi kit. Along with the Raspberry Pi Kit you will also be using other sensors that read temperature, pressure, and altitude. To make your weather station advanced, you can connect the device to your cloud account to store the weather data. Hardware Raspberry Pi 2 or 3 Breadboard Adafruit BMP280 Barometric Pressure & Altitude Sensor Software Windows 10 IoT Core Approximate total cost Less than $60 Facial Recognition Door Self-built home security projects are some of the most popular DIY projects because they can be cheaper and simple compared to bulky professional installations. Here's a project that controls entry access using facial recognition, thanks to Microsoft Project Oxford. This project from Mazudo, based on Raspberry Pi and Windows IoT, is posted on Hackster.io. This is a handy project for DIY enthusiasts who want to build a quick security lock for their homes. Hardware Raspberry Pi 3 Breadboard USB camera Relay switch Speaker Software Windows 10 IoT Core Approximate total cost Less than $50 Your very own Alexa Echo Alexa Echo has always been a handy device, which can take notes, schedule reminders for your appointments, and play podcasts for you. Brilliant, isn’t it?  You can build a fully functional customized Alexa Echo with all the features of Alexa, apart from accessing official music servers like Amazon prime. It will also have an integration with recently included third party apps like todoist and Any.do. This DIY Echo can also be connected to your cell phone devices to manage notifications when the timer goes off, and so on. Only one thing that your DIY will be missing is the ability to function as a bluetooth speaker. Hardware Raspberry Pi 3 Breadboard USB speaker and mic Software Raspbian Approximate total cost Less than $50 Pet Feeder You surely don’t want your pet to starve when you’re away, do you? This customized pet feeder is controlled via the internet; set timings and feed your pet automatically later. These pet feeders are directly connected to WiFi using ESP8266 chip. We can easily add features like controlling the device using cell phone and making dashboards using Freeboard. This project can be later upgraded or rightly reprogrammed to fill your snack bowl at regular intervals as well. Hardware Arduino PIR motion sensor ESP8266 ESP-01 Software Arduino IDE ESP8266Flasher.exe Approximate total cost Less than $40 Video Surveillance Robot Video surveillance is a process of monitoring a scenario, person or an environment as a whole. A video surveillance robot can capture the activities happening in the surrounding where it is deployed and can be controlled using a GUI Interface. For further enhancements, you can even connect your device to the cloud and save the recorded data there. Hardware Raspberry Pi ARM Cortex- A7 CPU L293 motor driver Software Raspbian Approximate total cost Less than $50 These are few economical yet highly useful Internet of Things projects, which can be leveraged to improve your daily activities. Still not convinced?. Think of it this way. Buying the microchip board is a one time investment as it can be reused in separate projects. The sensors and other peripherals aren’t that expensive. You might say, it’s just way easier to buy an IoT device. I would argue that, buying an IoT device is not as satisfying as building one for the same purpose. In the end, there are multiple advantages of building one as you can brag about it to your friends and most importantly include it in your resume to give you that edge over others in an interview. Cognitive IoT: How Artificial Intelligence is remoulding Industrial and Consumer IoT Windows 10 IoT Core: What you need to know 5 reasons to choose AWS IoT Core for your next IoT project  
Read more
  • 0
  • 0
  • 12209

article-image-top-5-open-source-static-site-generators
Sugandha Lahoti
21 May 2018
6 min read
Save for later

Top 5 open source static site generators

Sugandha Lahoti
21 May 2018
6 min read
Static sites are back and stronger than ever. A large number of businesses have realized the importance of sticking to trendy, beautiful, static websites which have less hassle of server maintenance and security exploits. For example, Nest and MailChimp, prominent design companies, are using static site generators for their primary websites. Vox Media has built an entire publishing system around Middleman static site generator. A static website contains Web pages coded in HTML, with fixed content, so they look same to every user. These websites are made using static site generators, which automate the process of creating websites, with minimal coding required from developers. If you’re looking to implement static sites in your next business project, we have compiled a list of the top 5 static site generators to help you design interactive and fast websites. Before we dive in, first let’s understand when choosing static sites makes sense. Why choose static sites? Static sites typically take the content stored in flat files as opposed to dynamic sites where databases serve as content stores. This content is applied against templates and is used to generate a structure of static HTML files. These static files function as the website for the users. Agreed, they lack real-time content and have limited functionalities. But these static sites come in real handy when you want to avoid the hassle of server maintenance while also keeping your pocket light. Not to mention, they are the best option available when your product doesn’t require timely upgrades. Another important factor which contributes to their popularity is the ability to be indexed easily by Google search engines. Since Google has indicated site speed to be one of the signals used to rank pages, static sites have truly shined through. With pure HTML static websites, you have total control over your SEO, and the HTML and CSS are fully understood by search engines. Unlike dynamic websites, you don’t need a special plugin to manage your SEO or need to optimize page load time. Static sites are fast loading, secure, and most importantly, well prepared for traffic surges. This is why their popularity is only surging with the growth of publishing content online. Now that I have ignited your interest to make your next website static, let’s look at some frameworks for building these sites. Static site generators have exploded in popularity in recent years, with a total of more than 100,000 stars for static website generator repositories. Hence, navigating the wide range of choices can be difficult. Here are my top five picks to get you started. Jekyll: The most mature player Jekyll is perhaps the most mature and popular static site generator (Quite obvious from the Github stars). It is built with Ruby and is typically used for transforming plain texts into static websites and blogs. It takes a directory filled with text files, renders that content with Markdown and Liquid templates, and generates a publish-ready static website. Jekyll comes with a big bonus of being natively supported by GitHub pages. So you can easily deploy your site using GitHub for absolutely free. It also has a huge community and wide array of plugins, making it easier for Wordpress and Drupal developers to import content. Hugo: The fastest player Blazingly fast, Hugo is a static HTML and CSS website generator built around Google’s Go programming language. It is optimized for speed, ease of use, and configurability. As with Jekyll, Hugo takes a directory of text files and templates, albeit written in Go, and generates them into a full HTML website. It is extremely fast with build times less than 1 ms per page It is cross-platform, with easy installation on macOS, Linux, Windows, and more. It renders changes on the fly with LiveReload as you develop. It provides full i18n support for multi-language sites. Hexo: The One-command player Hexo is a powerful framework built with Node.js. It offers super fast rendering even for extremely large sites. Hexo is highly extensible as it offers support for GitHub flavored Markdown and most Octopress plugins. It has a One-command deployment to GitHub Pages, Heroku, and other sites. Hexo also features a powerful plugin system. You can install more plugins for Jade, CoffeeScript plugins and many Jekyll plugins with minor adjustments. Gatsby: The multi-tasker Gatsby is a static site generator for React. It is optimized for speed as it loads only critical parts for fast loading. Once loaded, Gatsby prefetches resources for other pages so that clicking around the site feels incredibly fast. Gatsby.js can also be used to generate static Progressive Web apps. It does automatic routing based on the directory structure. The HTML code is generated server-side and additional code need not be included for configuring the router. Gatsby has a pre-configured Webpack-based build system and allows easy data integration from CMSs, SaaS services, APIs, databases, and file systems. VuePress: The new player VuePress, the new player in town, is a minimalistic static site generator powered by Vue.js. VuePress creates a single-page application with pre-rendered static HTML from a Markdown file. This markdown file is powered by Vue, VueRouter, and Webpack. It is composed of two parts: A theming system A default theme optimized for writing technical documentation. This default theme has a header-based search, customizable navbar and sidebar, optional homepage, auto-generated Github link and Page edit links. VuePress also comes with integrated Google Analytics and multi-language support. Here's a short table summarising all static site generators. Static Site Generator GitHub stars Languages Templates Features Jekyll 34k + Ruby Liquid - Most mature and popular - Supported by GitHub pages - A wide array of Plugins Hugo 25k + Go Go - Extremely fast - Cross-platform - Renders changes on the fly - Full i18n support for multi-language sites. Hexo 22k + Javascript EJS, Pug - Highly extensible - One-command deployment - Powerful plugin system Gatsby 21k + Javascript React - Optimized for speed - Generates static PWA - A pre-configured Webpack-based build system VuePress 7k + (growing fast) Javascript Vue - Theming system - Default theme - Google analytics support - Multi-Language support   Apart from these, you also have Next, GitBook, Nuxt, Pelican, among others as some of the other static site generators to choose from. Before going with your choice of static site generator, you need to first make an informed decision on whether or not a static site is right for your next project. Consider your website needs and the kind of business you’re running. If your website has too much going on, it may be killing your traffic. In such cases having a fast, secure and beautiful static site is much more beneficial than a massive, unwieldy dynamic website. Firefox 60 arrives with exciting updates for web developers: Quantum CSS engine, new Web APIs and more [news] Get ready for Bootstrap v4.1; Web developers to strap up their boots [news] How to create a generic reusable section for a single page based website [tutorial]
Read more
  • 0
  • 6
  • 12167
article-image-what-role-does-linux-play-in-securing-android-devices
Sugandha Lahoti
07 Oct 2018
9 min read
Save for later

What role does Linux play in securing Android devices?

Sugandha Lahoti
07 Oct 2018
9 min read
In this article, we will talk about the Android Model particularly the Linux Kernel layer, over which Android is built. We will also talk about Android's security features and offerings and how Linux plays a role to secure Android OS. This article is taken from the book Practical Mobile Forensics - Third Edition by Rohit Tamma et al. In this book, you will investigate, analyze, and report iOS, Android, and Windows devices. The Android architecture Android is open source and the code is released under the Apache license. Practically, this means anyone (especially device manufacturers) can access it, freely modify it, and use the software according to the requirements of any device. This is one of the primary reasons for its wide acceptance. Notable players that use Android include Samsung, HTC, Sony, and LG. As with any other platform, Android consists of a stack of layers running one above the other. To understand the Android ecosystem, it's essential to have a basic understanding of what these layers are and what they do. The following figure summarizes the various layers involved in the Android software stack: Android architecture Each of these layers performs several operations that support specific operating system functions. Each layer provides services to the layers lying on top of it. The Linux kernel layer Android OS is built on top of the Linux kernel, with some architectural changes made by Google. There are several reasons for choosing the Linux kernel. Most importantly, Linux is a portable platform that can be compiled easily on different hardware. The kernel acts as an abstraction layer between the software and hardware present on the device. Consider the case of a camera click. What happens when you take a photo using the camera button on your device? At some point, the hardware instruction (pressing a button) has to be converted to a software instruction (to take a picture and store it in the gallery). The kernel contains drivers to facilitate this process. When the user presses on the button, the instruction goes to the corresponding camera driver in the kernel, which sends the necessary commands to the camera hardware, similar to what occurs when a key is pressed on a keyboard. In simple words, the drivers in the kernel command control the underlying hardware. The Linux kernel is responsible for managing the core functionality of Android, such as process management, memory management, security, and networking. Linux is a proven platform when it comes to security and process management. Android has taken leverage of the existing Linux open source OS to build a solid foundation for its ecosystem. Each version of Android has a different version of the underlying Linux kernel. The Marshmallow Android version is known to use Linux kernel 3.18.10, whereas the Nougat version is known to use Linux kernel 4.4.1. Android security Android was designed with a specific focus on security. Android as a platform offers and enforces certain features that safeguard the user data present on the mobile through multi-layered security. There are certain safe defaults that will protect the user, and certain offerings that can be leveraged by the development community to build secure applications. The following are issues that are to be kept in mind while incorporating Android security controls: Protecting user-related data Safeguarding the system resources Making sure that one application cannot access the data of another application The next few sections will help us understand more about Android's security features and offerings. Secure kernel Linux has evolved as a trusted platform over the years, and Android has leveraged this fact using it as its kernel. The user-based permission model of Linux has, in fact, worked well for Android. As mentioned earlier, there is a lot of specific code built into the Linux kernel. With each Android version release, the kernel version has also changed. The following table shows Android versions and their corresponding kernel versions: Android version Linux kernel version 1 2.6.25 1.5 2.6.27 1.6 2.6.29 2.2 2.6.32 2.3 2.6.35 3.0 2.6.36 4.0 3.0.1 4.1 3.0.31 4.2 3.4.0 4.2 3.4.39 4.4 3.8 5.0 3.16.1 6.0 3.18.1 7.0 4.4.1 The permission model As shown in the following screenshot, any Android application must be granted permissions to access sensitive functionality, such as the internet, dialer, and so on, by the user. This provides an opportunity for the user to know in advance which functions on the device is being accessed by the application. Simply put, it requires the user's permission to perform any kind of malicious activity (stealing data, compromising the system, and so on). This model helps the user to prevent attacks, but if the user is unaware and gives away a lot of permissions, it leaves them in trouble (remember, when it comes to installing malware on any device, the weakest link is always the user). Until Android 6.0, users needed to grant the permissions during install time. Users had to either accept all the permissions or not install the application. But, starting from Android 6.0, users grant permissions to apps while the app is running. This new permission system also gives the user more control over the app's functionality by allowing the user to grant selective permissions. For example, a user can deny a particular app access to his location but provide access to the internet. The user can revoke the permissions at any time by going to the app's Settings screen. Application sandbox In Linux systems, each user is assigned a unique user ID (UID), and users are segregated so that one user cannot access the data of another user. However, all applications under a particular user are run with the same privileges. Similarly, in Android, each application runs as a unique user. In other words, a UID is assigned to each application and is run as a separate process. This concept ensures an application sandbox at the kernel level. The kernel manages the security restrictions between the applications by making use of existing Linux concepts, such as UID and GID. If an application attempts to do something malicious, say to read the data of another application, this is not permitted as the application does not have user privileges. Hence, the operating system protects an application from accessing the data of another application. Secure inter-process communication Android offers secure inter-process communication through which one's activity in an application can send messages to another activity in the same application or a different application. To achieve this, Android provides inter-process communication (IPC) mechanisms: intents, services, content providers, and so on. Application signing It is mandatory that all of the installed applications are digitally signed. Developers can place their applications in Google's Play Store only after signing the applications. The private key with which the application is signed is held by the developer. Using the same key, a developer can provide updates to their application, share data between the applications, and so on. Security-Enhanced Linux Security-Enhanced Linux (SELinux) is a security feature that was introduced in Android 4.3 and fully enforced in Android 5.0. Until this addition, Android security was based on Discretionary Access Control (DAC), which means applications can ask for permissions, and users can grant or deny those permissions. Thus, malware can create havoc on phones by gaining those permissions. But, SE Android uses Mandatory Access Control (MAC), which ensures that applications work in isolated environments. Hence, even if a user installs a malware app, the malware cannot access the OS and corrupt the device. SELinux is used to enforce MAC over all the processes, including the ones running with root privileges. SELinux operates on the principle of default denial: anything that is not explicitly allowed is denied. SELinux can operate in one of the two global modes: permissive mode, in which permission denials are logged but not enforced, and enforcing mode, in which denials are both logged and enforced. Full Disk Encryption With Android 6.0 Marshmallow, Google has mandated Full Disk Encryption (FDE) for most devices, provided that the hardware meets certain minimum standards. Encryption is the process of converting data into cipher text using a secret key. On Android devices, full disk encryption refers to the process of encrypting all user data using a secret key. This key is then encrypted by the lock screen PIN/pattern/password before being securely stored in a trusted location. Once a device is encrypted, all user-created data is automatically encrypted before writing it to disk, and all reads automatically decrypt data before returning it to the calling process. Full disk encryption in Android works only with an Embedded Multimedia Card (eMMC) and similar flash devices that present themselves to the kernel as block devices. Staring from Android 7.x, Google decided to shift the encryption feature from full-disk encryption to file-based encryption. In file-based encryption, different files are encrypted with different keys. By doing so, those files can be unlocked independently without requiring an entire partition to be decrypted at once. As a result of this, the system can now decrypt and use files needed to boot the system, and open notifications without having to wait until the user unlocks the phone. Trusted Execution Environment Trusted Execution Environment (TEE) is an isolated area (typically a separate microprocessor) intended to guarantee the security of data stored inside it, and also to execute code with integrity. The main processor on mobile devices is considered untrusted and cannot be used to store secret data (such as cryptographic keys). Hence, TEE is used specifically to perform such operations, and the software running on the main processor delegates any operations that require the use of secret data to the TEE processor. Thus we talked about the Linux Kernel layer, over which Android is built. We also talked about Android's security features and offerings and how Linux plays a role to secure Android OS. To learn more about methods for accessing the data stored on Android devices, read our book Practical Mobile Forensics - Third Edition. The kernel community attempting to make Linux more secure. Google open sources Filament – a physically based rendering engine for Android, Windows, Linux and macOS Google becomes a new platinum member of the Linux Foundation
Read more
  • 0
  • 0
  • 12089

article-image-learn-kotlin-next-universal-programming-language
Sugandha Lahoti
11 May 2018
14 min read
Save for later

Forget C and Java. Learn Kotlin: the next universal programming language

Sugandha Lahoti
11 May 2018
14 min read
Kotlin is fast moving towards becoming the universal programming language. What is a universal programming language? From a simplistic view, the expectation could be that one language is used for all types of programming. While that may be far-fetched in today's complex world, the expectation could be adjusted to one language becoming the dominant programming language. Most certainly, it is the single, most important language to master. [box type="shadow" align="" class="" width=""]This article is an excerpt from the book,  Kotlin Blueprints, written by Ashish Belagali, Hardik Trivedi, and Akshay Chordiya. With this book, you will learn how to design and prototype professional-grade applications using various features of Kotlin.[/box] Historically, different languages have used strategies appropriate for those times to become the universal programming languages: In the 1970s, C became the universal programming language. Prior to C, the programming languages of the world were divided between low-level and high-level languages, the former being the languages that were close to machine code and the latter being ones that were more concise and worked better for human understanding. The C programming language was developed as a single language that could work as a low-level and a high-level language. The Unix operating system was showcased as one that was built ground-up entirely on C, without needing another low-level language. In the 1990s, Java became the universal programming language with the Write Once Run Anywhere strategy. Prior to Java, developers needed to create different programs to run on different platforms (different operating systems running on different hardware needed different programs to run). However, with Java, programs could be written targeting a single platform, namely the Java Virtual Machine (JVM). The JVM is available on all the popular platforms and takes care of all platform-specific nuances. The Java language became the universal language by being the language in which to write programs for the JVM. Another two decades have passed, and the stage is all set to welcome the next universal language. Let's examine Kotlin's strategy to become that. Why can Kotlin be described as a better Java than any other language? How does Kotlin address areas beyond the Java world? What is Kotlin's winning strategy? What does this all mean for a smart developer? Why Kotlin vs Java? Why is being a better Java important for a language? For over a decade, Java has consistently been the world's most widely used programming language. Therefore, a language that gets crowned as being a better Java should automatically attract the attention of the world's single largest community of programmers: the Java programmers. The TIOBE index is widely referred to as a gauge of the popularity of programming languages. Updated to August 2017, the index graph is reproduced in the following illustration:   The interesting point is that while Java has been the #1 programming language in the world for the last 15 years or so, it has been in a steady state of decline for many years now. Many new languages have kept coming, and existing ones have kept improving, chipping steadily into Java's developer base; however, none of them have managed to take the #1 position from Java so far. Today, Kotlin is poised to become the most serious challenger for the better Java crown, and subsequently, to take the first place, for reasons that we will see shortly. Presently at 41st place, Kotlin is marching ahead at a fast pace. In May 2017, Google announced Kotlin to be the officially supported language for Android development in league with Java. This has turned out to be a major boost for Kotlin, and the rate of its adoption has accelerated ever since. Why not other languages? Many languages prior to Kotlin have tried to become a better Java. Let's see why they could never become one. Every language attracts the programmer community by giving them an ability to do something that was cumbersome before. Their adoption is directly driven by how much value the promise has for them and how much faith the community can put into that promise. All languages or frameworks that claimed to be a better Java and offered something worthwhile beyond what Java offers also took something back in turn. Here are a few examples: .NET framework has been the longtime rival of Java and has supported multiple languages from day one. Based on the lessons learned from Java, the .NET designers came up with better language constructs. However, the biggest hurdle for .NET was that it was a proprietary technology, and that created an impediment to its adoption. Also, .NET was more aggressive in adding newer language constructs. While the framework evolved quickly as a result of that, it broke its backward compatibility many times. Ruby (and Python) offered shortened code, enticing programming constructs, and greater expressiveness as opposed to the boring Java; however, they took away static typing support (which helps to make robust programs) and made the programs slower. Scala offered shortened code and advanced programming constructs, without sacrificing typing safety. However, Scala is complex and has a substantially high learning curve. It supports multiple coding styles. So, there is a danger that Scala code written by one developer may not be understood easily by another. These are risk factors for any project that includes a team of developers and when the application is expected to be supported over a long period, which is true about most applications anyway. Why Kotlin? Unlike other languages, Kotlin offers a lot of power over Java, while not taking anything away. Let's take a look at the following screenshot to see how: Kotlin is interoperable with Java. It is possible to write applications containing both Java and Kotlin code, calling one from the other. Calling Java code from Kotlin is simpler, as opposed to the other way around, but the former will be the case most of the times anyway, where new Kotlin code is added on top of legacy Java code. Kotlin is interoperable and can use all the Java libraries and legacy coding without having to do any code conversion. It is possible to inject Kotlin into a Java project without boiling the ocean. Concise yet expressive code While being interoperable, Kotlin code is far superior to Java code. Like Scala, Kotlin uses type inference to cut down on a lot of boilerplate code and makes it concise. (Type inference is a better feature than dynamic typing as it reduces the code without sacrificing the robustness of the end product). However, unlike Scala, Kotlin code is easy to read and understand, even for someone who may not know Kotlin. Kotlin's data class construct is the most prominent example of being concise as shown in the following: data class Employee (val id: Long, var name: String) Compared to its Java counterpart, the preceding line has packed into it the class definition, member variables, constructor, getter-setter methods, and also the utility methods, such as equals() and hashCode(). This will easily take 15-20 lines of Java code. The data classes construct is not an isolated example. There are many others where the syntax is concise and expressive. Consider the following as additional examples: Kotlin's default values to function parameters save the need to overload the functions Kotlin's extension functions can be used to add domain-specific functionality to existing classes, making it easy for someone from the domain to understand Enhanced robustness Statically typed languages have a built-in safety net because of the assurance that the compiler will catch any incorrect type cast. Both Java and Kotlin support static typing. With Java Generics introduced in Java 1.5, they both fare better over the Java releases prior to 1.5. However, Kotlin takes a big step further in addressing the Null pointer error. This Null pointer error causes a lot of checks in Java programs: String s = someOperation(); if (s != null) { ... } One can see that the null check is not needed if someOperation() never returns null. On the other hand, it is possible for a programmer to omit the null check while someOperation() returning null is a valid case. With Kotlin, the definition of someOperation() itself will return either String or String? and then there are implications on the subsequent code, so the developer just cannot go wrong. Refer the  following table: fun someOperation() : String // not nullable fun someOperation() : String? // nullable val s = someOperation() if (s != null) { // null check not needed – editor warning … } val s = someOperation() n = s.length() // error, null check imposed n = s?.length() ?: 0 // handling null condition One may point out that Java developers can use the @Nullable and @NotNull annotations or the Optional class; however, these were added quite late, most developers are not aware of them, and they can always get away with not using them, as the code does not break. Finally, they are not as elegant as putting a question mark. There is also a subtle point here. If a Kotlin developer is careless, he would write just the type name, which would automatically become a non-nullable declaration. If he wanted to make it nullable, he would have to  key in that extra question mark deliberately. Thus, you are on the side of caution, and that is as far as keeping the code robust is concerned. Another example of this robustness is found in the var/val declarations. Seasoned programmers know that most variables get a value assigned to them only once. In Kotlin, while declaring the variable, you choose val for such a variable. At the time of variable declaration, the programmer has to select between val and var, and so he puts some thought into it. On the other hand, in Java, you can get away with just declaring the type with its name, and you will rarely find any Java code that defines a variable with the final keyword, which is Java's way of declaring that the variable can be assigned a value only once. Basically, with the same maturity level of programmers, you expect a relatively more robust code in Kotlin as opposed to Java, and that's a big win from the business perspective. Excellent IDE support from day one Kotlin comes from JetBrains, who also develop a well-known Java integrated development environment (IDE): IntelliJ IDEA. JetBrains developers made sure that Kotlin has first-class support in IDEA. Not only that, they also developed a Kotlin plugin for Eclipse, which is the #1 most widely used Java IDE. Contrast this with the situation when Java appeared on the scene roughly two decades ago. There was no good IDE support. Programmers were asked to use simple text editors. Coding Java was hard, with no safety net provided by an IDE, until the Eclipse editor was open-sourced. In the case of Kotlin, the editor's suggestions being available from day one means that they can learn the language faster, make fewer mistakes, and write good quality compilable code with relative ease. Clearly, Kotlin does not want to waste any time in climbing up the ladder of popularity. Beyond being a better Java We saw that on the JVM platform, Kotlin is neat and quite superior. However, Kotlin has set its eyes beyond the JVM. Its strategy is to win based on its superior and modern feature set. Before we go ahead, let's list the top five appeals of Kotlin: Static typing (like in C or Java) means that there is built-in type safety. The compiler catches any incorrect type assignments. This makes programs robust. Kotlin is concise and expressive. Being concise implies that there is less to read and maintain. Being expressive implies better maintainability. Being a JVM language, the Kotlin programs can take advantage of the features built into the JVM, such as its cross-platform nature, memory management, high performance and sandbox security. Kotlin has inbuilt null-safety. Null references are famous as the billion-dollar mistake, as admitted by its inventor Tony Hoare and cost a great deal of unnecessary null-checks in programs. Kotlin eliminates those and makes the programs more robust. Kotlin is easy to learn, especially for Java developers. Its syntax is clean and therefore easy to understand, because of which, Kotlin programs are fun for developers to code and easy to understand, and enhancing for their peers. From a business angle, they are more robust and easy to maintain for businesses. Kotlin is in the winning camp The features of Kotlin have a good validation when one considers that other languages, which have similar features, are also growing in popularity: The Crystal language attracts Ruby programmers by adding static typing support. Similarly, TypeScript adds static typing support to JavaScript and has become the preferred language for some JavaScript frameworks. Scala and F# add functional programming support to traditional non-functional paradigms without sacrificing type safety and, hence, are more attractive. Kotlin uses functional programming, just enough to ease out the programming in a lot of cases, but not too much to make it complex. Like Kotlin, Swift, and Rust also have inbuilt null-safety. Kotlin and Swift are often compared, as their syntaxes resemble one another a lot. Server-side languages, which were getting designed after the emergence of the parallel computing phenomena, became one of the chief requirements for providing inbuilt constructs for easing the programmer's work. One can find this in both Kotlin (coroutines) and Rust. Go native strategy The Kotlin developers figured that the same strategy that is used on the JVM platform could be used on other platforms too. Consider the following illustration: On no platform does Kotlin disrupt the platform's existing technology: The JVM works with the Java bytecode and Kotlin gives an alternative to Java to generate the same bytecode (By no means is Kotlin the first alternative as there are already 200+ languages that work with JVM, but it is the most elegant one for all the reasons that we have seen previously). On modern browsers where JavaScript is the de facto standard, Kotlin can work by transpiling to JavaScript. Again, this means that Kotlin is friendly with existing browsers without making any special effort. On the Node.js platform where JavaScript is used on the server side, your Kotlin code transpiles into JavaScript, and hence there are no changes needed in the Node.js framework for Kotlin to run. In a similar way, Kotlin/Native plans to work with other technologies in a native way. Since the platform's technology is not disrupted, there are zero changes needed at the platform level to adopt Kotlin. Kotlin's compatibility with a given platform can be taken for granted from day one. This eliminates a big business risk. Kotlin's winning strategy Kotlin's winning strategy is the sum of the various factors that we have seen previously. It has a two-pronged strategy to win over the developers with the coolness of the language, and the ease of working with it, to win over business users with its business benefits. The following illustration shows us the different benefits of using Kotlin: The other benefits also include: The growing popularity of the language Endorsement from Google to make Kotlin an officially supported language in May 2017 Kotlin-specific development frameworks emerging Leading Java frameworks, such as Spring, offering Kotlin-specific improvements The growing number of applications being tried out with Kotlin The user groups spread across Kotlin developer hubs The growing number of technology companies using Kotlin With this in mind, the winning strategy for smart programmers is to master Kotlin and learn to work with Kotlin on various platforms. Being ahead of the curve as opposed to following the world after Kotlin is already big but it will be a quick path to being recognized as a leader. Further chapters of this book will help you in exactly this mission. Apart from going through this book, we strongly suggest you join the community. Join the Kotlin weekly mailing list at http://kotlinweekly.net. Join the nearest Kotlin user group at http://kotlinlang.org/community/user-groups.html. Kotlin's community on Slack is at https://kotlinlang.slack.com/. We saw how Kotlin is well positioned to take off as the universal programming language. It offers an opportunity for smart programmers to establish themselves at the forefront of this rising tide. This article was taken from the book Kotlin Blueprints. If you liked reading this piece, check out the  book to build comprehensive applications using Kotlin features.  Getting started with Kotlin programming Build your first Android app with Kotlin How to convert Java code into Kotlin
Read more
  • 0
  • 2
  • 11987