Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

Author Posts

122 Articles
article-image-developers-need-to-say-no-elliot-alderson-on-the-faceapp-controversy-in-a-bonus-podcast-episode-podcast
Richard Gall
12 Aug 2019
5 min read
Save for later

"Developers need to say no" - Elliot Alderson on the FaceApp controversy in a BONUS podcast episode [Podcast]

Richard Gall
12 Aug 2019
5 min read
Last month there was a huge furore around FaceApp, the mobile application that ages your photographs to show you what you might look like as you get older. This was caused by a rapid cycle of misinformation and conjecture. It was thanks to cybersecurity researcher Elliot Alderson - who you might remember from last week's podcast episode - that the world was able to get beyond speculation and find out what was really going on. We got in touch with Elliot shortly after the story broke. He was kind enough to speak to us about the FaceApp furore, and explained what caused the confusion and how he managed to get to the bottom of what was actually going on. You can listen to what he had to say in this special short bonus episode: https://soundcloud.com/packt-podcasts/bonus-security-researcher-elliot-alderson-on-the-faceapp-furore   Elliot says that although FaceApp is problematic, it isn't unique. It poses exactly the same threat to our privacy as the platforms and applications that millions of people use every day. "There is an issue with FaceApp, he tells us. "But there is an issue with Facebook, with SnapChat, with Twitter - it's never a good idea for someone to upload a photo of your face to a random application." This line of argument can be found elsewhere. Arguably the most important lesson we can learn. In this article from Wired, journalist Brian Barrett writes "should you be worried about FaceApp? Sure. But not necessarily more than any other app you let into your photo library." Should you use FaceApp? However, although you might assume that a security professional would simply warn everyone against using these sorts of applications, Elliot says "this application is really trendy. You can see a lot of stars using it on social media, so this is normal - you want to use this application." What you need to consider if want to use FaceApp However, if you do want to use it, you should be careful. "You have to step back a little bit before using it and ask yourself a question" about how money is being made. "this is a free application... there are developers behind this application, they need to live, they need to eat, they need to live, they need to eat - they need to earn money - and in general the answer is with your data." "You are the information." Elliot says. "You can decide to use it, and say okay, I'm ready to lose this part of my privacy in order to use this cool service... or you will... think no, it's not worth it. FaceApp seems to be cool, but my privacy is more important than something trendy like this." The key, then, is to check the terms and conditions of the application. "You have to know that you will have lost a part of your privacy, And if you're okay with that then - okay, go for it, and use the application." "Developers need to say no sometimes." Developer responsibility and code ethics There are clearly question marks for users about FaceApp, or, indeed, any other free application that has access to your data. But what about the developers building these applications? Do they have a part to play in ensuring that applications respect user consent and privacy? "It's complicated for a developer to say no to their project manager" says Elliot. However, this doesn't mean developers should be content to follow orders from management. "Developers need to raise their level... and say okay, but ethics is also important..." Elliot continues, "as a technical guy I need to spread the message internally in my company, and say to the project manager, to the business, to the marketing department okay this is a cool feature but no, we won't do that because this is against our user'." "Developers need to say no sometimes - and companies need to understand that it's not okay to dump as much data as possible from their users." How did Elliot Alderson uncover the truth about FaceApp? One thing that is often forgotten in these stories are the technical processes through which the truth is uncovered. Sure, that might be a little dry or complicated for some, but the fact that there is real detective work in understanding what's actually going on inside an application is incredibly interesting. It also highlights that while software might sometimes appear mysterious or even impenetrable, with the right skills and tools we can see how things actually work. That's not only useful from a technical perspective, it's also a way for all of us to retrieve a small sense of power back from applications built and owned by companies worth billions of dollars. "It's not that easy, but it's not super complicated too," says Elliot. Although he tells us that "the first time you want to do it you need to spend some time on it for sure," once you're set up and ready to go you can find things out remarkably fast. Using a tool called Burp Suite, the whole process was complete in a matter of moments. "Checking FaceApp took literally 5 minutes for me, because everything is already set up on my computer and I just have to install the application and look at the network request." Learn more about Burp Suite with Packt's selection of eBooks and videos here. Follow Elliot Alderson on Twitter: @fs0c131y
Read more
  • 0
  • 0
  • 4697

article-image-prof-rowel-atienza-discusses-the-intuition-behind-deep-learning-techniques-advances-in-gans
Packt Editorial Staff
30 Sep 2019
6 min read
Save for later

Prof. Rowel Atienza discusses the intuition behind deep learning, advances in GANs & techniques to create cutting-edge AI models

Packt Editorial Staff
30 Sep 2019
6 min read
In recent years, deep learning has made unprecedented progress in vision, speech, natural language processing and understanding, and other areas of data science. Developments in deep learning techniques, including GANs, variational autoencoders and deep reinforcement learning, are creating impressive AI results. For example, DeepMind's AlphaGo Zero became a game changer in AI research when it beat world champions in the game of Go. In this interview, Professor Rowel Atienza, author of the book Advanced Deep Learning with Keras talks about the recent developments in the field of deep learning. This book is a comprehensive guide to the advanced deep learning techniques available today, so you can create your own cutting-edge AI. This book strikes a balance between advanced concepts in deep learning and practical implementations with Keras. Key takeaways from the interview The intuition of deep learning is built on the fact that the deeper the network gets, the more feature representations the network learns in order to solve complex real-world problems. The objective of deep learning is to enable agents to be more robust to unforeseen events and to lessen the dependency on huge data. Advances in GANs enable us to generate high-dimensional fake data such as high-resolution images or videos that look very convincing. Deep learning tackles the curse of dimensionality by finding efficient data structures and layers that could represent complex data in the most efficient manner. The interview in detail What is the intuition behind deep learning? What are the recent developments in deep learning? Rowel Atienza: Deep learning is built on the intuition that the deeper the network gets, the more feature representations the network learns in order to solve complex real-world problems. Unlike machine learning, deep learning learns these features automatically from data in different degrees of supervision. There are many recent developments in deep learning. There are advances on graph neural networks because people are realizing the limits of NLP (Natural Language Processing), CNN (Convolution Neural Networks), and RNN (Recurrent Neural Networks) in representing more complex data structures such as social network, 3D shapes, molecular structures, etc. Implementing the causality in reasoning on data is another area of strong interest. Deep learning is strong on correlation not on discovering causality in data. Meta learning or learning to learn is also another area of interest. The objective is to enable agents to be more robust to unforeseen events and to lessen the dependency on huge data. What are different deep learning techniques to create successful AI? RA: A successful AI is dependent on two things: 1) deep domain knowledge and 2) deep understanding of state of the art techniques that will work on the domain problem. Domain knowledge comes from someone who is very familiar with the domain problem. This person is not necessarily knowledgeable in AI. This domain knowledge is then modelled in AI to automate the process of problem solving. How deep learning tackles the curse of dimensionality? RA: One of the goals of deep learning is to keep on finding efficient data structures and layers that could represent complex data in the most efficient manner. For example, geometric deep learning is able to circumvent the limitations of representing and learning from 3D data by avoiding inefficient 3D convolutions. There is still so much to be done in this space. What is autoencoders? What is the need of autoencoders in deep learning? How do you create an autoencoder? RA: Autoencoders compress high dimensionality data into low dimensionality code without losing important information. Low-dimensional code is suitable for further processing by other deep learning models such as in generative models like GANs and VAEs. Autoencoder can easily be implemented using two networks, an encoder and decoder. The depth, width, and type of layers are dependent on the original data to be encoded. Why are GANs so innovative? RA: GANs are innovative since they are good in generating fake data that look real. It is something that is hard to accomplish using other generative models. The advances in GANs enable us to generate high-dimensional fake data such as high resolution image or video that look very convincing. Tell us a little bit about this book? What makes this book necessary? What gap does it fill? RA: Advanced Deep Learning with Keras focuses on recent advances on deep learning It starts with a quick review of deep learning concepts (NLP, CNN, RNN). The discussions on deep neural networks, autoencoders, generative adversarial network (GAN), variational autoencoders (VAE), and deep reinforcement learning (DRL) follow. The book is important for everyone who would like to understand advanced concepts on deep learning and their corresponding implementation in Keras. The current version has in depth focus on generative models (autoencoders, GANs, VAEs) that could be used in-practical setting. The DRL explains the core concepts of value-based and policy-based methods in reinforcement learning and the corresponding working implementations in Keras which are difficult to make them right. About the Book Advanced Deep Learning with Keras is a comprehensive guide to the advanced deep learning techniques available today, so you can create your own cutting-edge AI. Using Keras as an open-source deep learning library, you'll find hands-on projects throughout that show you how to create more effective AI with the latest techniques. About the Author Rowel Atienza is an Associate Professor at the Electrical and Electronics Engineering Institute of the University of the Philippines, Diliman. He holds the Dado and Maria Banatao Institute Professorial Chair in Artificial Intelligence. Rowel has been fascinated with intelligent robots since he graduated from the University of the Philippines. He received his MEng from the National University of Singapore for his work on an AI-enhanced four-legged robot. He finished his Ph.D. at The Australian National University for his contribution to the field of active gaze tracking for human-robot interaction. Deep learning models have massive carbon footprints, can photonic chips help reduce power consumption? Machine learning experts on how we can use machine learning to mitigate and adapt to the changing climate Google launches beta version of Deep Learning Containers for developing, testing and deploying ML applications
Read more
  • 0
  • 0
  • 4653

article-image-listen-how-activestate-is-tackling-dependency-hell-by-providing-enterprise-level-support-for-open-source-programming-languages-podcast
Richard Gall
08 Oct 2019
2 min read
Save for later

Listen: How ActiveState is tackling "dependency hell" by providing enterprise-level support for open source programming languages [Podcast]

Richard Gall
08 Oct 2019
2 min read
"Open source back in the late nineties - and even throughout the 2000s - was really hard to use," ActiveState CEO Bart Copeland says. "Our job," he continues, "was to make it much easier for developers to use open source and much easier for enterprises to use open source." How does ActiveState work? But how does ActiveState actually do this? Copeland explains: "ActiveState is exactly like Red Hat. So what Red Hat did to Linux - providing enterprise-grade Linux distributions - ActiveState does for open source programming languages." Clearly ActiveState is an interesting product that's playing an important part in helping enterprises to better manage the widespread migration to open source technology. For the latest edition of the Packt Podcast we spoke to Copeland about ActiveState and the growth of open source over the last decade. We think you'll find what he has to say interesting... Listen: https://soundcloud.com/packt-podcasts/activestate-making-open-source-more-accessible-for-the-enterprise-interview-with-bart-copeland   Read next: Can a modified MIT ‘Hippocratic License’ to restrict misuse of open source software prompt a wave of ethical innovation in tech? Key quotes from Bart Copeland Copeland on the relationship between enterprise management and developers: "If you look at the enterprise… they want to make sure that it works and it doesn’t cause security threats and their in compliance with all the licenses. And the result is, due to the complexities of open source, management within the enterprise will often limit developers on what languages and what open source stacks they can use because the more stacks you have, the more complexity you have in an organization." Copeland on developer freedom: "A developer is a very technical and creative individual and they want to be able to use the right tools to build the right solution. And so if a developer is handcuffed to certain technology stacks, they may not be able to use the best technology to solve the problem." Learn more about ActiveState here.
Read more
  • 0
  • 0
  • 4644
Banner background image

article-image-listen-puppets-vp-of-ecosystem-engineering-nigel-kersten-talks-about-key-devops-challenges-podcast
Richard Gall
23 Jul 2019
4 min read
Save for later

Listen: Puppet's VP of Ecosystem Engineering Nigel Kersten talks about key DevOps challenges [Podcast]

Richard Gall
23 Jul 2019
4 min read
We've been talking about DevOps a lot on the Packt Podcast. The reason for that is simple: it's a critical part of how we actually build software from both a technical and an organizational perspective. And anything that draws us closer to the relationship between people and software can only be a good thing right? For this edition of the Packt Podcast we spoke to Nigel Kersten, who's the VP of Ecosystem Engineering at Puppet. With Puppet playing an important role in the evolution of DevOps over the last decade or so, we thought he would be a great person to give an insight not only on how Puppet has been adapting to industry trends (yes, we're waving at you, Kubernetes). Listen to the episode: https://soundcloud.com/packt-podcasts/puppets-vp-of-engineering-nigel-kersten-on-the-organizational-challenges-of-devops Nigel Kersten talks DevOps We covered a diverse range of topics in the episode. From Nigel's move from Google to Puppet (which, he tells us, slightly upset his mom...), through to the challenges - and pitfalls - engineering teams face when trying to implement DevOps. Read next: DevOps engineering and full-stack development – 2 sides of the same agile coin Key quotes from this podcast episode How to automate workflows effectively “One thing we definitely tell people to do is… don’t automate one service from end to end. Don’t pick one complicated three tier web application put a small team on it and say “your job is to puppetize all of this infrastructure. What, instead, is a more powerful way to work is you go what are those low level building blocks that are across all of your infrastructure...? What are the things that are common across all of your infrastructure? Automate those things because they’re often really simple to do, and the rewards are huge.”  “Look at the things that are causing you pain in production. If you go and talk to the people who are on call, in charge of deployments, any of those parts of your infrastructure and ask them what would be the one thing that you would fix that would make your infrastructure more reliable, they will always have a shortlist of things… and when you do this, you start building trust across the whole organization.” The fear of automation “There’s always fear about adopting automation. There’s always fear about people’s jobs changing and adopting new tools and disciplines - sort of in an endless cycle of new tool adoption, people being told that they have to learn new things - the more you can actually show value across the whole organization that this thing’s relatively easy, a small investment for large returns, the more powerful an effect you're actually going to have.” DevOps challenges “I think it’s a huge mistake if people think they’re embarking on a DevOps journey and they’re not willing to actually make some of the cultural and organizational changes - it’s about creating more cross-functional teams, it’s about giving them more autonomy, and it’s about actually letting people work across organizational boundaries without having to go up and down the hierarchy of the organization.” “Most people are actually struggling pre-DevOps in many ways… the people who we’ve seen fail are the ones who have gone, look we’re going to jump exactly from where we are now and try to move to an incredibly automated environment without putting a lot of the ground work in place  - like building up trust within the org, giving teams more autonomy, allowing service owners to configure monitoring themselves - I think all of those sorts of things are really prerequisites for a whole organization succeeding at DevOps.”
Read more
  • 0
  • 0
  • 4619

article-image-listen-to-uber-engineer-yuri-shkuro-discuss-distributed-tracing-and-observability-podcast
Richard Gall
17 May 2019
2 min read
Save for later

Listen to Uber engineer Yuri Shkuro discuss distributed tracing and observability [Podcast]

Richard Gall
17 May 2019
2 min read
We've been talking a lot about observability on the Packt Hub over the last few months. Back in March we spoke to Honeycomb CEO Charity Majors who told us why observability is so important and why it can be so challenging for engineering teams to implement. It's clear it's a big topic with plenty of perspectives - but one that could have a ripple effect across the software industry. To get a further perspective on the topic, we spoke to Yuri Shkuro, who's an engineer at Uber and author of Mastering Distributed Tracing (which was published in February) to talk about how distributed tracing can help engineers build more observable systems. Yuri spoke in detail in the podcast about the value of observability in the context of complex distributed systems, as well as some of the challenges in implementing distributed tracing. As one of the creators of Jaeger, an open source tool built specifically for distributed tracing, he's well-placed to comment on how the ecosystem is evolving and how organizations can start thinking more seriously about observability. Read an extract from Yuri's book here. The episode covers: The difference between monitoring and observability Some of the misconceptions around distributed tracing Who can benefit from distributed tracing - from DevOps to SREs Practical advice for getting started with distributed tracing Listen on SoundCloud: https://soundcloud.com/packt-podcasts/if-youre-on-call-you-need-observability-tools-uber-engineer-yuri-shkuro-on-distributed-tracing “Tracing is conceptually a white box instrumentation technique. You cannot do tracing in an application by purely observing it from the outside, because that feature of context propagation is simply not possible - if you have 10 incoming requests into an application concurrently, and it does 100 outbound requests then how do you know which ones correlate to the incoming requests? That’s what context propagation allows us to achieve, it allows us to establish causality within events.”
Read more
  • 0
  • 0
  • 4615

article-image-use-keras-deep-learning
Amey Varangaonkar
13 Sep 2017
5 min read
Save for later

Why you should use Keras for deep learning

Amey Varangaonkar
13 Sep 2017
5 min read
A lot of people rave about TensorFlow and Theano, but there are is one complaint you hear fairly regularly: that they can be a little challenging to use if you're directly building deep learning models. That’s where Keras comes to the rescue. It's a high-level deep learning library written in Python that can be used as a wrapper on top of TensorFlow or Theano, to simplify the model training process and to make the models more efficient. Sujit Pal is Technology Research Director at Elsevier Labs. He has been working with Keras for some time. He is an expert in Semantic Search, Natural Language Processing and Machine Learning. He's also the co-author of Deep Learning with Keras, which is why we spoke to him about why you should use start using Keras (he's very convincing). 5 reasons you should start using Keras Keras is easy to get started with if you’ve worked with Python before and have some basic knowledge of neural networks. It works on top of Theano and TensorFlow seamlessly to create efficient deep learning models. It offers just the right amount of abstraction - allowing you to focus on the problem at hand rather than worry about the complexity of using the framework. It is a handy tool to use if you’re looking to build models related to Computer Vision or Natural Language Processing. Keras is a very expressive framework that allows for rapid prototyping of models. Why I started using Keras Packt: Why did you start using using Keras? Sujit Pal: My first deep learning toolkit was actually Caffe, then TensorFlow, both for work related projects. I learned Keras for a personal project and I was impressed by the Goldilocks (i.e. just right) quality of the abstraction. Thinking at the layer level was far more convenient than having to think in terms of matrix multiplication that TensorFlow makes you do, and at the same time I liked the control I got from using a programming language (Python) as opposed to using JSON in Caffe. I've used Keras for multiple projects now. Packt: How has this experience been different from other frameworks and tools? What problems does it solve exclusively? Sujit: I think Keras has the right combination of simplicity and power. In addition, it allows you to run against either TensorFlow or Theano backends. I understand that it is being extended to support two other backends - CNTK and MXNet. The documentation on the Keras site is extremely good and the API itself (both the Sequential and Functional ones) are very intuitive. I personally took to it like a fish to water, and I have heard from quite a few other people that their experiences were very similar. What you need to know to start using Keras Packt: What are the prerequisites to learning Keras? And what aspects are tricky to learn? Sujit: I think you need to know some basic Python and have some idea about Neural Networks. I started with Neural Networks from the Google/edX course taught by Vincent Van Houke. It’s pretty basic (and taught using TensorFlow) but you can start building networks with Keras even with that kind of basic background. Also, if you have used numpy or scikit-learn, some of the API is easier to pick up because of the similarities. I think the one aspect I have had a few problems with is building custom layers. While there is some documentation that is just enough to get you started, I think Keras would be usable in many more situations if the documentation for the custom layers was better, maybe more in line with the rest of Keras. Things like how to signal that a layer supports masking or multiple tensors, debugging layers, etc. Packt: Why do you use Keras in your day-to-day programming and data science tasks? Sujit: I have spent most of last year working with Image classification and similarity, and I've used Keras to build most of my more recent models. This year I am hoping to do some work with NLP as it relates to images, such as generating image captions, etc. On the personal projects side, I have used Keras for building question answering and disease prediction models, both with data from Kaggle competitions. How Keras could be improved Packt: As a developer, what do you think are the areas of development for Keras as a library? Where do you struggle the most? Sujit: As I mentioned before, the Keras API is quite comprehensive and most of the time Keras is all you need to build networks, but occasionally you do hit its limits. So I think the biggest area of Keras that could be improved would be extensibility, using its backend interface. Another thing I am excited about is the contrib.keras package in TensorFlow, I think it might open up even more opportunity for customization, or at least the potential to maybe mix and match TensorFlow with Keras.
Read more
  • 0
  • 0
  • 4590
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-statistics-data-science-interview-james-miller
Amey Varangaonkar
09 Jan 2018
9 min read
Save for later

Why You Need to Know Statistics To Be a Good Data Scientist

Amey Varangaonkar
09 Jan 2018
9 min read
Data Science has popularly been dubbed as the sexiest job of the 21st century. So much so that everyone wants to become a data scientist. But what do you need to get started with data science? Do you need to have a degree in statistics? Why is having sound knowledge of statistics so important to be a good data scientist? We seek answers to these questions and look at data science through a statistical lens, in an interesting conversation with James D. Miller. [author title="James D. Miller"]James is an IBM certified expert and a creative innovator. He has over 35 years of experience in applications and system design & development across multiple platforms and technologies. Jim has also been responsible for managing and directing multiple resources in various management roles including project and team leader, lead developer and applications development director. He is the author or several popular books such as Big Data Visualization, Learning IBM Watson Analytics, Mastering Splunk, and many more. In addition, Jim has written a number of whitepapers and continues to write on a number of relevant topics based upon his personal experiences and industry best practices.[/author] In this interview, we look at some of the key challenges faced by many while transitioning from a data developer role to a data scientist. Jim talks about his new book, Statistics for Data Science and discusses how statistics plays a key role when it comes to finding unique, actionable insights from data in order to make crucial business decisions. Key Takeaways - Statistics for Data Science Data science attempts to uncover the hidden context of data by going beyond answering generic questions such as ‘what is happening’, to tackling questions such as ‘what should be done next’. Statistics for data science cultivates 'structured thinking' in one. For most data developers who are transitioning to the role of data scientist, the biggest challenge often comes in calibrating their thought process - from being data design-driven to more insight-driven Having a sound knowledge of statistics differentiates good data scientists from mediocre ones - it helps them accurately identify patterns in data that can potentially cause changes in outcomes Statistics for Data Science attempts to bridge the learning gap between database development and data science by implementing the statistical concepts and methodologies in R to build intuitive and accurate data models. These methodologies and their implementations are easily transferable to other popular programming languages such as Python. While many data science tasks are being automated these days using different tools and platforms, the statistical concepts and methodologies will continue to form their backbone. Investing in statistics for data science is worth every penny! Full Interview Everyone wants to learn data science today as it is one of the most in-demand skills out there. In order to be a good data scientist, having a strong foundation in statistics has become a necessity. Why do you think is this the case? What importance does statistics have in data science? With Statistics, it has always been about "explaining" (data). With data science, the objective is going beyond questions such as "what happened?" and the "what is happening?" to try to determine "what should be done next?". Understanding the fundamentals of statistics allows one to apply "structured thinking" to interpret knowledge and insights sourced from statistics. You are a seasoned professional in the field of Data Science with over 30 years of experience. We would like to know how your journey in Data Science began, and what changes you have observed in this domain over the 3 decades. I have been fortunate to have had a career that has traversed many platforms and technological trends (in fact over 37 years of diversified projects). Starting as a business applications and database developer, I have almost always worked for the office of finance. Typically, these experiences started with the collection - and then management of - data to be able to report results or assess performance. Over time, the industry has evolved and this work as becoming a “commodity” – with many mature tool options available and plenty of seasoned professionals available to perform the work. Businesses have now become keen to “do something more” with their data assets and are looking to move into the world of data science. The world before us offers enormous opportunities for those not only with a statistical background but someone with a business background that understands and can apply the statistical data sciences to identify new opportunities or competitive advantages. What are the key challenges involved in the transition from being a data developer to becoming a data scientist? How does the knowledge of statistics affect this transition? Does one need a degree in statistics before jumping into Data Science? Someone who has been working actively with data already has a “head start” in that they have experience with managing and manipulating data and data sources. They would also most likely have programming experience and possess the ability to apply logic to data. The challenge will be to “retool” their thinking from data developer to data scientist – for example, going from data querying to data mining. Happily, there is much that the data developer “already knows” about data science and my book Statistics for Data Science attempts to “point out” the skills and experiences that the data developer will recognize as the same or at least have significant similarities. You will find that the field of data science is still evolving and the definition of “data scientist” depends upon the industry, project or organization you are referring to. This means that there are many roles that may involve data science with each having perhaps quite different prerequisites (such as a statistical degree). You have authored a lot of books such as Big Data Visualization, Learning IBM Watson Analytics, etc. with the latest being Statistics for Data Science. Please tell us something about your latest book. The latest book, “Statistics for Data Science”, looks to point out the synergies between a data developer and data scientist and hopes to evolve the data developers thinking “beyond database structures”, but also introduces key concepts and terminologies such as probability, statistical inference, model fitting, classification, regression and more, that can be used to journey into statistics and data science. How is statistics used when it comes to cleaning and pre-processing the data? How does it help the analysis? What other tasks can these statistical techniques be used for? Simple examples of the use of statistics when cleaning and/or pre-processing of data (by a data developer) include data-typing, Min/Max limitation, addressing missing values and so on. A really good opportunity for the use of statistics in data or database development is while modeling data to design appropriate storage structures.  Using statistics in data development applies a methodical, structured approach to the process. The use of statistics can be a competitive advantage to any data development project. In the book, for practical purposes, you have shown the implementation of the different statistical techniques using the popular R programming language. Why do you think R is favored by the statisticians so much? What advantages does it offer? R is a powerful, feature-rich, extendable free language with many, many easy to use packages free for download. In addition, R has “a history” within the data science industry. R is also quite easy to learn and be productive with quickly. It also includes many graphics and other abilities “built-in”. Do you foresee a change in the way statistics for data science is used in the near future? In other words, will the dependency on statistical techniques for performing different data science tasks reduce? Statistics will continue to be important to data science. I do see more “automation” of more and more data science tasks through the availability of “off the shelf” packages that can be downloaded and installed and used. Also, the more popular tools will continue to incorporate statistical functions over time. This will allow for the main-streaming of statistics and data science into even more areas of life. The key will be for the user to have an understanding of the key statistical concepts and uses. What advice would you like to give to - 1 Those transitioning from the developer to the data scientist role, and 2. Absolute beginners, who want to take up statistics and data science as a career option? Buy my book! But seriously, keep reading and researching. Expose yourself to as much statistics and data science use cases and projects a possible. Most importantly, as you read about the topic, look for similarities between what you do today and what you are reading about. How does it relate? Always look for opportunities to use something that is new to you to do something you do routinely today. Your book 'Statistics for Data Science' highlights different statistical techniques for data analysis and finding unique insights from data. What are the three key takeaways for the readers, from this book? Again, I see (and point out in the book) key synergies between data or database development and data science. I would urge the reader – or anyone looking to move from data developer to data scientist - to learn through these and perhaps additional examples he or she may be able to find and leverage on their own. Using this technique, one can perhaps navigate laterally, rather than losing the time it would take to “start over” at the beginning (or bottom?) of the data science learning curve. Additionally, I would suggest to the reader that time taken to get acquainted with the R programs and the logic used for statistical computations (this book should be a good start) is time well spent.  
Read more
  • 0
  • 0
  • 4517

article-image-industrial-internet-iiot-architects
Aaron Lazar
21 Nov 2017
8 min read
Save for later

Why the Industrial Internet of Things (IIoT) needs Architects

Aaron Lazar
21 Nov 2017
8 min read
The Industrial Internet, the IIoT, the 4th Industrial Revolution or Industry 4.0, whatever you may call it, has gained a lot of traction in recent times. Many leading companies are driving this revolution, connecting smart edge devices to cloud-based analysis platforms and solving their business challenges in new and smarter ways. To ensure the smooth integration of such machines and devices, effective architectural strategies based on accepted principles, best practices, and lessons learned, must be applied. In this interview, Shyam throws light on his new book, Architecting the Industrial Internet, and shares expert insights into the world of IIoT, Big Data, Artificial Intelligence and more. Shyam Nath Shyam is the director of technology integrations for Industrial IoT at GE Digital. His area of focus is building go-to-market solutions. His technical expertise lies in big data and analytics architecture and solutions with focus on IoT. He joined GE in Sep 2013 prior to which he has worked in IBM, Deloitte, Oracle, and Halliburton. He is the Founder/President of the BIWA Group, a global community of professional in Big Data, analytics, and IoT. He has often been listed as one of the top social media influencers for Industrial IoT. You can follow him on Twitter @ShyamVaran.   He talks about the IIoT, the various impacts that technologies like AI and Deep Learning will have on IIoT and he gives a futuristic direction to where IIoT is headed towards. He talks about the challenges that Architects face while architecting IIoT solutions and how his book will help them overcome such issues. Key Takeaways The fourth Industrial Revolution will break silos and bring IT and Ops teams together to function more smoothly. Choosing the right technology to work with involves taking risks and experimenting with custom solutions. The Predix platform and Predix.io allow developers and architects, quickly learn from others and build working prototypes that can be used to get quick feedback from the business users. Interoperability issues and a lack of understanding of all the security ramifications of the hyper-connected world could be a few challenges that adoption of IIoT must overcome Supporting technologies like AI, Deep Learning, AR and VR will have major impacts on the Industrial Internet In-depth Interview On the promise of a future with the Industrial Internet The 4th Industrial Revolution is evolving at a terrific pace. Can you highlight some of the most notable aspects of Industry 4.0? The Industrial Internet is the 4th Industrial Revolution. It will have a profound impact on both the industrial productivity as well as the future of work. Due to more reliable power, cleaner water, and Intelligent Cities, the standard of living will improve, at large for the world citizens. Industrial Internet will forge new collaborations between the IT and OT, in the organizations, and each side will develop a better appreciation of the problems and technologies of the other. They will work together to create smoother overall operations by breaking the silos. On Shyam’s IIoT toolbox that he uses on a day to day basis You have a solid track record of architecting IIoT applications in the Big Data space over the years. What tools do you use on a day-to-day basis? In order to build Industrial Internet applications, GE's Predix is my preferred IIoT platform. It is built for Digital Industrial solutions, with security and compliance baked into it. Customer IIoT solutions can be quickly built on Predix and extended with the services in the marketplace from the ecosystem. For Asset Health Monitoring and for reducing the downtime, Asset Performance Management (APM) can be used to get a jump start and its extensibility framework can be used to extend it. On how to begin one’s journey into building the Industry 4.0 For an IIoT architect, what would your recommended learning plan be? What aspects of architecting Industry 4.0 applications are tricky to master and how does your book Architecting the Industrial Internet, prepare its readers to be industry ready? An IIoT Architect can start with the book Architecting the Industrial Internet, to get a good grasp of the area broadly. This book provides a diverse set of perspectives and architectural principles, from authors who work in GE Digital, Oracle and Microsoft. The end-to-end IIoT applications involve an understanding of sensors, machines, control systems, connectivity and cloud or server systems, along with the understanding of associated enterprise data, the architect needs to focus on a limited solution or proof of concept first. The book provides coverage for the end-to-end requirements of the IIoT solutions for the architects, developers and business managers. The extensive set of use cases and case studies provides examples from many different industry domains to allow the readers to easily related to it. The book is written, in a style that would not overwhelm the reader, yet explain the workings of the architecture and the solutions. The book will be best suited for Enterprise Architects and Data Architects who are trying to understand how IIoT solutions differ from traditional IT solutions. The layer-by-layer description of the IIoT Architecture will provide a systematic approach to help develop a deep understanding, for Architects. IoT Developers who have some understanding of this area can learn the IIoT platform-based approach to building solutions quickly. On how to choose the best technology solution to optimize ROI There are so many IIoT technologies, that manufacturers are confused as to how to choose the best technology to obtain the best ROI. What would your advice to manufacturers be, in this regard? The manufacturers and operation leaders look for quick solutions to known issues, in a proven way. Hence, often they do not have the appetite to experiment with a custom solution, rather they like to know where the solution provider has solved similar problems and what was the outcome. The collection of use cases and case studies will help business leaders get an idea of the potential ROI while evaluating the solution. Getting to know Predix, GE’s IIoT platform, better Let's talk a bit about Predix, GE's IIoT platform. What advantages does Predix offer developers and architects? Do you foresee any major improvements coming to Predix in the near future? The GE's Predix platform has a growing developer community that is approaching 40,000 strong. Likewise, the ecosystem of Partners is approaching 1000. Coupled with the free access to create developer accounts on Predix.io, the developers and architects can quickly learn from others and build working prototypes that can be used to get quick feedback from the business users. The catalog of microservices at Predix.io will continue to expand. Likewise, applications written on top of Predix, such as APM and OPM (Operations Performance Management) will continue to become feature-rich, providing coverage to many common Digital Industrial challenges. On the impact of other emerging technologies like AI on IIoT What according to you will the impact be of AI and Deep Learning, on IIoT? AI and Deep Learning help to build robust Digtal Twins of the industrial assets. These Digital Twins, will make the job of predictive maintenance and optimization, much easier for the operators of these assets. Further, IIoT will benefit from many new advances in technologies like AI, AR/VR and make the job of Field Services Technicians easier. IIoT is already widely used in energy generation and distribution, Intelligent Cities for law enforcement and to ease traffic congestion. The field of healthcare is evolving, due to increasing use of wearables. Finally, precision agriculture is enabled by IoT as well. On likely barriers to IIoT adoption What are the roadblocks you expect in the adoption of IIoT? Today the challenges to rapid adoption of IoT, are interoperability issues and lack of understanding of all the security ramifications of the hyper-connected world. Finally, how to explain the business case of the IoT to the decision makers and different stakeholders is still evolving. On why Architecting the Industrial Internet is a must read for Architects Would you like to give architects 3 reasons on why they should pick up your book? It is written by IIoT practitioners from large companies who are building solutions for both internal and external consumption. The book captures the architectural best practices and advocates a platform based approach, to solutions. The theory is put to practice in the form of use cases and case studies, to provide a comprehensive guide to the architects. If you enjoyed this interview, do check out Shyam’s latest book, Architecting the Industrial Internet.
Read more
  • 0
  • 0
  • 4426

article-image-nate-chamberlain-talks-about-the-microsoft-enterprise-mobility-and-security-suite-and-why-be-ms-101-certified
Savia Lobo
06 Dec 2019
7 min read
Save for later

Nate Chamberlain talks about the Microsoft Enterprise Mobility and Security suite and becoming M365 certified

Savia Lobo
06 Dec 2019
7 min read
Security is an important aspect for organizations and securing the devices that contain confidential data--personal or professional, is absolutely essential. Microsoft Enterprise Mobility + Security, an intelligent mobility management and security platform, offers a suite of services that helps in securing employee devices; thus, protecting and securing the organization. Our recent chat with Nate Chamberlain, a Business Analyst at DH Pace, Kansas, helped us understand more about the Microsoft 365 Enterprise Mobility + Security suite. Nate is also a Microsoft MVP in Office apps and services. His recently published book Microsoft 365 Mobility and Security – Exam Guide MS-101, helps users to plan, deploy, and manage Microsoft Office 365 services and gain the skills required to pass the MS-101 exam. In this interview, Nate also shares his favorite services from the suite, the importance of using Shadow IT in a controlled manner while ensuring security Cloud Apps, how the M365 Certified Enterprise Administrator Expert certification has helped him give a career boost, and much more. On the Microsoft Enterprise Mobility and Security suite Microsoft Enterprise Mobility and Security suite help professionals secure devices used within the enterprise and also helps in identifying breaches before they cause any major damage. The suite provides two offerings, Enterprise Mobility + Security E3 and E5. Talking about the popular Microsoft solutions in the suite, Nate said, “Azure Active Directory, Intune, and the Microsoft 365 Security & Compliance Center are big players in the overall EM+S suite. Taking the time to get to know each of them has the potential to significantly enhance your organization’s security.” Nate said one of his favorite features from the Microsoft 365 Enterprise Mobility + Security suite is, “the ability to be extremely granular in building conditional access policies. That, paired with the ability to utilize AI and zero-day security information in policies and practices, continually impresses me. It’ll be interesting to see where Endpoint Manager takes us.” On the topic of what features he would like to add in the suite in the future, he said, “the biggest improvement I would hope for currently is licensing simplification, and making sure admins are able to secure their organization and its users without breaking the budget.” On the new Microsoft Endpoint Manager and using ‘Shadow IT’ for Cloud App security Last month, Microsoft announced its new Endpoint Manager, a convergence of two of its popular tools, System Center Configuration Manager (ConfigMgr) and Microsoft Intune. Both ConfigMgr and Intune offer integrated cloud-powered management tools, and unique co-management options to provision, deploy, manage, and secure endpoints and applications across an organization. The Endpoint Manager offers end-to-end management solutions without the need for worrying too much about the complexity involved during migration, thus helping customers in a smooth cloud transition. According to Nate, “Microsoft Endpoint Manager takes a lot of the licensing guesswork out of building a secure solution for your organization.”  In addition to Intune and ConfigMgr, Microsoft Endpoint Manager includes the Device Management Admin Center (DMAC) and Desktop Analytics. Nate further adds that Microsoft Endpoint Manager includes nearly everything discussed in his exam prep book, Microsoft 365 Mobility and Security – Exam Guide MS-101--including Intune and ConfigMgr. Shadow IT and cloud app security In his book, Nate has written about controlling the use of ‘Shadow IT’ for Cloud App security. Shadow IT, also known as Stealth IT, is built and used without the knowledge of the IT or security group within the organization. We asked Nate for what processes is Shadow IT built-in the organizations. We also asked why Shadow IT is a threat and how organizations can minimize its usage. Clearing the clouds on Shadow IT, Nate explains, “Shadow IT is often a consequence of being too restrictive without providing alternative means of productivity and collaboration solutions. And sometimes, even if you provide alternatives or company-licensed tools, it’s the lack of ongoing education that failed to spread awareness and competency that led users to more familiar, comfortable means of accomplishing goals. When users need to accomplish something, they’ll find a way with or without the organization’s assistance.” He further adds, “It’s IT administrators’ responsibility to make sure productivity and collaboration solutions are provisioned and configured for secure, appropriate usage, and that education is provided to get users on board.” A few key takeaways from Nate’s book, an MS-101 exam guide, and his recommendations for further Microsoft 365 certifications According to Microsoft’s official website, the “Exam MS-101: Microsoft 365 Mobility and Security”, the skills measured in the exam includes, implementing modern device services, implementing Microsoft 365 security and threat management, and managing Microsoft 365 governance and compliance. These skills would help companies sieve through all the candidates among others who don’t know much about the suite. Talking about the key takeaways from his book, Nate says, “I hope readers find the content to be challenging, but accessible. The best takeaway I could hope for is that readers retain information that not only helps them in the exam but in their jobs. The whole point of taking exams and obtaining certifications is to demonstrate proficiency, knowledge, and skill. Ultimately, it’s practising those skills in the real world that matter - not the score on the exam. But the exam is absolutely a first step toward building confidence and career growth.” We also asked him what other certifications he would recommend next, to which Nate said, “Once readers pass MS-101, they should aim for passing MS-100 if they haven’t already. After that, they’re just one prerequisite certification away from becoming a Microsoft 365 Enterprise Administrator Expert.” On Nate’s journey as a Microsoft SharePoint Systems Engineer and beyond Nate worked as a SharePoint Systems Engineer at LMH Health, Kansas, and is currently a Business Analyst at DH Pace in Olathe, KS. He is also an M365 Certified Enterprise Administrator Expert. He shared why certifications are important for career growth. He says, “My certification certainly looks great on my resume to potential employers and I like to think it’s part of what made me competitive in pursuing my current role. Certifications are verified proof of skill and competency. It alleviates some risk a company would otherwise assume in hiring someone for highly technical work like we find in our industry.” He also shared about his journey and how learning SharePoint transformed his role. “My journey has been one of self-teaching, fueled by inspiring tech solutions coming out of Microsoft. I was once tasked to learn what I can about SharePoint, at the University of Kansas, and it turned into a SharePoint-specific role there. That opened doors for me which brought me to LMH Health and ultimately DH Pace.” He continues, “Somewhere along the way, I started sharing what I was learning via my blog, NateChamberlain.com, and by speaking at conferences around the country regularly. I also started a SharePoint user group, LSPUG, in Lawrence, KS. For these reasons and perhaps others, I was awarded the Microsoft MVP for Office Apps and Services.” Certifications are indeed verified proof of skill and competency. So go ahead and check out Nate’s book, Microsoft 365 Mobility and Security – Exam Guide MS-101, to get up to speed with planning, deploying, and managing Microsoft Office 365 services and gain the skills you need to pass the MS-101 exam. With this book, you’ll explore everything from mobile device management and compliance, through to data governance and auditing. By the end of this book, you’ll have learned to work with Microsoft 365 services and covered the concepts and techniques you need to know to pass the MS-101 exam. Written in a succinct way, you’ll explore chapter-wise self-assessment questions, exam tips and mock exams with answers. Microsoft technology evangelist Matthew Weston on how Microsoft PowerApps is democratizing app development [Interview] How PyTorch is bridging the gap between research and production at Facebook: PyTorch team at F8 conference SOLIDWORKS specialist Tayseer Almattar takes us into the world of 3D modeling using SOLIDWORKS 2020 [Interview]
Read more
  • 0
  • 0
  • 4422

article-image-be-objective-fight-for-the-user-and-test-with-real-users-on-the-go-interview-with-design-purist-will-grant
Packt Editorial Staff
17 Jul 2018
8 min read
Save for later

“Be objective, fight for the user, and test with real users on the go!” - Interview with design purist, Will Grant

Packt Editorial Staff
17 Jul 2018
8 min read
Too often, as designers and developers we fail to make interfaces that are usable, fail to make software that is intuitive, and fail to make products that normal people can understand. By coating design rigour with a layer of brand fluff, and putting form over function again and again, we build products that serve nobody but the internal needs of our corporations and brands. In this interview with Will Grant, a web technology entrepreneur and veteran, we discuss ways to solve 101 UX design problems clearly and single-mindedly. We also discuss about his upcoming book 101 UX Principles, in which Will has defined and refined what it means to build products people intuitively know how to use. Author’s Bio Will Grant is a British UI/UX expert and graduate of Birmingham City University, where he studied human computer interaction and usable design. Following his degree, he trained with Jakob Nielsen and Bruce Tognazzini, pioneers in UX design. Will has been building intuitive usable software products since the birth of the consumer web over 20 years ago, through to the present day, where Will's work has reached more than a billion users. He is the co-founder and the design lead at UX-focused analytics tool Prodlytic. Key takeaways: The vast majority of UX is still about concepts, journey and the tasks we help users to achieve. The tools to deliver great UX have changed, but UX is still about familiarity, consistency and empathy. The 101 UX Principles are a shortcut for UX professionals. Designers can apply them to their products and make usable software 99% of the time for 99% of users. Over reliance on ‘brand’ and internal goals, trying to reinvent the wheel, and forgetting to put oneself in the place of the user are some common reasons why UX design fails. Many UX people forget that design – UI design in particular – isn’t art, it’s design to perform a function: to serve users. Follow Will’s 10 commandments for effective UX design to create more usable and successful products. There’s another 91 in the book 101 UX Principles too. Full Interview Of the 100+ UX design principles that you explore in your new book, if we asked you to pick the top 10, what would those be? Will’s 10 commandments for effective UX design, so to speak. Test with real users Don’t join the dark side Make your buttons look like buttons Label your icon buttons Use 2 font families, maximum Make ‘blank slates’ more than just empty views Hide ‘advanced’ settings from most users Decide if an interaction should be obvious, easy, or possible Anyone can be a UX professional Use device-native input features where possible Just following these 10 and applying them to your software design will create more usable, successful products. There’s another 91 great commandments in the book too. Will, as this book is about 101 UX Principles, what makes your principles right? Nothing is perfect, but these principles are a ‘shortcut’ for UX professionals. Instead of reinventing the wheel, designers can apply these principles to their products and make usable software 99% of the time for 99% of users. I’ve spent over 20 years, since the birth of the consumer web, building interfaces for 100s of products and over a billion users. My approach isn’t perfect, but it has been tested and proven to work at scale. This guide will help you avoid common mistakes and start with a product that’s extremely usable and intuitive - for the widest possible section of users. Why do people keep making UX mistakes? It’s usually a combination of factors; over reliance on ‘brand’ and internal goals, trying to reinvent the wheel, and forgetting to put yourself in the place of the user. Too often the internal goals of an organisation supercede the design teams who are genuinely trying to ‘fight for the user’. The CEO wants it to look a certain way (but he/she has no design background), or the marketing team decide that a certain typeface has to be used (even though it’s unreadable). The paradox is that, as UX and UI people, we’re over-exposed to components, controls, patterns and interfaces in general. It’s the curse of knowledge and we are the last people who should be designing interfaces — unless we can do the hard bit: objectivity. Name a big company that gets UX right, and one that gets it wrong This is impossible, even today after 20+ years of consumer web products, the experience people see is wildly different from product to product - regardless of the company. Generally, large companies with lots of internal bureaucracy and hierarchy produce end products that are the least usable - this is where small, nimble startups can often produce a better product: not because they are ‘better’ overall, but because they haven’t yet lost sight of the importance of UX. And, crucially, startup teams are less encumbered by legacy baggage and are more free to follow best practice in design. Who inspires you the most with the UX community? Donald Norman & Jakob Nielsen have both been hugely influential to me. Don Norman’s book “The design of everyday things”  pretty much kicked off and ‘invented’ the whole field of human-computer interaction, which these days we call ‘UX’. Nielsen & Norman are sometimes derided as ‘too purist’ but that’s what appeals to me most. Stripping back interfaces to the bare minimum, removing clutter and making things simple are things I try to do in my work every day. I worked for a boss in my early 20’s - he wasn’t a designer - but he did fly into an apoplectic rage at the slightest mistake I might make. It taught me to check, check and re-check my designs and despite him being a horrible person, my work is better for it. What was the last app that made you throw down your phone in frustration? Easy - it was the HSBC app, yesterday, with it’s dreadful ‘update’ process. Apple have gone to great lengths to build an App Store which auto-updates your apps, in the background while you’re asleep and your phone is on charge. HSBC decided that their banking app should do its own half-assed updates, whenever it feels like it, inside the app - just when you open the app and you’re about to use it. A classic example of reinventing the wheel, building a new experience that fails because nobody has thought of the user - only of their internal needs. In your more than two decades of UX design experience, how has the web evolved from a user experience perspective? What were some of the biggest surprises in UX design trends for you? What design ideas have remained unaltered by time? I think it’s remarkable how little has changed - in terms of design ideas that ‘just work’ at least. Yes, software has changed massively over that time - from basic websites and browsers on desktop computers through to web app and native apps on smartphones and tablets. However, the vast majority of UX is still about the concepts, the journey and the tasks you’re helping the user to achieve. The tools to deliver great UX have changed, but UX itself is still about familiarity, consistency and empathy. With emerging technologies like machine learning, AR, VR, IoT etc increasingly impacting how we design for the web, where do you see UX design heading in the coming years? What are some general rules worth keeping in mind when designing for the future? What are some opportunities and challenges you foresee for UX designers? It's more of a hope than a prediction, but perhaps us designers will stop doing things because we can and start asking if we should. A greater sense of social responsibility, and a reduction in sneaky 'dark pattern' UX would be great for everyone. Somewhere along the way, many UX people forgot that design – UI design in particular – isn’t art, it’s design to perform a function: to serve users. Too many designers are slavishly following the latest design trend, applying ‘flat design’ to every app, or trying to be different for the sake of it, with custom-designed interfaces and arbitrary visual metaphors. The solution is simple, too: try and be objective, fight for the user, and test with real users as you go. 101 UX Principles provides 101 ways to solve 101 UX problems clearly and single-mindedly. There are 1000s of methods to apply to each and every interaction in your product, but this book is a ‘shortcut’ to a method that works. The book is available to pre-order now and is expected to be published soon. What UX designers can teach Machine Learning Engineers? To start with: Model Interpretability Is your web design responsive? A UX strategy is worthless without a solid usability test plan    
Read more
  • 0
  • 0
  • 4343
article-image-red-badger-tech-director-viktor-charypar-talks-monorepos-lifelong-learning-and-the-challenges-facing-open-source-software-interview
Richard Gall
10 May 2019
7 min read
Save for later

Red Badger Tech Director Viktor Charypar talks monorepos, lifelong learning, and the challenges facing open source software [Interview]

Richard Gall
10 May 2019
7 min read
Back in February, Viktor Charypar, Tech Director at Red badger explained the benefits of using a monorepo. For many teams, especially those without the resources or a highly developed and well-supported engineering culture, the idea of a monorepo might sound a little strange - following on from this piece, I spoke to Viktor to get a little bit more detail on the benefits of a monorepo and why engineering teams should seriously consider using them. But I didn't just speak to him about monorepos - I was also interested in how Red Badger builds a forward thinking engineering culture that can empower its clients, and how the team embraces continuous learning to ensure everyone is on top of the trends and tools that are going to be impacting digital transformation in the future. So, let's take a look at what Viktor had to say... Why monorepos now? Richard Gall: Why a monorepo now? If you’re dealing with multiple microservices doesn’t it make sense to separate source code? Viktor Charypar: It would seem to make sense, but microservices architecture actually introduces a new level of complexity that needs to be managed, which monorepos make much easier. The main issue is in dependencies between the services and the contracts they agree to exchange data. If one side of the contract changes in an incompatible way without the other side adapting to that change, the system no longer works. This problem grows with the number of services in the system and so it’s especially prominent in microservices architectures. Managing each service’s source code in a separate repository makes it more difficult to understand its ties to the rest of the system. This forces you to adopt some kind of external versioning scheme, such as semantic versioning, to express which revisions of services work together as things change. These versions are decided by engineers manually as changes are made according to a set of rules, and then the dependent services are updated to refer to the latest version of the service they consume, when they are changed to be compatible. This is time-consuming and error-prone. In a monorepo, all the components of the system are versioned together and changes can be made across the system. This not only means an external versioning scheme is not required, but it also makes it easier to test and enforce contracts between services. Monorepos really come to their own when they are coupled with a Continuous Integration system aware of the dependencies between components in the repo. Given a change made by a developer and the knowledge of the dependency “graph”, we can deterministically decide which system components can be affected by the change and therefore need to be retested. It is then up to owners of each service to do a level of testing of their dependencies to make sure their behaviour didn’t change significantly enough to break their own functionality. All this is automated and can be executed without human intervention. Humans just make changes to the software and express expectations on their dependencies in terms of contracts and tests. Read next: Mozilla’s updated policies will ban extensions with obfuscated code Digital transformation challenges RG: What common problems are clients coming to you with? VC: In general, our clients recognise they need help with their digital product capabilities, i.e. delivering interesting propositions to their customers as digital products, typically websites and mobile apps. In large enterprise companies, this ability is generally predicated on going through a digital transformation - adopting agile delivery methods, breaking down functional silos and working in cross-discipline, vertically aligned teams that can decide things quickly and adapt to how customers respond to their product offering. Our clients typically come to us with one of a few problems ranging from needing help with product strategy, i.e. what to offer their customers and how to find which of the many ideas have a market fit. Through knowing what to do but struggling to deliver it at pace, all the way to already having a digital product offering, but one which doesn’t perform as expected. Either from the perspective of customer behaviour (e.g. low conversion rate) or from a technical quality perspective, i.e. the website is unstable, struggles under high load, there are long outages, etc. While our strength is traditionally in fast product delivery and quality, we can help across the board, from product strategy to what we call empower and embed - demonstrating how to deliver digital products quickly, sustainably and with high quality, helping to build internal capability and then handing over to them. Essentially we want to help our clients build sustainable businesses. Read next: Linux forms Urban Computing Foundation: Set of open source tools to build autonomous vehicles and smart infrastructure Learning and assessing new software and tools RG: How do you stay on top of new tools? Do you have a learning culture at Red Badger? VC: We absolutely do, from simple day to day things like all engineers being encouraged to pair program to learn from each other or everyone in the company having a yearly training budget as one of the benefits, to doing things like a yearly internal mini conference called Tech Lab for all the engineers to get together and share latest learnings and general experience from projects. We have actually recently published a report which started with an activity at the last Tech Lab, which answers a lot of the questions above. It’s available on our website here (and we’ll also follow it with a series of events). We also run a few regular meetups in London, the biggest being the London React Meetup, which we’ve been hosting regularly for about four years. RG: How do you assess tools? VC: There are a few things we generally look at. The first is obviously experimenting with the tool to work out what it does and how. We’re in a privileged position of starting new projects, often greenfield ones, fairly regularly. We typically use about 80% of tools we know and trust and about 20% of new ones, which we want to try out “in anger” and learn about. We also look at who is behind these tools, which are generally open source, and whether there is momentum behind them and support from the community. Open source software typically goes through a period of rapid innovation and competition in a certain area and then, eventually, the community settles on a few options that work the best and fit the different problems people are trying to solve. The future of open source - is it sustainable? RG: How do you see the future of open source - is it sustainable on its current model? VC: That’s an interesting thing to think about! It seems like the open source model is widely misunderstood as software being built by dedicated developers in their free time. But in reality, most large, popular open source projects are backed by large software companies and people maintain them as their day job - for example, Linux, Kubernetes, React. Even the web standards are set by standard bodies comprised of professionals supported by the major browser vendors. I think the model with a sole maintainer working on something in their spare time doesn’t really work if their project gets very popular and the demand on their time grows. We all know how people tend to behave on the internet and software industry is no exception, so maintainers who do it as a hobby are at a pretty high risk of burning out. For the major open source projects, this seems to be more of an exception, as they are typically maintained by a team of people employed by a company invested in the project. The sponsor benefits from the community contributions and, if the project gets popular, from controlling the direction of a de facto standard and the community benefits from someone else doing the lion’s share of the work. I look at it as being similar to science, where different people publicly contribute to push the boundaries of knowledge, just because pooling resources makes more sense and doesn’t stop any individual contributor from profiting on the results. In that sense, I think it’s a pretty sustainable model and leads to better quality, more versatile software.
Read more
  • 0
  • 0
  • 4332

article-image-enhancing-data-quality-with-cleanlab
Prakhar Mishra
11 Dec 2024
10 min read
Save for later

Enhancing Data Quality with Cleanlab

Prakhar Mishra
11 Dec 2024
10 min read
IntroductionIt is a well-established fact that your machine-learning model is only as good as the data it is fed. ML model trained on bad-quality data usually has a number of issues. Here are a few ways that bad data might affect machine-learning models -1. Predictions that are wrong may be made as a result of errors, missing numbers, or other irregularities in low-quality data. The model's predictions are likely to be inaccurate if the data used to train is unreliable.2. Bad data can also bias the model. The ML model can learn and reinforce these biases if the data is not representative of the real-world situations, which can result in predictions that are discriminating.3. Poor data also disables the the ability of ML model to generalize on fresh data. Poor data may not effectively depict the underlying patterns and relationships in the data.4. Models trained on bad-quality data might need more retraining and maintenance. The overall cost and complexity of model deployment could rise as a result.As a result, it is critical to devote time and effort to data preprocessing and cleaning in order to decrease the impact of bad data on ML models. Furthermore, to ensure the model's dependability and performance, it is often necessary to use domain knowledge to recognize and address data quality issues.It might come as a surprise, but gold-standard datasets like ImageNet, CIFAR, MNIST, 20News, and more also contain labeling issues. I have put in some examples below for reference -The above snippet is from the Amazon sentiment review dataset , where the original label was Neutral in both cases, whereas Cleanlab and Mechanical turk said it to be positive (which is correct).The above snippet is from the MNIST dataset, where the original label was marked to be 8 and 0 respectively, which is incorrect. Instead, both Cleanlab and Mechanical Turk said it to be 9 and 6 (which is correct).Feel free to check out labelerrors to explore more such cases in similar datasets.Introducing CleanlabThis is where Cleanlab can come in handy as your best bet. It helps by automatically identifying problems in your ML dataset, it assists you in cleaning both data and labels. This data centric AI software uses your existing models to estimate dataset problems that can be fixed to train even better models. The graphic below depicts the typical data-centric AI model development cycle:Apart from the standard way of coding all the way through finding data issues, it also offers Cleanlab Studio - a no-code platform for fixing all your data errors. For the purpose of this blog, we will go the former way on our sample use case.Getting Hands-on with CleanlabInstallationInstalling cleanlab is as easy as doing a pip install. I recommend installing optional dependencies as well, you never know what you need and when. I also installed sentence transformers, as I would be using them for vectorizing the text. Sentence transformers come with a bag of many amazing models, we particularly use ‘all-mpnet-base-v2’ as our choice of sentence-transformers for vectorizing text sequences. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for tasks like clustering or semantic search. Feel free to check out this for the list of all models and their comparisons.pip install ‘cleanlab[all]’ pip install sentence-transformersDatasetWe picked the SMS Spam Detection dataset as our choice of dataset for doing the experimentation. It is a public set of labeled SMS messages that have been collected for mobile phone spam research with total instances of roughly ~5.5k. The below graphic gives a sneak peek of some of the samples from the dataset.Data PreviewCodeLet’s now delve into the code. For demonstration purposes, we inject a 5% noise in the dataset, and see if we are able to detect them and eventually train a better model.Note: I have also annotated every segment of the code wherever necessary for better understanding.import pandas as pd from sklearn.model_selection import train_test_split, cross_val_predict       from sklearn.preprocessing import LabelEncoder               from sklearn.linear_model import LogisticRegression       from sentence_transformers import SentenceTransformer       from cleanlab.classification import CleanLearning       from sklearn.metrics import f1_score # Reading and renaming data. Here we set sep=’\t’ because the data is tab       separated.       data = pd.read_csv('SMSSpamCollection', sep='\t')       data.rename({0:'label', 1:'text'}, inplace=True, axis=1)       # Dropping any instance of duplicates that could exist       data.drop_duplicates(subset=['text'], keep=False, inplace=True)       # Original data distribution for spam and not spam (ham) categories       print (data['label'].value_counts(normalize=True))       ham 0.865937       spam 0.134063       # Adding noise. Switching 5% of ham data to ‘spam’ label       tmp_df = data[data['label']=='ham']               examples_to_change = int(tmp_df.shape[0]*0.05)       print (f'Changing examples: {examples_to_change}')       examples_text_to_change = tmp_df.head(examples_to_change)['text'].tolist() changed_df = pd.DataFrame([[i, 'spam'] for i in examples_text_to_change])       changed_df.rename({0:'text', 1:'label'}, axis=1, inplace=True)       left_data = data[~data['text'].isin(examples_text_to_change)]       final_df = pd.concat([left_data, changed_df])               final_df.reset_index(drop=True, inplace=True)       Changing examples: 216       # Modified data distribution for spam and not spam (ham) categories       print (final_df['label'].value_counts(normalize=True))       ham 0.840016       spam 0.159984    raw_texts, raw_labels = final_df["text"].values, final_df["label"].values # Converting label into integers encoder = LabelEncoder() encoder.fit(raw_train_labels)       train_labels = encoder.transform(raw_train_labels)       test_labels = encoder.transform(raw_test_labels)       # Vectorizing text sequence using sentence-transformers transformer = SentenceTransformer('all-mpnet-base-v2') train_texts = transformer.encode(raw_train_texts)       test_texts = transformer.encode(raw_test_texts)       # Instatiating model instance model = LogisticRegression(max_iter=200) # Wrapping the sckit model around CL cl = CleanLearning(model) # Finding label issues in the train set label_issues = cl.find_label_issues(X=train_texts, labels=train_labels) # Picking top 50 samples based on confidence scores identified_issues = label_issues[label_issues["is_label_issue"] == True] lowest_quality_labels =       label_issues["label_quality"].argsort()[:50].to_numpy()       # Beauty print the label issue detected by CleanLab def print_as_df(index):    return pd.DataFrame(              {    "text": raw_train_texts,              "given_label": raw_train_labels,           "predicted_label": encoder.inverse_transform(label_issues["predicted_label"]),       },       ).iloc[index]       print_as_df(lowest_quality_labels[:5]) As we can see, Cleanlab assisted us in automatically removing the incorrect labels and training a better model with the same parameters and settings. In my experience, people frequently ignore data concerns in favor of building more sophisticated models to increase accuracy numbers. Improving data, on the other hand, is a pretty simple performance win. And, thanks to products like Cleanlab, it's become really simple and convenient.Feel free to access and play around with the above code in the Colab notebook hereConclusionIn conclusion, Cleanlab offers a straightforward solution to enhance data quality by addressing label inconsistencies, a crucial step in building more reliable and accurate machine learning models. By focusing on data integrity, Cleanlab simplifies the path to better performance and underscores the significance of clean data in the ever-evolving landscape of AI. Elevate your model's accuracy by investing in data quality, and explore the provided code to see the impact for yourself.Author BioPrakhar has a Master’s in Data Science with over 4 years of experience in industry across various sectors like Retail, Healthcare, Consumer Analytics, etc. His research interests include Natural Language Understanding and generation, and has published multiple research papers in reputed international publications in the relevant domain. Feel free to reach out to him on LinkedIn
Read more
  • 2
  • 0
  • 4208

article-image-microsoft-power-bi-interview-part1-brett-powell
Amey Varangaonkar
09 Oct 2017
8 min read
Save for later

Ride the third wave of BI with Microsoft Power BI

Amey Varangaonkar
09 Oct 2017
8 min read
[dropcap]S[/dropcap]elf-service Business Intelligence is the buzzword everyone's talking about today. It gives modern business users the ability to find unique insights from their data without any hassle. Amidst a myriad of BI tools and platforms out there in the market, Microsoft’s Power BI has emerged as a powerful, all-encompassing BI solution - empowering users to tailor and manage Business Intelligence to suit their unique needs and scenarios. [author title="Brett Powell"]A Microsoft Power BI partner, and the founder and owner of Frontline Analytics LLC., a BI and analytics research and consulting firm. Brett has contributed to the design and development of Microsoft BI stack and Power BI solutions of diverse scale and complexity across the retail, manufacturing, financial, and services industries. He regularly blogs about the latest happenings in Microsoft BI and Power BI features at Insight Quest. He is also an organizer of the Boston BI User Group.[/author]   In this two part interview Brett talks about his new book, Microsoft Power BI Cookbook, and shares his insights and expertise in the area of BI and data analytics with a particular focus on Power BI. In part one, Brett shares his views on topics ranging from what it takes to be successful in the field of BI & data analytics to why he thinks Microsoft is going to lead the way in shaping the future of the BI landscape. In part two of the interview, he shares his expertise with us on the unique features that differentiate Power BI from other tools and platforms in the BI space. Key Takeaways Ease of deployment across multiple platforms, efficient data-driven insights, ease of use and support for a data-driven corporate culture are factors to consider while choosing a Business Intelligence solution for enterprises. Power BI leads in self-service BI because it’s the first Software-as-a-Service (SaaS) platform to offer ‘End User BI’ where anyone, not just a business analyst, can leverage powerful tools to obtain greater value from data. Microsoft Power BI has been identified as a leader in Gartner’s Magic Quadrant for BI and Analytics platforms, and provides a visually rich and easy to access interface that modern business users require. You can isolate report authoring from dataset development in Power BI, or quickly scale up or down a Power BI dataset as per your needs. Power BI is much more than just a tool for reports and dashboards. With a thorough understanding of the query and analytical engines of Power BI, users can customize more powerful and sustainable BI solutions. Part One Interview Excerpts - Power BI from a Bird’s Eye View On choosing the right BI solution for your enterprise needs What are some key criteria one must evaluate while choosing a BI solution for enterprises? How does Power BI fare against these criteria as compared with other leading solutions from IBM, Oracle and Qlikview? Enterprises require a platform which can be implemented on their terms and adapted to their evolving needs. For example, the platform must support on-premises, cloud, and hybrid deployments with seamless integration allowing organizations to both leverage on-premises assets as well as fully manage their cloud solution. Additionally, the platform must fully support both corporate business intelligence processes such as staged deployments across development and production environments as well as self-service tools which empower business teams to contribute to BI projects and a data driven corporate culture. Furthermore, enterprises must consider the commitment of the vendor to BI and analytics, the full cost of scaling and managing the solution, as well as the vendors’ vision for delivering emerging capabilities such as artificial intelligence and natural language. Microsoft Power BI has been identified as a leader in Gartner’s Magic Quadrant for BI and Analytics platforms based on both its currently ability to execute as well as its vision. Particularly now with Power BI Premium, the Power BI Report Server, and Power BI embedded offerings, Power BI truly offers organizations the ability to tailor and manage BI to their unique needs and scenarios. Power BI’s mobile application, available on all common platforms (iOS, Android) in addition to continued user experience improvements in the Power BI service provides a visually rich and common interface for the ‘anytime access’ that modern business users require. Additionally, since Power BI’s self-service authoring tool of Power BI Desktop shares the same engine as SQL Server Analysis Services, Power BI has a distinct advantage in enabling organizations to derive value from both self-service and corporate BI. The BI landscape is very competitive and other vendors such as Tableau and Qlikview have obtained significant market share. However, as organizations fully consider the features distinguishing the products in addition to the licensing structures and the integration with Microsoft Azure, Office 365, and common existing BI assets such as Excel and SQL Server Reporting Services and Analysis Services, they will (and are) increasingly concluding that Power BI provides a compelling value. On the future of BI and why Brett is betting on Microsoft to lead the way Self-service BI as a trend has become mainstream. How does Microsoft Power BI lead this trend? Where do you foresee the BI market heading next i.e., are there other trends we should watch out for?  Power BI leads in self-service BI because it’s the first software as a service (SaaS) platform to offer ‘End User BI’ in which anyone, not just a business analyst, can leverage powerful tools to obtain greater value from data. This ‘third wave’ of BI, as Microsoft suggests, further follows and supplements the first and second waves of BI in Corporate and self-service BI, respectively. For example, Power BI’s Q & A experience with natural language queries and integration with Cortana goes far beyond the traditional self-service process of an analyst finding field names and dragging and dropping items on a canvas to build a report. Additionally, an end user has the power of machine learning algorithms at their fingertips with features such as Quick Insights now built into Power BI Desktop. Furthermore, it’s critical to understand that Microsoft has a much larger vision for self-service BI than other vendors. Self-service BI is not exclusively the visualization layer over a corporate IT controlled data model – it’s also the ability for self-service solutions to be extended and migrated to corporate solutions as part of a complete BI strategy. Given their common underlying technologies, Microsoft is able to remove friction between corporate and self-service BI and allows organizations to manage modern, iterative BI project lifecycles.    On staying ahead of the curve in the data analytics & BI industry For someone just starting out in the data analytics and BI fields, what would your advice be? How can one keep up with the changes in this industry? I would focus on building a foundation in the areas which don’t change frequently such as math, statistics, and dimensional modeling. You don’t need to become a data scientist or a data warehouse architect to deliver great value to organizations but you do need to know the basic tools of storing and analysing data to answer business questions. To succeed in this industry over time you need to consistently invest in your skills in the areas and technologies relevant to your chosen path. You need to hold yourself accountable for becoming a better data professional and this can be accomplished by certification exams, authoring technical blogs, giving presentations, or simply taking notes from technical books and testing out tools and code on your machine. For hard skills I’d recommend standard SQL, relational database fundamentals, data warehouse architecture and dimensional model design, and at least a core knowledge of common data transformation processes and/or tools such as SQL Server Integration Services (SSIS) and SQL stored procedures. You’ll need to master an analytical language as well and for Microsoft BI projects that language is increasingly DAX. For soft skills, you need to move beyond simply looking for a list of requirements for your projects. You need to learn to become flexible and active – you need to become someone who offers ideas and looks to show value and consistently improve projects rather than just ‘deliver requirements’. You need to be able to have both a deeply technical conversation but also have a very practical conversation with business stakeholders. You need to able to build relationships with both business and IT. You don’t ever want to dominate or try to impress anyone but if you’re truly passionate about your work then this will be visible in how you speak about your projects and the positive energy you bring to work every day and to your ongoing personal development.   If you enjoyed this interview, check out Brett’s latest book, Microsoft Power BI Cookbook. In part two of the interview, Brett shares 5 Power BI features to watch out for, 7 reasons to choose Power BI to build enterprise solutions and more. Visit us tomorrow to read part two of the interview.
Read more
  • 0
  • 0
  • 4197
article-image-listen-we-discuss-why-chaos-engineering-and-observability-will-be-important-in-2019-podcast
Richard Gall
21 Dec 2018
1 min read
Save for later

Listen: We discuss why chaos engineering and observability will be important in 2019 [Podcast]

Richard Gall
21 Dec 2018
1 min read
This week I published a post that explored some of the key trends in software infrastructure that security engineers, SREs, and SysAdmins should be paying attention to in 2019. There was clearly a lot to discuss - which is why I sat down with my colleague Stacy Matthews to discuss some of the topics explored in the post in a little more. Enjoy! https://soundcloud.com/packt-podcasts/why-observability-and-chaos-engineering-will-be-vital-in-2019 What do you think? Is chaos engineering too immature for widespread adoption? And how easy will it be to begin building for observability?
Read more
  • 0
  • 0
  • 4030

article-image-sports-analytics-empowering-better-decision-making
Amey Varangaonkar
14 Nov 2017
11 min read
Save for later

Expert Insights: How sports analytics is empowering better decision-making

Amey Varangaonkar
14 Nov 2017
11 min read
Analytics is slowly changing the face of the sports industry as we know it. Data-driven insights are being used to improve the team and individual performance, and to get that all-important edge over the competition. But what exactly is sports analytics? And how is it being used? What better way to get answers to these questions than asking an expert himself! [author title="Gaurav Sundararaman"]A Senior Stats Analyst at ESPN currently based in Bangalore, India. With over 10 years of experience in the field of analytics, Gaurav worked as a Research Analyst and a consultant in the initial phase of his career. He then ventured into sports analytics in 2012, and played a major role in the Analytics division of SportsMechanics India Pvt. Ltd. where he was the Analytics Consultant for the T20 World Cup winning West Indies team in 2016.[/author]   In this interview, Gaurav takes us through the current landscape of sports analytics, and talks about how analytics is empowering better decision-making in sports. Key Takeaways Sports analytics pertains to finding actionable, useful insights from sports data which teams can use to gain competitive advantage over the opposition Instincts backed by data make on and off-field decisions more powerful and accurate Rise of IoT and wearable technology has boosted sports analytics. With more data available for analysis, insights can be unique and very helpful Analytics is being used in sports right from improving player performance to optimizing ticket prices and understanding fan sentiments Knowledge of  tools for data collection, analysis and visualization such as R, Python and Tableau is essential for a sports analyst Thorough understanding of the sport, up to date skillset and strong communication with players and management are equally important factors to perform efficient analytics Adoption of analytics within sports has been slow, but steady. More and more teams are now realizing the benefits of sports analytics and are adopting an analytics-based strategy Complete Interview Analytics today is finding widespread applications in almost every industry today - how has the sports industry changed over the years? What role is analytics playing in this transformation? The sports industry has been relatively late in adopting analytics. That said, the use of analytics in sports has also varied geographically. In the west, analytics plays a big role in helping teams, as well as individual athletes, take up decisions. Better infrastructure and a quick adoption of the latest trends in technology is an important factor here. Also, investment in sports starts from a very young age in the west, which also makes a huge difference.  In contrast, many countries in Asia are still lagging behind when it comes to adopting analytics, and still leverage on traditional techniques to solve problems. A combination of analytics with traditional knowledge from experience would go a long way in helping teams, players and businesses succeed. Previously the sports industry was a very close community. Now with the advent of analytics, the industry has managed to expand its horizon. We witness more non-sportsmen playing a major part in the decision making. They understand the dynamics of the sports business and how to use data-driven insights to influence the same. Many major teams across different sports such as Football (Soccer), Cricket, American Football, Basketball and more have realized the value of data and analytics. How are they using it? What advantages does analytics offer to them? One thing I firmly believe is that analytics can’t replace skills or can’t guarantee wins. What it can do is ensure there is logic towards certain plans and decisions. Instincts backed by data make the decisions more powerful. I always tell the coaches or players – Go with your gut and instincts as Plan A. If it does not work out your fall back could be Plan B based on trends and patterns derived from data. It turns out to be a win-win for both. Analytics offers a neutral perspective which sometimes players or coaches may not realize. Each sport has a unique way of applying analytics to make decisions and obviously, as analysts, we need to understand the context and map the relevant data. As far as using the analytics is concerned, the goals are pretty straightforward - be the best, beat the opponents and aim for sustained success. Analytics helps you achieve each of these objectives. The rise of IoT and wearable technology over the last few years has been incredible. How has it affected sports, and sports analytics, in particular? It is great to see that many companies are investing in such technologies. It is important to identify where wearables and IoT can be used in sport and where it can cause maximum impact. These devices allow in-game monitoring of players, their performance, and their current physical state. Also, I believe more than on-field, these technologies would be very useful in engaging fans as well. Data derived from these devices could be used in broadcasting as well as providing a good experience for fans in the stadiums. This will encourage more and more people to watch games in stadiums and not in the comfort of their homes. We have already seen a beginning with a few stadiums around the world leveraging technology (IoT). The Mercedes Benz stadium (home of Atlanta Falcons) has a high tech stadium powered by IBM. Sacramento is building a state-of-the-art facility for the Sacramento Kings. This is just the start, and it will only get better with time. How does one become a sports analyst? Are there any particular courses/certifications that one needs to complete in order to become one? Can you share with us your journey in sports analytics? To be honest there are no professional courses yet in India to become an Analyst. There are a couple of colleges which have just started offering Sports Analytics as a course in their Post-Graduation Program. However, there are a few companies (Sports Mechanics and Kadamba Technologies in Chennai) that offer jobs that can enable you to become a Sports Analyst if you are really good.  If you are a freelancer then my advice would be to ensure you brand yourself well and showcase your knowledge through social media platforms and get a breakthrough via contacts. Post my MBA, Sports Mechanics (a leader in this space), a company based in Chennai were looking for someone to work to start their data practice. I was just lucky to be at the right place at the right time. I worked for 4 years there and was able to learn a lot about the industry and what works and what does not. Being a small company, I was lucky to don multiple hats and work on different projects across the value chain. I moved and joined the lovely team Of ESPNCricinfo where I work for their stats team. What are the tools and frameworks that you use for your day to day tasks? How do they make your work easier? There are no specific tools or frameworks. It depends on the enterprise you are working for. Usually, they are proprietary tools of the company. Most of these tools are used either to collect, mine or visualize data. Interpreting the information and presenting it in a manner in which users understand is important and that is where certain applications or frameworks are used. However to be ready for the future it would be good to be skilled on tools that support data collection, analysis and visualization namely R, Python and Tableau, to name a few. Do sports analysts have to interact with players and the coaching staff directly? How do you communicate your insights and findings with the relevant stakeholders? Yes, they have to interact with players and management directly. If not, the impact will be minimal. Communicating insights is very important in this industry. Too much analysis could lead to paralysis. We need to identify what exactly each player or coach is looking for, based on their game and try to provide them the information in a crisp manner which helps them make decisions on and off the field. For each stakeholder the magnitude of the information provided is different. For the coach and management, the insights can be in detail while for the players we need to keep it short and to the point. The insights you generate must not only be limited to enhancing the performance of a team on the field but much more than that. Could you give us some examples? Insights can vary. For the management, it could deal with how to maximise the revenue or save some money in an auction. For coaches, it could help them know about his team’s as well as the opposition’s strengths and weaknesses from a different perspective. For captains, data could help in identifying some key strategies on the field. For example, in Cricket, it could help the captain determine which bowler to bring on to which opposition batsmen, or where to place the fielders. Off the field, one area where analytics could play a big role would be in grassroots development and tracking of an athlete from an early age to ensure he is prepared for the biggest stage. Monitoring performance, improving physical attributes by following a specific regimen, assessing injury record and designing specific training programs, etc. are some ways in which this could be done. What are some of the other challenges that you face in your day to day work? Growth in this industry can be slow sometimes. You need to be very patient, work hard and ensure you follow the sport very closely. There are not many analytical obstacles as such, but understanding the requirements and what exactly the data needs are can be quite a challenge. Despite all the buzz, there are quite a few sports teams and organizations who are still reluctant to adopt an analytics-based strategy – why do you think is that the case? What needs to change? The reason for the slow adoption could be the lack of successful case studies and the awareness. In most sports when so many decisions are taken on the field sometimes the players' ability and skill seems far more superior to anything else. As more instances of successful execution of data-based trends come up, we are likely to see more teams adopting the data-based strategy. Like I mentioned earlier, analytics needs to be used to make the coach and captain take the most logical and informed decisions. Decision-makers need to be aware of the way it is used and how much impact it can cause.  This awareness is vital towards increasing the adoption of analytics in sports. Where do you see sports analytics in the next 5-10 years? Today in sports many decisions are taken on gut feeling, and I believe there should be a balance. That is where analytics can help. In sports like Cricket, only around 30% of the data is used and there is more emphasis given to video. Meanwhile, if we look at Soccer or Basketball, the usage of data and video analytics is close to 60-70% of its potential. Through awareness and trying out new plans based on data, we can increase usage of analytics in cricket to 60-70 % in the next few years. Despite the current shortcomings, It is fair to say that there is a progressive and positive change at the grassroots level across the world. Data-based coaching and access to technology are slowly being made available to teams as well as budding sportsmen/women. Another positive is that the investment in the sports industry is growing steadily. I am confident that in a couple of years, we will see more job opportunities in sports. Maybe in five years, the entire ecosystem would be more structured and professional. We would witness analytics playing a much bigger role in helping stakeholders make informed decisions, as data-based insights become even more crucial. Lastly, what advice do you have for aspiring sports analysts? My only advice would be - Be passionate, build a strong network of people around you, and constantly be on the lookout for opportunities. Also, it is important to keep updating your skill-set in terms of the tools and techniques needed to perform efficient and faster analytics. Newer and better tools keep coming up very quickly, which make your work easier and faster. Be on the lookout for such tools! One also needs to identify their own niche based on their strengths and try to build on that. The industry is on the cusp of growth and as budding analysts, we need to be prepared to take off when the industry matures. Build your brand and talk to more people in the industry - figure out what you want to do to keep yourself in the best position to grow with the industry.
Read more
  • 0
  • 0
  • 3949