Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

Tech Guides - Data

281 Articles
article-image-ibm-researchs-5-in-5-predictions-think-2018
Amey Varangaonkar
16 Apr 2018
4 min read
Save for later

What we learnt from IBM Research’s ‘5 in 5’ predictions presented at Think 2018

Amey Varangaonkar
16 Apr 2018
4 min read
IBM’s mission has always been to innovate and in the process, change the way the world operates. With this objective in mind, IBM Research started a conversation termed as ‘5 in 5’ way back in 2012, giving their top 5 predictions every year at IBM Think 2018 on how technology would change the world. These predictions are usually the drivers for their research and innovation - and eventually solving the problems by coming up with efficient solutions to them. Here are the 5 predictions made by IBM Research for 2018: More secure Blockchain products: In order to avoid counterfeit Blockchain products, the technology will be coupled with cryptographic solutions to develop decentralized solutions. Digital transactions are often subject to frauds, and securing them with crypto-anchors is seen as the way to go forward. Want to know how this can be achieved? You might want to check out IBM’s blog on crypto-anchors and their real world applications. If you are like me, you’d rather watch IBM researcher Andres Kind explain what crypto-anchors are in a fast paced science slam session. Sophisticated cyber attacks will continue to happen: Cyber attacks resulting in the data leaks or stealing of confidential data is not news to us. The bigger worry, though, is that the current methodologies to prevent these attacks are not proving to be good enough. IBM predicts this is only going to get worse, with more advanced and sophisticated cyber attacks breaking into the current secure systems with ease. IBM Research also predicted the rise of ‘lattice cryptography’, a new security mechanism offering a more sophisticated layer of protection for the systems. You can read more about lattice cryptography technology on IBM’s official blog. Or, you can watch IBM researcher Cecilia Boschini explain what is lattice cryptography in 5 minutes on one of IBM’s famous science slam sessions. Artificial Intelligence-powered bots will help clean the oceans: Our marine ecosystem seems to be going from bad to worse. This is mainly due the pollution and toxic wastes being dumped into it. IBM predicts that AI-powered autonomous bots, deployed and controlled on the cloud, can help relieve this situation by monitoring the water bodies for water quality and pollution levels. You can learn more about how these autonomous bots will help save the seas in this interesting talk by Tom Zimmerman. An unbiased AI system: Artificially designed systems  are only as good as the data being used to build them. This data may be impure, or may contain flaws or bias pertaining of color, race, gender and so on. Going forward, new models which mitigate these biases and ensure more standard, bias-free predictions will be designed. With these models, certain human values and principles will be considered for effective decision-making. IBM researcher Francesca Rossi talks about bias in AI and the importance of building fair systems that help us make better decisions. Quantum Computing will go mainstream: IBM predicts that quantum computing will get out of research labs and gain mainstream adoption in the next 5 years. Problems considered to be difficult or unsolvable today due to their sheer scale or complexity can be tackled with the help of quantum computing. To know more, let IBM researcher Talia Gershon take you through the different aspects of quantum computing and why it is expected to be a massive hit. Amazingly, most of the predictions from the past have turned out to be true. For instance, IBM predicted the rise of Computer Vision technology in 2012, where computers would be able to not only process images, but also understand their ‘features’. It remains to be seen how true this year’s predictions will turn out to be. However, considering the rate at which the research on AI and other tech domains is progressing and being put to practical use, we won’t be surprised if they all become a reality soon. What do you think?
Read more
  • 0
  • 0
  • 4355

article-image-gdpr-is-good-for-everyone-businesses-developers-customers
Richard Gall
14 Apr 2018
5 min read
Save for later

GDPR is good for everyone: businesses, developers, customers

Richard Gall
14 Apr 2018
5 min read
Concern around GDPR is palpable, but time is running out. It appears that many businesses don’t really know what they’re doing. At his Congressional hearing, Mark Zuckerberg’s notes read “don’t say we’re already GDPR compliant” - if Facebook aren’t ready yet, how could the average medium sized business be? But the truth is that GDPR couldn’t have come at a better time. Thanks in part to the Facebook and Cambridge Analytica scandal, the question of user data and online privacy has never been so audible within public discourse. That level of public interest wasn’t around a year ago. Ultimately, GDPR is the best way to tackle these issues. It forces businesses to adopt a different level of focus - and care - towards its users. It forces everyone to ask: what counts as personal data? who has access to it? who is accountable for the security of that data? These aren’t just points of interest for EU bureaucrats. They are fundamental questions about how businesses own and manage relationships with customers. GDPR is good news for web developers In turn, this means GDPR is good news for those working in development too. If you work in web development or UX it’s likely that you’ve experienced frustration when working against the requirements and feedback of senior stakeholders. Often, the needs of users are misunderstood or even ignored at the expense of what the business needs. This is especially true when management lacks technical knowledge and makes too many assumptions. At its worst, it can lead down the path of ‘dark patterns’ where UX is designed in such a way to ‘trick’ customers in behaving in a certain way. But even when intentions aren’t that evil, the mindset that refuses to take user privacy - and simple user desires - seriously can be damaging. Ironically, the problems this sort of negligence is causing isn’t just leading to legal issues. It’s also bad for business. That’s because when you engineer everything around what’s best for the business in a crude and thoughtless way, you make life hard for users and customers. This means: Customers simply have a bad experience and could get a better one elsewhere Customers lose trust Your brand is damaged GDPR will force businesses to get out of the habit of lazy thinking. It makes issues around UX, data protection so much more important than it otherwise would be. It also forces businesses to start taking the way software is built and managed much more seriously. GDPR will change bad habits in businesses What GDPR does, then, is it will force businesses to get out of the habit of lazy thinking. It makes issues around UX, data protection so much more important than it otherwise would be. It also forces businesses to start taking the way software is built and managed much more seriously. This could mean a change in the way that developers work within their businesses in the future. Siloes won’t just be inefficient, they might just lead to a legal crisis. Development teams will have to work closely with legal, management and data teams to ensure that the software they are developing is GDPR compliant. Of course, this will also require a good deal of developer training to be fully briefed on the new landscape. It also means we might see new roles like Chief Data Officer becoming more prominent. But it’s worth remembering that for non-developers, GDPR is going to also require much more technical understanding. If recent scandals have shown us anything, it’s that a lot of people don’t fully understand the capabilities that even the smallest organizations have at their disposal. GDPR will force the non-technical to become more informed about how software and data interact - and most importantly how software can sometimes exploit or protect users. GDPR will give developers a renewed focus on the code they write Equally, for developers, GDPR also forces a renewed focus on the code they write. Discussions around standards have been a central point of contention in the open source world for some time. There has always been an unavoidable, quiet tension between innovation and standard compliance. Writing in Smashing Magazine, digital law expert Heather Burns has some very advice on this: "Your coding standards must be preventive as well. You should disable unsafe or unnecessary modules, particularly in APIs and third-party libraries. An audit of what constitutes an unsafe module should be about privacy by design, such as the unnecessary capture and retention of personal data, as well as security vulnerabilities. Likewise, code reviews should include an audit for privacy by design principles, including mapping where data is physically and virtually stored, and how that data is protected, encrypted, and sandboxed." Sure, all of this seems like a headache, but all of this should make life better for users and customers. And while it might seem frustrating to not be able to track users in the way that we might have in the old world, by forcing everyone to focus on what users really want  - not what we want them to want - we’ll ultimately get to a place that’s better for everyone.
Read more
  • 0
  • 0
  • 3558

article-image-data-science-windows-big-no
Aaron Lazar
13 Apr 2018
5 min read
Save for later

Data science on Windows is a big no

Aaron Lazar
13 Apr 2018
5 min read
I read a post from a Linkedin connection about a week ago. It read: “The first step in becoming a data scientist: forget about Windows.” Even if you’re not a programmer, that's pretty controversial. The first nerdy thought I had was, that’s not true. The first step to Data Science is not choosing an OS, it’s statistics! Anyway, I kept wondering what’s wrong with doing data science on Windows, exactly. Why is the legacy product (Windows), created by one of the leaders in Data Science and Artificial Intelligence, not suitable to support the very thing it is driving? As a publishing professional and having worked with a lot of authors, one of the main issues I’ve faced while collaborating with them is the compatibility of platforms, especially when it comes to sharing documents, working with code, etc. At least 80 percent of the authors I’ve worked with have been using something other than Windows. They are extremely particular about the platform they’re working on, and have usually chosen Linux. I don’t know if they consider it a punishable offence, but I’ve been using Windows since I was 12, even though I have played around with Macs and machines running Linux/Unix. I’ve never been affectionately drawn towards those machines as much as my beloved laptop that is happily rolling on Windows 10 Pro. Why is data science on Windows is a bad idea? When Microsoft created Windows, its main idea was to make the platform as user friendly as possible, and it focused every ounce of energy on that and voila! They created one of the most simplest operating systems that one could ever use. Microsoft wanted to make computing easy for everyone - teachers, housewives, kids, business professionals. However, they did not consider catering to the developer community as much as its users. Now that’s not to say that you can’t really use a Windows machine to code. Of course, you can run Python or R programs. But you’re likely to face issues with compatibility and speed. If you’re choosing to use the command line, and something goes wrong, it’s a real PITA to debug on Windows. Also, if you’re doing cluster computing with other Linux/Macs, it’s better to have one of them yourself. Many would agree that Windows is more likely to suffer a BSoD (Blue Screen of Death) than a Mac or a Unix machine, messing up your algorithm that’s been running for a long time. [box type="note" align="" class="" width=""]Check out our most read post 15 useful Python libraries to make your Data science tasks easier. [/box] Is it all that bad? Well, not really. In fact, if you need to pump in a couple more gigs of RAM, you can’t think of doing that on a Mac. Although you might still encounter some weird stuff like those mentioned above, on a Windows PC, you can always Google up a workaround. Don’t beat yourself up if you own a PC. You can always set up a dual boot, running a Linux distribution parallely. You might want to check out Vagrant for this. Also, you’ll be surprised if you’re a Mac owner and you plan some heavy duty Deep Learning on a GPU, you can’t really run CUDA without messing things up. CUDA will only work well with NVIDIAs GPUs on a PC. In Joey Tribbiani's words “This is a moo point.” To me, data science is really OS agnostic. For instance, now with Docker, you don’t really have to worry much about which OS you’re running - so from that perspective, data science on Windows may work for you. Still feel for Windows? Well, there are obviously drawbacks. You’ll still keep living with the fear of isolation that Microsoft tries to create in the minds of customers. Moreover, you’ll be faced with “slowdom” if that’s a word, what with all the background processes eating away your computing power! You’ll be defying everything that modern computing is defined by - KISS, Open Source, Agile, etc. Another important thing you need to keep in mind is that when you’re working with so much data, you really don’t wanna get hacked! Last but not the least, if you’re intending to dabble with AI and Blockchain, your best bet is not going to be Windows. All said and done, if you’re a budding data scientist who’s looking to buy some new equipment, you might want to consider a few things before you invest in your machine. Think about what you’ll be working with, what tools you might want to use and if you want to play safe, it’s best to go with a Linux system. If you have the money and want to flaunt it, while still enjoying support from most tools, think about a Mac. And finally, if you’re brave and are not worried about having two OSes running on your system, go in for a Windows PC. So the next time someone decides to gift you a Windows PC, don’t politely decline right away. Grab it and swiftly install a Linux distro! Happy coding! :) *I will put an asterisk here, for the thoughts put in this article are completely my personal opinion and it might differ from person to person. Go ahead and share your thoughts in the comments section below.
Read more
  • 0
  • 4
  • 13932

article-image-analyzing-the-chief-data-officer-role
Aaron Lazar
13 Apr 2018
9 min read
Save for later

GDPR is pushing the Chief Data Officer role center stage

Aaron Lazar
13 Apr 2018
9 min read
Gartner predicted that by 2020 90% of large organizations in regulated industries will have a Chief Data Officer role. With the recent heat around Facebook, Mark Zuckerberg, and the fast-approaching GDPR compliance deadline, it’s quite likely that 2018 will be the year of the Chief Data Officer. This article was first published in October, 2017 and has been updated to keep up with the latest trends in GDPR. In 2014 around 400 CEOs and top business execs were asked how they recognize data as a corporate asset. They responded with a mixed set of reactions and viewed the worth of data in their organization in varied ways. Now, in 2018 these reactions have drastically changed – more and more organizations have realized the importance of Data-as-an-Asset. More importantly, the European Union (EU) has made it mandatory that General Data Protection Regulation (GDPR) compliance be sought by the 25th of May, 2018. The primary reason for creating the Chief Data Officer role was to connect the ends between functional management and IT teams in an organization. But now, it looks like the CDO will also be primarily focusing on setting up and driving GDPR compliance, to avoid a fine up to €20 million or 4% of the annual global turnover of the previous year, whichever is higher. We’re going to spend a few minutes breaking down the Chief Data Officer role for you, revealing several interesting insights along the way. Let’s start with the obvious question. What might the Chief Data Officer’s responsibilities be? Like other C-suite execs, a Chief Data Officer is expected to have a well-blended mix of technical know-how and business acumen. Their role is very diverse and sometimes comes across as a pain to define the scope. Here are some of the key responsibilities of a Chief Data Officer: Data Policies and GDPR Compliance Data security is one of the most important elements that any business must consider. It needs to comply with regulatory standards and requirements of the country where it operates. A Chief Data Officer is be responsible for ensuring the compliance of policies across all branches of business and the associated compliance requirement taxonomies, on a global level. What is GDPR? If you’re thinking there were no data protection laws before the General Data Protection Regulation 2016/679, that’s not true. The major difference, however, is that GDPR focuses more on customer data privacy and protection. GDPR requirements will change the way organizations store, process and protect their customers’ personal data. They will need solutions for assessing, implementing and maintaining GDPR compliance, and that’s where Chief Data Officers fit in. Using data to gain a competitive edge Chief Data Officers need to have sound knowledge of the business’ customers, markets where they operate, and strong analytical skills to leverage the right data, at the right time, at the right place. This would eventually give the business an edge over its competitors in the market. For example, a ferry service could use data to identify the rates that customers would be willing to pay at a certain time of the day. Setting in motion the best practices of data governance Organizations span across the globe these days and often employees from different parts of the world work on the same data. This can often result in data moving through unconnected systems, and ending up as inefficient or disjointed pieces of business information. A Chief Data Officer needs to ensure that this information is aggregated and maintained in such a way that clear information ownership across the organization is established. Architecting future-proof data solutions A Chief Data Officer often acts like a Data Architect. They will sometimes take on responsibility for planning, designing, and building Big Data systems and ensuring successful integration with other systems in an organization. Designing systems that can provide answers to the user’s problems now and in the future is vital. Chief Data Officers are often found asking themselves key questions like how to generate data with maximum reusability, while also making sure it’s as accurate and relevant as possible. Defining Information Management tools Different business units across the globe tend to use different tools, technologies, and systems to work on, store, and share information on an enterprise level. This greatly affects a company’s ability to access and leverage data for effective decision making and various other duties. A Chief Data Officer is responsible for establishing data-oriented standards across the business and for driving all arms of the business to comply with the standards and embrace change, to ensure the integrity of data. Spotting new opportunities A Chief Data Officer is responsible for spotting new opportunities where the business can venture into through careful analysis of data and past records. For example, a motor company would leverage certain sales information to make an informed decision on what age group to target with their new SUV in-the-making, to maximize sales. These are just the tip of the iceberg when it comes to a Chief Data Officer’s responsibilities. Responsibilities go hand-in-hand with skill and traits required to execute those responsibilities. Below are some key capabilities sought after in a CDO. Key Chief Data Officer Skills The right person for the job is expected to possess impeccable leadership and C-Suite level communication skills, as well as strong business acumen. They are expected to have strong knowledge of GDPR software tools and solutions to enable the organizations hiring them to swiftly transform and adopt the new regulations. They are also expected to possess knowledge of IT architecture, including a familiarity with leading architectural standards such as TOGAF or the Zachman framework. They need to be experienced in driving data governance as well as data quality and integrity, while also possessing a strong knowledge of data analytics, visualization, and storytelling. Familiar with Big Data solutions like Hadoop, MapReduce, and HBase is a plus. How much do Chief Data Officers earn?  Now, let’s take a look at what kind of compensation a Chief Data Officer is likely to be offered. To tell you the truth, the answer to this question is still a bit hazy, but it’s sure to pick up speed with the recent developments in the regulatory and legal areas related to data. About a year ago, a blog post from careeraddict revealed that the salary for a CDO in the US was around $112,000 annually. A job listing seen on Indeed quoted $200,000 as the annual salary. Indeed shows 7 jobs posted for a CDO in the last 15 days. We took the estimated salaries of CDOs and compared them with those of CIOs and CTOs in the same company. It turns out that most were on par, with a few CDO compensations falling slightly short of the CIO and CTO salary. These are just basic salary figures. Bonuses add on the side, amounting up to 50% in some cases. However, please note that these salary figures vary heavily based on the type of organization and the industry. Do businesses even need a Chief Data Officer? One might argue that some of the skills expected of a Chief Data Officer would also be held by the CIO or the Chief Digital Officer of the organization or the Data Protection Officer (if they have one). Then, why have a Chief Data Officer at all and incur an extra significant cost to the company? With the rapid change in tech and the rate at which data is generated, used, and discarded, most data pointers, point in the direction of having a separate Chief Data Officer, working alongside the CIO. It’s critical to have a clearly defined need for both roles to co-exist. Blurring the boundaries of the two roles can be detrimental and organizations must, therefore, be painstakingly mindful of the defined KRAs. The organisation should clearly define the two roles to keep the business structure running smoothly. A Chief Data Officer’s main focus will be on the latest data-centric technological innovations, their compliance to the new standards while also boosting customer engagement, privacy and in turn, loyalty and the business’s competitive advantage. The CIO, on the other hand, focuses on improving the bottom line by owning business productivity metrics, cost-cutting initiatives, making IT investments etc. – i.e., an inward facing data-management and architecture role. The CIO is the person who is therefore responsible for leading digital initiatives at a board level. In addition to managing data and governing information, if the CIO’s responsibilities were to also include implementing analytics in fresh ways to generate value for the business, it is going to be a tall order. To put it simply, it is more practical for the CIO to own the systems and the CDO to oversee all the bits and bytes that flow through these systems. Moreover, in several cases, the Chief Data Officer will act as a liaison between the business and IT. Thus, Chief Data Officers and CIOs both need to work together and support each other for a better functioning business. The bottom line: a Chief Data Officer is essential For an organization dealing with a lot of data, a Chief Data Officer is a must. Failure to have one on board can result in being fined €10 million euros or 2% of the organization’s worldwide turnover (depending on which is higher). Here are the criteria for an organization to have a dedicated personnel managing Data Protection. The organization’s core activities should: Have data processing operations which require regular and systematic monitoring of data subjects on a large scale or monitoring of individuals Be processing a large scale of special categories of data (i.e. sensitive data such as health, religion, race, sexual orientation etc.) Have data processing carried out by a public authority or a body processing personal data, except for courts operating in their judicial capacity Apart from this mandate, a Chief Data Officer can add immense value by aligning data-driven insights with its vision and goals. A CDO can bridge the gap between the CMO and the CIO, by focusing on meeting customer requirements through data-driven products. For those in data and insights centric roles such as data scientists, data engineers, data analysts and others, the CDO is a natural destination for their career progression journey. The Chief Data Officer role is highly attractive in terms of the scope of responsibilities, the capabilities and of course, the pay. Certification courses like this one are popping up to help individuals shape themselves for the role. All-in-all, this new C-suite position in most organizations, is the perfect pivot between old and new, bridging silos, and making a future where data privacy is intact.
Read more
  • 0
  • 0
  • 5979

article-image-amazon-sagemaker-machine-learning-cloud-easy
Amey Varangaonkar
12 Apr 2018
5 min read
Save for later

Amazon Sagemaker makes machine learning on the cloud easy

Amey Varangaonkar
12 Apr 2018
5 min read
Amazon Sagemaker was launched by Amazon back in November 2017. It was built with the promise of simplifying machine learning on the cloud. The software was a response not only to the increasing importance of machine learning, but also the fact that there is a demand to perform machine learning in the cloud. Amazon Sagemaker is clearly a smart move by Amazon that will consolidate the dominance of AWS in the cloud market. What is Amazon Sagemaker? Amazon Sagemaker is Amazon’s premium cloud-based service which serves as a platform for machine learning developers and data scientists to build, train and deploy machine learning models on the cloud. One of the features that makes Sagemaker stand out from the rest is that it is business-ready. This means machine learning models can be optimized for high performance and deployed at scale to work on data with varying sizes and complexity. The basic intention of Sagemaker, as Vogels mentioned in his keynote, is to remove any barriers that slow down the machine learning process for developers. In a standard machine learning process, a developer spends most of the time doing the following standard tasks: Collecting, cleaning and preparing the training data set. Selecting the most appropriate algorithm for the machine learning problem Training the model for accurate prediction Optimizing the model’s performance Integrating the model with the application Deploying the application to production Most of these tasks require a lot of expertise, and more importantly, time and efforts. Not to mention the computational resources such as storage space and processing memory. The larger the dataset, the bigger this problem becomes. Amazon Sagemaker removes these complexities by providing a solid platform with built-in modules that can be used together or individually to complete each of the above tasks with relative ease. How Amazon Sagemaker Works Amazon Sagemaker offers a lot of options for machine learning developers to train and optimize their machine learning models to work at scale. For starters, Sagemaker comes integrated with hosted Jupyter notebooks to allow developers to visually explore and analyze their dataset. You can also move your data directly from popular Amazon databases such as RDS, DynamoDB and Redshift into S3 and conduct your analysis there. The simple block diagram below demonstrates the core working of Amazon Sagemaker: Amazon Sagemaker includes 12 high performance, production-ready algorithms which can be used to build and deploy models at scale. Some of the popular ones include k-means clustering, Principal Component Analysis (PCA), neural topic modeling, and more. It comes pre-configured with popular machine learning and deep learning frameworks such as Tensorflow, PyTorch, Apache MXNet and more, but you can also use your own framework without any hassle. Once your model is trained, Sagemaker makes use of the AWS’ auto-scaled clusters to deploy the model, making sure the model doesn’t lack in performance and is highly available at all times. Not just that, Sagemaker also includes built-in testing capabilities for you to test and check your model for any issues, before it can be deployed for production. Benefits of using Amazon Sagemaker Business are likely to adopt Amazon Sagemaker, mainly because of the fact that it makes the whole machine learning process so effortless. With Sagemaker, it becomes very easy to build and deploy smarter applications that give accurate predictions, and thereby help increase the business profitability. Significantly reduces time: With built-in modules, Sagemaker significantly reduces the time required to do a variety of machine learning tasks, and the models can be deployed to production in very little time. This is important for businesses, as near-real time insights obtained from smart applications help them optimize their processes quickly, and effectively get an edge over their competition. Effortless and more productive machine learning: By virtue of the one-click training and deployment feature offered by Sagemaker, machine learning engineers and developers can now focus on asking the right questions of the data, and focus on the results rather than the process. They can also devote more time to optimizing the model rather than focusing on collecting and cleaning the data, which takes up most of their time. Flexibility in using the algorithms and frameworks: With Sagemaker, developers have the freedom to choose the best-possible algorithm and tool for performing machine learning effectively. Easy integration, access and optimization: The models trained using Sagemaker can be integrated into an existing business application seamlessly, and are optimized for speed and high performance. Backed by the computational power of AWS, business can rest assured their applications will continue to perform optimally without any risk of failure. Sagemaker - Amazon’s answer to Cloud Auto ML In a 3-way cloud war between Google, Microsoft and Amazon, it is clear Google and Amazon are trying to go head to head in order to establish their supremacy in the market, especially in the AI space. Sagemaker is Amazon’s answer to Google’s Cloud Auto ML, which was made publicly available in January, and delivers a similar promise - making machine learning easier than ever for developers. With Amazon serving a large customer-base, a platform like Sagemaker helps them to create a system that runs at scale and handles vast amounts of data quite effortlessly.  Amazon is yet to release any technical paper on how Sagemaker’s streaming algorithms work, but that will certainly be something to look out for in the near future. Considering Amazon identifies AI as key to their future product development, to think of Sagemaker as a better, more complete cloud service which also has deep learning capabilities is definitely not far-fetched.
Read more
  • 0
  • 0
  • 6896

article-image-brickcoin-might-change-your-mind-about-cryptocurrencies
Savia Lobo
11 Apr 2018
3 min read
Save for later

BrickCoin might just change your mind about cryptocurrencies

Savia Lobo
11 Apr 2018
3 min read
At the start of 2018, the cryptocurrency boom seemed to be at an end. Bitcoin's price plunged in just a matter of weeks from a mid-December 2017 high of $20,000 to less than $10,000. Suddenly everything seemed unpredictable and volatile. The cryptocurrency honeymoon was at an end. However, while many are starting to feel cautious about investing, a new cryptocurrency might change the game. BrickCoin might well be the cryptocurrency to reinvigorate a world that's shifted from optimism to pessimism in just a couple of months. But what is BrickCoin? And how is it different from other cryptocurrencies? Most importantly, why might you be confident in its success? What is BrickCoin? This one’s also a Blockchain based currency, but one backed with real estate(REITs). BrickCoin aims to be the first regulated, KYC, and AML compliant real estate crypto. The real estate is comprehensive to regulators and is accepted by many as a robust asset class. This is a major distinguishing point between BrickCoin and other cryptocurrencies. Traditional money saving methods - savings account and fixed deposits - are not inflation-proof. They also have very low levels of interest at the moment. On the other hand, complex investment options such as hedge funds are typically only available to wealthy individuals as they require large initial investments. These also do not offer ready liquidity and are vulnerable to bankruptcy. BrickCoin comes to the rescue here, as it claims to be Inflation proof. Find out more about BrickCoin here. The key features of BrickCoin It is a savings token which can be bought with traditional currency or digital currency. It represents an investment in a piece of commercial debt-free real estate. The real estate is held as part of a very secure, high-value, debt-free REIT. BrickCoins are kept in a mobile digital wallet. All transactions are fully-managed, validated and trackable by blockchain technology. BrickCoins can be converted into FIAT currency instantly. [box type="note" align="" class="" width=""]Also read about CryptoML, a machine learning powered cryptocurrency platform.[/box] BrickCoin is essentially the next step in the evolution of cryptocurrency. It is a savings scheme that is backed by a non-inflationary asset - commercial debt-free real estate - to deliver stable capital preservation. As a cryptocurrency, it allows savers to convert their money to and from BrickCoin tokens using the full security and convenience of blockchain technology. BrickCoin will be the first cryptocurrency that bridges the gap between the necessary reliance on the FIAT currencies and the asset-backed wealth-creation opportunities that are often out of reach for many ordinary savers. Crypto News Cryptojacking is a growing cybersecurity threat, report warns Coinbase Commerce API launches Crypto-ML, a machine learning powered cryptocurrency platform Crypto Op-Ed There and back again: Decrypting Bitcoin`s 2017 journey from $1000 to $20000 Will Ethereum eclipse Bitcoin? Beyond the Bitcoin: How cryptocurrency can make a difference in hurricane disaster relief Cryptocurrency Tutorials Predicting Bitcoin price from historical and live data How to mine bitcoin with your Raspberry Pi Protecting Your Bitcoins
Read more
  • 0
  • 0
  • 3699
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-everything-know-ethereum
Packt Editorial Staff
10 Apr 2018
8 min read
Save for later

Everything you need to know about Ethereum

Packt Editorial Staff
10 Apr 2018
8 min read
Ethereum was first conceived of by Vitalik Buterin in November 2013. The critical idea proposed was the development of a Turing-complete language that allows the development of arbitrary programs (smart contracts) for Blockchain and decentralized applications. This concept is in contrast to Bitcoin, where the scripting language is limited in nature and allows necessary operations only. This is an excerpt from the second edition of Mastering Blockchain by Imram Bashir. The following table shows all the releases of Ethereum starting from the first release to the planned final release: Version Release date Olympic May, 2015 Frontier July 30, 2015 Homestead March 14, 2016 Byzantium (first phase of Metropolis) October 16, 2017 Metropolis To be released Serenity (final version of Ethereum) To be released   The first version of Ethereum, called Olympic, was released in May, 2015. Two months later, a second version was released, called Frontier. After about a year, another version named Homestead with various improvements was released in March, 2016. The latest Ethereum release is called Byzantium. This is the first part of the development phase called Metropolis. This release implemented a planned hard fork at block number 4,370,000 on October 16, 2017. The second part of this release called Constantinople is expected in 2018 but there is no exact time frame available yet. The final planned release of Ethereum is called Serenity. It's planned for Serenity to introduce the final version of PoS based blockchain instead of PoW. The yellow paper The Yellow Paper, written by Dr. Gavin Wood, serves as a formal definition of the Ethereum protocol. Anyone can implement an Ethereum client by following the protocol specifications defined in the paper. While this paper is a challenging read, especially for those who do not have a background in algebra or mathematics, it contains a complete formal specification of Ethereum. This specification can be used to implement a fully compliant Ethereum client. The list of all symbols with their meanings used in the paper is provided here with the anticipation that it will make reading the yellow paper more accessible. Once symbol meanings are known, it will be much easier to understand how Ethereum works in practice. Symbol Meaning Symbol Meaning ≡ Is defined as ≤ Less than or equal to = Is equal to Sigma, World state ≠ Is not equal to Mu, Machine state ║...║ Length of Upsilon, Ethereum state transition function Is an element of Block level state transition function Is not an element of . Sequence concatenation For all There exists Union ᴧ Contract creation function Logical AND Increment : Such that Floor, lowest element {} Set Ceiling, highest element () Function of tuple No of bytes [] Array indexing Exclusive OR Logical OR (a ,b) Real numbers >= a and < b > Is greater than Empty set, null + Addition - Subtraction ∑ Summation { Describing various cases of if, otherwise   Ethereum blockchain Ethereum, like any other blockchain, can be visualized as a transaction-based state machine. This definition is referred to in the Yellow Paper. The core idea is that in Ethereum blockchain, a genesis state is transformed into a final state by executing transactions incrementally. The final transformation is then accepted as the absolute undisputed version of the state. In the following diagram, the Ethereum state transition function is shown, where a transaction execution has resulted in a state transition: In the example above, a transfer of two Ether from address 4718bf7a to address 741f7a2 is initiated. The initial state represents the state before the transaction execution, and the final state is what the morphed state looks like. Mining plays a central role in state transition, and we will elaborate the mining process in detail in the later sections. The state is stored on the Ethereum network as the world state. This is the global state of the Ethereum blockchain. How Ethereum works from a user's perspective For all the conversation around cryptocurrencies, it's very rare for anyone to actually explain how it works from the perspective of a user. Let's take a look at how it works in practice. In this example, I'll use the example of one man (Bashir) transferring money to another (Irshad). You may also want to read our post on if Ethereum will eclipse bitcoin. For the purposes of this example, we're using Jaxx wallet. However, you can use any cryptocurrency wallet for this. First, either a user requests money by sending the request to the sender, or the sender decides to send money to the receiver. The request can be sent by sending the receivers Ethereum address to the sender. For example, there are two users, Bashir and Irshad. If Irshad requests money from Bashir, then she can send a request to Bashir by using QR code. Once Bashir receives this request he will either scan the QR code or manually type in Irshad's Ethereum address and send Ether to Irshad's address. This request is encoded as a QR code shown in the following screenshot which can be shared via email, text or any other communication methods.2. Once Bashir receives this request he will either scan this QR code or copy the Ethereum address in the Ethereum wallet software and initiate a transaction. This process is shown in the following screenshot where the Jaxx Ethereum wallet software on iOS is used to send money to Irshad. The following screenshot shows that the sender has entered both the amount and destination address for sending Ether. Just before sending the Ether the final step is to confirm the transaction which is also shown here: Once the request (transaction) of sending money is constructed in the wallet software, it is then broadcasted to the Ethereum network. The transaction is digitally signed by the sender as proof that he is the owner of the Ether. This transaction is then picked up by nodes called miners on the Ethereum network for verification and inclusion in the block. At this stage, the transaction is still unconfirmed. Once it is verified and included in the block, the PoW process begins. Once a miner finds the answer to the PoW problem, by repeatedly hashing the block with a new nonce, this block is immediately broadcasted to the rest of the nodes which then verifies the block and PoW. If all the checks pass then this block is added to the blockchain, and miners are paid rewards accordingly. Finally, Irshad gets the Ether, and it is shown in her wallet software. This is shown here: On the blockchain, this transaction is identified by the following transaction hash: 0xc63dce6747e1640abd63ee63027c3352aed8cdb92b6a02ae25225666e171009e Details regarding this transaction can be visualized from the block explorer, as shown in the following screenshot: Thiswalkthroughh should give you some idea of how it works. Different Ethereum networks The Ethereum network is a peer-to-peer network where nodes participate in order to maintain the blockchain and contribute to the consensus mechanism. Networks can be divided into three types, based on requirements and usage. These types are described in the following subsections. Mainnet Mainnet is the current live network of Ethereum. The current version of mainnet is Byzantium (Metropolis) and its chain ID is 1. Chain ID is used to identify the network. A block explorer which shows detailed information about blocks and other relevant metrics is available here. This can be used to explore the Ethereum blockchain. Testnet Testnet is the widely used test network for the Ethereum blockchain. This test blockchain is used to test smart contracts and DApps before being deployed to the production live blockchain. Because it is a test network, it allows experimentation and research. The main testnet is called Ropsten which contains all features of other smaller and special purpose testnets that were created for specific releases. For example, other testnets include Kovan and Rinkeby which were developed for testing Byzantium releases. The changes that were implemented on these smaller testnets has also been implemented on Ropsten. Now the Ropsten test network contains all properties of Kovan and Rinkeby. Private net As the name suggests, this is the private network that can be created by generating a new genesis block. This is usually the case in private blockchain distributed ledger networks, where a private group of entities start their blockchain and use it as a permissioned blockchain. The following table shows the list of Ethereum network with their network IDs. These network IDs are used to identify the network by Ethereum clients. Network name Network ID / Chain ID Ethereum mainnet 1 Morden 2 Ropsten 3 Rinkeby 4 Kovan 42 Ethereum Classic mainnet 61   You should now have a good foundation of knowledge to get started with Ethereum. To learn more about Ethereum and other cryptocurrencies, check out the new edition of Mastering Blockchain. Other posts from this book A brief history of Blockchain Write your first Blockchain: Learning Solidity Programming in 15 minutes 15 ways to make Blockchains scalable, secure and safe! What is Bitcoin
Read more
  • 0
  • 0
  • 8284

article-image-brief-history-blockchain
Packt Editorial Staff
09 Apr 2018
6 min read
Save for later

A brief history of Blockchain

Packt Editorial Staff
09 Apr 2018
6 min read
History - where do we start? Blockchain was introduced with the invention of Bitcoin in 2008. Its practical implementation then occurred in 2009. Of course, both Blockchain and Bitcoin are very different, but you can't tell the full story behind the history of Blockchain without starting with Bitcoin. Electronic cash before Blockchain The concept of electronic cash or digital currency is not new. Since the 1980s, e-cash protocols have existed based on a model proposed by David Chaum. This is an extract from the new edition of Mastering Blockchain. Just as you need to understand the concept of distributed systems is to properly understand Blockchain, you also need to understand electronic cash. This concept pre-dates Blockchain and Bitcoin, but without it, we would certainly not be where we are today. Two fundamental e-cash system issues need to be addressed: accountability and anonymity. Accountability is required to ensure that cash is spendable only once (double-spend problem) and that it can only be spent by its rightful owner. Double spend problem arises when same money can be spent twice. As it is quite easy to make copies of digital data, this becomes a big issue in digital currencies as you can make many copies of same digital cash. Anonymity is required to protect users' privacy. As with physical cash, it is almost impossible to trace back spending to the individual who actually paid the money. David Chaum solved both of these problems during his work in the 1980s by using two cryptographic operations, namely blind signatures and secret sharing. Blind signatures allow for signing a document without actually seeing it, and secret sharing is a concept that enables the detection of double spending, that is using the same e-cash token twice (double spending). In 2009, the first practical implementation of an electronic cash (e-cash) system named Bitcoin appeared. The term cryptocurrency emerged later. For the very first time, it solved the problem of distributed consensus in a trustless network. It used public key cryptography with a Proof of Work (PoW) mechanism to provide a secure, controlled, and decentralized method of minting digital currency. The key innovation was the idea of an ordered list of blocks composed of transactions and cryptographically secured by the PoW mechanism. Other technologies that used something like a precursor to Bitcoin, include Merkle trees, hash functions, and hash chains. Looking at all the technologies mentioned earlier and their relevant history, it is easy to see how concepts from electronic cash schemes and distributed systems were combined to create Bitcoin and what now is known as Blockchain. This concept can also be visualized with the help of the following diagram: Blockchain and Sakoshi Nakamoto In 2008, a groundbreaking paper entitled Bitcoin: A Peer-to-Peer Electronic Cash System was written on the topic of peer-to-peer electronic cash under the pseudonym Satoshi Nakamoto. It introduced the term chain of blocks. No one knows the actual identity of Satoshi Nakamoto. After introducing Bitcoin in 2009, he remained active in the Bitcoin developer community until 2011. He then handed over Bitcoin development to its core developers and simply disappeared. Since then, there has been no communication from him whatsoever, and his existence and identity are shrouded in mystery. The term chain of blocks evolved over the years into the word Blockchain. Since that point, the history of Blockchain is really the history of its application in different industries. The most notable area is unsurprisingly within finance. Blockchain has been shown to improve the speed and security of financial transactions. While it hasn't yet become embedded in the mainstream of the financial sector, it surely only remains a matter of time before it begins to take hold. How it has evolved in recent years In Blockchain: Blueprint for a New Economy, Melanie Swann identifies three different tiers of Blockchain. These three tiers all showcase how Blockchain is currently evolving. It's worth noting that these various tiers or versions aren't simple chronological points in the history of Blockchain. The lines between each are blurred, and it ultimately depends on how Blockchain technology is being applied that different features and capabilities will be appear. Blockchain 1.0: This tier was introduced with the invention of Bitcoin, and it is primarily used for cryptocurrencies. Also, as Bitcoin was the first implementation of cryptocurrencies, it makes sense to categorize this first generation of Blockchain technology to include only cryptographic currencies. All alternative cryptocurrencies, as well as Bitcoin, fall into this category. It includes core applications such as payments and applications. This generation started in 2009 when Bitcoin was released and ended in early 2010. Blockchain 2.0: This second Blockchain generation is used by financial services and smart contracts. This tier includes various financial assets, such as derivatives, options, swaps, and bonds. Applications that go beyond currency, finance, and markets are incorporated at this tier. Ethereum, Hyperledger, and other newer Blockchain platforms are considered part of Blockchain 2.0. This generation started when ideas related to using blockchain for other purposes started to emerge in 2010. Blockchain 3.0: This third Blockchain generation is used to implement applications beyond the financial services industry and is used in government, health, media, the arts, and justice. Again, as in Blockchain 2.0, Ethereum, Hyperledger, and newer blockchains with the ability to code smart contracts are considered part of this blockchain technology tier. This generation of Blockchain emerged around 2012 when multiple applications of Blockchain technology in different industries were researched. Blockchain X.0: This generation represents a vision of Blockchain singularity where one day there will be a public Blockchain service available that anyone can use just like the Google search engine. It will provide services for all realms of society. It will be a public and open distributed ledger with general-purpose rational agents (Machina economicus) running on a Blockchain, making decisions, and interacting with other intelligent autonomous agents on behalf of people, and regulated by code instead of law or paper contracts. This does not mean that law and contracts will disappear, instead, law and contracts will be implementable in code. Like any history, this history of Blockchain isn't exhaustive. But it does hopefully give you an idea of how it has developed to where we are today. Check out this tutorial to write your first Blockchain program.
Read more
  • 0
  • 0
  • 7058

article-image-why-oracle-losing-database-race
Aaron Lazar
06 Apr 2018
3 min read
Save for later

Why Oracle is losing the Database Race

Aaron Lazar
06 Apr 2018
3 min read
When you think of databases, the first thing that comes to mind is Oracle or IBM. Oracle has been ruling the database world for decades now, and it has been able to acquire tonnes of applications that use its databases. However, that’s changing now, and if you didn’t know already, you might be surprised to know that Oracle is losing the database race. Oracle = Goliath Oracle was and still is ranked number one among databases, owing to its legacy in the database ballpark. Source - DB Engines The main reason why Oracle has managed to hold its position is because of lock-in, a CIO’s worst nightmare. Migrating data that’s accumulated over the years is not a walk in the park and usually has top management flinching every time it’s mentioned. Another reason is because Oracle is known to be aggressive when it comes to maintaining and enforcing licensing terms. You won’t be surprised to find Oracle ‘agents’ at the doorstep of your organisation, slapping you with a big fine for non-compliance! Oracle != Goliath for everyone You might wonder whether even the biggies are in the same position, locked-in with Oracle. Well, the Amazons and Salesforces of the world have quietly moved away from lock-in hell and have their applications now running on open-source projects. In fact, Salesforce plans to be completely free of Oracle databases by 2023 and has even codenamed this project “Sayonara”. I wonder what inspired the name! Enter the “Davids” of Databases While Oracle’s databases have been declining, alternatives like SQL Server and PostgreSQL have been steadily growing. SQL Server has been doing it in leaps and bounds, with a growth rate of over 30%. Amazon and Microsoft’s cloud based databases have seen close to 10x growth. While one might think that all Cloud solutions would have dominated the database world, databases like Google Cloud SQL and IBM Cognos have been suffering very slow to no growth as the question of lock-in arises again, only this time with a cloud vendor. MongoDB has been another shining star in the database race. Several large organisations like HSBC, Adobe, Ebay, Forbes and MTV have adopted MongoDB as their database solution. Newer organisations have been resorting to adopt these databases instead to looking to Oracle. However, it’s not really eating into Oracle’s existing market, at least not yet. Is 18c Oracle’s silver bullet? Oracle bragged a lot about 18c, last year, positioning it as a database that needs little to no human interference thanks to its ground-breaking machine learning; one that operates at less than 30 minutes of downtime a year and many more features. Does this make Microsoft and Amazon break into a sweat? Hell no! Although Oracle has strategically positioned 18c as a database that lowers operational cost by cutting down on the human element, it still is quite expensive when compared to its competitors - they haven’t dropped their price one bit. Moreover, it can’t really automate “everything” and there’s always a need for a human administrator - not really convincing enough. Quite naturally customers will be drawn towards competition. In the end, the way I look at it, Oracle already had a head start and is now inches from the elusive finish line, probably sniggering away at all the customers that it has on a leash. All while cloud databases are slowly catching up and will soon be leaving Oracle in a heap of dirt. Reminds me of that fable mum used to read to me...what’s it called...The hare and the tortoise.
Read more
  • 0
  • 0
  • 8181

article-image-top-10-computer-vision-tools
Aaron Lazar
05 Apr 2018
7 min read
Save for later

Top 10 Tools for Computer Vision

Aaron Lazar
05 Apr 2018
7 min read
The adoption of Computer Vision has been steadily picking up pace over the past decade, but there’s been a spike in adoption of various computer vision tools in recent times, thanks to its implementation in fields like IoT, manufacturing, healthcare, security, etc. Computer vision tools have evolved over the years, so much so that computer vision is now also being offered as a service. Moreover, the advancements in hardware like GPUs, as well as machine learning tools and frameworks make computer vision much more powerful in the present day. Major cloud service providers like Google, Microsoft and AWS have all joined the race towards being the developers’ choice. But which tool should you choose? Today I’ll take you through a list of the top tools and will help you understand which one to pick up, based on your need. Computer Vision Tools/Libraries OpenCV: Any post on computer vision is incomplete without the mention of OpenCV. OpenCV is a great performing computer vision tool and it works well with C++ as well as Python. OpenCV is prebuilt with all the necessary techniques and algorithms to perform several image and video processing tasks. It’s quite easy to use and this makes it clearly the most popular computer vision library on the planet! It is multi-platform, allowing you to build applications for Linux, Windows and Android. At the same time, it does have some drawbacks. It gets a bit slow when working through massive data sets or very large images. Moreover, on its own, it doesn’t have GPU support and relies on CUDA for GPU processing. Matlab: Matlab is a great tool for creating image processing applications and is widely used in research. The reason being that Matlab allows quick prototyping. Another interesting aspect is that Matlab code is quite concise, as compared to C++, making it easier to read and debug. It tackles errors before execution by proposing some ways to make the code faster. On the downside, Matlab is a paid tool. Also, it can get quite slow during execution time, if that’s something that concerns you much. Matlab is not your go to tool in an actual production environment, as it was basically built for prototyping and research. AForge.NET/Accord.NET: You’ll be excited to know that image processing is possible even if you’re a C# and .NET developer, thanks to AForge/Accord. It’s a great tool that has a lot of filters and is great for image manipulation and different transforms. The Image Processing Lab allows for filtering capabilities like edge detection and more. AForge is extremely simple to use as all you need to do is adjust parameters from a user interface. Moreover, its processing speeds are quite good. However, AForge doesn’t possess the power and capabilities of other tools like OpenCV, like advanced motion picture analysis or even advanced processing on images. TensorFlow: TensorFlow has been gaining popularity over the past couple of years, owing to its power and ease of use. It lets you bring the power of Deep Learning to computer vision and has some great tools to perform image processing/classification - it’s API-like graph tensor. Moreover, you can make use of the Python API to perform face and expression detection. You can also perform classification using techniques like regression. Tensorflow also allows you to perform computer vision of tremendous magnitudes. One of the main drawbacks of Tensorflow is that it’s extremely resource hungry and can devour a GPU’s capabilities in no time, quite uncalled for. Moreover, if you wanted to learn how to perform image processing with TensorFlow, you’d have to understand what Machine and Deep Learning is, write your own algorithms and then go forward from there. CUDA: CUDA is a platform for parallel computing, invented by NVIDIA. It enables great boosts in computing performance by leveraging the power of GPUs. The CUDA Toolkit includes the NVIDIA Performance Primitives library which is a collection of signal, image, and video processing functions. If you have large images to process, that are GPU intensive, you can choose to use CUDA. CUDA is easy to program and is quite efficient and fast. On the downside, it is extremely high on power consumption and you will find yourself reformulating for memory distribution in parallel tasks. SimpleCV: SimpleCV is a framework for building computer vision applications. It gives you access to a multitude of computer vision tools on the likes of OpenCV, pygame, etc. If you don’t want to get into the depths of image processing and just want to get your work done, this is the tool to get your hands on. If you want to do some quick prototyping, SimpleCV will serve you best. Although, if your intention is to use it in heavy production environments, you cannot expect it to perform on the level of OpenCV. Moreover, the community forum is not very active and you might find yourself running into walls, especially with the installation. GPUImage: GPUImage is a framework or rather, an iOS library that allows you to apply GPU-accelerated effects and filters to images, live motion video, and movies. It is built on OpenGL ES 2.0. Running custom filters on a GPU calls for a lot of code to set up and maintain. GPUImage cuts down on all of that boilerplate and gets the job done for you. Computer Vision as a Service: Google Cloud and Mobile Vision APIs: Google Cloud Vision API enables developers to perform image processing by encapsulating powerful machine learning models in a simple REST API that can be called in an application. Also, its Optical Character Recognition (OCR) functionality enables you to detect text in your images. The Mobile Vision API lets you detect objects in photos and video, using real-time on-device vision technology. It also lets you scan and recognise barcodes and text. Amazon Rekognition: Amazon Rekognition is a deep learning-based image and video analysis service that makes adding image and video analysis to your applications, a piece of cake. The service can identify objects, text, people, scenes and activities, and it can also detect inappropriate content, apart from providing highly accurate facial analysis and facial recognition for sentiment analysis. Microsoft Azure Computer Vision API: Microsoft’s API is quite similar to its peers and allows you to analyse images, read text in them, and analyse video in near-real time. You can also flag adult content, generate thumbnails of images and recognise handwriting. Bonus: SciPy and NumPy: I thought I’d add these in as well, since I’ve seen quite a few developers use Python to build computer vision applications (without OpenCV, that is). SciPy and NumPy are quite powerful enough to perform image processing. scikit-image is a Python package that is dedicated towards image processing, which uses native NumPy and SciPy arrays as image objects. Moreover, you get to use the cool IPython interactive computing environment and you can also choose to include OpenCV if you want to do some more hardcore image processing. Well there you have it, these were the top tools for computer vision and image processing. Head on over and check out these resources, to get working with some of the top tools used in the industry.
Read more
  • 0
  • 2
  • 28361
article-image-top-5-programming-languages-big-data
Amey Varangaonkar
04 Apr 2018
8 min read
Save for later

Top 5 programming languages for crunching Big Data effectively

Amey Varangaonkar
04 Apr 2018
8 min read
One of the most important decisions that Big Data professionals have to make, especially the ones who are new to the scene or are just starting out, is choosing the best programming languages for big data manipulation and analysis. Understanding the Big Data problem and framing the architecture to solve it is not quite enough these days - the execution needs to be perfect as well, and choosing the right language goes a long way. The best languages for big data In this article, we look at the 5 of the most popularly used - not to mention highly effective - programming languages for developing Big Data solutions. Scala A beautiful crossover of the object-oriented and functional programming paradigms, Scala is fast and robust, and a popular choice of language for many Big Data professionals.The fact that two of the most popular Big Data processing frameworks in Apache Spark and Apache Kafka have been built on top of Scala tells you everything you need to know about the power of Scala. Scala runs on the JVM, which means the codes written in Scala can be easily used within a Java-based Big Data ecosystem. One significant factor that differentiates Scala from Java, though, is that Scala is a lot less verbose in comparison. You can write 100s of lines of confusing-looking Java code in less than 15 lines in Scala. One negative aspect of Scala, though, is its steep learning curve when compared to languages like Go and Python, and this may put off beginners looking to use it. Why use Scala for big data? Fast and robust Suitable for working with Big Data tools like Apache Spark for distributed Big Data processing JVM compliant, can be used in a Java-based ecosystem Python Python has been declared as one of the fastest growing programming languages in 2018 as per the recently held Stack Overflow Developer Survey. Its general-purpose nature means it can be used across a broad spectrum of use-cases, and Big Data programming is one major area of application. Many libraries for data analysis and manipulation which are increasingly being used in a Big Data framework to clean and manipulate large chunks of data, such as pandas, NumPy, SciPy - are all Python-based. Not just that, most popular machine learning and deep learning frameworks such as scikit-learn, Tensorflow and many more, are also written in Python and are finding increasing application within the Big Data ecosystem. One drawback of using Python, and a reason why it is not a first-class citizen when it comes to Big Data programming yet, is that it’s slow. Although very easy to use, Big Data professionals have found systems built with languages such as Java or Scala faster and more robust to use than the systems built with Python. However, Python makes up for this limitation with other qualities. As Python is primarily a scripting language, interactive coding and development of analytical solutions for Big Data becomes very easy. Python can integrate effortlessly with the existing Big Data frameworks such as Apache Hadoop and Apache Spark, allowing you to perform predictive analytics at scale without any problem. Why use Python for big data? General-purpose Rich libraries for data analysis and machine learning Easy to use Supports iterative development Rich integration with Big Data tools Interactive computing through Jupyter notebooks R It won’t come as a surprise to many that those who love statistics, love R. The ‘language of statistics’ as it is popularly called as, R is used to build data models which can be used for effective and accurate data analysis. Powered by a large repository of R packages (CRAN, also called as Comprehensive R Archive Network), with R you have just about every type of tool to accomplish any task in Big Data processing - right from analysis to data visualization. R can be integrated seamlessly with Apache Hadoop and Apache Spark, among other popular frameworks, for Big Data processing and analytics. One issue with using R as a programming language for Big Data is that it is not very general-purpose. It means the code written in R is not production-deployable and generally has to be translated to some other programming language such as Python or Java. That said, if your goal is to only build statistical models for Big Data analytics, R is an option you should definitely consider. Why use R for big data? Built for data science Support for Hadoop and Spark Strong statistical modeling and visualization capabilities Support for Jupyter notebooks Java Last, but not the least, there’s always the good old Java. Some of the traditional Big Data frameworks such as Apache Hadoop and all the tools within its ecosystem are all Java-based, and still in use today in many enterprises. Not to mention the fact that Java is the most stable and production-ready language among all the languages we have discussed so far! Using Java to develop your Big Data applications gives you the ability to use a large ecosystem of tools and libraries for interoperability, monitoring and much more, most of which have already been tried and tested. One major drawback of Java is its verbosity. The fact that you have to write hundreds of lines of codes in Java for a task which can written in barely 15-20 lines of code in Python or Scala, can turnoff many budding programmers. However, the introduction of lambda functions in Java 8 does make life quite easier. Java also does not support iterative development unlike newer languages like Python, and this is an area of focus for the future Java releases. Despite the flaws, Java remains a strong contender when it comes to the preferred language for Big Data programming because of its history and the continued reliance on the traditional Big Data tools and frameworks. Why use Java for big data? Traditional Big Data tools and frameworks are written in Java Stable and production-ready Large ecosystem of tried and tested tools and libraries Go Last but not the least, there’s Go - one of the fastest rising programming languages in recent times. Designed by a group of Google engineers who were frustrated with C++, we think Go is a good shout in this list - simply because of the fact that it powers so many tools used in the Big Data infrastructure, including Kubernetes, Docker and many more. Go is fast, easy to learn, and fairly easy to develop applications with, not to mention deploy them. More importantly, as businesses look at building data analysis systems that can operate at scale, Go-based systems are being used to integrate machine learning and parallel processing of data. It is also possible to interface other languages with Go-based systems with relative ease. Why use Go for big data? Fast, easy to use Many tools used in the Big Data infrastructure are Go-based Efficient distributed computing There are a few other languages you might want to consider - Julia, SAS and MATLAB being some major ones which are useful in their own right. However, when compared to the languages we talked about above, we thought they fell a bit short in some aspects - be it speed, efficiency, ease of use, documentation, or community support, among other things. Let’s take a quick look at the comparison table of all the languages we discussed above. Note that we have used the ✓ symbol for the best possible language/s to help you make an informed decision. This is just our view, and that’s not to say that the other languages are any worse! Scala Python R Java Go Speed ✓ ✓ ✓ Ease of use ✓ ✓ ✓ Quick Learning curve ✓ ✓ Data Analysis capability ✓ ✓ ✓ General-purpose ✓ ✓ ✓ ✓ Big Data support ✓ ✓ ✓ ✓ ✓ Interfacing with other languages ✓ ✓ ✓ Production-ready ✓ ✓ ✓ So...which language should you choose? To answer the question in short - it all depends on the use-case you want to develop. If your focus is hardcore data analysis which involves a lot of statistical computing, R would be your go-to language. On the other hand, if you want to develop streaming applications for your Big Data, Scala can be a preferable choice. If you wish to use Machine Learning to leverage your Big Data and build predictive models, Python will come to your rescue. Lastly, if you plan to build Big Data solutions using just the traditionally-available tools, Java is the language for you. You also have the option of combining the power of two languages to get a more efficient and powerful solution. For example, you can train your machine learning model in Python and deploy it on Spark in a distributed mode. Ultimately, it all depends on how efficiently your solution can function, and more importantly, how fast and accurate it is. Which language do you prefer for crunching your Big Data? Do let us know!
Read more
  • 0
  • 1
  • 18241

article-image-why-deepmind-open-sourced-sonnet
Sugandha Lahoti
03 Apr 2018
3 min read
Save for later

Why DeepMind made Sonnet open source

Sugandha Lahoti
03 Apr 2018
3 min read
DeepMind has always open sourced their projects with a bang. Last year, it announced that it is going to open source Sonnet, a library for quickly building neural network modules with Tensorflow. Deepmind shifted from Torch to Tensorflow as their choice of framework since early 2016, after it was acquired by Google in 2014. Why Sonnet if you have TensorFlow? Since adopting TensorFlow as the choice of their framework, DeepMind has enjoyed the flexibility and adaptiveness of TF for building higher-level frameworks. In order to build neural network modules with Tensorflow, they created a framework called Sonnet. Sonnet doesn’t typically replace TensorFlow; it just eases the process of constructing neural networks. Prior to Sonnet, DeepMind developers were forced to become intimately familiar with the underlying TensorFlow graphs in order to correctly architect its applications. With Sonnet, the creation of neural network components is quite easy as it first constructs Python objects which represent some part of a neural network, and then separately connect these objects into the TensorFlow computation graph. What makes Sonnet special? Sonnet uses Modules. Modules encapsulate elements of a neural network which in turn abstracts low-level aspects of TensorFlow applications. Sonnet enables developers to build their own Modules using a simple programming model. These Modules simplify the neural network training and can help to implement individual neural networks that can be combined to implement higher-level networks. Developers can also easily extend Sonnet by implementing their own modules. Using Sonnet, it becomes easier to switch between different models, allowing engineers to freely conduct experiments without worrying about hampering their entire projects. Why open source Sonnet? The announcement of Sonnet open sourcing came on April 7, 2017. Most people appreciated it as a move in the right direction. One of the focal purpose of DeepMind to open source Sonnet was to make the developer community to use Sonnet to take their own research forwards.  According to FossBytes, "DeepMind foresees Sonnet to be used by the community as a research propellant." With this open sourcing, the machine learning community can then more actively contribute back by utilizing Sonnet in their own projects. Moreover, if the community becomes accustomed and acquainted with DeepMind’s internal libraries, it will become easier for the DeepMind group to release other Machine learning models alongside research papers. Certain experienced developers also point out that using TensorFlow and Sonnet together is similar to using TensorFlow and Torch together, with a Reddit comment stating “DeepMind's trying to turn TensorFlow into Torch”. Nevertheless, open sourcing of Sonnet is seen as DeepMind’s part of their broader commitment to open source AI research. Also, as Sonnet is adopted by the community more similar frameworks are also likely to develop that make neural network construction easier using TensorFlow as the underlying runtime. Taking a further step towards democratization of machine learning and its subsidies. Sonnet is already available on GitHub and will be regularly updated by the DeepMind team to match the in-house version.
Read more
  • 0
  • 0
  • 7209

article-image-emoji-scavenger-hunt-showcases-tensorflow-js
Richard Gall
03 Apr 2018
3 min read
Save for later

Emoji Scavenger Hunt showcases TensorFlow.js

Richard Gall
03 Apr 2018
3 min read
What is Emoji Scavenger Hunt? Emoji Scavenger Hunt is a game built using neural networks. Developed by Google using TensorFlow.js, a version of the machine learning library designed to run on browsers, the game showcases how machine learning can be brought to web applications. But more importantly, TensorFlow.js, which was announced at the end of March at the TensorFlow Developer Summit looks like it could be a tool to define the next few years of web development, making machine learning more accessible to JavaScript developers than ever before. Start playing now. At the moment Emoji Scavenger Hunt is pretty basic, but the central idea is pretty cool. When you open up the web page in your browser and click 'Let's Play', the app asks for access to your camera. The game then starts: you'll see a countdown, before your camera opens and the web application asks you to find an example of an emoji in the real world. If you find yourself easily irritated you're probably not going to get addicted, as Google seem to have done their best to cultivate an emoji-esque mise en scene. But the game nevertheless highlights not only how neural networks work, but also, in the context of TensorFlow.js, how they might operate in a browser. Of course, one of the reasons Emoji Scavenger Hunt is so basic is because a core part of the game is training the neural network. Presumably, as more people play it, the neural network will improve at 'guessing' what objects in the real world relate to which emoji on your keyboard. TensorFlow.js will bring machine learning to the browser What's exciting is how TensorFlow.js might help shape the future of web development. It's going to make it much easier for JavaScript developers to get started with machine learning - on Reddit a number of users were thankful that they could now use TensorFlow without touching a line of Python code. On the other hand - perhaps a little less likely - TensorFlow.js might lead to more machine learning developers using JavaScript. If games like Emoji Scavenger Hunt become the norm, engineers and data scientists will have a new way to train algorithms - getting users to do it for them. TensorFlow.js and deeplearn.js Eagle-eyed readers who have been watching TensorFlow closely might be thinking here - what about deeplearn.js? Fortunately, the TensorFlow team have an answer: TensorFlow.js... is the successor to deeplearn.js, which is now called TensorFlow.js Core. TensorFlow.js and the future of machine learning The announcement of TensorFlow.js highlights that Google and the core development team behind TensorFlow have a clear focus on the future. They're already the definitive library for machine learning and deep learning. What this will do is spread its dominance into new domains. Emoji Scavenger Hunt is pointing the way - we're sure to see plenty of machine learning imitators and innovators over the next few years.
Read more
  • 0
  • 0
  • 6935
article-image-paper-in-two-minutes-i-revnet-a-deep-invertible-convolutional-network
Sugandha Lahoti
02 Apr 2018
4 min read
Save for later

Paper in Two minutes: i-RevNet, a deep invertible convolutional network

Sugandha Lahoti
02 Apr 2018
4 min read
The ICLR 2018 accepted paper, i-RevNet: Deep Invertible Networks, introduces i-RevNet, an invertible convolutional network, that does not discard any information about the input while classifying images. This paper is authored by Jörn-Henrik Jacobsen, Arnold W.M. Smeulders, and Edouard Oyallon. The 6th annual ICLR conference is scheduled to happen between April 30 - May 03, 2018. i-RevNet, a deep invertible convolutional network What problem is the paper attempting to solve? A CNN is generally composed of a cascade of linear and nonlinear operators. These operators are very effective in classifying images of all sorts but reveal little information about the contribution of the internal representation to the classification. The learning process of a CNN works by a regular reduction of large amounts of uninformative variability in the images to reveal the essence of the visual class. However, the extent to which information is discarded is lost somewhere in the intermediate nonlinear processing steps. Also, there is a wide belief, that discarding information is essential for learning representations that generalize well to unseen data. The authors of this paper show that discarding information is not necessary and propose to explain this theory with empirical evidence. This paper also provides an understanding of the variability reduction process by proposing an invertible convolutional network. The i-RevNet does not discard any information about the input while classifying images. It has a built-in pseudo-inverse, allowing for easy inversion.  It basically uses linear and invertible operators for performing downsampling, instead of non-invertible variants like spatial pooling. Paper summary i-RevNet is an invertible deep network, which builds upon the recently introduced RevNet, where the non-invertible components of the original RevNets are replaced by invertible ones. i-RevNets retain all information about the input signal in any of their intermediate representations up until the last layer. They achieve the same performance on Imagenet compared to similar non-invertible RevNet and ResNet architectures. The above image describes the blocks of an i-RevNet. The strategy implemented by an i-RevNet consists in an alternation between additions, and nonlinear operators, while progressively down-sampling the signal operators. The pair of the final layer is concatenated through a merging operator. Using this architecture, the authors avoid the non-invertible modules of a RevNet (e.g. max-pooling or strides) which are necessary to train them in a reasonable time and are designed to build invariance w.r.t. Translation variability. Their method replaces the non-invertible modules by linear and invertible modules Sj, that can reduce the spatial resolution while maintaining the layer’s size by increasing the number of channels. Key Takeaways This work provides a solid empirical evidence that learning invertible representations does not discard any information about their input on large-scale supervised problems. i-RevNet, the invertible network proposed, is a class of CNN which is fully invertible and permits to exactly recover the input from its last convolutional layer. i-RevNets achieve the same classification accuracy in the classification of complex datasets as illustrated on ILSVRC-2012 when compared to the RevNet and ResNet architectures with a similar number of layers. The inverse network is obtained for free when training an i-RevNet, requiring only minimal adaption to recover inputs from the hidden representations. Reviewer feedback summary Overall Score: 25/30 Average Score: 8.3 Reviewers agreed the paper is a strong contribution, despite some comments about the significance of the result; i.e., why is invertibility a "surprising" property for learnability, in the sense that F(x) = {x,  phi(x)}, where phi is a standard CNN satisfies both properties: invertible and linear measurements of F producing good classification. Having said that, the reviews agreed that the paper is well written and easy to follow and considered it to be a great contribution to the ICLR conference.
Read more
  • 0
  • 0
  • 6018

article-image-unity-machine-learning-agents-transforming-games-with-artificial-intelligence
Amey Varangaonkar
30 Mar 2018
4 min read
Save for later

Unity Machine Learning Agents: Transforming Games with Artificial Intelligence

Amey Varangaonkar
30 Mar 2018
4 min read
Unity has undoubtedly been one of the leaders when it comes to developing cross-platform products - going from strengths to strengths in developing visually stimulating 2D as well as 3D games and simulations.With Artificial Intelligence revolutionizing the way games are being developed these days, Unity have identified the power of Machine Learning and introduced Unity Machine Learning Agents. With this, they plan on empowering the game developers and researchers in their quest to develop intelligent games, robotics and simulations. What are Unity Machine Learning Agents? Traditionally, game developers have been hard-coding the behaviour of the game agents. Although effective, this is a tedious task and it also limits the intelligence of the agents. Simply put, the agents are not smart enough. To overcome this obstacle, Unity have simplified the training process for the game developers and researchers by introducing Unity Machine Learning Agents (ML-Agents, in short). Through just a simple Python API, the game agents can be now trained to use deep reinforcement learning, an advanced form of machine learning, to learn from their actions and modify their behaviour accordingly. These agents can then be used to dynamically modify the difficulty of the game. How do they work? As mentioned earlier, the Unity ML-Agents are designed to work based on the concept of deep reinforcement learning, a branch of machine learning where the agents are trained to learn from their own actions. Here is a simple flowchart to demonstrate how reinforcement learning works: The reinforcement learning training cycle The learning environment to be configured for the ML-agents consists of 3 primary objects: Agent: Every agent has a unique set of states, observations and actions within the environment, and is assigned rewards for particular events. Brain: A brain decides what action any agent is supposed to take in a particular scenario. Think of it as a regular human brain, which basically controls the bodily functions. Academy: This object contains all the brains within the environment To train the agents, a variety of scenarios are made possible by varying the connection of different components (explained above) of the environment. Some are single agents, some simultaneous single agents, and others could be co-operative and competitive multi-agents and more. You can read more about these possibilities on the official Unity blog. Apart from the way these agents are trained, Unity are also adding some cool new features in these ML-agents. Some of these are: Monitoring the agents’ decision-making to make it more accurate Incorporating curriculum learning, by which the complexity of the tasks can eventually be increased to aid more effective learning Imitation learning is a newly-introduced feature wherein the agents simply mimic the actions we want them to perform, rather than they learning on their own. What next for Unity Machine Learning Agents? Unity recently announced the release of v0.3 beta SDK of the ML-agents, and have been making significant progress in this domain to develop smarter, more intelligent game agents which can be used with the Unity game engine. Still very much in the research phase, these agents can also be used as an example by academic researchers to study the complex behaviour of trained models in different environments and scenarios where the variables associated with the in-game physics and visual appearance can be altered. Going forward, these agents can also be used by enterprises for large scale simulations, in robotics and also in the development of autonomous vehicles. These are interesting times for game developers, and Unity in particular, in their quest for developing smarter, cutting-edge games. Inclusion of machine learning in their game development strategy is a terrific move, although it will take some time for this to be perfected and incorporated seamlessly. Nonetheless, all the research and innovation being put into this direction certainly seems well worth it!
Read more
  • 0
  • 0
  • 6870