Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon

How-To Tutorials - Data

1210 Articles
article-image-brad-miro-talks-tensorflow-2-0-features-and-how-google-is-using-it-internally
Sugandha Lahoti
10 Dec 2019
6 min read
Save for later

Brad Miro talks TensorFlow 2.0 features and how Google is using it internally

Sugandha Lahoti
10 Dec 2019
6 min read
TensorFlow 2.0, released in October, has got developers excited about a myriad of features and its ease of use.  At the EuroPython Conference 2019, Brad Miro, developer programs engineer at Google talked about the updates being made to TensorFlow 2.0. He also gave an overview of how Google is using TensorFlow, moving on to why Python is important for TensorFlow development and how to migrate from TF 1.x to TF 2.0. EuroPython is one of the most popular Python programming language community conferences. Below are some highlights from Brad’s talk at EuroPython. What is TensorFlow? TensorFlow, an open-source deep learning library developed at Google, first released in 2015. It’s a Python framework that includes a number of utilities for helping you write deep neural networks supporting both GPUs and TPUs. A lot of deep learning involves using mathematics, statistics, and algebra and perform low-level optimizations with your system. TensorFlow removes a lot of those abstractions leaving you to focus on actually writing your model. How TensorFlow is used internally at Google Tensorflow is used internally at Google to power all of its machine learning and AI. Google’s data centers are powered using AI and TensorFlow to help optimize the usage of these data centers to reduce bandwidth, to ensure network connections are optimized, and to reduce power consumption. TensorFlow also is useful for performing global localization in Google Maps. It is also used heavily in the Google Pixel range of smartphones to help optimize the software. These technologies are also used in medical research specifically in the field of Computer Vision. For example, Tensorflow is used to distinguish between the retinal image of a healthy eye from the retinal image of an eye that has diabetic retinopathy.   Further Learning If you want to learn to build more computer vision applications with TensorFlow 2.0, check out the book Hands-On Computer Vision with TensorFlow 2 by Benjamin Planche, and Eliot Andres. This book by Packt Publishing is a practical guide to building high-performance systems for object detection, segmentation, video processing, smartphone applications, and more. By the end of the book, you will have both the theoretical understanding and practical skills to solve advanced computer vision problems with TensorFlow 2.0. Furthermore, Google is using AI and TensorFlow to predict whether or not objects in space are planets. To summarize, they use AI to predict whether or not fluctuations in the brightness of an object is due to it being a planet.   Why Python is so important for TensorFlow Python has always been the choice for TensorFlow due to the language being extremely easy to use and having a rich ecosystem for data science including tools such as numpy, scikit-learn, and pandas. When TensorFlow was being built, the idea was that it should have the simplicity of numpy, performance of C but ease of use of Python.  What does TensorFlow 2.0 bring to the table TensorFlow 2.0 is powerful, flexible, scalable and easily deployable.  What’s gone Session.run tf.control_dependencies tf.global_variables_initializer tf.cond, tf.while_loop Tf.contrib What’s new Eager execution enabled by default  tf.function Keras as main high level API Distribution Strategy API SavedModel API  TensorFlow 2.0 had a major API Cleanup. Many API symbols are removed or renamed for better consistency and clarity.  Session.run has been replaced with eager execution which effectively means that your tensorflow code runs like numpy code.  Eager execution enables fast iteration and intuitive debugging without building a graph. It also makes creating and experimenting with models using TensorFlow easier. It can be especially useful when using the tf.keras model subclassing API. TensorFlow 2.0 has tf.function, a python decorator that lets you run regular Python code which is later compiled down to TensorFlow code using AutoGraph. The Distribution Strategy API in TensorFlow 2.0 allows machine learning researchers to distribute training across a wide variety of compute configurations. This release also allows distributed training with Keras’ model.fit and custom training loops. Keras is introduced as the main high-level API. Keras is a popular high-level API used for easy and fast prototyping, building, and training of deep learning models. This will enable developers to easily leverage their various model-building APIs. Using Keras with TensorFlow has two main methods.  Symbolic (Keras sequential) Your model is a graph of layers Any graph you compile will run  TensorFlow helps you debug by catching errors at compile time Imperative method (Keras subclassing) Your model is Python bytecode Complete flexibility and control  Harder to debug/ Harder to maintain  There are pros and cons of using each method; it really just depends on what your specific use cases are. The SavedModel API allows you to save your trained ML model into a language-neutral format. With TensorFlow 2.0, all TensorFlow ecosystem projects including TensorFlow Lite, TensorFlow JS, TensorFlow Serving, and TensorFlow Hub, support SavedModels. On Tensorflow Hub, you can store and download pre-built models. You can use TensorFlow Extended which is a Python library that can be run on your servers to productionalize your models. TensorFlow Lite lets you run your TensorFlow models on edge devices. With TensorFlow.js, you can run machine learning models using javascript in the browser or run them on servers using node. TensorFlow also has Swift for TensorFlow to help developers use Swift to develop machine learning models. “Swift for TensorFlow provides a new programming model that combines the performance of graphs with the flexibility and expressivity of Eager execution, with a strong focus on improved usability at every level of the stack. This is not just a TensorFlow API wrapper written in Swift — we added compiler and language enhancements to Swift to provide a first-class user experience for machine learning developers.”  Other packages that exist in the TensorFlow ecosystem used for niche use cases are TF Probability, TF Agents (reinforcement learning), Tensor2Tensor, TF Ranking, TF Text (natural language processing), TF Federated, TF privacy and more.  How to upgrade from TensorFlow 1.x to TensorFlow 2.0 There are several migration guides available on TensorFlow’s website. You can also use the tf.compat.v1 library for backwards compatibility and the tf_upgrade_v2 script which you can execute on top of any Python script to convert TF 1.x code to 2.0 code. You can also read more about TF 2.0 migration in our book Hands-On Computer Vision with TensorFlow 2 which introduces the automatic migration tool  and compares TensorFlow 1 concepts with their TensorFlow 2 counterparts with a detailed guide on migrating to idiomatic TensorFlow 2 code. You can watch Brad’s full talk on YouTube. This video is licensed under the CC BY-NC-SA 3.0 license.  TensorFlow.js contributor Kai Sasaki on how TensorFlow.js eases web-based machine learning application development Introducing Spleeter, a Tensorflow based python library that extracts voice and sound from any music track. TensorFlow 2.0 released with tighter Keras integration, eager execution enabled by default, and more!
Read more
  • 0
  • 0
  • 4445

article-image-microsoft-airbnb-genentech-toyota-pytorch-to-build-deploy-production-ready-ai
Sugandha Lahoti
10 Dec 2019
6 min read
Save for later

How Microsoft, Airbnb, Genentech, and Toyota are using PyTorch to build and deploy production-ready AI

Sugandha Lahoti
10 Dec 2019
6 min read
Built by Facebook engineers and researchers, Pytorch is an open-source Python-based deep learning framework for developing new machine learning models, explore neural network architecture and deploy them at scale in production.  PyTorch is known for its advanced indexing and functions, imperative style, integration support, and API simplicity. This is one of the key reasons why developers prefer this framework for research and hackability. PyTorch is also the second-fastest-growing open source project on the GitHub community which includes anybody from developers starting to get acquainted with AI to some of the best known AI researchers and some of the best-known companies doing AI.  At its F8 annual developer conference, Facebook shared how production-ready PyTorch 1.0 is being adopted by the community and the industry. If you want to learn how you can use this framework to build projects in machine intelligence and deep learning, you may go through our book PyTorch Deep Learning Hands-On by authors Sherin Thomas and Sudhanshu Passi. This book demonstrates numerous examples and dynamic AI applications and demonstrates the simplicity and efficiency of PyTorch.  A number of companies are using PyTorch for research and for production. At F8 developer conference this year, Jerome Pesenti, Vice President of AI at Facebook introduced representatives from Microsoft, Airbnb, Genentech, and Toyota Research Institute who talked about how the framework is helping them build, train, and deploy production-ready AI. Below are some excerpts from their talks. Read also: How PyTorch is bridging the gap between research and production at Facebook: PyTorch team at F8 conference How Microsoft uses PyTorch for its language modeling service David Aronchick, Head of Open Source Machine Learning Strategy at Microsoft Azure  At Microsoft, PyTorch is being used in their language modeling service. Language modeling service uses state-of-the-art language models for both 1 P (first-party) and 3 P (third party). Microsoft explored a number of deep learning frameworks but was running into several issues. These included a slow transition from research to production, inconsistent and frequently changing APIs, and a trade-off between high-level ease-of-use and low-level flexibility.  To overcome these issues, in partnership with Facebook Microsoft built an internal language modeling toolkit on top of PyTorch. Using the native extensibility that PyTorch provided, Microsoft was able to build advanced/custom tasks and architecture. It also improved the onboarding of new users and was an active and inviting community. As a result of this work, Microsoft was able to scale the language modeling features to billions of words. It also led to intuitive, static, and consistent APIs which resulted in seamless migration from Language modeling toolkit v0.4 to 1.0. They also saw improvements in model sizes. Microsoft have partnered with ics.ai to deliver conversational AI bots across the public sector in the UK. ICS.ai, based in Basingstoke, have trained their Microsoft AI driven chat bots to scale to the demands of large county councils, healthcare trusts and universities. How Airbnb is using conversational AI tools in PyTorch to enhance customer experience Cindy Chen, Senior machine learning Data Scientist at Airbnb Airbnb has built a dialog assistant to integrate smart replies and enhance their customer experience. The core of their Dialog assistant for customer service at Airbnb is powered by PyTorch. They have built the smart replies recommendation model by treating it as a machine translation problem.  Airbnb is translating the customer's input message into agent responses by building a sequence to sequence model. They leverage PyTorch’s Open neural machine translation library to build the sequence to sequence model.  Using Pytorch has significantly sped up the Airbnb’s model development cycle as PyTorch provides them with state-of-the-art technologies such as various attention mechanisms and beam search.  How Genentech uses Pytorch in drug discovery and cancer therapy Daniel Bozinov, Head of AI - Early clinical development informatics, Genentech At Genentech, PyTorch is being used to develop personalized cancer medicine as well as for drug discovery and in cancer therapy.  For drug development, Genentech has built deep learning models for specific domains to make some predictions about the properties of molecules such as toxicity. They're also applying AI to come up with new cancer therapies. They identify unique molecules specific to cancer cells that are only produced by those cancer cells, potentially sensitizing the immune system to attack those cancer cells and basically treat them like an infection.   PyTorch has been their deep learning framework of choice because of features such as easier debugging, more flexible control structures, being natively pythonic, and it’s Dynamic graphs which yield in faster execution. Their model architecture is inspired by textual entailment in natural language processing. They use a partially recurrent neural network as well as a straightforward feed-forward network, combine the outputs of these two networks and predict the peptide binding. Toyota Research Institute adds new driver support features in cars Adrien Gaidon, Machine Learning Lead, Toyota Research Institute Toyota developed a cutting-edge cloud platform for distributed deep learning on high-resolution sensory inputs, especially video. This was designed to add new driver support features to the cars. PyTorch was instrumental in scaling up Toyota’s deep learning system because of features like simple API, integration with the global Python ecosystem, and overall a great user experience for fast exploration. It’s also fast for training on a very large scale. In addition to amping up TRI’s creativity and expertise, Pytorch has also amplified Toyota’s capabilities to iterate quickly from idea to real-world use cases. The team at TRI is excited for new Pytorch production features that will help them accelerate Toyota even further.  In this post, we have only summarized the talks. At F8, these researchers spoke in length about each of their company’s projects and how PyTorch has been instrumental in their growth. You can watch the full video on YouTube.  If you are inspired to build your PyTorch-based deep learning and machine learning models, we recommend you to go through our book PyTorch Deep Learning Hands-On. Facebook releases PyTorch 1.3 with named tensors, PyTorch Mobile, 8-bit model quantization, and more François Chollet, creator of Keras on TensorFlow 2.0 and Keras integration, tricky design decisions in Deep Learning, and more PyTorch announces the availability of PyTorch Hub for improving machine learning research reproducibility
Read more
  • 0
  • 0
  • 4496

article-image-francois-chollet-tensorflow-2-0-keras-integration-tricky-design-decisions-deep-learning
Sugandha Lahoti
10 Dec 2019
6 min read
Save for later

François Chollet, creator of Keras on TensorFlow 2.0 and Keras integration, tricky design decisions in Deep Learning, and more

Sugandha Lahoti
10 Dec 2019
6 min read
TensorFlow 2.0 was made available in October. One of the major highlights of this release was the integration of Keras into TensorFlow. Keras is an open-source deep-learning library that is designed to enable fast, user-friendly experimentation with deep neural networks. It serves as an interface to several deep learning libraries, most popular of which is TensorFlow, and it was integrated into TensorFlow main codebase in TensorFlow 2.0. In September, Lex Fridman, Research scientist at MIT popularly known for his podcasts, spoke to François Chollet, who is the author of Keras on Keras, Deep Learning, and the Progress of AI. In this post, we have tried to highlight François’ views on the Keras and TensorFlow 2.0 integration, early days of Keras and the importance of design decisions for building deep learning models. We recommend the full podcast that’s available on Fridman’s YouTube channel. Want to build Neural Networks? [box type="shadow" align="" class="" width=""]If you want to build multiple neural network architectures such as CNN, RNN, LSTM in Keras, we recommend you to read Neural Networks with Keras Cookbook by V Kishore Ayyadevara. This book features over 70 recipes such as object detection and classification, building self-driving car applications, understanding data encoding for image, text and recommender systems and more. [/box] Early days of Keras and how it was integrated into TensorFlow I started working on Keras in 2015, says Chollet. At that time Caffe was the popular deep learning library, based on C++ and was popular for building Computer Vision projects. Chollet was interested in Recurrent Neural Networks (RNNs) which was a niche topic at that time. Back then, there was no good solution or reusable open-source implementation of RNNs and LSTMs, so he decided to build his own and that’s how Keras started. “It was going to be mostly around RNNs and LSTMs and the models would be defined by Python code, which was going against mainstream,” he adds. Later, he joined Google’s research team working on image classification. At that time, he was exposed to the early internal version of Tensorflow - which was an improved version of Theano. When Tensorflow was released in 2015, he refactored Keras to run on TensorFlow. Basically he was abstracting away all the backend functionality into one module so that the same codebase could run on top of multiple backends. A year later, the TensorFlow team requested him to integrate the Keras API into TensorFlow more tightly.  They build a temporary TensorFlow-only version of Keras that was in tf.contrib for a while. Then they finally moved to TensorFlow Core in 2017. TensorFlow 2.0 gives both usability and flexibility to Keras Keras has been a very easy-to-use high-level interface to do deep learning. However, it lacked in flexibility - Keras framework was not the optimal way to do things compared to just writing everything from scratch. TensorFlow 2.0 offers both usability and flexibility to Keras. You have the usability of the high-level interface but you have the flexibility of the lower-level interface. You have this spectrum of workflows where you can get more or less usability and flexibility,  the trade-offs depending on your needs. It's very flexible, easy to debug, and powerful but also integrates seamlessly with higher-level features up to classic Keras workflows. “You have the same framework offering the same set of APIs that enable a spectrum of workflows that are more or less high level and are suitable for you know profiles ranging from researchers to data scientists and everything in between,” says Chollet. Design decisions are especially important while integrating Keras with Tensorflow “Making design decisions is as important as writing code”, claims Chollet. A lot of thought and care is taken in coming up with these decisions, taking into account the diverse user base of TensorFlow - small-scale production users, large-scale production users, startups, and researchers. Chollet says, “A lot of the time I spend on Google is actually discussing design. This includes writing design Docs, participating in design review meetings, etc.” Making a design decision is about satisfying a set of constraints but also trying to do so in the simplest way possible because this is what can be maintained and expanded in the future. You want to design APIs that are modular and hierarchical so that they have an API surface that is as small as possible. You want this modular hierarchical architecture to reflect the way that domain experts think about the problem. On the future of Keras and TensorFlow. What’s going to happen in TensorFlow 3.0? Chollet says that he’s really excited about developing even higher-level APIs with Keras. He’s also excited about hyperparameter tuning by automated machine learning. He adds, “The future is not just, you know, defining a model, it's more like an automatic model.” Limits of deep learning wrt function approximators that try to generalize from data Chollet emphasizes that “Neural Networks don't generalize well, humans do.” Deep Learning models are like huge parametric and differentiable models that go from an input space to an output space, trained with gradient descent. They are learning a continuous geometric morphing from an input vector space to an output space. As this is done point by point; a deep neural network can only make sense of points in space that are very close to things that it has already seen in string data. At best it can do the interpolation across points. However, that means in order to train your network you need a dense sampling of the input, almost a point-by-point sampling which can be very expensive if you're dealing with complex real-world problems like autonomous driving or robotics.  In contrast to this, you can look at very simple rules algorithms. If you have a symbolic rule it can actually apply to a very large set of inputs because it is abstract, it is not obtained by doing a point by point mapping. Deep learning is really like point by point geometric morphings. Meanwhile, abstract rules can generalize much better. I think the future is which can combine the two. Chollet also talks about self-improving Artificial General Intelligence, concerns about short-term and long-term threats in AI, Program synthesis, Good test for intelligence and more. The full podcast is available on Lex’s YouTube channel. If you want to implement neural network architectures in Keras for varied real-world applications, you may go through our book Neural Networks with Keras Cookbook. TensorFlow.js contributor Kai Sasaki on how TensorFlow.js eases web-based machine learning application development 10 key announcements from Microsoft Ignite 2019 you should know about What does a data science team look like?
Read more
  • 0
  • 0
  • 3326
Banner background image

article-image-how-to-perform-exception-handling-in-python-with-try-catch-and-finally
Guest Contributor
10 Dec 2019
9 min read
Save for later

How to perform exception handling in Python with ‘try, catch and finally’

Guest Contributor
10 Dec 2019
9 min read
An integral part of using Python involves the art of handling exceptions. There are primarily two types of exceptions; Built-in exceptions and User-Defined Exceptions. In such cases, the error handling resolution is to save the state of execution in the moment of error which interrupts the normal program flow to execute a special function or a code which is called Exception Handler. There are many types of errors like ‘division by zero’, ‘file open error’, etc. where an error handler needs to fix the issue. This allows the program to continue based on prior data saved. Source: Eyehunts Tutorial Just like Java, exceptions handling in Python is no different. It is a code embedded in a try block to run exceptions. Compare that to Java where catch clauses are used to catch the Exceptions. The same sort of Catch clause is used in Python that begins with except. Also, custom-made exception is possible in Python by using the raise statement where it forces a specified exception to take place. Reason to use exceptions Errors are always expected while writing a program in Python which requires a backup mechanism. Such a mechanism is set to handle any encountered errors and not doing so may crash the program completely. The reason to equip python program with the exception mechanism is to set and define a backup plan just in case any possible error situation erupts while executing it. Catch exceptions in Python Try statement is used for handling the exception in Python. A Try clause will consist of a raised exception associated with a particular, critical operation. For handling the exception the code is written within the Except Clause. The choice of performing a type of operation depends on the programmer once catching the exception is done. The below-defined program loops until the user enters an integer value having a valid reciprocal. A part of code that triggers an exception is contained inside the Try block. In case of absence of any exceptions then the normal flow of execution continues skipping the except block. And in case of exceptions raising the except block is caught. Checkout the example: The Output will be: Naming the exception is possible by using the ex_info() function that is present inside the sys module. It asks the user to make another attempt for naming it. Any unexpected values like 'a' or '1.3' will trigger the ValueError. Also, the return value of '0' leads to ZeroDivisionError. Exception handling in Python: try, except and finally There are instances where the suspicious code may raise exceptions which are placed inside such try statement block. Again, there is a code that is dedicated to handling such raised exceptions and the same is placed within the Except block. Below is an example of above-explained try and except statement when used in Python. try:   ** Operational/Suspicious Code except for SomeException:   ** Code to handle the exception How do they work in Python: The primarily used try block statements are triggered for checking whether or not there is any exception occurring within the code. In the event of non-occurrence of exception, the except block (Containing the exceptions handling statements) is executed post executing the try block. When the exception matches the predefined name as mentioned in 'SomeException' for handling the except block, it does the handling and enables the program to continue. In case of absence of any corresponding handlers that deals with the ones to be found in the except block then the activity of program execution is halted along with the error defining it. Defining Except without the exception To define the Except Clause isn’t always a viable option regardless of which programming language is used. As equipping the execution with the try-except clause is capable of handling all the possible types of exceptions. It will keep users ignorant about whether the exception was even raised in the first place. It is also a good idea to use the except statement without the exceptions field, for example some of the statements are defined below: try:    You do your operations here;    ...................... except:    If there is an exception, then execute this block.    ...................... else:    If there is no exception then execute this block.  OR, follow the below-defined syntax: try:   #do your operations except:   #If there is an exception raised, execute these statements else:   #If there is no exception, execute these statements Here is an example if the intent is to catch an exception within the file. This is useful when the intention is to read the file but it does not exist. try:   fp = open('example.txt', r) except:   print ('File is not found')   fp.close This example deals with opening the 'example.txt'. In such cases, when the called upon file is not found or does not exist then the code executes the except block giving the error read like 'File is not found'. Defining except clause for multiple exceptions It is possible to deal with multiple exceptions in a single block using the try statement. It allows doing so by enabling programmers to specify the different exception handlers. Also, it is recommended to define a particular exception within the code as a part of good programming practice. The better way out in such cases is to define the multiple exceptions using the same, above-mentioned except clause. And it all boils down to the process of execution wherein if the interpreter gets hold of a matching exception, then the code written under the except code will be executed. One way to do is by defining a tuple that can deal with the predefined multiple exceptions within the except clause. The below example shows the way to define such exceptions: try:    # do something  except (Exception1, Exception2, ..., ExceptionN):    # handle multiple exceptions    pass except:    # handle all other exceptions You can also use the same except statement to handle multiple exceptions as follows − try:    You do your operations here;    ...................... except(Exception1[, Exception2[,...ExceptionN]]]):    If there is an exception from the given exception list,     then execute this block.    ...................... else:    If there is no exception then execute this block.  Exception handling in Python using the try-finally clause Apart from implementing the try and except blocks within one, it is also a good idea to put together try and finally blocks. Here, the final block will carry all the necessary statements required to be executed regardless of the exception being raised in the try block. One benefit of using this method is that it helps in releasing external resources and clearing up the cache memories beefing up the program. Here is the pseudo-code for try..finally clause. try:    # perform operations finally:    #These statements must be executed Defining exceptions in try... finally block The example given below executes an event that shuts the file once all the operations are completed. try:    fp = open("example.txt",'r')    #file operations finally:    fp.close() Again, using the try statement in Python, it is wise to consider that it also comes with an optional clause – finally. Under any given circumstances, this code is executed which is usually put to use for releasing the additional external resource. It is not new for the developers to be connected to a remote data centre using a network. Also, there are chances of developers working with a file loaded with Graphic User Interface. Such situations will push the developers to clean up the used resources. Even if the resources used, yield successful results, such post-execution steps are always considered as a good practice. Actions like shutting down the GUI, closing a file or even disconnecting from a connected network written down in the finally block assures the execution of the code. The finally block is something that defines what must be executed regardless of raised exceptions. Below is the syntax used for such purpose: The file operations example below illustrates this very well: try: f = open("test.txt",encoding = 'utf-8') # perform file operations finally: f.close() Or In simpler terms: try:    You do your operations here;    ......................    Due to any exception, this may be skipped. finally:    This would always be executed.    ...................... Constructing such a block is a better way to ensure the file is closed even if the exception has taken place. Make a note that it is not possible to use the else clause along with the above-defined finally clause. Understanding user-defined exceptions Python users can create exceptions and it is done by deriving classes out of the built-in exceptions that come as standard exceptions. There are instances where displaying any specific information to users is crucial, especially upon catching the exception. In such cases, it is best to create a class that is subclassed from the RuntimeError. For that matter, the try block will raise a user-defined exception. The same is caught in the except block. Creating an instance of the class Networkerror will need the user to use variable e. Below is the syntax: class Networkerror(RuntimeError):    def __init__(self, arg):       self.args = arg   Once the class is defined, raising the exception is possible by following the below-mentioned syntax. try:    raise Networkerror("Bad hostname") except Networkerror,e:    print e.args Key points to remember Note that an exception is an error that occurs while executing the program indicating such events (error) occur though less frequently. As mentioned in the examples above, the most common exceptions are ‘divisible by 0’, ‘attempt to access non-existent file’ and ‘adding two non-compatible types’. Ensure putting up a try statement with a code where you are not sure whether or not the exception will occur. Specify an else block alongside try-except statement which will trigger when there is no exception raised in a try block. Author bio Shahid Mansuri Co-founder Peerbits, one of the leading software development company, USA, founded in 2011 which provides Python development services. Under his leadership, Peerbits used Python on a project to embed reports & researches on a platform that helped every user to access the dashboard that was freely available and also to access the dashboard that was exclusively available. His visionary leadership and flamboyant management style have yield fruitful results for the company. He believes in sharing his strong knowledge base with a learned concentration on entrepreneurship and business. Introducing Spleeter, a Tensorflow based python library that extracts voice and sound from any music track Fake Python libraries removed from PyPi when caught stealing SSH and GPG keys, reports ZDNet There’s more to learning programming than just writing code
Read more
  • 0
  • 0
  • 17064

article-image-how-pytorch-is-bridging-the-gap-between-research-and-production-at-facebook-pytorch-team-at-f8-conference
Vincy Davis
04 Dec 2019
7 min read
Save for later

How PyTorch is bridging the gap between research and production at Facebook: PyTorch team at F8 conference

Vincy Davis
04 Dec 2019
7 min read
PyTorch, the machine learning library which was originally developed as a research framework by a Facebook intern in 2017, has now grown into a popular deep learning workflow. One of the most loved products by Facebook, PyTorch is free, open source, and used for applications like computer vision and natural language processing (NLP).  At the F8 conference held this year, the PyTorch team consisting of Joe Spisak, the project manager for PyTorch at Facebook AI and Dmytro Dzhulgakov, the tech lead at Facebook AI gave a talk on how Facebook is developing and scaling AI experiences with PyTorch.  Spisak describes PyTorch as an eager and graph-based execution that is defined by ‘run’. This means that when a user executes a Python code, it generates a graph on the fly. It is dynamic in nature and allows the compilation of the static graph. The dynamic neural networks are accessible, thus, allowing the user to change the parameters very quickly. This feature comes in handy for applications like control flow in NLP. Another important feature of PyTorch, according to Spisak, is the ability to generate accurately distributed training models that possess close to billion parameters, including the cutting-edge ones. It also has a simple and easy API that is very intuitive by nature. This is one of the qualities of PyTorch which has endeared many developers, claims Spisak.  Become a pro at Deep Learning with PyTorch! If you want to become an expert in building and training neural network models with high speed and flexibility in text, vision, and advanced analytics using PyTorch 1.x, read our book Deep Learning with PyTorch 1.x - Second Edition written by Sri. Yogesh K., Laura Mitchell, et al.  It will give you an insight into solving real-world problems using CNNs, RNNs, and LSTMs, along with discovering state-of-the-art modern deep learning architectures, such as ResNet, DenseNet, and Inception. How PyTorch is bridging the gap between research and production at Facebook Dzhulgakov points out how general advances in AI are driven by innovative research in the fields of academia or industry and why it’s necessary to bridge this big lag between research and production. He says, “If you have a new idea and you want to take it all the way through to deployment, you usually need to go through multiple steps - figure out what the approach is and then find the training data maybe prepare massage it a little bit. Actually, build and train your model after that and then there is this painful step of transferring your model to a production environment which often historically involved reimplementation of a lot of code so you can actually take and deploy it and scale-up.”  According to Dzhulgakov, PyTorch is trying to minimize this big gap by encouraging advances and experimentations in the field, so that the research is brought into production in a few days, instead of months. Challenges in bringing research to production Following are the various classes of challenges associated with bringing research to production, according to the PyTorch team. Hardware efficiency: In case of a tight latency constraint environment, users are required to fit all the hardware into the performance budget. On the other hand, an underused hardware environment can lead to an increase in cost. Scalability: In Facebook’s recent work, Dzhulgakov says, they have trained on billions of public images, thus indicating significant accuracy gains as compared to regular datasets like imageNet. Similarly, when models are taken to inference, it means that billions of inferences per second are running with multiple diverse models sharing the same hardware. Cross-platform: Neural networks are mostly not isolated as they need to be deployed inside their target application. It has a lot of interdependence with the surrounding code and application, thus posing different constraints like the user will not be able to run Python code or the user will have to work on very constrained computer capabilities if running a mobile device, and more. Reliability: A lot of PyTorch jobs run for multiple weeks on hundreds of GPUs, hence it is important to design a reliable software which can tolerate hardware failures and deliver results.  How PyTorch is tackling these challenges In order to tackle the above-listed challenges, Dzhulgakov says Facebook develops systems that can take up a training job and perform optimizations focused on performance for the performance-critical pieces. The system also applies “recipes for reliability” so that the developer written modeling code is automatically transformed. The Jit package comes into the picture here and acts like a key factor that is built to capture the structure of the Python program with minimal changes. The main goal of the Jit package is to make this process almost seamless. He asserts that PyTorch has been successful since it feels like regular programming in Python and most of its users start developing in traditional PyTorch mode (eager mode) just by writing and prototyping in the program. “For the subset of promising models which shall show what results you need to bring to production either scale up, so you can apply techniques provided by Jit to exist mental codes and annotated in order to run in so-called script code.”   The Jit is like a subset of Python with a thread list of request semantics, which allows the user to apply transparent transformations for the eager mode to the user. The annotations include adding a few lines of Python code on top of the function in such a way that it can be done incrementally on function by function or module by module fashion. This hybrid fashion ensures that the model works along the way. Such powerful PyTorch tools permit the user to share the same code base between research and production environments.  Next, Dzhulgakov deduces that the common factor between research and production is that both teams of developers work on the same code base built on top of PyTorch. Thus, they share the codes among the teams that have a common domain like text classification or object detection or reinforcement learning. These developers prototype models, train new algorithms and address new tasks for quickly transitioning this functionality to the opposite environment. Watch the full talk to see Dzhulgakov’s examples of PyTorch bridging the gap between research and production at Facebook. If you want to become an expert at implementing deep learning applications in PyTorch, check out our latest book Deep Learning with PyTorch 1.x - Second Edition written by Sri. Yogesh K., Laura Mitchell, and Et al. This book will show you how to apply neural networks to domains such as computer vision and NLP. It will also guide you to build, train, and scale a model with PyTorch and cover complex neural networks such as GANs and autoencoders for producing text and images. NVIDIA releases Kaolin, a PyTorch library to accelerate research in 3D computer vision and AI Introducing ESPRESSO, an open-source, PyTorch based, end-to-end neural automatic speech recognition (ASR) toolkit for distributed training across GPUs Facebook releases PyTorch 1.3 with named tensors, PyTorch Mobile, 8-bit model quantization, and more Transformers 2.0: NLP library with deep interoperability between TensorFlow 2.0 and PyTorch, and 32+ pretrained models in 100+ languages PyTorch announces the availability of PyTorch Hub for improving machine learning research reproducibility
Read more
  • 0
  • 0
  • 3810

article-image-amazon-reinvent-2019-day-one-aws-launches-braket-its-new-quantum-service-and-releases-sagemaker-operators-for-kubernetes
Sugandha Lahoti
03 Dec 2019
6 min read
Save for later

Amazon re:Invent 2019 Day One: AWS launches Braket, its new quantum service and releases SageMaker Operators for Kubernetes

Sugandha Lahoti
03 Dec 2019
6 min read
At day one of the ongoing Amazon re:Invent 2019, there was a flurry of announcements made for AWS. Most importantly, AWS announced the preview launch of Braket, its own quantum computing service following the likes of IBM, Microsoft, and Google. Amazon also released Amazon SageMaker Operators for Kubernetes to help data scientists using Kubernetes to train, tune, and deploy machine learning models in Amazon SageMaker. re:Invent is Amazon’s flagship conference hosted by Amazon Web Services for the global cloud computing community. This year re: Invent is taking place in Las Vegas, December 2-6, 2019. re:Invent 2019 Day One announcements Braket: AWS’ new quantum service in preview now Amazon Braket (named after the common notation for quantum states) is a fully managed service that helps you get started with quantum computing. Braket consists of a full development environment that helps data scientists to: design quantum algorithms from scratch or choose from a set of pre-built algorithms, test these algorithms on simulated quantum computers (including gate based and quantum annealing superconductors, and ion trap hardware) run them on your choice of different quantum hardware technologies ( including D-Wave, IonQ, and Rigetti) Once your tests are complete, you will be automatically notified and your results will be stored in Amazon S3. Amazon Braket publishes event logs and performance metrics such as completion status and execution time to Amazon CloudWatch. To make it easier to develop hybrid algorithms that combine classical and quantum tasks, Amazon Braket helps manage classical compute resources and establish low-latency connections to the quantum hardware. At re:Invent 2019, AWS also launched the Amazon Quantum Solutions Lab, a collaborative research program that connects you with quantum computing experts from Amazon and its technology and consulting partners. They can help you identify potential uses of quantum computing, build internal expertise, and collaborate on programs to design and test quantum algorithms. Braket is available in preview now. Amazon SageMaker Operators for Kubernetes Now developers and data scientists can use Kubernetes to train, tune, and deploy machine learning models in Amazon SageMaker, with the new Amazon SageMaker Operators for Kubernetes. Customers can install these Amazon SageMaker Operators on their Kubernetes cluster to create Amazon SageMaker jobs natively using the Kubernetes API and command-line Kubernetes tools such as ‘kubectl’. Operators can be used to train machine learning models, optimize hyperparameters for a given model, run batch transform jobs over existing models, and set up inference endpoints. With these operators, users can manage their jobs in Amazon SageMaker from their Kubernetes cluster in Amazon Elastic Kubernetes Service EKS. Amazon SageMaker Operators for Kubernetes are available in select AWS regions. AWS DeepComposer, a creative way to learn Machine Learning Amazon has launched AWS DeepComposer, the world’s first machine learning-enabled musical keyboard at re:Invent 2019. AWS DeepComposer is an educational tool to teach people Machine Learning. AWS DeepComposer gives developers of all skill levels a creative way to experience machine learning – music. https://youtu.be/XH2EbK9dQlg You can input a melody by connecting the AWS DeepComposer keyboard to your computer, or play the virtual keyboard in the AWS DeepComposer console. You can generate an original music composition using the pre-trained genre models in the console. You can then publish your tracks to SoundCloud. It is designed specifically to educate developers by means of tutorials, sample code, and training data. These can be used to get started with building generative AI models, all without having to write a single line of code. With AWS DeepComposer, you can train and optimize GAN models to create original music. GAN models pit two different neural networks against each other to produce new and original digital works based on sample inputs. AWS DeepComposer is available in preview now. Amazon Transcribe now extended to healthcare patients Amazon’s automatic speech recognition service Amazon Transcribe is now available for medical speech as announced in re:Invent 2019. Amazon Transcribe Medical allows physicians to easily and quickly dictate their clinical notes and see their speech converted to accurate text in real-time, without any human intervention. Clinicians can use natural speech and do not have to explicitly call out punctuation like “comma” or “full stop”. This text can then be automatically fed to downstream applications such as EHR systems, or to AWS language services such as Amazon Comprehend Medical for entity extraction. To make it work, you need to capture audio using your device’s microphone and send PCM (Pulse-code modulation) audio to a streaming API based on the popular Websocket protocol. This API will respond with a series of JSON blobs with the transcribed text, as well as word-level time stamps, punctuation, etc. Optionally, you can save this data to an Amazon Simple Storage Service (S3) bucket. Amazon Transcribe Medical is available in US East (N. Virginia) and US West (Oregon) regions. Updates to Microsoft Windows Server AWS has released a bring-your-own-license (BYOL) experience for customers as an easier way to bring, and manage, their existing licenses for Microsoft Windows Server and SQL Server to AWS. The new BYOL experience enables customers who want to use their existing Windows Server or SQL Server licenses to seamlessly create virtual machines in EC2, while AWS takes care of managing their licenses to help ensure compliance to licensing rules specified by the customer. Amazon is also providing End-of-Support Migration Program (EMP) for Windows Server. On January 14, 2020, support for Windows Server 2008 and 2008 R2 will end. Having an application that can run only on an unsupported version of Windows Server is problematic as you will no longer get free security patch updates, leaving you vulnerable to security and compliance risks. This new program combines technology with expert guidance, to migrate your legacy applications running on outdated versions of Windows Server to newer, supported versions on AWS. Other updates announced at Amazon re:Invent 2019 Amazon EventBridge Schema Registry is now in preview.  The schema registry stores the structure (schema) of Amazon EventBridge events and maps them to Java, Python, and Typescript bindings so that you can use the events as typed objects. The existing AWS IoT SiteWise preview adds new features such as creating a virtual representation of your facility, monitor production performance metrics and use AWS IoT SiteWise Monitor to visualize the data in real-time. AWS IoT SiteWise Monitor is a new SaaS application that lets you monitor and interact with the data collected and organized by AWS IoT SiteWise. The upcoming AWS DeepRacer Evo car will include a stereo camera and a Light Detection and Ranging (LIDAR) sensor.  The DeepRacer League in 2020 will have 8 additional races in 5 countries. The preview of EC2 Image Builder, a service that makes it easier and faster to build and maintain secure OS images for Windows Server and Amazon Linux 2, using automated build pipelines. Amazon re:Invent will continue throughout this week (the last day is the 6th of December). You can access the Livestream here. Keep checking this space for news on other updates and launches. Amazon EKS Windows Container Support is now generally available Amazon’s hardware event 2019 highlights: a high-end Echo Studio, the new Echo Show 8, and more 10 key announcements from Microsoft Ignite 2019 you should know about
Read more
  • 0
  • 0
  • 3086
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-tensorflow-js-contributor-kai-sasaki-on-how-tensorflow-js-eases-web-based-machine-learning-application-development
Sugandha Lahoti
28 Nov 2019
6 min read
Save for later

TensorFlow.js contributor Kai Sasaki on how TensorFlow.js eases web-based machine learning application development

Sugandha Lahoti
28 Nov 2019
6 min read
Running Machine Learning applications on the web browser is one of the hottest trends in software development right now. Many notable machine learning projects are being built with Tensorflow.js. It is one of the most popular frameworks for building performant machine learning applications that run smoothly in a web browser. Recently, we spoke with Kai Sasaki, who is one of the initial contributors to TensorFlow.js. He talked about current and future versions of TF.js, how it compares to other browser-based ML tools and his contributions to the community. He also shared his views on why he thinks Javascript good for Machine Learning. If you are a web developer with working knowledge of Javascript who wants to learn how to integrate machine learning techniques with web-based applications, we recommend you to read the book, Hands-on Machine Learning with TensorFlow.js. This hands-on course covers important aspects of machine learning with TensorFlow.js using practical examples. Throughout the course, you'll learn how different algorithms work and follow step-by-step instructions to implement them through various examples. On how TensorFlow.js has improved web-based machine learning How do you think Machine Learning for the Web has evolved in the last 2-3 years? What are some current applications of web-based machine learning and TensorFlow.js? What can we expect in future releases? Machine Learning on the web platform is a field attracting more developers and machine learning practitioners. There are two reasons. First, the web platform is universally available. The web browser mostly provides us a way to access the underlying resource transparently. The second reason is security.raining a model on the client-side means you can keep sensitive data inside the client environment as the entire training process is completed on the client-side itself. The data is not sent to the cloud, making it more secure and less susceptible to vulnerabilities or hacking. In future releases as well, TensorFlow.js is expected to provide more secure and accessible functionalities. You can find various kinds of TensorFlow.js based applications here. How does TensorFlow.js compare with other web and browser-based machine learning tools? Does it make web-based machine learning application development easier? The most significant advantage of TensorFlow.js is the full compatibility of the TensorFlow ecosystem. Not only can a TensorFlow model be seamlessly used in TensorFlow.js, tools for visualization and model deployment in the TensorFlow ecosystem can also be used in TensorFlow.js. TensorFlow 2 was released in October. What are some new changes made specific to TensorFlow.js as a part of TF 2.0 that machine learning developers will find useful? What are your first impressions of this new release? Although there is nothing special related to TensorFlow 2.0, the full support of new backends is actively developed, such as WASM and WebGPU. These hardware acceleration mechanisms provided by the web platform can enhance performance for any TensorFlow.js application. It surely makes the potential of TensorFlow.js stronger and possible use cases broader. On Kai’s experience working on his book, Hands-on Machine Learning with TensorFlow.js Tell us the motivation behind writing your book Hands-on Machine Learning with TensorFlow.js. What are some of your favorite chapters/projects from the book? TensorFlow.js does not have much history because only three years have passed since its initial publication. Due to the lack of resources to learn TensorFlow.js usage, I was motivated to write a book illustrating how to using TensorFlow.js practically. I think chapters 4 - 9 of my book Hands-On Machine Learning with TensorFlow.js provide readers good material to practice how to write the ML application with TensorFlow.js. Why Javascript for Machine Learning Why do you think Javascript is good for Machine Learning? What are some of the good machine learning packages available in Javascript? How does it compare to other languages like Python, R, Matlab, especially in terms of performance? JavaScript is a primary programming language in the web platform so it can work as a bridge between the web and machine learning applications. We have several other libraries working similarly. For example, machinelearn.js is a general machine learning framework running with JavaScript. Although JavaScript is not a highly performant language, its universal availability in the web platform is attractive to developers as they can build their machine learning applications that are “write once, run anywhere”. We can compare the performance by running state-of-the-art machine learning models such as MobileNet or ResNet practically. On his contribution towards TF.js You are a contributor for TensorFlow.js and were awarded by the Google Open Source Peer Bonus Program. What were your main contributions? How was your experience working for TF.js? One of the significant contributions I have made was fast Fourier transformation operations. I have created the initial implementation of fft, ifft, rfft and irfft. I also added stft (short term Fourier transformation). These operators are mainly used for performing signal analysis for audio applications. I have done several bug fixes and test enhancements in TensorFlow.js too. What are the biggest challenges today in the field of Machine Learning and AI in web development? What do you see as some of the greatest technology disruptors in the next 5 years? While many developers are writing Python programming languages in the machine learning field, not many web developers have familiarity and knowledge of machine learning in spite of the substantial advantage of the integration between machine learning and web platform. I believe machine learning technologies will be democratized among web developers so that a vast amount of creativity is flourished in the next five years. By cooperating with these enthusiastic developers in the community, I believe the machine learning on the client-side or edge device will be one of the major contributions in the machine learning field. About the author Kai Sasaki works as a software engineer in Treasure Data to build large-scale distributed systems. He is one of the initial contributors to TensorFlow.js and contributes to developing operators for newer machine learning models. He has also received the Google Open Source Peer Bonus in 2018. You can find him on Twitter, Linkedin, and GitHub. About the book Hands-On Machine Learning with TensorFlow.js is a comprehensive guide that will help you easily get started with machine learning algorithms and techniques using TensorFlow.js. Throughout the course, you'll learn how different algorithms work and follow step-by-step instructions to implement them through various examples. By the end of this book, you will be able to create and optimize your own web-based machine learning applications using practical examples. Baidu adds Paddle Lite 2.0, new development kits, EasyDL Pro, and other upgrades to its PaddlePaddle platform. Introducing Spleeter, a Tensorflow based python library that extracts voice and sound from any music track TensorFlow 2.0 released with tighter Keras integration, eager execution enabled by default, and more!
Read more
  • 0
  • 0
  • 3967

article-image-julia-computing-research-team-runs-machine-learning-model-on-encrypted-data-without-decrypting-it
Fatema Patrawala
28 Nov 2019
5 min read
Save for later

Julia Computing research team runs machine learning model on encrypted data without decrypting it

Fatema Patrawala
28 Nov 2019
5 min read
Last week, the team at Julia Computing published a research based on cutting edge cryptographic techniques. The research involved cryptography techniques to practically perform computation on data without ever decrypting it. For example, the user would send encrypted data (e.g. images) to the cloud API, which would run the machine learning model and then return the encrypted answer. Nowhere is the user data decrypted and in particular the cloud provider does not have access to either the original image nor is it able to decrypt the prediction it computed. The team made this possible by building a machine learning service for handwriting recognition of encrypted images (from the MNIST dataset). The ability to compute on encrypted data is generally referred to as “secure computation” and is a fairly large area of research, with many different cryptographic approaches and techniques for a plethora of different application scenarios. For their research, Julia team focused on using a technique known as “homomorphic encryption”. What is homomorphic encryption Homomorphic encryption is a form of encryption that allows computation on ciphertexts, generating an encrypted result which, when decrypted, matches the result of the operations as if they had been performed on the plaintext. This technique can be used for privacy-preserving outsourced storage and computation. It allows data to be encrypted and out-sourced to commercial cloud environments for processing, all while encrypted. In highly regulated industries, such as health care, homomorphic encryption can be used to enable new services by removing privacy barriers inhibiting data sharing. In this research, the Julia Computing team used a homomorphic encryption system which involves the following operations: pub_key, eval_key, priv_key = keygen() encrypted = encrypt(pub_key, plaintext) decrypted = decrypt(priv_key, encrypted) encrypted′ = eval(eval_key, f, encrypted) So the first three are fairly straightforward and are familiar to anyone who has used asymmetric cryptography before. The last one is important as it evaluates some function f on the encryption and returns another encrypted value corresponding to the result of evaluating f on the encrypted value. It is this property that gives homomorphic computation its name. Further the Julia Computing team talks about CKKS (Cheon-Kim-Kim-Song), a homomorphic encryption scheme that allowed homomorphic evaluation on the following primitive operations: Element-wise addition of length n vectors of complex numbers Element-wise multiplication of length n complex vectors Rotation (in the circshift sense) of elements in the vector Complex conjugation of vector elements But they also mentioned that computations using CKKS were noisy, and hence they tested to perform these operations in Julia. Which convolutional neural network did the Julia Computing team use As a starting point the Julia Computing team used the convolutional neural network example given in the Flux model zoo. They kept training the loop, prepared the data and tweaked the ML model slightly. It is essentially the same model as the one used in the paper “Secure Outsourced Matrix Computation and Application to Neural Networks”, which uses the same (CKKS) cryptographic scheme. This paper also encrypts the model, which the Julia team neglected for simplicity and they involved bias vectors after every layer (which Flux does by default). This resulted in a higher test set accuracy of the model used by Julia team which was (98.6% vs 98.1%). An unusual feature in this model are the x.^2 activation functions. More common choices here would have been tanh or relu or something more advanced. While those functions (relu in particular) are cheap to evaluate on plaintext values, they would however, be quite expensive to evaluate on encrypted values. Also, the team would have ended up evaluating a polynomial approximation had they adopted these common choices. Fortunately  x.^2 worked fine for their purpose. How was the homomorphic operation carried out The team performed homomorphic operation on Convolutions and Matrix Multiply assuming a batch size of 64. They precomputed each convolution window of 7x7 extraction from the original images which gave them 64 7x7 matrices per input image. Then they collected the same position in each window into one vector and got a 64-element vector for each image, (i.e. a total of 49 64x64 matrices), and encrypted these matrices. In this way the convolution became a scalar multiplication of the whole matrix with the appropriate mask element, and by summing all 49 elements later, the team got the result of the convolution. Then the team moved to Matrix Multiply by rotating elements in the vector to effect a re-ordering of the multiplication indices. They considered a row-major ordering of matrix elements in the vector. Then shifted the vector by a multiple of the row-size, and got the effect of rotating the columns, which is a sufficient primitive for implementing matrix multiply. The team was able to get everything together and it worked. You can take a look at the official blog post to know the step by step implementation process with codes. Further they also executed the whole encryption process in Julia as it allows powerful abstractions and they could encapsulate the whole convolution extraction process as a custom array type. The Julia Computing team states, “Achieving the dream of automatically executing arbitrary computations securely is a tall order for any system, but Julia’s metaprogramming capabilities and friendly syntax make it well suited as a development platform.” Julia co-creator, Jeff Bezanson, on what’s wrong with Julialang and how to tackle issues like modularity and extension Julia v1.3 released with new multithreading features, and much more! The Julia team shares its finalized release process with the community Julia announces the preview of multi-threaded task parallelism in alpha release v1.3.0 How to make machine learning based recommendations using Julia [Tutorial]
Read more
  • 0
  • 0
  • 3731

article-image-10-key-announcements-from-microsoft-ignite-2019-you-should-know-about
Sugandha Lahoti
26 Nov 2019
7 min read
Save for later

10 key announcements from Microsoft Ignite 2019 you should know about

Sugandha Lahoti
26 Nov 2019
7 min read
This year’s Microsoft Ignite was jam-packed with new releases and upgrades in Microsoft’s line of products and services. The company elaborated on its growing focus to address the needs of its customers to help them do their business in smarter, more productive and more efficient ways. Most of the products were AI-based and Microsoft was committed to security and privacy. Microsoft Ignite 2019 took place on November 4-8, 2019 in Orlando, Florida and was attended by 26,000 IT implementers and decision-makers, developers, data professionals and people from various industries. There were a total of 175 separate announcements made! We have tried to cover the top 10 here. Microsoft’s Visual Studio IDE is now available on the web The web-based version of Microsoft’s Visual Studio IDE is now available to all developers. Called the Visual Studio Online, this IDE will allow developers to configure a fully configured development environment for their repositories and use the web-based editor to work on their code. Visual Studio Online is deeply integrated with GitHub (also owned by Microsoft), although developers can also attach their own physical and virtual machines to their Visual Studio-based environments. Visual Studio Online’s cloud-hosted environments, as well as extended support for Visual Studio Code and the web UI, are now available in preview. Support for Visual Studio 2019 is in private preview, which you can also sign up for through the Visual Studio Online web portal. Project Cortex will classify all content in a single network Project Cortex is a new service in Microsoft 365 useful to maintain the everyday flow of work in enterprises. Project Cortex collates enterprises generated documents and data, which is often spread across numerous repositories. It uses AI and machine learning to automatically classify all your content into topics to form a knowledge network. Cortex improves individual productivity and organizational intelligence and can be used across Microsoft 365, such as in the Office apps, Outlook, and Microsoft Teams. Project Cortex is now in private preview and will be generally available in the first half of 2020. Single-view device management with ‘Microsoft Endpoint Manager’ Microsoft has combined its Configuration Manager with Intune, its cloud-based endpoint management system to form what they call an Endpoint Manager. ConfigMgr allows enterprises to manage the PCs, laptops, phones, and tablets they issue to their employees. Intune is used for cloud-based management of phones. The Endpoint Manager will provide unique co-management options to organizations to provision, deploy, manage and secure endpoints and applications across their organization. Touted as the most important release of the event by Satya Nadella, this solution will give enterprises a single view of their deployments. ConfigMgr users will now also get a license to Intune to allow them to move to cloud-based management. No-code bot builder ‘Microsoft Power Virtual Agents’ is available in public preview Built on the Azure Bot Framework, Microsoft Power Virtual Agents is a low-code and no-code bot-building solution now available in public preview. Power Virtual Agents enables programmers with little to no developer experience to create and deploy intelligent virtual agents. The solution also includes Azure Machine Learning to help users create and improve conversational agents for personalized customer service. Power Virtual Agents will be generally available Dec. 1. Microsoft’s Chromium-based version of Edge is now more privacy-focused Microsoft Ignite announced the release candidate for Microsoft’s Chromium-based version of Edge browser with the general availability release on January 15. InPrivate search will be available for Microsoft Edge and Microsoft Bing to keep online searches and identities private, giving users more control over their data.  When searching InPrivate, search history and personally identifiable data will not be saved nor be associated back to you. Users’ identities and search histories are completely private. There will also be a new security baseline for the all-new Microsoft Edge. Security baselines are pre-configured groups of security settings and default values that are recommended by the relevant security teams. The next version of Microsoft Edge will feature a new icon symbolizing the major changes in Microsoft Edge, built on the Chromium open source project. It will appear in an Easter egg hunt designed to reward the Insider community. ML.NET 1.4 announces General Availability ML.NET 1.4, Microsoft’s open-source machine learning framework is now generally available. The latest release adds image classification training with the ML.NET API, as well as a relational database loader API for reading data used for training models with ML.NET. ML.NET also includes Model Builder (easy to use UI tool in Visual Studio) and Command-Line Interface to make it super easy to build custom Machine Learning models using AutoML. This release also adds a new preview of the Visual Studio Model Builder extension that supports image classification training from a graphical user interface. A preview of Jupyter support for writing C# and F# code for ML.NET scenarios is also available. Azure Arc extends Azure services across multiple infrastructures One of the most important features of Microsoft Ignite 2019 was Azure Arc. This new service enables Azure services anywhere and extends Azure management to any infrastructure — including those of competitors like AWS and Google Cloud.  With Azure Arc, customers can use Azure’s cloud management experience for their own servers (Linux and Windows Server) and Kubernetes clusters by extending Azure management across environments. Enterprises can also manage and govern resources at scale with powerful scripting, tools, Azure Portal and API, and Azure Lighthouse. Announcing Azure Synapse Analytics Azure Synapse Analytics builds upon Microsoft’s previous offering Azure SQL Data Warehouse. This analytics service combines traditional data warehousing with big data analytics bringing serverless on-demand or provisioned resources—at scale. Using Azure Synapse Analytics, customers can ingest, prepare, manage, and serve data for immediate BI and machine learning applications within the same service. Safely share your big data with Azure Data Share, now generally available As the name suggests, Azure Data Share allows you to safely share your big data with other organizations. Organizations can share data stored in their data lakes with third party organizations outside their Azure tenancy. Data providers wanting to share data with their customers/partners can also easily create a new share, populate it with data residing in a variety of stores and add recipients. It employs Azure security measures such as access controls, authentication, and encryption to protect your data. Azure Data Share supports sharing from SQL Data Warehouse and SQL DB, in addition to Blob and ADLS (for snapshot-based sharing). It also supports in-place sharing for Azure Data Explorer (in preview). Azure Quantum to be made available in private preview Microsoft has been working on Quantum computing for some time now. At Ignite, Microsoft announced that it will be launching Azure Quantum in private preview in the coming months. Azure Quantum is a full-stack, open cloud ecosystem that will bring quantum computing to developers and organizations. Azure Quantum will assemble quantum solutions, software, and hardware across the industry in a  single, familiar experience in Azure. Through Azure Quantum, you can learn quantum computing through a series of tools and learning tutorials, like the quantum katas. Developers can also write programs with Q# and the QDK Solve. Microsoft Ignite 2019 organizers have released an 88-page document detailing about all 175 announcements which you can access here. You can also view the conference Keynote delivered by Satya Nadella on YouTube as well as Microsoft Ignite’s official blog. Facebook mandates Visual Studio Code as default development environment and partners with Microsoft for remote development extensions Exploring .Net Core 3.0 components with Mark J. Price, a Microsoft specialist Yubico reveals Biometric YubiKey at Microsoft Ignite Microsoft announces .NET Jupyter Notebooks
Read more
  • 0
  • 0
  • 5096

article-image-why-geospatial-analysis-and-gis-matters-more-than-ever-today
Richard Gall
18 Nov 2019
7 min read
Save for later

Why geospatial analysis and GIS matters more than ever today

Richard Gall
18 Nov 2019
7 min read
Due to the hype around big data and artificial intelligence, it can be easy to miss some of the powerful but specific ways data can be truly impactful. One of the most important areas of modern data analysis that rarely gets given its due is geospatial analysis. At a time when both the natural and human worlds are going through a period of seismic change, the ability to throw a spotlight on issues of climate and population change is as transformative as the smartest chatbot (indeed, probably much more transformative). The foundation of geospatial analysis are GIS systems. GIS, in case you’re new to the field ,is an acronym for Geographic Information System. GIS applications and tools allow you to store, manipulate, analyze, and visualize data that corresponds to different aspects of the existing environment. Central to this is topographical information, but it could also include many other aspects, from contours and slopes, the built environment, land types and bodies of water. In the context of climate and human geography it’s easy to see how this kind of data can help us see the bigger picture - quite literally - behind what’s happening in our region, across our countries, and indeed, across the whole world. The history of geospatial analysis is a testament to its power. In 1854 physician John Snow identified the source of a cholera outbreak in London by marking out the homes of victims on a map. The cluster of victims that Snow’s map revealed led him to an infected water supply. Read next: Neo4j introduces Aura, a new cloud service to supply a flexible, reliable and developer-friendly graph database How GIS and geospatial analysis is being used today While this example is, of course, incredibly low-tech, it highlights exactly why geospatial analysis and GIS tools can be so valuable. To bring us up to date, there are many more examples of how geospatial analysis is making a real impact in social and environmental issues. This article on Forbes, for example, details some of the ways in which GIS projects are helping to uncover information that offers some unique insights on the history of racism, and its continuing reality today. The list includes a map of historical lynchings occurring between 1877 and 1950, and a map by the Urban Institute that shows the reality of racial segregation in U.S. schools in the 21st century. https://twitter.com/urbaninstitute/status/504668921962577921 That’s just a small snapshot - there are a huge range of incredible GIS projects that are having a massive impact on both how we understand issues, but even on policy. That's analytics enacting real, demonstrable change. Here are a few of the different areas in which GIS is being used: How GIS can be used in agriculture GIS can be used to tackle crop diseases by identifying issues across a large area of land. It’s possible to gain a deeper insight into what can drive improvements to crop yields by looking at the geographic and environmental factors that influence successful growth. How GIS can be used in retail GIS can help provide an insight on the relationship between consumer behavior and factors such as weather and congestion. It can also be used to better understand how consumers interact with products in shops. This can influence things like store design and product placement. How GIS can be used in meteorology and climate science Without GIS, it would be impossible to properly understand and visualise rainfall around the world. GIS can also be used to make predictions about the weather. For example, identifying anomalies in patterns and trends could indicate extreme weather events. How GIS can be used in medicine and health As we saw in the example above, by identifying clusters of disease, it becomes much easier to determine the causes of certain illnesses. GIS can also help us better understand the relationship between illness and environment - like pollution and asthma. How GIS can be used for humanitarian purposes Geospatial tools can help humanitarian teams to understand patterns of violence in given areas. This can help them to better manage and distribute resources and support to where it’s needed (Map Kibera is a great example of how this can be done). GIS tools are good at helping to bridge the gap between local populations and humanitarian workers in times of crisis. For example, during the Haiti earthquake non-profit tech company Ushahidi’s product helped to collate and coordinate reports from across the island. This made it possible to align what might have otherwise been a mess of data and information. There are many, many more examples of GIS being used for both commercial and non-profit purposes. If you want an in-depth look at a huge range of examples, it’s well worth checking out this article, which features 1000 GIS projects. Although geospatial analysis can be used across many different domains, all the examples above have a trend running through them: they all help us to understand the impact of space and geography. From social mobility and academic opportunity to soil erosion, GIS and other geospatial tools are brilliant because they help us to identify relationships that we might otherwise be unable to see. GIS and geospatial analysis project ideas This is an important point if you’re not sure where to start when it comes to starting a new GIS project. Forget the data (to begin with at least) and just think about what sort of questions you’d like to answer. The list is potentially endless, but here are some questions that I thought of just off the top of my head: Are there certain parts of your region more prone to flooding? Why are certain parts of your town congested and not others? Do economically marginalised people have to travel further to receive healthcare? Does one part of your region receive more rainfall/snowfall than other parts? Are there more new buildings in one area than another? Getting this right is integral to any good analysis project. Ultimately it’s what makes the whole thing worthwhile. Read next: PostGIS 3.0.0 releases with raster support as a separate extension Where to find data for a GIS project Once you’ve decided on something you want to find out, the next part is to collect your data. This can be tricky, but there are nevertheless a massive range of free data sources you can use for your project. This web page has a comprehensive collection of datasets; while it might not have exactly what you’re looking for, it's nevertheless a good place to begin if you simply want to try something out. Conclusion: Geospatial analysis is one of the most exciting and potentially transformative fields in analytics GIS and geospatial analysis is quite literally rooted in the real world. In the maps and visualizations that we create we’re able to offer unique perspectives on history or provide practical guidance on how we should act, what we need to do. This is significant: all too often technology can feel like its divorced from reality, as if it is folded into its own world that has no connection to real people. So, be ambitious, and be bold with your next GIS project: who knows what impact it could have.
Read more
  • 0
  • 0
  • 5534
article-image-facebook-releases-pytorch-1-3-with-named-tensors-pytorch-mobile-8-bit-model-quantization-and-more
Bhagyashree R
11 Oct 2019
5 min read
Save for later

Facebook releases PyTorch 1.3 with named tensors, PyTorch Mobile, 8-bit model quantization, and more

Bhagyashree R
11 Oct 2019
5 min read
Yesterday, at the PyTorch Developer Conference, Facebook announced the release of PyTorch 1.3. This release comes with three experimental features: named tensors, 8-bit model quantization, and PyTorch Mobile. Along with these exciting features, Facebook also announced the general availability of Google Cloud TPU support and a newly launched integration with Alibaba Cloud. Key updates in PyTorch 1.3 Named Tensors for more readable and maintainable code Though tensors are the building blocks of modern machine learning, researchers have argued that they are “broken.” Tensors have their own share of shortcomings: they expose private dimensions, broadcast based on absolute position, and keep the type information in the documentation. PyTorch 1.3 tries to solve this problem by introducing experimental support for named tensors, which was proposed by Sasha Rush, an Associate Professor at Cornell Tech. He has built a library called NamedTensor, which serves as a “thin-wrapper” on Torch tensor. This update introduces a few changes to the API. Dimension access and reduction now use a ‘dim’ argument instead of an index. Constructing and adding dimensions requires a “name” argument. Functions now broadcast based on set operations, not through heuristic ordering rules. 8-bit model quantization for mobile-optimized AI Quantization in deep learning is the method of approximating a neural network that uses 32-bit floating-point numbers by a neural network that uses a lower-precision numerical format. It is used to reduce the bandwidth and compute requirements of deep learning models. This is extremely essential for on-device applications that have limited memory size and number of computations. PyTorch 1.3 brings experimental support for 8-bit model quantization with the eager mode Python API for efficient deployment on servers and edge devices. This feature includes techniques like post-training quantization, dynamic quantization, and quantization-aware training. Moving from 32-bits to 8-bits can result in two to four times faster computations with one-quarter the memory usage. PyTorch Mobile for more efficient on-device machine learning Running machine learning models directly on edge devices is of great importance as it reduces latency. This is why PyTorch 1.3 introduces PyTorch Mobile that enables “an end-to-end workflow from Python to deployment on iOS and Android.” The current release is experimental. In the future releases, we can expect PyTorch Mobile to come with build-level optimization, selective compilation, support for QNNPACK quantized kernel libraries and ARM CPUs, further performance improvements, and more. Model interpretability and privacy tools in PyTorch 1.3 Captum and Captum Insights Captum is an easy-to-use model interpretability library for PyTorch. It is backed by state-of-the-art interpretability algorithms such as Integrated Gradients, DeepLIFT, and Conductance to help developers improve and troubleshoot their models. Developers can identify different features that contribute to a model’s output and improve its design. Facebook has also released an early release of Captum Insights. It is an interpretability visualization widget built on top of Captum. It works across images, text, and other features to help users understand feature attribution. Check out Facebook’s announcement to know more about Captum. CrypTen Machine learning via cloud-based platforms poses various security and privacy challenges. Facebook writes, “In particular, users of these platforms may not want or be able to share unencrypted data, which prevents them from taking full advantage of ML tools.” PyTorch 1.3 comes with CrypTen, a framework for privacy-preserving machine learning. It aims to make secure computing techniques accessible to machine learning practitioners. You can find more about CrypTen on GitHub. Libraries for multimodal AI systems Detectron2: It is an object detection library implemented in PyTorch. It features support for the latest models and tasks and increased flexibility to aid computer vision research. There are also improvements in maintainability and scalability to support production use cases. Fairseq gets speech extensions: With this release, Fairseq, a framework for sequence-to-sequence applications such as language translation includes support for end-to-end learning for speech and audio recognition tasks. The release of PyTorch 1.3 started a discussion on Hacker News and naturally many developers compared it with TensorFlow 2.0. Here’s what a user commented, “This is a common trend for being second in the market when we see Pytorch and TensorFlow 2.0, TF 2.0 was created to compete directly with Pytorch pythonic implementation (Keras based, Eager execution).” They further added, “Facebook at least on PyTorch has been delivering a quality product. Although for us running production pipelines TF is still ahead in many areas (GPU, TPU implementation, TensorRT, TFX and other pipeline tools) I can see Pytorch catching up on the next couple of years which by my prediction many companies will be running serious and advanced workflows and we may be able to see a winner there.” The named tensors implementation is being well-received by the PyTorch community: https://twitter.com/leopd/status/1182342855886376965 https://twitter.com/rasbt/status/1182647527906140161 These were some of the updates in PyTorch 1.3. Check out the official announcement by Facebook to know more. PyTorch 1.2 is here with a new TorchScript API, expanded ONNX export, and more PyTorch announces the availability of PyTorch Hub for improving machine learning research reproducibility Sherin Thomas explains how to build a pipeline in PyTorch for deep learning workflows Facebook AI open-sources PyTorch-BigGraph for faster embeddings in large graphs Facebook open-sources PyText, a PyTorch based NLP modeling framework
Read more
  • 0
  • 0
  • 4068

article-image-get-ready-for-open-data-science-conference-2019-in-europe-and-california
Sugandha Lahoti
10 Oct 2019
3 min read
Save for later

Get Ready for Open Data Science Conference 2019 in Europe and California

Sugandha Lahoti
10 Oct 2019
3 min read
Get ready to learn and experience the very latest in data science and AI with expert-led trainings, workshops, and talks at ​ODSC West 2019 in San Francisco and ODSC Europe 2019 in London. ODSC events are built for the community and feature the most comprehensive breadth and depth of training opportunities available in data science, machine learning, and deep learning. They also provide numerous opportunities to connect, network, and exchange ideas with data science peers and experts from across the country and the world. What to expect at ODSC West 2019 ODSC West 2019 is scheduled to take place in San Francisco, California on Tuesday, Oct 29, 2019, 9:00 AM – Friday, Nov 1, 2019, 6:00 PM PDT. This year, ODSC West will host several networking events, including ODSC Networking Reception, Dinner and Drinks with Data Scientists, Meet the Speakers, Meet the Experts, and Book Signings Hallway Track. Core areas of focus include Open Data Science, Machine Learning & Deep Learning, Research frontiers, Data Science Kick-Start, AI for engineers, Data Visualization, Data Science for Good, and management & DataOps. Here are just a few of the experts who will be presenting at ODSC: Anna Veronika Dorogush, CatBoost Team Lead, Yandex Sarah Aerni, Ph.D., Director of Data Science and Engineering, Salesforce Brianna Schuyler, Ph.D., Data Scientist, Fenix International Katie Bauer, Senior Data Scientist, Reddit, Inc Jennifer Redmon, Chief Data Evangelist, Cisco Systems, Inc Sanjana Ramprasad, Machine Learning Engineer, Mya Systems Cassie Kozyrkov, Ph.D., Chief Decision Scientist, Google Rachel Thomas, Ph.D., Co-Founder, fast.ai Check out the conference’s more industry-leading speakers here. ODSC also conducts the Accelerate AI Business Summit, which brings together leading experts in AI and business to discuss three core topics: AI Innovation, Expertise, and Management. Don’t miss out on the event You can also use code ODSC_PACKT right now to exclusively save 30% before Friday on your ticket to ODSC West 2019. What to expect at ODSC Europe 2019 ODSC Europe 2019 is scheduled to take place in London, the UK on Tuesday, Nov 19, 2019 – Friday, Nov 22, 2019. Europe Talks/Workshops schedule includes Thursday, Nov 21st and Friday, Nov 22nd. It is available to Silver, Gold, Platinum, and Diamond pass holders. Europe Trainings schedule includes Tuesday, November 19th and Wednesday, November 20th. It is available to Training,  Gold ( Wed Nov 20th only), Platinum, and Diamond pass holders. Some talks scheduled to take place include ML for Social Good: Success Stories and Challenges, Machine Learning Interpretability Toolkit, Tools for High-Performance Python, The Soul of a New AI, Machine Learning for Continuous Integration, Practical, Rigorous Explainability in AI, and more. ODSC has released a preliminary schedule with information on attending speakers and their training, workshop, and talk topics. The full schedule is going to be available soon. They’ve also recently added several excellent speakers, including Manuela Veloso, Ph.D. | Head of AI Research, JP Morgan Dr. Wojciech Samek | Head of Machine Learning, Fraunhofer Heinrich Hertz Institute Samik Chandanara | Head of Analytics and Data Science, JP Morgan Tom Cronin | Head of Data Science & Data Engineering, Lloyds Banking Group Gideon Mann, Ph.D. | Head of Data Science, Bloomberg, LP There are more chances to learn, connect, and share ideas at this year’s event than ever before. Don’t miss out. Use code ODSC_PACKT right now to save 30% on your ticket to ODSC Europe 2019.
Read more
  • 0
  • 0
  • 2831

article-image-tensorflow-2-0-released-tighter-keras-integration-eager-execution-enabled-by-default
Bhagyashree R
03 Oct 2019
5 min read
Save for later

TensorFlow 2.0 released with tighter Keras integration, eager execution enabled by default, and more!

Bhagyashree R
03 Oct 2019
5 min read
After releasing the beta version of TensorFlow 2.0 in June, Google announced its final release on Monday. This release comes with tighter integration with Keras, eager execution enabled by default, promises three times faster training performance, a cleaned-up API, and more. Key updates in TensorFlow 2.0 Tighter Keras integration for better developer productivity One of the important updates in TensorFlow 2.0 is its tighter integration with Keras, a popular high-level API used for easy and fast prototyping, building, and training deep learning models. This will enable developers to easily leverage its various model-building APIs including Sequential, Functional, and Subclassing. Explaining the motivation behind this change, the TensorFlow team wrote, “By establishing Keras as the high-level API for TensorFlow, we are making it easier for developers new to machine learning to get started with TensorFlow. A single high-level API reduces confusion and enables us to focus on providing advanced capabilities for researchers.” Eager execution enabled by default In TensorFlow 1.x, developers were required to define an abstract data structure named Graph and to run this graph they needed an encapsulation called Session. TensorFlow 2.0 has eager execution enabled by default to “eagerly” run code, similar to normal Python code. Eager execution enables fast iteration and intuitive debugging without building a graph. It also makes creating and experimenting with models using TensorFlow much easier. It can be especially useful when using the tf.keras model subclassing API. Also Read: Keras 2.3.0, the first release of multi-backend Keras with TensorFlow 2.0 support is now out Distribution Strategy API The Distribution Strategy API in TensorFlow 2.0 allows machine learning researchers to distribute training across a wide variety of compute configurations. This will allow them to “attain great out-of-the-box performance” with minimal code changes. This release also allows distributed training with Keras’ model.fit and custom training loops. Performance improvements on GPUs TensorFlow 2.0 includes multi-GPU support and experimental support for multi worker and Cloud TPUs. This release also has a number of performance improvements on GPUs. It promises three times faster training performance when using mixed precision on NVIDIA’s Volta and Turing GPUs. It includes tight integration with NVIDIA TensorRT, a platform for high-performance deep learning inference. The standardized SavedModel file format The SavedModel API allows you to save your trained ML model into a language-neutral format. With TensorFlow 2.0, all TensorFlow ecosystem projects including TensorFlow Lite, TensorFlow JS, TensorFlow Serving, and TensorFlow Hub, support SavedModels. Standardizing the SavedModel file format will enable developers to run their models on a variety of runtimes including the cloud, web, browser, Node.js, mobile, and embedded systems. “This allows you to run your models with TensorFlow, deploy them with TensorFlow Serving, use them on mobile and embedded systems with TensorFlow Lite, and train and run in the browser or Node.js with TensorFlow.js,” the team writes. API simplification TensorFlow 2.0 includes a number of API updates. Many API symbols are removed or renamed for better consistency and clarity. Also, the tf.app, tf.flags, and tf.logging API are removed in favor of abseil-py. Because of the huge number of API changes, developers in a discussion on Hacker News expressed that transitioning from TensorFlow 1.X to TensorFlow 2.0 is quite complicated. Some also mentioned switching to PyTorch instead. A user commented, “As someone who uses TensorFlow a lot, I predict an enormous clusterfuck of a transition. Tensorflow has turned into a multiheaded monster, supporting many things and approaches but none of them very well...In my opinion, there are some architectural problems with TF, which have not been addressed in this update...If you need to transition from TF1 to TF2, consider doing the TF1 to PyTorch transition instead.” While some others were happy with the recommended Keras API and eager execution. “I don't know if I'm the only one, but I actually love the changes they've made since v1. Eager execution and tf.function are fantastic, and the built-in Keras is even better than the standalone version. A big improvement compared to TF from last year,” a user commented on Reddit. Another user added, “The most important change in terms of usability, IMO, is the use of tf.keras as the recommended interface to TensorFlow. There hasn't been a case yet where I've needed to dip outside of Keras into raw TensorFlow, but the option is there and is easy to do. That said, TF 2.0 changes a lot. Many repos might break, so expect to see lots of tensorflow==1.14 in requirement.txt files from now on.” These were some of the updates in TensorFlow 2.0. Check out the official announcement and release notes to know more in detail. Transformers 2.0: NLP library with deep interoperability between TensorFlow 2.0 and PyTorch, and 32+ pretrained models in 100+ languages TensorFlow 2.0 to be released soon with eager execution, removal of redundant APIs, tf function and more Introducing TensorFlow Graphics packed with TensorBoard 3D, object transformations, and much more Train a convolutional neural network in Keras and improve it with data augmentation [Tutorial] Train a convolutional neural network in Keras and improve it with data augmentation [Tutorial]
Read more
  • 0
  • 0
  • 6074
article-image-transformers-2-0-nlp-library-with-deep-interoperability-between-tensorflow-2-0-and-pytorch
Fatema Patrawala
30 Sep 2019
3 min read
Save for later

Transformers 2.0: NLP library with deep interoperability between TensorFlow 2.0 and PyTorch, and 32+ pretrained models in 100+ languages

Fatema Patrawala
30 Sep 2019
3 min read
Last week, Hugging Face, a startup specializing in natural language processing, released a landmark update to their popular Transformers library, offering unprecedented compatibility between two major deep learning frameworks, PyTorch and TensorFlow 2.0. Transformers (formerly known as pytorch-transformers and pytorch-pretrained-bert) provides general-purpose architectures (BERT, GPT-2, RoBERTa, XLM, DistilBert, XLNet…) for Natural Language Understanding (NLU) and Natural Language Generation (NLG) with over 32+ pretrained models in 100+ languages and deep interoperability between TensorFlow 2.0 and PyTorch. Transformers 2.0 embraces the ‘best of both worlds’, combining PyTorch’s ease of use with TensorFlow’s production-grade ecosystem. The new library makes it easier for scientists and practitioners to select different frameworks for the training, evaluation and production phases of developing the same language model. “This is a lot deeper than what people usually think when they talk about compatibility,” said Thomas Wolf, who leads Hugging Face’s data science team. “It’s not only about being able to use the library separately in PyTorch and TensorFlow. We’re talking about being able to seamlessly move from one framework to the other dynamically during the life of the model.” https://twitter.com/Thom_Wolf/status/1177193003678601216 “It’s the number one feature that companies asked for since the launch of the library last year,” said Clement Delangue, CEO of Hugging Face. Notable features in Transformers 2.0 8 architectures with over 30 pretrained models, in more than 100 languages Load a model and pre-process a dataset in less than 10 lines of code Train a state-of-the-art language model in a single line with the tf.keras fit function Share pretrained models, reducing compute costs and carbon footprint Deep interoperability between TensorFlow 2.0 and PyTorch models Move a single model between TF2.0/PyTorch frameworks at will Seamlessly pick the right framework for training, evaluation, production As powerful and concise as Keras About Hugging Face Transformers With half a million installs since January 2019, Transformers is the most popular open-source NLP library. More than 1,000 companies including Bing, Apple or Stitchfix are using it in production for text classification, question-answering, intent detection, text generation or conversational. Hugging Face, the creators of Transformers, have raised US$5M so far from investors in companies like Betaworks, Salesforce, Amazon and Apple. On Hacker News, users are appreciating the company and how Transformers has become the most important library in NLP. Other interesting news in data Baidu open sources ERNIE 2.0, a continual pre-training NLP model that outperforms BERT and XLNet on 16 NLP tasks Dr Joshua Eckroth on performing Sentiment Analysis on social media platforms using CoreNLP Facebook open-sources PyText, a PyTorch based NLP modeling framework
Read more
  • 0
  • 0
  • 6558

article-image-can-a-modified-mit-hippocratic-license-to-restrict-misuse-of-open-source-software-prompt-a-wave-of-ethical-innovation-in-tech
Savia Lobo
24 Sep 2019
5 min read
Save for later

Can a modified MIT ‘Hippocratic License’ to restrict misuse of open source software prompt a wave of ethical innovation in tech?

Savia Lobo
24 Sep 2019
5 min read
Open source licenses allow software to be freely distributed, modified, and used. These licenses give developers an additional advantage of allowing others to use their software as per their own rules and conditions. Recently, software developer and open-source advocate Coraline Ada Ehmke has caused a stir in the software engineering community with ‘The Hippocratic License.’ Ehmke was also the original author of Contributor Covenant, a “code of conduct" for open source projects that encourages participants to use inclusive language and to refrain from personal attacks and harassment. In a tweet posted in September last year, following the code of conduct, she mentioned, “40,000 open source projects, including Linux, Rails, Golang, and everything OSS produced by Google, Microsoft, and Apple have adopted my code of conduct.” [box type="shadow" align="" class="" width=""]The term ‘Hippocratic’ is derived from the Hippocratic Oath, the most widely known of Greek medical texts. The Hippocratic Oath in literal terms requires a new physician to swear upon a number of healing gods that he will uphold a number of professional ethical standards.[/box] Ehmke explained the license in more detail in a post published on Sunday. In it, she highlights how the idea that writing software with the goals of clarity, conciseness, readability, performance, and elegance are limiting, and potentially dangerous.“All of these technologies are inherently political,” she writes. “There is no neutral political position in technology. You can’t build systems that can be weaponized against marginalized people and take no responsibility for them.”The concept of the Hippocratic license is relatively simple. In a tweet, Ehmke said that it “specifically prohibits the use of open-source software to harm others.” Open source software and the associated harm Out of the many privileges that open source software allows such as free redistribution of the software as well as the source code, the OSI also defines there is no discrimination against who uses it or where it will be put to use. A few days ago, a software engineer, Seth Vargo pulled his open-source software, Chef-Sugar, offline after finding out that Chef (a popular open source DevOps company using the software) had recently signed a contract selling $95,000-worth of licenses to the US Immigrations and Customs Enforcement (ICE), which has faced widespread condemnation for separating children from their parents at the U.S. border and other abuses. Vargo took down the Chef Sugar library from both GitHub and RubyGems, the main Ruby package repository, as a sign of protest. In May, this year, Mijente, an advocacy organization released documents stating that Palantir was responsible for the 2017 ICE operation that targeted and arrested family members of children crossing the border alone. Also, in May 2018, Amazon employees, in a letter to Jeff Bezos, protested against the sale of its facial recognition tech to Palantir where they “refuse to contribute to tools that violate human rights”, citing the mistreatment of refugees and immigrants by ICE. Also, in July, the WYNC revealed that Palantir’s mobile app FALCON was being used by ICE to carry out raids on immigrant communities as well as enable workplace raids in New York City in 2017. Founder of OSI responds to Ehmke’s Hippocratic License Bruce Perens, one of the founders of the Open Source movement in software, responded to Ehmke in a post titled “Sorry, Ms. Ehmke, The “Hippocratic License” Can’t Work” . “The software may not be used by individuals, corporations, governments, or other groups for systems or activities that actively and knowingly endanger harm, or otherwise threaten the physical, mental, economic, or general well-being of underprivileged individuals or groups,” he highlights in his post. “The terms are simply far more than could be enforced in a copyright license,” he further adds.  “Nobody could enforce Ms. Ehmke’s license without harming someone, or at least threatening to do so. And it would be easy to make a case for that person being underprivileged,”  he continued. He concluded saying that, though the terms mentioned in Ehmke’s license were unagreeable, he will “happily support Ms. Ehmke in pursuit of legal reforms meant to achieve the protection of underprivileged people.” Many have welcomed Ehmke's idea of an open source license with an ethical clause. However, the license is not OSI approved yet and chances are slim after Perens’ response. There are many users who do not agree with the license. Reaching a consensus will be hard. https://twitter.com/seannalexander/status/1175853429325008896 https://twitter.com/AdamFrisby/status/1175867432411336704 https://twitter.com/rishmishra/status/1175862512509685760 Even though developers host their source code on open source repositories, a license may bring certain level of restrictions on who is allowed to use the code. However, as Perens mentions, many of the terms in Ehmke’s license hard to implement. Irrespective of the outcome of this license’s approval process, Coraline Ehmke has widely opened up the topic of the need for long overdue FOSS licensing reforms in the open source community. It would be interesting to see if such a license would boost ethical reformation by giving more authority to the developers in imbibing their values and preventing the misuse of their software. Read the Hippocratic license to know more in detail. Other interesting news Tech ImageNet Roulette: New viral app trained using ImageNet exposes racial biases in artificial intelligent system Machine learning ethics: what you need to know and what you can do Facebook suspends tens of thousands of apps amid an ongoing investigation into how apps use personal data
Read more
  • 0
  • 0
  • 5436