Tech News

article-image-sherin-thomas-explains-how-to-build-a-pipeline-in-pytorch-for-deep-learning-workflows

09 May 2019

8 min read

Sherin Thomas explains how to build a pipeline in PyTorch for deep learning workflows

09 May 2019

A typical deep learning workflow starts with ideation and research around a problem statement, where the architectural design and model decisions come into play. Following this, the theoretical model is experimented using prototypes. This includes trying out different models or techniques, such as skip connection, or making decisions on what not to try out. PyTorch was started as a research framework by a Facebook intern, and now it has grown to be used as a research or prototype framework and to write an efficient model with serving modules. The PyTorch deep learning workflow is fairly equivalent to the workflow implemented by almost everyone in the industry, even for highly sophisticated implementations, with slight variations. In this article, we explain the core of ideation and planning, design and experimentation of the PyTorch deep learning workflow. This article is an excerpt from the book PyTorch Deep Learning Hands-On by Sherin Thomas and Sudhanshi Passi. This book attempts to provide an entirely practical introduction to PyTorch. This PyTorch publication has numerous examples and dynamic AI applications and demonstrates the simplicity and efficiency of the PyTorch approach to machine intelligence and deep learning. Ideation and planning Usually, in an organization, the product team comes up with a problem statement for the engineering team, to know whether they can solve it or not. This is the start of the ideation phase. However, in academia, this could be the decision phase where candidates have to find a problem for their thesis. In the ideation phase, engineers brainstorm and find the theoretical implementations that could potentially solve the problem. In addition to converting the problem statement to a theoretical solution, the ideation phase is where we decide what the data types are and what dataset we should use to build the proof of concept (POC) of the minimum viable product (MVP). Also, this is the stage where the team decides which framework to go with by analyzing the behavior of the problem statement, available implementations, available pretrained models, and so on. This stage is very common in the industry, and I have come across numerous examples where a well-planned ideation phase helped the team to roll out a reliable product on time, while a non-planned ideation phase destroyed the whole product creation. Design and experimentation The crucial part of design and experimentation lies in the dataset and the preprocessing of the dataset. For any data science project, the major timeshare is spent on data cleaning and preprocessing. Deep learning is no exception from this. Data preprocessing is one of the vital parts of building a deep learning pipeline. Usually, for a neural network to process, real-world datasets are not cleaned or formatted. Conversion to floats or integers, normalization and so on, is required before further processing. Building a data processing pipeline is also a non-trivial task, which consists of writing a lot of boilerplate code. For making it much easier, dataset builders and DataLoader pipeline packages are built into the core of PyTorch. The dataset and DataLoader classes Different types of deep learning problems require different types of datasets, and each of them might require different types of preprocessing depending on the neural network architecture we use. This is one of the core problems in deep learning pipeline building. Although the community has made the datasets for different tasks available for free, writing a preprocessing script is almost always painful. PyTorch solves this problem by giving abstract classes to write custom datasets and data loaders. The example given here is a simple dataset class to load the fizzbuzz dataset, but extending this to handle any type of dataset is fairly straightforward. PyTorch's official documentation uses a similar approach to preprocess an image dataset before passing that to a complex convolutional neural network (CNN) architecture. A dataset class in PyTorch is a high-level abstraction that handles almost everything required by the data loaders. The custom dataset class defined by the user needs to override the __len__ and __getitem__ functions of the parent class, where __len__ is being used by the data loaders to determine the length of the dataset and __getitem__ is being used by the data loaders to get the item. The __getitem__ function expects the user to pass the index as an argument and get the item that resides on that index: from dataclasses import dataclassfrom torch.utils.data import Dataset, DataLoader@dataclass(eq=False)class FizBuzDataset(Dataset): input_size: int start: int = 0 end: int = 1000 def encoder(self,num): ret = [int(i) for i in '{0:b}'.format(num)] return[0] * (self.input_size - len(ret)) + ret def __getitem__(self, idx): x = self.encoder(idx) if idx % 15 == 0: y = [1,0,0,0] elif idx % 5 ==0: y = [0,1,0,0] elif idx % 3 == 0: y = [0,0,1,0] else: y = [0,0,0,1] return x,y def __len__(self): return self.end - self.start The implementation of a custom dataset uses brand new dataclasses from Python 3.7. dataclasses help to eliminate boilerplate code for Python magic functions, such as __init__, using dynamic code generation. This needs the code to be type-hinted and that's what the first three lines inside the class are for. You can read more about dataclasses in the official documentation of Python (https://docs.python.org/3/library/dataclasses.html). The __len__ function returns the difference between the end and start values passed to the class. In the fizzbuzz dataset, the data is generated by the program. The implementation of data generation is inside the __getitem__ function, where the class instance generates the data based on the index passed by DataLoader. PyTorch made the class abstraction as generic as possible such that the user can define what the data loader should return for each id. In this particular case, the class instance returns input and output for each index, where, input, x is the binary-encoder version of the index itself and output is the one-hot encoded output with four states. The four states represent whether the next number is a multiple of three (fizz), or a multiple of five (buzz), or a multiple of both three and five (fizzbuzz), or not a multiple of either three or five. Note: For Python newbies, the way the dataset works can be understood by looking first for the loop that loops over the integers, starting from zero to the length of the dataset (the length is returned by the __len__ function when len(object) is called). The following snippet shows the simple loop: dataset = FizBuzDataset()for i in range(len(dataset)): x, y = dataset[i]dataloader = DataLoader(dataset, batch_size=10, shuffle=True, num_workers=4)for batch in dataloader: print(batch) The DataLoader class accepts a dataset class that is inherited from torch.utils.data.Dataset. DataLoader accepts dataset and does non-trivial operations such as mini-batching, multithreading, shuffling, and so on, to fetch the data from the dataset. It accepts a dataset instance from the user and uses the sampler strategy to sample data as mini-batches. The num_worker argument decides how many parallel threads should be operating to fetch the data. This helps to avoid a CPU bottleneck so that the CPU can catch up with the GPU's parallel operations. Data loaders allow users to specify whether to use pinned CUDA memory or not, which copies the data tensors to CUDA's pinned memory before returning it to the user. Using pinned memory is the key to fast data transfers between devices, since the data is loaded into the pinned memory by the data loader itself, which is done by multiple cores of the CPU anyway. Most often, especially while prototyping, custom datasets might not be available for developers and in such cases, they have to rely on existing open datasets. The good thing about working on open datasets is that most of them are free from licensing burdens, and thousands of people have already tried preprocessing them, so the community will help out. PyTorch came up with utility packages for all three types of datasets with pretrained models, preprocessed datasets, and utility functions to work with these datasets. This article is about how to build a basic pipeline for deep learning development. The system we defined here is a very common/general approach that is followed by different sorts of companies, with slight changes. The benefit of starting with a generic workflow like this is that you can build a really complex workflow as your team/project grows on top of it. Build deep learning workflows and take deep learning models from prototyping to production with PyTorch Deep Learning Hands-On written by Sherin Thomas and Sudhanshu Passi. F8 PyTorch announcements: PyTorch 1.1 releases with new AI tools, open sourcing BoTorch and Ax, and more Facebook AI open-sources PyTorch-BigGraph for faster embeddings in large graphs Top 10 deep learning frameworks

0
0
12601

article-image-microsofts-net-core-2-1-now-powers-bing-com

Melisha Dsouza

21 Aug 2018

4 min read

Microsoft’s .NET Core 2.1 now powers Bing.com

Melisha Dsouza

21 Aug 2018

4 min read

Microsoft is ever striving to make its products run better. They can add yet another accomplishment to their list as Microsoft’s cloud service search engine, Bing is now running fully on .NET Core 2.1, as announced by the .NET engineering team in their blog yesterday. .NET Core is the slimmed down and cross-platform version of Microsoft’s .NET managed common language runtime. Since Bing runs on thousands of servers spanning many data centers across the globe, .NET Core will serve as the perfect platform for it to function on. Why did Bing migrate to .NET Core 2.1? Bing has always run on the .NET Framework, but has been able to move to .NET Core 2.1 after some recent API additions. Let’s take a look at the main reasons for Bing.com’s migration to .NET Core. 1. Performance i.e. serving latency .NET Core 2.1 has led to an improvement in performance in virtually all areas of the runtime and libraries. The internal server latency over the last few months shows a striking 34% improvement. Check out the graph for a clear picture! Souce: blog.msdn.microsoft.com The following changes in .NET Core 2.1 are the reasons why the workload and performance has greatly improved- #1 Vectorization of string.Equals & string.IndexOf/LastIndexOf HTML rendering and manipulation are string-heavy workloads. Vectorization of String comparisons and indexing operations (major components of string slicing) is the biggest contributor to the performance improvement. You can find more information on this on the github page for Vectorization of string.Equals and string.IndexOf/LastIndexOf #2 Devirtualization Support for EqualityComparer<T>.Default One of .NET core’s major software components is a heavy user of Dictionary<int/long, V>, which indirectly benefits from the intrinsic recognition work that was done in the JIT to make Dictionary<K, V> amenable to that optimization. Head over to the github page for more clarity on why this feature empowers .NET Core 2.1 #3 Software Write Watch for Concurrent GC This led to a reduction in CPU usage. The implementation relies on a JIT Write Barrier, which instinctively increases the cost of a reference store, but that cost is amortized and not noticed in the workload. #4 Methods with calli are now inline-able ldftn + calli are used in lieu of delegates (which incur an object allocation) in performance-critical pieces of code where there is a need to call a managed method indirectly. This change allowed method bodies with a calli instruction to be eligible for inlining. The github page provides more insight on this subject. #5 Improve performance of string.IndexOfAny for 2 & 3 char searches A common operation in a front-end stack is search for ‘:’, ‘/’, ‘/’ in a string to delimit portions of a URL. Check out this special-casing improvement that was beneficial throughout the codebase on the github page. 2. Runtime Agility The ability to have an xcopy version of the runtime inside their application denotes that they can adopt newer versions of the runtime at a much faster pace. The Continuous integration (CI) pipeline is run with .NET Core’s daily CI and it builds testing functionality and performance all the way through the release. 3. ReadyToRun Images Managed applications usually can have poor startup performance as methods first have to be JIT compiled to machine code. .NET Framework has a precompilation technology, NGEN. On .NET Core, the crossgen tool allows the code to be precompiled as a pre-deployment step, such as in the build lab, and the images deployed to production are Ready To Run! This feature was not supported on the previous .NET implementation. The .NET Core team is striving to provide Bing.com users fast results. The latest software and technologies used by their developers will ensure that .NET Core will not fail Bing.com! Read the detailed overview of the article on Microsoft's blog. Say hello to FASTER: a new key-value store for large state management by Microsoft Microsoft Azure’s new governance DApp: An enterprise blockchain without mining .NET Core completes move to the new compiler – RyuJIT

0
0
12531

article-image-intel-amd-laptop-chip-partnership

Abhishek Jha

09 Nov 2017

3 min read

Frenemies: Intel and AMD partner on laptop chip to keep Nvidia at bay

Abhishek Jha

09 Nov 2017

3 min read

For decades, Intel and AMD have remained bitter archrivals. Today, they find themselves teaming up to thwart a common enemy – Nvidia. As Intel revealed its partnership with Advanced Micro Devices (AMD) over a next-generation notebook chip, it was the first time the two chip giants collaborated since the ‘80s. The proposed chip for thin and lightweight laptops combines an Intel processor and an AMD graphics unit for complex video gaming. The new series of processors will be part of Intel's 8th-generation Core H-series mobile chips, expected to hit the market in the first quarter of 2018. What it means is that Intel’s high-performance x86 cores will get combined with AMD Radeon Graphics into the same processor package using Intel’s EMIB multi-die technology. That is not all. Intel is also bundling the design with built-in High Bandwidth Memory (HBM2) RAM. The new processor, Intel claims, reduces the usual silicon footprint by about 50%. And with a ‘semi-custom’ graphics processor from AMD, enthusiasts can look forward to discrete graphics-level performances for playing games, editing photos or videos, and other tasks that can leverage modern GPU technologies. What does AMD get? Having struggled to remain profitable in recent times, AMD has been losing share in the discrete notebook GPU market. The deal could bring additional revenues with increased market share. Most importantly, the laptops built with the new processors won’t be competing with AMD’s Ryzen chips (which are also designed for ultrathin laptops). AMD clarified on the difference: While the new Intel chips are designed for serious gamers, Ryzen chips (that are due out at the end of the year) can run games but are not specifically designed for that purpose. "Our collaboration with Intel expands the installed base for AMD Radeon GPUs and brings to market a differentiated solution for high-performance graphics,” Scott Herkelman, vice president and general manager of AMD's Radeon Technologies Group, said. "Together we are offering gamers and content creators the opportunity to have a thinner-and-lighter PC capable of delivering discrete performance-tier graphics experiences in AAA games and content creation applications.” While more information will be available in future, the first machines with the new technology are expected to release in the first quarter of 2018. Nvidia's stock fell on the news. While both AMD and Intel saw their shares surging. A rivalry that began when AMD reverse-engineered the Intel 8080 microchip in 1975 could still be far from over, but in graphics, the two have been rather cordial. Despite hating each other since formation, both decided to pick each other as lesser evil over Nvidia. This is why the Intel AMD laptop chip partnership has a definite future. Currently centered around laptop solutions, this could even stretch to desktops, who knows!

0
0
12484

article-image-build-achatbot-with-microsoft-bot-framework

Kunal Chaudhari

27 Apr 2018

8 min read

How to build a chatbot with Microsoft Bot framework

Kunal Chaudhari

27 Apr 2018

8 min read

The Microsoft Bot Framework is an increbible tool from Microsoft. It makes building chatbots easier and more accessible than ever. That means you can build awesome conversational chatbots for a range of platforms, including Facebook and Slack. In this tutorial, you'll learn how to build an FAQ chatbot using Microsoft Bot Framework and ASP.NET Core. This tutorial has been taken from .NET Core 2.0 By Example. Let's get started. You're chatbot that can respond to simple queries such as: How are you? Hello! Bye! This should provide a good foundation for you to go further and build more complex chatbots with the Microsoft Bot Framework, The more you train the Bot and the more questions you put in its knowledge base, the better it will be. If you're a UK based public sector organisation then ICS AI offer conversational AI solutions built to your needs. Their Microsoft based infrastructure runs chatbots augmented with AI to better serve general public enquiries. Build a basic FAQ Chabot with Microsoft Bot Framework First of all, we need to create a page that can be accessed anonymously, as this is frequently asked questions (FAQ ), and hence the user should not be required to be logged in to the system to access this page. To do so, let's create a new controller called FaqController in our LetsChat.csproj. It will be a very simple class with just one action called Index, which will display the FAQ page. The code is as follows: [AllowAnonymous] public class FaqController : Controller { // GET: Faq public ActionResult Index() { return this.View(); } } Notice that we have used the [AllowAnonymous] attribute, so that this controller can be accessed even if the user is not logged in. The corresponding .cshtml is also very simple. In the solution explorer, right-click on the Views folder under the LetsChat project and create a folder named Faq and then add an Index.cshtml file in that folder. The markup of the Index.cshtml would look like this: @{ ViewData["Title"] = "Let's Chat"; ViewData["UserName"] = "Guest"; if(User.Identity.IsAuthenticated) { ViewData["UserName"] = User.Identity.Name; } } <h1> Hello @ViewData["UserName"]! Welcome to FAQ page of Let's Chat </h1> <br /> Nothing much here apart from the welcome message. The message displays the username if the user is authenticated, else it displays Guest. Now, we need to integrate the Chatbot stuff on this page. To do so, let's browse http://qnamaker.ai. This is Microsoft's QnA (as in questions and answers) maker site which a free, easy-to-use, REST API and web-based service that trains artificial intelligence (AI) to respond to user questions in a more natural, conversational way. Compatible across development platforms, hosting services, and channels, QnA Maker is the only question and answer service with a graphical user interface—meaning you don’t need to be a developer to train, manage, and use it for a wide range of solutions. And that is what makes it incredibly easy to use. You would need to log in to this site with your Microsoft account (@microsoft/@live/@outlook). If you don't have one, you should create one and log in. On the very first login, the site would display a dialog seeking permission to access your email address and profile information. Click Yes and grant permission: You would then be presented with the service terms. Accept that as well. Then navigate to the Create New Service tab. A form will appear as shown here: The form is easy to fill in and provides the option to extract the question/answer pairs from a site or .tsv, .docx, .pdf, and .xlsx files. We don't have questions handy and so we will type them; so do not bother about these fields. Just enter the service name and click the Create button. The service should be created successfully and the knowledge base screen should be displayed. We will enter probable questions and answers in this knowledge base. If the user types a question that resembles the question in the knowledge base, it will respond with the answer in the knowledge base. Hence, the more questions and answers we type, the better it will perform. So, enter all the questions and answers that you wish to enter, test it in the local Chatbot setup, and, once you are happy with it, click on Publish. This would publish the knowledge bank and share the sample URL to make the HTTP request. Note it down in a notepad. It contains the knowledge base identifier guide, hostname, and subscription key. With this, our questions and answers are ready and deployed. We need to display a chat interface, pass the user-entered text to this service, and display the response from this service to the user in the chat user interface. To do so, we will make use of the Microsoft Bot Builder SDK for .NET and follow these steps: Download the Bot Application project template from http://aka.ms/bf-bc-vstemplate. Download the Bot Controller item template from http://aka.ms/bf-bc-vscontrollertemplate. Download the Bot Dialog item template from http://aka.ms/bf-bc-vsdialogtemplate. Next, identify the project template and item template directory for Visual Studio 2017. The project template directory is located at %USERPROFILE%DocumentsVisual Studio 2017TemplatesProjectTemplatesVisual C# and the item template directory is located at %USERPROFILE%DocumentsVisual Studio 2017TemplatesItemTemplatesVisual C#. Copy the Bot Application project template to the project template directory. Copy the Bot Controller ZIP and Bot Dialog ZIP to the item template directory. In the solution explorer of the LetsChat project, right-click on the solution and add a new project. Under Visual C#, we should now start seeing a Bot Application template as shown here: Name the project FaqBot and click OK. A new project will be created in the solution, which looks similar to the MVC project template. Build the project, so that all the dependencies are resolved and packages are restored. If you run the project, it is already a working Bot, which can be tested by the Microsoft Bot Framework emulator. Download the BotFramework-Emulator setup executable from https://github.com/Microsoft/BotFramework-Emulator/releases/. Let's run the Bot project by hitting F5. It will display a page pointing to the default URL of http://localhost:3979. Now, open the Bot framework emulator and navigate to the preceding URL and append api/messages; to it, that is, browse to http://localhost:3979/api/messages and click Connect. On successful connection to the Bot, a chat-like interface will be displayed in which you can type the message. The following screenshot displays this step: We have a working bot in place which just returns the text along with its length. We need to modify this bot, to pass the user input to our QnA Maker service and display the response returned from our service. To do so, we will need to check the code of MessagesController in the Controllers folder. We notice that it has just one method called Post, which checks the activity type, does specific processing for the activity type, creates a response, and returns it. The calculation happens in the Dialogs.RootDialog class, which is where we need to make the modification to wire up our QnA service. The modified code is shown here: private static string knowledgeBaseId = ConfigurationManager.AppSettings["KnowledgeBaseId"]; //// Knowledge base id of QnA Service. private static string qnamakerSubscriptionKey = ConfigurationManager.AppSettings["SubscriptionKey"]; ////Subscription key. private static string hostUrl = ConfigurationManager.AppSettings["HostUrl"]; private async Task MessageReceivedAsync(IDialogContext context, IAwaitable<object> result) { var activity = await result as Activity; // return our reply to the user await context.PostAsync(this.GetAnswerFromService(activity.Text)); context.Wait(MessageReceivedAsync); } private string GetAnswerFromService(string inputText) { //// Build the QnA Service URI Uri qnamakerUriBase = new Uri(hostUrl); var builder = new UriBuilder($"{qnamakerUriBase}/knowledgebases /{knowledgeBaseId}/generateAnswer"); var postBody = $"{{"question": "{inputText}"}}"; //Add the subscription key header using (WebClient client = new WebClient()) { client.Headers.Add("Ocp-Apim-Subscription-Key", qnamakerSubscriptionKey); client.Headers.Add("Content-Type", "application/json"); try { var response = client.UploadString(builder.Uri, postBody); var json = JsonConvert.DeserializeObject<QnAResult> (response); return json?.answers?.FirstOrDefault().answer; } catch (Exception ex) { return ex.Message; } } } The code is pretty straightforward. First, we add the QnA Maker service subscription key, host URL, and knowledge base ID in the appSettings section of Web.config. Next, we read these app settings into static variables so that they are available always. Next, we modify the MessageReceivedAsync method of the dialog to pass the user input to the QnA service and return the response of the service back to the user. The QnAResult class can be seen from the source code. This can be tested in the emulator by typing in any of the questions that we have stored in our knowledge base, and we will get the appropriate response, as shown next: Our simple FAQ bot using the Microsoft Bot Framework and ASP.NET Core 2.0 is now ready! Read more about building chatbots: How to build a basic server side chatbot using Go

0
1
12372

article-image-googles-new-facial-recognition-patent-uses-your-social-network-to-identify-you

Melisha Dsouza

10 Aug 2018

3 min read

Google’s new facial recognition patent uses your social network to identify you!

Melisha Dsouza

10 Aug 2018

3 min read

Google is making its mark in facial recognition technology. After two successful forays in facial identification patents in August 2017 and January 2018, Google is back with another charter. This time its huge and plans to use machine-learning technology for facial recognition of publicly available personal photos on the internet. It’s no secret that Google can crawl trillions of websites at once. Using this as an advantage, the new patent allows Google to source pictures and identify faces from personal communications, social networks, collaborative apps, blogs and much more! Why is facial recognition gaining importance? The internet is buzzing with people clicking and uploading their images. Whether it be profile pictures or group photographs, images on social networks is all the rage these days. Apart from this, facial recognition also comes in handy while performing secure banking and financial transactions. ATMs and banks use this technology to make sure the user is who he/she says they are. From criminal tracking to identifying individuals in huge masses of people- facial recognition has applications everywhere! Clearly, Google has been taking full advantage of this tech. First, in the “Reverse Image Search” system, that allowed users to upload an image of a public figure to Google, the results would be a “best Guess” about who appears in the photo. And now, with the new patent, users can identify photos of less famous individuals. Imagine uploading a picture of a fifth-grade friend and coming back with the result of his/her email ID or occupation or for that matter, where they lives! The Workings of the Google Brain The process is simple and straightforward. First, the user uploads a photo, screenshot or scanned image The system analyzes the image and comes up with both visually similar, and a potential match using advanced image recognition Google will find the best possible match based partially on the data it pulled from your social accounts and other collaborative apps plus the aforementioned data sources The process of recognizing an image adopted by Google Source: CBInsights While all of this does sound exciting, there is a dark side left to be explored. Imagine you are out going about your own business. Someone who you don't even know happens to click your picture. This could later be used to find out all your personal details like where you live, what you do for a living, what your email address. All because everything is available on your social media accounts and on the internet these days! Creepy much? This is where basic ethics and privacy concerns come into play. The only solace here is that the patent states, in certain scenarios, a person would have to opt-in to have his/identity appear in search results. Need to know more? Check out the perspective on thenextweb.com. Admiring the many faces of Facial Recognition with Deep Learning Google’s second innings in China: Exploring cloud partnerships with Tencent and others Google’s Smart Display – A push towards the new OS, Fuchsia

0
59
12221

article-image-git-2-23-released-with-two-new-commands-git-switch-and-git-restore-a-new-tutorial-and-much-more

Amrata Joshi

19 Aug 2019

4 min read

Git 2.23 released with two new commands ‘git switch’ and ‘git restore’, a new tutorial, and much more!

Amrata Joshi

19 Aug 2019

4 min read

Last week, the team behind Git released Git 2.23 that comes with experimental commands, backward compatibility and much more. This release has received contributions from over 77 contributors out of which 26 were new. What’s new in Git 2.23? Experimental commands This release comes with a new pair of experimental commands, git switch and git restore for providing a better interface for the git checkout. “Two new commands "git switch" and "git restore" are introduced to split "checking out a branch to work on advancing its history" and "checking out paths out of the index and/or a tree-ish to work on advancing the current history" out of the single "git checkout" command,” the official mail thread reads. Git checkout can be used to change branches with git checkout <branch>. In case if the user doesn’t want to switch branches, git checkout can be used to change individual files, too. These new commands aim to separate the responsibilities of git checkout into two narrower categories that is operations, which change branches and operations that change files. Backward compatibility The "--base" option of "format-patch" is now compatible with "git patch-id --stable". Git fast-export/import pair The "git fast-export/import" pair will be now used to handle commits with log messages in encoding other than UTF-8. git clone --recurse-submodules "git clone --recurse-submodules" has now learned to set up the submodules for ignoring commit object names that are recorded in the superproject gitlink. git diff/grep The pattern "git diff/grep" that is used for extracting funcname and words boundary for Rust has now been added. git fetch" and "git pull The commands "git fetch" and "git pull" are used to report when a fetch results in non-fast-forward updates that lets the user notice unusual situation. git status With this release, the extra blank lines in "git status" output have been reduced. Developer support This release comes with developer support for emulating unsatisfied prerequisites in tests for ensuring that the remainder of the tests succeeds when tests with prerequisites are skipped. A new tutorial for git-core developers This release comes with a new tutorial that target aspiring git-core developers. This tutorial demonstrates end-to-end workflow of creating a change to the Git tree, for sending it for review, as well as making changes that are based on comments. Bug fixes in Git 2.23 In the earlier version, "git worktree add" used to fail when another worktree that was connected to the same repository was corrupt. This issue has been corrected in this release. An issue with the file descriptor has been fixed. This release comes with an updated parameter validation. The code for parsing scaled numbers out of configuration files has been made more robust and easier to follow with this release. Few users seem to be happy about the new changes made, a user commented on HackerNews, “It's nice to hear that there appears to be progress being made in making git's tooling nicer and more consistent. Git's model itself is pretty simple, but the command line tools for working with it aren't and I feel that this fuels most of the "Git is hard" complaints.” Few others are still skeptical about the new commands, another user commented, “On the one hand I'm happy on the new "switch" and "restore" commands. On the other hand, I wonder if they truly add any value other than the semantic distinction of functions otherwise present in checkout.” To know more about this news in detail, read the official blog post on GitHub. GitHub has blocked an Iranian software developer’s account GitHub services experienced a 41-minute disruption yesterday iPhone can be hacked via a legit-looking malicious lightning USB cable worth $200, DefCon 27 demo shows

0
0
12218

article-image-creating-an-html-url-from-a-powershell-string-sqlnewblogger-from-blog-posts-sqlservercentral

Anonymous

30 Dec 2020

2 min read

Creating an HTML URL from a PowerShell String–#SQLNewBlogger from Blog Posts - SQLServerCentral

Anonymous

30 Dec 2020

2 min read

Another post for me that is simple and hopefully serves as an example for people trying to get blogging as #SQLNewBloggers. I wrote about getting a quick archive of SQL Saturday data last week, and while doing that, I had some issues building the HTML needed in PowerShell. I decided to work through this a bit and determine what was wrong. My original code looked like this: $folder = "E:DocumentsgitSQLSatArchiveSQLSatArchiveSQLSatArchiveClientApppublicAssetsPDF" $code = "" $list = Get-ChildItem -Path $folder ForEach ($File in $list) { #write-host($File.name) $code = $code + "<li><a href=$($File.Name)>$($File.BaseName)</a></li>" } write-host($code) This gave me the code I needed, which I then edited in SSMS to get the proper formatting. However, I knew this needed to work. I had used single quotes and then added in the slashes, but that didn’t work. This code: $folder = "E:DocumentsgitSQLSatArchiveSQLSatArchiveSQLSatArchiveClientApppublicAssetsPDF" $code = "" $list = Get-ChildItem -Path $folder ForEach ($File in $list) { #write-host($File.name) $code = $code + '<li><a href="/Assets/PDF/$($File.Name)" >$($File.BaseName)</a></li>' } write-host($code) produced this type of output: <li><a href="/Assets/PDF/$($File.Name)" >$($File.BaseName)</a></li> Not exactly top notch HTML. I decided that I should look around. I found a post on converting some data to HTML, which wasn’t what I wanted, but it had a clue in there. The double quotes. I needed to escape quotes here, as I wanted the double quotes around my string. I changed the line building the string to this: $code = $code + "<li><a href=""/Assets/PDF/$($File.Name)"" >$($File.BaseName)</a></li>" And I then had what I wanted: <li><a href="/Assets/PDF/1019.pdf" >1019</a></li> Strings in PoSh can be funny, so a little attention to escaping things and knowing about variables and double quotes is helpful. SQLNewBlogger This was about 15 minutes of messing with Google and PoSh to solve, but then only about 10 minutes to write up. A good example that shows some research, initiative, and investigation in addition to solving a problem. The post Creating an HTML URL from a PowerShell String–#SQLNewBlogger appeared first on SQLServerCentral.

0
0
12029

article-image-snapchat-source-code-leaked-and-posted-to-github

Richard Gall

09 Aug 2018

2 min read

Snapchat source code leaked and posted to GitHub

Richard Gall

09 Aug 2018

2 min read

Source code for what is believed to be a small part of Snapchat's iOS application was posted on GitHub after being leaked back in May. After being notified, Snap Inc., Snapchat's parent company, immediately filed a DMCA request to GitHub to get the code removed. A copy of the request was found by a 'security researcher' tweeting from the handle @x0rz, who shared a link to a copy of the request on GitHub: https://twitter.com/x0rz/status/1026735377955086337 You can read the DMCA request in full here. [caption id="attachment_21477" align="aligncenter" width="916"] Part of the Snap Inc. DMCA request to GitHub[/caption] The initial leak back in May was caused by an update to the Snapchat iOS application. A spokesperson for Snap Inc. explained to CNET: "An iOS update in May exposed a small amount of our source code and we were able to identify the mistake and rectify it immediately... We discovered that some of this code had been posted online and it has been subsequently removed. This did not compromise our application and had no impact on our community." This code was then published by a someone using the name Khaled Alshehri, believed to be based in Pakistan, on GitHub. The repository created - called Source-SnapChat - has now been taken down. A number of posts linked to the GitHub account suggests that the leaker had tried to contact Snapchat but had been ignored. "I will post it again until I get a reply" they said. https://twitter.com/i5aaaald/status/1025639490696691712 Leaked Snapchat code is still being traded privately Although GitHub has taken the repo down, it's not hard to find people claiming they have a copy of the code that they're willing to trade: https://twitter.com/iSn0we/status/1026738393353465858 Now the code is out in the wild it will take more than a DMCA request to get things under control. Although it would appear the leaked code isn't substantial enough to give much away to potential cybercriminals, it's likely that Snapchat is now working hard to make the changes required to tighten its security. Read next Snapchat is losing users – but revenue is up 15 year old uncovers Snapchat’s secret visual search function

0
0
11981

article-image-low-js-a-node-js-port-for-embedded-systems

Prasad Ramesh

17 Sep 2018

3 min read

low.js, a Node.js port for embedded systems

Prasad Ramesh

17 Sep 2018

3 min read

0
0
11972

article-image-introducing-krispnet-dnn-a-deep-learning-model-for-real-time-noise-suppression

Bhagyashree R

15 Nov 2018

3 min read

Introducing krispNet DNN, a deep learning model for real-time noise suppression

Bhagyashree R

15 Nov 2018

3 min read

Last month, 2Hz introduced an app called krisp which was featured on the Nvidia website. It uses deep learning for noise suppression and is powered by krispNet Deep Neural Network. krispNet is trained to recognize and reduce background noise from real-time audio and yields clear human speech. 2Hz is a company which builds AI-powered voice processing technologies to improve voice quality in communications. What are the limitations in the current ways of noise suppression? Many edge devices from phones, laptops, to conferencing systems come with noise suppression technologies. Latest mobile phones come equipped with multiple microphones which helps suppress environmental noise when we talk. Generally, the first mic is placed on the front bottom of the phone to directly capture the user’s voice. The second mic is placed as far as possible from the first mic. After the surrounding sounds are captured by both these mics, the software effectively subtracts them from each other and yields an almost clean voice. The limitations of multiple mics design: Since multiple mics design requires a certain form factor, their application is only limited to certain use cases such as phones or headsets with sticky mics. These designs make the audio path complicated, requiring more hardware and code. Audio processing can only be done on the edge or device side, thus the underlying algorithm is not very sophisticated due to the low power and compute requirement. The traditional Digital Signal Processing (DSP) algorithms also work well only in certain use cases. Their main drawback is that they are not scalable to variety and variability of noises that exist in our everyday environment. This is why 2Hz has come up with a deep learning solution that uses a single microphone design and all the post processing is handled by a software. This allows hardware designs to be simpler and more efficient. How deep learning can be used in noise suppression? There are three steps involved in applying deep learning to noise suppression: Source: Nvidia Data collection: The first step is to build a dataset to train the network by combining distinct noises and clean voices to produce synthetic noisy speech. Training: Next, feed the synthetic noisy speech dataset to the DNN on input and the clean speech on the output. Inference: Finally, produce a mask which will filter out the noise giving you a clear human voice. What are the advantages of krispNet DNN? krispNet is trained with a very large amount of distinct background noises and clean human voices. It is able to optimize itself to recognize what’s background noise and separate it from a human speech by leaving only the latter. While inferencing, krispNet acts on real-time audio and removes background noise. krispNet DNN can also perform Packet Loss Concealment for audio and fill out missing voice chunks in voice calls by eliminating “chopping”. krispNet DNN can predict higher frequencies of a human voice and produce much richer voice audio than the original lower bitrate audio. Read more in detail about how we can use deep learning in noise suppression on the Nvidia blog. Samsung opens its AI based Bixby voice assistant to third-party developers Voice, natural language, and conversations: Are they the next web UI? How Deep Neural Networks can improve Speech Recognition and generation

0
0
11962

article-image-craftassist-an-open-source-framework-to-enable-interactive-bots-in-minecraft-by-facebook-researchers

Vincy Davis

19 Jul 2019

5 min read

CraftAssist: An open-source framework to enable interactive bots in Minecraft by Facebook researchers

Vincy Davis

19 Jul 2019

5 min read

Two days ago, researchers from Facebook AI Research published a paper titled “CraftAssist: A Framework for Dialogue-enabled Interactive Agents”. The authors of this research are Facebook AI research engineers Jonathan Gray and Kavya Srinet, Facebook AI research scientist C. Lawrence Zitnick and Arthur Szlam and Yacine Jernite, Haonan Yu, Zhuoyuan Chen, Demi Guo and Siddharth Goyal. The paper describes the implementation of an assistant bot called CraftAssist which appears and interacts like another player, in the open sandbox game of Minecraft. The framework enables players to interact with the bot via in-game chat through various implemented tools and platforms. The players can also record these interactions through an in-game chat. The main aim of the bot is to be a useful and entertaining assistant to all the tasks listed and evaluated by the human players. Image Source: CraftAssist paper For motivating the wider AI research community to use the CraftAssist platform in their own experiments, Facebook researchers have open-sourced the framework, the baseline assistant, data and the models. The released data includes the functions which was used to build the 2,586 houses in Minecraft, the labeling data of the walls, roofs, etc. of the houses, human rephrasing of fixed commands, and the conversion of natural language commands to bot interpretable logical forms. The technology that allows the recording of human and bot interaction on a Minecraft server has also been released so that researcher will be able to independently collect data. Why is the Minecraft protocol used? Minecraft is a popular multiplayer volumetric pixel (voxel) 3D game based on building and crafting which allows multiplayer servers and players to collaborate and build, survive or compete with each other. It operates through a client and server architecture. The CraftAssist bot acts as a client and communicates with the Minecraft server using the Minecraft network protocol. The Minecraft protocol allows the bot to connect to any Minecraft server without the need for installing server-side mods. This lets the bot to easily join a multiplayer server along with human players or other bots. It also lets the bot to join an alternative server which implements the server-side component of the Minecraft network protocol. The CraftAssist bot uses a 3rd-party open source Cuberite server. It is a fast and extensible game server used for Minecraft. Read More: Introducing Minecraft Earth, Minecraft’s AR-based game for Android and iOS users How does the CraftAssist function? The block diagram below demonstrates how the bot interacts with incoming in-game chats and reaches the desired target. Image Source: CraftAssist paper Firstly, the incoming text is transformed into a logical form called the action dictionary. The action dictionary is then translated by a dialogue object which interacts with the memory module of the bot. This produces an action or a chat response to the user. The bot’s memory uses a relational database which is structured to recognize the relation between stored items of information. The major advantage of this type of memory is the easy to convert semantic parser, which is converted into a fully specified tasks. The bot responds to higher-level actions, called Tasks. Tasks are an interruptible process which follows a clear objective of step by step actions. It can adjust to long pauses between steps and can also push other Tasks onto a stack, like the way functions can call other functions in a standard programming language. Move, Build and Destroy are few of the many basic Tasks assigned to the bot. The The Dialogue Manager checks for illegal or profane words, then queries the semantic parser. The semantic parser takes the chat as input and produces an action dictionary. The action dictionary indicates that the text is a command given by a human and then specifies the high-level action to be performed by the bot. Once the task is created and pushed onto the Task stack, it is the responsibility of the command task ‘Move’ to compare the bot’s current location to the target location. This will make the bot to undertake a sequence of low-level step movements to reach the target. The core of the bot’s understanding of natural language depends on a neural semantic parser called the Text-toAction-Dictionary (TTAD) model. This model receives the incoming command/chat and then classifies it into an action dictionary which is interpreted by the Dialogue Object. The CraftAssist framework thus enables the bots in Minecraft to interact and play with players by understanding human interactions, using the implemented tools. The researchers hope that since the dataset of CraftAssist is now open-sourced, more developers will be empowered to contribute to this framework by assisting or training the bots, which might lead to the bots learning from human dialogue interactions, in the future. Developers have found the CraftAssist framework interesting. https://twitter.com/zehavoc/status/1151944917859688448 A user on Hacker News comments, “Wow, this is some amazing stuff! Congratulations!” Check out the paper CraftAssist: A Framework for Dialogue-enabled Interactive Agents for more details. Epic Games grants Blender $1.2 million in cash to improve the quality of their software development projects What to expect in Unreal Engine 4.23? A study confirms that pre-bunk game reduces susceptibility to disinformation and increases resistance to fake news

0
0
11955

article-image-gnu-bash-5-0-is-here-with-new-features-and-improvements

Natasha Mathur

08 Jan 2019

2 min read

Bash 5.0 is here with new features and improvements

Natasha Mathur

08 Jan 2019

2 min read

GNU project made version 5.0 of its popular POSIX shell Bash ( Bourne Again Shell) available yesterday. Bash 5.0 explores new improvements and features such as BASH_ARGV0, EPOCHSECONDS, and EPOCHREALTIME among others. Bash was first released in 1989 and was created for the GNU project as a replacement for their Bourne shell. It is capable of performing functions such as interactive command line editing, and job control on architectures that support it. It is a complete implementation of the IEEE POSIX shell and tools specification. Key Updates New features Bash 5.0 comes with a newly added EPOCHSECONDS variable, which is capable of expanding to the time in seconds. There is another newly added EPOCHREALTIME variable which is similar to EPOCHSECONDS in Bash 5.0. EPOCHREALTIME is capable of obtaining the number of seconds since the Unix Epoch, the only difference being that this variable is a floating point with microsecond granularity. BASH_ARGV0 is also a newly added variable in Bash 5.0 that expands to $0 and sets $0 on assignment. There is a newly defined config-top.h in Bash 5.0. This allows the shell to use a static value for $PATH. Bash 5.0 has a new shell option that can enable and disable sending history to syslog at runtime. Other Changes The `globasciiranges' option is now enabled by default in Bash 5.0 and can be set to off by default at configuration time. POSIX mode is now capable of enabling the `shift_verbose' option. The `history' builtin option in Bash 5.0 can now delete ranges of history entries using `-d start-end'. A change that caused strings containing + backslashes to be flagged as glob patterns has been reverted in Bash 5.0. For complete information on bash 5.0, check out its official release notes. GNU ed 1.15 released! GNU Bison 3.2 got rolled out GNU Guile 2.9.1 beta released JIT native code generation to speed up all Guile programs

0
0
11905

article-image-big-data-as-a-service-bdaas-solutions-comparing-iaas-paas-and-saas

Guest Contributor

28 Aug 2018

8 min read

Big data as a service (BDaaS) solutions: comparing IaaS, PaaS and SaaS

Guest Contributor

28 Aug 2018

8 min read

What is Big Data as a Service (BDaaS)? Thanks to the increased adoption of cloud infrastructures, processing, storing, and analyzing huge amounts of data has never been easier. The big data revolution may have already happened, but it’s Big Data as a service, or BDaas, that’s making it a reality for many businesses and organizations. Essentially, BDaas is any service that involves managing or running big data on the cloud. The advantages of BDaas There are many advantages to using a BDaaS solution. It makes many of the aspects that managing a big data infrastructure yourself so much easier. One of the biggest advantages is that it makes managing large quantities of data possible for medium-sized businesses. Not only can it be technically and physically challenging, it can also be expensive. With BDaaS solutions that run in the cloud, companies don’t need to stump up cash up front, and operational expenses on hardware can be kept to a minimum. With cloud computing, your infrastructure requirements are fixed at a monthly or annual cost. However, it’s not just about storage and cos. BDaaS solutions sometimes offer in-built solutions for artificial intelligence and analytics, which means you can accomplish some pretty impressive results without having to have a huge team of data analysts, scientists and architects around you. The different models of BDaaS There are three different BDaaS models. These closely align with the 3 models of cloud infrastructure: IaaS, PaaS, and SaaS. Big Data Infrastructure as a Service (IaaS) – Basic data services from a cloud service provider. Big Data Platform as a Service (PaaS) – Offerings of an all-round Big Data stack like those provided by Amazon S3, EMR or RedShift. This excludes ETL and BI. Big Data Software as a Service (SaaS) – A complete Big Data stack within a single tool. How does the Big Data IaaS Model work? A good example of the IaaS model is Amazon’s AWS IaaS architecture, which combines S3 and EC2. Here, S3 acts as a data lake that can store infinite amounts of structured as well as unstructured data. EC2 acts a compute layer that can be used to implement a data service of your choice and connects to the S3 data. For the data layer you have the option of choosing from among: Hadoop – The Hadoop ecosystem can be run on an EC2 instance giving you complete control NoSQL Databases – These include MongoDB or Cassandra Relational Databases – These include PostgreSQL or MySQL For the compute layer, you can choose from among: Self-built ETL scripts that run on EC2 instances Commercial ETL tools that can run on Amazon’s infrastructure and use S3 Open source processing tools that run on AWS instances, like Kafka How does the Big Data PaaS Model work? A standard Hadoop cloud-based Big Data Infrastructure on Amazon contains the following: Data Ingestion – Logs file data from any data source Amazon S3 Data Storage Layer Amazon EMR – A scalable set of instances that run Map/Reduce against the S3 data. Amazon RDS – A hosted MySQL database that stores the results from Map/Reduce computations. Analytics and Visualization – Using an in-house BI tool. A similar set up can be replicated using Microsoft’s Azure HDInsight. The data ingestion can be made easier with Azure Data Factory’s copy data tool. Apart from that, Azure offers several storage options like Data lake storage and Blob storage that you can use to store results from the computations. How does the Big Data SaaS model work? A fully hosted Big Data stack complete that includes everything from data storage to data visualization contains the following: Data Layer – Data needs to be pulled into a basic SQL database. An automated data warehouse does this efficiently Integration Layer – Pulls the data from the SQL database into a flexible modeling layer Processing Layer – Prepares the data based on the custom business requirements and logic provided by the user Analytics and BI Layer – Fully featured BI abilities which include visualizations, dashboards, and charts, etc. Azure Data Warehouse and AWS Redshift are the popular SaaS options that offer a complete data warehouse solution in the cloud. Their stack integrates all the four layers and is designed to be highly scalable. Google’s BigQuery is another contender that’s great for generating meaningful insights at an unmatched price-performance. Choosing the right BDaaS provider It sounds obvious, but choosing the right BDaaS provider is ultimately all about finding the solution that best suits your needs. There are a number of important factors to consider, such as workload, performance, and cost, each of which will have varying degrees of importance for you. criteria behind the classification include workload, performance and budget requirements. Here are 3 ways you might approach a BDaaS solution:Core BDaaS Core BDaaS uses a minimal platform like Hadoop with YARN and HDFS and other services like Hive. This service has gained popularity among companies which use this for any irregular workloads or as part of their larger infrastructure. They might not be as performance intensive as the other two categories. A prime example would be Elastic MapReduce or EMR provided by AWS. This integrates freely with NoSQL store, S3 Storage, DynamoDB and similar services. Given its generic nature, EMR allows a company to combine it with other services which can result in simple data pipelines to a complete infrastructure. Performance BDaaS Performance BDaaS assists businesses that are already employing a cluster-computing framework like Hadoop to further optimize their infrastructure as well as the cluster performance. Performance BDaaS is a good fit for companies that are rapidly expanding and do not wish to be burdened by having to build a data architecture and a SaaS layer. The benefit of outsourcing the infrastructure and platform is that companies can focus on specific processes that add value instead of concentrating on complicated Big Data related infrastructure. For instance, there are many third-party solutions built on top of Amazon or Azure stack that let you outsource your infrastructure and platform requirements to them. Feature BDaaS If your business is in need of additional features that may not be within the scope of Hadoop, Feature BDaaS may be the way forward. Feature BDaaS focuses on productivity as well as abstraction. It is designed to enable users to be up and using Big Data quickly and efficiently. Feature BDaaS combines both PaaS and SaaS layers. This includes web/API interfaces, and database adapters that offer a layer of abstraction from the underlying details. Businesses don’t have to spend resources and manpower setting up the cloud infrastructure. Instead, they can rely on third-party vendors like Qubole and Altiscale that are designed to set it up and running on AWS, Azure or cloud vendor of choice quickly and efficiently. Additional Tips for Choosing a Provider When evaluating a BDaaS provider for your business, cost reduction and scalability are important factors. Here are a few tips that should help you choose the right provider. Low or Zero Startup Costs – A number of BDaaS providers offer a free trial period. Therefore, theoretically, you can start seeing results before you even commit a dollar. Scalable – Growth in scale is in the very nature of a Big Data project. The solution should be easy and affordable to scale, especially in terms of storage and processing resources. Industry Footprint – It is a good idea to choose a BDaaS provider that already has an experience in your industry. This is doubly important if you are also using them for consultancy and project planning requirements. Real-Time Analysis and Feedback – The most successful Big Data projects today are those that can provide almost immediate analysis and feedback. This helps businesses to take remedial action instantly instead of working off of historical data. Managed or Self-Service – Most BDaaS providers today provide a mix of both managed as well as self-service models based on the company’s needs. It is common to find a host of technical staff working in the background to provide the client with services as needed. Conclusion The value of big data is not in the data itself, but in the insights that can be drawn after processing it and running it through robust analytics. This can help to guide and define your decision making for the future. A quick tip with regards to using Big Data: keep it small at the initial stages. This ensures the data can be checked for accuracy and the metrics derived from them are right. Once confirmed, you can go ahead with more complex and larger data projects. Gilad David Maayan is a technology writer who has worked with over 150 technology companies including SAP, Oracle, Zend, CheckPoint and Ixia. Gilad is a 3-time winner of international technical communication awards, including the STC Trans-European Merit Award and the STC Silicon Valley Award of Excellence. Over the past 7 years Gilad has headed Agile SEO, which performs strategic search marketing for leading technology brands. Together with his team, Gilad has done market research, developer relations and content strategy in 39 technology markets, lending him a broad perspective on trends, approaches and ecosystems across the tech industry. Common big data design patterns Hortonworks partner with Google Cloud to enhance their Big Data strategy Top 5 programming languages for crunching Big Data effectively Getting to know different Big data Characteristics

0
0
11889

article-image-nuxt-js-2-0-released-with-a-new-scaffolding-tool-webpack-4-upgrade-and-more

Bhagyashree R

24 Sep 2018

3 min read

Nuxt.js 2.0 released with a new scaffolding tool, Webpack 4 upgrade, and more!

Bhagyashree R

24 Sep 2018

3 min read

Last week, the Nuxt.js community announced the release of Nuxt.js 2.0 with major improvements. This release comes with a scaffolding tool, create-nuxt-app to quickly get you started with Nuxt.js development. To provide a faster boot-up and re-compilation, this release is upgraded to Webpack 4 (Legato) and Babel 7. Nuxt.js is an open source web application framework for creating Vue.js applications. You can choose between universal, static generated or single page application. What is new in Nuxt.js 2.0? Introduced features and upgradations create-nuxt-app To get you quickly started with Nuxt.js development, you can use the newly introduced create-nuxt-app tool. This tool includes all the Nuxt templates such as starter template, express templates, and so on. With create-nuxt-app you can choose between integrated server-side framework, UI frameworks, and add axios module. Introducing nuxt-start and nuxt-legacy To start Nuxt.js application in production mode nuxt-start is introduced. To support legacy build of Nuxt.js for Node.js < 8.0.0, nuxt-legacy is added. Upgraded to Webpack 4 and Babel 7 To provide faster boot-up time and faster re-compilation, this release uses Webpack 4 (Legato) and Babel 7. ESM supported everywhere In this release, ESM is supported everywhere. You can now use export/import syntax in nuxt.config.js, serverMiddleware, and modules. Replace postcss-next with postcss-preset-env Due to the deprecation of cssnext, you have to use postcss-preset-env instead of postcss-cssnext. Use ~assets instead of ~/assets Due to css-loader upgradation, use ~assets instead of ~/assets for alias in <url> CSS data type, for example, background: url("~assets/banner.svg"). Improvements The HTML script tag in core/renderer.js is fixed to pass W3C validation. The background-color property is now replaced with background in loadingIndicator, to allow the use of images and gradients for your background in SPA mode. Due to server/client artifact isolation, external build.publicPath need to upload built content to .nuxt/dist/client directory instead of .nuxt/dist. webpackbar and consola provide a improved CLI experience and better CI compatibility. Template literals in lodash templates are disabled. Better error handling if the specified plugin isn't found. Deprecated features The vendor array isn't supported anymore. DLL support is removed because it was not stable enough. AggressiveSplittingPlugin is obsoleted, users can use optimization.splitChunks.maxSize instead. The render.gzip option is deprecated. Users can use render.compressor instead. To read more about the updates, check out Nuxt’s official announcement on Medium and also see the release notes on its GitHub repository. Next.js 7, a framework for server-rendered React applications, releases with support for React context API and Webassembly low.js, a Node.js port for embedded systems Welcome Express Gateway 1.11.0, a microservices API Gateway on Express.js

0
0
11870

article-image-tesseract-version-4-0-releases-with-new-lstm-based-engine-and-an-updated-build-system

Natasha Mathur

30 Oct 2018

2 min read

Tesseract version 4.0 releases with new LSTM based engine, and an updated build system

Natasha Mathur

30 Oct 2018

2 min read

Google released version 4.0 of its OCR engine, Tesseract, yesterday. Tesseract 4.0 comes with a new neural net (LSTM) based OCR engine, updated build system, other improvements, and bug fixes. Tesseract is an OCR engine that offers support for unicode (a specification that supports all character set) and comes with an ability to recognize more than 100 languages out of the box. It can be trained to recognize other languages and is used for text detection on mobile devices, videos, and in Gmail image spam detection. Let’s have a look at what's new in Tesseract 4.0. New neural net (LSTM) based OCR engine The new OCR engine uses a neural network system based on LSTMs, with major accuracy gains. This consists of new training tools for the LSTM OCR engine. You can train a new model from scratch or by fine-tuning an existing model. Trained data including LSTM models and 123 languages have been added to the new OCR engine. Optional accelerated code paths have been added for the LSTM recognizer: Moreover, a new parameter lstm_choice_mode that allows including alternative symbol choices in the hOCR output has been added. Updated Build System Tesseract 4.0 uses semantic versioning and requires Leptonica 1.74.0 or a higher version. In case you want to build Tesseract from source code then a compiler with strong C++ 11 support is necessary. Unit tests have been added to the main repo. Tesseract's source tree has been reorganized in version 4.0. A new option has been added that lets you compile Tesseract without the code of the legacy OCR engine. Bug Fixes Issues in trainingdata rendering have been fixed. Damage caused to binary images when processing PDFs has been fixed. Issues in the OpenCL code have been fixed. OpenCL now works fine for the legacy Tesseract OCR engine but the performance hasn’t improved yet. Other Improvements Multi-page TIFF handling is improved in Tesseract 4.0. Improvements are made to PDF rendering. The version information and improved help texts have been added to the training tools. tessedit_pageseg_mode 1 has been removed from hocr, pdf, and tsv config files. The user has to now explicitly use --psm 1 if that is desired. For more information, check out the official release notes. Tesla v9 to incorporate neural networks for autopilot Neural Network Intelligence: Microsoft’s open source automated machine learning toolkit

0
0
11727

Sherin Thomas explains how to build a pipeline in PyTorch for deep learning workflows

Microsoft’s .NET Core 2.1 now powers Bing.com

Frenemies: Intel and AMD partner on laptop chip to keep Nvidia at bay

How to build a chatbot with Microsoft Bot framework

Google’s new facial recognition patent uses your social network to identify you!

Git 2.23 released with two new commands ‘git switch’ and ‘git restore’, a new tutorial, and much more!

Creating an HTML URL from a PowerShell String–#SQLNewBlogger from Blog Posts - SQLServerCentral

Snapchat source code leaked and posted to GitHub

low.js, a Node.js port for embedded systems

Introducing krispNet DNN, a deep learning model for real-time noise suppression

Trending Topics

CraftAssist: An open-source framework to enable interactive bots in Minecraft by Facebook researchers

Bash 5.0 is here with new features and improvements

Big data as a service (BDaaS) solutions: comparing IaaS, PaaS and SaaS

Nuxt.js 2.0 released with a new scaffolding tool, Webpack 4 upgrade, and more!

Tesseract version 4.0 releases with new LSTM based engine, and an updated build system