Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon

How-To Tutorials - Languages

135 Articles
article-image-why-guido-van-rossum-quit
Amey Varangaonkar
20 Jul 2018
7 min read
Save for later

Why Guido van Rossum quit as the Python chief (BDFL)

Amey Varangaonkar
20 Jul 2018
7 min read
It was the proverbial ‘end of an era’ for Python as Guido van Rossum stepped down as the Python chief, almost 3 decades since he created the programming language. It came as a shock to many Python users, and left a few bewildered. Many core developers thought this day might come, but they didn’t expect it to come so soon. However, looking at the post that Guido shared with the community, does this decision really come as a surprise? In this article, we dive deep into the possibilities and the circumstances that could’ve played a major role in van Rossum’s resignation. *Disclaimer: The views presented in this article are based purely on our research. They are not to be considered as inputs directly received from the Python community or Guido van Rossum himself. What can we make of Guido’s post? I’m pretty sure you’ve already read the mailing list post that Guido shared with the community last week. Aptly titled as ‘Transfer of Power’, the mail straightaway begins on a negative note: “Now that PEP 572 is done, I don't ever want to have to fight so hard for a PEP and find that so many people despise my decisions.” Some way to start a mail. The anger, disappointment and the tiredness is quite evident. Guido goes on to state that he would be removing himself from all the decision-making processes and will be available only for a while as a core developer and a mentor. From the tone of the mail, the three main reasons for his departure can be figured out quite easily: Guido felt there were questions around his decision-making and overall administration capabilities. The backlash on the PEP 572 is a testament to this. van Rossum is 62 now. Maybe the stress of leading this project for close to 30 years has finally taken a toll on his health, as he wryly talked about the piling medical issues. This is also quite evident from the last sentence of his mail: “I'm tired, and need a very long break” Guido thinks this is the right time for the baton to be passed over to the other core committers. He leaves everything for the core developers to figure out - from finalizing the PEPs (Python Enhancement Proposal) to deciding how the new core developers are inducted. Understanding the backlash behind PEP 572 For a mature language such as Python, you’d think there wouldn’t be much left to get excited about. However, a proposal to add a new feature to Python - PEP 572 - has caused a furore in the Python community in the last few months. What PEP 572 is all about The idea behind PEP 572 is quite simple - to allow assignment to variables within expressions. To make things simpler, consider the following lines of code in Python: a = b  - this is a simple assignment statement, while: a == b - this is a test for equality With PEP 572 comes a brand new operator := which is available in some other programming languages, and is an equivalent of the in-expression. So the way you would use this operator would be: while a:=b.read(10): print(a) Looks like a simple statement, isn’t it? Keep printing a while it is in a certain range of b. So what’s all the hue and cry about? In principle, the way := is used signifies that the value of an expression is assigned and returned to whatever code is using it, almost as if no assignment ever happened. This can get really tricky when complex expressions are involved. Ideally, an expression assignment is useful when one needs to retain the result of that expression while it is being used for some other purposes. The use of := is against this best practice, and has therefore led to many disagreements. The community response to PEP 572 Many Python users thought PEP 572 was a bad idea due to the reasons mentioned above. They did not hide their feelings regarding this too. In fact, some of the comments were quite brutal: Even some of the core developers were unhappy with this proposal, saying it did not fit the fundamental Python best practice, i.e. preference for simplicity over complexity. This practice is a part of the PEP 20, titled ‘The Zen of the Python’. As the Python BDFL, van Rossum personally signed off each PEP. This is in stark contrast to how other programming languages such as PHP finalize their proposals, i.e., by voting on them. On the PEP 572 objections, Guido’s response befitted that of a BDFL perfectly: Some developers still disagreed with this proposal, believing that it deviated from the standard best practices and rather reflected van Rossum’s preferred style of coding. So much so that van Rossum had to ask the committers to give him time to respond to the queries. Eventually the PEP 572 was accepted by Guido van Rossum, as he settled the matter with the following note: Thank you all. I will accept the PEP as is. I am happy to accept *clarification* updates to the PEP if people care to submit them as PRs to the peps repo, and that could even (to some extent) include summaries of discussion we've had, or outright rejected ideas. But even without any of those I think the PEP is very clear so I will not wait very long (maybe a week). Normally, in case of some other language, such an argument could have gone on forever, with both the sides reluctant to give in. The progress of the language would be stuck in a limbo as a result of this polarity. With Guido gone now, one cannot help but wonder if this is going to be case with Python going forward. Could van Rossum been pressurized less if he had adopted a consensus-based voting system to sign proposals off too? And if that was the case, would the proposal still have gone through an opposing majority of core developers? “Tired of the hatred” It would be wrong to say that the BDFL quit mainly because of how working on PEP 572 left a bitter taste in his mouth. However, it is fair to say that the negativity surrounding PEP 572 must’ve pushed van Rossum off the ledge finally. The fact that he thinks stepping down from his role as Python chief would mean people would not ‘despise his decisions’ - must’ve played a major role in his announcement. Guido’s decision to quit was rather an inevitable outcome of a series of past bad experiences accrued over the years with backlashes over his decisions on Python’s direction. Leading one of the most successful and long running open source projects in the world is no joke, and it brings more than its fair share of burden to carry. In many ways, CEOs of big tech companies have it easier. For starters, they’ve a lot of funding and they mainly worry about how to make their shareholders happy (make more money). More importantly, they aren’t directly exposed to the end users the way open source leaders are, for every decision they make. What’s next for Guido? Guido van Rossum isn’t going away for good. His mail states that he will still be around as a core dev, and as a mentor to other budding developers for some time. He says just wants to move away from the leadership role, away from all the responsibilities that once made him the BDFL. His tweet corroborates this: https://twitter.com/gvanrossum/status/1017546023227424768?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Etweet Call him a dictator if you will, his contributions to Python cannot be taken away. From being a beginner’s coding language to being used in enterprise applications - Python’s rise under Van Rossum as one of the most popular and versatile programming languages in the world has been incredible. Perhaps the time was right for the sun to set, and the PEP 572 scenario and the circumstances surrounding it might just have given Guido the platform to ride away into the sunset. Read more Python founder resigns – Guido van Rossum goes ‘on a permanent vacation from being BDFL’ Top 7 Python programming books you need to read Python, Tensorflow, Excel and more – Data professionals reveal their top tools
Read more
  • 0
  • 2
  • 19878

article-image-apollo-11-source-code-how-it-became-a-small-step-for-a-woman-and-a-huge-leap-for-software-engineering
Sugandha Lahoti
19 Jul 2018
5 min read
Save for later

Apollo 11 source code: A small step for a woman, and a huge leap for 'software engineering'

Sugandha Lahoti
19 Jul 2018
5 min read
Yesterday, reddit saw an explosion of discussion around the original Apollo 11 Guidance Computer (AGC) source code. The code in its entirety was uploaded on GitHub two years ago, thanks to former NASA intern, Chris Garry. And again it seems to have undergone significant updates this week looking at the timestamps on all the files in the repo. This is a project that will always hold a special place for all software professionals around the world. This is the project that made ‘software engineering’ a real discipline. What is AGC and why it mattered for Apollo 11? AGC was a digital computer produced for the Apollo program, installed on board the Apollo 11 Command Module (CM) and Lunar Module (LM). The AGC code is also referred to as ‘COLOSSUS 2A’ and was written in AGC assembly language and stored on rope memory. On any given Apollo mission, there were two AGCs, one for the CM, and one for the LM. The two AGCs were identical and interchangeable. However, their software differed as both the LM and the CM performed different tasks pertaining to the spacecraft. The CM launched the three astronauts to the moon, and back again. The LM helped in the landing of two of the astronauts on the moon while the third astronaut remained in the CM, in orbit around the moon. The woman who coined the term, ‘software engineering’ The AGC code was brought to life by Margaret Hamilton, director of software engineering for the project. In a male-dominated world of tech and engineering of that time, Margaret was an exception. She led a team credited with developing the software for Apollo and Skylab, keeping her head high even through backlash. “People used to say to me, ‘How can you leave your daughter? How can you do this?”  She went on to become the founder and CEO of Hamilton Technologies, Inc. and was also awarded the Presidential Medal of Freedom in 2016. Hamilton is considered one of the pioneers of software engineering, credited for actually coining the term “software engineering”. She first started using the term during the early Apollo missions wanting to give software the same legitimacy as other disciplines. At that time, it was not taken seriously but over time software engineering has become an IEEE Standard. What can we learn from the AGC code developers? Understandably, the AGC specifications and processing power are very underrated as compared to the technology of today. Some still wittingly call it a calculator, instead of a computer. Others say, that the CPU in a microwave oven is probably more powerful than an AGC. Inspite of being a very basic technology in terms of processing power and speed, the Apollo 11 spacecraft was able to complete the first ever manned mission to the moon and back. This is not just a huge testament to the original programming team’s ingenuity and resourcefulness but also of their grit and meticulousness. One would think such a bunch produced serious (boring) code that has flawless execution. Read between the (code) lines and you see a team that just enjoyed every moment of writing the code with quirky naming conventions and humorous notes inside the comments. Back to the present As soon as the code was uploaded on Github two years ago and even now, coders and software programmers all over the world, are dissecting it, particularly interested in the quirky English-descriptions of code explanations. People on Reddit are terming the code files as real programming that doesn’t rely on APIs to do the heavy lifting. People are also loving the naming convention of the source code files and their programs which were 1960s inspired light-hearted jokes. For example, the BURN_BABY_BURN--MASTER IGNITION ROUTINE for the master ignition routine, and the PINBALL_GAME_BUTTONS_AND_LIGHTS.agc for keyboard and display code. Even the programs are quirky. The LUNAR_LANDING_GUIDANCE_EQUATIONS.s, file ended up having two temporary lines of code as permanent. You can read more such interesting reddit comments. However, a point worth noting is that Margaret and by extension, women in tech, are conspicuously missing in this rich discussion. We can start seeing real change only when discussion forums start including various facets of the tool/tech under discussion. People behind the tech are an important facet, and more so when they are in the minority. You can also read The Apollo Guidance Computer: Architecture and Operation the inside scoop on how the AGC functioned, what kind of design decisions and software choices the programmers had to made based on the features and limitations of the AGC among other insights. The github repo for the original Apollo 11 source code also contains material for further reading. Is space the final frontier for AI? NASA to reveal Kepler’s latest AI backed discovery NASA’s Kepler discovers a new exoplanet using Google’s Machine Learning Meet CIMON, the first AI robot to join the astronauts aboard ISS
Read more
  • 0
  • 1
  • 9971

article-image-python-founder-guido-van-rossum-goes-on-a-permanent-vacation-from-being-bdfl
Savia Lobo
13 Jul 2018
5 min read
Save for later

Python founder resigns - Guido van Rossum goes ‘on a permanent vacation from being BDFL’

Savia Lobo
13 Jul 2018
5 min read
Python is one of the most popular scripting languages widely adopted and loved due to its simplicity.  Since its humble beginnings in the last 80s as an interpreter for the new, a simple-to-read scripting language, it has now come to dominate all of the tech world. Python has become a vital part of web development stacks such as Perl, PHP, and others have been core to domains like security. It is also used in current popular technologies such as AI, ML, and DL. After 28 years of successfully stewarding the Python community since inventing it back in Dec 1989, Guido van Rossum has decided to take himself out of the decision making process of the community as a Benevolent dictator for life (BDFL). Guido still promises to be a part of the core development group. He also added that he will be available to mentor people but most of the times the community will have to manage on their own. Benevolent dictator for life (BDFL) is a term that Guido's fellow Python enthusiasts came up with for him, as a joke, when discussing minutes of the meeting over email regarding leading Python’s development and adoption. Who will look after the Python community now? Guido Van Rossum said, "I am not going to appoint a successor". True to his leadership style, he has thrown his team of core developers into the deep end by asking them to consider what the Python community's new governance model could be. In his memo, he asked, "So what are you all going to do? Create a democracy? Anarchy? A dictatorship? A federation?" Guido's parting advice to the core dev team Guido expressed confidence in his team to continue to manage the day-to-day tasks and operations just as they’ve been doing under his leadership. The two things he wants the core developers and the community to think deeply about are: How the PEPs are decided and How will the new core developers be inducted? He also emphasized the importance of fostering the right community culture militantly through Python's Community Code of Conduct (CoC). He said, "if you don't like that document your only option might be to leave this group voluntarily. Perhaps there are issues to decide like when should someone be kicked out (this could be banning people from python-dev or python-ideas too since those are also covered by the CoC)." He assured the team that while he has stepped down as the BDFL and from all decision-making duties, he will continue to be an active member of the community and will now be more available as a mentor to those on the core development team. Guido's decision to quit seems to have stemmed partly from the physical, mental, and emotional toll that the role has taken on him over years. He concluded his thread on Transfer of Power by saying, "I'm tired, and need a very long break". How is the Python community taking this decision? The development team hopes Guido will make a come back after his well-deserved break. As a BDFL, Guido has provided them with consistency in design and taste. By having Guido as a monitor, the team has had a very consistent view of how the community should behave and this has been an asset for the whole team. Now they have four ways to explore to govern the Python community Find a new BDFL. This option seems highly unlikely as Guido’s legacy is irreplaceable. Besides, it is practically the least robust to rely on one person to take all key decisions and to commit their full time to the community. That person also needs to be well respected and accepted as a de facto head. Set up an N-virate leadership team (a group of 3 (triumvirate) or 5 (quintumvirate) experts). With such a model, the responsibilities and load will be equally distributed among the chosen members from the core development team. This appears to be the current favorite on the thread that opened yesterday. Become a democracy. In this model, the community gets to vote on all key decisions. This seems like the short-term fix the team is gravitating towards. At least to decide on the immediate task at hand. But many on the team acknowledge that this is not a permanent answer as it will pull the language in too many directions and also is time-consuming. Explore the governance model of other open source communities. This option is as being seriously considered in the discussions. Clearly, the community loves Guido, evident from the deluge of well wishes he's receiving from all over the globe. You know you've done your job well when you hear someone say 'You changed my life'. Guido has changed millions of lives for the better. https://twitter.com/AndrewYNg/status/1017664116482162689 https://twitter.com/anthonypjshaw/status/1017610576640393216 https://twitter.com/generativist/status/1017547690228396032 https://twitter.com/bloodyquantum/status/1017558674024218624 Thank you, Guido, for Python, your heart, and your leadership. We know the community will thrive even in your absence because you've cultivated an excellent culture and a great set of minds. Top 7 Python programming books you need to read Python web development: Django vs Flask in 2018 Python experts talk Python on Twitter: Q&A Recap
Read more
  • 0
  • 0
  • 7526
Banner background image

article-image-writing-perform-test-functions-in-golang-tutorial
Natasha Mathur
10 Jul 2018
9 min read
Save for later

Writing test functions in Golang [Tutorial]

Natasha Mathur
10 Jul 2018
9 min read
Go is a modern programming language built for the 21st-century application development. Hardware and technology have advanced significantly over the past decade, and most of the other languages do not take advantage of these technological advancements.  Go allows us to build network applications that take advantage of concurrency and parallelism made available with multicore systems. Testing is an important part of programming, whether it is in Go or in any other language. Go has a straightforward approach to writing tests, and in this tutorial, we will look at some important tools to help with testing. This tutorial is an excerpt from the book ‘Distributed computing with Go’, written by V.N. Nikhil Anurag. There are certain rules and conventions we need to follow to test our code. They are as follows: Source files and associated test files are placed in the same package/folder The name of the test file for any given source file is <source-file-name>_test.go Test functions need to have the "Test" prefix, and the next character in the function name should be capitalized In the remainder of this tutorial, we will look at three files and their associated tests: variadic.go and variadic_test.go addInt.go and addInt_test.go nil_test.go (there isn't any source file for these tests) Along the way, we will introduce any concepts we might use. variadic.go function In order to understand the first set of tests, we need to understand what a variadic function is and how Go handles it. Let's start with the definition: Variadic function is a function that can accept any number of arguments during function call. Given that Go is a statically typed language, the only limitation imposed by the type system on a variadic function is that the indefinite number of arguments passed to it should be of the same data type. However, this does not limit us from passing other variable types. The arguments are received by the function as a slice of elements if arguments are passed, else nil, when none are passed. Let's look at the code to get a better idea: // variadic.go package main func simpleVariadicToSlice(numbers ...int) []int { return numbers } func mixedVariadicToSlice(name string, numbers ...int) (string, []int) { return name, numbers } // Does not work. // func badVariadic(name ...string, numbers ...int) {} We use the ... prefix before the data type to define a function as a variadic function. Note that we can have only one variadic parameter per function and it has to be the last parameter. We can see this error if we uncomment the line for badVariadic and try to test the code. variadic_test.go We would like to test the two valid functions, simpleVariadicToSlice, and mixedVariadicToSlice, for various rules defined above. However, for the sake of brevity, we will test these: simpleVariadicToSlice: This is for no arguments, three arguments, and also to look at how to pass a slice to a variadic function mixedVariadicToSlice: This is to accept a simple argument and a variadic argument Let's now look at the code to test these two functions: // variadic_test.go package main import "testing" func TestSimpleVariadicToSlice(t *testing.T) { // Test for no arguments if val := simpleVariadicToSlice(); val != nil { t.Error("value should be nil", nil) } else { t.Log("simpleVariadicToSlice() -> nil") } // Test for random set of values vals := simpleVariadicToSlice(1, 2, 3) expected := []int{1, 2, 3} isErr := false for i := 0; i < 3; i++ { if vals[i] != expected[i] { isErr = true break } } if isErr { t.Error("value should be []int{1, 2, 3}", vals) } else { t.Log("simpleVariadicToSlice(1, 2, 3) -> []int{1, 2, 3}") } // Test for a slice vals = simpleVariadicToSlice(expected...) isErr = false for i := 0; i < 3; i++ { if vals[i] != expected[i] { isErr = true break } } if isErr { t.Error("value should be []int{1, 2, 3}", vals) } else { t.Log("simpleVariadicToSlice([]int{1, 2, 3}...) -> []int{1, 2, 3}") } } func TestMixedVariadicToSlice(t *testing.T) { // Test for simple argument & no variadic arguments name, numbers := mixedVariadicToSlice("Bob") if name == "Bob" && numbers == nil { t.Log("Recieved as expected: Bob, <nil slice>") } else { t.Errorf("Received unexpected values: %s, %s", name, numbers) } } Running tests in variadic_test.go Let's run these tests and see the output. We'll use the -v flag while running the tests to see the output of each individual test: $ go test -v ./{variadic_test.go,variadic.go} === RUN TestSimpleVariadicToSlice --- PASS: TestSimpleVariadicToSlice (0.00s) variadic_test.go:10: simpleVariadicToSlice() -> nil variadic_test.go:26: simpleVariadicToSlice(1, 2, 3) -> []int{1, 2, 3} variadic_test.go:41: simpleVariadicToSlice([]int{1, 2, 3}...) -> []int{1, 2, 3} === RUN TestMixedVariadicToSlice --- PASS: TestMixedVariadicToSlice (0.00s) variadic_test.go:49: Received as expected: Bob, <nil slice> PASS ok command-line-arguments 0.001s addInt.go The tests in variadic_test.go elaborated on the rules for the variadic function. However, you might have noticed that TestSimpleVariadicToSlice ran three tests in its function body, but go test treats it as a single test. Go provides a good way to run multiple tests within a single function, and we shall look them in addInt_test.go. For this example, we will use a very simple function as shown in this code: // addInt.go package main func addInt(numbers ...int) int { sum := 0 for _, num := range numbers { sum += num } return sum } addInt_test.go You might have also noticed in TestSimpleVariadicToSlice that we duplicated a lot of logic, while the only varying factor was the input and expected values. One style of testing, known as Table-driven development, defines a table of all the required data to run a test, iterates over the "rows" of the table and runs tests against them. Let's look at the tests we will be testing against no arguments and variadic arguments: // addInt_test.go package main import ( "testing" ) func TestAddInt(t *testing.T) { testCases := []struct { Name string Values []int Expected int }{ {"addInt() -> 0", []int{}, 0}, {"addInt([]int{10, 20, 100}) -> 130", []int{10, 20, 100}, 130}, } for _, tc := range testCases { t.Run(tc.Name, func(t *testing.T) { sum := addInt(tc.Values...) if sum != tc.Expected { t.Errorf("%d != %d", sum, tc.Expected) } else { t.Logf("%d == %d", sum, tc.Expected) } }) } } Running tests in addInt_test.go Let's now run the tests in this file, and we are expecting each of the row in the testCases table, which we ran, to be treated as a separate test: $ go test -v ./{addInt.go,addInt_test.go} === RUN TestAddInt === RUN TestAddInt/addInt()_->_0 === RUN TestAddInt/addInt([]int{10,_20,_100})_->_130 --- PASS: TestAddInt (0.00s) --- PASS: TestAddInt/addInt()_->_0 (0.00s) addInt_test.go:23: 0 == 0 --- PASS: TestAddInt/addInt([]int{10,_20,_100})_->_130 (0.00s) addInt_test.go:23: 130 == 130 PASS ok command-line-arguments 0.001s nil_test.go We can also create tests that are not specific to any particular source file; the only criteria is that the filename needs to have the <text>_test.go form. The tests in nil_test.go elucidate on some useful features of the language which the developer might find useful while writing tests. They are as follows: httptest.NewServer: Imagine the case where we have to test our code against a server that sends back some data. Starting and coordinating a full blown server to access some data is hard. The http.NewServer solves this issue for us. t.Helper: If we use the same logic to pass or fail a lot of testCases, it would make sense to segregate this logic into a separate function. However, this would skew the test run call stack. We can see this by commenting t.Helper() in the tests and rerunning go test. We can also format our command-line output to print pretty results. We will show a simple example of adding a tick mark for passed cases and cross mark for failed cases. In the test, we will run a test server, make GET requests on it, and then test the expected output versus actual output: // nil_test.go package main import ( "fmt" "io/ioutil" "net/http" "net/http/httptest" "testing" ) const passMark = "u2713" const failMark = "u2717" func assertResponseEqual(t *testing.T, expected string, actual string) { t.Helper() // comment this line to see tests fail due to 'if expected != actual' if expected != actual { t.Errorf("%s != %s %s", expected, actual, failMark) } else { t.Logf("%s == %s %s", expected, actual, passMark) } } func TestServer(t *testing.T) { testServer := httptest.NewServer( http.HandlerFunc( func(w http.ResponseWriter, r *http.Request) { path := r.RequestURI if path == "/1" { w.Write([]byte("Got 1.")) } else { w.Write([]byte("Got None.")) } })) defer testServer.Close() for _, testCase := range []struct { Name string Path string Expected string }{ {"Request correct URL", "/1", "Got 1."}, {"Request incorrect URL", "/12345", "Got None."}, } { t.Run(testCase.Name, func(t *testing.T) { res, err := http.Get(testServer.URL + testCase.Path) if err != nil { t.Fatal(err) } actual, err := ioutil.ReadAll(res.Body) res.Body.Close() if err != nil { t.Fatal(err) } assertResponseEqual(t, testCase.Expected, fmt.Sprintf("%s", actual)) }) } t.Run("Fail for no reason", func(t *testing.T) { assertResponseEqual(t, "+", "-") }) } Running tests in nil_test.go We run three tests, where two test cases will pass and one will fail. This way we can see the tick mark and cross mark in action: $ go test -v ./nil_test.go === RUN TestServer === RUN TestServer/Request_correct_URL === RUN TestServer/Request_incorrect_URL === RUN TestServer/Fail_for_no_reason --- FAIL: TestServer (0.00s) --- PASS: TestServer/Request_correct_URL (0.00s) nil_test.go:55: Got 1. == Got 1. --- PASS: TestServer/Request_incorrect_URL (0.00s) nil_test.go:55: Got None. == Got None. --- FAIL: TestServer/Fail_for_no_reason (0.00s) nil_test.go:59: + != - FAIL exit status 1 FAIL command-line-arguments 0.003s We looked at how to write test functions in Go, and learned a few interesting concepts when dealing with a variadic function and other useful test functions. If you found this post useful, do check out the book  'Distributed Computing with Go' to learn more about testing, Goroutines, RESTful web services, and other concepts in Go. Why is Go the go-to language for cloud-native development? – An interview with Mina Andrawos Systems programming with Go in UNIX and Linux How to build a basic server-side chatbot using Go
Read more
  • 0
  • 0
  • 13679

article-image-understanding-go-internals-defer-panic-and-recover-functions-tutorial
Packt Editorial Staff
09 Jul 2018
8 min read
Save for later

Understanding Go Internals: defer, panic() and recover() functions [Tutorial]

Packt Editorial Staff
09 Jul 2018
8 min read
The Go programming language, often referred to as Golang, is making strides with masterclass developments and architecture by the greatest programming minds.  The Go features are extremely handy, and you can use them all the time. However, there is nothing more rewarding than being able to see and understand what is going on in the background and how Go operates behind the scenes. In this article we will learn to use the defer keyword, panic() and recover() functions in Go. This article is extracted from the First Edition of Mastering Go written by Mihalis Tsoukalos. The concepts discussed in this article (and more) have been updated or improved in the third edition of Mastering Go. The defer keyword The defer keyword postpones the execution of a function until the surrounding function returns. It is widely used in file input and output operations because it saves you from having to remember when to close an opened file: the defer keyword allows you to put the function call that closes an opened file near to the function call that opened it. You will also see defer in action in the section that talks about the panic()  and recover() built-in Go functions. It is very important to remember that deferred functions are executed in Last In First Out (LIFO) order after the return of the surrounding function. Put simply, this means that if you defer function f1() first, function f2() second, and function f3() third in the same surrounding function, when the surrounding function is about to return, function f3() will be executed first, function f2() will be executed second, and function f1() will be the last one to get executed. As this definition of defer is a little unclear, I think that you will understand the use of defer a little better by looking at the Go code and the output of the defer.go  program, which will be presented in three parts. The first part of the program follows: package main import (  "fmt" ) func d1() { for i := 3; i > 0; i-- { defer fmt.Print(i, " ") } } Apart from the import block, the preceding Go code implements a function named d1() with a for loop and a defer statement that will be executed three times. The second part of defer.go   contains the following Go code: func d2() { for i := 3; i > 0; i-- { defer func() { fmt.Print(i, " ") }() } fmt.Println() } In this part of the code, you can see the implementation of another function that is named d2(). The d2() function also contains a for loop and a defer statement that will be also executed three times. However, this time the defer keyword is applied to an anonymous function instead of a single fmt.Print() statement. Additionally, the anonymous function takes no parameters. The last part of the Go code follows: func d3() { for i := 3; i > 0; i-- { defer func(n int) { fmt.Print(n, " ") }(i) } } func main() { d1() d2() fmt.Println() d3() fmt.Println() } Apart from the main() function that calls the d1(), d2(), and d3() functions, you can also see the implementation of the d3() function, which has a for loop that uses the defer keyword on an anonymous function. However, this time the anonymous function requires one integer parameter named n. The Go code tells us that the n parameter takes its value from the i variable used in the for loop. Executing defer.go will create the following output: $ go run defer.go 1 2 3 0 0 0 1 2 3 You will most likely find the generated output complicated and challenging to understand. This underscores the fact that the operation and the results of the use of defer can be tricky if your code is not clear and unambiguous. Let's examine the results in order to get a better idea of how tricky defer can be if you do not pay close attention to your code. We will start with the first line of the output (1 2 3), which is generated by the d1() function. The values of i in d1() are 3, 2, and 1 in that order. The function that is deferred in d1() is the fmt.Print() statement. As a result, when the d1() function is about to return, you get the three values of the i variable of the for loop in reverse order because deferred functions are executed in LIFO order. Now, let us inspect the second line of the output that is produced by the d2() function. It is really strange that we got three zeros instead of 1 2 3 in the output. The reason for this, however, is relatively simple. After the for loop has ended, the value of i is 0, because it is that value of i that made the for loop terminate. However, the tricky part here is that the deferred anonymous function is evaluated after the for loop ends, because it has no parameters. This means that is evaluated three times for an i value of 0, hence the generated output! This kind of confusing code is what might lead to the creation of nasty bugs in your projects, so try to avoid it! Last, we will talk about the third line of the output, which is generated by the d3() function. Due to the parameter of the anonymous function, each time the anonymous function is deferred, it gets and uses the current value of i. As a result, each execution of the anonymous function has a different value to process, thus the generated output. After this, it should be clear that the best approach to the use of defer is the third one, which is exhibited in the d3() function. This is so because you intentionally pass the desired variable in the anonymous function in an easy to understand way. Panic and Recover This technique involves the use of the panic() and recover() functions, and it will be presented in panicRecover.go, which you will review in three parts. Strictly speaking, panic() is a built-in Go function that terminates the current flow of a Go program and starts panicking! On the other hand, the recover() function, which is also a built-in Go function, allows you to take back the control of a goroutine that just panicked using panic(). The first part of the program follows: package main import ( "fmt" ) func a() { fmt.Println("Inside a()") defer func() { if c := recover(); c != nil { fmt.Println("Recover inside a()!") } }() fmt.Println("About to call b()") b() fmt.Println("b() exited!") fmt.Println("Exiting a()") } Apart from the import block, this part includes the implementation of the a() function. The most important part of the a() function is the defer block of code, which implements an anonymous function that will be called when there is a call to panic(). The second code segment of panicRecover.go follows next: func b() { fmt.Println("Inside b()") panic("Panic in b()!") fmt.Println("Exiting b()") } The last part of the program, which illustrates the panic() and recover() functions, is as follows: func main() { a() fmt.Println("main() ended!") } Executing panicRecover.go will create the following output: $ go run panicRecover.go Inside a() About to call b() Inside b() Recover inside a()! main() ended! What just happened here is really impressive! However, as you can see from the output, the a() function did not end normally, because its last two statements did not get executed: fmt.Println("b() exited!") fmt.Println("Exiting a()") Nevertheless, the good thing is that panicRecover.go ended according to our will without panicking because the anonymous function used in defer took control of the situation! Also note that function b() knows nothing about function a(). However, function a() contains Go code that handles the panic condition of function b()! Using the panic function on its own You can also use the panic() function on its own without any attempt to recover, and this subsection will show you its results using the Go code of justPanic.go, which will be presented in two parts. The first part of justPanic.go follows next: package main import ( "fmt" "os" ) As you can see, the use of panic() does not require any extra Go packages. The second part of justPanic.go is shown in the following Go code: func main() { if len(os.Args) == 1 { panic("Not enough arguments!") } fmt.Println("Thanks for the argument(s)!") } If your Go program does not have at least one command line argument, it will call the panic() function. The panic() function takes one parameter, which is the error message that you want to print on the screen. Executing justPanic.go on a macOS High Sierra machine will create the following output: $ go run justPanic.go panic: Not enough arguments! goroutine 1 [running]: main.main() /Users/mtsouk/ch2/code/justPanic.go:10 +0x9e exit status 2 Thus, using the panic() function on its own will terminate the Go program without giving you the opportunity to recover! Therefore use of the panic() and recover() pair is much more practical and professional than just using panic() alone. To summarize, we covered some of the interesting Go topics like; defer keyword; the panic() and recover() functions. To explore other major features and packages in Go, get our latest edition in Go programming, Mastering Go, written by Mihalis Tsoukalos. Implementing memory management with Golang’s garbage collector Why is Go the go-to language for cloud native development? – An interview with Mina Andrawos How to build a basic server side chatbot using Go How Concurrency and Parallelism works in Golang [Tutorial]  
Read more
  • 0
  • 0
  • 4568

article-image-concurrency-and-parallelism-in-golang-tutorial
Natasha Mathur
06 Jul 2018
11 min read
Save for later

How Concurrency and Parallelism works in Golang [Tutorial]

Natasha Mathur
06 Jul 2018
11 min read
Computer and software programs are useful because they do a lot of laborious work very fast and can also do multiple things at once. We want our programs to be able to do multiple things simultaneously, and the success of a programming language can depend on how easy it is to write and understand multitasking programs. Concurrency and parallelism are two terms that are bound to come across often when looking into multitasking and are often used interchangeably. However, they mean two distinctly different things. In this article, we will look at how concurrency and parallelism work in Go using simple examples for better understanding. Let's get started! This article is an excerpt from a book 'Distributed Computing with Go' written by V.N. Nikhil Anurag. The standard definitions given on the Go blog are as follows: Concurrency: Concurrency is about dealing with lots of things at once. This means that we manage to get multiple things done at once in a given period of time. However, we will only be doing a single thing at a time. This tends to happen in programs where one task is waiting and the program decides to run another task in the idle time. In the following diagram, this is denoted by running the yellow task in idle periods of the blue task. Parallelism: Parallelism is about doing lots of things at once. This means that even if we have two tasks, they are continuously working without any breaks in between them. In the diagram, this is shown by the fact that the green task is running independently and is not influenced by the red task in any manner: It is important to understand the difference between these two terms. Let's look at a few concrete examples to further elaborate upon the difference between the two. Concurrency Let's look at the concept of concurrency using a simple example of a few daily routine tasks and the way we can perform them. Imagine you start your day and need to get six things done: Make hotel reservation Book flight tickets Order a dress Pay credit card bills Write an email Listen to an audiobook The order in which they are completed doesn't matter, and for some of the tasks, such as  writing an email or listening to an audiobook, you need not complete them in a single sitting. Here is one possible way to complete the tasks: Order a dress. Write one-third of the email. Make hotel reservation. Listen to 10 minutes of audiobook. Pay credit card bills. Write another one-third of the email. Book flight tickets. Listen to another 20 minutes of audiobook. Complete writing the email. Continue listening to audiobook until you fall asleep. In programming terms, we have executed the above tasks concurrently. We had a complete day and we chose particular tasks from our list of tasks and started to work on them. For certain tasks, we even decided to break them up into pieces and work on the pieces between other tasks. We will eventually write a program which does all of the preceding steps concurrently, but let's take it one step at a time. Let's start by building a program that executes the tasks sequentially, and then modify it progressively until it is purely concurrent code and uses goroutines. The progression of the program will be in three steps: Serial task execution. Serial task execution with goroutines. Concurrent task execution. Code overview The code will consist of a set of functions that print out their assigned tasks as completed. In the cases of writing an email or listening to an audiobook, we further divide the tasks into more functions. This can be seen as follows: writeMail, continueWritingMail1, continueWritingMail2 listenToAudioBook, continueListeningToAudioBook Serial task execution Let's first implement a program that will execute all the tasks in a linear manner. Based on the code overview we discussed previously, the following code should be straightforward: package main import ( "fmt" ) // Simple individual tasks func makeHotelReservation() { fmt.Println("Done making hotel reservation.") } func bookFlightTickets() { fmt.Println("Done booking flight tickets.") } func orderADress() { fmt.Println("Done ordering a dress.") } func payCreditCardBills() { fmt.Println("Done paying Credit Card bills.") } // Tasks that will be executed in parts // Writing Mail func writeAMail() { fmt.Println("Wrote 1/3rd of the mail.") continueWritingMail1() } func continueWritingMail1() { fmt.Println("Wrote 2/3rds of the mail.") continueWritingMail2() } func continueWritingMail2() { fmt.Println("Done writing the mail.") } // Listening to Audio Book func listenToAudioBook() { fmt.Println("Listened to 10 minutes of audio book.") continueListeningToAudioBook() } func continueListeningToAudioBook() { fmt.Println("Done listening to audio book.") } // All the tasks we want to complete in the day. // Note that we do not include the sub tasks here. var listOfTasks = []func(){ makeHotelReservation, bookFlightTickets, orderADress, payCreditCardBills, writeAMail, listenToAudioBook, } func main() { for _, task := range listOfTasks { task() } } We take each of the main tasks and start executing them in simple sequential order. Executing the preceding code should produce unsurprising output, as shown here: Done making hotel reservation. Done booking flight tickets. Done ordering a dress. Done paying Credit Card bills. Wrote 1/3rd of the mail. Wrote 2/3rds of the mail. Done writing the mail. Listened to 10 minutes of audio book. Done listening to audio book. Serial task execution with goroutines We took a list of tasks and wrote a program to execute them in a linear and sequential manner. However, we want to execute the tasks concurrently! Let's start by first introducing goroutines for the split tasks and see how it goes. We will only show the code snippet where the code actually changed here: /******************************************************************** We start by making Writing Mail & Listening Audio Book concurrent. *********************************************************************/ // Tasks that will be executed in parts // Writing Mail func writeAMail() { fmt.Println("Wrote 1/3rd of the mail.") go continueWritingMail1() // Notice the addition of 'go' keyword. } func continueWritingMail1() { fmt.Println("Wrote 2/3rds of the mail.") go continueWritingMail2() // Notice the addition of 'go' keyword. } func continueWritingMail2() { fmt.Println("Done writing the mail.") } // Listening to Audio Book func listenToAudioBook() { fmt.Println("Listened to 10 minutes of audio book.") go continueListeningToAudioBook() // Notice the addition of 'go' keyword. } func continueListeningToAudioBook() { fmt.Println("Done listening to audio book.") } The following is a possible output: Done making hotel reservation. Done booking flight tickets. Done ordering a dress. Done paying Credit Card bills. Wrote 1/3rd of the mail. Listened to 10 minutes of audio book. Whoops! That's not what we were expecting. The output from the continueWritingMail1, continueWritingMail2, and continueListeningToAudioBook functions is missing; the reason being that we are using goroutines. Since goroutines are not waited upon, the code in the main function continues executing and once the control flow reaches the end of the main function, the program ends. What we would really like to do is to wait in the main function until all the goroutines have finished executing. There are two ways we can do this—using channels or using WaitGroup.  We'll use WaitGroup now. In order to use WaitGroup, we have to keep the following in mind: Use WaitGroup.Add(int) to keep count of how many goroutines we will be running as part of our logic. Use WaitGroup.Done() to signal that a goroutine is done with its task. Use WaitGroup.Wait() to wait until all goroutines are done. Pass WaitGroup instance to the goroutines so they can call the Done() method. Based on these points, we should be able to modify the source code to use WaitGroup. The following is the updated code: package main import ( "fmt" "sync" ) // Simple individual tasks func makeHotelReservation(wg *sync.WaitGroup) { fmt.Println("Done making hotel reservation.") wg.Done() } func bookFlightTickets(wg *sync.WaitGroup) { fmt.Println("Done booking flight tickets.") wg.Done() } func orderADress(wg *sync.WaitGroup) { fmt.Println("Done ordering a dress.") wg.Done() } func payCreditCardBills(wg *sync.WaitGroup) { fmt.Println("Done paying Credit Card bills.") wg.Done() } // Tasks that will be executed in parts // Writing Mail func writeAMail(wg *sync.WaitGroup) { fmt.Println("Wrote 1/3rd of the mail.") go continueWritingMail1(wg) } func continueWritingMail1(wg *sync.WaitGroup) { fmt.Println("Wrote 2/3rds of the mail.") go continueWritingMail2(wg) } func continueWritingMail2(wg *sync.WaitGroup) { fmt.Println("Done writing the mail.") wg.Done() } // Listening to Audio Book func listenToAudioBook(wg *sync.WaitGroup) { fmt.Println("Listened to 10 minutes of audio book.") go continueListeningToAudioBook(wg) } func continueListeningToAudioBook(wg *sync.WaitGroup) { fmt.Println("Done listening to audio book.") wg.Done() } // All the tasks we want to complete in the day. // Note that we do not include the sub tasks here. var listOfTasks = []func(*sync.WaitGroup){ makeHotelReservation, bookFlightTickets, orderADress, payCreditCardBills, writeAMail, listenToAudioBook, } func main() { var waitGroup sync.WaitGroup // Set number of effective goroutines we want to wait upon waitGroup.Add(len(listOfTasks)) for _, task := range listOfTasks{ // Pass reference to WaitGroup instance // Each of the tasks should call on WaitGroup.Done() task(&waitGroup) } // Wait until all goroutines have completed execution. waitGroup.Wait() } Here is one possible output order; notice how continueWritingMail1 and continueWritingMail2 were executed at the end after listenToAudioBook and continueListeningToAudioBook: Done making hotel reservation. Done booking flight tickets. Done ordering a dress. Done paying Credit Card bills. Wrote 1/3rd of the mail. Listened to 10 minutes of audio book. Done listening to audio book. Wrote 2/3rds of the mail. Done writing the mail. Concurrent task execution In the final output of the previous part, we can see that all the tasks in listOfTasks are being executed in serial order, and the last step for maximum concurrency would be to let the order be determined by Go runtime instead of the order in listOfTasks. This might sound like a laborious task, but in reality this is quite simple to achieve. All we need to do is add the go keyword in front of task(&waitGroup): func main() { var waitGroup sync.WaitGroup // Set number of effective goroutines we want to wait upon waitGroup.Add(len(listOfTasks)) for _, task := range listOfTasks { // Pass reference to WaitGroup instance // Each of the tasks should call on WaitGroup.Done() go task(&waitGroup) // Achieving maximum concurrency } // Wait until all goroutines have completed execution. waitGroup.Wait() Following is a possible output: Listened to 10 minutes of audio book. Done listening to audio book. Done booking flight tickets. Done ordering a dress. Done paying Credit Card bills. Wrote 1/3rd of the mail. Wrote 2/3rds of the mail. Done writing the mail. Done making hotel reservation. If we look at this possible output, the tasks were executed in the following order: Listen to audiobook. Book flight tickets. Order a dress. Pay credit card bills. Write an email. Make hotel reservations. Now that we have a good idea on what concurrency is and how to write concurrent code using goroutines and WaitGroup, let's dive into parallelism. Parallelism Imagine that you have to write a few emails. They are going to be long and laborious, and the best way to keep yourself entertained is to listen to music while writing them, that is, listening to music "in parallel" to writing the emails. If we wanted to write a program that simulates this scenario, the following is one possible implementation: package main import ( "fmt" "sync" "time" ) func printTime(msg string) { fmt.Println(msg, time.Now().Format("15:04:05")) } // Task that will be done over time func writeMail1(wg *sync.WaitGroup) { printTime("Done writing mail #1.") wg.Done() } func writeMail2(wg *sync.WaitGroup) { printTime("Done writing mail #2.") wg.Done() } func writeMail3(wg *sync.WaitGroup) { printTime("Done writing mail #3.") wg.Done() } // Task done in parallel func listenForever() { for { printTime("Listening...") } } func main() { var waitGroup sync.WaitGroup waitGroup.Add(3) go listenForever() // Give some time for listenForever to start time.Sleep(time.Nanosecond * 10) // Let's start writing the mails go writeMail1(&waitGroup) go writeMail2(&waitGroup) go writeMail3(&waitGroup) waitGroup.Wait() } The output of the program might be as follows: Done writing mail #3. 19:32:57 Listening... 19:32:57 Listening... 19:32:57 Done writing mail #1. 19:32:57 Listening... 19:32:57 Listening... 19:32:57 Done writing mail #2. 19:32:57 The numbers represent the time in terms of Hour:Minutes:Seconds and, as can be seen, they are being executed in parallel. You might have noticed that the code for parallelism looks almost identical to the code for the final concurrency example. However, in the function listenForever, we are printing Listening... in an infinite loop. If the preceding example was written without goroutines, the output would keep printing Listening... and never reach the writeMail function calls. Goroutines are concurrent and, to an extent, parallel; however, we should think of them as being concurrent. The order of execution of goroutines is not predictable and we should not rely on them to be executed in any particular order. We should also take care to handle errors and panics in our goroutines because even though they are being executed in parallel, a panic in one goroutine will crash the complete program. Finally, goroutines can block on system calls, however, this will not block the execution of the program nor slow down the performance of the overall program. We looked at how goroutine can be used to run concurrent programs and also learned how parallelism works in Go. If you found this post useful, do check out the book 'Distributed Computing with Go' to learn more about Goroutines, channels and messages, and other concepts in Go. Golang Decorators: Logging & Time Profiling Essential Tools for Go Programming Why is Go the go-to language for cloud-native development? – An interview with Mina Andrawos
Read more
  • 0
  • 0
  • 24957
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-how-to-implement-immutability-functions-in-kotlin
Aaron Lazar
27 Jun 2018
8 min read
Save for later

How to implement immutability functions in Kotlin [Tutorial]

Aaron Lazar
27 Jun 2018
8 min read
Unlike Clojure, Haskell, F#, and the likes, Kotlin is not a pure functional programming language, where immutability is forced; rather, we may refer to Kotlin as a perfect blend of functional programming and OOP languages. It contains the major benefits of both worlds. So, instead of forcing immutability like pure functional programming languages, Kotlin encourages immutability, giving it automatic preference wherever possible. In this article, we'll understand the various methods of implementing immutability in Kotlin. This article has been taken from the book, Functional Kotlin, by Mario Arias and Rivu Chakraborty. In other words, Kotlin has immutable variables (val), but no language mechanisms that would guarantee true deep immutability of the state. If a val variable references a mutable object, its contents can still be modified. We will have a more elaborate discussion and a deeper dive on this topic, but first let us have a look at how we can get referential immutability in Kotlin and the differences between var, val, and const val. By true deep immutability of the state, we mean a property will always return the same value whenever it is called and that the property never changes its value; we can easily avoid this if we have a val  property that has a custom getter. You can find more details at the following link: https://artemzin.com/blog/kotlin-val-does-not-mean-immutable-it-just-means-readonly-yeah/ The difference between var and val So, in order to encourage immutability but still let the developers have the choice, Kotlin introduced two types of variables. The first one is var, which is just a simple variable, just like in any imperative language. On the other hand, val brings us a bit closer to immutability; again, it doesn't guarantee immutability. So, what exactly does the val variable provide us? It enforces read-only, you cannot write into a val variable after initialization. So, if you use a val variable without a custom getter, you can achieve referential immutability. Let's have a look; the following program will not compile: fun main(args: Array<String>) { val x:String = "Kotlin" x+="Immutable"//(1) } As I mentioned earlier, the preceding program will not compile; it will give an error on comment (1). As we've declared variable x as val, x will be read-only and once we initialize x; we cannot modify it afterward. So, now you're probably asking why we cannot guarantee immutability with val ? Let's inspect this with the following example: object MutableVal { var count = 0 val myString:String = "Mutable" get() {//(1) return "$field ${++count}"//(2) } } fun main(args: Array<String>) { println("Calling 1st time ${MutableVal.myString}") println("Calling 2nd time ${MutableVal.myString}") println("Calling 3rd time ${MutableVal.myString}")//(3) } In this program, we declared myString as a val property, but implemented a custom get function, where we tweaked the value of myString before returning it. Have a look at the output first, then we will further look into the program: As you can see, the myString property, despite being val, returned different values every time we accessed it. So, now, let us look into the code to understand such behavior. On comment (1), we declared a custom getter for the val property myString. On comment (2), we pre-incremented the value of count and added it after the value of the field value, myString, and returned the same from the getter. So, whenever we requested the myString property, count got incremented and, on the next request, we got a different value. As a result, we broke the immutable behavior of a val property. Compile time constants So, how can we overcome this? How can we enforce immutability? The const val properties are here to help us. Just modify val myString with const val myString and you cannot implement the custom getter. While val properties are read-only variables, const val, on the other hand, are compile time constants. You cannot assign the outcome (result) of a function to const val. Let's discuss some of the differences between val and const val: The val properties are read-only variables, while const val are compile time constants The val properties can have custom getters, but const val cannot We can have val properties anywhere in our Kotlin code, inside functions, as a class member, anywhere, but const val has to be a top-level member of a class/object You cannot write delegates for the const val properties We can have the val property of any type, be it our custom class or any primitive data type, but only primitive data types and String are allowed with a const val property We cannot have nullable data types with the const val properties; as a result, we cannot have null values for the const val properties either As a result, the const val properties guarantee immutability of value but have lesser flexibility and you are bound to use only primitive data types with const val, which cannot always serve our purposes. Now, that I've used the word referential immutability quite a few times, let us now inspect what it means and how many types of immutability there are. Types of immutability There are basically the following two types of immutability: Referential immutability Immutable values Immutable reference  (referential immutability) Referential immutability enforces that, once a reference is assigned, it can't be assigned to something else. Think of having it as a val property of a custom class, or even MutableList or MutableMap; after you initialize the property, you cannot reference something else from that property, except the underlying value from the object. For example, take the following program: class MutableObj { var value = "" override fun toString(): String { return "MutableObj(value='$value')" } } fun main(args: Array<String>) { val mutableObj:MutableObj = MutableObj()//(1) println("MutableObj $mutableObj") mutableObj.value = "Changed"//(2) println("MutableObj $mutableObj") val list = mutableListOf("a","b","c","d","e")//(3) println(list) list.add("f")//(4) println(list) } Have a look at the output before we proceed with explaining the program: So, in this program we've two val properties—list and mutableObj. We initialized mutableObj with the default constructor of MutableObj, since it's a val property it'll always refer to that specific object; but, if you concentrate on comment (2), we changed the value property of mutableObj, as the value property of the MutableObj class is mutable (var). It's the same with the list property, we can add items to the list after initialization, changing its underlying value. Both list and mutableObj are perfect examples of immutable reference; once initialized, the properties can't be assigned to something else, but their underlying values can be changed (you can refer the output). The reason behind that is the data type we used to assign to those properties. Both the MutableObj class and the MutableList<String> data structures are mutable themselves, so we cannot restrict value changes for their instances. Immutable values The immutable values, on the other hand, enforce no change on values as well; it is really complex to maintain. In Kotlin, the const val properties enforce immutability of value, but they lack flexibility (we already discussed them) and you're bound to use only primitive types, which can be troublesome in real-life scenarios. Immutable collections Kotlin gives preference to immutability wherever possible, but leaves the choice to the developer whether or when to use it. This power of choice makes the language even more powerful. Unlike most languages, where they have either only mutable (like Java, C#, and so on) or only immutable collections (like F#, Haskell, Clojure, and so on), Kotlin has both and distinguishes between them, leaving the developer with the freedom to choose whether to use an immutable or mutable one. Kotlin has two interfaces for collection objects—Collection<out E> and MutableCollection<out E>; all the collection classes (for example, List, Set, or Map) implement either of them. As the name suggests, the two interfaces are designed to serve immutable and mutable collections respectively. Let us have an example: fun main(args: Array<String>) { val immutableList = listOf(1,2,3,4,5,6,7)//(1) println("Immutable List $immutableList") val mutableList:MutableList<Int> = immutableList.toMutableList()//(2) println("Mutable List $mutableList") mutableList.add(8)//(3) println("Mutable List after add $mutableList") println("Mutable List after add $immutableList") } The output is as follows: So, in this program, we created an immutable list with the help of the listOf method of Kotlin, on comment (1). The listOf method creates an immutable list with the elements (varargs) passed to it. This method also has a generic type parameter, which can be skipped if the elements array is not empty. The listOf method also has a mutable version—mutableListOf() which is identical except that it returns MutableList instead. We can convert an immutable list to a mutable one with the help of the toMutableList() extension function, we did the same in comment (2), to add an element to it on comment (3). However, if you check the output, the original Immutable List remains the same without any changes, the item is, however, added to the newly created MutableList instead. So now you know how to implement immutability in Kotlin. If you found this tutorial helpful, and would like to learn more, head on over to purchase the full book, Functional Kotlin, by Mario Arias and Rivu Chakraborty. Extension functions in Kotlin: everything you need to know Building RESTful web services with Kotlin Building chat application with Kotlin using Node.js, the powerful Server-side JavaScript platform
Read more
  • 0
  • 0
  • 5648

article-image-delphi-memory-management-techniques-for-parallel-programming
Pavan Ramchandani
19 Jun 2018
31 min read
Save for later

Delphi: memory management techniques for parallel programming

Pavan Ramchandani
19 Jun 2018
31 min read
Memory management is part of practically every computing system. Multiple programs must coexist inside a limited memory space, and that can only be possible if the operating system is taking care of it. When a program needs some memory, for example, to create an object, it can ask the operating system and it will give it a slice of shared memory. When an object is not needed anymore, that memory can be returned to the loving care of the operating system. In this tutorial, we will touch upon memory management techniques, the most prime factor in parallel programming. The article is an excerpt from a book written by Primož Gabrijelčič, titled Delphi High Performance.  Slicing and dicing memory straight from the operating system is a relatively slow operation. In lots of cases, a memory system also doesn't know how to return small chunks of memory. For example, if you call Windows' VirtualAlloc function to get 20 bytes of memory, it will actually reserve 4 KB (or 4,096 bytes) for you. In other words, 4,076 bytes would be wasted. To fix these and other problems, programming languages typically implement their own internal memory management algorithms. When you request 20 bytes of memory, the request goes to that internal memory manager. It still requests memory from the operating system but then splits it internally into multiple parts. In a hypothetical scenario, the internal memory manager would request 4,096 bytes from the operating system and give 20 bytes of that to the application. The next time the application would request some memory (30 bytes for example), the internal memory manager would get that memory from the same 4,096-byte block. To move from hypothetical to specific, Delphi also includes such a memory manager. From Delphi 2006, this memory manager is called FastMM. It was written as an open source memory manager by Pierre LeRiche with help from other Delphi programmers and was later licensed by Borland. FastMM was a great improvement over the previous Delphi memory manager and, although it does not perform perfectly in the parallel programming world, it still functions very well after more than ten years. Optimizing strings and array allocations When you create a string, the code allocates memory for its content, copies the content into that memory, and stores the address of this memory in the string variable. If you append a character to this string, it must be stored somewhere in that memory. However, there is no place to store the string. The original memory block was just big enough to store the original content. The code must, therefore, enlarge that memory block, and only then can the appended character be stored in the newly acquired space A very similar scenario plays out when you extend a dynamic array. Memory that contains the array data can sometimes be extended in place (without moving), but often this cannot be done. If you do a lot of appending, these constant reallocations will start to slow down the code. The Reallocation demo shows a few examples of such behavior and possible workarounds. The first example, activated by the Append String button, simply appends the '*' character to a string 10 million times. The code looks simple, but the s := s + '*' assignment hides a potentially slow string reallocation: procedure TfrmReallocation.btnAppendStringClick(Sender: TObject); var s: String; i: Integer; begin s := ''; for i := 1 to CNumChars do s := s + '*'; end; By now, you probably know that I don't like to present problems that I don't have solutions for and this is not an exception. In this case, the solution is called SetLength. This function sets a string to a specified size. You can make it shorter, or you can make it longer. You can even set it to the same length as before. In case you are enlarging the string, you have to keep in mind that SetLength will allocate enough memory to store the new string, but it will not initialize it. In other words, the newly allocated string space will contain random data. A click on the SetLength String button activates the optimized version of the string appending code. As we know that the resulting string will be CNumChars long, the code can call SetLength(s, CNumChars) to preallocate all the memory in one step. After that, we should not append characters to the string as that would add new characters at the end of the preallocated string. Rather, we have to store characters directly into the string by writing  to s[i]: procedure TfrmReallocation.btnSetLengthClick(Sender: TObject); var s: String; i: Integer; begin SetLength(s, CNumChars); for i := 1 to CNumChars do s[i] := '*'; end; Comparing the speed shows that the second approach is significantly faster. It runs in 33 ms instead of the original 142 ms. A similar situation happens when you are extending a dynamic array. The code triggered by the Append array button shows how an array may be extended by one element at a time in a loop. Admittedly, the code looks very weird as nobody in their right mind would write a loop like this. In reality, however, similar code would be split into multiple longer functions and may be hard to spot: procedure TfrmReallocation.btnAppendArrayClick(Sender: TObject); var arr: TArray<char>; i: Integer; begin SetLength(arr, 0); for i := 1 to CNumChars do begin SetLength(arr, Length(arr) + 1); arr[High(arr)] := '*'; end; end; The solution is similar to the string case. We can preallocate the whole array by calling the SetLength function and then write the data into the array elements. We just have to keep in mind that the first array element always has index 0: procedure TfrmReallocation.btnSetLengthArrayClick(Sender: TObject); var arr: TArray<char>; i: Integer; begin SetLength(arr, CNumChars); for i := 1 to CNumChars do arr[i-1] := '*'; end; Improvements in speed are similar to the string demo. The original code needs 230 ms to append ten million elements, while the improved code executes in 26 ms. The third case when you may want to preallocate storage space is when you are appending to a list. As an example, I'll look into a TList<T> class. Internally, it stores the data in a TArray<T>, so it again suffers from constant memory reallocation when you are adding data to the list. The short demo code appends 10 million elements to a list. As opposed to the previous array demo, this is a completely normal looking code, found many times in many applications: procedure TfrmReallocation.btnAppendTListClick(Sender: TObject); var list: TList<Char>; i: Integer; begin list := TList<Char>.Create; try for i := 1 to CNumChars do list.Add('*'); finally FreeAndNil(list); end; end; To preallocate memory inside a list, you can set the Capacity property to an expected number of elements in the list. This doesn't prevent the list from growing at a later time; it just creates an initial estimate. You can also use Capacity to reduce memory space used for the list after deleting lots of elements from it. The difference between a list and a string or an array is that, after setting Capacity, you still cannot access list[i] elements directly. Firstly you have to Add them, just as if Capacity was not assigned: procedure TfrmReallocation.btnSetCapacityTListClick(Sender: TObject); var list: TList<Char>; i: Integer; begin list := TList<Char>.Create; try list.Capacity := CNumChars; for i := 1 to CNumChars do list.Add('*'); finally FreeAndNil(list); end; end; Comparing the execution speed shows only a small improvement. The original code executed in 167 ms, while the new version needed 145 ms. The reason for that relatively small change is that TList<T> already manages its storage array. When it runs out of space, it will always at least double the previous size. Internal storage therefore grows from 1 to 2, 4, 8, 16, 32, 64, ... elements. This can, however, waste a lot of memory. In our example, the final size of the internal array is 16,777,216 elements, which is about 60% elements too many. By setting the capacity to the exact required size, we have therefore saved 6,777,216 * SizeOf(Char) bytes or almost 13 megabytes. Other data structures also support the Capacity property. We can find it in TList, TObjectList, TInterfaceList, TStrings, TStringList, TDictionary, TObjectDictionary and others. Memory management functions Besides the various internal functions that the Delphi runtime library (RTL) uses to manage strings, arrays and other built-in data types, RTL also implements various functions that you can use in your program to allocate and release memory blocks. In the next few paragraphs, I'll tell you a little bit about them. Memory management functions can be best described if we split them into a few groups, each including functions that were designed to work together. The first group includes GetMem, AllocMem, ReallocMem, and FreeMem. The procedure GetMem(var P: Pointer; Size: Integer) allocates a memory block of size Size and stores an address of this block in a pointer variable P. This pointer variable is not limited to pointer type, but can be of any pointer type (for example PByte). The new memory block is not initialized and will contain whatever is stored in the memory at that time. Alternatively, you can allocate a memory block with a call to the function AllocMem(Size: Integer): Pointer which allocates a memory block, fills it with zeroes, and then returns its address. To change the size of a memory block, call the procedure ReallocMem(var P: Pointer; Size: Integer). Variable P must contain a pointer to a memory block and Size can be either smaller or larger than the original block size. FastMM will try to resize the block in place. If that fails, it will allocate a new memory block, copy the original data into the new block and return an address of the new block in the P. Just as with the GetMem, newly allocated bytes will not be initialized. To release memory allocated in this way, you should call the FreeMem(var P: Pointer) procedure. The second group includes GetMemory, ReallocMemory, and FreeMemory. These three work just the same as functions from the first group, except that they can be used from C++ Builder. The third group contains just two functions, New and Dispose. These two functions can be used to dynamically create and destroy variables of any type. To allocate such a variable, call New(var X: Pointer) where P is again of any pointer type. The compiler will automatically provide the correct size for the memory block and it will also initialize all managed fields to zero. Unmanaged fields will not be initialized. To release such variables, don't use FreeMem but Dispose(var X: Pointer). In the next section, I'll give a short example of using New and Dispose to dynamically create and destroy variables of a record type. You must never use Dispose to release memory allocated with GetMem or AllocateMem. You must also never use FreeMem to release memory allocated with New. The fourth and last group also contains just two functions, Initialize and Finalize. Strictly speaking, they are not memory management functions. If you create a variable containing managed fields (for example, a record) with a function other than New or AllocMem, it will not be correctly initialized. Managed fields will contain random data and that will completely break the execution of the program. To fix that, you should call Initialize(var V) passing in the variable (and not the pointer to this variable!). An example in the next section will clarify that. Before you return such a variable to the memory manager, you should clean up all references to managed fields by calling Finalize(var V). It is better to use Dispose, which will do that automatically, but sometimes that is not an option and you have to do it manually. Both functions also exist in a form that accepts a number of variables to initialize. This form can be used to initialize or finalize an array of data: procedure Initialize(var V; Count: NativeUInt); procedure Finalize(var V; Count: NativeUInt); In the next section, I'll dig deeper into the dynamic allocation of record variables. I'll also show how most of the memory allocation functions are used in practice. Dynamic record allocation While it is very simple to dynamically create new objects—you just call the Create constructor—dynamic allocation of records and other data types (arrays, strings ...) is a bit more complicated. In the previous section, we saw that the preferred way of allocating such variables is with the New method. The InitializeFinalize demo shows how this is done in practice. The code will dynamically allocate a variable of type TRecord. To do that, we need a pointer variable, pointing to TRecord. The cleanest way to do that is to declare a new type PRecord = ^TRecord: type TRecord = record s1, s2, s3, s4: string; end; PRecord = ^TRecord; Now, we can just declare a variable of type PRecord and call New on that variable. After that, we can use the rec variable as if it was a normal record and not a pointer. Technically, we would have to always write rec^.s1, rec^.s4 and so on, but the Delphi compiler is friendly enough and allows us to drop the ^ character: procedure TfrmInitFin.btnNewDispClick(Sender: TObject); var rec: PRecord; begin New(rec); try rec.s1 := '4'; rec.s2 := '2'; rec.s4 := rec.s1 + rec.s2 + rec.s4; ListBox1.Items.Add('New: ' + rec.s4); finally Dispose(rec); end; end; Technically, you could just use rec: ^TRecord instead of rec: PRecord, but it is customary to use explicitly declared pointer types, such as PRecord. Another option is to use GetMem instead of New, and FreeMem instead of Dispose. In this case, however, we have to manually prepare allocated memory for use with a call to Initialize. We must also prepare it to be released with a call to Finalize before we call FreeMem. If we use GetMem for initialization, we must manually provide the correct size of the allocated block. In this case, we can simply use SizeOf(TRecord). We must also be careful with parameters passed to GetMem and Initialize. You pass a pointer (rec) to GetMem and FreeMem and the actual record data (rec^) to Initialize and Finalize: procedure TfrmInitFin.btnInitFinClick(Sender: TObject); var rec: PRecord; begin GetMem(rec, SizeOf(TRecord)); try Initialize(rec^); rec.s1 := '4'; rec.s2 := '2'; rec.s4 := rec.s1 + rec.s2 + rec.s4; ListBox1.Items.Add('GetMem+Initialize: ' + rec.s4); finally Finalize(rec^); FreeMem (rec); end; end; This demo also shows how the code doesn't work correctly if you allocate a record with GetMem, but then don't call Initialize. To test this, click the third button (GetMem). While in actual code the program may sometimes work and sometimes not, I have taken some care so that GetMem will always return a memory block which will not be initialized to zero and the program will certainly fail: It is certainly possible to create records dynamically and use them instead of classes, but one question still remains—why? Why would we want to use records instead of objects when working with objects is simpler? The answer, in one word, is speed. The demo program, Allocate, shows the difference in execution speed. A click on the Allocate objects button will create ten million objects of type TNodeObj, which is a typical object that you would find in an implementation of a binary tree. Of course, the code then cleans up after itself by destroying all those objects: type TNodeObj = class Left, Right: TNodeObj; Data: NativeUInt; end; procedure TfrmAllocate.btnAllocClassClick(Sender: TObject); var i: Integer; nodes: TArray<TNodeObj>; begin SetLength(nodes, CNumNodes); for i := 0 to CNumNodes-1 do nodes[i] := TNodeObj.Create; for i := 0 to CNumNodes-1 do nodes[i].Free; end; A similar code, activated by the Allocate records button creates ten million records of type TNodeRec, which contains the same fields as TNodeObj: type PNodeRec = ^TNodeRec; TNodeRec = record Left, Right: PNodeRec; Data: NativeUInt; end; procedure TfrmAllocate.btnAllocRecordClick(Sender: TObject); var i: Integer; nodes: TArray<PNodeRec>; begin SetLength(nodes, CNumNodes); for i := 0 to CNumNodes-1 do New(nodes[i]); for i := 0 to CNumNodes-1 do Dispose(nodes[i]); end; Running both methods shows a big difference. While the class-based approach needs 366 ms to initialize objects and 76 ms to free them, the record-based approach needs only 76 ms to initialize records and 56 to free them. Where does that big difference come from? When you create an object of a class, lots of things happen. Firstly, TObject.NewInstance is called to allocate an object. That method calls TObject.InstanceSize to get the size of the object, then GetMem to allocate the memory and in the end, InitInstance which fills the allocated memory with zeros. Secondly, a chain of constructors is called. After all that, a chain of AfterConstruction methods is called (if such methods exist). All in all, that is quite a process which takes some time. Much less is going on when you create a record. If it contains only unmanaged fields, as in our example, a GetMem is called and that's all. If the record contains managed fields, this GetMem is followed by a call to the _Initialize method in the System unit which initializes managed fields. The problem with records is that we cannot declare generic pointers. When we are building trees, for example, we would like to store some data of type T in each node. The initial attempt at that, however, fails. The following code does not compile with the current Delphi compiler: type PNodeRec<T> = ^TNodeRec<T>; TNodeRec<T> = record Left, Right: PNodeRec<T>; Data: T; end; We can circumvent this by moving the TNodeRec<T> declaration inside the generic class that implements a tree. The following code from the Allocate demo shows how we could declare such internal type as a generic object and as a generic record: type TTree<T> = class strict private type TNodeObj<T1> = class Left, Right: TNodeObj<T1>; Data: T1; end; PNodeRec = ^TNodeRec; TNodeRec<T1> = record Left, Right: PNodeRec; Data: T1; end; TNodeRec = TNodeRec<T>; end; If you click the Allocate node<string> button, the code will create a TTree<string> object and then create 10 million class-based nodes and the same amount of record-based nodes. This time, New must initialize the managed field Data: string but the difference in speed is still big. The code needs 669 ms to create and destroy class-based nodes and 133 ms to create and destroy record-based nodes. Another big difference between classes and records is that each object contains two hidden pointer-sized fields. Because of that, each object is 8 bytes larger than you would expect (16 bytes in 64-bit mode). That amounts to 8 * 10,000,000 bytes or a bit over 76 megabytes. Records are therefore not only faster but also save space! FastMM internals To get a full speed out of anything, you have to understand how it works and memory managers are no exception to this rule. To write very fast Delphi applications, you should, therefore, understand how Delphi's default memory manager works. FastMM is not just a memory manager—it is three memory managers in one! It contains three significantly different subsystems—small block allocator, medium block allocator, and large block allocator. The first one, the allocator for small blocks, handles all memory blocks smaller than 2,5 KB. This boundary was determined by observing existing applications. As it turned out, in most Delphi applications, this covers 99% of all memory allocations. This is not surprising, as in most Delphi applications most memory is allocated when an application creates and destroys objects and works with arrays and strings, and those are rarely larger than a few hundred characters. Next comes the allocator for medium blocks, which are memory blocks with a size between 2,5 KB and 160 KB. The last one, allocator for large blocks, handles all other requests. The difference between allocators lies not just in the size of memory that they serve, but in the strategy they use to manage memory. The large block allocator implements the simplest strategy. Whenever it needs some memory, it gets it directly from Windows by calling VirtualAlloc. This function allocates memory in 4 KB blocks so this allocator could waste up to 4,095 bytes per request. As it is used only for blocks larger than 160 KB, this wasted memory doesn't significantly affect the program, though. The medium block allocator gets its memory from the large block allocator. It then carves this larger block into smaller blocks, as they are requested by the application. It also keeps all unused parts of the memory in a linked list so that it can quickly find a memory block that is still free. The small block allocator is where the real smarts of FastMM lies. There are actually 56 small memory allocators, each serving only one size of the memory block. The first one serves 8-byte blocks, the next one 16-byte blocks, followed by the allocator for 24, 32, 40, ... 256, 272, 288, ... 960, 1056, ... 2384, and 2608-byte blocks. They all get memory from the medium block allocator. If you want to see block sizes for all 56 allocators, open FastMM4.pas and search for SmallBlockTypes. What that actually means is that each memory allocation request will waste some memory. If you allocate 28 bytes, they'll be allocated from the 32-byte allocator, so 4 bytes will be wasted. If you allocate 250 bytes, they'll come from the 256-byte allocator and so on. The sizes of memory allocators were carefully chosen so that the amount of wasted memory is typically below 10%, so this doesn't represent a big problem in most applications. Each allocator is basically just an array of equally sized elements (memory blocks). When you allocate a small amount of memory, you'll get back one element of an array. All unused elements are connected into a linked list so that the memory manager can quickly find a free element of an array when it needs one. The following image shows a very simplified representation of FastMM allocators. Only two small block allocators are shown. Boxes with thick borders represent allocated memory. Boxes with thin borders represent unused (free) memory. Free memory blocks are connected into linked lists. Block sizes in different allocators are not to scale: FastMM implements a neat trick which helps a lot when you resize strings or arrays by a small amount. Well, the truth be told, I had to append lots and lots of characters—ten million of them—for this difference to show. If I were appending only a few characters, both versions would run at nearly the same speed. If you can, on the other hand, get your hands on a pre-2006 Delphi and run the demo program there, you'll see that the one-by-one approach runs terribly slow. The difference in speed will be of a few more orders of magnitude larger than in my example. The trick I'm talking about assumes that if you had resized memory once, you'll probably want to do it again, soon. If you are enlarging the memory, it will limit the smallest size of the new memory block to be at least twice the size of the original block plus 32 bytes. Next time you'll want to resize, FastMM will (hopefully) just update the internal information about the allocated memory and return the same block, knowing that there's enough space at the end. All that trickery is hard to understand without an example, so here's one. Let's say we have a string of 5 characters which neatly fits into a 24-byte block. Sorry, what am I hearing? "What? Why!? 5 unicode characters need only 10 bytes!" Oh, yes, strings are more complicated than I told you before. In reality, each Delphi UnicodeString and AnsiString contains some additional data besides the actual characters that make up the string. Parts of the string are also: 4-byte length of string, 4-byte reference count, 2-byte field storing the size of each string character (either 1 for AnsiString or 2 for UnicodeString), and 2-byte field storing the character code page. In addition to that, each string includes a terminating Chr(0) character. For a 5-character string this gives us 4 (length) + 4 (reference count) + 2 (character size) + 2 (codepage) + 5 (characters) * 2 (size of a character) + 2 (terminating Chr(0)) = 24 bytes. When you add one character to this string, the code will ask the memory manager to enlarge a 24-byte block to 26 bytes. Instead of returning a 26-byte block, FastMM will round that up to 2 * 24 + 32 = 80 bytes. Then it will look for an appropriate allocator, find one that serves 80-byte blocks (great, no memory loss!) and return a block from that allocator. It will, of course, also have to copy data from the original block to the new block. This formula, 2 * size + 32, is used only in small block allocators. A medium block allocator only overallocates by 25%, and a large block allocator doesn't implement this behavior at all. Next time you add one character to this string, FastMM will just look at the memory block, determine that there's still enough space inside this 80-byte memory block and return the same memory. This will continue for quite some time while the block grows to 80 bytes in two-byte increments. After that, the block will be resized to 2 * 80 + 32 = 192 bytes (yes, there is an allocator for this size), data will be copied and the game will continue. This behavior indeed wastes some memory but, under most circumstances, significantly boosts the speed of code that was not written with speed in mind. Memory allocation in a parallel world We've seen how FastMM boosts the reallocation speed. The life of a memory manager is simple when there is only one thread of execution inside a program. When the memory manager is dealing out the memory, it can be perfectly safe in the knowledge that nothing can interrupt it in this work. When we deal with parallel processing, however, multiple paths of execution simultaneously execute the same program and work on the same data. Because of that, life from the memory manager's perspective suddenly becomes very dangerous. For example, let's assume that one thread wants some memory. The memory manager finds a free memory block on a free list and prepares to return it. At that moment, however, another thread also needs some memory from the same allocator. This second execution thread (running in parallel with the first one) would also find a free memory block on the free list. If the first thread didn't yet update the free list, that may even be the same memory block! That can only result in one thing—complete confusion and crashing programs. It is extremely hard to write a code that manipulates some data structures (such as a free list) in a manner that functions correctly in a multithreaded world. So hard that FastMM doesn't even try it. Instead of that, it regulates access to each allocator with a lock. Each of the 56 small block allocators get their own lock, as do medium and large block allocators. When a program needs some memory from, say, a 16-byte allocator, FastMM will lock this allocator until the memory is returned to the program. If during this time, another thread requests a memory from the same 16-byte allocator, it will have to wait until the first thread finishes. This indeed fixes all problems but introduces a bottleneck—a part of the code where threads must wait to be processed in a serial fashion. If threads do lots of memory allocation, this serialization will completely negate the speed-up that we expected to get from the parallel approach. Such a memory manager would be useless in a parallel world. To fix that, FastMM introduces memory allocation optimization which only affects small blocks. When accessing a small block allocator, FastMM will try to lock it. If that fails, it will not wait for the allocator to become unlocked but will try to lock the allocator for the next block size. If that succeeds, it will return memory from the second allocator. That will indeed waste more memory but will help with the execution speed. If the second allocator also cannot be locked, FastMM will try to lock the allocator for yet the next block size. If the third allocator can be locked, you'll get back memory from it. Otherwise, FastMM will repeat the process from the beginning. This process can be somehow described with the following pseudo-code: allocIdx := find best allocator for the memory block repeat if can lock allocIdx then break; Inc(allocIdx); if can lock allocIdx then break; Inc(allocIdx); if can lock allocIdx then break; Dec(allocIdx, 2) until false allocate memory from allocIdx allocator unlock allocIdx A careful reader would notice that this code fails when the first line finds the last allocator in the table or the one before that. Instead of adding some conditional code to work around the problem, FastMM rather repeats the last allocator in the list three times. The table of small allocators actually ends with the following sizes: 1,984; 2,176; 2,384; 2,608; 2,608; 2,608. When requesting a block size above 2,384 the first line in the pseudo-code above will always find the first 2,608 allocator, so there will always be two more after it. This approach works great when memory is allocated but hides another problem. And how can I better explain a problem than with a demonstration ...? An example of this problem can be found in the program, ParallelAllocations. If you run it and click the Run button, the code will compare the serial version of some algorithm with a parallel one. I'm aware that I did not explain parallel programming at all, but the code is so simple that even somebody without any understanding of the topic will guess what it does. The core of a test runs a loop with the Execute method on all objects in a list. If a parallelTest flag is set, the loop is executed in parallel, otherwise, it is executed serially. The only mystery part in the code, TParallel.For does exactly what it says—executes a for loop in parallel. if parallelTest then TParallel.For(0, fList.Count - 1, procedure(i: integer) begin fList[i].Execute; end) else for i := 0 to fList.Count - 1 do fList[i].Execute; If you'll be running the program, make sure that you execute it without the debugger (Ctrl + Shift + F9 will do that). Running with the debugger slows down parallel execution and can skew the measurements. On my test machine I got the following results: In essence, parallelizing the program made it almost 4 times faster. Great result! Well, no. Not a great result. You see, the machine I was testing on has 12 cores. If all would be running in parallel, I would expect an almost 12x speed-up, not a mere 4-times improvement! If you take a look at the code, you'll see that each Execute allocates a ton of objects. It is obvious that a problem lies in the memory manager. The question remains though, where exactly lies this problem and how can we find it? I ran into exactly the same problem a few years ago. A highly parallel application which processes gigabytes and gigabytes of data was not running fast enough. There were no obvious problematic points and I suspected that the culprit was FastMM. I tried swapping the memory manager for a more multithreading-friendly one and, indeed, the problem was somehow reduced but I still wanted to know where the original sin lied in my code. I also wanted to continue using FastMM as it offers great debugging tools. In the end, I found no other solution than to dig in the FastMM internals, find out how it works, and add some logging there. More specifically, I wanted to know when a thread is waiting for a memory manager to become unlocked. I also wanted to know at which locations in my program this happens the most. To cut a (very) long story short, I extended FastMM with support for this kind of logging. This extension was later integrated into the main FastMM branch. As these changes are not included in Delphi, you have to take some steps to use this code. Firstly, you have to download FastMM from the official repository at https://github.com/pleriche/FastMM4. Then you have to unpack it somewhere on the disk and add FastMM4 as a first unit in the project file (.dpr). For example, the ParallelAllocation program starts like this: program ParallelAllocation; uses FastMM4 in 'FastMM\FastMM4.pas', Vcl.Forms, ParallelAllocationMain in 'ParallelAllocationMain.pas' {frmParallelAllocation}; When you have done that, you should firstly rebuild your program and test if everything is still working. (It should but you never know ...) To enable the memory manager logging, you have to define a conditional symbol LogLockContention, rebuild (as FastMM4 has to be recompiled) and, of course, run the program without the debugger. If you do that, you'll see that the program runs quite a bit slower than before. On my test machine, the parallel version was only 1.6x faster than the serial one. The logging takes its toll, but that is not important. The important part will appear when you close the program. At that point, the logger will collect all results and sort them by frequency. The 10 most frequent sources of locking in the program will be saved to a file called <programname>_MemoryManager_EventLog.txt. You will find it in the folder with the <programname>.exe. The three most frequent sources of locking will also be displayed on the screen. The following screenshot shows a cropped version of this log. Some important parts are marked out: For starters, we can see that at this location the program waited 19,020 times for a memory manager to become unlocked. Next, we can see that the memory function that caused the problem was FreeMem. Furthermore, we can see that somebody tried to delete from a list (InternalDoDelete) and that this deletion was called from TSpeedTest.Execute, line 130. FreeMem was called because the list in question is actually a TObjectList and deleting elements from the list caused it to be destroyed. The most important part here is the memory function causing the problem—FreeMem. Of course! Allocations are optimized. If an allocator is locked, the next one will be used and so on. Releasing memory, however, is not optimized! When we release a memory block, it must be returned to the same allocator that it came from. If two threads want to release memory to the same allocator at the same time, one will have to wait. I had an idea on how to improve this situation by adding a small stack (called release stack) to each allocator. When FreeMem is called and it cannot lock the allocator, the address of the memory block that is to be released will be stored on that stack. FreeMem will then quickly exit. When a FreeMem successfully locks an allocator, it firstly releases its own memory block. Then it checks if anything is waiting on the release stack and releases these memory blocks too (if there are any). This change is also included in the main FastMM branch, but it is not activated by default as it increases the overall memory consumption of the program. However, in some situations, it can do miracles and if you are developing multithreaded programs you certainly should test it out. To enable release stacks, open the project settings for the program, remove the conditional define LogLockContention (as that slows the program down) and add the conditional define UseReleaseStack. Rebuild, as FastMM4.pas has to be recompiled. On my test machine, I got much better results with this option enabled. Instead of a 3,9x speed-up, the parallel version was 6,3x faster than the serial one. The factor is not even close to 12x, as the threads do too much fighting for the memory, but the improvement is still significant: That is as far as FastMM will take us. For a faster execution, we need a more multithreading-friendly memory manager. To summarize, this article covered memory management techniques offered by Delphi. We looked into optimization, allocation, and internal of storage for efficient parallel programming. If you found this post useful, do check out the book Delphi High Performance to learn more about the intricacies of how to perform High-performance programming with Delphi. Read More: Exploring the Usages of Delphi Network programming 101 with GAWK (GNU AWK) A really basic guide to batch file programming
Read more
  • 0
  • 0
  • 12996

article-image-regular-expressions-awk-programming
Pavan Ramchandani
18 May 2018
8 min read
Save for later

Regular expressions in AWK programming: What, Why, and How

Pavan Ramchandani
18 May 2018
8 min read
AWK is a pattern-matching language. It searches for a pattern in a file and, upon finding the corresponding match, it performs the file's action on the input line. This pattern could consist of fixed strings or a pattern of text. This variable content or pattern is generally searched with the help of regular expressions. Hence, regular expressions form an important part of AWK programming language. Today we will introduce you to the regular expressions in AWK programming and will get started with string-matching patterns and basic constructs to use with AWK. This article is an excerpt from a book written by Shiwang Kalkhanda, titled Learning AWK Programming. What is a regular expression? A regular expression, or regexpr, is a set of characters used to describe a pattern. A regular expression is generally used to match lines in a file that contain a particular pattern. Many Unix utilities operate on plain text files line by line, such as grep, sed, and awk. Regular expressions search for a pattern on a single line in a file. A regular expression doesn't search for a pattern that begins on one line and ends on another. Other programming languages may support this, notably Perl. Why use regular expressions? Generally, all editors have the ability to perform search-and-replace operations. Some editors can only search for patterns, others can also replace them, and others can also print the line containing that pattern. A regular expression goes many steps beyond this simple search, replace, and printing functionality, and hence it is more powerful and flexible. We can search for a word of a certain size, such as a word that has four characters or numbers. We can search for a word that ends with a particular character, let's say e. You can search for phone numbers, email IDs, and so on, and can also perform validation using regular expressions. They simplify complex pattern-matching tasks and hence form an important part of AWK programming. Other regular expression variations also exist, notably those for Perl. Using regular expressions with AWK There are mainly two types of regular expressions in Linux: Basic regular expressions that are used by vi, sed, grep, and so on Extended regular expressions that are used by awk, nawk, gawk, and egrep Here, we will refer to extended regular expressions as regular expressions in the context of AWK. In AWK, regular expressions are enclosed in forward slashes, '/', (forming the AWK pattern) and match every input record whose text belongs to that set. The simplest regular expression is a string of letters, numbers, or both that matches itself. For example, here we use the ly regular expression string to print all lines that contain the ly pattern in them. We just need to enclose the regular expression in forward slashes in AWK: $ awk '/ly/' emp.dat The output on execution of this code is as follows: Billy Chabra 9911664321 bily@yahoo.com M lgs 1900 Emily Kaur 8826175812 emily@gmail.com F Ops 2100 In this example, the /ly/ pattern matches when the current input line contains the ly sub-string, either as ly itself or as some part of a bigger word, such as Billy or Emily, and prints the corresponding line. Regular expressions as string-matching patterns with AWK Regular expressions are used as string-matching patterns with AWK in the following three ways. We use the '~' and '! ~' match operators to perform regular expression comparisons: /regexpr/: This matches when the current input line contains a sub-string matched by regexpr. It is the most basic regular expression, which matches itself as a string or sub-string. For example, /mail/ matches only when the current input line contains the mail string as a string, a sub-string, or both. So, we will get lines with Gmail as well as Hotmail in the email ID field of the employee database as follows: $ awk '/mail/' emp.dat The output on execution of this code is as follows: Jack Singh 9857532312 jack@gmail.com M hr 2000 Jane Kaur 9837432312 jane@gmail.com F hr 1800 Eva Chabra 8827232115 eva@gmail.com F lgs 2100 Ana Khanna 9856422312 anak@hotmail.com F Ops 2700 Victor Sharma 8826567898 vics@hotmail.com M Ops 2500 John Kapur 9911556789 john@gmail.com M hr 2200 Sam khanna 8856345512 sam@hotmail.com F lgs 2300 Emily Kaur 8826175812 emily@gmail.com F Ops 2100 Amy Sharma 9857536898 amys@hotmail.com F Ops 2500 In this example, we do not specify any expression, hence it automatically matches a whole line, as follows: $ awk '$0 ~ /mail/' emp.dat The output on execution of this code is as follows: Jack Singh 9857532312 jack@gmail.com M hr 2000 Jane Kaur 9837432312 jane@gmail.com F hr 1800 Eva Chabra 8827232115 eva@gmail.com F lgs 2100 Ana Khanna 9856422312 anak@hotmail.com F Ops 2700 Victor Sharma 8826567898 vics@hotmail.com M Ops 2500 John Kapur 9911556789 john@gmail.com M hr 2200 Sam khanna 8856345512 sam@hotmail.com F lgs 2300 Emily Kaur 8826175812 emily@gmail.com F Ops 2100 Amy Sharma 9857536898 amys@hotmail.com F Ops 2500 expression ~ /regexpr /: This matches if the string value of the expression contains a sub-string matched by regexpr. Generally, this left-hand operand of the matching operator is a field. For example, in the following command, we print all the lines in which the value in the second field contains a /Singh/ string: $ awk '$2 ~ /Singh/{ print }' emp.dat We can also use the expression as follows: $ awk '{ if($2 ~ /Singh/) print}' emp.dat The output on execution of the preceding code is as follows: Jack Singh 9857532312 jack@gmail.com M hr 2000 Hari Singh 8827255666 hari@yahoo.com M Ops 2350 Ginny Singh 9857123466 ginny@yahoo.com F hr 2250 Vina Singh 8811776612 vina@yahoo.com F lgs 2300 expression !~ /regexpr /: This matches if the string value of the expression does not contain a sub-string matched by regexpr. Generally, this expression is also a field variable. For example, in the following example, we print all the lines that don't contain the Singh sub-string in the second field, as follows: $ awk '$2 !~ /Singh/{ print }' emp.dat The output on execution of the preceding code is as follows: Jane Kaur 9837432312 jane@gmail.com F hr 1800 Eva Chabra 8827232115 eva@gmail.com F lgs 2100 Amit Sharma 9911887766 amit@yahoo.com M lgs 2350 Julie Kapur 8826234556 julie@yahoo.com F Ops 2500 Ana Khanna 9856422312 anak@hotmail.com F Ops 2700 Victor Sharma 8826567898 vics@hotmail.com M Ops 2500 John Kapur 9911556789 john@gmail.com M hr 2200 Billy Chabra 9911664321 bily@yahoo.com M lgs 1900 Sam khanna 8856345512 sam@hotmail.com F lgs 2300 Emily Kaur 8826175812 emily@gmail.com F Ops 2100 Amy Sharma 9857536898 amys@hotmail.com F Ops 2500 Any expression may be used in place of /regexpr/ in the context of ~; and !~. The expression here could also be if, while, for, and do statements. Basic regular expression construct Regular expressions are made up of two types of characters: normal text characters, called literals, and special characters, such as the asterisk (*, +, ?, .), called metacharacters. There are times when you want to match a metacharacter as a literal character. In such cases, we prefix that metacharacter with a backslash (), which is called an escape sequence. The basic regular expression construct can be summarized as follows: Here is the list of metacharacters, also known as special characters, that are used in building regular expressions:     ^    $    .    [    ]    |    (    )    *    +    ? The following table lists the remaining elements that are used in building a basic regular expression, apart from the metacharacters mentioned before: Literal A literal character (non-metacharacter ), such as A, that matches itself. Escape sequence An escape sequence that matches a special symbol: for example t matches tab. Quoted metacharacter () In quoted metacharacters, we prefix metacharacter with a backslash, such as $ that matches the metacharacter literally. Anchor (^) Matches the beginning of a string. Anchor ($) Matches the end of a string. Dot (.) Matches any single character. Character classes (...) A character class [ABC] matches any one of the A, B, or C characters. Character classes may include abbreviations, such as [A-Za-z]. They match any single letter. Complemented character classes Complemented character classes [^0-9] match any character except a digit. These operators combine regular expressions into larger ones: Alternation (|) A|B matches A or B. Concatenation AB matches A immediately followed by B. Closure (*) A* matches zero or more As. Positive closure (+) A+ matches one or more As. Zero or one (?) A? matches the null string or A. Parentheses () Used for grouping regular expressions and back-referencing. Like regular expressions, (r) can be accessed using n digit in future. Do check out the book Learning AWK Programming to learn more about the intricacies of AWK programming language for text processing. Read More What is the difference between functional and object-oriented programming? What makes a programming language simple or complex?
Read more
  • 0
  • 0
  • 14519

article-image-awk-programming-langauge
Pavan Ramchandani
17 May 2018
9 min read
Save for later

That '70s language: AWK programming

Pavan Ramchandani
17 May 2018
9 min read
AWK is an interpreted programming language designed for text processing and report generation. It is typically used for data manipulation, such as searching for items within data, performing arithmetic operations, and restructuring raw data for generating reports in most Unix-like operating systems. Today, we will explore the AWK philosophy and different types of AWK that exist, starting from its original implementation in 1977 at AT&T's Laboratories, Inc. We will also look at the various implementation areas of AWK in data science today. Using AWK programs, one can handle repetitive text-editing problems with very simple and short programs. It is a pattern-action language; it searches for patterns in a given input and, when a match is found, it performs the corresponding action. The pattern can be made of strings, regular expressions, comparison operations on numbers, fields, variables, and so on. It reads the input files and splits each input line of the file into fields automatically. AWK has most of the well-designed features that every programming language should contain. Its syntax particularly resembles that of the C programming language. It is named after its original three authors: Alfred V. Aho Peter J. Weinberger Brian W. Kernighan AWK is a very powerful, elegant, and simple that every person dealing with text processing should be familiar with. This article is an excerpt from a book written by Shiwang Kalkhanda, titled Learning AWK Programming. This book will introduce you to AWK programming language and get you hands-on working with practical implementation of AWK. Types of AWK The AWK language was originally implemented as an AWK utility on Unix. Today, most Linux distributions provide GNU implementation of AWK (GAWK), and a symlink for AWK is created from the original GAWK binary. The AWK utility can be categorized into the following three types, depending upon the type of interpreter it uses for executing AWK programs: AWK: This is the original AWK interpreter available from AT&T Laboratories. However, it is not used much nowadays and hence it might not be well-maintained. Its limitation is that it splits a line into a maximum 99 fields. It was updated and replaced in the mid-1980s with an enhanced version called New AWK (NAWK). NAWK: This is AT&T's latest development on the AWK interpreter. It is well-maintained by one of the original authors of AWK - Dr. Brian W. Kernighan. GAWK: This is the GNU project's implementation of the AWK programming language. All GNU/Linux distributions are shipped with GAWK by default and hence it is the most popular version of AWK. GAWK interpreter is fully compatible with AWK and NAWK. Beyond these, we also have other, less popular, AWK interpreters and translators, mentioned as follows. These variants are useful in operations when you want to translate your AWK program to C, C++, or Perl: MAWK: Michael Brennan interpreter for AWK. TAWK: Thompson Automation interpreter/compiler/Microsoft Windows DLL for AWK. MKSAWK: Mortice Kern Systems interpreter/compiler/for AWK. AWKCC: An AWK translator to C (might not be well-maintained). AWKC++: Brian Kernighan's AWK translator to C++ (experimental). It can be downloaded from: https://9p.io/cm/cs/who/bwk/awkc++.ps. AWK2C: An AWK translator to C. It uses GNU AWK libraries extensively. A2P: An AWK translator to Perl. It comes with Perl. AWKA: Yet another AWK translator to C (comes with the library), based on MAWK. It can be downloaded from: http://awka.sourceforge.net/download.html. When and where to use AWK AWK is simpler than any other utility for text processing and is available as the default on Unix-like operating systems. However, some people might say Perl is a superior choice for text processing, as AWK is functionally a subset of Perl, but the learning curve for Perl is steeper than that of AWK; AWK is simpler than Perl. AWK programs are smaller and hence quicker to execute. Anybody who knows the Linux command line can start writing AWK programs in no time. Here are a few use cases of AWK: Text processing Producing formatted text reports/labels Performing arithmetic operations on fields of a file Performing string operations on different fields of a file Programs written in AWK are smaller than they would be in other higher-level languages for similar text processing operations. AWK programs are interpreted on a GNU/Linux Terminal and thus avoid the compiling, debugging phase of software development in other languages. Getting started with installation This section describes how to set up the AWK environment on your GNU/Linux system, and we'll also discuss the workflow of AWK. Then, we'll look at different methods for executing AWK programs. Installation on Linux Generally, AWK is installed by default on most GNU/Linux distributions. Using the which command, you can check whether it is installed on your system or not. In case AWK is not installed on your system, you can do so in one of two ways: Using the package manager of the corresponding GNU/Linux system Compiling from the source code Let's take a look at each method in detail in the following sections. Using the package manager Different flavors of GNU/Linux distribution have different package-management utilities. If you are using a Debian-based GNU/Linux distribution, such as Ubuntu, Mint, or Debian, then you can install it using the Advance Package Tool (APT) package manager, as follows: [ shiwang@linux ~ ] $ sudo apt-get update -y [ shiwang@linux ~ ] $ sudo apt-get install gawk -y Similarly, to install AWK on an RPM-based GNU/Linux distribution, such as Fedora, CentOS, or RHEL, you can use the Yellowdog Updator Modified (YUM) package manager, as follows: [ root@linux ~ ] # yum update -y [ root@linux ~ ] # yum install gawk -y For installation of AWK on openSUSE, you can use the zypper (zypper command line) package-management utility, as follows: [ root@linux ~ ] # zypper update -y [ root@linux ~ ] # zypper install gawk -y Once the installation is finished, make sure AWK is accessible through the command line. We can check that using the which command, which will return the absolute path of AWK on our system: [ root@linux ~ ] # which awk /usr/bin/awk You can also use awk --version to find the AWK version on our system: [ root@linux ~ ] # awk --version Compiling from the source code Like every other open source utility, the GNU AWK source code is freely available for download as part of the GNU project. Previously, you saw how to install AWK using the package manager; now, you will see how to install AWK by compiling from its source code on the GNU/Linux distribution. The following steps are applicable to most of the GNU/Linux software for installation: Download the source code from a GNU project ftp site. Here, we will use the wget command line utility to download it, however you are free to choose any other program, such as curl, you feel comfortable with: [ shiwang@linux ~ ] $ wget http://ftp.gnu.org/gnu/gawk/gawk-4.1.3.tar.xz Extract the downloaded source code: [ shiwang@linux ~ ] $ tar xvf gawk-4.1.3.tar.xz Change your working directory and execute the configure file to configure the GAWK as per the working environment of your system: [ shiwang@linux ~ ] $ cd gawk-4.1.3 && ./configure Once the configure command completes its execution successfully, it will generate the make file. Now, compile the source code by executing the make command: [ shiwang@linux ~ ] $ make Type make install to install the programs and any data files and documentation. When installing into a prefix owned by root, it is recommended that the package be configured and built as a regular user, and only the make install phase is executed with root privileges: [ shiwang@linux ~ ] $ sudo make install Upon successful execution of these five steps, you have compiled and installed AWK on your GNU/Linux distribution. You can verify this by executing the which awk command in the Terminal or awk --version: [ root@linux ~ ] # which awk /usr/bin/awk Now you have a working AWK/GAWK installation and we are ready to begin AWK programming, but before that, our next section describes the workflow of the AWK interpreter. If you are running on macOS X, AWK, and not GAWK, would be installed as a default on it. For GAWK installation on macOS X, please refer to MacPorts for GAWK. Workflow of AWK Having a basic knowledge of the AWK interpreter workflow will help you to better understand AWK and will result in more efficient AWK program development. Hence, before getting your hands dirty with AWK programming, you need to understand its internals. The AWK workflow can be summarized as shown in the following figure: Let's take a look at each operation: READ OPERATION: AWK reads a line from the input stream (file, pipe, or stdin) and stores it in memory. It works on text input, which can be a file, the standard input stream, or from a pipe, which it further splits into records and fields: Records: An AWK record is a single, continuous data input that AWK works on. Records are bounded by a record separator, whose value is stored in the RS variable. The default value of RS is set to a newline character. So, the lines of input are considered records for the AWK interpreter. Records are read continuously until the end of the input is reached. Figure 1.2  shows how input data is broken into records and then goes further into how it is split into fields: Fields: Each record can further be broken down into individual chunks called fields. Like records, fields are bounded. The default field separator is any amount of whitespace, including tab and space characters. So by default, lines of input are further broken down into individual words separated by whitespace. You can refer to the fields of a record by a field number, beginning with 1. The last field in each record can be accessed by its number or with the NF special variable, which contains the number of fields in the current record, as shown in Figure 1.3: EXECUTE OPERATION: All AWK commands are applied sequentially on the input (records and fields). By default, AWK executes commands on each record/line. This behavior of AWK can be restricted by the use of patterns. REPEAT OPERATION: The process of read and execute is repeated until the end of the file is reached. The following flowchart depicts the workflow:   We introduced you to the AWK programming language and got ourselves a quick primer to get started with application development. If you found this post is useful, do check out the book Learning AWK Programming to learn more about the intricacies of AWK programming language for text processing. The oldest programming languages in use today What is the difference between functional and object oriented programming? Systems programming with Go in UNIX and Linux  
Read more
  • 0
  • 0
  • 4696
article-image-work-with-classes-in-typescript
Amey Varangaonkar
15 May 2018
8 min read
Save for later

How to work with classes in Typescript

Amey Varangaonkar
15 May 2018
8 min read
If we are developing any application using TypeScript, be it a small-scale or a large-scale application, we will use classes to manage our properties and methods. Prior to ES 2015, JavaScript did not have the concept of classes, and we used functions to create class-like behavior. TypeScript introduced classes as part of its initial release, and now we have classes in ES6 as well. The behavior of classes in TypeScript and JavaScript ES6 closely relates to the behavior of any object-oriented language that you might have worked on, such as C#. This excerpt is taken from the book TypeScript 2.x By Example written by Sachin Ohri. Object-oriented programming in TypeScript Object-oriented programming allows us to represent our code in the form of objects, which themselves are instances of classes holding properties and methods. Classes form the container of related properties and their behavior. Modeling our code in the form of classes allows us to achieve various features of object-oriented programming, which helps us write more intuitive, reusable, and robust code. Features such as encapsulation, polymorphism, and inheritance are the result of implementing classes. TypeScript, with its implementation of classes and interfaces, allows us to write code in an object-oriented fashion. This allows developers coming from traditional languages, such as Java and C#, feel right at home when learning TypeScript. Understanding classes Prior to ES 2015, JavaScript developers did not have any concept of classes; the best way they could replicate the behavior of classes was with functions. The function provides a mechanism to group together related properties and methods. The methods can be either added internally to the function or using the prototype keyword. The following is an example of such a function: function Name (firstName, lastName) { this.firstName = firstName; this.lastName = lastName; this.fullName = function() { return this.firstName + ' ' + this.lastName ; }; } In this preceding example, we have the fullName method encapsulated inside the Name function. Another way of adding methods to functions is shown in the following code snippet with the prototype keyword: function Name (firstName, lastName) { this.firstName = firstName; this.lastName = lastName; } Name.prototype.fullName = function() { return this.firstName + ' ' + this.lastName ; }; These features of functions did solve most of the issues of not having classes, but most of the dev community has not been comfortable with these approaches. Classes make this process easier. Classes provide an abstraction on top of common behavior, thus making code reusable. The following is the syntax for defining a class in TypeScript: The syntax of the class should look very similar to readers who come from an object-oriented background. To define a class, we use a class keyword followed by the name of the class. The News class has three member properties and one method. Each member has a type assigned to it and has an access modifier to define the scope. On line 10, we create an object of a class with the new keyword. Classes in TypeScript also have the concept of a constructor, where we can initialize some properties at the time of object creation. Access modifiers Once the object is created, we can access the public members of the class with the dot operator. Note that we cannot access the author property with the espn object because this property is defined as private. TypeScript provides three types of access modifiers. Public Any property defined with the public keyword will be freely accessible outside the class. As we saw in the previous example, all the variables marked with the public keyword were available outside the class in an object. Note that TypeScript assigns public as a default access modifier if we do not assign any explicitly. This is because the default JavaScript behavior is to have everything public. Private When a property is marked as private, it cannot be accessed outside of the class. The scope of a private variable is only inside the class when using TypeScript. In JavaScript, as we do not have access modifiers, private members are treated similarly to public members. Protected The protected keyword behaves similarly to private, with the exception that protected variables can be accessed in the derived classes. The following is one such example: class base{ protected id: number; } class child extends base{ name: string; details():string{ return `${name} has id: ${this.id}` } } In the preceding code, we extend the child class with the base class and have access to the id property inside the child class. If we create an object of the child class, we will still not have access to the id property outside. Readonly As the name suggests, a property with a readonly access modifier cannot be modified after the value has been assigned to it. The value assigned to a readonly property can only happen at the time of variable declaration or in the constructor. In the above code, line 5 gives an error stating that property name is readonly, and cannot be an assigned value. Transpiled JavaScript from classes While learning TypeScript, it is important to remember that TypeScript is a superset of JavaScript and not a new language on its own. Browsers can only understand JavaScript, so it is important for us to understand the JavaScript that is transpiled by TypeScript. TypeScript provides an option to generate JavaScript based on the ECMA standards. You can configure TypeScript to transpile into ES5 or ES6 (ES 2015) and even ES3 JavaScript by using the flag target in the tsconfig.json file. The biggest difference between ES5 and ES6 is with regard to the classes, let, and const keywords which were introduced in ES6. Even though ES6 has been around for more than a year, most browsers still do not have full support for ES6. So, if you are creating an application that would target older browsers as well, consider having the target as ES5. So, the JavaScript that's generated will be different based on the target setting. Here, we will take an example of class in TypeScript and generate JavaScript for both ES5 and ES6. The following is the class definition in TypeScript: This is the same code that we saw when we introduced classes in the Understanding Classes section. Here, we have a class named News that has three members, two of which are public and one private. The News class also has a format method, which returns a string concatenated from the member variables. Then, we create an object of the News class in line 10 and assign values to public properties. In the last line, we call the format method to print the result. Now let's look at the JavaScript transpiled by TypeScript compiler for this class. ES6 JavaScript ES6, also known as ES 2015, is the latest version of JavaScript, which provides many new features on top of ES5. Classes are one such feature; JavaScript did not have classes prior to ES6. The following is the code generated from the TypeScript class, which we saw previously: If you compare the preceding code with TypeScript code, you will notice minor differences. This is because classes in TypeScript and JavaScript are similar, with just types and access modifiers additional in TypeScript. In JavaScript, we do not have the concept of declaring public members. The author variable, which was defined as private and was initialized at its declaration, is converted to a constructor initialization in JavaScript. If we had not have initialized author, then the produced JavaScript would not have added author in the constructor. ES5 JavaScript ES5 is the most popular JavaScript version supported in browsers, and if you are developing an application that has to support the majority of browser versions, then you need to transpile your code to the ES5 version. This version of JavaScript does not have classes, and hence the transpiled code converts classes to functions, and methods inside the classes are converted to prototypically defined methods on the functions. The following is the code transpiled when we have the target set as ES5 in the TypeScript compiler options: As discussed earlier, the basic difference is that the class is converted to a function. The interesting aspect of this conversion is that the News class is converted to an immediately invoked function expression (IIFE). An IIFE can be identified by the parenthesis at the end of the function declaration, as we see in line 9 in the preceding code snippet. IIFEs cause the function to be executed immediately and help to maintain the correct scope of a function rather than declaring the function in a global scope. Another difference was how we defined the method format in the ES5 JavaScript. The prototype keyword is used to add the additional behavior to the function, which we see here. A couple of other differences you may have noticed include the change of the let keyword to var, as let is not supported in ES5. All variables in ES5 are defined with the var keyword. Also, the format method now does not use a template string, but standard string concatenation to print the output. TypeScript does a good job of transpiling the code to JavaScript while following recommended practices. This helps in making sure we have a robust and reusable code with minimum error cases. If you found this tutorial useful, make sure you check out the book TypeScript 2.x By Example for more hands-on tutorials on how to effectively leverage the power of TypeScript to develop and deploy state-of-the-art web applications. How to install and configure TypeScript Understanding Patterns and Architectures in TypeScript Writing SOLID JavaScript code with TypeScript
Read more
  • 0
  • 0
  • 6875

article-image-install-configure-typescript
Amey Varangaonkar
11 May 2018
9 min read
Save for later

How to install and configure TypeScript

Amey Varangaonkar
11 May 2018
9 min read
In this tutorial, we will look at the installation process of TypeScript and the editor setup for TypeScript development. Microsoft does well in providing easy-to-perform steps to install TypeScript on all platforms, namely Windows, macOS, and Linux. [box type="shadow" align="" class="" width=""]The following excerpt is taken from the book TypeScript 2.x By Example written by Sachin Ohri. This book presents hands-on examples and projects to learn the fundamental concepts of the popular TypeScript programming language.[/box] Installation of TypeScript TypeScript's official website is the best source to install the latest version. On the website, go to the Download section. There, you will find details on how to install TypeScript. Node.js and Visual Studio are the two most common ways to get it. It supports a host of other editors and has plugins available for them in the same link. We will be installing TypeScript using Node.js and using Visual Studio Code as our primary editor. You can use any editor of your choice and be able to run the applications seamlessly. If you use full-blown Visual Studio as your primary development IDE, then you can use either of the links, Visual Studio 2017 or Visual Studio 2013, to download the TypeScript SDK. Visual Studio does come with a TypeScript compiler but it's better to install it from this link so as to get the latest version. To install TypeScript using Node.js, we will use npm (node package manager), which comes with Node.js. Node.js is a popular JavaScript runtime for building and running server-side JavaScript applications. As TypeScript compiles into JavaScript, Node is an ideal fit for developing server-side applications with the TypeScript language. As mentioned on the website, just running the following command in the Terminal (on macOS) / Command Prompt (on Windows) window will install the latest version: npm install -g typescript To load any package from Node.js, the npm command starts with npm install; the -g flag identifies that we are installing the package globally. The last parameter is the name of the package that we are installing. Once it is installed, you can check the version of TypeScript by running the following command in the Terminal window: tsc -v You can use the following command to get the help for all the other options that are available with tsc: tsc -h TypeScript editors One of the outstanding features of TypeScript is its support for editors. All the editors provide support for language services, thereby providing features such as IntelliSense, statement completion, and error highlighting. If you are coming from a .NET background, then Visual Studio 2013/2015/2017 is a good option for you. Visual Studio does not require any configuration and it's easy to start using TypeScript. As we discussed earlier, just install the SDK and you are good to go. If you are from a Java background, TypeScript supports Eclipse as well. It also supports plugins for Sublime, WebStorm, and Atom, and each of these provides a rich set of features. Visual Studio Code (VS Code) is another good option for an IDE. It's a smaller, lighter version of Visual Studio and primarily used for web application development. VS Code is lightweight and cross-platform, capable of running on Windows, Linux, and macOS. It has an ever-increasing set of plugins to help you write better code, such as TSLint, a static analysis tool to help TypeScript code for readability, maintainability, and error checking. VS Code has a compelling case to be the default IDE for all sorts of web application development. In this post, we will briefly look at the Visual Studio and VS Code setup for TypeScript. Visual Studio Visual Studio is a full-blown IDE provided by Microsoft for all .NET based development, but now Visual Studio also has excellent support for TypeScript with built-in project templates. A TypeScript compiler is integrated into Visual Studio to allow automatic transpiling of code to JavaScript. Visual Studio also has the TypeScript language service integrated to provide IntelliSense and design-time error checking, among other things. With Visual Studio, creating a project with a TypeScript file is as simple as adding a new file with a .ts extension. Visual Studio will provide all the features out of the box. VS Code VS Code is a lightweight IDE from Microsoft used for web application development. VS Code can be installed on Windows, macOS, and Linux-based systems. VS Code can recognize the different type of code files and comes with a huge set of extensions to help in development. You can install VS Code from https://code.visualstudio.com/download. VS Code comes with an integrated TypeScript compiler, so we can start creating projects directly. The following screenshot shows a TypeScript file opened in VS Code: To run the project in VS Code, we need a task runner. VS Code includes multiple task runners which can be configured for the project, such as Gulp, Grunt, and TypeScript. We will be using the TypeScript task runner for our build. VS Code has a Command Palette which allows you to access various different features, such as Build Task, Themes, Debug options, and so on. To open the Command Palette, use Ctrl + Shift + P on a Windows machine or Cmd + Shift + P on a macOS. In the Command Palette, type Build, as shown in the following screenshot, which will show the command to build the project: When the command is selected, VS Code shows an alert, No built task defined..., as follows: We select Configure Build Task and, from all the available options as shown in the following screenshot, choose TypeScript build: This creates a new folder in your project, .vscode and a new file, task.json. This JSON file is used to create the task that will be responsible for compiling TypeScript code in VS Code. TypeScript needs another JSON file (tsconfig.json) to be able to configure compiler options. Every time we run the code, tsc will look for a file with this name and use this file to configure itself. TypeScript is extremely flexible in transpiling the code to JavaScript as per developer requirements, and this is achieved by configuring the compiler options of TypeScript. TypeScript compiler The TypeScript compiler is called tsc and is responsible for transpiling the TypeScript code to JavaScript. The TypeScript compiler is also cross-platform and supported on Windows, macOS, and Linux. To run the TypeScript compiler, there are a couple of options. One is to integrate the compiler in your editor of choice, which we explained in the previous section. In the previous section, we also integrated the TypeScript compiler with VS Code, which allowed us to build our code from the editor itself. All the compiler configurations that we would want to use are added to the tsconfig.json file. Another option is to use tsc directly from the command line / Terminal window. TypeScript's tsc command takes compiler configuration options as parameters and compiles code into JavaScript. For example, create a simple TypeScript file in Notepad and add the following lines of code to it. To create a file as a TypeScript file, we just need to make sure we have the file extension as *.ts: class Editor { constructor(public name: string,public isTypeScriptCompatible : Boolean) {} details() { console.log('Editor: ' + this.name + ', TypeScript installed: ' + this.isTypeScriptCompatible); } } class VisualStudioCode extends Editor{ public OSType: string constructor(name: string,isTypeScriptCompatible : Boolean, OSType: string) { super(name,isTypeScriptCompatible); this.OSType = OSType; } } let VS = new VisualStudioCode('VSCode', true, 'all'); VS.details(); This is the same code example we used in the TypeScript features section of this chapter. Save this file as app.ts (you can give it any name you want, as long as the extension of the file is *.ts). In the command line / Terminal window, navigate to the path where you have saved this file and run the following command: tsc app.ts This command will build the code and the transpile it into JavaScript. The JavaScript file is also saved in the same location where we had TypeScript. If there is any build issue, tsc will show these messages on the command line only. As you can imagine, running the tsc command manually for medium- to large-scale projects is not a productive approach. Hence, we prefer to use an editor that has TypeScript integrated. The following table shows the most commonly used TypeScript compiler configurations. We will be discussing these in detail in upcoming chapters: Compiler option Type Description allowUnusedLabels boolean By default, this flag is false. This option tells the compiler to flag unused labels. alwaysStrict boolean By default, this flag is false. When turned on, this will cause the compiler to compile in strict mode and emit use strict in the source file. module string Specify module code generation: None, CommonJS, AMD, System, UMD, ES6, or ES2015. moduleResolution string Determines how the module is resolved. noImplicitAny boolean This property allows an error to be raised if there is any code which implies data type as any. This flag is recommended to be turned off if you are migrating a JavaScript project to TypeScript in an incremental manner. noImplicitReturn boolean Default value is false; raises an error if not all code paths return a value. noUnusedLocals boolean Reports an error if there are any unused locals in the code. noUnusedParameter boolean Reports an error if there are any unused parameters in the code. outDir string Redirects output structure to the directory. outFile string Concatenates and emits output to a single file. The order of concatenation is determined by the list of files passed to the compiler on the command line along with triple-slash references and imports. See the output file order documentation for more details. removeComments boolean Remove all comments except copyright header comments beginning with /*!. sourcemap boolean Generates corresponding .map file. Target string Specifies ECMAScript target version: ES3(default), ES5, ES6/ES2015, ES2016, ES2017, or ESNext. Watch Runs the compiler in watch mode. Watches input files and triggers recompilation on changes. We saw it is quite easy to set up and configure TypeScript, and we are now ready to get started with our first application! To learn more about writing and compiling your first TypeScript application, make sure you check out the book TypeScript 2.x By Example. Introduction to TypeScript Introducing Object Oriented Programmng with TypeScript Elm and TypeScript – Static typing on the Frontend
Read more
  • 0
  • 1
  • 10191

article-image-8-recipes-to-master-promises-in-ecmascript-2018
Richa Tripathi
08 May 2018
17 min read
Save for later

8 recipes to master Promises in ECMAScript 2018

Richa Tripathi
08 May 2018
17 min read
What are Promises in ECMAScript? In earlier versions of JavaScript, the callback pattern was the most common way to organize asynchronous code. It got the job done, but it didn't scale well. With callbacks, as more asynchronous functions are added, the code becomes more deeply nested, and it becomes more difficult to add to, refactor, and understand the code. This situation is commonly known as callback hell. Promises were introduced to improve on this situation. Promises allow the relationships of asynchronous operations to be rearranged and organized with more freedom and flexibility. In this context, today we will learn about Promises and how to use it to create and organize asynchronous functions. We will also explore how to handle error conditions. Creating and waiting for Promises Promises provide a way to compose and combine asynchronous functions in an organized and easier to read way. This recipe demonstrates a very basic usage of promises. This recipe assumes that you already have a workspace that allows you to create and run ES modules in your browser for all the recipes given below: How to do it... Open your command-line application and navigate to your workspace. Create a new folder named 03-01-creating-and-waiting-for-promises. Copy or create an index.html that loads and runs a main function from main.js. Create a main.js file that creates a promise and logs messages before and after the promise is created, as well as while the promise is executing and after it has been resolved: // main.js export function main () { console.log('Before promise created'); new Promise(function (resolve) { console.log('Executing promise'); resolve(); }).then(function () { console.log('Finished promise'); }); console.log('After promise created'); } Start your Python web server and open the following link in your browser: http://localhost:8000/. You will see the following output: How it works... By looking at the order of the log messages, you can clearly see the order of operations. First, the initial log is executed. Next, the promise is created with an executor method. The executor method takes resolve as an argument. The resolve function fulfills the promise. Promises adhere to an interface named thenable. This means that we can chain then callbacks. The callback we attached with this method is executed after the resolve function is called. This function executes asynchronously (not immediately after the Promise has been resolved). Finally, there is a log after the promise has been created. The order the logs messages appear reveals the asynchronous nature of the code. All of the logs are seen in the order they appear in the code, except the Finished promise message. That function is executed asynchronously after the main function has exited! Resolving Promise results In the previous recipe, we saw how to use promises to execute asynchronous code. However, this code is pretty basic. It just logs a message and then calls resolve. Often, we want to use asynchronous code to perform some long-running operation, then return that value. This recipe demonstrates how to use resolve in order to return the result of a long-running operation. How to do it... Open your command-line application and navigate to your workspace.  Create a new folder named 3-02-resolving-promise-results. Copy or create an index.html that loads and runs a main function from main.js. Create a main.js file that creates a promise and logs messages before and after the promise is created: // main.js export function main () { console.log('Before promise created'); new Promise(function (resolve) { }); console.log('After promise created'); } Within the promise, resolve a random number after a 5-second timeout: new Promise(function (resolve) { setTimeout(function () { resolve(Math.random()); }, 5000); }) Chain a then call off the promise. Pass a function that logs out the value of its only argument: new Promise(function (resolve) { setTimeout(function () { resolve(Math.random()); }, 5000); }).then(function (result) { console.log('Long running job returned: %s', result); }); Start your Python web server and open the following link in your browser: http://localhost:8000/. You should see the following output: How it works... Just as in the previous recipe, the promise was not fulfilled until resolve was executed (this time after 5 seconds). This time however, we passed the called  resolve immediately with a random number for an argument. When this happens, the argument is provided to the callback for the subsequent then function. We'll see in future recipes how this can be continued to create promise chains. Rejecting Promise errors In the previous recipe, we saw how to use resolve to provide a result from a successfully fulfilled promise. Unfortunately, the code doesn't always run as expected. Network connections can be down, data can be corrupted, and uncountable other errors can occur. We need to be able to handle those situations as well. This recipe demonstrates how to use reject when errors arise. How to do it... Open your command-line application and navigate to your workspace. Create a new folder named 3-03-rejecting-promise-errors. Copy or create an index.html that loads and runs a main function from main.js. Create a main.js file that creates a promise, and logs messages before and after the promise is created and when the promise is fulfilled: new Promise(function (resolve) { resolve(); }).then(function (result) { console.log('Promise Completed'); }); Add a second argument to the promise callback named reject, and call reject with a new error: new Promise(function (resolve, reject) { reject(new Error('Something went wrong'); }).then(function (result) { console.log('Promise Completed'); }); Chain a catch call off the promise. Pass a function that logs out its only argument: new Promise(function (resolve, reject) { reject(new Error('Something went wrong'); }).then(function (result) { console.log('Promise Completed'); }).catch(function (error) { console.error(error); }); Start your Python web server and open the following link in your browser: http://localhost:8000/. You should see the following output: How it works... Previously we saw how to use resolve to return a value in the case of a successful fulfillment of a promise. In this case, we called reject before resolve. This means that the Promise finished with an error before it could resolve. When the Promise completes in an error state, the then callbacks are not executed. Instead we have to use catch in order to receive the error that the Promise rejects. You'll also notice that the catch callback is only executed after the main function has returned. Like successful fulfillment, listeners to unsuccessful ones execute asynchronously. See also Handle errors with Promise.catch Simulating finally with Promise.then Chaining Promises So far in this article, we've seen how to use promises to run single asynchronous tasks. This is helpful but doesn't provide a significant improvement over the callback pattern. The real advantage that promises offer comes when they are composed. In this recipe, we'll use promises to combine asynchronous functions in series. How to do it... Open your command-line application and navigate to your workspace. Create a new folder named 3-04-chaining-promises. Copy or create an index.html that loads and runs a main function from main.js. Create a main.js file that creates a promise. Resolve a random number from the promise: new Promise(function (resolve) { resolve(Math.random()); }); ); Chain a then call off of the promise. Return true from the callback if the random value is greater than or equal to 0.5: new Promise(function (resolve, reject) { resolve(Math.random()); }).then(function(value) { return value >= 0.5; }); Chain a final then call after the previous one. Log out a different message if the argument is true or false: new Promise(function (resolve, reject) { resolve(Math.random()); }).then(function (value) { return value >= 0.5; }).then(function (isReadyForLaunch) { if (isReadyForLaunch) { console.log('Start the countdown! '); } else { console.log('Abort the mission. '); } }); Start your Python web server and open the following link in your browser: http://localhost:8000/. If you are lucky, you'll see the following output: If you are unlucky, we'll see the following output: How it works... We've already seen how to use then to wait for the result of a promise. Here, we are doing the same thing multiple times in a row. This is called a promise chain. After the promise chain is started with the new promise, all of the subsequent links in the promise chain return promises as well. That is, the callback of each then function is resolve like another promise. See also Using Promise.all to resolve multiple Promises Handle errors with Promise.catch Simulating finally with a final Promise.then call Starting a Promise chain with Promise.resolve In this article's preceding recipes, we've been creating new promise objects with the constructor. This gets the jobs done, but it creates a problem. The first callback in the promise chain has a different shape than the subsequent callbacks. In the first callback, the arguments are the resolve and reject functions that trigger the subsequent then or catch callbacks. In subsequent callbacks, the returned value is propagated down the chain, and thrown errors are captured by catch callbacks. This difference adds mental overhead. It would be nice to have all of the functions in the chain behave in the same way. In this recipe, we'll see how to use Promise.resolve to start a promise chain. How to do it... Open your command-line application and navigate to your workspace. Create a new folder named 3-05-starting-with-resolve. Copy or create an index.html that loads and runs a main function from main.js. Create a main.js file that calls Promise.resolve with an empty object as the first argument: export function main () { Promise.resolve({}) } Chain a then call off of resolve, and attach rocket boosters to the passed object: export function main () { Promise.resolve({}).then(function (rocket) { console.log('attaching boosters'); rocket.boosters = [{ count: 2, fuelType: 'solid' }, { count: 1, fuelType: 'liquid' }]; return rocket; }) } Add a final then call to the chain that lets you know when the boosters have been added: export function main () { Promise.resolve({}) .then(function (rocket) { console.log('attaching boosters'); rocket.boosters = [{ count: 2, fuelType: 'solid' }, { count: 1, fuelType: 'liquid' }]; return rocket; }) .then(function (rocket) { console.log('boosters attached'); console.log(rocket); }) } Start your Python web server and open the following link in your browser: http://localhost:8000/. You should see the following output: How it works... Promise.resolve creates a new promise that resolves the value passed to it. The subsequent then method will receive that resolved value as it's argument. This method can seem a little roundabout but can be very helpful for composing asynchronous functions. In effect, the constituents of the promise chain don't need to be aware that they are in the chain (including the first step). This makes transitioning from code that doesn't use promises to code that does much easier. Using Promise.all to resolve multiple promises So far, we've seen how to use promises to perform asynchronous operations in sequence. This is useful when the individual steps are long-running operations. However, this might not always be the more efficient configuration. Quite often, we can perform multiple asynchronous operations at the same time. In this recipe, we'll see how to use Promise.all to start multiple asynchronous operations, without waiting for the previous one to complete. How to do it... Open your command-line application and navigate to your workspace. Create a new folder named 3-06-using-promise-all. Copy or create an index.html that loads and runs a main function from main.js. Create a main.js file that creates an object named rocket, and calls Promise.all with an empty array as the first argument: export function main() { console.log('Before promise created'); const rocket = {}; Promise.all([]) console.log('After promise created'); } Create a function named addBoosters that creates an object with boosters to an object: function addBoosters (rocket) { console.log('attaching boosters'); rocket.boosters = [{ count: 2, fuelType: 'solid' }, { count: 1, fuelType: 'liquid' }]; return rocket; } Create a function named performGuidanceDiagnostic that returns a promise of a successfully completed task: function performGuidanceDiagnostic (rocket) { console.log('performing guidance diagnostic'); return new Promise(function (resolve) { setTimeout(function () { console.log('guidance diagnostic complete'); rocket.guidanceDiagnostic = 'Completed'; resolve(rocket); }, 2000); }); } Create a function named loadCargo that adds a payload to the cargoBay: function loadCargo (rocket) { console.log('loading satellite'); rocket.cargoBay = [{ name: 'Communication Satellite' }] return rocket; } Use Promise.resolve to pass the rocket object to these functions within Promise.all: export function main() { console.log('Before promise created'); const rocket = {}; Promise.all([ Promise.resolve(rocket).then(addBoosters), Promise.resolve(rocket).then(performGuidanceDiagnostic), Promise.resolve(rocket).then(loadCargo) ]); console.log('After promise created'); } Attach a then call to the chain and log that the rocket is ready for launch: const rocket = {}; Promise.all([ Promise.resolve(rocket).then(addBoosters), Promise.resolve(rocket).then(performGuidanceDiagnostic), Promise.resolve(rocket).then(loadCargo) ]).then(function (results) { console.log('Rocket ready for launch'); console.log(results); }); Start your Python web server and open the following link in your browser: http://localhost:8000/. You should see the following output: How it works... Promise.all is similar to Promise.resolve; the arguments are resolved as promises. The difference is that instead of a single result, Promise.all accepts an iterable argument, each member of which is resolved individually. In the preceding example, you can see that each of the promises is initiated immediately. Two of them are able to complete while performGuidanceDiagnostic continues. The promise returned by Promise.all is fulfilled when all the constituent promises have been resolved. The results of the promises are combined into an array and propagated down the chain. You can see that three references to rocket are packed into the results argument. And you can see that the operations of each promise have been performed on the resulting object. There's more As you may have guessed, the results of the constituent promises don't have to return the same value. This can be useful, for example, when performing multiple independent network requests. The index of the result for each promise corresponds to the index of the operation within the argument to Promise.all. In these cases, it can be useful to use array destructuring to name the argument of the then callback: Promise.all([ findAstronomers, findAvailableTechnicians, findAvailableEquipment ]).then(function ([astronomers, technicians, equipment]) { // use results for astronomers, technicians, and equipment }); Handling errors with Promise.catch In a previous recipe, we saw how to fulfill a promise with an error state using reject, and we saw that this triggers the next catch callback in the promise chain. Because promises are relatively easy to compose, we need to be able to handle errors that are reported in different ways. Luckily promises are able to handle this seamlessly. In this recipe, we'll see how Promises.catch can handle errors that are reported by being thrown or through rejection. How to do it... Open your command-line application and navigate to your workspace.  Create a new folder named 3-07-handle-errors-promise-catch. Copy or create an index.html that loads and runs a main function from main.js. Create a main.js file with a main function that creates an object named rocket: export function main() { console.log('Before promise created'); const rocket = {}; console.log('After promise created'); } Create a function addBoosters that throws an error: function addBoosters (rocket) { throw new Error('Unable to add Boosters'); } Create a function performGuidanceDiagnostic that returns a promise that rejects an error: function performGuidanceDiagnostic (rocket) { return new Promise(function (resolve, reject) { reject(new Error('Unable to finish guidance diagnostic')); }); } Use Promise.resolve to pass the rocket object to these functions, and chain a catch off each of them: export function main() { console.log('Before promise created'); const rocket = {}; Promise.resolve(rocket).then(addBoosters) .catch(console.error); Promise.resolve(rocket).then(performGuidanceDiagnostic) .catch(console.error); console.log('After promise created'); } Start your Python web server and open the following link in your browser: http://localhost:8000/. You should see the following output: How it works... As we saw before, when a promise is fulfilled in a rejected state, the callback of the catch functions is triggered. In the preceding recipe, we see that this can happen when the reject method is called (as with performGuidanceDiagnostic). It also happens when a function in the chain throws an error (as will addBoosters). This has similar benefit to how Promise.resolve can normalize asynchronous functions. This handling allows asynchronous functions to not know about the promise chain, and announce error states in a way that is familiar to developers who are new to promises. This makes expanding the use of promises much easier. Simulating finally with the promise API In a previous recipe, we saw how catch can be used to handle errors, whether a promise has rejected, or a callback has thrown an error. Sometimes, it is desirable to execute code whether or not an error state has been detected. In the context of try/catch blocks, the finally block can be used for this purpose. We have to do a little more work to get the same behavior when working with promises In this recipe, we'll see how a final then call to execute some code in both successful and failing fulfillment states. How to do it... Open your command-line application and navigate to your workspace.  Create a new folder named 3-08-simulating-finally. Copy or create an index.html that loads and runs a main function from main.js. Create a main.js file with a main function that logs out messages for before and after promise creation: export function main() { console.log('Before promise created'); console.log('After promise created'); } Create a function named addBoosters that throws an error if its first parameter is false: function addBoosters(shouldFail) { if (shouldFail) { throw new Error('Unable to add Boosters'); } return { boosters: [{ count: 2, fuelType: 'solid' }, { count: 1, fuelType: 'liquid' }] }; } Use Promise.resolve to pass a Boolean value that is true if a random number is greater than 0.5 to addBoosters: export function main() { console.log('Before promise created'); Promise.resolve(Math.random() > 0.5) .then(addBoosters) console.log('After promise created'); } Add a then function to the chain that logs a success message: export function main() { console.log('Before promise created'); Promise.resolve(Math.random() > 0.5) .then(addBoosters) .then(() => console.log('Ready for launch: ')) console.log('After promise created'); } Add a catch to the chain and log out the error if thrown: export function main() { console.log('Before promise created'); Promise.resolve(Math.random() > 0.5) .then(addBoosters) .then(() => console.log('Ready for launch: ')) .catch(console.error) console.log('After promise created'); } Add a then after the catch, and log out that we need to make an announcement: export function main() { console.log('Before promise created'); Promise.resolve(Math.random() > 0.5) .then(addBoosters) .then(() => console.log('Ready for launch: ')) .catch(console.error) .then(() => console.log('Time to inform the press.')); console.log('After promise created'); } Start your Python web server and open the following link in your browser: http://localhost:8000/. If you are lucky and the boosters are added successfully, you'll see the following output: 12. If you are unlucky, you'll see an error message like the following: How it works... We can see in the preceding output that whether or not the asynchronous function completes in an error state, the last then callback is executed. This is possible because the catch method doesn't stop the promise chain. It simply catches any error states from the previous links in the chain, and then propagates a new value forward. The final then is then protected from being bypassed by an error state by this catch. And so, regardless of the fulfillment state of prior links in the chain, we can be sure that the callback of this final then will be executed. To summarize, we learned how to use the Promise API to organize asynchronous programs. We also looked at how to propagate results through promise chains and handle errors.  You read an excerpt from a book written by Ross Harrison, titled ECMAScript Cookbook. It’s a complete guide on how to become a better web programmer by writing efficient and modular code using ES6 and ES8. What’s new in ECMAScript 2018 (ES9)? ECMAScript 7 – What to expect? Modular Programming in ECMAScript 6  
Read more
  • 0
  • 0
  • 3107
article-image-single-responsibility-principle-solid-net-core
Aaron Lazar
07 May 2018
7 min read
Save for later

Applying Single Responsibility principle from SOLID in .NET Core

Aaron Lazar
07 May 2018
7 min read
In today's tutorial, we'll learn how to apply the Single Responsibility principle from the SOLID principles, to .NET Core Applications. This brings us to another interesting concept in OOP called SOLID design principles. These design principles are applied to any OOP design and are intended to make software easier to understand, more flexible, and easily maintainable. [box type="shadow" align="" class="" width=""]This article is an extract from the book C# 7 and .NET Core Blueprints, authored by Dirk Strauss and Jas Rademeyer. The book is a step-by-step guide that will teach you the essential .NET Core and C# concepts with the help of real-world projects.[/box] The term SOLID is a mnemonic for: Single responsibility principle Open/closed principle Liskov substitution principle Interface segregation principle Dependency inversion principle In this article, we will take a look at the first of the principles—the single responsibility principle. Let's look at the single responsibility principle now. Single responsibility principle Simply put, a module or class should have the following characteristics only: It should do one single thing and only have a single reason to change It should do its one single thing well The functionality provided needs to be entirely encapsulated by that class or module What is meant when saying that a module must be responsible for a single thing? The Google definition of a module is: "Each of a set of standardized parts or independent units that can be used to construct a more complex structure, such as an item of furniture or a building." From this, we can understand that a module is a simple building block. It can be used or reused to create something bigger and more complex when used with other modules. In C# therefore, the module does closely resemble a class, but I will go so far as to say that a module can also be extended to be a method. The function that the class or module performs can only be one thing. That is to say that it has a narrow responsibility. It is not concerned with anything else other than doing that one thing it was designed to do. If we had to apply the single responsibility principle to a person, then that person would be only a software developer, for example. But what if a software developer also was a doctor and a mechanic and a school teacher? Would that person be effective in any of those roles? That would contravene the single responsibility principle. The same is true for code. Having a look at our AllRounder and Batsman classes, you will notice that in AllRounder, we have the following code: private double CalculateStrikeRate(StrikeRate strikeRateType) { switch (strikeRateType) { case StrikeRate.Bowling: return (BowlerBallsBowled / BowlerWickets); case StrikeRate.Batting: return (BatsmanRuns * 100) / BatsmanBallsFaced; default: throw new Exception("Invalid enum"); } } public override int CalculatePlayerRank() { return 0; } In Batsman, we have the following code: public double BatsmanBattingStrikeRate => (BatsmanRuns * 100) / BatsmanBallsFaced; public override int CalculatePlayerRank() { return 0; } Using what we have learned about the single responsibility principle, we notice that there is an issue here. To illustrate the problem, let's compare the code side by side: We are essentially repeating code in the Batsman and AllRounder classes. This doesn't really bode well for single responsibility, does it? I mean, the one principle is that a class must only have a single function to perform. At the moment, both the Batsman and AllRounder classes are taking care of calculating strike rates. They also both take care of calculating the player rank. They even both have exactly the same code for calculating the strike rate of a batsman! The problem comes in when the strike rate calculation changes (not that it easily would, but let's assume it does). We now know that we have to change the calculation in both places. As soon as the developer changes one calculation and not the other, a bug is introduced into our application. Let's simplify our classes. In the BaseClasses folder, create a new abstract class called Statistics. The code should look as follows: namespace cricketScoreTrack.BaseClasses { public abstract class Statistics { public abstract double CalculateStrikeRate(Player player); public abstract int CalculatePlayerRank(Player player); } } In the Classes folder, create a new derived class called PlayerStatistics (that is to say it inherits from the Statistics abstract class). The code should look as follows: using cricketScoreTrack.BaseClasses; using System; namespace cricketScoreTrack.Classes { public class PlayerStatistics : Statistics { public override int CalculatePlayerRank(Player player) { return 1; } public override double CalculateStrikeRate(Player player) { switch (player) { case AllRounder allrounder: return (allrounder.BowlerBallsBowled / allrounder.BowlerWickets); case Batsman batsman: return (batsman.BatsmanRuns * 100) / batsman.BatsmanBallsFaced; default: throw new ArgumentException("Incorrect argument supplied"); } } } } You will see that the PlayerStatistics class is now solely responsible for calculating player statistics for the player's rank and the player's strike rate. You will see that I have not included much of an implementation for calculating the player's rank. I briefly commented the code on GitHub for this method on how a player's rank is determined. It is quite a complicated calculation and differs for batsmen and bowlers. I have therefore omitted it for the purposes of this chapter on OOP. Your Solution should now look as follows: Swing back over to your Player abstract class and remove abstract public int CalculatePlayerRank(); from the class. In the IBowler interface, remove the double BowlerStrikeRate { get; } property. In the IBatter interface, remove the double BatsmanBattingStrikeRate { get; } property. In the Batsman class, remove public double BatsmanBattingStrikeRate and public override int CalculatePlayerRank() from the class. The code in the Batsman class will now look as follows: using cricketScoreTrack.BaseClasses; using cricketScoreTrack.Interfaces; namespace cricketScoreTrack.Classes { public class Batsman : Player, IBatter { #region Player public override string FirstName { get; set; } public override string LastName { get; set; } public override int Age { get; set; } public override string Bio { get; set; } #endregion #region IBatsman public int BatsmanRuns { get; set; } public int BatsmanBallsFaced { get; set; } public int BatsmanMatch4s { get; set; } public int BatsmanMatch6s { get; set; } #endregion } } Looking at the AllRounder class, remove the public enum StrikeRate { Bowling = 0, Batting = 1 } enum as well as the public double BatsmanBattingStrikeRate and public double BowlerStrikeRate properties. Lastly, remove the private double CalculateStrikeRate(StrikeRate strikeRateType) and public override int CalculatePlayerRank() methods. The code for the AllRounder class now looks as follows: using cricketScoreTrack.BaseClasses; using cricketScoreTrack.Interfaces; using System; namespace cricketScoreTrack.Classes { public class AllRounder : Player, IBatter, IBowler { #region Player public override string FirstName { get; set; } public override string LastName { get; set; } public override int Age { get; set; } public override string Bio { get; set; } #endregion #region IBatsman public int BatsmanRuns { get; set; } public int BatsmanBallsFaced { get; set; } public int BatsmanMatch4s { get; set; } public int BatsmanMatch6s { get; set; } #endregion #region IBowler public double BowlerSpeed { get; set; } public string BowlerType { get; set; } public int BowlerBallsBowled { get; set; } public int BowlerMaidens { get; set; } public int BowlerWickets { get; set; } public double BowlerEconomy => BowlerRunsConceded / BowlerOversBowled; public int BowlerRunsConceded { get; set; } public int BowlerOversBowled { get; set; } #endregion } } Looking back at our AllRounder and Batsman classes, the code is clearly simplified. It is definitely more flexible and is starting to look like a well-constructed set of classes. Give your solution a rebuild and make sure that it is all working. So now you know how to simplify your .NET Core applications by applying the Single Responsibility principle. If you found this tutorial helpful and you'd like to learn more, go ahead and pick up the book C# 7 and .NET Core Blueprints, authored by Dirk Strauss and Jas Rademeyer. What is ASP.NET Core? How to call an Azure function from an ASP.NET Core MVC application How to dockerize an ASP.NET Core application  
Read more
  • 0
  • 0
  • 5819

article-image-azure-function-asp-net-core-mvc-application
Aaron Lazar
03 May 2018
10 min read
Save for later

How to call an Azure function from an ASP.NET Core MVC application

Aaron Lazar
03 May 2018
10 min read
In this tutorial, we'll learn how to call an Azure Function from an ASP.NET Core MVC application. [box type="shadow" align="" class="" width=""]This article is an extract from the book C# 7 and .NET Core Blueprints, authored by Dirk Strauss and Jas Rademeyer. This book is a step-by-step guide that will teach you essential .NET Core and C# concepts with the help of real-world projects.[/box] We will get started with creating an ASP.NET Core MVC application that will call our Azure Function to validate an email address entered into a login screen of the application: This application does no authentication at all. All it is doing is validating the email address entered. ASP.NET Core MVC authentication is a totally different topic and not the focus of this post. In Visual Studio 2017, create a new project and select ASP.NET Core Web Application from the project templates. Click on the OK button to create the project. This is shown in the following screenshot: On the next screen, ensure that .NET Core and ASP.NET Core 2.0 is selected from the drop-down options on the form. Select Web Application (Model-View-Controller) as the type of application to create. Don't bother with any kind of authentication or enabling Docker support. Just click on the OK button to create your project: After your project is created, you will see the familiar project structure in the Solution Explorer of Visual Studio: Creating the login form For this next part, we can create a plain and simple vanilla login form. For a little bit of fun, let's spice things up a bit. Have a look on the internet for some free login form templates: I decided to use a site called colorlib that provided 50 free HTML5 and CSS3 login forms in one of their recent blog posts. The URL to the article is: https://colorlib.com/wp/html5-and-css3-login-forms/. I decided to use Login Form 1 by Colorlib from their site. Download the template to your computer and extract the ZIP file. Inside the extracted ZIP file, you will see that we have several folders. Copy all the folders in this extracted ZIP file (leave the index.html file as we will use this in a minute): Next, go to the solution for your Visual Studio application. In the wwwroot folder, move or delete the contents and paste the folders from the extracted ZIP file into the wwwroot folder of your ASP.NET Core MVC application. Your wwwroot folder should now look as follows: 4. Back in Visual Studio, you will see the folders when you expand the wwwroot node in the CoreMailValidation project. 5. I also want to focus your attention to the Index.cshtml and _Layout.cshtml files. We will be modifying these files next: Open the Index.cshtml file and remove all the markup (except the section in the curly brackets) from this file. Paste the HTML markup from the index.html file from the ZIP file we extracted earlier. Do not copy the all the markup from the index.html file. Only copy the markup inside the <body></body> tags. Your Index.cshtml file should now look as follows: @{ ViewData["Title"] = "Login Page"; } <div class="limiter"> <div class="container-login100"> <div class="wrap-login100"> <div class="login100-pic js-tilt" data-tilt> <img src="images/img-01.png" alt="IMG"> </div> <form class="login100-form validate-form"> <span class="login100-form-title"> Member Login </span> <div class="wrap-input100 validate-input" data-validate="Valid email is required: ex@abc.xyz"> <input class="input100" type="text" name="email" placeholder="Email"> <span class="focus-input100"></span> <span class="symbol-input100"> <i class="fa fa-envelope" aria-hidden="true"></i> </span> </div> <div class="wrap-input100 validate-input" data-validate="Password is required"> <input class="input100" type="password" name="pass" placeholder="Password"> <span class="focus-input100"></span> <span class="symbol-input100"> <i class="fa fa-lock" aria-hidden="true"></i> </span> </div> <div class="container-login100-form-btn"> <button class="login100-form-btn"> Login </button> </div> <div class="text-center p-t-12"> <span class="txt1"> Forgot </span> <a class="txt2" href="#"> Username / Password? </a> </div> <div class="text-center p-t-136"> <a class="txt2" href="#"> Create your Account <i class="fa fa-long-arrow-right m-l-5" aria-hidden="true"></i> </a> </div> </form> </div> </div> </div> The code for this chapter is available on GitHub here: Next, open the Layout.cshtml file and add all the links to the folders and files we copied into the wwwroot folder earlier. Use the index.html file for reference. You will notice that the _Layout.cshtml file contains the following piece of code—@RenderBody(). This is a placeholder that specifies where the Index.cshtml file content should be injected. If you are coming from ASP.NET Web Forms, think of the _Layout.cshtml page as a master page. Your Layout.cshtml markup should look as follows: <!DOCTYPE html> <html> <head> <meta charset="utf-8" /> <meta name="viewport" content="width=device-width, initial-scale=1.0" /> <title>@ViewData["Title"] - CoreMailValidation</title> <link rel="icon" type="image/png" href="~/images/icons/favicon.ico" /> <link rel="stylesheet" type="text/css" href="~/vendor/bootstrap/css/bootstrap.min.css"> <link rel="stylesheet" type="text/css" href="~/fonts/font-awesome-4.7.0/css/font-awesome.min.css"> <link rel="stylesheet" type="text/css" href="~/vendor/animate/animate.css"> <link rel="stylesheet" type="text/css" href="~/vendor/css-hamburgers/hamburgers.min.css"> <link rel="stylesheet" type="text/css" href="~/vendor/select2/select2.min.css"> <link rel="stylesheet" type="text/css" href="~/css/util.css"> <link rel="stylesheet" type="text/css" href="~/css/main.css"> </head> <body> <div class="container body-content"> @RenderBody() <hr /> <footer> <p>© 2018 - CoreMailValidation</p> </footer> </div> <script src="~/vendor/jquery/jquery-3.2.1.min.js"></script> <script src="~/vendor/bootstrap/js/popper.js"></script> <script src="~/vendor/bootstrap/js/bootstrap.min.js"></script> <script src="~/vendor/select2/select2.min.js"></script> <script src="~/vendor/tilt/tilt.jquery.min.js"></script> <script> $('.js-tilt').tilt({ scale: 1.1 }) </script> <script src="~/js/main.js"></script> @RenderSection("Scripts", required: false) </body> </html> If everything worked out right, you will see the following page when you run your ASP.NET Core MVC application. The login form is obviously totally non-functional: However, the login form is totally responsive. If you had to reduce the size of your browser window, you will see the form scale as your browser size reduces. This is what you want. If you want to explore the responsive design offered by Bootstrap, head on over to https://getbootstrap.com/ and go through the examples in the documentation:   The next thing we want to do is hook this login form up to our controller and call the Azure Function we created to validate the email address we entered. Let's look at doing that next. Hooking it all up To simplify things, we will be creating a model to pass to our controller: Create a new class in the Models folder of your application called LoginModel and click on the Add button:  2. Your project should now look as follows. You will see the model added to the Models folder: The next thing we want to do is add some code to our model to represent the fields on our login form. Add two properties called Email and Password: namespace CoreMailValidation.Models { public class LoginModel { public string Email { get; set; } public string Password { get; set; } } } Back in the Index.cshtml view, add the model declaration to the top of the page. This makes the model available for use in our view. Take care to specify the correct namespace where the model exists: @model CoreMailValidation.Models.LoginModel @{ ViewData["Title"] = "Login Page"; } The next portion of code needs to be written in the HomeController.cs file. Currently, it should only have an action called Index(): public IActionResult Index() { return View(); } Add a new async function called ValidateEmail that will use the base URL and parameter string of the Azure Function URL we copied earlier and call it using an HTTP request. I will not go into much detail here, as I believe the code to be pretty straightforward. All we are doing is calling the Azure Function using the URL we copied earlier and reading the return data: private async Task<string> ValidateEmail(string emailToValidate) { string azureBaseUrl = "https://core-mail- validation.azurewebsites.net/api/HttpTriggerCSharp1"; string urlQueryStringParams = $"? code=/IS4OJ3T46quiRzUJTxaGFenTeIVXyyOdtBFGasW9dUZ0snmoQfWoQ ==&email={emailToValidate}"; using (HttpClient client = new HttpClient()) { using (HttpResponseMessage res = await client.GetAsync( $"{azureBaseUrl}{urlQueryStringParams}")) { using (HttpContent content = res.Content) { string data = await content.ReadAsStringAsync(); if (data != null) { return data; } else return ""; } } } } Create another public async action called ValidateLogin. Inside the action, check to see if the ModelState is valid before continuing. For a nice explanation of what ModelState is, have a look at the following article—https://www.exceptionnotfound.net/asp-net-mvc-demystified-modelstate/. We then do an await on the ValidateEmail function, and if the return data contains the word false, we know that the email validation failed. A failure message is then passed to the TempData property on the controller. The TempData property is a place to store data until it is read. It is exposed on the controller by ASP.NET Core MVC. The TempData property uses a cookie-based provider by default in ASP.NET Core 2.0 to store the data. To examine data inside the TempData property without deleting it, you can use the Keep and Peek methods. To read more on TempData, see the Microsoft documentation here: https://docs.microsoft.com/en-us/aspnet/core/fundamentals/app-state?tabs=aspnetcore2x. If the email validation passed, then we know that the email address is valid and we can do something else. Here, we are simply just saying that the user is logged in. In reality, we will perform some sort of authentication here and then route to the correct controller. So now you know how to call an Azure Function from an ASP.NET Core application. If you found this tutorial helpful and you'd like to learn more, go ahead and pick up the book C# 7 and .NET Core Blueprints. What is ASP.NET Core? Why ASP.NET makes building apps for mobile and web easy How to dockerize an ASP.NET Core application    
Read more
  • 0
  • 1
  • 15863