Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon

How-To Tutorials - Languages

135 Articles
article-image-kotlin-basics
Packt
16 Nov 2016
7 min read
Save for later

Kotlin Basics

Packt
16 Nov 2016
7 min read
In this article by Stephen Samuel and Stefan Bocutiu, the authors of the book Programming Kotlin, it’s time to discover the fundamental building blocks of Kotlin. This article will cover the basic constructs of the language, such as defining variables, control flow syntax, type inference, and smart casting, and its basic types and their hierarchy. (For more resources related to this topic, see here.) For those coming from a Java background, this article will also highlight some of the key differences between Kotlin and Java and how Kotlin’s language features are able to exist on the JVM. For those who are not existing Java programmers, then those differences can be safely skipped. vals and vars Kotlin has two keywords for declaring variables: val and var. A var is a mutable variable—a variable that can be changed to another value by reassigning it. This is equivalent to declaring a variable in Java. val name = “kotlin” Alternatively, the var can be initialized later: var name: String name = “kotlin” Variables defined with var can be reassigned since they are mutable: var name = “kotlin” name = “more kotlin” The val keyword is used to declare a read-only variable. This is equivalent to declaring a final variable in Java. A val must be initialized when created since it cannot be changed later: val name = “kotlin” A read-only variable does not mean the instance itself is automatically immutable. The instance may still allow its member variables to be changed via functions or properties. But the variable itself cannot change its value or be reassigned to another value. Type inference Did you notice in the previous section that the type of the variable was not included when it was initialized? This is different to Java, where the type of the variable must always accompany its declaration. Even though Kotlin is a strongly typed language, we don’t always need to declare types explicitly. The compiler can attempt to figure out the type of an expression from the information included in the expression. A simple val is an easy case for the compiler because the type is clear from the right-hand side. This mechanism is called type inference. This reduces boilerplate while keeping the type safety we expect of a modern language. Values and variables are not the only places where type inference can be used. It can also be used in closures where the type of the parameter(s) can be inferred from the function signature. It can also be used in single-line functions, where the return value can be inferred from the expression in the function, as this example demonstrates: fun plusOne(x: Int) = x + 1 Sometimes, it is helpful to add type inference if the type inferred by the compiler is not exactly what you want: val explicitType: Number = 12.3 Basic types One of the big differences between Kotlin and Java is that in Kotlin, everything is an object. If you come from a Java background, then you will already be aware that in Java, there are special primitive types, which are treated differently from objects. They cannot be used as generic types, do not support method/function calls, and cannot be assigned null. An example is the boolean primitive type. Java introduced wrapper objects to offer a workaround in which primitive types are wrapped in objects so that java.lang. Boolean wraps a boolean in order to smooth over the distinctions. Kotlin removes this necessity entirely from the language by promoting the primitives to full objects. Whenever possible, the Kotlin compiler will map basic types back to JVM primitives for performance reasons. However, the values must sometimes be boxed, such as when the type is nullable or when it is used in generics. Two different values that are boxed might not use the same instance, so referential equality is not guaranteed on boxed values. Numbers The built-in number types are as follows: Type Width long 64 int 32 short 16 byte 8 double 64 float 32 To create a number literal, use one of the following forms: val int = 123 val long = 123456L val double = 12.34 val float = 12.34F val hexadecimal = 0xAB val binary = 0b01010101 You will notice that a long value requires the suffix L and a float, F. The double type is used as the default for floating point numbers, and int for integral numbers. The hexadecimal and binary use the prefixes 0x and 0b respectively. Kotlin does not support the automatic widening of numbers, so conversion must be invoked explicitly. Each number has a function that will convert the value to one of the other number types: val int = 123 val long = int.toLong() val float = 12.34F val double = float.toDouble() The full set of methods for conversions between types is as follows: toByte() toShort() toInt() toLong() toFloat() toDouble() toChar() Unlike Java, there are no built-in bitwise operators, but named functions instead. This can be invoked like operators (except inverse): val leftShift = 1 shl 2 val rightShift = 1 shr 2 val unsignedRightShift = 1 ushr 2 val and = 1 and 0x00001111 val or = 1 and 0x00001111 val xor = 1 xor 0x00001111 val inv = 1.inv() Booleans Booleans are rather standard and support the usual negation, conjunction and disjunction operations. Conjunction and disjunction are lazily evaluated. So if the left-hand side satisfies the clause, then the right-hand side will not be evaluated: val x = 1 val y = 2 val z = 2 val isTrue = x < y && x < z val alsoTrue = x == y || y == z Chars Chars represent a single character. Character literals use single quotes, such as a or Z. Chars also support escaping for the following characters: t, b, n, r, , , \, $. All Unicode characters can be represented using the respective Unicode number, like so: u1234. Note that the char type is not treated as a number, unlike Java. Strings Just as in Java, strings are immutable. String literals can be created using double or triple quotes. Double quotes create an escaped string. In an escaped string, special characters such as newline must be escaped: val string = “string with n new line” Triple quotes create a raw string. In a raw string, no escaping is necessarily, and all characters can be included. val rawString = “““ raw string is super useful for strings that span many lines “““ Strings also provide an iterator function, so they can be used in a for loop. Arrays In Kotlin, we can create an array using the arrayOf() library function: val array = arrayOf(1, 2, 3) Alternatively, we can create an array from an initial size and a function that is used to generate each element: val perfectSquares = Array(10, { k -> k * k }) Unlike Java, arrays are not treated specially by the language and are regular collection classes. Instances of Array provide an iterator function and a size function as well as a get and set function. The get and set functions are also available through bracket syntax like many C style languages: val element1 = array[0] val element2 = array[1] array[2] = 5 To avoid boxing types that will ultimately be represented as primitives in the JVM, Kotlin provides alternative array classes that are specialized for each of the primitive types. This allows performance-critical code to use arrays as efficiently as they would do in plain Java. The provided classes are ByteArray, CharArray, ShortArray, IntArray, LongArray, BooleanArray, FloatArray, and DoubleArray. Comments Comments in Kotlin will come as no surprise to most programmers as they are the same as Java, Javascript, and C, among other languages. Block comments and line comments are supported: // line comment /* A block comment can span many lines */ Packages Packages allow us to split code into namespaces. Any file may begin with a package declaration: package com.packt.myproject class Foo fun bar(): String = “bar” The package name is used to give us the fully-qualified name (FQN) for a class, object, interface, or function. In the previous example, the Foo class has the FQN com.packt.myproject.Foo, and the top-level function bar has the FQN com.packt.myproject.bar. Summary In Kotlin, everything is an object in the sense that we can call member functions and properties on any variable. Some types are built in because their implementation is optimized, but to the user, they look like ordinary classes. In this article, we described most of these types: numbers, characters, booleans, and arrays. Resources for Article: Further resources on this subject: Responsive Applications with Asynchronous Programming [Article] Asynchronous Programming in F# [Article] Go Programming Control Flow [Article]
Read more
  • 0
  • 0
  • 1692

article-image-getting-started-sorting-algorithms-java
Packt
16 Nov 2016
9 min read
Save for later

Getting Started with Sorting Algorithms in Java

Packt
16 Nov 2016
9 min read
In this article by Peter Verhas author of the book Java 9 Programming By Example, we will develop a simple sort program. Using this code as an example, we will look at different build tools, which are frequently used for Java projects, and learn the basic features of the Java language. (For more resources related to this topic, see here.) The problem we will solve The sorting problem is one of the oldest programming tasks that an engineer solves. We will have a set of records and we know that we will want to find a specific one sometime later, and we will want to find that one fast. To find it, we will sort the records in a specific order that helps finding the record we want fast. As an example, we can have the names of the students with some marks on cards. When students will come to the office asking for the result, we can turn all pages one after the other to find the name of the enquiring student. However, it is better if we sort the papers by the name of the students lexicographically. When a student comes, we can search the mark attached to the name much faster. We can look at the middle card; if it shows the name of the student, then we are happy to have found the name and the mark. If the card precedes the name of the student lexicographically, then we will continue searching in the second half, otherwise the first half. Following that approach, we can find the name of the student in no more steps than as many times the pack of cards can be halved. If we have two cards, then it is two steps at most. If it is four, then we will need three steps at most. If there are eight cards, then we may need four steps, but not more. If there are 1000 cards, then we may need at most 11 steps, while the original, non-sorted set will need 1000 steps, worst case. That is, approximately, it speeds up the search 100 times, so this is worth sorting the cards, unless the sorting itself takes too much time. In many cases, it is worth sorting the dataset and there are many sorting algorithms to do that. There are simpler and more complex algorithms, and as in many cases, more complex algorithms are the one that run faster. As we are focusing on the Java programming part and not the algorithm forging, in this article, we will develop a Java code that implements a simple and not-that-fast algorithm. Bubble sort The algorithm that we will implement in this article is well known as bubble sort. The approach is very simple. Begin at the start of the cards and compare the first and the second card. If the first card is later in lexicographic order than the second one, then swap the two cards. Then, repeat this for the card that is at the second place now, then the third, and so on. There is a card that is lexicographically the latest, say Wilson, and sometime later, we will get to this card as we go on swapping, the cards going from start to end. When we get this card and start to compare it with the next one, we will always swap them; this way, Wilson's card will travel to the last place where it has to be after the sort. All we have to do is repeat this travelling from the start and the occasional swapping of cards again, but this time only to the last but one element. This time, the second latest element will get to its place—say Wilkinson will be right before Wilson. If we have n cards, and we repeat this n-1 times, all cards will get to their place. Project structure and build tools When a project is more complex than a single class, and it usually is, then it is wise to define a project structure. We will have to decide where we store the source files, where the resource files (those that contain some resource for the program, but are not Java source) are, where should the .class files be written by the compiler, and so on. Generally, the structure is mainly the directory setup and configuring the tools that perform the build that use these tools. The compilation of complex programs cannot be feasibly done using the command line issuing javac commands. If we have a 100 Java source files, the compilation will require that many javac commands to be issued. We can write a simple bash script that does that. First, it will be just 100 lines, each compiling one source Java file to class file. Then, we will realize that this is only time, CPU, and power consuming to compile the files that are not changed since the last compilation. So, we can add some bash programming that checks the time stamp on the source and generated files. Then, we will probably realize that… whatever. At the end, we will end up with a tool that is essentially a build tool. And, this is already done. Instead of creating one, we will use a build tool that is ready. There are a few of them that can be found at https://en.wikipedia.org/wiki/List_of_build_automation_software Make The Make program was originally created in April 1976, so this is not a new tool. It is included in the Unix system so this tool is available without any extra installation on Linux, Mac OS X, or any other Unix-based system. Additionally, there are numerous ports of the tool on Windows and some version is/was included in the Visual C compiler toolset. The Make is not tied to Java. It was created when the major programming language was C, but it is not tied to C or any other language. Make is a dependency description language that has a very simple syntax. The Make, just like any other build tool, works controlled by a project description file. In case of make, this file contains a rule set. The description file is usually named Makefile, but in case the name of the description file is different, it can be specified as a command-line option to the make command. Rules in Makefile follow each other and a it is one or more lines. The first line starts at the first position (there is no tab or space at the start of the line) and the following lines start with a tab character. Thus, Makefile may look something like the following code: run : hello.jar java -cp hello.jar HelloWorld hello.jar : HelloWorld.class jar -cf hello.jar HelloWorld.class HelloWorld.class : HelloWorld.java javac HelloWorld.java The file defines three so-called targets: run, hello.jar, and HelloWorld.class. To create HelloWorld.class, type the following line at the Command Prompt: make HelloWorld.class The make will look at the rule and see that it depends on HelloWorld.java. If the HelloWorld.class file does not exist, or HelloWorld.java is newer than the Java source file, make will execute the command that is written on the next line and it will compile the Java source file. If the class file was created following the last modification of HelloWorld.java, then make knows that there is no need to run the command. In case of creating HelloWorld.class,the make program has an easy task. The source file was already there. If you issue the make hello.jar command, the procedure is more complex. The make command sees that in order to create hello.jar, it needs HelloWorld.class, which itself is also a target on another rule. Thus, it may need to be created. First, it starts the problem the same way as before. If HelloWorld.class is there, and is older than hello.jar, there is nothing to do. If it is not there, or is newer than hello.jar, then the jar -cf hello.jar HelloWorld.class command needs to be executed, but not yet. It remembers that this command has to be executed sometime in the future when all the commands that are needed to create HelloWorld.class are already executed successfully. Thus, it continues to create the class file exactly the same way as I already described earlier. In general, a rule can have the following format: target : dependencies command The make command can create any target using the make target command by first calculating which commands to execute and then executing them one by one. The commands are shell commands executing in a different process and may pose problems under Windows, which may render the Makefile files operating system dependent. Note that the run target is not an actual file that make creates. A target can be a file name or just a name for the target. In the latter case, make will never consider the readily available target. As we do not use make for Java project, there is no room to get into more details. Additionally, I cheated a bit by making the description of a rule simpler than it should be. The make tool has many powerful features out of the scope of this book. There are also several implementations that differ a little from each other. You will most probably meet the one made by the Free Software Foundation—the GNU make. And, of course, just in case of any Unix command-line tool, man is your friend. The man make command will display the documentation of the tool on the screen. The main points that you should remember about make are as follows: It defines the dependencies of the individual artifacts (targets) in a declarative way It defines the actions to create the missing artifacts in an imperative way. Summary In this article, we have developed a very basic sort algorithm. It was made purposefully simple so that we could reiterate on the basic and most important Java language elements, classes, packages, variables, methods, and so on. Resources for Article: Further resources on this subject: Algorithm Analysis [article] Introduction to C# and .NET [article] Parallel Computing [article]
Read more
  • 0
  • 0
  • 1243

article-image-hosting-google-app-engine
Packt
21 Oct 2016
22 min read
Save for later

Hosting on Google App Engine

Packt
21 Oct 2016
22 min read
In this article by Mat Ryer, the author of the book Go Programming Blueprints Second Edition, we will see how to create a successful Google Application and deploy it in Google App Engine along with Googles Cloud data storage facility for App Engine Developers. (For more resources related to this topic, see here.) Google App Engine gives developers a NoOps (short for No Operations, indicating that developers and engineers have no work to do in order to have their code running and available) way of deploying their applications, and Go has been officially supported as a language option for some years now. Google's architecture runs some of the biggest applications in the world, such as Google Search, Google Maps, Gmail, among others, so is a pretty safe bet when it comes to deploying our own code. Google App Engine allows you to write a Go application, add a few special configuration files, and deploy it to Google's servers, where it will be hosted and made available in a highly available, scalable, and elastic environment. Instances will automatically spin up to meet demand and tear down gracefully when they are no longer needed with a healthy free quota and preapproved budgets. Along with running application instances, Google App Engine makes available a myriad of useful services, such as fast and high-scale data stores, search, memcache, and task queues. Transparent load balancing means you don't need to build and maintain additional software or hardware to ensure servers don't get overloaded and that requests are fulfilled quickly. In this article, we will build the API backend for a question and answer service similar to Stack Overflow or Quora and deploy it to Google App Engine. In the process, we'll explore techniques, patterns, and practices that can be applied to all such applications as well as dive deep into some of the more useful services available to our application. Specifically, in this article, you will learn: How to use the Google App Engine SDK for Go to build and test applications locally before deploying to the cloud How to use app.yaml to configure your application How Modules in Google App Engine let you independently manage the different components that make up your application How the Google Cloud Datastore lets you persist and query data at scale A sensible pattern for the modeling of data and working with keys in Google Cloud Datastore How to use the Google App Engine Users API to authenticate people with Google accounts A pattern to embed denormalized data into entities The Google App Engine SDK for Go In order to run and deploy Google App Engine applications, we must download and configure the Go SDK. Head over to https://cloud.google.com/appengine/downloads and download the latest Google App Engine SDK for Go for your computer. The ZIP file contains a folder called go_appengine, which you should place in an appropriate folder outside of your GOPATH, for example, in /Users/yourname/work/go_appengine. It is possible that the names of these SDKs will change in the future—if that happens, ensure that you consult the project home page for notes pointing you in the right direction at https://github.com/matryer/goblueprints. Next, you will need to add the go_appengine folder to your $PATH environment variable, much like what you did with the go folder when you first configured Go. To test your installation, open a terminal and type this: goapp version You should see something like the following: go version go1.6.1 (appengine-1.9.37) darwin/amd64 The actual version of Go is likely to differ and is often a few months behind actual Go releases. This is because the Cloud Platform team at Google needs to do work on its end to support new releases of Go. The goapp command is a drop-in replacement for the go command with a few additional subcommands; so you can do things like goapp test and goapp vet, for example. Creating your application In order to deploy an application to Google's servers, we must use the Google Cloud Platform Console to set it up. In a browser, go to https://console.cloud.google.com and sign in with your Google account. Look for the Create Project menu item, which often gets moved around as the console changes from time to time. If you already have some projects, click on a project name to open a submenu, and you'll find it in there. If you can't find what you're looking for, just search Creating App Engine project and you'll find it. When the New Project dialog box opens, you will be asked for a name for your application. You are free to call it whatever you like (for example, Answers), but note the Project ID that is generated for you; you will need to refer to this when you configure your app later. You can also click on Edit and specify your own ID, but know that the value must be globally unique, so you'll have to get creative when thinking one up. Here we will use answersapp as the application ID, but you won't be able to use that one since it has already been taken. You may need to wait a minute or two for your project to get created; there's no need to watch the page—you can continue and check back later. App Engine applications are Go packages Now that the Google App Engine SDK for Go is configured and our application has been created, we can start building it. In Google App Engine, an application is just a normal Go package with an init function that registers handlers via the http.Handle or http.HandleFunc functions. It does not need to be the main package like normal tools. Create a new folder (somewhere inside your GOPATH folder) called answersapp/api and add the following main.go file: package api import ( "io" "net/http" ) func init() { http.HandleFunc("/", handleHello) } func handleHello(w http.ResponseWriter, r *http.Request) { io.WriteString(w, "Hello from App Engine") } You will be familiar with most of this by now, but note that there is no ListenAndServe call, and the handlers are set inside the init function rather than main. We are going to handle every request with our simple handleHello function, which will just write a welcoming string. The app.yaml file In order to turn our simple Go package into a Google App Engine application, we must add a special configuration file called app.yaml. The file will go at the root of the application or module, so create it inside the answersapp/api folder with the following contents: application: YOUR_APPLICATION_ID_HERE version: 1 runtime: go api_version: go1 handlers: - url: /.* script: _go_app The file is a simple human–(and machine) readable configuration file in YAML (Yet Another Markup Language format—refer to yaml.org for more details). The following table describes each property: Property Description application The application ID (copied and pasted from when you created your project). version Your application version number—you can deploy multiple versions and even split traffic between them to test new features, among other things. We'll just stick with version 1 for now. runtime The name of the runtime that will execute your application. Since we're building a Go application, we'll use go. api_version The go1 api version is the runtime version supported by Google; you can imagine that this could be go2 in the future. handlers A selection of configured URL mappings. In our case, everything will be mapped to the special _go_app script, but you can also specify static files and folders here. Running simple applications locally Before we deploy our application, it makes sense to test it locally. We can do this using the App Engine SDK we downloaded earlier. Navigate to your answersapp/api folder and run the following command in a terminal: goapp serve You should see the following output: This indicates that an API server is running locally on port :56443, an admin server is running on :8000, and our application (the module default) is now serving at localhost:8080, so let's hit that one in a browser. As you can see by the Hello from App Engine response, our application is running locally. Navigate to the admin server by changing the port from :8080 to :8000. The preceding screenshot shows the web portal that we can use to interrogate the internals of our application, including viewing running instances, inspecting the data store, managing task queues, and more. Deploying simple applications to Google App Engine To truly understand the power of Google App Engine's NoOps promise, we are going to deploy this simple application to the cloud. Back in the terminal, stop the server by hitting Ctrl+C and run the following command: goapp deploy Your application will be packaged and uploaded to Google's servers. Once it's finished, you should see something like the following: Completed update of app: theanswersapp, version: 1 It really is as simple as that. You can prove this by navigating to the endpoint you get for free with every Google App Engine application, remembering to replace the application ID with your own: https://YOUR_APPLICATION_ID_HERE.appspot.com/. You will see the same output as earlier (the font may render differently since Google's servers will make assumptions about the content type that the local dev server doesn't). The application is being served over HTTP/2 and is already capable of pretty massive scale, and all we did was write a config file and a few lines of code. Modules in Google App Engine A module is a Go package that can be versioned, updated, and managed independently. An app might have a single module, or it can be made up of many modules: each distinct but part of the same application with access to the same data and services. An application must have a default module—even if it doesn't do much. Our application will be made up of the following modules: Description The module name The obligatory default module default An API package delivering RESTful JSON api A static website serving HTML, CSS, and JavaScript that makes AJAX calls to the API module web Each module will be a Go package and will, therefore, live inside its own folder. Let's reorganize our project into modules by creating a new folder alongside the api folder called default. We are not going to make our default module do anything other than use it for configuration, as we want our other modules to do all the meaningful work. But if we leave this folder empty, the Google App Engine SDK will complain that it has nothing to build. Inside the default folder, add the following placeholder main.go file: package defaultmodule func init() {} This file does nothing except allowing our default module to exist. It would have been nice for our package names to match the folders, but default is a reserved keyword in Go, so we have a good reason to break that rule. The other module in our application will be called web, so create another folder alongside the api and default folders called web. Here we are only going to build the API for our application and cheat by downloading the web module. Head over to the project home page at https://github.com/matryer/goblueprints, access the content for Second Edition, and look for the download link for the web components for this article in the Downloads section of the README file. The ZIP file contains the source files for the web component, which should be unzipped and placed inside the web folder. Now, our application structure should look like this: /answersapp/api /answersapp/default /answersapp/web Specifying modules To specify which module our api package will become, we must add a property to the app.yaml inside our api folder. Update it to include the module property: application: YOUR_APPLICATION_ID_HERE version: 1 runtime: go module: api api_version: go1 handlers: - url: /.* script: _go_app Since our default module will need to be deployed as well, we also need to add an app.yaml configuration file to it. Duplicate the api/app.yaml file inside default/app.yaml, changing the module to default: application: YOUR_APPLICATION_ID_HERE version: 1 runtime: go module: default api_version: go1 handlers: - url: /.* script: _go_app Routing to modules with dispatch.yaml In order to route traffic appropriately to our modules, we will create another configuration file called dispatch.yaml, which will let us map URL patterns to the modules. We want all traffic beginning with the /api/ path to be routed to the api module and everything else to the web module. As mentioned earlier, we won't expect our default module to handle any traffic, but it will have more utility later. In the answersapp folder (alongside our module folders—not inside any of the module folders), create a new file called dispatch.yaml with the following contents: application: YOUR_APPLICATION_ID_HERE dispatch: - url: "*/api/*" module: api - url: "*/*" module: web The same application property tells the Google App Engine SDK for Go which application we are referring to, and the dispatch section routes URLs to modules. Google Cloud Datastore One of the services available to App Engine developers is Google Cloud Datastore, a NoSQL document database built for automatic scaling and high performance. Its limited feature-set guarantees very high scale, but understanding the caveats and best practices is vital to a successful project. Denormalizing data Developers with experience of relational databases (RDBMS) will often aim to reduce data redundancy (trying to have each piece of data appear only once in their database) by normalizing data, spreading it across many tables, and adding references (foreign keys) before joining it back via a query to build a complete picture. In schemaless and NoSQL databases, we tend to do the opposite. We denormalize data so that each document contains the complete picture it needs, making read times extremely fast—since it only needs to go and get a single thing. For example, consider how we might model tweets in a relational database such as MySQL or Postgres: A tweet itself contains only its unique ID, a foreign key reference to the Users table representing the author of the tweet, and perhaps many URLs that were mentioned in TweetBody. One nice feature of this design is that a user can change their Name or AvatarURL and it will be reflected in all of their tweets, past and future: something you wouldn't get for free in a denormalized world. However, in order to present a tweet to the user, we must load the tweet itself, look up (via a join) the user to get their name and avatar URL, and then load the associated data from the URLs table in order to show a preview of any links. At scale, this becomes difficult because all three tables of data might well be physically separated from each other, which means lots of things need to happen in order to build up this complete picture. Consider what a denormalized design would look like instead: We still have the same three buckets of data, except that now our tweet contains everything it needs in order to render to the user without having to look up data from anywhere else. The hardcore relational database designers out there are realizing what this means by now, and it is no doubt making them feel uneasy. Following this approach means that: Data is repeated—AvatarURL in User is repeated as UserAvatarURL in the tweet (waste of space, right?) If the user changes their AvatarURL, UserAvatarURL in the tweet will be out of date Database design, at the end of the day, comes down to physics. We are deciding that our tweet is going to be read far more times than it is going to be written, so we'd rather take the pain up-front and take a hit in storage. There's nothing wrong with repeated data as long as there is an understanding about which set is the master set and which is duplicated for speed. Changing data is an interesting topic in itself, but let's think about a few reasons why we might be OK with the trade-offs. Firstly, the speed benefit to reading tweets is probably worth the unexpected behavior of changes to master data not being reflected in historical documents; it would be perfectly acceptable to decide to live with this emerged functionality for that reason. Secondly, we might decide that it makes sense to keep a snapshot of data at a specific moment in time. For example, imagine if someone tweets asking whether people like their profile picture. If the picture changed, the tweet context would be lost. For a more serious example, consider what might happen if you were pointing to a row in an Addresses table for an order delivery and the address later changed. Suddenly, the order might look like it was shipped to a different place. Finally, storage is becoming increasingly cheaper, so the need for normalizing data to save space is lessened. Twitter even goes as far as copying the entire tweet document for each of your followers. 100 followers on Twitter means that your tweet will be copied at least 100 times, maybe more for redundancy. This sounds like madness to relational database enthusiasts, but Twitter is making smart trade-offs based on its user experience; they'll happily spend a lot of time writing a tweet and storing it many times to ensure that when you refresh your feed, you don't have to wait very long to get updates. If you want to get a sense of the scale of this, check out the Twitter API and look at what a tweet document consists of. It's a lot of data. Then, go and look at how many followers Lady Gaga has. This has become known in some circles as "the Lady Gaga problem" and is addressed by a variety of different technologies and techniques that are out of the scope of this article. Now that we have an understanding of good NoSQL design practices, let's implement the types, functions, and methods required to drive the data part of our API. Entities and data access To persist data in Google Cloud Datastore, we need a struct to represent each entity. These entity structures will be serialized and deserialized when we save and load data through the datastore API. We can add helper methods to perform the interactions with the data store, which is a nice way to keep such functionality physically close to the entities themselves. For example, we will model an answer with a struct called Answer and add a Create method that in turn calls the appropriate function from the datastore package. This prevents us from bloating our HTTP handlers with lots of data access code and allows us to keep them clean and simple instead. One of the foundation blocks of our application is the concept of a question. A question can be asked by a user and answered by many. It will have a unique ID so that it is addressable (referable in a URL), and we'll store a timestamp of when it was created. type Question struct { Key *datastore.Key `json:"id" datastore:"-"` CTime time.Time `json:"created"` Question string `json:"question"` User UserCard `json:"user"` AnswersCount int `json:"answers_count"` } The UserCard struct represents a denormalized User entity, both of which we'll add later. You can import the datastore package in your Go project using this: import "google.golang.org/appengine/datastore" It's worth spending a little time understanding the datastore.Key type. Keys in Google Cloud Datastore Every entity in Datastore has a key, which uniquely identifies it. They can be made up of either a string or an integer depending on what makes sense for your case. You are free to decide the keys for yourself or let Datastore automatically assign them for you; again, your use case will usually decide which is the best approach to take Keys are created using datastore.NewKey and datastore.NewIncompleteKey functions and are used to put and get data into and out of Datastore via the datastore.Get and datastore.Put functions. In Datastore, keys and entity bodies are distinct, unlike in MongoDB or SQL technologies, where it is just another field in the document or record. This is why we are excluding Key from our Question struct with the datastore:"-" field tag. Like the json tags, this indicates that we want Datastore to ignore the Key field altogether when it is getting and putting data. Keys may optionally have parents, which is a nice way of grouping associated data together and Datastore makes certain assurances about such groups of entities, which you can read more about in the Google Cloud Datastore documentation online. Putting data into Google Cloud Datastore Before we save data into Datastore, we want to ensure that our question is valid. Add the following method underneath the Question struct definition: func (q Question) OK() error { if len(q.Question) < 10 { return errors.New("question is too short") } return nil } The OK function will return an error if something is wrong with the question, or else it will return nil. In this case, we just check to make sure the question has at least 10 characters. To persist this data in the data store, we are going to add a method to the Question struct itself. At the bottom of questions.go, add the following code: func (q *Question) Create(ctx context.Context) error { log.Debugf(ctx, "Saving question: %s", q.Question) if q.Key == nil { q.Key = datastore.NewIncompleteKey(ctx, "Question", nil) } user, err := UserFromAEUser(ctx) if err != nil { return err } q.User = user.Card() q.CTime = time.Now() q.Key, err = datastore.Put(ctx, q.Key, q) if err != nil { return err } return nil } The Create method takes a pointer to Question as the receiver, which is important because we want to make changes to the fields. If the receiver was (q Question)—without *, we would get a copy of the question rather than a pointer to it, and any changes we made to it would only affect our local copy and not the original Question struct itself. The first thing we do is use log (from the google.golang.org/appengine/log package) to write a debug statement saying we are saving the question. When you run your code in a development environment, you will see this appear in the terminal; in production, it goes into a dedicated logging service provided by Google Cloud Platform. If the key is nil (that means this is a new question), we assign an incomplete key to the field, which informs Datastore that we want it to generate a key for us. The three arguments we pass are context.Context (which we must pass to all datastore functions and methods), a string describing the kind of entity, and the parent key; in our case, this is nil. Once we know there is a key in place, we call a method (which we will add later) to get or create User from an App Engine user and set it to the question and then set the CTime field (created time) to time.Now—timestamping the point at which the question was asked. One we have our Question function in good shape, we call datastore.Put to actually place it inside the data store. As usual, the first argument is context.Context, followed by the question key and the question entity itself. Since Google Cloud Datastore treats keys as separate and distinct from entities, we have to do a little extra work if we want to keep them together in our own code. The datastore.Put method returns two arguments: the complete key and error. The key argument is actually useful because we're sending in an incomplete key and asking the data store to create one for us, which it does during the put operation. If successful, it returns a new datastore.Key object to us, representing the completed key, which we then store in our Key field in the Question object. If all is well, we return nil. Add another helper to update an existing question: func (q *Question) Update(ctx context.Context) error { if q.Key == nil { q.Key = datastore.NewIncompleteKey(ctx, "Question", nil) } var err error q.Key, err = datastore.Put(ctx, q.Key, q) if err != nil { return err } return nil } This method is very similar except that it doesn't set the CTime or User fields, as they will already have been set. Reading data from Google Cloud Datastore Reading data is as simple as putting it with the datastore.Get method, but since we want to maintain keys in our entities (and datastore methods don't work like that), it's common to add a helper function like the one we are going to add to questions.go: func GetQuestion(ctx context.Context, key *datastore.Key) (*Question, error) { var q Question err := datastore.Get(ctx, key, &q) if err != nil { return nil, err } q.Key = key return &q, nil } The GetQuestion function takes context.Context and the datastore.Key method of the question to get. It then does the simple task of calling datastore.Get and assigning the key to the entity before returning it. Of course, errors are handled in the usual way. This is a nice pattern to follow so that users of your code know that they never have to interact with datastore.Get and datastore.Put directly but rather use the helpers that can ensure the entities are properly populated with the keys (along with any other tweaks that they might want to do before saving or after loading). Summary This article thus gives us an idea about the Go App functionality, how to create a simple application and upload on Google App Engine thus giving a clear understanding of configurations and its working Further we also get some ideas about modules in Google App Engine and also Googles cloud data storage facility for App Engine Developers Resources for Article: Further resources on this subject: Google Forms for Multiple Choice and Fill-in-the-blank Assignments [article] Publication of Apps [article] Prerequisites for a Map Application [article]
Read more
  • 0
  • 0
  • 3148
Banner background image

article-image-deployment-and-devops
Packt
14 Oct 2016
16 min read
Save for later

Deployment and DevOps

Packt
14 Oct 2016
16 min read
 In this article by Makoto Hashimoto and Nicolas Modrzyk, the authors of the book Clojure Programming Cookbook, we will cover the recipe Clojure on Amazon Web Services. (For more resources related to this topic, see here.) Clojure on Amazon Web Services This recipe is a standalone dish where you can learn how to combine the elegance of Clojure with Amazon Web Services (AWS). AWS was started in 2006 and is used by many businesses as easy to use web services. This style of serverless services is becoming more and more popular. You can use computer resources and software services on demand, without the need of preparing hardware or installing software by yourselves. You will mostly make use of the amazonica library, which is a comprehensive Clojure client for the entire Amazon AWS set of APIs. This library wraps the Amazon AWS APIs and supports most of AWS services including EC2, S3, Lambda, Kinesis, Elastic Beanstalk, Elastic MapReduce, and RedShift. This recipe has received a lot of its content and love from Robin Birtle, a leading member of the Clojure Community in Japan. Getting ready You need an AWS account and credentials to use AWS, so this recipe starts by showing you how to do the setup and acquire the necessary keys to get started. Signing up on AWS You need to sign up AWS if you don't have your account in AWS yet. In this case, go to https://aws.amazon.com, click on Sign In to the Console, and follow the instruction for creating your account:   To complete the sign up, enter the number of a valid credit card and a phone number. Getting access key and secret access key To call the API, you now need your AWS's access key and secret access key. Go to AWS console and click on your name, which is located in the top right corner of the screen, and select Security Credential, as shown in the following screenshot: Select Access Keys (Access Key ID and Secret Access Key), as shown in the following screenshot:   Then, the following screen appears; click on New Access Key: You can see your access key and secret access key, as shown in the following screenshot: Copy and save these strings for later use. Setting up dependencies in your project.clj Let's add amazonica library to your project.clj and restart your REPL: :dependencies [[org.clojure/clojure "1.8.0"] [amazonica "0.3.67"]] How to do it… From there on, we will go through some sample usage of the core Amazon services, accessed with Clojure, and the amazonica library. The three main ones we will review are as follows: EC2, Amazon's Elastic Cloud, which allows to run Virtual Machines on Amazon's Cloud S3, Simple Storage Service, which gives you Cloud based storage SQS, Simple Queue Services, which gives you Cloud-based data streaming and processing Let's go through each of these one by one. Using EC2 Let's assume you have an EC2 micro instance in Tokyo region: First of all, we will declare core and ec2 namespace in amazonica to use: (ns aws-examples.ec2-example (:require [amazonica.aws.ec2 :as ec2] [amazonica.core :as core])) We will set the access key and secret access key for enabling AWS client API accesses AWS. core/defcredential does as follows: (core/defcredential "Your Access Key" "Your Secret Access Key" "your region") ;;=> {:access-key "Your Access Key", :secret-key "Your Secret Access Key", :endpoint "your region"} The region you need to specify is ap-northeast-1, ap-south-1, or us-west-2. To get full regions list, use ec2/describe-regions: (ec2/describe-regions) ;;=> {:regions [{:region-name "ap-south-1", :endpoint "ec2.ap-south-1.amazonaws.com"} ;;=> ..... ;;=> {:region-name "ap-northeast-2", :endpoint "ec2.ap-northeast-2.amazonaws.com"} ;;=> {:region-name "ap-northeast-1", :endpoint "ec2.ap-northeast-1.amazonaws.com"} ;;=> ..... ;;=> {:region-name "us-west-2", :endpoint "ec2.us-west-2.amazonaws.com"}]} ec2/describe-instances returns very long information as the following: (ec2/describe-instances) ;;=> {:reservations [{:reservation-id "r-8efe3c2b", :requester-id "226008221399", ;;=> :owner-id "182672843130", :group-names [], :groups [], .... To get only necessary information of instance, we define the following __get-instances-info: (defn get-instances-info[] (let [inst (ec2/describe-instances)] (->> (mapcat :instances (inst :reservations)) (map #(vector [:node-name (->> (filter (fn [x] (= (:key x)) "Name" ) (:tags %)) first :value)] [:status (get-in % [:state :name])] [:instance-id (:instance-id %)] [:private-dns-name (:private-dns-name %)] [:global-ip (-> % :network-interfaces first :private-ip-addresses first :association :public-ip)] [:private-ip (-> % :network-interfaces first :private-ip-addresses first :private-ip-address)])) (map #(into {} %)) (sort-by :node-name)))) ;;=> #'aws-examples.ec2-example/get-instances-info Let's try to use the following function: get-instances-info) ;;=> ({:node-name "ECS Instance - amazon-ecs-cli-setup-my-cluster", ;;=> :status "running", ;;=> :instance-id "i-a1257a3e", ;;=> :private-dns-name "ip-10-0-0-212.ap-northeast-1.compute.internal", ;;=> :global-ip "54.199.234.18", ;;=> :private-ip "10.0.0.212"} ;;=> {:node-name "EcsInstanceAsg", ;;=> :status "terminated", ;;=> :instance-id "i-c5bbef5a", ;;=> :private-dns-name "", ;;=> :global-ip nil, ;;=> :private-ip nil}) As in the preceding example function, we can obtain instance-id list. So, we can start/stop instances using ec2/start-instances and ec2/stop-instances_ accordingly: (ec2/start-instances :instance-ids '("i-c5bbef5a")) ;;=> {:starting-instances ;;=> [{:previous-state {:code 80, :name "stopped"}, ;;=> :current-state {:code 0, :name "pending"}, ;;=> :instance-id "i-c5bbef5a"}]} (ec2/stop-instances :instance-ids '("i-c5bbef5a")) ;;=> {:stopping-instances ;;=> [{:previous-state {:code 16, :name "running"}, ;;=> :current-state {:code 64, :name "stopping"}, ;;=> :instance-id "i-c5bbef5a"}]} Using S3 Amazon S3 is secure, durable, and scalable storage in AWS cloud. It's easy to use for developers and other users. S3 also provide high durability, availability, and low cost. The durability is 99.999999999 % and the availability is 99.99 %. Let's create s3 buckets names makoto-bucket-1, makoto-bucket-2, and makoto-bucket-3 as follows: (s3/create-bucket "makoto-bucket-1") ;;=> {:name "makoto-bucket-1"} (s3/create-bucket "makoto-bucket-2") ;;=> {:name "makoto-bucket-2"} (s3/create-bucket "makoto-bucket-3") ;;=> {:name "makoto-bucket-3"} s3/list-buckets returns buckets information: (s3/list-buckets) ;;=> [{:creation-date #object[org.joda.time.DateTime 0x6a09e119 "2016-08-01T07:01:05.000+09:00"], ;;=> :owner ;;=> {:id "3d6e87f691897059c23bcfb88b17da55f0c9aa02cc2a44e461f1594337059d27", ;;=> :display-name "tokoma1"}, ;;=> :name "makoto-bucket-1"} ;;=> {:creation-date #object[org.joda.time.DateTime 0x7392252c "2016-08-01T17:35:30.000+09:00"], ;;=> :owner ;;=> {:id "3d6e87f691897059c23bcfb88b17da55f0c9aa02cc2a44e461f1594337059d27", ;;=> :display-name "tokoma1"}, ;;=> :name "makoto-bucket-2"} ;;=> {:creation-date #object[org.joda.time.DateTime 0x4d59b4cb "2016-08-01T17:38:59.000+09:00"], ;;=> :owner ;;=> {:id "3d6e87f691897059c23bcfb88b17da55f0c9aa02cc2a44e461f1594337059d27", ;;=> :display-name "tokoma1"}, ;;=> :name "makoto-bucket-3"}] We can see that there are three buckets in your AWS console, as shown in the following screenshot: Let's delete two of the three buckets as follows: (s3/list-buckets) ;;=> [{:creation-date #object[org.joda.time.DateTime 0x56387509 "2016-08-01T07:01:05.000+09:00"], ;;=> :owner {:id "3d6e87f691897059c23bcfb88b17da55f0c9aa02cc2a44e461f1594337059d27", :display-name "tokoma1"}, :name "makoto-bucket-1"}] We can see only one bucket now, as shown in the following screenshot: Now we will demonstrate how to send your local data to s3. s3/put-object uploads a file content to the specified bucket and key. The following code uploads /etc/hosts and makoto-bucket-1: (s3/put-object :bucket-name "makoto-bucket-1" :key "test/hosts" :file (java.io.File. "/etc/hosts")) ;;=> {:requester-charged? false, :content-md5 "HkBljfktNTl06yScnMRsjA==", ;;=> :etag "1e40658df92d353974eb249c9cc46c8c", :metadata {:content-disposition nil, ;;=> :expiration-time-rule-id nil, :user-metadata nil, :instance-length 0, :version-id nil, ;;=> :server-side-encryption nil, :etag "1e40658df92d353974eb249c9cc46c8c", :last-modified nil, ;;=> :cache-control nil, :http-expires-date nil, :content-length 0, :content-type nil, ;;=> :restore-expiration-time nil, :content-encoding nil, :expiration-time nil, :content-md5 nil, ;;=> :ongoing-restore nil}} s3/list-objects lists objects in a bucket as follows: (s3/list-objects :bucket-name "makoto-bucket-1") ;;=> {:truncated? false, :bucket-name "makoto-bucket-1", :max-keys 1000, :common-prefixes [], ;;=> :object-summaries [{:storage-class "STANDARD", :bucket-name "makoto-bucket-1", ;;=> :etag "1e40658df92d353974eb249c9cc46c8c", ;;=> :last-modified #object[org.joda.time.DateTime 0x1b76029c "2016-08-01T07:01:16.000+09:00"], ;;=> :owner {:id "3d6e87f691897059c23bcfb88b17da55f0c9aa02cc2a44e461f1594337059d27", ;;=> :display-name "tokoma1"}, :key "test/hosts", :size 380}]} To obtain the contents of objects in buckets, use s3/get-object: (s3/get-object :bucket-name "makoto-bucket-1" :key "test/hosts") ;;=> {:bucket-name "makoto-bucket-1", :key "test/hosts", ;;=> :input-stream #object[com.amazonaws.services.s3.model.S3ObjectInputStream 0x24f810e9 ;;=> ...... ;;=> :last-modified #object[org.joda.time.DateTime 0x79ad1ca9 "2016-08-01T07:01:16.000+09:00"], ;;=> :cache-control nil, :http-expires-date nil, :content-length 380, :content-type "application/octet-stream", ;;=> :restore-expiration-time nil, :content-encoding nil, :expiration-time nil, :content-md5 nil, ;;=> :ongoing-restore nil}} The result is a map, the content is a stream data, and the value of :object-content. To get the result as a string, we will use slurp_ as follows: (slurp (:object-content (s3/get-object :bucket-name "makoto-bucket-1" :key "test/hosts"))) ;;=> "127.0.0.1tlocalhostn127.0.1.1tphenixnn# The following lines are desirable for IPv6 capable hostsn::1 ip6-localhost ip6-loopbacknfe00::0 ip6-localnetnff00::0 ip6-mcastprefixnff02::1 ip6-allnodesnff02::2 ip6-allroutersnn52.8.30.189 my-cluster01-proxy1 n52.8.169.10 my-cluster01-master1 n52.8.198.115 my-cluster01-slave01 n52.9.12.12 my-cluster01-slave02nn52.8.197.100 my-node01n" Using Amazon SQS Amazon SQS is a high-performance, high-availability, and scalable Queue Service. We will demonstrate how easy it is to handle messages on queues in SQS using Clojure: (ns aws-examples.sqs-example (:require [amazonica.core :as core] [amazonica.aws.sqs :as sqs])) To create a queue, you can use sqs/create-queue as follows: (sqs/create-queue :queue-name "makoto-queue" :attributes {:VisibilityTimeout 3000 :MaximumMessageSize 65536 :MessageRetentionPeriod 1209600 :ReceiveMessageWaitTimeSeconds 15}) ;;=> {:queue-url "https://sqs.ap-northeast-1.amazonaws.com/864062283993/makoto-queue"} To get information of queue, use sqs/get-queue-attributes as follows: (sqs/get-queue-attributes "makoto-queue") ;;=> {:QueueArn "arn:aws:sqs:ap-northeast-1:864062283993:makoto-queue", ... You can configure a dead letter queue using sqs/assign-dead-letter-queue as follows: (sqs/create-queue "DLQ") ;;=> {:queue-url "https://sqs.ap-northeast-1.amazonaws.com/864062283993/DLQ"} (sqs/assign-dead-letter-queue (sqs/find-queue "makoto-queue") (sqs/find-queue "DLQ") 10) ;;=> nil Let's list queues defined: (sqs/list-queues) ;;=> {:queue-urls ;;=> ["https://sqs.ap-northeast-1.amazonaws.com/864062283993/DLQ" ;;=> "https://sqs.ap-northeast-1.amazonaws.com/864062283993/makoto-queue"]} The following image is of the console of SQS: Let's examine URLs of queues: (sqs/find-queue "makoto-queue") ;;=> "https://sqs.ap-northeast-1.amazonaws.com/864062283993/makoto-queue" (sqs/find-queue "DLQ") ;;=> "https://sqs.ap-northeast-1.amazonaws.com/864062283993/DLQ" To send messages, we use sqs/send-message: (sqs/send-message (sqs/find-queue "makoto-queue") "hello sqs from Clojure") ;;=> {:md5of-message-body "00129c8cc3c7081893765352a2f71f97", :message-id "690ddd68-a2f6-45de-b6f1-164eb3c9370d"} To receive messages, we use sqs/receive-message: (sqs/receive-message "makoto-queue") ;;=> {:messages [ ;;=> {:md5of-body "00129c8cc3c7081893765352a2f71f97", ;;=> :receipt-handle "AQEB.....", :message-id "bd56fea8-4c9f-4946-9521-1d97057f1a06", ;;=> :body "hello sqs from Clojure"}]} To remove all messages in your queues, we use sqs/purge-queue: (sqs/purge-queue :queue-url (sqs/find-queue "makoto-queue")) ;;=> nil To delete queues, we use sqs/delete-queue: (sqs/delete-queue "makoto-queue") ;;=> nil (sqs/delete-queue "DLQ") ;;=> nil Serverless Clojure with AWS Lambda Lambda is an AWS product that allows you to run Clojure code without the hassle and expense of setting up and maintaining a server environment. Behind the scenes, there are still servers involved, but as far as you are concerned, it is a serverless environment. Upload a JAR and you are good to go. Code running on Lambda is invoked in response to an event, such as a file being uploaded to S3, or according to a specified schedule. In production environments, Lambda is normally used in wider AWS deployment that includes standard server environments to handle discrete computational tasks. Particularly those that benefit from Lambda's horizontal scaling that just happens with configuration required. For Clojurians working on personal project, Lambda is a wonderful combination of power and limitation. Just how far can you hack Lambda given the constraints imposed by AWS? Clojure namespace helloworld Start off with a clean empty projected generated using lein new. From there, in your IDE of choice, configure and package and a new Clojure source file. In the following example, the package is com.sakkam and the source file uses the Clojure namespace helloworld. The entry point to your Lambda code is a Clojure function that is exposed as a method of a Java class using Clojure's gen-class. Similar to use and require, the gen-class function can be included in the Clojure ns definition, as the following, or specified separately. You can use any name you want for the handler function but the prefix must be a hyphen unless an alternate prefix is specified as part of the :methods definition: (ns com.sakkam.lambda.helloworld (:gen-class :methods [^:static [handler [String] String]])) (defn -myhandler [s] (println (str "Hello," s))) From the command line, use lein uberjar to create a JAR that can be uploaded to AWS Lambda. Hello World – the AWS part Getting your Hello World to work is now a matter of creating a new Lambda within AWS, uploading your JAR, and configuring your handler. Hello Stream The handler method we used in our Hello World Lambda function was coded directly and could be extended to accept custom Java classes as part of the method signature. However, for more complex Java integrations, implementing one of AWS's standard interfaces for Lambda is both straightforward and feels more like idiomatic Clojure. The following example replaces our own definition of a handler method with an implementation of a standard interface that is provided as part of the aws-lambda-java-core library. First of all, add the dependency [com.amazonaws/aws-lambda-java-core "1.0.0"] into your project.clj. While you are modifying your project.clj, also add in the dependency for [org.clojure/data.json "0.2.6"] since we will be manipulating JSON formatted objects as part of this exercise. Then, either create a new Clojure namespace or modify your existing one so that it looks like the following (the handler function must be named -handleRequest since handleRequest is specified as part of the interface): (ns aws-examples.lambda-example (:gen-class :implements [com.amazonaws.services.lambda.runtime.RequestStreamHandler]) (:require [clojure.java.io :as io] [clojure.data.json :as json] [clojure.string :as str])) (defn -handleRequest [this is os context] (let [w (io/writer os) parameters (json/read (io/reader is) :key-fn keyword)] (println "Lambda Hello Stream Output ") (println "this class: " (class this)) (println "is class:" (class is)) (println "os class:" (class os)) (println "context class:" (class context)) (println "Parameters are " parameters)) (.flush w)) Use lein uberjar again to create a JAR file. Since we have an existing Lambda function in AWS, we can overwrite the JAR used in the Hello World example. Since the handler function name has changed, we must modify our Lambda configuration to match. This time, the default test that provides parameters in JSON format should work as is, and the result will look something like the following: We can very easily get a more interesting test of Hello Stream by configuring this Lambda to run whenever a file is uploaded to S3. At the Lambda management page, choose the Event Sources tab, click on Add Event, and choose an S3 bucket to which you can easily add a file. Now, upload a file to the specified S3 bucket and then navigate to the logs of the Hello World Lambda function. You will find that Hello World has been automatically invoked, and a fairly complicated object that represents the uploaded file is supplied as a parameter to our Lambda function. Real-world Lambdas To graduate from a Hello World Lambda to real-world Lambdas, the chances are you going to need richer integration with other AWS facilities. As a minimum, you will probably want to write a file to an S3 bucket or insert a notification into SNS queue. Amazon provides an SDK that makes this integration straightforward for developers using standard Java. For Clojurians, using the Amazon Clojure wrapper Amazonica is a very fast and easy way to achieve the same. How it works… Here, we will explain how AWS works. What Is Amazon EC2? Using EC2, we don't need to buy hardware or installing operating system. Amazon provides various types of instances for customers' use cases. Each instance type has varies combinations of CPU, memory, storage, and networking capacity. Some instance types are given in the following table. You can select appropriate instances according to the characteristics of your application. Instance type Description M4 M4 type instance is designed for general purpose computing. This family provides a balanced CPU, memory and network bandwidth C4 C4 type instance is designed for applications that consume CPU resources. C4 is the highest CPU performance with the lowest cost R3 R3 type instances are for memory-intensive applications G2 G2 type instances has NVIDIA GPU and is used for graphic applications and GPU computing applications such as deep learning   The following table shows the variations of models of M4 type instance. You can choose the best one among models. Model vCPU RAM (GiB) EBS bandwidth (Mbps) m4.large 2 8 450 m4.xlarge 4 16 750 m4.2xlarge 8 32 1,000 m4.4xlarge 16 64 2,000 m4.10xlarge 40 160 4,000   Amazon S3 Amazon S3 is storage for Cloud. It provides a simple web interface that allows you to store and retrieve data. S3 API is an ease of use but ensures security. S3 provides Cloud storage services and is scalable, reliable, fast, and inexpensive. Buckets and Keys Buckets are containers for objects stored in Amazon S3. Objects are stored in buckets. Bucket name is unique among all regions in the world. So, names of buckets are the top-level identities of S3 and units of charges and access controls. Keys are the unique identifiers for an object within a bucket. Every object in a bucket has exactly one key. Keys are the second-level identifiers and should be unique in a bucket. To identify an object, you use the combination of bucket name and key name. Objects Objects are accessed by a bucket names and keys. Objects consist of data and metadata. Metadata is a set of name-value pairs that describe the characteristics of object. Examples of metadata are the date last modified and content type. Objects can have multiple versions of data. There's more… It is clearly impossible to review all the different APIs for all the different services proposed via the Amazonica library, but you would probably get the feeling of having tremendous powers in your hands right now. (Don't forget to give that credit card back to your boss now …) Some other examples of Amazon services are as follows: Amazon IoT: This proposes a way to get connected devices easily and securely interact with cloud applications and other devices. Amazon Kinesis: This gives you ways of easily loading massive volumes of streaming data into AWS and easily analyzing them through streaming techniques. Summary We hope you enjoyed this appetizer to the book Clojure Programming Cookbook, which will present you a set of progressive readings to improve your Clojure skills, and make it so that Clojure becomes your de facto everyday language for professional and efficient work. This book presents different topics of generic programming, which are always to the point, with some fun so that each recipe feels not like a classroom, but more like a fun read, with challenging exercises left to the reader to gradually build up skills. See you in the book! Resources for Article: Further resources on this subject: Customizing Xtext Components [article] Reactive Programming and the Flux Architecture [article] Setup Routine for an Enterprise Spring Application [article]
Read more
  • 0
  • 0
  • 1578

article-image-fast-data-manipulation-r
Packt
14 Oct 2016
28 min read
Save for later

Fast Data Manipulation with R

Packt
14 Oct 2016
28 min read
Data analysis is a combination of art and science. The art part consists of data exploration and visualization, which is usually done best with better intuition and understanding of the data. The science part consists of statistical analysis, which relies on concrete knowledge of statistics and analytic skills. However, both parts of a serious research require proper tools and good skills to work with them. R is exactly the proper tool to do data analysis with. In this article by Kun Ren, author of the book Learning R Programming, we will discuss how R and data.table package make it easy to transform data and, thus, greatly unleash our productivity. (For more resources related to this topic, see here.) Loading data as data frames The most basic data structures in R are atomic vectors, such as. numeric, logical, character, and complex vector, and list. An atomic vector stores elements of the same type while list is allowed to store different types of elements. The most commonly used data structure in R to store real-world data is data frame. A data frame stores data in tabular form. In essence, a data frame is a list of vectors with equal length but maybe different types. Most of the code in this article is based on a group of fictitious data about some products (you can download the data at https://gist.github.com/renkun-ken/ba2d33f21efded23db66a68240c20c92). We will use the readr package to load the data for better handling of column types. If you don't have this package installed, please run install.packages("readr"). library(readr) product_info <- read_csv("data/product-info.csv") product_info ##    id      name  type   class released ## 1 T01    SupCar   toy vehicle      yes ## 2 T02  SupPlane   toy vehicle       no ## 3 M01     JeepX model vehicle      yes ## 4 M02 AircraftX model vehicle      yes ## 5 M03    Runner model  people      yes ## 6 M04    Dancer model  people       no Once the data is loaded into memory as a data frame, we can take a look at its column types, shown as follows: sapply(product_info, class) ##          id        name        type       class    released ## "character" "character" "character" "character" "character" Using built-in functions to manipulate data frames Although a data frame is essentially a list of vectors, we can access it like a matrix due to all column vectors being the same length. To select rows that meet certain conditions, we will supply a logical vector as the first argument of [] while the second is left empty. For example, we can take out all rows of toy type, shown as follows: product_info[product_info$type == "toy", ] ##    id     name type   class released ## 1 T01   SupCar  toy vehicle      yes ## 2 T02 SupPlane  toy vehicle       no Or, we can take out all rows that are not released. product_info[product_info$released == "no", ] ##    id     name  type   class released ## 2 T02 SupPlane   toy vehicle       no ## 6 M04   Dancer model  people       no To filter columns, we can supply a character vector as the second argument while the first is left empty, which is exactly the same with how we subset a matrix. product_info[1:3, c("id", "name", "type")] ##    id     name  type ## 1 T01   SupCar   toy ## 2 T02 SupPlane   toy ## 3 M01    JeepX model Alternatively, we can filter the data frame by regarding it as a list. We can supply only one character vector of column names in []. product_info[c("id", "name", "class")] ##    id      name   class ## 1 T01    SupCar vehicle ## 2 T02  SupPlane vehicle ## 3 M01     JeepX vehicle ## 4 M02 AircraftX vehicle ## 5 M03    Runner  people ## 6 M04    Dancer  people To filter a data frame by both row and column, we can supply a vector as the first argument to select rows and a vector as the second to select columns. product_info[product_info$type == "toy", c("name", "class", "released")] ##       name   class released ## 1   SupCar vehicle      yes ## 2 SupPlane vehicle       no If the row filtering condition is based on values of certain columns, the preceding code can be very redundant, especially when the condition gets more complicated. Another built-in function to simplify code is subset, as introduced previously. subset(product_info,   subset = type == "model" & released == "yes",   select = name:class) ##        name  type   class ## 3     JeepX model vehicle ## 4 AircraftX model vehicle ## 5    Runner model  people The subset function uses nonstandard evaluation so that we can directly use the columns of the data frame without typing product_info many times because the expressions are meant to be evaluated in the context of the data frame. Similarly, we can use with to evaluate an expression in the context of the data frame, that is, the columns of the data frame can be used as symbols in the expression without repeatedly specifying the data frame. with(product_info, name[released == "no"]) ## [1] "SupPlane" "Dancer" The expression can be more than a simple subsetting. We can summarize the data by counting the occurrences of each possible value of a vector. For example, we can create a table of occurrences of types of records that are released. with(product_info, table(type[released == "yes"])) ## ## model   toy ##     3     1 In addition to the table of product information, we also have a table of product statistics that describe some properties of each product. product_stats <- read_csv("data/product-stats.csv") product_stats ##    id material size weight ## 1 T01    Metal  120   10.0 ## 2 T02    Metal  350   45.0 ## 3 M01 Plastics   50     NA ## 4 M02 Plastics   85    3.0 ## 5 M03     Wood   15     NA ## 6 M04     Wood   16    0.6 Now, think of how we can get the names of products with the top three largest sizes? One way is to sort the records in product_stats by size in descending order, select id values of the top three records, and use these values to filter rows of product_info by id. top_3_id <- product_stats[order(product_stats$size, decreasing = TRUE), "id"][1:3] product_info[product_info$id %in% top_3_id, ] ##    id      name  type   class released ## 1 T01    SupCar   toy vehicle      yes ## 2 T02  SupPlane   toy vehicle       no ## 4 M02 AircraftX model vehicle      yes This approach looks quite redundant. Note that product_info and product_stats actually describe the same set of products in different perspectives. The connection between these two tables is the id column. Each id is unique and means the same product. To access both sets of information, we can put the two tables together into one data frame. The simplest way to do this is use merge: product_table <- merge(product_info, product_stats, by = "id") product_table ##    id      name  type   class released material size weight ## 1 M01     JeepX model vehicle      yes Plastics   50     NA ## 2 M02 AircraftX model vehicle      yes Plastics   85    3.0 ## 3 M03    Runner model  people      yes     Wood   15     NA ## 4 M04    Dancer model  people       no     Wood   16    0.6 ## 5 T01    SupCar   toy vehicle      yes    Metal  120   10.0 ## 6 T02  SupPlane   toy vehicle       no    Metal  350   45.0 Now, we can create a new data frame that is a combined version of product_table and product_info with a shared id column. In fact, if you reorder the records in the second table, the two tables still can be correctly merged. With the combined version, we can do things more easily. For example, with the merged version, we can sort the data frame with any column in one table we loaded without having to manually work with the other. product_table[order(product_table$size), ] ##    id      name  type   class released material size weight ## 3 M03    Runner model  people      yes     Wood   15     NA ## 4 M04    Dancer model  people       no     Wood   16    0.6 ## 1 M01     JeepX model vehicle      yes Plastics   50     NA ## 2 M02 AircraftX model vehicle      yes Plastics   85    3.0 ## 5 T01    SupCar   toy vehicle      yes    Metal  120   10.0 ## 6 T02  SupPlane   toy vehicle       no    Metal  350   45.0 To solve the problem, we can directly use the merged table and get the same answer. product_table[order(product_table$size, decreasing = TRUE), "name"][1:3] ## [1] "SupPlane"  "SupCar"    "AircraftX" The merged data frame allows us to sort the records by a column in one data frame and filter the records by a column in the other. For example, we can first sort the product records by weight in descending order and select all records of model type. product_table[order(product_table$weight, decreasing = TRUE), ][   product_table$type == "model",] ##    id      name  type   class released material size weight ## 6 T02  SupPlane   toy vehicle       no    Metal  350   45.0 ## 5 T01    SupCar   toy vehicle      yes    Metal  120   10.0 ## 2 M02 AircraftX model vehicle      yes Plastics   85    3.0 ## 4 M04    Dancer model  people       no     Wood   16    0.6 Sometimes, the column values are literal but can be converted to standard R data structures to better represent the data. For example, released column in product_info only takes yes and no, which can be better represented with a logical vector. We can use <- to modify the column values, as we learned previously. However, it is usually better to create a new data frame with the existing columns properly adjusted and new columns added without polluting the original data. To do this, we can use transform: transform(product_table,   released = ifelse(released == "yes", TRUE, FALSE),   density = weight / size) ##    id      name  type   class released material size weight ## 1 M01     JeepX model vehicle     TRUE Plastics   50     NA ## 2 M02 AircraftX model vehicle     TRUE Plastics   85    3.0 ## 3 M03    Runner model  people     TRUE     Wood   15     NA ## 4 M04    Dancer model  people    FALSE     Wood   16    0.6 ## 5 T01    SupCar   toy vehicle     TRUE    Metal  120   10.0 ## 6 T02  SupPlane   toy vehicle    FALSE    Metal  350   45.0 ##      density ## 1         NA ## 2 0.03529412 ## 3         NA ## 4 0.03750000 ## 5 0.08333333 ## 6 0.12857143 The result is a new data frame with released converted to a logical vector and a new density column added. You can easily verify that product_table is not modified at all. Additionally, note that transform is like subset, as both functions use nonstandard evaluation to allow direct use of data frame columns as symbols in the arguments so that we don't have to type product_table$ all the time. Now, we will load another table into R. It is the test results of the quality, and durability of each product. We store the data in product_tests. product_tests <- read_csv("data/product-tests.csv") product_tests ##    id quality durability waterproof ## 1 T01      NA         10         no ## 2 T02      10          9         no ## 3 M01       6          4        yes ## 4 M02       6          5        yes ## 5 M03       5         NA        yes ## 6 M04       6          6        yes Note that the values in both quality and durability contain missing values (NA). To exclude all rows with missing values, we can use na.omit(): na.omit(product_tests) ##    id quality durability waterproof ## 2 T02      10          9         no ## 3 M01       6          4        yes ## 4 M02       6          5        yes ## 6 M04       6          6        yes Another way is to use complete.cases() to get a logical vector indicating all complete rows, without any missing value,: complete.cases(product_tests) ## [1] FALSE  TRUE  TRUE  TRUE FALSE  TRUE Then, we can use this logical vector to filter the data frame. For example, we can get the id  column of all complete rows as follows: product_tests[complete.cases(product_tests), "id"] ## [1] "T02" "M01" "M02" "M04" Or, we can get the id column of all incomplete rows: product_tests[!complete.cases(product_tests), "id"] ## [1] "T01" "M03" Note that product_info, product_stats and product_tests all share an id column, and we can merge them altogether. Unfortunately, there's no built-in function to merge an arbitrary number of data frames. We can only merge two existing data frames at a time, or we'll have to merge them recursively. merge(product_table, product_tests, by = "id") ##    id      name  type   class released material size weight ## 1 M01     JeepX model vehicle      yes Plastics   50     NA ## 2 M02 AircraftX model vehicle      yes Plastics   85    3.0 ## 3 M03    Runner model  people      yes     Wood   15     NA ## 4 M04    Dancer model  people       no     Wood   16    0.6 ## 5 T01    SupCar   toy vehicle      yes    Metal  120   10.0 ## 6 T02  SupPlane   toy vehicle       no    Metal  350   45.0 ##   quality durability waterproof ## 1       6          4        yes ## 2       6          5        yes ## 3       5         NA        yes ## 4       6          6        yes ## 5      NA         10         no ## 6      10          9         no Data wrangling with data.table In the previous section, we had an overview on how we can use built-in functions to work with data frames. Built-in functions work, but are usually verbose. In this section, let's use data.table, an enhanced version of data.frame, and see how it makes data manipulation much easier. Run install.packages("data.table") to install the package. As long as the package is ready, we can load the package and use fread() to read the data files as data.table objects. library(data.table) product_info <- fread("data/product-info.csv") product_stats <- fread("data/product-stats.csv") product_tests <- fread("data/product-tests.csv") toy_tests <- fread("data/product-toy-tests.csv") It is extremely easy to filter data in data.table. To select the first two rows, just use [1:2], which instead selects the first two columns for data.frame. product_info[1:2] ##     id     name type   class released ## 1: T01   SupCar  toy vehicle      yes ## 2: T02 SupPlane  toy vehicle       no To filter by logical conditions, just directly type columns names as variables without quotation as the expression is evaluated within the context of product_info: product_info[type == "model" & class == "people"] ##     id   name  type  class released ## 1: M03 Runner model people      yes ## 2: M04 Dancer model people       no It is easy to select or transform columns. product_stats[, .(id, material, density = size / weight)] ##     id material   density ## 1: T01    Metal 12.000000 ## 2: T02    Metal  7.777778 ## 3: M01 Plastics        NA ## 4: M02 Plastics 28.333333 ## 5: M03     Wood        NA ## 6: M04     Wood 26.666667 The data.table object also supports using key for subsetting, which can be much faster than using ==. We can set a column as key for each data.table: setkey(product_info, id) setkey(product_stats, id) setkey(product_tests, id) Then, we can use a value to directly select rows. product_info["M02"] ##     id      name  type   class released ## 1: M02 AircraftX model vehicle      yes We can also set multiple columns as key so as to use multiple values to subset it. setkey(toy_tests, id, date) toy_tests[.("T02", 20160303)] ##     id     date sample quality durability ## 1: T02 20160303     75       8          8 If two data.table objects share the same key, we can join them easily: product_info[product_tests] ##     id      name  type   class released quality durability ## 1: M01     JeepX model vehicle      yes       6          4 ## 2: M02 AircraftX model vehicle      yes       6          5 ## 3: M03    Runner model  people      yes       5         NA ## 4: M04    Dancer model  people       no       6          6 ## 5: T01    SupCar   toy vehicle      yes      NA         10 ## 6: T02  SupPlane   toy vehicle       no      10          9 ##    waterproof ## 1:        yes ## 2:        yes ## 3:        yes ## 4:        yes ## 5:         no ## 6:         no Instead of creating new data.table, in-place modification is also supported. The := sets the values of a column in place without the overhead of making copies and, thus, is much faster than using <-. product_info[, released := (released == "yes")] ##     id      name  type   class released ## 1: M01     JeepX model vehicle     TRUE ## 2: M02 AircraftX model vehicle     TRUE ## 3: M03    Runner model  people     TRUE ## 4: M04    Dancer model  people    FALSE ## 5: T01    SupCar   toy vehicle     TRUE ## 6: T02  SupPlane   toy vehicle    FALSE product_info ##     id      name  type   class released ## 1: M01     JeepX model vehicle     TRUE ## 2: M02 AircraftX model vehicle     TRUE ## 3: M03    Runner model  people     TRUE ## 4: M04    Dancer model  people    FALSE ## 5: T01    SupCar   toy vehicle     TRUE ## 6: T02  SupPlane   toy vehicle    FALSE Another important argument of subsetting a data.table is by, which is used to split the data into multiple parts and for each part the second argument (j) is evaluated. For example, the simplest usage of by is counting the records in each group. In the following code, we can count the number of both released and unreleased products: product_info[, .N, by = released] ##    released N ## 1:     TRUE 4 ## 2:    FALSE 2 The group can be defined by more than one variable. For example, a tuple of type and class can be a group, and for each group, we can count the number of records, as follows: product_info[, .N, by = .(type, class)] ##     type   class N ## 1: model vehicle 2 ## 2: model  people 2 ## 3:   toy vehicle 2 We can also perform the following statistical calculations for each group: product_tests[, .(mean_quality = mean(quality, na.rm = TRUE)),   by = .(waterproof)] ##    waterproof mean_quality ## 1:        yes         5.75 ## 2:         no        10.00 We can chain multiple [] in turn. In the following example, we will first join product_info and product_tests by a shared key id and then calculate the mean value of quality and durability for each group of type and class of released products. product_info[product_tests][released == TRUE,   .(mean_quality = mean(quality, na.rm = TRUE),     mean_durability = mean(durability, na.rm = TRUE)),   by = .(type, class)] ##     type   class mean_quality mean_durability ## 1: model vehicle            6             4.5 ## 2: model  people            5             NaN ## 3:   toy vehicle          NaN            10.0 Note that the values of the by columns will be unique in the resulted data.table; we can use keyby instead of by to ensure that it is automatically used as key by the resulted data.table. product_info[product_tests][released == TRUE,   .(mean_quality = mean(quality, na.rm = TRUE),     mean_durability = mean(durability, na.rm = TRUE)),   keyby = .(type, class)] ##     type   class mean_quality mean_durability ## 1: model  people            5             NaN ## 2: model vehicle            6             4.5 ## 3:   toy vehicle          NaN            10.0 The data.table package also provides functions to perform superfast reshaping of data. For example, we can use dcast() to spread id values along the x-axis as columns and align quality values to all possible date values along the y-axis. toy_quality <- dcast(toy_tests, date ~ id, value.var = "quality") toy_quality ##        date T01 T02 ## 1: 20160201   9   7 ## 2: 20160302  10  NA ## 3: 20160303  NA   8 ## 4: 20160403  NA   9 ## 5: 20160405   9  NA ## 6: 20160502   9  10 Although each month a test is conducted for each product, the dates may not exactly match with each other. This results in missing values if one product has a value on a day but the other has no corresponding value on exactly the same day. One way to fix this is to use year-month data instead of exact date. In the following code, we will create a new ym column that is the first 6 characters of toy_tests. For example, substr(20160101, 1, 6) will result in 201601. toy_tests[, ym := substr(toy_tests$date, 1, 6)] ##     id     date sample quality durability     ym ## 1: T01 20160201    100       9          9 201602 ## 2: T01 20160302    150      10          9 201603 ## 3: T01 20160405    180       9         10 201604 ## 4: T01 20160502    140       9          9 201605 ## 5: T02 20160201     70       7          9 201602 ## 6: T02 20160303     75       8          8 201603 ## 7: T02 20160403     90       9          8 201604 ## 8: T02 20160502     85      10          9 201605 toy_tests$ym ## [1] "201602" "201603" "201604" "201605" "201602" "201603" ## [7] "201604" "201605" This time, we will use ym for alignment instead of date: toy_quality <- dcast(toy_tests, ym ~ id, value.var = "quality") toy_quality ##        ym T01 T02 ## 1: 201602   9   7 ## 2: 201603  10   8 ## 3: 201604   9   9 ## 4: 201605   9  10 Now the missing values are gone, the quality scores of both products in each month are naturally presented. Sometimes, we will need to combine a number of columns into one that indicates the measure and another that stores the value. For example, the following code uses melt() to combine the two measures (quality and durability) of the original data into a column named measure and a column of the measured value. toy_tests2 <- melt(toy_tests, id.vars = c("id", "ym"),   measure.vars = c("quality", "durability"),   variable.name = "measure") toy_tests2 ##      id     ym    measure value ##  1: T01 201602    quality     9 ##  2: T01 201603    quality    10 ##  3: T01 201604    quality     9 ##  4: T01 201605    quality     9 ##  5: T02 201602    quality     7 ##  6: T02 201603    quality     8 ##  7: T02 201604    quality     9 ##  8: T02 201605    quality    10 ##  9: T01 201602 durability     9 ## 10: T01 201603 durability     9 ## 11: T01 201604 durability    10 ## 12: T01 201605 durability     9 ## 13: T02 201602 durability     9 ## 14: T02 201603 durability     8 ## 15: T02 201604 durability     8 ## 16: T02 201605 durability     9 The variable names are now contained in the data, which can be directly used by some packages. For example, we can use ggplot2 to plot data in such format. The following code is an example of a scatter plot with a facet grid of different combination of factors. library(ggplot2) ggplot(toy_tests2, aes(x = ym, y = value)) +   geom_point() +   facet_grid(id ~ measure) The graph generated is shown as follows: The plot can be easily manipulated because the grouping factor (measure) is contained as data rather than columns, which is easier to represent from the perspective of the ggplot2 package. ggplot(toy_tests2, aes(x = ym, y = value, color = id)) +   geom_point() +   facet_grid(. ~ measure) The graph generated is shown as follows: Summary In this article, we used both built-in functions and the data.table package to perform simple data manipulation tasks. Using built-in functions can be verbose while using data.table can be much easier and faster. However, the tasks in real-world data analysis can be much more complex than the examples we demonstrated, which also requires better R programming skills. It is helpful to have a good understanding on how nonstandard evaluation makes data.table so easy to work with, how environment works and scoping rules apply to make your code predictable, and so on. A universal and consistent understanding of how R basically works will certainly give you great confidence to write R code to work with data and enable you to learn packages very quickly. Resources for Article: Further resources on this subject: Supervised Machine Learning [article] Getting Started with Bootstrap [article] Basics of Classes and Objects [article]
Read more
  • 0
  • 0
  • 1157

article-image-loops-conditions-and-recursion
Packt
14 Oct 2016
14 min read
Save for later

Loops, Conditions, and Recursion

Packt
14 Oct 2016
14 min read
In this article from Paul Johnson, author of the book Learning Rust, we would take a look at how loops and conditions within any programming language are a fundamental aspect of operation. You may be looping around a list attempting to find when something matches, and when a match occurs, branching out to perform some other task; or, you may just want to check a value to see if it meets a condition. In any case, Rust allows you to do this. (For more resources related to this topic, see here.) In this article, we will cover the following topics: Types of loop available Different types of branching within loops Recursive methods When the semi-colon (;) can be omitted and what it means Loops Rust has essentially three types of loop—for, loop, and while. The for loop This type of loop is very simple to understand, yet rather powerful in operation. It is simple. In that, we have a start value, an end condition, and some form of value change. Although, the power comes in those two last points. Let's take a simple example to start with—a loop that goes from 0 to 10 and outputs the value: for x in 0..10 { println!("{},", x); } We create a variable x that takes the expression (0..10) and does something with it. In Rust terminology, x is not only a variable but also an iterator, as it gives back a value from a series of elements. This is obviously a very simple example. We can also go down as well, but the syntax is slightly different. In C, you will expect something akin to for (i = 10; i > 0; --i). This is not available in Rust, at least, not in the stable branches. Instead, we will use the rev() method, which is as follows: for x in (0..10).rev() { println!("{},", x); } It is worth noting that, as with the C family, the last number is to be excluded. So, for the first example, the values outputted are 9 to 0; essentially, the program generates the output values from 0 to 10 and then outputs them in reverse. Notice also that the condition is in braces. This is because the second parameter is the condition. In C#, this will be the equivalent of a foreach. In Rust, it will be as follows: for var in condition { // do something } The C# equivalent for the preceding code is: foreach(var t in condition) // do something Using enumerate A loop condition can also be more complex using multiple conditions and variables. For example, the for loop can be tracked using enumerate. This will keep track of how many times the loop has executed, as shown here: for(i, j) in (10..20).enumerate() { println!("loop has executed {} times. j = {}", i, j); } 'The following is the output: The enumeration is given in the first variable with the condition in the second. This example is not of that much use, but where it comes into its own is when looping over an iterator. Say we have an array that we need to iterate over to obtain the values. Here, the enumerate can be used to obtain the value of the array members. However, the value returned in the condition will be a pointer, so a code such as the one shown in the following example will fail to execute (line is a & reference whereas an i32 is expected) fn main() { let my_array: [i32; 7] = [1i32,3,5,7,9,11,13]; let mut value = 0i32; for(_, line) in my_array.iter().enumerate() { value += line; } println!("{}", value); } This can be simply converted back from the reference value, as follows: for(_, line) in my_array.iter().enumerate() { value += *line; } The iter().enumerate() method can equally be used with the Vec type, as shown in the following code: fn main() { let my_array = vec![1i32,3,5,7,9,11,13]; let mut value = 0i32; for(_,line) in my_array.iter().enumerate() { value += *line; } println!("{}", value); } In both cases, the value given at the end will be 49, as shown in the following screenshot: The _ parameter You may be wondering what the _ parameter is. It's Rust, which means that there is an argument, but we'll never do anything with it, so it's a parameter that is only there to ensure that the code compiles. It's a throw-away. The _ parameter cannot be referred to either; whereas, we can do something with linenumber in for(linenumber, line), but we can't do anything with _ in for(_, line). The simple loop A simple form of the loop is called loop: loop { println!("Hello"); } The preceding code will either output Hello until the application is terminated or the loop reaches a terminating statement. While… The while condition is of slightly more use, as you will see in the following code snippet: while (condition) { // do something } Let's take a look at the following example: fn main() { let mut done = 0u32; while done != 32 { println!("done = {}", done); done+=1; } } The preceding code will output done = 0 to done = 31. The loop terminates when done equals 32. Prematurely terminating a loop Depending on the size of the data being iterated over within a loop, the loop can be costly on processor time. For example, say the server is receiving data from a data-logging application, such as measuring values from a gas chromatograph, over the entire scan, it may record roughly half a million data points with an associated time position. For our purposes, we want to add all of the recorded values until the value is over 1.5 and once that is reached, we can stop the loop. Sound easy? There is one thing not mentioned, there is no guarantee that the recorded value will ever reach over 1.5, so how can we terminate the loop if the value is reached? We can do this one of two ways. First is to use a while loop and introduce a Boolean to act as the test condition. In the following example, my_array represents a very small subsection of the data sent to the server. fn main() { let my_array = vec![0.6f32, 0.4, 0.2, 0.8, 1.3, 1.1, 1.7, 1.9]; let mut counter: usize = 0; let mut result = 0f32; let mut test = false; while test != true { if my_array[counter] > 1.5 { test = true; } else { result += my_array[counter]; counter += 1; } } println!("{}", result); } The result here is 4.4. This code is perfectly acceptable, if slightly long winded. Rust also allows the use of break and continue keywords (if you're familiar with C, they work in the same way). Our code using break will be as follows: fn main() { let my_array = vec![0.6f32, 0.4, 0.2, 0.8, 1.3, 1.1, 1.7, 1.9]; let mut result = 0f32; for(_, value) in my_array.iter().enumerate() { if *value > 1.5 { break; } else { result += *value; } } println!("{}", result); } Again, this will give an answer of 4.4, indicating that the two methods used are the equivalent of each other. If we replace break with continue in the preceding code example, we will get the same result (4.4). The difference between break and continue is that continue jumps to the next value in the iteration rather than jumping out, so if we had the final value of my_array as 1.3, the output at the end should be 5.7. When using break and continue, always keep in mind this difference. While it may not crash the code, mistaking break and continue may lead to results that you may not expect or want. Using loop labels Rust allows us to label our loops. This can be very useful (for example with nested loops). These labels act as symbolic names to the loop and as we have a name to the loop, we can instruct the application to perform a task on that name. Consider the following simple example: fn main() { 'outer_loop: for x in 0..10 { 'inner_loop: for y in 0..10 { if x % 2 == 0 { continue 'outer_loop; } if y % 2 == 0 { continue 'inner_loop; } println!("x: {}, y: {}", x, y); } } } What will this code do? Here x % 2 == 0 (or y % 2 == 0) means that if variable divided by two returns no remainder, then the condition is met and it executes the code in the braces. When x % 2 == 0, or when the value of the loop is an even number, we will tell the application to skip to the next iteration of outer_loop, which is an odd number. However, we will also have an inner loop. Again, when y % 2 is an even value, we will tell the application to skip to the next iteration of inner_loop. In this case, the application will output the following results: While this example may seem very simple, it does allow for a great deal of speed when checking data. Let's go back to our previous example of data being sent to the web service. Recall that we have two values—the recorded data and some other value, for ease, it will be a data point. Each data point is recorded 0.2 seconds apart; therefore, every 5th data point is 1 second. This time, we want all of the values where the data is greater than 1.5 and the associated time of that data point but only on a time when it's dead on a second. As we want the code to be understandable and human readable, we can use a loop label on each loop. The following code is not quite correct. Can you spot why? The code compiles as follows: fn main() { let my_array = vec![0.6f32, 0.4, 0.2, 0.8, 1.3, 1.1, 1.7, 1.9, 1.3, 0.1, 1.6, 0.6, 0.9, 1.1, 1.31, 1.49, 1.5, 0.7]; let my_time = vec![0.2f32, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0, 2.2, 2.4, 2.6, 2.8, 3.0, 3.2, 3.4, 3.6, 3.8]; 'time_loop: for(_, time_value) in my_time.iter().enumerate() { 'data_loop: for(_, value) in my_array.iter().enumerate() { if *value < 1.5 { continue 'data_loop; } if *time_value % 5f32 == 0f32 { continue 'time_loop; } println!("Data point = {} at time {}s", *value, *time_value); } } } This example is a very good one to demonstrate the correct operator in use. The issue is the if *time_value % 5f32 == 0f32 line. We are taking a float value and using the modulus of another float to see if we end up with 0 as a float. Comparing any value that is not a string, int, long, or bool type to another is never a good plan; especially, if the value is returned by some form of calculation. We can also not simply use continue on the time loop, so, how can we solve this problem? If you recall, we're using _ instead of a named parameter for the enumeration of the loop. These values are always an integer, therefore if we replace _ for a variable name, then we can use % 5 to perform the calculation and the code becomes: 'time_loop: for(time_enum, time_value) in my_time.iter().enumerate() { 'data_loop: for(_, value) in my_array.iter().enumerate() { if *value < 1.5 { continue 'data_loop; } if time_enum % 5 == 0 { continue 'time_loop; } println!("Data point = {} at time {}s", *value, *time_value); } } The next problem is that the output isn't correct. The code gives the following: Data point = 1.7 at time 0.4s Data point = 1.9 at time 0.4s Data point = 1.6 at time 0.4s Data point = 1.5 at time 0.4s Data point = 1.7 at time 0.6s Data point = 1.9 at time 0.6s Data point = 1.6 at time 0.6s Data point = 1.5 at time 0.6s The data point is correct, but the time is way out and continually repeats. We still need the continue statement for the data point step, but the time step is incorrect. There are a couple of solutions, but possibly the simplest will be to store the data and the time into a new vector and then display that data at the end. The following code gets closer to what is required: fn main() { let my_array = vec![0.6f32, 0.4, 0.2, 0.8, 1.3, 1.1, 1.7, 1.9, 1.3, 0.1, 1.6, 0.6, 0.9, 1.1, 1.31, 1.49, 1.5, 0.7]; let my_time = vec![0.2f32, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0, 2.2, 2.4, 2.6, 2.8, 3.0, 3.2, 3.4, 3.6, 3.8]; let mut my_new_array = vec![]; let mut my_new_time = vec![]; 'time_loop: for(t, _) in my_time.iter().enumerate() { 'data_loop: for(v, value) in my_array.iter().enumerate() { if *value < 1.5 { continue 'data_loop; } else { if t % 5 != 0 { my_new_array.push(*value); my_new_time.push(my_time[v]); } } if v == my_array.len() { break; } } } for(m, my_data) in my_new_array.iter().enumerate() { println!("Data = {} at time {}", *my_data, my_new_time[m]); } } We will now get the following output: Data = 1.7 at time 1.4 Data = 1.9 at time 1.6 Data = 1.6 at time 2.2 Data = 1.5 at time 3.4 Data = 1.7 at time 1.4 Yes, we now have the correct data, but the time starts again. We're close, but it's not right yet. We aren't continuing the time_loop loop and we will also need to introduce a break statement. To trigger the break, we will create a new variable called done. When v, the enumerator for my_array, reaches the length of the vector (this is the number of elements in the vector), we will change this from false to true. This is then tested outside of the data_loop. If done == true, break out of the loop. The final version of the code is as follows: fn main() { let my_array = vec![0.6f32, 0.4, 0.2, 0.8, 1.3, 1.1, 1.7, 1.9, 1.3, 0.1, 1.6, 0.6, 0.9, 1.1, 1.31, 1.49, 1.5, 0.7]; let my_time = vec![0.2f32, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0, 2.2, 2.4, 2.6, 2.8, 3.0, 3.2, 3.4, 3.6]; let mut my_new_array = vec![]; let mut my_new_time = vec![]; let mut done = false; 'time_loop: for(t, _) in my_time.iter().enumerate() { 'data_loop: for(v, value) in my_array.iter().enumerate() { if v == my_array.len() - 1 { done = true; } if *value < 1.5 { continue 'data_loop; } else { if t % 5 != 0 { my_new_array.push(*value); my_new_time.push(my_time[v]); } else { continue 'time_loop; } } } if done {break;} } for(m, my_data) in my_new_array.iter().enumerate() { println!("Data = {} at time {}", *my_data, my_new_time[m]); } } Our final output from the code is this: Recursive functions The final form of loop to consider is known as a recursive function. This is a function that calls itself until a condition is met. In pseudocode, the function looks like this: float my_function(i32:a) { // do something with a if (a != 32) { my_function(a); } else { return a; } } An actual implementation of a recursive function would look like this: fn recurse(n:i32) { let v = match n % 2 { 0 => n / 2, _ => 3 * n + 1 }; println!("{}", v); if v != 1 { recurse(v) } } fn main() { recurse(25) } The idea of a recursive function is very simple, but we need to consider two parts of this code. The first is the let line in the recurse function and what it means: let v = match n % 2 { 0 => n / 2, _ => 3 * n + 1 }; Another way of writing this is as follows: let mut v = 0i32; if n % 2 == 0 { v = n / 2; } else { v = 3 * n + 1; } In C#, this will equate to the following: var v = n % 2 == 0 ? n / 2 : 3 * n + 1; The second part is that the semicolon is not being used everywhere. Consider the following example: fn main() { recurse(25) } What is the difference between having and not having a semicolon? Rust operates on a system of blocks called closures. The semicolon closes a block. Let's see what that means. Consider the following code as an example: fn main() { let x = 5u32; let y = { let x_squared = x * x; let x_cube = x_squared * x; x_cube + x_squared + x }; let z = { 2 * x; }; println!("x is {:?}", x); println!("y is {:?}", y); println!("z is {:?}", z); } We have two different uses of the semicolon. If we look at the let y line first: let y = { let x_squared = x * x; let x_cube = x_squared * x; x_cube + x_squared + x // no semi-colon }; This code does the following: The code within the braces is processed. The final line, without the semicolon, is assigned to y. Essentially, this is considered as an inline function that returns the line without the semicolon into the variable. The second line to consider is for z: let z = { 2 * x; }; Again, the code within the braces is evaluated. In this case, the line ends with a semicolon, so the result is suppressed and () to z. When it is executed, we will get the following results: In the code example, the line within fn main calling recurse gives the same result with or without the semicolon. Summary In this, we've covered the different types of loops that are available within Rust, as well as gained an understanding of when to use a semicolon and what it means to omit it. We have also considered enumeration and iteration over a vector and array and how to handle the data held within them. Resources for Article: Further resources on this subject: Extra, Extra Collection, and Closure Changes that Rock! [article] Create a User Profile System and use the Null Coalesce Operator [article] Fine Tune Your Web Application by Profiling and Automation [article]
Read more
  • 0
  • 0
  • 1321
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-asynchronous-programming-f
Packt
12 Oct 2016
15 min read
Save for later

Asynchronous Programming in F#

Packt
12 Oct 2016
15 min read
In this article by Alfonso Garcia Caro Nunez and Suhaib Fahad, author of the book Mastering F#, sheds light on how writing applications that are non-blocking or reacting to events is increasingly becoming important in this cloud world we live in. A modern application needs to carry out a rich user interaction, communicate with web services, react to notifications, and so on; the execution of reactive applications is controlled by events. Asynchronous programming is characterized by many simultaneously pending reactions to internal or external events. These reactions may or may not be processed in parallel. (For more resources related to this topic, see here.) In .NET, both C# and F# provide asynchronous programming experience through keywords and syntaxes. In this article, we will go through the asynchronous programming model in F#, with a bit of cross-referencing or comparison drawn with the C# world. In this article, you will learn about asynchronous workflows in F# Asynchronous workflows in F# Asynchronous workflows are computation expressions that are setup to run asynchronously. It means that the system runs without blocking the current computation thread when a sleep, I/O, or other asynchronous process is performed. You may be wondering why do we need asynchronous programming and why can't we just use the threading concepts that we did for so long. The problem with threads is that the operation occupies the thread for the entire time that something happens or when a computation is done. On the other hand, asynchronous programming will enable a thread only when it is required, otherwise it will be normal code. There is also lot of marshalling and unmarshalling (junk) code that we will write around to overcome the issues that we face when directly dealing with threads. Thus, asynchronous model allows the code to execute efficiently whether we are downloading a page 50 or 100 times using a single thread or if we are doing some I/O operation over the network and there are a lot of incoming requests from the other endpoint. There is a list of functions that the Async module in F# exposes to create or use these asynchronous workflows to program. The asynchronous pattern allows writing code that looks like it is written for a single-threaded program, but in the internals, it uses async blocks to execute. There are various triggering functions that provide a wide variety of ways to create the asynchronous workflow, which is either a background thread, a .NET framework task object, or running the computation in the current thread itself. In this article, we will use the example of downloading the content of a webpage and modifying the data, which is as follows: let downloadPage (url: string) = async { let req = HttpWebRequest.Create(url) use! resp = req.AsyncGetResponse() use respStream = resp.GetResponseStream() use sr = new StreamReader(respStream) return sr.ReadToEnd() } downloadPage("https://www.google.com") |> Async.RunSynchronously The preceding function does the following: The async expression, { … }, generates an object of type Async<string> These values are not actual results; rather, they are specifications of tasks that need to run and return a string Async.RunSynchronously takes this object and runs synchronously We just wrote a simple function with asynchronous workflows with relative ease and reason about the code, which is much better than using code with Begin/End routines. One of the most important point here is that the code is never blocked during the execution of the asynchronous workflow. This means that we can, in principle, have thousands of outstanding web requests—the limit being the number supported by the machine, not the number of threads that host them. Using let! In asynchronous workflows, we will use let! binding to enable execution to continue on other computations or threads, while the computation is being performed. After the execution is complete, the rest of the asynchronous workflow is executed, thus simulating a sequential execution in an asynchronous way. In addition to let!, we can also use use! to perform asynchronous bindings; basically, with use!, the object gets disposed when it loses the current scope. In our previous example, we used use! to get the HttpWebResponse object. We can also do as follows: let! resp = req.AsyncGetResponse() // process response We are using let! to start an operation and bind the result to a value, do!, which is used when the return of the async expression is a unit. do! Async.Sleep(1000) Understanding asynchronous workflows As explained earlier, asynchronous workflows are nothing but computation expressions with asynchronous patterns. It basically implements the Bind/Return pattern to implement the inner workings. This means that the let! expression is translated into a call to async. The bind and async.Return function are defined in the Async module in the F# library. This is a compiler functionality to translate the let! expression into computation workflows and, you, as a developer, will never be required to understand this in detail. The purpose of explaining this piece is to understand the internal workings of an asynchronous workflow, which is nothing but a computation expression. The following listing shows the translated version of the downloadPage function we defined earlier: async.Delay(fun() -> let req = HttpWebRequest.Create(url) async.Bind(req.AsyncGetResponse(), fun resp -> async.Using(resp, fun resp -> let respStream = resp.GetResponseStream() async.Using(new StreamReader(respStream), fun sr " -> reader.ReadToEnd() ) ) ) ) The following things are happening in the workflow: The Delay function has a deferred lambda that executes later. The body of the lambda creates an HttpWebRequest and is forwarded in a variable req to the next segment in the workflow. The AsyncGetResponse function is called and a workflow is generated, where it knows how to execute the response and invoke when the operation is completed. This happens internally with the BeginGetResponse and EndGetResponse functions already present in the HttpWebRequest class; the AsyncGetResponse is just a wrapper extension present in the F# Async module. The Using function then creates a closure to dispose the object with the IDisposable interface once the workflow is complete. Async module The Async module has a list of functions that allows writing or consuming asynchronous code. We will go through each function in detail with an example to understand it better. Async.AsBeginEnd It is very useful to expose the F# workflow functionality out of F#, say if we want to use and consume the API's in C#. The Async.AsBeginEnd method gives the possibility of exposing the asynchronous workflows as a triple of methods—Begin/End/Cancel—following the .NET Asynchronous Programming Model (APM). Based on our downloadPage function, we can define the Begin, End, Cancel functions as follows: type Downloader() = let beginMethod, endMethod, cancelMethod = " Async.AsBeginEnd downloadPage member this.BeginDownload(url, callback, state : obj) = " beginMethod(url, callback, state) member this.EndDownload(ar) = endMethod ar member this.CancelDownload(ar) = cancelMethod(ar) Async.AwaitEvent The Async.AwaitEvent method creates an asynchronous computation that waits for a single invocation of a .NET framework event by adding a handler to the event. type MyEvent(v : string) = inherit EventArgs() member this.Value = v; let testAwaitEvent (evt : IEvent<MyEvent>) = async { printfn "Before waiting" let! r = Async.AwaitEvent evt printfn "After waiting: %O" r.Value do! Async.Sleep(1000) return () } let runAwaitEventTest () = let evt = new Event<Handler<MyEvent>, _>() Async.Start <| testAwaitEvent evt.Publish System.Threading.Thread.Sleep(3000) printfn "Before raising" evt.Trigger(null, new MyEvent("value")) printfn "After raising" > runAwaitEventTest();; > Before waiting > Before raising > After raising > After waiting : value The testAwaitEvent function listens to the event using Async.AwaitEvent and prints the value. As the Async.Start will take some time to start up the thread, we will simply call Thread.Sleep to wait on the main thread. This is for example purpose only. We can think of scenarios where a button-click event is awaited and used inside an async block. Async.AwaitIAsyncResult Creates a computation result and waits for the IAsyncResult to complete. IAsyncResult is the asynchronous programming model interface that allows us to write asynchronous programs. It returns true if IAsyncResult issues a signal within the given timeout. The timeout parameter is optional, and its default value is -1 of Timeout.Infinite. let testAwaitIAsyncResult (url: string) = async { let req = HttpWebRequest.Create(url) let aResp = req.BeginGetResponse(null, null) let! asyncResp = Async.AwaitIAsyncResult(aResp, 1000) if asyncResp then let resp = req.EndGetResponse(aResp) use respStream = resp.GetResponseStream() use sr = new StreamReader(respStream) return sr.ReadToEnd() else return "" } > Async.RunSynchronously (testAwaitIAsyncResult "https://www.google.com") We will modify the downloadPage example with AwaitIAsyncResult, which allows a bit more flexibility where we want to add timeouts as well. In the preceding example, the AwaitIAsyncResult handle will wait for 1000 milliseconds, and then it will execute the next steps. Async.AwaitWaitHandle Creates a computation that waits on a WaitHandle—wait handles are a mechanism to control the execution of threads. The following is an example with ManualResetEvent: let testAwaitWaitHandle waitHandle = async { printfn "Before waiting" let! r = Async.AwaitWaitHandle waitHandle printfn "After waiting" } let runTestAwaitWaitHandle () = let event = new System.Threading.ManualResetEvent(false) Async.Start <| testAwaitWaitHandle event System.Threading.Thread.Sleep(3000) printfn "Before raising" event.Set() |> ignore printfn "After raising" The preceding example uses ManualResetEvent to show how to use AwaitHandle, which is very similar to the event example that we saw in the previous topic. Async.AwaitTask Returns an asynchronous computation that waits for the given task to complete and returns its result. This helps in consuming C# APIs that exposes task based asynchronous operations. let downloadPageAsTask (url: string) = async { let req = HttpWebRequest.Create(url) use! resp = req.AsyncGetResponse() use respStream = resp.GetResponseStream() use sr = new StreamReader(respStream) return sr.ReadToEnd() } |> Async.StartAsTask let testAwaitTask (t: Task<string>) = async { let! r = Async.AwaitTask t return r } > downloadPageAsTask "https://www.google.com" |> testAwaitTask |> Async.RunSynchronously;; The preceding function is also downloading the web page as HTML content, but it starts the operation as a .NET task object. Async.FromBeginEnd The FromBeginEnd method acts as an adapter for the asynchronous workflow interface by wrapping the provided Begin/End method. Thus, it allows using large number of existing components that support an asynchronous mode of work. The IAsyncResult interface exposes the functions as Begin/End pattern for asynchronous programming. We will look at the same download page example using FromBeginEnd: let downloadPageBeginEnd (url: string) = async { let req = HttpWebRequest.Create(url) use! resp = Async.FromBeginEnd(req.BeginGetResponse, req.EndGetResponse) use respStream = resp.GetResponseStream() use sr = new StreamReader(respStream) return sr.ReadToEnd() } The function accepts two parameters and automatically identifies the return type; we will use BeginGetResponse and EndGetResponse as our functions to call. Internally, Async.FromBeginEnd delegates the asynchronous operation and gets back the handle once the EndGetResponse is called. Async.FromContinuations Creates an asynchronous computation that captures the current success, exception, and cancellation continuations. To understand these three operations, let's create a sleep function similar to Async.Sleep using timer: let sleep t = Async.FromContinuations(fun (cont, erFun, _) -> let rec timer = new Timer(TimerCallback(callback)) and callback state = timer.Dispose() cont(()) timer.Change(t, Timeout.Infinite) |> ignore ) let testSleep = async { printfn "Before" do! sleep 5000 printfn "After 5000 msecs" } Async.RunSynchronously testSleep The sleep function takes an integer and returns a unit; it uses Async.FromContinuations to allow the flow of the program to continue when a timer event is raised. It does so by calling the cont(()) function, which is a continuation to allow the next step in the asynchronous flow to execute. If there is any error, we can call erFun to throw the exception and it will be handled from the place we are calling this function. Using the FromContinuation function helps us wrap and expose functionality as async, which can be used inside asynchronous workflows. It also helps to control the execution of the programming with cancelling or throwing errors using simple APIs. Async.Start Starts the asynchronous computation in the thread pool. It accepts an Async<unit> function to start the asynchronous computation. The downloadPage function can be started as follows: let asyncDownloadPage(url) = async { let! result = downloadPage(url) printfn "%s" result"} asyncDownloadPage "http://www.google.com" |> Async.Start   We wrap the function to another async function that returns an Async<unit> function so that it can be called by Async.Start. Async.StartChild Starts a child computation within an asynchronous workflow. This allows multiple asynchronous computations to be executed simultaneously, as follows: let subTask v = async { print "Task %d started" v Thread.Sleep (v * 1000) print "Task %d finished" v return v } let mainTask = async { print "Main task started" let! childTask1 = Async.StartChild (subTask 1) let! childTask2 = Async.StartChild (subTask 5) print "Subtasks started" let! child1Result = childTask1 print "Subtask1 result: %d" child1Result let! child2Result = childTask2 print "Subtask2 result: %d" child2Result print "Subtasks completed" return () } Async.RunSynchronously mainTask Async.StartAsTask Executes a computation in the thread pool and returns a task that will be completed in the corresponding state once the computation terminates. We can use the same example of starting the downloadPage function as a task. let downloadPageAsTask (url: string) = async { let req = HttpWebRequest.Create(url) use! resp = req.AsyncGetResponse() use respStream = resp.GetResponseStream() use sr = new StreamReader(respStream) return sr.ReadToEnd() } |> Async.StartAsTask let task = downloadPageAsTask("http://www.google.com") prinfn "Do some work" task.Wait() printfn "done"   Async.StartChildAsTask Creates an asynchronous computation from within an asynchronous computation, which starts the given computation as a task. let testAwaitTask = async { print "Starting" let! child = Async.StartChildAsTask <| async { // " Async.StartChildAsTask shall be described later print "Child started" Thread.Sleep(5000) print "Child finished" return 100 } print "Waiting for the child task" let! result = Async.AwaitTask child print "Child result %d" result } Async.StartImmediate Runs an asynchronous computation, starting immediately on the current operating system thread. This is very similar to the Async.Start function we saw earlier: let asyncDownloadPage(url) = async { let! result = downloadPage(url) printfn "%s" result"} asyncDownloadPage "http://www.google.com" |> Async.StartImmediate Async.SwitchToNewThread Creates an asynchronous computation that creates a new thread and runs its continuation in it: let asyncDownloadPage(url) = async { do! Async.SwitchToNewThread() let! result = downloadPage(url) printfn "%s" result"} asyncDownloadPage "http://www.google.com" |> Async.Start   Async.SwitchToThreadPool Creates an asynchronous computation that queues a work item that runs its continuation, as follows: let asyncDownloadPage(url) = async { do! Async.SwitchToNewThread() let! result = downloadPage(url) do! Async.SwitchToThreadPool() printfn "%s" result"} asyncDownloadPage "http://www.google.com" |> Async.Start   Async.SwitchToContext Creates an asynchronous computation that runs its continuation in the Post method of the synchronization context. Let's assume that we set the text from the downloadPage function to a UI textbox, then we will do it as follows: let syncContext = System.Threading.SynchronizationContext() let asyncDownloadPage(url) = async { do! Async.SwitchToContext(syncContext) let! result = downloadPage(url) textbox.Text <- result"} asyncDownloadPage "http://www.google.com" |> Async.Start   Note that in the console applications, the context will be null. Async.Parallel The Parallel function allows you to execute individual asynchronous computations queued in the thread pool and uses the fork/join pattern. let parallel_download() = let sites = ["http://www.bing.com"; "http://www.google.com"; "http://www.yahoo.com"; "http://www.search.com"] let htmlOfSites = Async.Parallel [for site in sites -> downloadPage site ] |> Async.RunSynchronously printfn "%A" htmlOfSites   We will use the same example of downloading HTML content in a parallel way. The preceding example shows the essence of parallel I/O computation The async function, { … }, in the downloadPage function shows the asynchronous computation These are then composed in parallel using the fork/join combinator In this sample, the composition executed waits synchronously for overall result Async.OnCancel A cancellation interruption in the asynchronous computation when a cancellation occurs. It returns an asynchronous computation trigger before being disposed. // This is a simulated cancellable computation. It checks the " token source // to see whether the cancel signal was received. let computation " (tokenSource:System.Threading.CancellationTokenSource) = async { use! cancelHandler = Async.OnCancel(fun () -> printfn " "Canceling operation.") // Async.Sleep checks for cancellation at the end of " the sleep interval, // so loop over many short sleep intervals instead of " sleeping // for a long time. while true do do! Async.Sleep(100) } let tokenSource1 = new " System.Threading.CancellationTokenSource() let tokenSource2 = new " System.Threading.CancellationTokenSource() Async.Start(computation tokenSource1, tokenSource1.Token) Async.Start(computation tokenSource2, tokenSource2.Token) printfn "Started computations." System.Threading.Thread.Sleep(1000) printfn "Sending cancellation signal." tokenSource1.Cancel() tokenSource2.Cancel() The preceding example implements the Async.OnCancel method to catch or interrupt the process when CancellationTokenSource is cancelled. Summary In this article, we went through detail, explanations about different semantics in asynchronous programming with F#, used with asynchronous workflows. We saw a number of functions from the Async module. Resources for Article: Further resources on this subject: Creating an F# Project [article] Asynchronous Control Flow Patterns with ES2015 and beyond [article] Responsive Applications with Asynchronous Programming [article]
Read more
  • 0
  • 0
  • 5096

article-image-reactive-python-asynchronous-programming-rescue-part-2
Xavier Bruhiere
10 Oct 2016
5 min read
Save for later

Reactive Python - Asynchronous programming to the rescue, Part 2

Xavier Bruhiere
10 Oct 2016
5 min read
This two-part series explores asynchronous programming with Python using Asyncio. In Part 1 of this series, we started by building a project that shows how you can use Reactive Python in asynchronous programming. Let’s pick it back up here by exploring peer-to-peer communication and then just touching on service discovery before examining the streaming machine-to-machine concept. Peer-to-peer communication So far we’ve established a websocket connection to process clock events asynchronously. Now that one pin swings between 1's and 0's, let's wire a buzzer and pretend it buzzes on high states (1) and remains silent on low ones (0). We can rephrase that in Python, like so: # filename: sketches.py import factory class Buzzer(factory.FactoryLoop): """Buzz on light changes.""" def setup(self, sound): # customize buzz sound self.sound = sound @factory.reactive async def loop(self, channel, signal): """Buzzing.""" behavior = self.sound if signal == '1' else '...' self.out('signal {} received -> {}'.format(signal, behavior)) return behavior So how do we make them to communicate? Since they share a common parent class, we implement a stream method to send arbitrary data and acknowledge reception with, also, arbitrary data. To sum up, we want IOPin to use this API: class IOPin(factory.FactoryLoop): # [ ... ] @protocol.reactive async def loop(self, channel, msg): # [ ... ] await self.stream('buzzer', bits_stream) return 'acknowledged' Service discovery The first challenge to solve is service discovery. We need to target specific nodes within a fleet of reactive workers. This topic, however, goes past the scope of this post series. The shortcut below will do the job (that is, hardcode the nodes we will start), while keeping us focused on reactive messaging. # -*- coding: utf-8 -*- # vim_fenc=utf-8 # # filename: mesh.py """Provide nodes network knowledge.""" import websockets class Node(object): def __init__(self, name, socket, port): print('[ mesh ] registering new node: {}'.format(name)) self.name = name self._socket = socket self._port = port def uri(self, path): return 'ws://{socket}:{port}/{path}'.format(socket=self._socket, port=self._port, path=path) def connection(self, path=''): # instanciate the same connection as `clock` method return websockets.connect(self.uri(path)) # TODO service discovery def grid(): """Discover and build nodes network.""" # of course a proper service discovery should be used here # see consul or zookkeeper for example # note: clock is not a server so it doesn't need a port return [ Node('clock', 'localhost', None), Node('blink', 'localhost', 8765), Node('buzzer', 'localhost', 8765 + 1) ] Streaming machine-to-machine chat Let's provide FactoryLoop with the knowledge of the grid and implement an asynchronous communication channel. # filename: factory.py (continued) import mesh class FactoryLoop(object): def __init__(self, *args, **kwargs): # now every instance will know about the other ones self.grid = mesh.grid() # ... def node(self, name): """Search for the given node in the grid.""" return next(filter(lambda x: x.name == name, self.grid)) async def stream(self, target, data, channel): self.out('starting to stream message to {}'.format(target)) # use the node webscoket connection defined in mesh.py # the method is exactly the same as the clock async with self.node(target).connection(channel) as ws: for partial in data: self.out('> sending payload: {}'.format(partial)) # websockets requires bytes or strings await ws.send(str(partial)) self.out('< {}'.format(await ws.recv())) We added a bit of debugging lines to better understand how the data flows through the network. Every implementation of the FactoryLoop can both react to events and communicate with other nodes it is aware of. Wrapping up Time to update arduino.py and run our cluster of three reactive workers in three @click.command()# [ ... ]def main(sketch, **flags): # [ ... ] elif sketch == 'buzzer': sketchs.Buzzer(sound='buzz buzz buzz').run(flags['socket'], flags['port']) Launch three terminals or use a tool such as foreman to spawn multiple processes. Either way, keep in mind that you will need to track the scripts output. way, keep in mind that you will need to track the scripts output. $ # start IOPin and Buzzer on the same ports we hardcoded in mesh.py $ ./arduino.py buzzer --port 8766 $ ./arduino.py iopin --port 8765 $ # now that they listen, trigger actions with the clock (targetting IOPin port) $ ./arduino.py clock --port 8765 [ ... ] $ # Profit ! We just saw one worker reacting to a clock and another reacting to randomly generated events. The websocket protocol allowed us to exchange streaming data and receive arbitrary responses, unlocking sophisticated fleet orchestration. While we limited this example to two nodes, a powerful service discovery mechanism could bring to life a distributed network of microservices. By completing this post series, you should now have a better understanding of how to use Python with Asyncio for asynchronous programming. About the author Xavier Bruhiere is a lead developer at AppTurbo in Paris, where he develops innovative prototypes to support company growth. He is addicted to learning, hacking on intriguing hot techs (both soft and hard), and practicing high-intensity sports.
Read more
  • 0
  • 0
  • 2157

article-image-basics-classes-and-objects
Packt
06 Oct 2016
11 min read
Save for later

Basics of Classes and Objects

Packt
06 Oct 2016
11 min read
In this article by Steven Lott, the author of the book Modern Python Cookbook, we will see how to use a class to encapsulate data plus processing. (For more resources related to this topic, see here.) Introduction The point of computing is to process data. Even when building something like an interactive game, the game state and the player's actions are the data, the processing computes the next game state and the display update. The data plus processing is ubiquitous. Some games can have a relatively complex internal state. When we think of console games with multiple players and complex graphics, there are complex, real-time state changes. On the other hand, when we think of a very simple casino game like Craps, the game state is very simple. There may be no point established, or one of the numbers 4, 5, 6, 8, 9, 10 may be the established point. The transitions are relatively simple, and are often denoted by moving markers and chips around on the casino table. The data includes the current state, player actions, and rolls of the dice. The processing is the rules of the game. A game like Blackjack has a somewhat more complex internal state change as each card is accepted. In games where the hands can be split, the state of play can become quite complex. The data includes the current game state, the player's commands, and the cards drawn from the deck. Processing is defined by the rules of the game as modified by any house rules. In the case of Craps, the player may place bets. Interestingly, the player's input, has no effect on the game state. The internal state of the game object is determined entirely by the next throw of the dice. This leads to a class design that's relatively easy to visualize. Using a class to encapsulate data plus processing The essential idea of computing is to process data. This is exemplified when we write functions that process data. Often, we'd like to have a number of closely related functions that work with a common data structure. This concept is the heart of object-oriented programming. A class definition will contain a number of methods that will control the internal state of an object. The unifying concept behind a class definition is often captured as a summary of the responsibilities allocated to the class. How can we do this effectively? What's a good way to design a class? Getting Ready Let's look at a simple, stateful object—a pair of dice. The context for this would be an application which simulates the casino game of Craps. The goal is to use simulation of results to help invent a better playing strategy. This will save us from losing real money while we try to beat the house edge. There's an important distinction between the class definition and an instance of the class, called an object. We call this idea – as a whole – Object-Oriented Programming. Our focus is on writing class definitions. Our overall application will create instances of the classes. The behavior that emerges from the collaboration of the instances is the overall goal of the design process. Most of the design effort is on class definitions. Because of this, the name object-oriented programming can be misleading. The idea of emergent behavior is an essential ingredient in object-oriented programming. We don't specify every behavior of a program. Instead, we decompose the program into objects, define the object's state and behavior via the object's classes. The programming decomposes into class definitions based on their responsibilities and collaborations. An object should be viewed as a thing—a noun. The behavior of the class should be viewed as verbs. This gives us a hint as to how we can proceed with design classes that work effectively. Object-oriented design is often easiest to understand when it relates to tangible real-world things. It's often easier to write a software to simulate a playing card than to create a software that implements an Abstract Data Type (ADT). For this example, we'll simulate the rolling of die. For some games – like the casino game of Craps – two dice are used. We'll define a class which models the pair of dice. To be sure that the example is tangible, we'll model the pair of dice in the context of simulating a casino game. How to do it... Write down simple sentences that describe what an instance of the class does. We can call these as the problem statements. It's essential to focus on short sentences, and emphasize the nouns and verbs. The game of Craps has two standard dice. Each die has six faces with point values from 1 to 6. Dice are rolled by a player. The total of the dice changes the state of the craps game. However, those rules are separate from the dice. If the two dice match, the number was rolled the hard way. If the two dice do not match, the number was easy. Some bets depend on this hard vs easy distinction. Identify all of the nouns in the sentences. Nouns may identify different classes of objects. These are collaborators. Examples include player and game. Nouns may also identify attributes of objects in questions. Examples include face and point value. Identify all the verbs in the sentences. Verbs are generally methods of the class in question. Examples include rolled and match. Sometimes, they are methods of other classes. Examples include change the state, which applies to the Craps game. Identify any adjectives. Adjectives are words or phrases which clarify a noun. In many cases, some adjectives will clearly be properties of an object. In other cases, the adjectives will describe relationships among objects. In our example, a phrase like the total of the dice is an example of a prepositional phrase taking the role of an adjective. The the total of phrase modifies the noun the dice. The total is a property of the pair of dice. Start writing the class with the class statement. class Dice: Initialize the object's attributes in the __init__ method. def __init__(self): self.faces = None We'll model the internal state of the dice with the self.faces attribute. The self variable is required to be sure that we're referencing an attribute of a given instance of a class. The object is identified by the value of the instance variable, self We could put some other properties here as well. The alternative is to implement the properties as separate methods. These details of the design decision is the subject for using properties for lazy attributes. Define the object's methods based on the various verbs. In our case, we have several methods that must be defined. Here's how we can implement dice are rolled by a player. def roll(self): self.faces = (random.randint(1,6), random.randint(1,6)) We've updated the internal state of the dice by setting the self.faces attribute. Again, the self variable is essential for identifying the object to be updated. Note that this method mutates the internal state of the object. We've elected to not return a value. This makes our approach somewhat like the approach of Python's built-in collection classes. Any method which mutates the object does not return a value. This method helps implement the total of the dice changes the state of the Craps game. The game is a separate object, but this method provides a total that fits the sentence. def total(self): return sum(self.faces) These two methods help answer the hard way and easy way questions. def hardway(self): return self.faces[0] == self.faces[1] def easyway(self): return self.faces[0] != self.faces[1] It's rare in a casino game to have a rule that has a simple logical inverse. It's more common to have a rare third alternative that has a remarkably bad payoff rule. In this case, we could have defined easy way as return not self.hardway(). Here's an example of using the class. First, we'll seed the random number generator with a fixed value, so that we can get a fixed sequence of results. This is a way to create a unit test for this class. >>> import random >>> random.seed(1)   We'll create a Dice object, d1. We can then set its state with the roll() method. We'll then look at the total() method to see what was rolled. We'll examine the state by looking at the faces attribute. >>> from ch06_r01 import Dice >>> d1 = Dice() >>> d1.roll() >>> d1.total() 7 >>> d1.faces (2, 5)   We'll create a second Dice object, d2. We can then set its state with the roll() method. We'll look at the result of the total() method, as well as the hardway() method. We'll examine the state by looking at the faces attribute. >>> d2 = Dice() >>> d2.roll() >>> d2.total() 4 >>> d2.hardway() False >>> d2.faces (1, 3)   Since the two objects are independent instances of the Dice class, a change to d2 has no effect on d1. >>> d1.total() 7   How it works... The core idea here is to use ordinary rules of grammar – nouns, verbs, and adjectives – as a way to identify basic features of a class. Noun represents things. A good descriptive sentence should focus on tangible, real-world things more than ideas or abstractions. In our example, dice are real things. We try to avoid using abstract terms like randomizers or event generators. It's easier to describe the tangible features of real things, and then locate an abstract implementation that offers some of the tangible features. The idea of rolling the dice is an example of physical action that we can model with a method definition. Clearly, this action changes the state of the object. In rare cases – one time in 36 – the next state will happen to match the previous state. Adjectives often hold the potential for confusion. There are several cases such as: Some adjectives like first, last, least, most, next, previous, and so on will have a simple interpretation. These can have a lazy implementation as a method or an eager implementation as an attribute value. Some adjectives are more complex phrase like "the total of the dice". This is an adjective phrase built from a noun (total) and a preposition (of). This, too, can be seen as a method or an attribute. Some adjectives involve nouns that appear elsewhere in our software. We might have had a phrase like "the state of the Craps game" is a phrase where "state of" modifies another object, the "Craps game". This is clearly only tangentially related to the dice themselves. This may reflect a relationship between "dice" and "game". We might add a sentence to the problem statement like "The dice are part of the game". This can help clarify the presence of a relationship between game and dice. Prepositional phrases like "are part of" can always be reversed to create the a statement from the other object's point of view—"The game contains dice". This can help clarify the relationships among objects. In Python, the attributes of an object are – by default – dynamic. We don't specific a fixed list of attributes. We can initialize some (or all) of the attributes in the __init__() method of a class definition. Since attributes aren't static, we have considerable flexibility in our design. There's more... Capturing the essential internal state, and methods that cause state change is the first step in good class design. We can summarize some helpful design principles using the acronym SOLID. Single Responsibility Principle: A class should have one clearly-defined responsibility. Open/Closed Principle: A class should be open to extension – generally via inheritance – but closed to modification. We should design our classes so that we don't need to tweak the code to add or change features. Liskov Substitution Principle: We need to design inheritance so that a subclass can be used in place of the superclass. Interface Segregation Principle: When writing a problem statement, we want to be sure that collaborating classes have as few dependencies as possible. In many cases, this principle will lead us to decompose large problems into many small class definitions. Dependency Inversion Principle: It's less than ideal for a class to depend directly on other classes. It's better if a class depends on an abstraction, and a concrete implementation class is substituted for the abstract class. The goal is to create classes that have the proper behavior and also adhere to the design principles. Resources for Article: Further resources on this subject: Python Data Structures [article] Web scraping with Python (Part 2) [article] How is Python code organized [article]
Read more
  • 0
  • 0
  • 1740

article-image-reactive-python-asynchronous-programming-rescue-part-1
Xavier Bruhiere
05 Oct 2016
7 min read
Save for later

Reactive Python – Asynchronous programming to the rescue, Part 1

Xavier Bruhiere
05 Oct 2016
7 min read
On the Confluent website, you can find this title: Stream data changes everything From the createors of Kafka, a real-time messaging system, this is not a surprising assertion. Yet, data streaming infrastructures have gained in popularity and many projects require the data to be processed as soon as it shows up. This contributed to the development of famous technologies like Spark Stremaing, Apache Storm and more broadly websockets. This latest piece of software in particular brought real-time data feeds to web applications, trying to solve low-latency connections. Coupled with the asynchronous Node.js, you can build a powerful event-based reactive system. But what about Python? Given the popularity of the language in data science, would it be possible to bring the benefits of this kind of data ingestion? As this two-part post series will show, it turns out that modern Python (Python 3.4 or later) supports asynchronous data streaming apps. Introducing asyncio Python 3.4 introduced in the standard library the module asyncio to provision the language with: Asynchronous I/O, event loop, coroutines and tasks While Python treats functions as first-class objects (meaning you can assign them to variables and pass them as arguments), most developers follow an imperative programming style. It seems on purpose: It requires super human discipline to write readable code in callbacks and if you don’t believe me look at any piece of JavaScript code. - Guido van Rossum So Asyncio is the pythonic answer to asynchronous programming. This paradigm makes a lot of sense for otherwise costly I/O operations or when we need events to trigger code. Scenario For fun and profit, let's build such a project. We will simulate a dummy electrical circuit composed of three components: A clock regularly ticking A board I/O pin randomly choosing to toggle its binary state on clock events A buzzer buzzing when the I/O pin flips to one This set us up with an interesting machine-to-machine communication problem to solve. Note that the code snippets in this post make use of features like async and await introduced in Python 3.5. While it would be possible to backport to Python 3.4, I highly recommend that you follow along with the same version or newer. Anaconda or Pyenv can ease the installation process if necessary. $ python --version Python 3.5.1 $ pip --version pip 8.1.2 Asynchronous webscoket Client/Server Our first step, the clock, will introduce both asyncio and websocket basics. We need a straightforward method that fires tick signals through a websocket and wait for acknowledgement. # filename: sketch.py async def clock(socket, port, tacks=3, delay=1) The async keyword is sugar syntaxing introduced in Python 3.5 to replace the previous @asyncio.coroutine. The official pep 492 explains it all but the tldr : API quality. To simplify websocket connection plumbing, we can take advantage of the eponymous package: pip install websockets==3.5.1. It hides the protocol's complexity behind an elegant context manager. # filename: sketch.py # the path "datafeed" in this uri will be a parameter available in the other side but we won't use it for this example uri = 'ws://{socket}:{port}/datafeed'.format(socket=socket, port=port) # manage asynchronously the connection async with websockets.connect(uri) as ws: for payload in range(tacks): print('[ clock ] > {}'.format(payload)) # send payload and wait for acknowledgement await ws.send(str(payload)) print('[ clock ] < {}'.format(await ws.recv())) time.sleep(delay) The keyword await was introduced with async and replaces the old yield from to read values from asynchronous functions. Inside the context manager the connection stays open and we can stream data to the server we contacted. The server: IOPin At the core of our application are entities capable of speaking to each other directly. To make things fun, we will expose the same API as Arduino sketches, or a setup method that runs once at startup and a loop called when new data is available. # -*- coding: utf-8 -*- # vim_fenc=utf-8 # # filename: factory.py import abc import asyncio import websockets class FactoryLoop(object): """ Glue components to manage the evented-loop model. """ __metaclass__ = abc.ABCMeta def__init__(self, *args, **kwargs): # call user-defined initialization self.setup(*args, **kwargs) def out(self, text): print('[ {} ] {}'.format(type(self).__name__, text)) @abc.abstractmethod def setup(self, *args, **kwargs): pass @abc.abstractmethod async def loop(self, channel, data): pass def run(self, host, port): try: server = websockets.serve(self.loop, host, port) self.out('serving on {}:{}'.format(host, port)) asyncio.get_event_loop().run_until_complete(server) asyncio.get_event_loop().run_forever() exceptOSError: self.out('Cannot bind to this port! Is the server already running?') exceptKeyboardInterrupt: self.out('Keyboard interruption, aborting.') asyncio.get_event_loop().stop() finally: asyncio.get_event_loop().close() The child objects will be required to implement setup and loop, while this class will take care of: Initializing the sketch Registering a websocket server based on a asynchronous callback (loop) Telling the event loop to poll for... events The websockets states the server callback is expected to have the signature on_connection(websocket, path). This is too low-level for our purpose. Instead, we can write a decorator to manage asyncio details, message passing, or error handling. We will only call self.loop with application-level-relevant information: the actual message and the websocket path. # filename: factory.py import functools import websockets def reactive(fn): @functools.wraps(fn) async def on_connection(klass, websocket, path): """Dispatch events and wrap execution.""" klass.out('** new client connected, path={}'.format(path)) # process messages as long as the connection is opened or # an error is raised whileTrue: try: message = await websocket.recv() aknowledgement = await fn(klass, path, message) await websocket.send(aknowledgement or 'n/a') except websockets.exceptions.ConnectionClosed as e: klass.out('done processing messages: {}n'.format(e)) break return on_connection Now we can develop a readable IOPin object. # filename: sketch.py import factory class IOPin(factory.FactoryLoop): """Set an IO pin to 0 or 1 randomly.""" def setup(self, chance=0.5, sequence=3): self.chance = chance self.sequence = chance def state(self): """Toggle state, sometimes.""" return0if random.random() < self.chance else1 @factory.reactive async def loop(self, channel, msg): """Callback on new data.""" self.out('new tick triggered on {}: {}'.format(channel, msg)) bits_stream = [self.state() for _ in range(self.sequence)] self.out('toggling pin state: {}'.format(bits_stream)) # ... # ... toggle pin state here # ... return'acknowledged' We finally need some glue to run both the clock and IOPin and test if the latter toggles its state when the former fires new ticks. The following snippet uses a convenient library, click 6.6, to parse command-line arguments. #! /usr/bin/env python # -*- coding: utf-8 -*- # vim_fenc=utf-8 # # filename: arduino.py import sys import asyncio import click import sketchs @click.command() @click.argument('sketch') @click.option('-s', '--socket', default='localhost', help='Websocket to bind to') @click.option('-p', '--port', default=8765, help='Websocket port to bind to') @click.option('-t', '--tacks', default=5, help='Number of clock ticks') @click.option('-d', '--delay', default=1, help='Clock intervals') def main(sketch, **flags): if sketch == 'clock': # delegate the asynchronous execution to the event loop asyncio.get_event_loop().run_until_complete(sketchs.clock(**flags)) elif sketch == 'iopin': # arguments in the constructor go as is to our `setup` method sketchs.IOPin(chance=0.6).run(flags['socket'], flags['port']) else: print('unknown sketch, please choose clock, iopin or buzzer') return1 return0 if__name__ == '__main__': sys.exit(main()) Don't forget to chmod +x the script and start the server in a first terminal ./arduino.py iopin. When it is listening for connections, start the clock with ./arduino.py clock and watch them communicate! Note that we used here common default host and port so they can find each other. We have a good start with our app, and now in Part 2 we will further explore peer-to-peer communication, service discovery, and the streaming machine-to-machine concept. About the author Xavier Bruhiere is a lead developer at AppTurbo in Paris, where he develops innovative prototypes to support company growth. He is addicted to learning, hacking on intriguing hot techs (both soft and hard), and practicing high intensity sports.
Read more
  • 0
  • 0
  • 3596
article-image-reactive-python-real-time-events-processing
Xavier Bruhiere
04 Oct 2016
8 min read
Save for later

Reactive Python - Real-time events processing

Xavier Bruhiere
04 Oct 2016
8 min read
A recent trend in programming literature promotes functional programming as a sensible alternative to object-oriented programs for many use cases. This subject feeds many discussions and highlights how important program design is as our applications are becoming more and more complex. Although there might be here some seductive intellectual challenge (because yeah, we love to juggle with elegant abstractions), there are also real business values : Building sustainable, maintainable programs Decoupling architecture components for proper team work Limiting bug exposure Better product iteration When developers spot an interesting approach to solve a recurrent issue in our industry, they formalize it as a design pattern. Today, we will discuss a powerful member of this family: the pattern observer. We won't dive into the strict rhetorical details (sorry, not sorry). Instead, we will delve how reactive programming can level up the quality of our work. It's Python Week. That means you can not only save 50% on some of our latest Python products, but you can also pick up a free Python eBook every single day! The scene That was a bold statement; let's illustrate that with a real-world scenario. Say we were tasked to build a monitoring system. We need some way to collect data, analyze it, and take actions when things go unexpected. Anomaly detection is an exciting yet challenging problem. We don't want our data scientists to be bothered by infrastructure failures. And in the same spirit, we need other engineers to focus only on how to react to specific disaster scenarios. The core of our approach consists of two components—a monitoring module firing and forgetting its discoveries on channels and another processing brick intercepting those events with an appropriate response. The UNIX philosophy at its best: do one thing and do it well. We split the infrastructure by concerns and the workers by event types. Assuming that our team defines well-documented interfaces, this is a promising design. The rest of the article will discuss the technical implementation but keep in mind that I/O documentation and proper processing of load estimation are also fundamental. The strategy Our local lab is composed of three elements: The alert module that we will emulate with a simple cli tool, which publishes alert messages. The actual processing unit subscribing to events it knows how to react to. A message broker supporting the Publish / Subscribe (or PUBSUB) pattern. For this purpose, Redis offers a popular, efficient, and rock solid solution. This is highly recommended, but the database isn't designed for this case. NATS, however, presents itself as follows: NATS acts as a central nervous system for distributed systems such as mobile devices, IoT networks, enterprise microservices and cloud native infrastructure. Unlike traditional enterprise messaging systems, NATS provides an always on ‘dial-tone’. Sounds promising! Client libraries are available for major languages, and Apcera, the company sponsoring the technology, has a solid reputation for building reliable distributed systems. Again, we won't delve how processing actually happens, only the orchestration of this three moving parts. The setup Since NATS is a message broker, we need to run a server locally (version 0.8.0 as of today). Gnatsd is the official and scalable first choice. It is written in Go, so we get performances and drop-in binary out of the box. For fans of microservices (as I am), an official Docker image is available for pulling. Also, for lazy ones (as I am), a demo server is already running at nats://demo.nats.io:4222. Services will use Python 3.5.1, but 2.7.10 should do the job with minimal changes. Our scenario is mostly about data analysis and system administration on the backend, and Python has a wide range of tools for both areas. So let's install the requirements: $ pip --version pip 8.1.1 $ pip install -e git+https://github.com/mcuadros/pynats@6851e84eb4b244d22ffae65e9fbf79bd9872a5b3#egg=pynats click==6.6 # for cli integration Thats'all. We are now ready to write services. Publishing events Let's warm up by sending some alerts to the cloud. First, we need to connect to the NATS server: # -*- coding: utf-8 -*- # vim_fenc=utf-8 # # filename: broker.py import pynats def nats_conn(conf): """Connect to nats server from environment variables. The point is to allow easy switching without to change the code. You can read more on this approach stolen from 12 factors apps. """ # the default value comes from docker-compose (https://docs.docker.com/compose/) services link behavior host = conf.get('__BROKER_HOST__', 'nats') port = conf.get('__BROKER_PORT__', 4222) opts = { 'url': conf.get('url', 'nats://{host}:{port}'.format(host=host, port=port)), 'verbose': conf.get('verbose', False) } print('connecting to broker ({opts})'.format(opts=opts)) conn = pynats.Connection(**opts) conn.connect() return conn This should be enough to start our client: # -*- coding: utf-8 -*- # vim_fenc=utf-8 # # filename: observer.py import os import broker def send(channel, msg): # use environment variables for configuration nats = broker.nats_conn(os.environ) nats.publish(channel, msg) nats.close() And right after that, a few lines of code to shape a cli tool: #! /usr/bin/env python # -*- coding: utf-8 -*- # vim_fenc=utf-8 # # filename: __main__.py import click @click.command() @click.argument('command') @click.option('--on', default='some_event', help='messages topic name') def main(command, on): if command == 'send': click.echo('publishing message') observer.send(on, 'Terminator just dropped in our space-time') if__name__ == '__main__': main() chmod +x ./__main__.py gives it execution permission so we can test how our first bytes are doing. $ # `click` package gives us a productive cli interface $ ./__main__.py --help Usage: __main__.py [OPTIONS] COMMAND Options: --on TEXT messages topic name --help Show this message and exit. $ __BROKER_HOST__="demo.nats.io"./__main__.py send --on=click connecting to broker ({'verbose': False, 'url': 'nats://demo.nats.io:4222'}) publishing message ... This is indeed quite poor in feedback, but no exception means that we did connect to the server and published a message. Reacting to events We're done with the heavy lifting! Now that interesting events are flying through the Internet, we can catch them and actually provide business values. Don't forget the point: let the team write reactive programs without worrying how it will be triggered. I found the following snippet to be a readable syntax for such a goal: # filename: __main__.py import observer @observer.On('terminator_detected') def alert_sarah_connor(msg): print(msg.data) As the capitalized letter of On suggests, this is a Python class, wrapping a NATS connection. It aims to call the decorated function whenever a new message goes through the given channel. Here is a naive implementation shamefully ignoring any reasonable error handling and safe connection termination (broker.nats_conn would be much more production-ready as a context manger, but hey, we do things that don't scale, move fast, and break things): # filename: observer.py class On(object): def__init__(self, event_name, **kwargs): self._count = kwargs.pop('count', None) self._event = event_name self._opts = kwargs or os.environ def__call__(self, fn): nats = broker.nats_conn(self._opts) subscription = nats.subscribe(self._event, fn) def inner(): print('waiting for incoming messages') nats.wait(self._count) # we are done nats.unsubscribe(subscription) return nats.close() return inner Instil some life into this file from the __main__.py: # filename: __main__.py @click.command() @click.argument('command') @click.option('--on', default='some_event', help='messages topic name') def main(command, on): if command == 'send': click.echo('publishing message') observer.send(on, 'bad robot detected') elif command == 'listen': try: alert_sarah_connor(): exceptKeyboardInterrupt: click.echo('caught CTRL-C, cleaning after ourselves...') Your linter might complain about the injection of the msg argument in alert_sarah_connor, but no offense, it should just work (tm): $ In a first terminal, listen to messages $ __BROKER_HOST__="demo.nats.io"./__main__.py listen connecting to broker ({'url': 'nats://demo.nats.io:4222', 'verbose': False}) waiting for incoming messages $ And fire up alerts in a second terminal __BROKER_HOST__="demo.nats.io"--on='terminator_detected' The data appears in the first terminal, celebrate! Conclusion Reactive programming implemented with the Publish/Subscribe pattern brings a lot of benefits for events-oriented products. Modular development, decoupled components, scalable distributed infrastructure, single-responsibility principle.One should think about how data flows into the system before diving into the technical details. This kind of approach also gains traction from real-time data processing pipelines (Riemann, Spark, and Kafka). NATS performances, indeed, allow ultra low-latency architectures development without too much of a deployment overhead. We covered in a few lines of Python the basics of a reactive programming design, with a lot of improvement opportunities: events filtering, built-in instrumentation, and infrastructure-wide error tracing. I hope you found in this article the building block to develop upon! About the author Xavier Bruhiere is the lead developer at AppTurbo in Paris, where he develops innovative prototypes to support company growth. He is addicted to learning, hacking on intriguing hot techs (both soft and hard), and practicing high intensity sports.
Read more
  • 0
  • 0
  • 4727

article-image-simple-slack-websocket-integrations-10-lines-python
Bradley Cicenas
09 Sep 2016
3 min read
Save for later

Simple Slack Websocket Integrations in <10 lines of Python

Bradley Cicenas
09 Sep 2016
3 min read
If you use Slack, you've probably added a handful of integrations for your team from the ever-growing App Directory, and maybe even had an idea for your own Slack app. While the Slack API is featureful and robust, writing your own integration can be exceptionally easy. Through the Slack RTM (Real Time Messaging) API, you can write our own basic integrations in just a few lines of Python using the SlackSocket library. Want an accessible introduction to Python that's comprehensive enough to give you the confidence you need to dive deeper? This week, follow our Python Fundamentals course inside Mapt. It's completely free - so what have you got to lose? Structure Our integration will be structured with the following basic components: Listener Integration/bot logic Response The listener watches for one or more pre-defined "trigger" words, while the response posts the result of our intended task. Basic Integration We'll start by setting up SlackSocket with our API token: fromslacksocketimportSlackSocket slack=SlackSocket('<slack-token>', event_filter=['message']) By default, SlackSocket will listen for all Slack events. There are a lot of different events sent via RTM, but we're only concerned with 'message' events for our integration, so we've set an event_filter for only this type. Using the SlackSocketevents() generator, we'll read each 'message' event that comes in and can act on various conditions: for e inslack.events(): ife.event['text'] =='!hello': slack.send_msg('it works!', channel_name=e.event['channel']) If our message text matches the string '!hello', we'll respond to the source channel of the event with a given message('it works!'). At this point, we've created a complete integration that can connect to Slack as a bot user(or regular user), follow messages, and respond accordingly. Let's build something a bit more useful, like a password generator for throwaway accounts. Expanding Functionality For this integration command, we'll write a simple function to generate a random alphanumeric string 15 characters long: import random import string defrandomstr(): chars=string.ascii_letters+string.digits return''.join(random.choice(chars) for _ inrange(15)) Now we're ready to provide our random string generator to the rest of the team using the same chat logic as before, responding to the source channel with our generated password: for e inslack.events(): e.event['text'].startswith('!random'): slack.send_msg(randomstr(), channel_name=e.event['channel']) Altogether: import random import string fromslacksocketimportSlackSocket slack=SlackSocket('<slack-token>', event_filter=['message']) defrandomstr(): chars=string.ascii_letters+string.digits return''.join(random.choice(chars) for _ inrange(15)) for e inslack.events(): ife.event['text'].startswith('!random'): slack.send_msg(randomstr(), channel_name=e.event['channel']) And the results:  A complete integration in 10 lines of Python. Not bad! Beyond simplicity, SlackSocket provides a great deal of flexibility for writing apps, bots, or integrations. In the case of massive Slack groups with several thousand users, messages are buffered locally to ensure that none are missed. Dropped websocket connections are automatically re-connected as well, making it an ideal base for chat client. The code for SlackSocket is available on GitHub, and as always, we welcome any contributions or feature requests! About the author Bradley Cicenas is a New York City-based infrastructure engineer with an affinity for microservices, systems design, data science, and stoops.
Read more
  • 0
  • 0
  • 2807

article-image-go-programming-control-flow
Packt
10 Aug 2016
13 min read
Save for later

Go Programming Control Flow

Packt
10 Aug 2016
13 min read
In this article by Vladimir Vivien author of the book Learning Go programming explains some basic control flow of Go programming language. Go borrows several of the control flow syntax from its C-family of languages. It supports all of the expected control structures including if-else, switch, for-loop, and even goto. Conspicuously absent though are while or do-while statements. The following topics examine Go's control flow elements. Some of which you may already be familiar and others that bring new set of functionalities not found in other languages. The if statement The switch statement The type Switch (For more resources related to this topic, see here.) The If Statement The if-statement, in Go, borrows its basic structural form from other C-like languages. The statement conditionally executes a code-block when the Boolean expression that follows the if keyword which evaluates to true as illustrated in the following abbreviated program that displays information about the world currencies. import "fmt" type Currency struct { Name string Country string Number int } var CAD = Currency{ Name: "Canadian Dollar", Country: "Canada", Number: 124} var FJD = Currency{ Name: "Fiji Dollar", Country: "Fiji", Number: 242} var JMD = Currency{ Name: "Jamaican Dollar", Country: "Jamaica", Number: 388} var USD = Currency{ Name: "US Dollar", Country: "USA", Number: 840} func main() { num0 := 242 if num0 > 100 || num0 < 900 { mt.Println("Currency: ", num0) printCurr(num0) } else { fmt.Println("Currency unknown") } if num1 := 388; num1 > 100 || num1 < 900 { fmt.Println("Currency:", num1) printCurr(num1) } } func printCurr(number int) { if CAD.Number == number { fmt.Printf("Found: %+vn", CAD) } else if FJD.Number == number { fmt.Printf("Found: %+vn", FJD) } else if JMD.Number == number { fmt.Printf("Found: %+vn", JMD) } else if USD.Number == number { fmt.Printf("Found: %+vn", USD) } else { fmt.Println("No currency found with number", number) } } The if statement in Go looks similar to other languages. However, it sheds a few syntactic rules while enforcing new ones. The parentheses, around the test expression, are not necessary. While the following if-statement will compile, it is not idiomatic: if (num0 > 100 || num0 < 900) { fmt.Println("Currency: ", num0) printCurr(num0) } Use instead: if num0 > 100 || num0 < 900 { fmt.Println("Currency: ", num0) printCurr(num0) } The curly braces for the code block are always required. The following snippet will not compile: if num0 > 100 || num0 < 900 printCurr(num0) However, this will compile: if num0 > 100 || num0 < 900 {printCurr(num0)} It is idiomatic, however, to write the if statement on multiple lines (no matter how simple the statement block may be). This encourages good style and clarity. The following snippet will compile with no issues: if num0 > 100 || num0 < 900 {printCurr(num0)} However, the preferred idiomatic layout for the statement is to use multiple lines as follows: if num0 > 100 || num0 < 900 { printCurr(num0) } The if statement may include an optional else block which is executed when the expression in the if block evaluates to false. The code in the else block must be wrapped in curly braces using multiple lines as shown in the following. if num0 > 100 || num0 < 900 { fmt.Println("Currency: ", num0) printCurr(num0) } else { fmt.Println("Currency unknown") } The else keyword may be immediately followed by another if statement forming an if-else-if chain as used in function printCurr() from the source code listed earlier. if CAD.Number == number { fmt.Printf("Found: %+vn", CAD) } else if FJD.Number == number { fmt.Printf("Found: %+vn", FJD) The if-else-if statement chain can grow as long as needed and may be terminated by an optional else statement to express all other untested conditions. Again, this is done in the printCurr() function which tests four conditions using the if-else-if blocks. Lastly, it includes an else statement block to catch any other untested conditions: func printCurr(number int) { if CAD.Number == number { fmt.Printf("Found: %+vn", CAD) } else if FJD.Number == number { fmt.Printf("Found: %+vn", FJD) } else if JMD.Number == number { fmt.Printf("Found: %+vn", JMD) } else if USD.Number == number { fmt.Printf("Found: %+vn", USD) } else { fmt.Println("No currency found with number", number) } } In Go, however, the idiomatic and cleaner way to write such a deep if-else-if code block is to use an expressionless switch statement. This is covered later in the section on SwitchStatement. If Statement Initialization The if statement supports a composite syntax where the tested expression is preceded by an initialization statement. At runtime, the initialization is executed before the test expression is evaluated as illustrated in this code snippet (from the program listed earlier). if num1 := 388; num1 > 100 || num1 < 900 { fmt.Println("Currency:", num1) printCurr(num1) } The initialization statement follows normal variable declaration and initialization rules. The scope of the initialized variables is bound to the if statement block beyond which they become unreachable. This is a commonly used idiom in Go and is supported in other flow control constructs covered in this article. Switch Statements Go also supports a switch statement similarly to that found in other languages such as C or Java. The switch statement in Go achieves multi-way branching by evaluating values or expressions from case clauses as shown in the following abbreviated source code: import "fmt" type Curr struct { Currency string Name string Country string Number int } var currencies = []Curr{ Curr{"DZD", "Algerian Dinar", "Algeria", 12}, Curr{"AUD", "Australian Dollar", "Australia", 36}, Curr{"EUR", "Euro", "Belgium", 978}, Curr{"CLP", "Chilean Peso", "Chile", 152}, Curr{"EUR", "Euro", "Greece", 978}, Curr{"HTG", "Gourde", "Haiti", 332}, ... } func isDollar(curr Curr) bool { var bool result switch curr { default: result = false case Curr{"AUD", "Australian Dollar", "Australia", 36}: result = true case Curr{"HKD", "Hong Kong Dollar", "Hong Koong", 344}: result = true case Curr{"USD", "US Dollar", "United States", 840}: result = true } return result } func isDollar2(curr Curr) bool { dollars := []Curr{currencies[2], currencies[6], currencies[9]} switch curr { default: return false case dollars[0]: fallthrough case dollars[1]: fallthrough case dollars[2]: return true } return false } func isEuro(curr Curr) bool { switch curr { case currencies[2], currencies[4], currencies[10]: return true default: return false } } func main() { curr := Curr{"EUR", "Euro", "Italy", 978} if isDollar(curr) { fmt.Printf("%+v is Dollar currencyn", curr) } else if isEuro(curr) { fmt.Printf("%+v is Euro currencyn", curr) } else { fmt.Println("Currency is not Dollar or Euro") } dol := Curr{"HKD", "Hong Kong Dollar", "Hong Koong", 344} if isDollar2(dol) { fmt.Println("Dollar currency found:", dol) } } The switch statement in Go has some interesting properties and rules that make it easy to use and reason about. Semantically, Go's switch-statement can be used in two contexts: An expression-switch statement A type-switch statement The break statement can be used to escape out of a switch code block early The switch statement can include a default case when no other case expressions evaluate to a match. There can only be one default case and it may be placed anywhere within the switch block. Using Expression Switches Expression switches are flexible and can be used in many contexts where control flow of a program needs to follow multiple path. An expression switch supports many attributes as outlined in the following bullets. Expression switches can test values of any types. For instance, the following code snippet (from the previous program listing) tests values of struct type Curr. func isDollar(curr Curr) bool { var bool result switch curr { default: result = false case Curr{"AUD", "Australian Dollar", "Australia", 36}: result = true case Curr{"HKD", "Hong Kong Dollar", "Hong Koong", 344}: result = true case Curr{"USD", "US Dollar", "United States", 840}: result = true } return result } The expressions in case clauses are evaluated from left to right, top to bottom, until a value (or expression) is found that is equal to that of the switch expression. Upon encountering the first case that matches the switch expression, the program will execute the statements for the case block and then immediately exist the switch block. Unlike other languages, the Go case statement does not need to use a break to avoid falling through the next case. For instance, calling isDollar(Curr{"HKD", "Hong Kong Dollar", "Hong Koong", 344}) will match the second case statement in the function above. The code will set result to true and exist the switch code block immediately. Case clauses can have multiple values (or expressions) separated by commas with logical OR operator implied between them. For instance, in the following snippet, the switch expression curr is tested against values currencies[2], currencies[4], or currencies[10] using one case clause until a match is found. func isEuro(curr Curr) bool { switch curr { case currencies[2], currencies[4], currencies[10]: return true default: return false } } The switch statement is the cleaner and preferred idiomatic approach to writing complex conditional statements in Go. This is evident when the snippet above is compared to the following which does the same comparison using if statements. func isEuro(curr Curr) bool { if curr == currencies[2] || curr == currencies[4], curr == currencies[10]{ return true }else{ return false } } Fallthrough Cases There is no automatic fall through in Go's case clause as it does in the C or Java switch statements. Recall that a switch block that will exit after executing its first matching case. The code must explicitly place the fallthrough keyword, as the last statement in a case block, to force the execution flow to fall through the successive case block. The following code snippet shows a switch statement with a fallthrough in each case block. func isDollar2(curr Curr) bool { switch curr { case Curr{"AUD", "Australian Dollar", "Australia", 36}: fallthrough case Curr{"HKD", "Hong Kong Dollar", "Hong Koong", 344}: fallthrough case Curr{"USD", "US Dollar", "United States", 840}: return true default: return false } } When a case is matched, the fallthrough statements cascade down to the first statement of the successive case block. So if curr = Curr{"AUD", "Australian Dollar", "Australia", 36}, the first case will be matched. Then the flow cascades down to the first statement of the second case block which is also a fallthrough statement. This causes the first statement, return true, of the third case block to execute. This is functionally equivalent to following snippet. switch curr { case Curr{"AUD", "Australian Dollar", "Australia", 36}, Curr{"HKD", "Hong Kong Dollar", "Hong Koong", 344}, Curr{"USD", "US Dollar", "United States", 840}: return true default: return false } Expressionless Switches Go supports a form of the switch statement that does not specify an expression. In this format, each case expression must evaluate to a Boolean value true. The following abbreviated source code illustrates the uses of an expressionless switch statement as listed in function find(). The function loops through the slice of Curr values to search for a match based on field values in the struct passed in: import ( "fmt" "strings" ) type Curr struct { Currency string Name string Country string Number int } var currencies = []Curr{ Curr{"DZD", "Algerian Dinar", "Algeria", 12}, Curr{"AUD", "Australian Dollar", "Australia", 36}, Curr{"EUR", "Euro", "Belgium", 978}, Curr{"CLP", "Chilean Peso", "Chile", 152}, ... } func find(name string) { for i := 0; i < 10; i++ { c := currencies[i] switch { case strings.Contains(c.Currency, name), strings.Contains(c.Name, name), strings.Contains(c.Country, name): fmt.Println("Found", c) } } } Notice in the previous example, the switch statement in function find() does not include an expression. Each case expression is separated by a comma and must be evaluated to a Boolean value with an implied OR operator between each case. The previous switch statement is equivalent to the following use of if statement to achieve the same logic. func find(name string) { for i := 0; i < 10; i++ { c := currencies[i] if strings.Contains(c.Currency, name) || strings.Contains(c.Name, name) || strings.Contains(c.Country, name){ fmt.Println("Found", c) } } } Switch Initializer The switch keyword may be immediately followed by a simple initialization statement where variables, local to the switch code block, may be declared and initialized. This convenient syntax uses a semicolon between the initializer statement and the switch expression to declare variables which may appear anywhere in the switch code block. The following code sample shows how this is done by initializing two variables name and curr as part of the switch declaration. func assertEuro(c Curr) bool { switch name, curr := "Euro", "EUR"; { case c.Name == name: return true case c.Currency == curr: return true } return false } The previous code snippet uses an expressionless switch statement with an initializer. Notice the trailing semicolon to indicate the separation between the initialization statement and the expression area for the switch. In the example, however, the switch expression is empty. Type Switches Given Go's strong type support, it should be of little surprise that the language supports the ability to query type information. The type switch is a statement that uses the Go interface type to compare underlying type information of values (or expressions). A full discussion on interface types and type assertion is beyond the scope of this section. For now all you need to know is that Go offers the type interface{}, or empty interface, as a super type that is implemented by all other types in the type system. When a value is assigned type interface{}, it can be queried using the type switch as, shown in function findAny() in following code snippet, to query information about its underlying type. func find(name string) { for i := 0; i < 10; i++ { c := currencies[i] switch { case strings.Contains(c.Currency, name), strings.Contains(c.Name, name), strings.Contains(c.Country, name): fmt.Println("Found", c) } } } func findNumber(num int) { for _, curr := range currencies { if curr.Number == num { fmt.Println("Found", curr) } } } func findAny(val interface{}) { switch i := val.(type) { case int: findNumber(i) case string: find(i) default: fmt.Printf("Unable to search with type %Tn", val) } } func main() { findAny("Peso") findAny(404) findAny(978) findAny(false) } The function findAny() takes an interface{} as its parameter. The type switch is used to determine the underlying type and value of the variable val using the type assertion expression: switch i := val.(type) Notice the use of the keyword type in the type assertion expression. Each case clause will be tested against the type information queried from val.(type). Variable i will be assigned the actual value of the underlying type and is used to invoke a function with the respective value. The default block is invoked to guard against any unexpected type assigned to the parameter val parameter. Function findAny may then be invoked with values of diverse types, as shown in the following code snippet. findAny("Peso") findAny(404) findAny(978) findAny(false) Summary This article gave a walkthrough of the mechanism of control flow in Go including if, switch statements. While Go’s flow control constructs appear simple and easy to use, they are powerful and implement all branching primitives expected for a modern language. Resources for Article: Further resources on this subject: Game Development Using C++ [Article] Boost.Asio C++ Network Programming [Article] Introducing the Boost C++ Libraries [Article]
Read more
  • 0
  • 0
  • 2810
article-image-developing-middleware
Packt
08 Aug 2016
16 min read
Save for later

Developing Middleware

Packt
08 Aug 2016
16 min read
In this article by Doug Bierer, author of the book PHP 7 Programming Cookbook, we will cover the following topics: (For more resources related to this topic, see here.) Authenticating with middleware Making inter-framework system calls Using middleware to cross languages Introduction As often happens in the IT industry, terms get invented, and then used and abused. The term middleware is no exception. Arguably the first use of the term came out of the Internet Engineering Task Force (IETF) in the year 2000. Originally, the term was applied to any software which operates between the transport (that is, TCP/IP) and the application layer. More recently, especially with the acceptance of PHP Standard Recommendation number 7 (PSR-7), middleware, specifically in the PHP world, has been applied to the web client-server environment. Authenticating with middleware One very important usage of middleware is to provide authentication. Most web-based applications need the ability to verify a visitor via username and password. By incorporating PSR-7 standards into an authentication class, you will make it generically useful across the board, so to speak, being secure that it can be used in any framework that provides PSR-7-compliant request and response objects. How to do it… We begin by defining a ApplicationAclAuthenticateInterface class. We use this interface to support the Adapter software design pattern, making our Authenticate class more generically useful by allowing a variety of adapters, each of which can draw authentication from a different source (for example, from a file, using OAuth2, and so on). Note the use of the PHP 7 ability to define the return value data type: namespace ApplicationAcl; use PsrHttpMessage { RequestInterface, ResponseInterface }; interface AuthenticateInterface { public function login(RequestInterface $request) : ResponseInterface; } Note that by defining a method that requires a PSR-7-compliant request, and produces a PSR-7-compliant response, we have made this interface universally applicable. Next, we define the adapter that implements the login() method required by the interface. We make sure to use the appropriate classes, and define fitting constants and properties. The constructor makes use of ApplicationDatabaseConnection: namespace ApplicationAcl; use PDO; use ApplicationDatabaseConnection; use PsrHttpMessage { RequestInterface, ResponseInterface }; use ApplicationMiddleWare { Response, TextStream }; class DbTable implements AuthenticateInterface { const ERROR_AUTH = 'ERROR: authentication error'; protected $conn; protected $table; public function __construct(Connection $conn, $tableName) { $this->conn = $conn; $this->table = $tableName; } The core login() method extracts the username and password from the request object. We then do a straightforward database lookup. If there is a match, we store user information in the response body, JSON-encoded: public function login(RequestInterface $request) : ResponseInterface { $code = 401; $info = FALSE; $body = new TextStream(self::ERROR_AUTH); $params = json_decode($request->getBody()->getContents()); $response = new Response(); $username = $params->username ?? FALSE; if ($username) { $sql = 'SELECT * FROM ' . $this->table . ' WHERE email = ?'; $stmt = $this->conn->pdo->prepare($sql); $stmt->execute([$username]); $row = $stmt->fetch(PDO::FETCH_ASSOC); if ($row) { if (password_verify($params->password, $row['password'])) { unset($row['password']); $body = new TextStream(json_encode($row)); $response->withBody($body); $code = 202; $info = $row; } } } return $response->withBody($body)->withStatus($code); } } Best practice Never store passwords in clear text. When you need to do a password match, use password_verify(), which negates the need to reproduce the password hash. The Authenticate class is a wrapper for an adapter class that implements AuthenticationInterface. Accordingly, the constructor takes an adapter class as an argument, as well as a string that serves as the key in which authentication information is stored in $_SESSION: namespace ApplicationAcl; use ApplicationMiddleWare { Response, TextStream }; use PsrHttpMessage { RequestInterface, ResponseInterface }; class Authenticate { const ERROR_AUTH = 'ERROR: invalid token'; const DEFAULT_KEY = 'auth'; protected $adapter; protected $token; public function __construct( AuthenticateInterface $adapter, $key) { $this->key = $key; $this->adapter = $adapter; } In addition, we provide a login form with a security token, which helps prevent Cross Site Request Forgery (CSRF) attacks: public function getToken() { $this->token = bin2hex(random_bytes(16)); $_SESSION['token'] = $this->token; return $this->token; } public function matchToken($token) { $sessToken = $_SESSION['token'] ?? date('Ymd'); return ($token == $sessToken); } public function getLoginForm($action = NULL) { $action = ($action) ? 'action="' . $action . '" ' : ''; $output = '<form method="post" ' . $action . '>'; $output .= '<table><tr><th>Username</th><td>'; $output .= '<input type="text" name="username" /></td>'; $output .= '</tr><tr><th>Password</th><td>'; $output .= '<input type="password" name="password" />'; $output .= '</td></tr><tr><th>&nbsp;</th>'; $output .= '<td><input type="submit" /></td>'; $output .= '</tr></table>'; $output .= '<input type="hidden" name="token" value="'; $output .= $this->getToken() . '" />'; $output .= '</form>'; return $output; } Finally, the login() method in this class checks whether the token is valid. If not, a 400 response is returned. Otherwise, the login() method of the adapter is called: public function login( RequestInterface $request) : ResponseInterface { $params = json_decode($request->getBody()->getContents()); $token = $params->token ?? FALSE; if (!($token && $this->matchToken($token))) { $code = 400; $body = new TextStream(self::ERROR_AUTH); $response = new Response($code, $body); } else { $response = $this->adapter->login($request); } if ($response->getStatusCode() >= 200 && $response->getStatusCode() < 300) { $_SESSION[$this->key] = json_decode($response->getBody()->getContents()); } else { $_SESSION[$this->key] = NULL; } return $response; } } How it works… Go ahead and define the classes presented in this recipe, summarized in the following table: Class Discussed in these steps ApplicationAclAuthenticateInterface 1 ApplicationAclDbTable 2 - 3 ApplicationAclAuthenticate 4 - 6 You can then define a chap_09_middleware_authenticate.php calling program that sets up autoloading and uses the appropriate classes: <?php session_start(); define('DB_CONFIG_FILE', __DIR__ . '/../config/db.config.php'); define('DB_TABLE', 'customer_09'); define('SESSION_KEY', 'auth'); require __DIR__ . '/../Application/Autoload/Loader.php'; ApplicationAutoloadLoader::init(__DIR__ . '/..'); use ApplicationDatabaseConnection; use ApplicationAcl { DbTable, Authenticate }; use ApplicationMiddleWare { ServerRequest, Request, Constants, TextStream }; You are now in a position to set up the authentication adapter and core class: $conn = new Connection(include DB_CONFIG_FILE); $dbAuth = new DbTable($conn, DB_TABLE); $auth = new Authenticate($dbAuth, SESSION_KEY); Be sure to initialize the incoming request, and set up the request to be made to the authentication class: $incoming = new ServerRequest(); $incoming->initialize(); $outbound = new Request(); Check the incoming class method to see if it is POST. If so, pass a request to the authentication class: if ($incoming->getMethod() == Constants::METHOD_POST) { $body = new TextStream(json_encode( $incoming->getParsedBody())); $response = $auth->login($outbound->withBody($body)); } $action = $incoming->getServerParams()['PHP_SELF']; ?> The display logic looks like this: <?= $auth->getLoginForm($action) ?> Here is the output from an invalid authentication attempt. Notice the 401 status code on the right. In this illustration, you could add a var_dump() of the response object. Here is a successful authentication: Making inter-framework system calls One of the primary reasons for the development of PSR-7 (and middleware) was a growing need to make calls between frameworks. It is of interest to note that the main documentation for PSR-7 is hosted by PHP Framework Interop Group (PHP-FIG). How to do it… The primary mechanism used in middleware inter-framework calls is to create a driver program that executes framework calls in succession, maintaining a common request and response object. The request and response objects are expected to represent PsrHttpMessageServerRequestInterface and PsrHttpMessageResponseInterface respectively. For the purposes of this illustration, we define a middleware session validator. The constants and properties reflect the session thumbprint, which is a term we use to incorporate factors such as the website visitor's IP address, browser, and language settings: namespace ApplicationMiddleWareSession; use InvalidArgumentException; use PsrHttpMessage { ServerRequestInterface, ResponseInterface }; use ApplicationMiddleWare { Constants, Response, TextStream }; class Validator { const KEY_TEXT = 'text'; const KEY_SESSION = 'thumbprint'; const KEY_STATUS_CODE = 'code'; const KEY_STATUS_REASON = 'reason'; const KEY_STOP_TIME = 'stop_time'; const ERROR_TIME = 'ERROR: session has exceeded stop time'; const ERROR_SESSION = 'ERROR: thumbprint does not match'; const SUCCESS_SESSION = 'SUCCESS: session validates OK'; protected $sessionKey; protected $currentPrint; protected $storedPrint; protected $currentTime; protected $storedTime; The constructor takes a ServerRequestInterface instance and the session as arguments. If the session is an array (such as $_SESSION), we wrap it in a class. The reason why we do this is in case we are passed a session object, such as JSession used in Joomla. We then create the thumbprint using the factors previously mentioned factors. If the stored thumbprint is not available, we assume this is the first time, and store the current print as well as stop time, if this parameter is set. We used md5() because it's a fast hash, and is not exposed externally and is therefore useful to this application: public function __construct( ServerRequestInterface $request, $stopTime = NULL) { $this->currentTime = time(); $this->storedTime = $_SESSION[self::KEY_STOP_TIME] ?? 0; $this->currentPrint = md5($request->getServerParams()['REMOTE_ADDR'] . $request->getServerParams()['HTTP_USER_AGENT'] . $request->getServerParams()['HTTP_ACCEPT_LANGUAGE']); $this->storedPrint = $_SESSION[self::KEY_SESSION] ?? NULL; if (empty($this->storedPrint)) { $this->storedPrint = $this->currentPrint; $_SESSION[self::KEY_SESSION] = $this->storedPrint; if ($stopTime) { $this->storedTime = $stopTime; $_SESSION[self::KEY_STOP_TIME] = $stopTime; } } } It's not required to define __invoke(), but this magic method is quite convenient for standalone middleware classes. As is the convention, we accept ServerRequestInterface and ResponseInterface instances as arguments. In this method we simply check to see if the current thumbprint matches the one stored. The first time, of course, they will match. But on subsequent requests, the chances are an attacker intent on session hijacking will be caught out. In addition, if the session time exceeds the stop time (if set), likewise, a 401 code will be sent: public function __invoke( ServerRequestInterface $request, Response $response) { $code = 401; // unauthorized if ($this->currentPrint != $this->storedPrint) { $text[self::KEY_TEXT] = self::ERROR_SESSION; $text[self::KEY_STATUS_REASON] = Constants::STATUS_CODES[401]; } elseif ($this->storedTime) { if ($this->currentTime > $this->storedTime) { $text[self::KEY_TEXT] = self::ERROR_TIME; $text[self::KEY_STATUS_REASON] = Constants::STATUS_CODES[401]; } else { $code = 200; // success } } if ($code == 200) { $text[self::KEY_TEXT] = self::SUCCESS_SESSION; $text[self::KEY_STATUS_REASON] = Constants::STATUS_CODES[200]; } $text[self::KEY_STATUS_CODE] = $code; $body = new TextStream(json_encode($text)); return $response->withStatus($code)->withBody($body); } We can now put our new middleware class to use. The main problems with inter-framework calls, at least at this point, are summarized here. Accordingly, how we implement middleware depends heavily on the last point: Not all PHP frameworks are PSR-7-compliant Existing PSR-7 implementations are not complete All frameworks want to be the "boss" As an example, have a look at the configuration files for Zend Expressive, which is a self-proclaimed PSR7 Middleware Microframework. Here is the file, middleware-pipeline.global.php, which is located in the config/autoload folder in a standard Expressive application. The dependencies key is used to identify middleware wrapper classes that will be activated in the pipeline: <?php use ZendExpressiveContainerApplicationFactory; use ZendExpressiveHelper; return [ 'dependencies' => [ 'factories' => [ HelperServerUrlMiddleware::class => HelperServerUrlMiddlewareFactory::class, HelperUrlHelperMiddleware::class => HelperUrlHelperMiddlewareFactory::class, // insert your own class here ], ], Under the middleware_pipline key you can identify classes that will be executed before or after the routing process occurs. Optional parameters include path, error, and priority: 'middleware_pipeline' => [ 'always' => [ 'middleware' => [ HelperServerUrlMiddleware::class, ], 'priority' => 10000, ], 'routing' => [ 'middleware' => [ ApplicationFactory::ROUTING_MIDDLEWARE, HelperUrlHelperMiddleware::class, // insert reference to middleware here ApplicationFactory::DISPATCH_MIDDLEWARE, ], 'priority' => 1, ], 'error' => [ 'middleware' => [ // Add error middleware here. ], 'error' => true, 'priority' => -10000, ], ], ]; Another technique is to modify the source code of an existing framework module, and make a request to a PSR-7-compliant middleware application. Here is an example modifying a Joomla! installation to include a middleware session validator. Next, add this code the end of the index.php file in the /path/to/joomla folder. Since Joomla! uses Composer, we can leverage the Composer autoloader: session_start(); // to support use of $_SESSION $loader = include __DIR__ . '/libraries/vendor/autoload.php'; $loader->add('Application', __DIR__ . '/libraries/vendor'); $loader->add('Psr', __DIR__ . '/libraries/vendor'); We can then create an instance of our middleware session validator, and make a validation request just before $app = JFactory::getApplication('site');: $session = JFactory::getSession(); $request = (new ApplicationMiddleWareServerRequest())->initialize(); $response = new ApplicationMiddleWareResponse(); $validator = new ApplicationSecuritySessionValidator( $request, $session); $response = $validator($request, $response); if ($response->getStatusCode() != 200) { // take some action } How it works… First, create the ApplicationMiddleWareSessionValidator test middleware class described in steps 2 - 5. Then you will need to go to getcomposer.org and follow the directions to obtain Composer. Next, build a basic Zend Expressive application, as shown next. Be sure to select No when prompted for minimal skeleton: cd /path/to/source/for/this/chapter php composer.phar create-project zendframework/zend-expressive-skeleton expressive This will create a folder /path/to/source/for/this/chapter/expressive. Change to this directory. Modify public/index.php as follows: <?php if (php_sapi_name() === 'cli-server' && is_file(__DIR__ . parse_url( $_SERVER['REQUEST_URI'], PHP_URL_PATH)) ) { return false; } chdir(dirname(__DIR__)); session_start(); $_SESSION['time'] = time(); $appDir = realpath(__DIR__ . '/../../..'); $loader = require 'vendor/autoload.php'; $loader->add('Application', $appDir); $container = require 'config/container.php'; $app = $container->get(ZendExpressiveApplication::class); $app->run(); You will then need to create a wrapper class that invokes our session validator middleware. Create a SessionValidateAction.php file that needs to go in the /path/to/source/for/this/chapter/expressive/src/App/Action folder. For the purposes of this illustration, set the stop time parameter to a short duration. In this case, time() + 10 gives you 10 seconds: namespace AppAction; use ApplicationMiddleWareSessionValidator; use ZendDiactoros { Request, Response }; use PsrHttpMessageResponseInterface; use PsrHttpMessageServerRequestInterface; class SessionValidateAction { public function __invoke(ServerRequestInterface $request, ResponseInterface $response, callable $next = null) { $inbound = new Response(); $validator = new Validator($request, time()+10); $inbound = $validator($request, $response); if ($inbound->getStatusCode() != 200) { session_destroy(); setcookie('PHPSESSID', 0, time()-300); $params = json_decode( $inbound->getBody()->getContents(), TRUE); echo '<h1>',$params[Validator::KEY_TEXT],'</h1>'; echo '<pre>',var_dump($inbound),'</pre>'; exit; } return $next($request,$response); } } You will now need to add the new class to the middleware pipeline. Modify config/autoload/middleware-pipeline.global.php as follows. Modifications are shown in bold: <?php use ZendExpressiveContainerApplicationFactory; use ZendExpressiveHelper; return [ 'dependencies' => [ 'invokables' => [ AppActionSessionValidateAction::class => AppActionSessionValidateAction::class, ], 'factories' => [ HelperServerUrlMiddleware::class => HelperServerUrlMiddlewareFactory::class, HelperUrlHelperMiddleware::class => HelperUrlHelperMiddlewareFactory::class, ], ], 'middleware_pipeline' => [ 'always' => [ 'middleware' => [ HelperServerUrlMiddleware::class, ], 'priority' => 10000, ], 'routing' => [ 'middleware' => [ ApplicationFactory::ROUTING_MIDDLEWARE, HelperUrlHelperMiddleware::class, AppActionSessionValidateAction::class, ApplicationFactory::DISPATCH_MIDDLEWARE, ], 'priority' => 1, ], 'error' => [ 'middleware' => [ // Add error middleware here. ], 'error' => true, 'priority' => -10000, ], ], ]; You might also consider modifying the home page template to show the status of $_SESSION. The file in question is /path/to/source/for/this/chapter/expressive/templates/app/home-page.phtml. Simply adding var_dump($_SESSION) should suffice. Initially, you should see something like this: After 10 seconds, refresh the browser. You should now see this: Using middleware to cross languages Except in cases where you are trying to communicate between different versions of PHP, PSR-7 middleware will be of minimal use. Recall what the acronym stands for: PHP Standards Recommendations. Accordingly, if you need to make a request to an application written in another language, treat it as you would any other web service HTTP request. How to do it… In the case of PHP 4, you actually have a chance in that there was limited support for object-oriented programming. There is not enough space to cover all changes, but we present a potential PHP 4 version of ApplicationMiddleWareServerRequest. The first thing to note is that there are no namespaces! Accordingly, we use a classname with underscores, _, in place of namespace separators: class Application_MiddleWare_ServerRequest extends Application_MiddleWare_Request implements Psr_Http_Message_ServerRequestInterface { All properties are identified in PHP 4 using the key word var: var $serverParams; var $cookies; var $queryParams; // not all properties are shown The initialize() method is almost the same, except that syntax such as $this->getServerParams()['REQUEST_URI'] was not allowed in PHP 4. Accordingly, we need to split this out into a separate variable: function initialize() { $params = $this->getServerParams(); $this->getCookieParams(); $this->getQueryParams(); $this->getUploadedFiles; $this->getRequestMethod(); $this->getContentType(); $this->getParsedBody(); return $this->withRequestTarget($params['REQUEST_URI']); } All of the $_XXX super-globals were present in later versions of PHP 4: function getServerParams() { if (!$this->serverParams) { $this->serverParams = $_SERVER; } return $this->serverParams; } // not all getXXX() methods are shown to conserve space The null coalesce" operator was only introduced in PHP 7. We need to use isset(XXX) ? XXX : ''; instead: function getRequestMethod() { $params = $this->getServerParams(); $method = isset($params['REQUEST_METHOD']) ? $params['REQUEST_METHOD'] : ''; $this->method = strtolower($method); return $this->method; } The JSON extension was not introduced until PHP 5. Accordingly, we need to be satisfied with raw input. We could also possibly use serialize() or unserialize() in place of json_encode() and json_decode(): function getParsedBody() { if (!$this->parsedBody) { if (($this->getContentType() == Constants::CONTENT_TYPE_FORM_ENCODED || $this->getContentType() == Constants::CONTENT_TYPE_MULTI_FORM) && $this->getRequestMethod() == Constants::METHOD_POST) { $this->parsedBody = $_POST; } elseif ($this->getContentType() == Constants::CONTENT_TYPE_JSON || $this->getContentType() == Constants::CONTENT_TYPE_HAL_JSON) { ini_set("allow_url_fopen", true); $this->parsedBody = file_get_contents('php://stdin'); } elseif (!empty($_REQUEST)) { $this->parsedBody = $_REQUEST; } else { ini_set("allow_url_fopen", true); $this->parsedBody = file_get_contents('php://stdin'); } } return $this->parsedBody; } The withXXX() methods work pretty much the same in PHP 4: function withParsedBody($data) { $this->parsedBody = $data; return $this; } Likewise, the withoutXXX() methods work the same as well: function withoutAttribute($name) { if (isset($this->attributes[$name])) { unset($this->attributes[$name]); } return $this; } } For websites using other languages, we could use the PSR-7 classes to formulate requests and responses, but would then need to use an HTTP client to communicate with the other website. Here is the example: $request = new Request( TARGET_WEBSITE_URL, Constants::METHOD_POST, new TextStream($contents), [Constants::HEADER_CONTENT_TYPE => Constants::CONTENT_TYPE_FORM_ENCODED, Constants::HEADER_CONTENT_LENGTH => $body->getSize()] ); $data = http_build_query(['data' => $request->getBody()->getContents()]); $defaults = array( CURLOPT_URL => $request->getUri()->getUriString(), CURLOPT_POST => true, CURLOPT_POSTFIELDS => $data, ); $ch = curl_init(); curl_setopt_array($ch, $defaults); $response = curl_exec($ch); curl_close($ch); Summary In this article, we learned about providing authentication to a system, to make calls between frameworks, and to make a request to an application written in another language. Resources for Article: Further resources on this subject: Middleware [Article] Building a Web Application with PHP and MariaDB – Introduction to caching [Article] Data Tables and DataTables Plugin in jQuery 1.3 with PHP [Article]
Read more
  • 0
  • 0
  • 2795

article-image-responsive-applications-asynchronous-programming
Packt
13 Jul 2016
9 min read
Save for later

Responsive Applications with Asynchronous Programming

Packt
13 Jul 2016
9 min read
In this article by Dirk Strauss, author of the book C# Programming Cookbook, he sheds some light on how to handle events, exceptions and tasks in asynchronous programming, making your application responsive. (For more resources related to this topic, see here.) Handling tasks in asynchronous programming Task-Based Asynchronous Pattern (TAP) is now the recommended method to create asynchronous code. It executes asynchronously on a thread from the thread pool and does not execute synchronously on the main thread of your application. It allows us to check the task's state by calling the Status property. Getting ready We will create a task to read a very large text file. This will be accomplished using an asynchronous Task. How to do it… Create a large text file (we called ours taskFile.txt) and place it in your C:temp folder: In the AsyncDemo class, create a method called ReadBigFile() that returns a Task<TResult> type, which will be used to return an integer of bytes read from our big text file: public Task<int> ReadBigFile() { } Add the following code to open and read the file bytes. You will see that we are using the ReadAsync() method that asynchronously reads a sequence of bytes from the stream and advances the position in that stream by the number of bytes read from that stream. You will also notice that we are using a buffer to read those bytes. public Task<int> ReadBigFile() { var bigFile = File.OpenRead(@"C:temptaskFile.txt"); var bigFileBuffer = new byte[bigFile.Length]; var readBytes = bigFile.ReadAsync(bigFileBuffer, 0, " (int)bigFile.Length); return readBytes; } Exceptions you can expect to handle from the ReadAsync() method are ArgumentNullException, ArgumentOutOfRangeException, ArgumentException, NotSupportedException, ObjectDisposedException and InvalidOperatorException. Finally, add the final section of code just after the var readBytes = bigFile.ReadAsync(bigFileBuffer, 0, (int)bigFile.Length); line that uses a lambda expression to specify the work that the task needs to perform. In this case, it is to read the bytes in the file: public Task<int> ReadBigFile() { var bigFile = File.OpenRead(@"C:temptaskFile.txt"); var bigFileBuffer = new byte[bigFile.Length]; var readBytes = bigFile.ReadAsync(bigFileBuffer, 0, (int)bigFile.Length); readBytes.ContinueWith(task => { if (task.Status == TaskStatus.Running) Console.WriteLine("Running"); else if (task.Status == TaskStatus.RanToCompletion) Console.WriteLine("RanToCompletion"); else if (task.Status == TaskStatus.Faulted) Console.WriteLine("Faulted"); bigFile.Dispose(); }); return readBytes; } If not done so in the previous section, add a button to your Windows Forms application's Form designer. On the winformAsync form designer, open Toolbox and select the Button control, which is found under the All Windows Forms node: Drag the button control onto the Form1 designer: With the button control selected, double-click the control to create the click event in the code behind. Visual Studio will insert the event code for you: namespace winformAsync { public partial class Form1 : Form { public Form1() { InitializeComponent(); } private void button1_Click(object sender, EventArgs e) { } } } Change the button1_Click event and add the async keyword to the click event. This is an example of a void returning asynchronous method: private async void button1_Click(object sender, EventArgs e) { } Now, make sure that you add code to call the AsyncDemo class's ReadBigFile() method asynchronously. Remember to read the result from the method (which are the bytes read) into an integer variable: private async void button1_Click(object sender, EventArgs e) { Console.WriteLine("Start file read"); Chapter6.AsyncDemo oAsync = new Chapter6.AsyncDemo(); int readResult = await oAsync.ReadBigFile(); Console.WriteLine("Bytes read = " + readResult); } Running your application will display the Windows Forms application: Before clicking on the button1 button, ensure that the Output window is visible: From the View menu, click on the Output menu item or type Ctrl + Alt + to display the Output window. This will allow us to see the Console.Writeline() outputs as we have added them to the code in the Chapter6 class and in the Windows application. Clicking on the button1 button will display the outputs to our Output window. Throughout this code execution, the form remains responsive. Take note though that the information displayed in your Output window will differ from the screenshot. This is because the file you used is different from mine. How it works… The task is executed on a separate thread from the thread pool. This allows the application to remain responsive while the large file is being processed. Tasks can be used in multiple ways to improve your code. This recipe is but one example. Exception handling in asynchronous programming Exception handling in asynchronous programming has always been a challenge. This was especially true in the catch blocks. As of C# 6, you are now allowed to write asynchronous code inside the catch and finally block of your exception handlers. Getting ready The application will simulate the action of reading a logfile. Assume that a third-party system always makes a backup of the logfile before processing it in another application. While this processing is happening, the logfile is deleted and recreated. Our application, however, needs to read this logfile on a periodic basis. We, therefore, need to be prepared for the case where the file does not exist in the location we expect it in. Therefore, we will purposely omit the main logfile, so that we can force an error. How to do it… Create a text file and two folders to contain the logfiles. We will, however, only create a single logfile in the BackupLog folder. The MainLog folder will remain empty: In our AsyncDemo class, write a method to read the main logfile in the MainLog folder: private async Task<int> ReadMainLog() { var bigFile = " File.OpenRead(@"C:tempLogMainLogtaskFile.txt"); var bigFileBuffer = new byte[bigFile.Length]; var readBytes = bigFile.ReadAsync(bigFileBuffer, 0, " (int)bigFile.Length); await readBytes.ContinueWith(task => { if (task.Status == TaskStatus.RanToCompletion) Console.WriteLine("Main Log RanToCompletion"); else if (task.Status == TaskStatus.Faulted) Console.WriteLine("Main Log Faulted"); bigFile.Dispose(); }); return await readBytes; } Create a second method to read the backup file in the BackupLog folder: private async Task<int> ReadBackupLog() { var bigFile = " File.OpenRead(@"C:tempLogBackupLogtaskFile.txt"); var bigFileBuffer = new byte[bigFile.Length]; var readBytes = bigFile.ReadAsync(bigFileBuffer, 0, " (int)bigFile.Length); await readBytes.ContinueWith(task => { if (task.Status == TaskStatus.RanToCompletion) Console.WriteLine("Backup Log " RanToCompletion"); else if (task.Status == TaskStatus.Faulted) Console.WriteLine("Backup Log Faulted"); bigFile.Dispose(); }); return await readBytes; } In actual fact, we would probably only create a single method to read the logfiles, passing only the path as a parameter. In a production application, creating a class and overriding a method to read the different logfile locations would be a better approach. For the purposes of this recipe, however, we specifically wanted to create two separate methods so that the different calls to the asynchronous methods are clearly visible in the code. We will then create a main ReadLogFile() method that tries to read the main logfile. As we have not created the logfile in the MainLog folder, the code will throw a FileNotFoundException. It will then run the asynchronous method and await that in the catch block of the ReadLogFile() method (something which was impossible in the previous versions of C#), returning the bytes read to the calling code: public async Task<int> ReadLogFile() { int returnBytes = -1; try { Task<int> intBytesRead = ReadMainLog(); returnBytes = await ReadMainLog(); } catch (Exception ex) { try { returnBytes = await ReadBackupLog(); } catch (Exception) { throw; } } return returnBytes; } If not done so in the previous recipe, add a button to your Windows Forms application's Form designer. On the winformAsync form designer, open Toolbox and select the Button control, which is found under the All Windows Forms node: Drag the button control onto the Form1 designer: With the button control selected, double-click on the control to create the click event in the code behind. Visual Studio will insert the event code for you: namespace winformAsync { public partial class Form1 : Form { public Form1() { InitializeComponent(); } private void button1_Click(object sender, EventArgs e) { } } } Change the button1_Click event and add the async keyword to the click event. This is an example of a void returning an asynchronous method: private async void button1_Click(object sender, EventArgs e) { } Next, we will write the code to create a new instance of the AsyncDemo class and attempt to read the main logfile. In a real-world example, it is at this point that the code does not know that the main logfile does not exist: private async void button1_Click(object sender, EventArgs "e) { Console.WriteLine("Read backup file"); Chapter6.AsyncDemo oAsync = new Chapter6.AsyncDemo(); int readResult = await oAsync.ReadLogFile(); Console.WriteLine("Bytes read = " + readResult); } Running your application will display the Windows Forms application: Before clicking on the button1 button, ensure that the Output window is visible: From the View menu, click on the Output menu item or type Ctrl + Alt + O to display the Output window. This will allow us to see the Console.Writeline() outputs as we have added them to the code in the Chapter6 class and in the Windows application. To simulate a file not found exception, we deleted the file from the MainLog folder. You will see that the exception is thrown, and the catch block runs the code to read the backup logfile instead: How it works… The fact that we can await in catch and finally blocks allows developers much more flexibility because asynchronous results can consistently be awaited throughout the application. As you can see from the code we wrote, as soon as the exception was thrown, we asynchronously read the file read method for the backup file. Summary In this article we looked at how TAP is now the recommended method to create asynchronous code. How tasks can be used in multiple ways to improve your code. This allows the application to remain responsive while the large file is being processed also how exception handling in asynchronous programming has always been a challenge and how to use catch and finally block to handle exceptions. Resources for Article: Further resources on this subject: Functional Programming in C#[article] Creating a sample C#.NET application[article] Creating a sample C#.NET application[article]
Read more
  • 0
  • 0
  • 3317