Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7019 Articles
article-image-exploring-structure-motion-using-opencv
Packt
09 Jan 2017
20 min read
Save for later

Exploring Structure from Motion Using OpenCV

Packt
09 Jan 2017
20 min read
In this article by Roy Shilkrot, coauthor of the book Mastering OpenCV 3, we will discuss the notion of Structure from Motion (SfM), or better put, extracting geometric structures from images taken with a camera under motion, using OpenCV's API to help us. First, let's constrain the otherwise very broad approach to SfM using a single camera, usually called a monocular approach, and a discrete and sparse set of frames rather than a continuous video stream. These two constrains will greatly simplify the system we will sketch out in the coming pages and help us understand the fundamentals of any SfM method. In this article, we will cover the following: Structure from Motion concepts Estimating the camera motion from a pair of images (For more resources related to this topic, see here.) Throughout the article, we assume the use of a calibrated camera—one that was calibrated beforehand. Calibration is a ubiquitous operation in computer vision, fully supported in OpenCV using command-line tools. We, therefore, assume the existence of the camera's intrinsic parameters embodied in the K matrix and the distortion coefficients vector—the outputs from the calibration process. To make things clear in terms of language, from this point on, we will refer to a camera as a single view of the scene rather than to the optics and hardware taking the image. A camera has a position in space and a direction of view. Between two cameras, there is a translation element (movement through space) and a rotation of the direction of view. We will also unify the terms for the point in the scene, world, real, or 3D to be the same thing, a point that exists in our real world. The same goes for points in the image or 2D, which are points in the image coordinates, of some real 3D point that was projected on the camera sensor at that location and time. Structure from Motion concepts The first discrimination we should make is the difference between stereo (or indeed any multiview), 3D reconstruction using calibrated rigs, and SfM. A rig of two or more cameras assumes we already know what the "motion" between the cameras is, while in SfM, we don't know what this motion is and we wish to find it. Calibrated rigs, from a simplistic point of view, allow a much more accurate reconstruction of 3D geometry because there is no error in estimating the distance and rotation between the cameras—it is already known. The first step in implementing an SfM system is finding the motion between the cameras. OpenCV may help us in a number of ways to obtain this motion, specifically using the findFundamentalMat and findEssentialMat functions. Let's think for one moment of the goal behind choosing an SfM algorithm. In most cases, we wish to obtain the geometry of the scene, for example, where objects are in relation to the camera and what their form is. Having found the motion between the cameras picturing the same scene, from a reasonably similar point of view, we would now like to reconstruct the geometry. In computer vision jargon, this is known as triangulation, and there are plenty of ways to go about it. It may be done by way of ray intersection, where we construct two rays: one from each camera's center of projection and a point on each of the image planes. The intersection of these rays in space will, ideally, intersect at one 3D point in the real world that was imaged in each camera, as shown in the following diagram: In reality, ray intersection is highly unreliable. This is because the rays usually do not intersect, making us fall back to using the middle point on the shortest segment connecting the two rays. OpenCV contains a simple API for a more accurate form of triangulation, the triangulatePoints function, so this part we do not need to code on our own. After you have learned how to recover 3D geometry from two views, we will see how you can incorporate more views of the same scene to get an even richer reconstruction. At that point, most SfM methods try to optimize the bundle of estimated positions of our cameras and 3D points by means of Bundle Adjustment. OpenCV contains means for Bundle Adjustment in its new Image Stitching Toolbox. However, the beauty of working with OpenCV and C++ is the abundance of external tools that can be easily integrated into the pipeline. We will, therefore, see how to integrate an external bundle adjuster, the Ceres non-linear optimization package. Now that we have sketched an outline of our approach to SfM using OpenCV, we will see how each element can be implemented. Estimating the camera motion from a pair of images Before we set out to actually find the motion between two cameras, let's examine the inputs and the tools we have at hand to perform this operation. First, we have two images of the same scene from (hopefully not extremely) different positions in space. This is a powerful asset, and we will make sure that we use it. As for tools, we should take a look at mathematical objects that impose constraints over our images, cameras, and the scene. Two very useful mathematical objects are the fundamental matrix (denoted by F) and the essential matrix (denoted by E). They are mostly similar, except that the essential matrix is assuming usage of calibrated cameras; this is the case for us, so we will choose it. OpenCV allows us to find the fundamental matrix via the findFundamentalMat function and the essential matrix via the findEssentialMatrix function. Finding the essential matrix can be done as follows: Mat E = findEssentialMat(leftPoints, rightPoints, focal, pp); This function makes use of matching points in the "left" image, leftPoints, and "right" image, rightPoints, which we will discuss shortly, as well as two additional pieces of information from the camera's calibration: the focal length, focal, and principal point, pp. The essential matrix, E, is a 3 x 3 matrix, which imposes the following constraint on a point in one image and a point in the other image: x'K­TEKx = 0, where x is a point in the first image one, x' is the corresponding point in the second image, and K is the calibration matrix. This is extremely useful, as we are about to see. Another important fact we use is that the essential matrix is all we need in order to recover the two cameras' positions from our images, although only up to an arbitrary unit of scale. So, if we obtain the essential matrix, we know where each camera is positioned in space and where it is looking. We can easily calculate the matrix if we have enough of those constraint equations, simply because each equation can be used to solve for a small part of the matrix. In fact, OpenCV internally calculates it using just five point-pairs, but through the Random Sample Consensus algorithm (RANSAC), many more pairs can be used and make for a more robust solution. Point matching using rich feature descriptors Now we will make use of our constraint equations to calculate the essential matrix. To get our constraints, remember that for each point in image A, we must find a corresponding point in image B. We can achieve such a matching using OpenCV's extensive 2D feature-matching framework, which has greatly matured in the past few years. Feature extraction and descriptor matching is an essential process in computer vision and is used in many methods to perform all sorts of operations, for example, detecting the position and orientation of an object in the image or searching a big database of images for similar images through a given query. In essence, feature extraction means selecting points in the image that would make for good features and computing a descriptor for them. A descriptor is a vector of numbers that describes the surrounding environment around a feature point in an image. Different methods have different lengths and data types for their descriptor vectors. Descriptor Matching is the process of finding a corresponding feature from one set in another using its descriptor. OpenCV provides very easy and powerful methods to support feature extraction and matching. Let's examine a very simple feature extraction and matching scheme: vector<KeyPoint> keypts1, keypts2; Mat desc1, desc2; // detect keypoints and extract ORB descriptors Ptr<Feature2D> orb = ORB::create(2000); orb->detectAndCompute(img1, noArray(), keypts1, desc1); orb->detectAndCompute(img2, noArray(), keypts2, desc2); // matching descriptors Ptr<DescriptorMatcher> matcher = DescriptorMatcher::create("BruteForce-Hamming"); vector<DMatch> matches; matcher->match(desc1, desc2, matches); You may have already seen similar OpenCV code, but let's review it quickly. Our goal is to obtain three elements: feature points for two images, descriptors for them, and a matching between the two sets of features. OpenCV provides a range of feature detectors, descriptor extractors, and matchers. In this simple example, we use the ORB class to get both the 2D location of Oriented BRIEF (ORB) (where Binary Robust Independent Elementary Features (BRIEF)) feature points and their respective descriptors. We use a brute-force binary matcher to get the matching, which is the most straightforward way to match two feature sets by comparing each feature in the first set to each feature in the second set (hence the phrasing "brute-force"). In the following image, we will see a matching of feature points on two images from the Fountain-P11 sequence found at http://cvlab.epfl.ch/~strecha/multiview/denseMVS.html: Practically, raw matching like we just performed is good only up to a certain level, and many matches are probably erroneous. For that reason, most SfM methods perform some form of filtering on the matches to ensure correctness and reduce errors. One form of filtering, which is built into OpenCV's brute-force matcher, is cross-check filtering. That is, a match is considered true if a feature of the first image matches a feature of the second image, and the reverse check also matches the feature of the second image with the feature of the first image. Another common filtering mechanism, used in the provided code, is to filter based on the fact that the two images are of the same scene and have a certain stereo-view relationship between them. In practice, the filter tries to robustly calculate the fundamental or essential matrix and retain those feature pairs that correspond to this calculation with small errors. An alternative to using rich features, such as ORB, is to use optical flow. The following information box provides a short overview of optical flow. It is possible to use optical flow instead of descriptor matching to find the required point matching between two images, while the rest of the SfM pipeline remains the same. OpenCV recently extended its API for getting the flow field from two images, and now it is faster and more powerful. Optical flow It is the process of matching selected points from one image to another, assuming that both images are part of a sequence and relatively close to one another. Most optical flow methods compare a small region, known as the search window or patch, around each point from image A to the same area in image B. Following a very common rule in computer vision, called the brightness constancy constraint (and other names), the small patches of the image will not change drastically from one image to other, and therefore the magnitude of their subtraction should be close to zero. In addition to matching patches, newer methods of optical flow use a number of additional methods to get better results. One is using image pyramids, which are smaller and smaller resized versions of the image, which allow for working from coarse to-fine—a very well-used trick in computer vision. Another method is to define global constraints on the flow field, assuming that the points close to each other move together in the same direction. Finding camera matrices Now that we have obtained matches between keypoints, we can calculate the essential matrix. However, we must first align our matching points into two arrays, where an index in one array corresponds to the same index in the other. This is required by the findEssentialMat function as we've seen in the Estimating Camera Motion section. We would also need to convert the KeyPoint structure to a Point2f structure. We must pay special attention to the queryIdx and trainIdx member variables of DMatch, the OpenCV struct that holds a match between two keypoints, as they must align with the way we used the DescriptorMatcher::match() function. The following code section shows how to align a matching into two corresponding sets of 2D points, and how these can be used to find the essential matrix: vector<KeyPoint> leftKpts, rightKpts; // ... obtain keypoints using a feature extractor vector<DMatch> matches; // ... obtain matches using a descriptor matcher //align left and right point sets vector<Point2f> leftPts, rightPts; for (size_t i = 0; i < matches.size(); i++) { // queryIdx is the "left" image leftPts.push_back(leftKpts[matches[i].queryIdx].pt); // trainIdx is the "right" image rightPts.push_back(rightKpts[matches[i].trainIdx].pt); } //robustly find the Essential Matrix Mat status; Mat E = findEssentialMat( leftPts, //points from left image rightPts, //points from right image focal, //camera focal length factor pp, //camera principal point cv::RANSAC, //use RANSAC for a robust solution 0.999, //desired solution confidence level 1.0, //point-to-epipolar-line threshold status); //binary vector for inliers We may later use the status binary vector to prune those points that align with the recovered essential matrix. Refer to the following image for an illustration of point matching after pruning. The red arrows mark feature matches that were removed in the process of finding the matrix, and the green arrows are feature matches that were kept: Now we are ready to find the camera matrices; however, the new OpenCV 3 API makes things very easy for us by introducing the recoverPose function. First, we will briefly examine the structure of the camera matrix we will use: This is the model for our camera; it consists of two elements, rotation (denoted as R) and translation (denoted as t). The interesting thing about it is that it holds a very essential equation: x = PX, where x is a 2D point on the image and X is a 3D point in space. There is more to it, but this matrix gives us a very important relationship between the image points and the scene points. So, now that we have a motivation for finding the camera matrices, we will see how it can be done. The following code section shows how to decompose the essential matrix into the rotation and translation elements: Mat E; // ... find the essential matrix Mat R, t; //placeholders for rotation and translation //Find Pright camera matrix from the essential matrix //Cheirality check is performed internally. recoverPose(E, leftPts, rightPts, R, t, focal, pp, mask); Very simple. Without going too deep into mathematical interpretation, this conversion of the essential matrix to rotation and translation is possible because the essential matrix was originally composed by these two elements. Strictly for satisfying our curiosity, we can look at the following equation for the essential matrix, which appears in the literature:. We see that it is composed of (some form of) a translation element, t, and a rotational element, R. Note that a cheirality check is internally performed in the recoverPose function. The cheirality check makes sure that all triangulated 3D points are in front of the reconstructed camera. Camera matrix recovery from the essential matrix has in fact four possible solutions, but the only correct solution is the one that will produce triangulated points in front of the camera, hence the need for a cheirality check. Note that what we just did only gives us one camera matrix, and for triangulation, we require two camera matrices. This operation assumes that one camera matrix is fixed and canonical (no rotation and no translation): The other camera that we recovered from the essential matrix has moved and rotated in relation to the fixed one. This also means that any of the 3D points that we recover from these two camera matrices will have the first camera at the world origin point (0, 0, 0). One more thing we can think of adding to our method is error checking. Many times, the calculation of an essential matrix from point matching is erroneous, and this affects the resulting camera matrices. Continuing to triangulate with faulty camera matrices is pointless. We can install a check to see if the rotation element is a valid rotation matrix. Keeping in mind that rotation matrices must have a determinant of 1 (or -1), we can simply do the following: bool CheckCoherentRotation(const cv::Mat_<double>& R) { if (fabsf(determinant(R)) - 1.0 > 1e-07) { cerr << "rotation matrix is invalid" << endl; return false; } return true; } We can now see how all these elements combine into a function that recovers the P matrices. First, we will introduce some convenience data structures and type short hands: typedef std::vector<cv::KeyPoint> Keypoints; typedef std::vector<cv::Point2f> Points2f; typedef std::vector<cv::Point3f> Points3f; typedef std::vector<cv::DMatch> Matching; struct Features { //2D features Keypoints keyPoints; Points2f points; cv::Mat descriptors; }; struct Intrinsics { //camera intrinsic parameters cv::Mat K; cv::Mat Kinv; cv::Mat distortion; }; Now, we can write the camera matrix finding function: void findCameraMatricesFromMatch( constIntrinsics& intrin, constMatching& matches, constFeatures& featuresLeft, constFeatures& featuresRight, cv::Matx34f& Pleft, cv::Matx34f& Pright) { { //Note: assuming fx = fy const double focal = intrin.K.at<float>(0, 0); const cv::Point2d pp(intrin.K.at<float>(0, 2), intrin.K.at<float>(1, 2)); //align left and right point sets using the matching Features left; Features right; GetAlignedPointsFromMatch( featuresLeft, featuresRight, matches, left, right); //find essential matrix Mat E, mask; E = findEssentialMat( left.points, right.points, focal, pp, RANSAC, 0.999, 1.0, mask); Mat_<double> R, t; //Find Pright camera matrix from the essential matrix recoverPose(E, left.points, right.points, R, t, focal, pp, mask); Pleft = Matx34f::eye(); Pright = Matx34f(R(0,0), R(0,1), R(0,2), t(0), R(1,0), R(1,1), R(1,2), t(1), R(2,0), R(2,1), R(2,2), t(2)); } At this point, we have the two cameras that we need in order to reconstruct the scene. The canonical first camera, in the Pleft variable, and the second camera we calculated, form the essential matrix in the Pright variable. Choosing the image pair to use first Given we have more than just two image views of the scene, we must choose which two views we will start the reconstruction from. In their paper, Snavely et al. suggest that we pick the two views that have the least number of homography inliers. A homography is a relationship between two images or sets of points that lie on a plane; the homography matrix defines the transformation from one plane to another. In case of an image or a set of 2D points, the homography matrix is of size 3 x 3. When Snavely et al. look for the lowest inlier ratio, they essentially suggest to calculate the homography matrix between all pairs of images and pick the pair whose points mostly do not correspond with the homography matrix. This means the geometry of the scene in these two views is not planar or at least not the same plane in both views, which helps when doing 3D reconstruction. For reconstruction, it is best to look at a complex scene with non-planar geometry, with things closer and farther away from the camera. The following code snippet shows how to use OpenCV's findHomography function to count the number of inliers between two views whose features were already extracted and matched: int findHomographyInliers( const Features& left, const Features& right, const Matching& matches) { //Get aligned feature vectors Features alignedLeft; Features alignedRight; GetAlignedPointsFromMatch(left, right, matches, alignedLeft, alignedRight); //Calculate homography with at least 4 points Mat inlierMask; Mat homography; if(matches.size() >= 4) { homography = findHomography(alignedLeft.points, alignedRight.points, cv::RANSAC, RANSAC_THRESHOLD, inlierMask); } if(matches.size() < 4 or homography.empty()) { return 0; } return countNonZero(inlierMask); } The next step is to perform this operation on all pairs of image views in our bundle and sort them based on the ratio of homography inliers to outliers: //sort pairwise matches to find the lowest Homography inliers map<float, ImagePair> pairInliersCt; const size_t numImages = mImages.size(); //scan all possible image pairs (symmetric) for (size_t i = 0; i < numImages - 1; i++) { for (size_t j = i + 1; j < numImages; j++) { if (mFeatureMatchMatrix[i][j].size() < MIN_POINT_CT) { //Not enough points in matching pairInliersCt[1.0] = {i, j}; continue; } //Find number of homography inliers const int numInliers = findHomographyInliers( mImageFeatures[i], mImageFeatures[j], mFeatureMatchMatrix[i][j]); const float inliersRatio = (float)numInliers / (float)(mFeatureMatchMatrix[i][j].size()); pairInliersCt[inliersRatio] = {i, j}; } } Note that the std::map<float, ImagePair> will internally sort the pairs based on the map's key: the inliers ratio. We then simply need to traverse this map from the beginning to find the image pair with least inlier ratio, and if that pair cannot be used, we can easily skip ahead to the next pair. Summary In this article, we saw how OpenCV v3 can help us approach Structure from Motion in a manner that is both simple to code and to understand. OpenCV v3's new API contains a number of useful functions and data structures that make our lives easier and also assist in a cleaner implementation. However, the state-of-the-art SfM methods are far more complex. There are many issues we choose to disregard in favor of simplicity, and plenty more error examinations that are usually in place. Our chosen methods for the different elements of SfM can also be revisited. Some methods even use the N-view triangulation once they understand the relationship between the features in multiple images. If we would like to extend and deepen our familiarity with SfM, we will certainly benefit from looking at other open source SfM libraries. One particularly interesting project is libMV, which implements a vast array of SfM elements that may be interchanged to get the best results. There is a great body of work from University of Washington that provides tools for many flavors of SfM (Bundler and VisualSfM). This work inspired an online product from Microsoft, called PhotoSynth, and 123D Catch from Adobe. There are many more implementations of SfM readily available online, and one must only search to find quite a lot of them. Resources for Article: Further resources on this subject: Basics of Image Histograms in OpenCV [article] OpenCV: Image Processing using Morphological Filters [article] Face Detection and Tracking Using ROS, Open-CV and Dynamixel Servos [article]
Read more
  • 0
  • 1
  • 27562

article-image-how-does-elasticsearch-work-tutorial
Savia Lobo
30 Jul 2018
12 min read
Save for later

How does Elasticsearch work? [Tutorial]

Savia Lobo
30 Jul 2018
12 min read
Elasticsearch is much more than just a search engine; it supports complex aggregations, geo filters, and the list goes on. Best of all, you can run all your queries at a speed you have never seen before.  Elasticsearch, like any other open source technology, is very rapidly evolving, but the core fundamentals that power Elasticsearch don't change. In this article, we will briefly discuss how Elasticsearch works internally and explain the basic query APIs.  All the data in Elasticsearch is internally stored in  Apache Lucene as an inverted index. Although data is stored in Apache Lucene, Elasticsearch is what makes it distributed and provides the easy-to-use APIs. This Elasticsearch tutorial is an excerpt taken from the book,'Learning Elasticsearch' written by Abhishek Andhavarapu. Inverted index in Elasticsearch Inverted index will help you understand the limitations and strengths of Elasticsearch compared with the traditional database systems out there. Inverted index at the core is how Elasticsearch is different from other NoSQL stores, such as MongoDB, Cassandra, and so on. We can compare an inverted index to an old library catalog card system. When you need some information/book in a library, you will use the card catalog, usually at the entrance of the library, to find the book. An inverted index is similar to the card catalog. Imagine that you were to build a system like Google to search for the web pages mentioning your search keywords. We have three web pages with Yoda quotes from Star Wars, and you are searching for all the documents with the word fear. Document1: Fear leads to anger Document2: Anger leads to hate Document3: Hate leads to suffering In a library, without a card catalog to find the book you need, you would have to go to every shelf row by row, look at each book title, and see whether it's the book you need. Computer-based information retrieval systems do the same. Without the inverted index, the application has to go through each web page and check whether the word exists in the web page. An inverted index is similar to the following table. It is like a map with the term as a key and list of the documents the term appears in as value. Term Document Fear 1 Anger 1,2 Hate 2,3 Suffering 3 Leads 1,2,3 Once we construct an index, as shown in this table, to find all the documents with the term fear is now just a lookup. Just like when a library gets a new book, the book is added to the card catalog, we keep building an inverted index as we encounter a new web page. The preceding inverted index takes care of simple use cases, such as searching for the single term. But in reality, we query for much more complicated things, and we don't use the exact words. Now let's say we encountered a document containing the following: Yosemite national park may be closed for the weekend due to forecast of substantial rainfall We want to visit Yosemite National Park, and we are looking for the weather forecast in the park. But when we query for it in the human language, we might query something like weather in yosemite or rain in yosemite. With the current approach, we will not be able to answer this query as there are no common terms between the query and the document, as shown: Document Query rainfall rain To be able to answer queries like this and to improve the search quality, we employ various techniques such as stemming, synonyms discussed in the following sections. Stemming Stemming is the process of reducing a derived word into its root word. For example, rain, raining, rained, rainfall has the common root word "rain". When a document is indexed, the root word is stored in the index instead of the actual word. Without stemming, we end up storing rain, raining, rained in the index, and search relevance would be very low. The query terms also go through the stemming process, and the root words are looked up in the index. Stemming increases the likelihood of the user finding what he is looking for. When we query for rain in yosemite, even though the document originally had rainfall, the inverted index will contain term rain. We can configure stemming in Elasticsearch using Analyzers. Synonyms Similar to rain and raining, weekend and sunday mean the same thing. The document might not contain Sunday, but if the information retrieval system can also search for synonyms, it will significantly improve the search quality. Human language deals with a lot of things, such as tense, gender, numbers. Stemming and synonyms will not only improve the search quality but also reduce the index size by removing the differences between similar words. More examples: Pen, Pen[s] -> Pen Eat, Eating  -> Eat Phrase search As a user, we almost always search for phrases rather than single words. The inverted index in the previous section would work great for individual terms but not for phrases. Continuing the previous example, if we want to query all the documents with a phrase anger leads to in the inverted index, the previous index would not be sufficient. The inverted index for terms anger and leads is shown below: Term Document Anger 1,2 Leads 1,2,3 From the preceding table, the words anger and leads exist both in document1 and document2. To support phrase search along with the document, we also need to record the position of the word in the document. The inverted index with word position is shown here: Term Document Fear 1:1 Anger 1:3, 2:1 Hate 2:3, 3:1 Suffering 3:3 Leads 1:2, 2:2, 3:2 Now, since we have the information regarding the position of the word, we can search if a document has the terms in the same order as the query. Term Document anger 1:3, 2:1 leads 1:2, 2:2 Since document2 has anger as the first word and leads as the second word, the same order as the query, document2 would be a better match than document1. With the inverted index, any query on the documents is just a simple lookup. This is just an introduction to inverted index; in real life, it's much more complicated, but the fundamentals remain the same. When the documents are indexed into Elasticsearch, documents are processed into the inverted index. Scalability and availability in Elasticsearch Let's say you want to index a billion documents; having just a single machine might be very challenging. Partitioning data across multiple machines allows Elasticsearch to scale beyond what a single machine do and support high throughput operations. Your data is split into small parts called shards. When you create an index, you need to tell Elasticsearch the number of shards you want for the index and Elasticsearch handles the rest for you. As you have more data, you can scale horizontally by adding more machines. We will go in to more details in the sections below. There are type of shards in Elasticsearch - primary and replica. The data you index is written to both primary and replica shards. Replica is the exact copy of the primary. In case of the node containing the primary shard goes down, the replica takes over. This process is completely transparent and managed by Elasticsearch. We will discuss this in detail in the Failure Handling section below. Since primary and replicas are the exact copies, a search query can be answered by either the primary or the replica shard. This significantly increases the number of simultaneous requests Elasticsearch can handle at any point in time. As the index is distributed across multiple shards, a query against an index is executed in parallel across all the shards. The results from each shard are then gathered and sent back to the client. Executing the query in parallel greatly improves the search performance. Now, we will discuss the relation between node, index and shard. Relation between node, index, and shard Shard is often the most confusing topic when I talk about Elasticsearch at conferences or to someone who has never worked on Elasticsearch. In this section, I want to focus on the relation between node, index, and shard. We will use a cluster with three nodes and create the same index with multiple shard configuration, and we will talk through the differences. Three shards with zero replicas We will start with an index called esintroduction with three shards and zero replicas. The distribution of the shards in a three node cluster is as follows: In the above screenshot, shards are represented by the green squares. We will talk about replicas towards the end of this discussion. Since we have three nodes(servers) and three shards, the shards are evenly distributed across all three nodes. Each node will contain one shard. As you index your documents into the esintroduction index, data is spread across the three shards. Six shards with zero replicas Now, let's recreate the same esintroduction index with six shards and zero replicas. Since we have three nodes (servers) and six shards, each node will now contain two shards. The esintroduction index is split between six shards across three nodes. The distribution of shards for an index with six shards is as follows: The esintroduction index is spread across three nodes, meaning these three nodes will handle the index/query requests for the index. If these three nodes are not able to keep up with the indexing/search load, we can scale the esintroduction index by adding more nodes. Since the index has six shards, you could add three more nodes, and Elasticsearch automatically rearranges the shards across all six nodes. Now, index/query requests for the esintroduction index will be handled by six nodes instead of three nodes. If this is not clear, do not worry, we will discuss more about this as we progress in the book. Six shards with one replica Let's now recreate the same esintroduction index with six shards and one replica, meaning the index will have 6 primary shards and 6 replica shards, a total of 12 shards. Since we have three nodes (servers) and twelve shards, each node will now contain four shards. The esintroduction index is split between six shards across three nodes. The green squares represent shards in the following figure. The solid border represents primary shards, and replicas are the dotted squares: As we discussed before, the index is distributed into multiple shards across multiple nodes. In a distributed environment, a node/server can go down due to various reasons, such as disk failure, network issue, and so on. To ensure availability, each shard, by default, is replicated to a node other than where the primary shard exists. If the node containing the primary shard goes down, the shard replica is promoted to primary, and the data is not lost, and you can continue to operate on the index. In the preceding figure, the esintroduction index has six shards split across the three nodes. The primary of shard 2 belongs to node elasticsearch 1, and the replica of the shard 2 belongs to node elasticsearch 3. In the case of the elasticsearch 1 node going down, the replica in elasticsearch 3 is promoted to primary. This switch is completely transparent and handled by Elasticsearch. Distributed search One of the reasons queries executed on Elasticsearch are so fast is because they are distributed. Multiple shards act as one index. A search query on an index is executed in parallel across all the shards. Let's take an example: in the following figure, we have a cluster with two nodes: Node1, Node2 and an index named chapter1 with two shards: S0, S1 with one replica: Assuming the chapter1 index has 100 documents, S1 would have 50 documents, and S0 would have 50 documents. And you want to query for all the documents that contain the word Elasticsearch. The query is executed on S0 and S1 in parallel. The results are gathered back from both the shards and sent back to the client. Imagine, you have to query across million of documents, using Elasticsearch the search can be distributed. For the application I'm currently working on, a query on more than 100 million documents comes back within 50 milliseconds; which is simply not possible if the search is not distributed. Failure handling in Elasticsearch Elasticsearch handles failures automatically. This section describes how the failures are handled internally. Let's say we have an index with two shards and one replica. In the following diagram, the shards represented in solid line are primary shards, and the shards in the dotted line are replicas: As shown in preceding diagram, we initially have a cluster with two nodes. Since the index has two shards and one replica, shards are distributed across the two nodes. To ensure availability, primary and replica shards never exist in the same node. If the node containing both primary and replica shards goes down, the data cannot be recovered. In the preceding diagram, you can see that the primary shard S0 belongs to Node 1 and the replica shard S0 to the Node 2. Next, just like we discussed in the Relation between Node, Index and Shard section, we will add two new nodes to the existing cluster, as shown here: The cluster now contains four nodes, and the shards are automatically allocated to the new nodes. Each node in the cluster will now contain either a primary or replica shard. Now, let's say Node2, which contains the primary shard S1, goes down as shown here: Since the node that holds the primary shard went down, the replica of S1, which lives in Node3, is promoted to primary. To ensure the replication factor of 1, a copy of the shard S1 is made on Node1. This process is known as rebalancing of the cluster. Depending on the application, the number of shards can be configured while creating the index. The process of rebalancing the shards to other nodes is entirely transparent to the user and handled automatically by Elasticsearch. We discussed inverted indexes, relation between nodes, index and shard, distributed search and how failures are handled automatically in Elasticsearch. Check out this book, 'Learning Elasticsearch' to know about handling document relationships, working with geospatial data, and much more. How to install Elasticsearch in Ubuntu and Windows Working with Kibana in Elasticsearch 5.x CRUD (Create Read, Update and Delete) Operations with Elasticsearch
Read more
  • 0
  • 2
  • 27426

article-image-getting-started-multiplayer-game-programming
Packt
04 Jun 2015
33 min read
Save for later

Getting Started with Multiplayer Game Programming

Packt
04 Jun 2015
33 min read
In this article by Rodrigo Silveira author of the book Multiplayer gaming with HTML5 game development, if you're reading this, chances are pretty good that you are already a game developer. That being the case, then you already know just how exciting it is to program your own games, either professionally or as a highly gratifying hobby that is very time-consuming. Now you're ready to take your game programming skills to the next level—that is, you're ready to implement multiplayer functionality into your JavaScript-based games. (For more resources related to this topic, see here.) In case you have already set out to create multiplayer games for the Open Web Platform using HTML5 and JavaScript, then you may have already come to realize that a personal desktop computer, laptop, or a mobile device is not particularly the most appropriate device to share with another human player for games in which two or more players share the same game world at the same time. Therefore, what is needed in order to create exciting multiplayer games with JavaScript is some form of networking technology. We will be discussing the following principles and concepts: The basics of networking and network programming paradigms Socket programming with HTML5 Programming a game server and game clients Turn-based multiplayer games Understanding the basics of networking It is said that one cannot program games that make use of networking without first understanding all about the discipline of computer networking and network programming. Although having a deep understanding of any topic can be only beneficial to the person working on that topic, I don't believe that you must know everything there is to know about game networking in order to program some pretty fun and engaging multiplayer games. Saying that is the case is like saying that one needs to be a scholar of the Spanish language in order to cook a simple burrito. Thus, let us take a look at the most basic and fundamental concepts of networking. After you finish reading this article, you will know enough about computer networking to get started, and you will feel comfortable adding multiplayer aspects to your games. One thing to keep in mind is that, even though networked games are not nearly as old as single-player games, computer networking is actually a very old and well-studied subject. Some of the earliest computer network systems date back to the 1950s. Though some of the techniques have improved over the years, the basic idea remains the same: two or more computers are connected together to establish communication between the machines. By communication, I mean data exchange, such as sending messages back and forth between the machines, or one of the machines only sends the data and the other only receives it. With this brief introduction to the concept of networking, you are now grounded in the subject of networking, enough to know what is required to network your games—two or more computers that talk to each other as close to real time as possible. By now, it should be clear how this simple concept makes it possible for us to connect multiple players into the same game world. In essence, we need a way to share the global game data among all the players who are connected to the game session, then continue to update each player about every other player. There are several different techniques that are commonly used to achieve this, but the two most common approaches are peer-to-peer and client-server. Both techniques present different opportunities, including advantages and disadvantages. In general, neither is particularly better than the other, but different situations and use cases may be better suited for one or the other technique. Peer-to-peer networking A simple way to connect players into the same virtual game world is through the peer-to-peer architecture. Although the name might suggest that only two peers ("nodes") are involved, by definition a peer-to-peer network system is one in which two or more nodes are connected directly to each other without a centralized system orchestrating the connection or information exchange. On a typical peer-to-peer setup, each peer serves the same function as every other one—that is, they all consume the same data and share whatever data they produce so that others can stay synchronized. In the case of a peer-to-peer game, we can illustrate this architecture with a simple game of Tic-tac-toe. Once both the players have established a connection between themselves, whoever is starting the game makes a move by marking a cell on the game board. This information is relayed across the wire to the other peer, who is now aware of the decision made by his or her opponent, and can thus update their own game world. Once the second player receives the game's latest state that results from the first player's latest move, the second player is able to make a move of their own by checking some available space on the board. This information is then copied over to the first player who can update their own world and continue the process by making the next desired move. The process goes on until one of the peers disconnects or the game ends as some condition that is based on the game's own business logic is met. In the case of the game of Tic-tac-toe, the game would end once one of the players has marked three spaces on the board forming a straight line or if all nine cells are filled, but neither player managed to connect three cells in a straight path. Some of the benefits of peer-to-peer networked games are as follows: Fast data transmission: Here, the data goes directly to its intended target. In other architectures, the data could go to some centralized node first, then the central node (or the "server") contacts the other peer, sending the necessary updates. Simpler setup: You would only need to think about one instance of your game that, generally speaking, handles its own input, sends its input to other connected peers, and handles their output as input for its own system. This can be especially handy in turn-based games, for example, most board games such as Tic-tac-toe. More reliability: Here one peer that goes offline typically won't affect any of the other peers. However, in the simple case of a two-player game, if one of the players is unable to continue, the game will likely cease to be playable. Imagine, though, that the game in question has dozens or hundreds of connected peers. If a handful of them suddenly lose their Internet connection, the others can continue to play. However, if there is a server that is connecting all the nodes and the server goes down, then none of the other players will know how to talk to each other, and nobody will know what is going on. On the other hand, some of the more obvious drawbacks of peer-to-peer architecture are as follows: Incoming data cannot be trusted: Here, you don't know for sure whether or not the sender modified the data. The data that is input into a game server will also suffer from the same challenge, but once the data is validated and broadcasted to all the other peers, you can be more confident that the data received by each peer from the server will have at least been sanitized and verified, and will be more credible. Fault tolerance can be very low: If enough players share the game world, one or more crashes won't make the game unplayable to the rest of the peers. Now, if we consider the many cases where any of the players that suddenly crash out of the game negatively affect the rest of the players, we can see how a server could easily recover from the crash. Data duplication when broadcasting to other peers: Imagine that your game is a simple 2D side scroller, and many other players are sharing that game world with you. Every time one of the players moves to the right, you receive the new (x, y) coordinates from that player, and you're able to update your own game world. Now, imagine that you move your player to the right by a very few pixels; you would have to send that data out to all of the other nodes in the system. Overall, peer-to-peer is a very powerful networking architecture and is still widely used by many games in the industry. Since current peer-to-peer web technologies are still in their infancy, most JavaScript-powered games today do not make use of peer-to-peer networking. For this and other reasons that should become apparent soon, we will focus almost exclusively on the other popular networking paradigm, namely, the client-server architecture. Client-server networking The idea behind the client-server networking architecture is very simple. If you squint your eyes hard enough, you can almost see a peer-to-peer graph. The most obvious difference between them, is that, instead of every node being an equal peer, one of the nodes is special. That is, instead of every node connecting to every other node, every node (client) connects to a main centralized node called the server. While the concept of a client-server network seems clear enough, perhaps a simple metaphor might make it easier for you to understand the role of each type of node in this network format as well as differentiate it from peer-to-peer . In a peer-to-peer network, you can think of it as a group of friends (peers) having a conversation at a party. They all have access to all the other peers involved in the conversation and can talk to them directly. On the other hand, a client-server network can be viewed as a group of friends having dinner at a restaurant. If a client of the restaurant wishes to order a certain item from the menu, he or she must talk to the waiter, who is the only person in that group of people with access to the desired products and the ability to serve the products to the clients. In short, the server is in charge of providing data and services to one or more clients. In the context of game development, the most common scenario is when two or more clients connect to the same server; the server will keep track of the game as well as the distributed players. Thus, if two players are to exchange information that is only pertinent to the two of them, the communication will go from the first player to and through the server and will end up at the other end with the second player. Following the example of the two players involved in a game of Tic-tac-toe, we can see how similar the flow of events is on a client-server model. Again, the main difference is that players are unaware of each other and only know what the server tells them. While you can very easily mimic a peer-to-peer model by using a server to merely connect the two players, most often the server is used much more actively than that. There are two ways to engage the server in a networked game, namely in an authoritative and a non-authoritative way. That is to say, you can have the enforcement of the game's logic strictly in the server, or you can have the clients handle the game logic, input validation, and so on. Today, most games using the client-server architecture actually use a hybrid of the two (authoritative and non-authoritative servers). For all intents and purposes, however, the server's purpose in life is to receive input from each of the clients and distribute that input throughout the pool of connected clients. Now, regardless of whether you decide to go with an authoritative server instead of a non-authoritative one, you will notice that one of challenges with a client-server game is that you will need to program both ends of the stack. You will have to do this even if your clients do nothing more than take input from the user, forward it to the server, and render whatever data they receive from the server; if your game server does nothing more than forward the input that it receives from each client to every other client, you will still need to write a game client and a game server. We will discuss game clients and servers later. For now, all we really need to know is that these two components are what set this networking model apart from peer-to-peer. Some of the benefits of client-server networked games are as follows: Separation of concerns: If you know anything about software development, you know that this is something you should always aim for. That is, good, maintainable software is written as discrete components where each does one "thing", and it is done well. Writing individual specialized components lets you focus on performing one individual task at a time, making your game easier to design, code, test, reason, and maintain. Centralization: While this can be argued against as well as in favor of, having one central place through which all communication must flow makes it easier to manage such communication, enforce any required rules, control access, and so forth. Less work for the client: Instead of having a client (peer) in charge of taking input from the user as well as other peers, validating all the input, sharing data among other peers, rendering the game, and so on, the client can focus on only doing a few of these things, allowing the server to offload some of this work. This is particularly handy when we talk about mobile gaming, and how much subtle divisions of labor can impact the overall player experience. For example, imagine a game where 10 players are engaged in the same game world. In a peer-to-peer setup, every time one player takes an action, he or she would need to send that action to nine other players (in other words, there would need to be nine network calls, boiling down to more mobile data usage). On the other hand, on a client-server configuration, one player would only need to send his or her action to one of the peers, that is, the server, who would then be responsible for sending that data to the remaining nine players. Common drawbacks of client-server architectures, whether or not the server is authoritative, are as follows: Communication takes longer to propagate: In the very best possible scenario imaginable, every message sent from the first player to the second player would take twice as long to be delivered as compared to a peer-to-peer connection. That is, the message would be first sent from the first player to the server and then from the server to the second player. There are many techniques that are used today to solve the latency problem faced in this scenario, some of which we will discuss in much more depth later. However, the underlying dilemma will always be there. More complexity due to more moving parts: It doesn't really matter how you slice the pizza; the more code you need to write (and trust me, when you build two separate modules for a game, you will write more code), the greater your mental model will have to be. While much of your code can be reused between the client and the server (especially if you use well-established programming techniques, such as object-oriented programming), at the end of the day, you need to manage a greater level of complexity. Single point of failure and network congestion: Up until now, we have mostly discussed the case where only a handful of players participates in the same game. However, the more common case is that a handful of groups of players play different games at the same time. Using the same example of the two-player game of Tic-tac-toe, imagine that there are thousands of players facing each other in single games. In a peer-to-peer setup, once a couple of players have directly paired off, it is as though there are no other players enjoying that game. The only thing to keep these two players from continuing their game is their own connection with each other. On the other hand, if the same thousands of players are connected to each other through a server sitting between the two, then two singled out players might notice severe delays between messages because the server is so busy handling all of the messages from and to all of the other people playing isolated games. Worse yet, these two players now need to worry about maintaining their own connection with each other through the server, but they also hope that the server's connection between them and their opponent will remain active. All in all, many of the challenges involved in client-server networking are well studied and understood, and many of the problems you're likely to face during your multiplayer game development will already have been solved by someone else. Client-server is a very popular and powerful game networking model, and the required technology for it, which is available to us through HTML5 and JavaScript, is well developed and widely supported. Networking protocols – UDP and TCP By discussing some of the ways in which your players can talk to each other across some form of network, we have yet only skimmed over how that communication is actually done. Let us then describe what protocols are and how they apply to networking and, more importantly, multiplayer game development. The word protocol can be defined as a set of conventions or a detailed plan of a procedure [Citation [Def. 3,4]. (n.d.). In Merriam Webster Online, Retrieved February 12, 2015, from http://www.merriam-webster.com/dictionary/protocol]. In computer networking, a protocol describes to the receiver of a message how the data is organized so that it can be decoded. For example, imagine that you have a multiplayer beat 'em up game, and you want to tell the game server that your player just issued a kick command and moved 3 units to the left. What exactly do you send to the server? Do you send a string with a value of "kick", followed by the number 3? Otherwise, do you send the number first, followed by a capitalized letter "K", indicating that the action taken was a kick? The point I'm trying to make is that, without a well-understood and agreed-upon protocol, it is impossible to successfully and predictably communicate with another computer. The two networking protocols that we'll discuss in the section, and that are also the two most widely used protocols in multiplayer networked games, are the Transmission Control Protocol (TCP) and the User Datagram Protocol (UDP). Both protocols provide communication services between clients in a network system. In simple terms, they are protocols that allow us to send and receive packets of data in such a way that the data can be identified and interpreted in a predictable way. When data is sent through TCP, the application running in the source machine first establishes a connection with the destination machine. Once a connection has been established, data is transmitted in packets in such a way that the receiving application can then put the data back together in the appropriate order. TCP also provides built-in error checking mechanisms so that, if a packet is lost, the target application can notify the sender application, and any missing packets are sent again until the entire message is received. In short, TCP is a connection-based protocol that guarantees the delivery of the full data in the correct order. Use cases where this behavior is desirable are all around us. When you download a game from a web server, for example, you want to make sure that the data comes in correctly. You want to be sure that your game assets will be properly and completely downloaded before your users start playing your game. While this guarantee of delivery may sound very reassuring, it can also be thought of as a slow process, which, as we'll see briefly, may sometimes be more important than knowing that the data will arrive in full. In contrast, UDP transmits packets of data (called datagrams) without the use of a pre-established connection. The main goal of the protocol is to be a very fast and frictionless way of sending data towards some target application. In essence, you can think of UDP as the brave employees who dress up as their company's mascot and stand outside their store waving a large banner in the hope that at least some of the people driving by will see them and give them their business. While at first, UDP may seem like a reckless protocol, the use cases that make UDP so desirable and effective includes the many situations when you care more about speed than missing packets a few times, getting duplicate packets, or getting them out of order. You may also want to choose UDP over TCP when you don't care about the reply from the receiver. With TCP, whether or not you need some form of confirmation or reply from the receiver of your message, it will still take the time to reply back to you, at least acknowledging that the message was received. Sometimes, you may not care whether or not the server received the data. A more concrete example of a scenario where UDP is a far better choice over TCP is when you need a heartbeat from the client letting the server know if the player is still there. If you need to let your server know that the session is still active every so often, and you don't care if one of the heartbeats get lost every now and again, then it would be wise to use UDP. In short, for any data that is not mission-critical and you can afford to lose, UDP might be the best option. In closing, keep in mind that, just as peer-to-peer and client-server models can be built side by side, and in the same way your game server can be a hybrid of authoritative and non-authoritative, there is absolutely no reason why your multiplayer games should only use TCP or UDP. Use whichever protocol a particular situation calls for. Network sockets There is one other protocol that we'll cover very briefly, but only so that you can see the need for network sockets in game development. As a JavaScript programmer, you are doubtlessly familiar with Hypertext Transfer Protocol (HTTP). This is the protocol in the application layer that web browsers use to fetch your games from a Web server. While HTTP is a great protocol to reliably retrieve documents from web servers, it was not designed to be used in real-time games; therefore, it is not ideal for this purpose. The way HTTP works is very simple: a client sends a request to a server, which then returns a response back to the client. The response includes a completion status code, indicating to the client that the request is either in process, needs to be forwarded to another address, or is finished successfully or erroneously. There are a handful of things to note about HTTP that will make it clear that a better protocol is needed for real-time communication between the client and server. Firstly, after each response is received by the requester, the connection is closed. Thus, before making each and every request, a new connection must be established with the server. Most of the time, an HTTP request will be sent through TCP, which, as we've seen, can be slow, relatively speaking. Secondly, HTTP is by design a stateless protocol. This means that, every time you request a resource from a server, the server has no idea who you are and what is the context of the request. (It doesn't know whether this is your first request ever or if you're a frequent requester.) A common solution to this problem is to include a unique string with every HTTP request that the server keeps track of, and can thus provide information about each individual client on an ongoing basis. You may recognize this as a standard session. The major downside with this solution, at least with regard to real-time gaming, is that mapping a session cookie to the user's session takes additional time. Finally, the major factor that makes HTTP unsuitable for multiplayer game programming is that the communication is one way—only the client can connect to the server, and the server replies back through the same connection. In other words, the game client can tell the game server that a punch command has been entered by the user, but the game server cannot pass that information along to other clients. Think of it like a vending machine. As a client of the machine, we can request specific items that we wish to buy. We formalize this request by inserting money into the vending machine, and then we press the appropriate button. Under no circumstance will a vending machine issue commands to a person standing nearby. That would be like waiting for a vending machine to dispense food, expecting people to deposit the money inside it afterwards. The answer to this lack of functionality in HTTP is pretty straightforward. A network socket is an endpoint in a connection that allows for two-way communication between the client and the server. Think of it more like a telephone call, rather than a vending machine. During a telephone call, either party can say whatever they want at any given time. Most importantly, the connection between both parties remains open throughout the duration of the conversation, making the communication process highly efficient. WebSocket is a protocol built on top of TCP, allowing web-based applications to have two-way communication with a server. The way a WebSocket is created consists of several steps, including a protocol upgrade from HTTP to WebSocket. Thankfully, all of the heavy lifting is done behind the scenes by the browser and JavaScript. For now, the key takeaway here is that with a TCP socket (yes, there are other types of socket including UDP sockets), we can reliably communicate with a server, and the server can talk back to us as per the need. Socket programming in JavaScript Let's now bring the conversation about network connections, protocols, and sockets to a close by talking about the tools—JavaScript and WebSockets—that bring everything together, allowing us to program awesome multiplayer games in the language of the open Web. The WebSocket protocol Modern browsers and other JavaScript runtime environments have implemented the WebSocket protocol in JavaScript. Don't make the mistake of thinking that just because we can create WebSocket objects in JavaScript, WebSockets are part of JavaScript. The standard that defines the WebSocket protocol is language-agnostic and can be implemented in any programming language. Thus, before you start to deploy your JavaScript games that make use of WebSockets, ensure that the environment that will run your game uses an implementation of the ECMA standard that also implements WebSockets. In other words, not all browsers will know what to do when you ask for a WebSocket connection. For the most part, though, the latest versions, as of this writing, of the most popular browsers today (namely, Google Chrome, Safari, Mozilla Firefox, Opera, and Internet Explorer) implement the current latest revision of RFC 6455. Previous versions of WebSockets (such as protocol version - 76, 7, or 10) are slowly being deprecated and have been removed by some of the previously mentioned browsers. Probably the most confusing thing about the WebSocket protocol is the way each version of the protocol is named. The very first draft (which dates back to 2010), was named draft-hixie-thewebsocketprotocol-75. The next version was named draft-hixie-thewebsocketprotocol-76. Some people refer to these versions as 75 and 76, which can be quite confusing, especially since the fourth version of the protocol is named draft-ietf-hybi-thewebsocketprotocol-07, which is named in the draft as WebSocket Version 7. The current version of the protocol (RFC 6455) is 13. Let us take a quick look at the programming interface (API) that we'll use within our JavaScript code to interact with a WebSocket server. Keep in mind that we'll need to write both the JavaScript clients that use WebSockets to consume data as well as the WebSocket server, which uses WebSockets but plays the role of the server. The difference between the two will become apparent as we go over some examples. Creating a client-side WebSocket The following code snippet creates a new object of type WebSocket that connects the client to some backend server. The constructor takes two parameters; the first is required and represents the URL where the WebSocket server is running and expecting connections. The second URL, is an optional list of sub-protocols that the server may implement. var socket = new WebSocket('ws://www.game-domain.com'); Although this one line of code may seem simple and harmless enough, here are a few things to keep in mind: We are no longer in HTTP territory. The address to your WebSocket server now starts with ws:// instead of http://. Similarly, when we work with secure (encrypted) sockets, we would specify the server's URL as wss://, just like in https://. It may seem obvious to you, but a common pitfall that those getting started with WebSockets fall into is that, before you can establish a connection with the previous code, you need a WebSocket server running at that domain. WebSockets implement the same-origin security model. As you may have already seen with other HTML5 features, the same-origin policy states that you can only access a resource through JavaScript if both the client and the server are in the same domain. For those who are not familiar with the same-domain (also known as the same-origin) policy, the three things that constitute a domain, in this context, are the protocol, host, and port of the resource being accessed. In the previous example, the protocol, host, and port number were, respectively ws (and not wss, http, or ssh), www.game-domain.com (any sub-domain, such as game-domain.com or beta.game-domain.com would violate the same-origin policy), and 80 (by default, WebSocket connects to port 80, and port 443 when it uses wss). Since the server in the previous example binds to port 80, we don't need to explicitly specify the port number. However, had the server been configured to run on a different port, say 2667, then the URL string would need to include a colon followed by the port number that would need to be placed at the end of the host name, such as ws://www.game-domain.com:2667. As with everything else in JavaScript, WebSocket instances attempt to connect to the backend server asynchronously. Thus, you should not attempt to issue commands on your newly created socket until you're sure that the server has connected; otherwise, JavaScript will throw an error that may crash your entire game. This can be done by registering a callback function on the socket's onopen event as follows: var socket = new WebSocket('ws://www.game-domain.com'); socket.onopen = function(event) {    // socket ready to send and receive data }; Once the socket is ready to send and receive data, you can send messages to the server by calling the socket object's send method, which takes a string as the message to be sent. // Assuming a connection was previously established socket.send('Hello, WebSocket world!'); Most often, however, you will want to send more meaningful data to the server, such as objects, arrays, and other data structures that have more meaning on their own. In these cases, we can simply serialize our data as JSON strings. var player = {    nickname: 'Juju',    team: 'Blue' };   socket.send(JSON.stringify(player)); Now, the server can receive that message and work with it as the same object structure that the client sent it, by running it through the parse method of the JSON object. var player = JSON.parse(event.data); player.name === 'Juju'; // true player.team === 'Blue'; // true player.id === undefined; // true If you look at the previous example closely, you will notice that we extract the message that is sent through the socket from the data attribute of some event object. Where did that event object come from, you ask? Good question! The way we receive messages from the socket is the same on both the client and server sides of the socket. We must simply register a callback function on the socket's onmessage event, and the callback will be invoked whenever a new message is received. The argument passed into the callback function will contain an attribute named data, which will contain the raw string object with the message that was sent. socket.onmessage = function(event) {    event instanceof MessageEvent; // true      var msg = JSON.parse(event.data); }; Other events on the socket object on which you can register callbacks include onerror, which is triggered whenever an error related to the socket occurs, and onclose, which is triggered whenever the state of the socket changes to CLOSED; in other words, whenever the server closes the connection with the client for any reason or the connected client closes its connection. As mentioned previously, the socket object will also have a property called readyState, which behaves in a similar manner to the equally-named attribute in AJAX objects (or more appropriately, XMLHttpRequest objects). This attribute represents the current state of the connection and can have one of four values at any point in time. This value is an unsigned integer between 0 and 3, inclusive of both the numbers. For clarity, there are four accompanying constants on the WebSocket class that map to the four numerical values of the instance's readyState attribute. The constants are as follows: WebSocket.CONNECTING: This has a value of 0 and means that the connection between the client and the server has not yet been established. WebSocket.OPEN: This has a value of 1 and means that the connection between the client and the server is open and ready for use. Whenever the object's readyState attribute changes from CONNECTING to OPEN, which will only happen once in the object's life cycle, the onopen callback will be invoked. WebSocket.CLOSING: This has a value of 2 and means that the connection is being closed. WebSocket.CLOSED: This has a value of 3 and means that the connection is now closed (or could not be opened to begin with). Once the readyState has changed to a new value, it will never return to a previous state in the same instance of the socket object. Thus, if a socket object is CLOSING or has already become CLOSED, it will never OPEN again. In this case, you would need a new instance of WebSocket if you would like to continue to communicate with the server. To summarize, let us bring together the simple WebSocket API features that we discussed previously and create a convenient function that simplifies data serialization, error checking, and error handling when communicating with the game server: function sendMsg(socket, data) {    if (socket.readyState === WebSocket.OPEN) {      socket.send(JSON.stringify(data));        return true;    }      return false; }; Game clients Earlier, we talked about the architecture of a multiplayer game that was based on the client-server pattern. Since this is the approach we will take for the games that we'll be developing, let us define some of the main roles that the game client will fulfill. From a higher level, a game client will be the interface between the human player and the rest of the game universe (which includes the game server and other human players who are connected to it). Thus, the game client will be in charge of taking input from the player, communicating this to the server, receive any further instructions and information from the server, and then render the final output to the human player again. Depending on the type of game server used, the client can be more sophisticated than just an input application that renders static data received from the server. For example, the client could very well simulate what the game server will do and present the result of this simulation to the user while the server performs the real calculations and tells the results to the client. The biggest selling point of this technique is that the game would seem a lot more dynamic and real-time to the user since the client responds to input almost instantly. Game servers The game server is primarily responsible for connecting all the players to the same game world and keeping the communication going between them. However as you will soon realize, there may be cases where you will want the server to be more sophisticated than a routing application. For example, just because one of the players is telling the server to inform the other participants that the game is over, and the player sending the message is the winner, we may still want to confirm the information before deciding that the game is in fact over. With this idea in mind, we can label the game server as being of one of the two kinds: authoritative or non-authoritative. In an authoritative game server, the game's logic is actually running in memory (although it normally doesn't render any graphical output like the game clients certainly will) all the time. As each client reports the information to the server by sending messages through its corresponding socket, the server updates the current game state and sends the updates back to all of the players, including the original sender. This way we can be more certain that any data coming from the server has been verified and is accurate. In a non-authoritative server, the clients take on a much more involved part in the game logic enforcement, which gives the client a lot more trust. As suggested previously, what we can do is take the best of both worlds and create a mix of both the techniques. What we will do is, have a strictly authoritative server, but clients that are smart and can do some of the work on their own. Since the server has the ultimate say in the game, however, any messages received by clients from the server are considered as the ultimate truth and supersede any conclusions it came to on its own. Summary Overall, we discussed the basics of networking and network programming paradigms. We saw how WebSockets makes it possible to develop real-time, multiplayer games in HTML5. Finally, we implemented a simple game client and game server using widely supported web technologies and built a fun game of Tic-tac-toe. Resources for Article: Further resources on this subject: HTML5 Game Development – A Ball-shooting Machine with Physics Engine [article] Creating different font files and using web fonts [article] HTML5 Canvas [article]
Read more
  • 0
  • 0
  • 27090

article-image-build-google-cloud-iot-application
Gebin George
27 Jun 2018
19 min read
Save for later

Build an IoT application with Google Cloud [Tutorial]

Gebin George
27 Jun 2018
19 min read
In this tutorial, we will build a sample internet of things application using Google Cloud IoT. We will start off by implementing the end-to-end solution, where we take the data from the DHT11 sensor and post it to the Google IoT Core state topic. This article is an excerpt from the book, Enterprise Internet of Things Handbook, written by Arvind Ravulavaru. End-to-end communication To get started with Google IoT Core, we need to have a Google account. If you do not have a Google account, you can create one by navigating to this URL: https://accounts.google.com/SignUp?hl=en. Once you have created your account, you can login and navigate to Google Cloud Console: https://console.cloud.google.com. Setting up a project The first thing we are going to do is create a project. If you have already worked with Google Cloud Platform and have at least one project, you will be taken to the first project in the list or you will be taken to the Getting started page. As of the time of writing this book, Google Cloud Platform has a free trial for 12 months with $300 if the offer is still available when you are reading this chapter, I would highly recommend signing up: Once you have signed up, let's get started by creating a new project. From the top menu bar, select the Select a Project dropdown and click on the plus icon to create a new project. You can fill in the details as illustrated in the following screenshot: Click on the Create button. Once the project is created, navigate to the Project and you should land on the Home page. Enabling APIs Following are the steps to be followed for enabling APIs: From the menu on the left-hand side, select APIs & Services | Library as shown in the following screenshot: On the following screen, search for pubsub and select the Pub/Sub API from the results and we should land on a page similar to the following: Click on the ENABLE button and we should now be able to use these APIs in our project. Next, we need to enable the real-time API; search for realtime and we should find something similar to the following: Click on the ENABLE & button. Enabling device registry and devices The following steps should be used for enabling device registry and devices: From the left-hand side menu, select IoT Core and we should land on the IoT Core home page: Instead of the previous screen, if you see a screen to enable APIs, please enable the required APIs from here. Click on the & Create device registry button. On the Create device registry screen, fill the details as shown in the following table: Field Value Registry ID Pi3-DHT11-Nodes Cloud region us-central1 Protocol MQTT HTTP Default telemetry topic device-events Default state topic dht11 After completing all the details, our form should look like the following: We will add the required certificates later on. Click on the Create button and a new device registry will be created. From the Pi3-DHT11-Nodes registry page, click on the Add device button and set the Device ID as Pi3-DHT11-Node or any other suitable name. Leave everything as the defaults and make sure the Device communication is set to Allowed and create a new device. On the device page, we should see a warning as highlighted in the following screenshot: Now, we are going to add a new public key. To generate a public/private key pair, we need to have OpenSSL command line available. You can download and set up OpenSSL from here: https://www.openssl.org/source/. Use the following command to generate a certificate pair at the default location on your machine: openssl req -x509 -newkey rsa:2048 -keyout rsa_private.pem -nodes -out rsa_cert.pem -subj "/CN=unused" If everything goes well, you should see an output as shown here: Do not share these certificates anywhere; anyone with these certificates can connect to Google IoT Core as a device and start publishing data. Now, once the certificates are created, we will attach them to the device we have created in IoT Core. Head back to the device page of the Google IoT Core service and under Authentication click on Add public key. On the following screen, fill it in as illustrated: The public key value is the contents of rsa_cert.pem that we generated earlier. Click on the ADD button. Now that the public key has been successfully added, we can connect to the cloud using the private key. Setting up Raspberry Pi 3 with DHT11 node Now that we have our device set up in Google IoT Core, we are going to complete the remaining operation on Raspberry Pi 3 to send data. Pre-requisites The requirements for setting up Raspberry Pi 3 on a DHT11 node are: One Raspberry Pi 3: https://www.amazon.com/Raspberry-Pi-Desktop-Starter-White/dp/B01CI58722 One breadboard: https://www.amazon.com/Solderless-Breadboard-Circuit-Circboard-Prototyping/dp/B01DDI54II/ One DHT11 sensor: https://www.amazon.com/HiLetgo-Temperature-Humidity-Arduino-Raspberry/dp/B01DKC2GQ0 Three male-to-female jumper cables: https://www.amazon.com/RGBZONE-120pcs-Multicolored-Dupont-Breadboard/dp/B01M1IEUAF/ If you are new to the world of Raspberry Pi GPIO's interfacing, take a look at this Raspberry Pi GPIO Tutorial: The Basics Explained on YouTube: https://www.youtube.com/watch?v=6PuK9fh3aL8. The following steps are to be used for the setup process: Connect the DHT11 sensor to Raspberry Pi 3 as shown in the following diagram: Next, power up Raspberry Pi 3 and log in to it. On the desktop, create a new folder named Google-IoT-Device. Open a new Terminal and cd into this folder. Setting up Node.js Refer to the following steps to install Node.js: Open a new Terminal and run the following commands: $ sudo apt update $ sudo apt full-upgrade This will upgrade all the packages that need upgrades. Next, we will install the latest version of Node.js. We will be using the Node 7.x version: $ curl -sL https://deb.nodesource.com/setup_7.x | sudo -E bash - $ sudo apt install nodejs This will take a moment to install, and once your installation is done, you should be able to run the following commands to see the version of Node.js and npm: $ node -v $ npm -v Developing the Node.js device app Now, we will set up the app and write the required code: From the Terminal, once you are inside the Google-IoT-Device folder, run the following command: $ npm init -y Next, we will install jsonwebtoken (https://www.npmjs.com/package/jsonwebtoken) and mqtt (https://www.npmjs.com/package/mqtt) from npm. Execute the following command: $ npm install jsonwebtoken mqtt--save Next, we will install rpi-dht-sensor (https://www.npmjs.com/package/rpi-dht-sensor) from npm. This module will help in reading the DHT11 temperature and humidity values: $ npm install rpi-dht-sensor --save Your final package.json file should look similar to the following code snippet: { "name": "Google-IoT-Device", "version": "1.0.0", "description": "", "main": "index.js", "scripts": { "test": "echo "Error: no test specified" && exit 1" }, "keywords": [], "author": "", "license": "ISC", "dependencies": { "jsonwebtoken": "^8.1.1", "mqtt": "^2.15.3", "rpi-dht-sensor": "^0.1.1" } } Now that we have the required dependencies installed, let's continue. Create a new file named index.js at the root of the Google-IoT-Device folder. Next, create a folder named certs at the root of the Google-IoT-Device folder and move the two certificates we created using OpenSSL there. Your final folder structure should look something like this: Open index.js in any text editor and update it as shown here: var fs = require('fs'); var jwt = require('jsonwebtoken'); var mqtt = require('mqtt'); var rpiDhtSensor = require('rpi-dht-sensor'); var dht = new rpiDhtSensor.DHT11(2); // `2` => GPIO2 var projectId = 'pi-iot-project'; var cloudRegion = 'us-central1'; var registryId = 'Pi3-DHT11-Nodes'; var deviceId = 'Pi3-DHT11-Node'; var mqttHost = 'mqtt.googleapis.com'; var mqttPort = 8883; var privateKeyFile = '../certs/rsa_private.pem'; var algorithm = 'RS256'; var messageType = 'state'; // or event var mqttClientId = 'projects/' + projectId + '/locations/' + cloudRegion + '/registries/' + registryId + '/devices/' + deviceId; var mqttTopic = '/devices/' + deviceId + '/' + messageType; var connectionArgs = { host: mqttHost, port: mqttPort, clientId: mqttClientId, username: 'unused', password: createJwt(projectId, privateKeyFile, algorithm), protocol: 'mqtts', secureProtocol: 'TLSv1_2_method' }; console.log('connecting...'); var client = mqtt.connect(connectionArgs); // Subscribe to the /devices/{device-id}/config topic to receive config updates. client.subscribe('/devices/' + deviceId + '/config'); client.on('connect', function(success) { if (success) { console.log('Client connected...'); sendData(); } else { console.log('Client not connected...'); } }); client.on('close', function() { console.log('close'); }); client.on('error', function(err) { console.log('error', err); }); client.on('message', function(topic, message, packet) { console.log(topic, 'message received: ', Buffer.from(message, 'base64').toString('ascii')); }); function createJwt(projectId, privateKeyFile, algorithm) { var token = { 'iat': parseInt(Date.now() / 1000), 'exp': parseInt(Date.now() / 1000) + 86400 * 60, // 1 day 'aud': projectId }; var privateKey = fs.readFileSync(privateKeyFile); return jwt.sign(token, privateKey, { algorithm: algorithm }); } function fetchData() { var readout = dht.read(); var temp = readout.temperature.toFixed(2); var humd = readout.humidity.toFixed(2); return { 'temp': temp, 'humd': humd, 'time': new Date().toISOString().slice(0, 19).replace('T', ' ') // https://stackoverflow.com/a/11150727/1015046 }; } function sendData() { var payload = fetchData(); payload = JSON.stringify(payload); console.log(mqttTopic, ': Publishing message:', payload); client.publish(mqttTopic, payload, { qos: 1 }); console.log('Transmitting in 30 seconds'); setTimeout(sendData, 30000); } In the previous code, we first define the projectId, cloudRegion, registryId, and deviceId based on what we have created. Next, we build the connectionArgs object, using which we are going to connect to Google IoT Core using MQTT-SN. Do note that the password property is a JSON Web Token (JWT), based on the projectId and privateKeyFile algorithm. The token that is created by this function is valid only for one day. After one day, the cloud will refuse connection to this device if the same token is used. The username value is the Common Name (CN) of the certificate we have created, which is unused. Using mqtt.connect(), we are going to connect to the Google IoT Core. And we are subscribing to the device config topic, which can be used to send device configurations when connected. Once the connection is successful, we callsendData() every 30 seconds to send data to the state topic. Save the previous file and run the following command: $ sudo node index.js And we should see something like this: As you can see from the previous Terminal logs, the device first gets connected then starts transmitting the temperature and humidity along with time. We are sending time as well, so we can save it in the BigQuery table and then build a time series chart quite easily. Now, if we head back to the Device page of Google IoT Core and navigate to the Configuration & state history tab, we should see the data that we are sending to the state topic here: Now that the device is sending data, let's actually read the data from another client. Reading the data from the device For this, you can either use the same Raspberry Pi 3 or another computer. I am going to use MacBook as a client that is interested in the data sent by the Thing. Setting up credentials Before we start reading data from Google IoT Core, we have to set up our computer (for example, MacBook) as a trusted device, so our computer can request data. Let's perform the following steps to set the credentials: To do this, we need to create a new Service account key. From the left-hand-side menu of the Google Cloud Console, select APIs & Services | Credentials. Then click on the Create credentials dropdown and select Service account key as shown in the following screenshot: Now, fill in the details as shown in the following screenshot: We have given access to the entire project for this client and as an Owner. Do not select these settings if this is a production application. Click on Create and you will be asked to download and save the file. Do not share this file; this file is as good as giving someone owner-level permissions to all assets of this project. Once the file is downloaded somewhere safe, create an environment variable with the name GOOGLE_APPLICATION_CREDENTIALS and point it to the path of the downloaded file. You can refer to Getting Started with Authentication at https://cloud.google.com/docs/authentication/getting-started if you are facing any difficulties. Setting up subscriptions The data from the device is being sent to Google IoT Core using the state topic. If you recall, we have named that topic dht11. Now, we are going to create a subscription for this topic: From the menu on the left side, select Pub/Sub | Topics. Now, click on New subscription for the dht11 topic, as shown in the following screenshot: Create a new subscription by setting up the options selected in this screenshot: We are going to use the subscription named dht11-data to get the data from the state topic. Setting up the client Now that we have provided the required credentials as well as subscribed to a Pub/Sub topic, we will set up the Pub/Sub client. Follow these steps: Create a folder named test_client inside the test_client directory. Now, run the following command: $ npm init -y Next, install the @google-cloud/pubsub (https://www.npmjs.com/package/@google-cloud/pubsub) module with the help of the following command: $ npm install @google-cloud/pubsub --save Create a file inside the test_client folder named index.js and update it as shown in this code snippet: var PubSub = require('@google-cloud/pubsub'); var projectId = 'pi-iot-project'; var stateSubscriber = 'dht11-data' // Instantiates a client var pubsub = new PubSub({ projectId: projectId, }); var subscription = pubsub.subscription('projects/' + projectId + '/subscriptions/' + stateSubscriber); var messageHandler = function(message) { console.log('Message Begin >>>>>>>>'); console.log('message.connectionId', message.connectionId); console.log('message.attributes', message.attributes); console.log('message.data', Buffer.from(message.data, 'base64').toString('ascii')); console.log('Message End >>>>>>>>>>'); // "Ack" (acknowledge receipt of) the message message.ack(); }; // Listen for new messages subscription.on('message', messageHandler); Update the projectId and stateSubscriber in the previous code. Now, save the file and run the following command: $ node index.js We should see the following output in the console: This way, any client that is interested in the data of this device can use this approach to get the latest data. With this, we conclude the section on posting data to Google IoT Core and fetching the data. In the next section, we are going to work on building a dashboard. Building a dashboard Now that we have seen how a client can read the data from our device on demand, we will move on to building a dashboard, where we display data in real time. For this, we are going to use Google Cloud Functions, Google BigQuery, and Google Data Studio. Google Cloud Functions Cloud Functions are solution for serverless services. Cloud Functions is a lightweight solution for creating standalone and single-purpose functions that respond to cloud events. You can read more about Google Cloud Functions at https://cloud.google.com/functions/. Google BigQuery Google BigQuery is an enterprise data warehouse that solves this problem by enabling super-fast SQL queries using the processing power of Google's infrastructure. You can read more about Google BigQuery at https://cloud.google.com/bigquery/. Google Data Studio Google Data Studio helps to build dashboards and reports using various data connectors, such as BigQuery or Google Analytics. You can read more about Google Data Studio at https://cloud.google.com/data-studio/. As of April 2018, these three services are still in beta. As we have already seen in the Architecture section, once the data is published on the state topic, we are going to create a cloud function that will get triggered by the data event on the Pub/Sub client. And inside our cloud function, we are going to get a copy of the published data and then insert it into the BigQuery dataset. Once the data is inserted, we are going to use Google Data Studio to create a new report by linking the BigQuery dataset to the input. So, let's get started. Setting up BigQuery The first thing we are going to do is set up BigQuery: From the side menu of the Google Cloud Platform Console, our project page, click on the BigQuery URL and we should be taken to the Google BigQuery home page. Select Create new dataset, as shown in the following screenshot: Create a new dataset with the values illustrated in the following screenshot: Once the dataset is created, click on the plus sign next to the dataset and create an empty table. We are going to name the table dht11_data and we are going have three fields in it, as shown here: Click on the Create Table button to create the table. Now that we have our table ready, we will write a cloud function to insert the incoming data from Pub/Sub into this table. Setting up Google Cloud Function Now, we are going to set up a cloud function that will be triggered by the incoming data: From the Google Cloud Console's left-hand-side menu, select Cloud Functions under Compute. Once you land on the Google Cloud Functions homepage, you will be asked to enable the cloud functions API. Click on Enable API: Once the API is enabled, we will be on the Create function page. Fill in the form as shown here: The Trigger is set to Cloud Pub/Sub topic and we have selected dht11 as the Topic. Under the Source code section; make sure you are in the index.js tab and update it as shown here: var BigQuery = require('@google-cloud/bigquery'); var projectId = 'pi-iot-project'; var bigquery = new BigQuery({ projectId: projectId, }); var datasetName = 'pi3_dht11_dataset'; var tableName = 'dht11_data'; exports.pubsubToBQ = function(event, callback) { var msg = event.data; var data = JSON.parse(Buffer.from(msg.data, 'base64').toString()); // console.log(data); bigquery .dataset(datasetName) .table(tableName) .insert(data) .then(function() { console.log('Inserted rows'); callback(); // task done }) .catch(function(err) { if (err && err.name === 'PartialFailureError') { if (err.errors && err.errors.length > 0) { console.log('Insert errors:'); err.errors.forEach(function(err) { console.error(err); }); } } else { console.error('ERROR:', err); } callback(); // task done }); }; In the previous code, we were using the BigQuery Node.js module to insert data into our BigQuery table. Update projectId, datasetName, and tableName as applicable in the code. Next, click on the package.json tab and update it as shown: { "name": "cloud_function", "version": "0.0.1", "dependencies": { "@google-cloud/bigquery": "^1.0.0" } } Finally, for the Function to execute field, enter pubsubToBQ. pubsubToBQ is the name of the function that has our logic and this function will be called when the data event occurs. Click on the Create button and our function should be deployed in a minute. Running the device Now that the entire setup is done, we will start pumping data into BigQuery: Head back to Raspberry Pi 3 which was sending the DHT11 temperature and humidity data, and run the application. We should see the data being published to the state topic: Now, if we head back to the Cloud Functions page, we should see the requests coming into the cloud function: You can click on VIEW LOGS to view the logs of each function execution: Now, head over to our table in BigQuery and click on the RUN QUERY button; run the query as shown in the following screenshot: Now, all the data that was generated by the DHT11 sensor is timestamped and stored in BigQuery. You can use the Save to Google Sheets button to save this data to Google Sheets and analyze the data there or plot graphs, as shown here: Or we can go one step ahead and use the Google Data Studio to do the same. Google Data Studio reports Now that the data is ready in BigQuery, we are going to set up Google Data Studio and then connect both of them, so we can access the data from BigQuery in Google Data Studio: Navigate to https://datastudio.google.com and log in with your Google account. Once you are on the Home page of Google Data Studio, click on the Blank report template. Make sure you read and agree to the terms and conditions before proceeding. Name the report PI3 DHT11 Sensor Data. Using the Create new data source button, we will create a new data source. Click on Create new data source and we should land on a page where we need to create a new Data Source. From the list of Connectors, select BigQuery; you will be asked to authorize Data Studio to interface with BigQuery, as shown in the following screenshot: Once we authorized, we will be shown our projects and related datasets and tables: Select the dht11_data table and click on Connect. This fetches the metadata of the table as shown here: Set the Aggregation for the temp and humd fields to Max and set the Type for time as Date & Time. Pick Minute (mm) from the sub-list. Click on Add to report and you will be asked to authorize Google Data Studio to read data from the table. Once the data source has been successfully linked, we will create a new time series chart. From the menu, select Insert | Time Series link. Update the data configuration of the chart as shown in the following screenshot: You can play with the styles as per your preference and we should see something similar to the following screenshot: This report can then be shared with any user. With this, we have seen the basic features and implementation process needed to work with Google Cloud IoT Core as well other features of the platform. If you found this post useful, do check out the book,  Enterprise Internet of Things Handbook, to build state of the art IoT applications best-fit for Enterprises. Cognitive IoT: How Artificial Intelligence is remoulding Industrial and Consumer IoT Five Most Surprising Applications of IoT How IoT is going to change tech teams
Read more
  • 0
  • 17
  • 27081

article-image-sizing-configuring-hadoop-cluster
Oli Huggins
16 Feb 2014
10 min read
Save for later

Sizing and Configuring your Hadoop Cluster

Oli Huggins
16 Feb 2014
10 min read
This article, written by Khaled Tannir, the author of Optimizing Hadoop for MapReduce, discusses two of the most important aspects to consider while optimizing Hadoop for MapReduce: sizing and configuring the Hadoop cluster correctly. Sizing your Hadoop cluster Hadoop's performance depends on multiple factors based on well-configured software layers and well-dimensioned hardware resources that utilize its CPU, Memory, hard drive (storage I/O) and network bandwidth efficiently. Planning the Hadoop cluster remains a complex task that requires a minimum knowledge of the Hadoop architecture and may be out the scope of this book. This is what we are trying to make clearer in this section by providing explanations and formulas in order to help you to best estimate your needs. We will introduce a basic guideline that will help you to make your decision while sizing your cluster and answer some How to plan questions about cluster's needs such as the following: How to plan my storage? How to plan my CPU? How to plan my memory? How to plan the network bandwidth? While sizing your Hadoop cluster, you should also consider the data volume that the final users will process on the cluster. The answer to this question will lead you to determine how many machines (nodes) you need in your cluster to process the input data efficiently and determine the disk/memory capacity of each one. Hadoop is a Master/Slave architecture and needs a lot of memory and CPU bound. It has two main components: JobTracker: This is the critical component in this architecture and monitors jobs that are running on the cluster TaskTracker: This runs tasks on each node of the cluster To work efficiently, HDFS must have high throughput hard drives with an underlying filesystem that supports the HDFS read and write pattern (large block). This pattern defines one big read (or write) at a time with a block size of 64 MB, 128 MB, up to 256 MB. Also, the network layer should be fast enough to cope with intermediate data transfer and block. HDFS is itself based on a Master/Slave architecture with two main components: the NameNode / Secondary NameNode and DataNode components. These are critical components and need a lot of memory to store the file's meta information such as attributes and file localization, directory structure, names, and to process data. The NameNode component ensures that data blocks are properly replicated in the cluster. The second component, the DataNode component, manages the state of an HDFS node and interacts with its data blocks. It requires a lot of I/O for processing and data transfer. Typically, the MapReduce layer has two main prerequisites: input datasets must be large enough to fill a data block and split in smaller and independent data chunks (for example, a 10 GB text file can be split into 40,960 blocks of 256 MB each, and each line of text in any data block can be processed independently). The second prerequisite is that it should consider the data locality, which means that the MapReduce code is moved where the data lies, not the opposite (it is more efficient to move a few megabytes of code to be close to the data to be processed, than moving many data blocks over the network or the disk). This involves having a distributed storage system that exposes data locality and allows the execution of code on any storage node. Concerning the network bandwidth, it is used at two instances: during the replication process and following a file write, and during the balancing of the replication factor when a node fails. The most common practice to size a Hadoop cluster is sizing the cluster based on the amount of storage required. The more data into the system, the more will be the machines required. Each time you add a new node to the cluster, you get more computing resources in addition to the new storage capacity. Let's consider an example cluster growth plan based on storage and learn how to determine the storage needed, the amount of memory, and the number of DataNodes in the cluster. Daily data input 100 GB Storage space used by daily data input = daily data input * replication factor = 300 GB HDFS replication factor 3 Monthly growth 5% Monthly volume = (300 * 30) + 5% =  9450 GB After one year = 9450 * (1 + 0.05)^12 = 16971 GB Intermediate MapReduce data 25% Dedicated space = HDD size * (1 - Non HDFS reserved space per disk / 100 + Intermediate MapReduce data / 100) = 4 * (1 - (0.25 + 0.30)) = 1.8 TB (which is the node capacity) Non HDFS reserved space per disk 30% Size of a hard drive disk 4 TB Number of DataNodes needed to process: Whole first month data = 9.450 / 1800 ~= 6 nodes The 12th month data = 16.971/ 1800 ~= 10 nodes Whole year data = 157.938 / 1800 ~= 88 nodes Do not use RAID array disks on a DataNode. HDFS provides its own replication mechanism. It is also important to note that for every disk, 30 percent of its capacity should be reserved to non-HDFS use. It is easy to determine the memory needed for both NameNode and Secondary NameNode. The memory needed by NameNode to manage the HDFS cluster metadata in memory and the memory needed for the OS must be added together. Typically, the memory needed by Secondary NameNode should be identical to NameNode. Then you can apply the following formulas to determine the memory amount: NameNode memory 2 GB - 4 GB Memory amount = HDFS cluster management memory + NameNode memory + OS memory Secondary NameNode memory 2 GB - 4 GB OS memory 4 GB - 8 GB HDFS memory 2 GB - 8 GB At least NameNode (Secondary NameNode) memory = 2 + 2 + 4 = 8 GB It is also easy to determine the DataNode memory amount. But this time, the memory amount depends on the physical CPU's core number installed on each DataNode. DataNode process memory 4 GB - 8 GB Memory amount = Memory per CPU core * number of CPU's core + DataNode process memory + DataNode TaskTracker memory + OS memory DataNode TaskTracker memory 4 GB - 8 GB OS memory 4 GB - 8 GB CPU's core number 4+ Memory per CPU core 4 GB - 8 GB At least DataNode memory = 4*4 + 4 + 4 + 4 = 28 GB Regarding how to determine the CPU and the network bandwidth, we suggest using the now-a-days multicore CPUs with at least four physical cores per CPU. The more physical CPU's cores you have, the more you will be able to enhance your job's performance (according to all rules discussed to avoid underutilization or overutilization). For the network switches, we recommend to use equipment having a high throughput (such as 10 GB) Ethernet intra rack with N x 10 GB Ethernet inter rack. Configuring your cluster correctly To run Hadoop and get a maximum performance, it needs to be configured correctly. But the question is how to do that. Well, based on our experiences, we can say that there is not one single answer to this question. The experiences gave us a clear indication that the Hadoop framework should be adapted for the cluster it is running on and sometimes also to the job. In order to configure your cluster correctly, we recommend running a Hadoop job(s) the first time with its default configuration to get a baseline. Then, you will check the resource's weakness (if it exists) by analyzing the job history logfiles and report the results (measured time it took to run the jobs). After that, iteratively, you will tune your Hadoop configuration and re-run the job until you get the configuration that fits your business needs. The number of mappers and reducer tasks that a job should use is important. Picking the right amount of tasks for a job can have a huge impact on Hadoop's performance. The number of reducer tasks should be less than the number of mapper tasks. Google reports one reducer for 20 mappers; the others give different guidelines. This is because mapper tasks often process a lot of data, and the result of those tasks are passed to the reducer tasks. Often, a reducer task is just an aggregate function that processes a minor portion of the data compared to the mapper tasks. Also, the correct number of reducers must also be considered. The number of mappers and reducers is related to the number of physical cores on the DataNode, which determines the maximum number of jobs that can run in parallel on DataNode. In a Hadoop cluster, master nodes typically consist of machines where one machine is designed as a NameNode, and another as a JobTracker, while all other machines in the cluster are slave nodes that act as DataNodes and TaskTrackers. When starting the cluster, you begin starting the HDFS daemons on the master node and DataNode daemons on all data nodes machines. Then, you start the MapReduce daemons: JobTracker on the master node and the TaskTracker daemons on all slave nodes. The following diagram shows the Hadoop daemon's pseudo formula: When configuring your cluster, you need to consider the CPU cores and memory resources that need to be allocated to these daemons. In a huge data context, it is recommended to reserve 2 CPU cores on each DataNode for the HDFS and MapReduce daemons. While in a small and medium data context, you can reserve only one CPU core on each DataNode. Once you have determined the maximum mapper's slot numbers, you need to determine the reducer's maximum slot numbers. Based on our experience, there is a distribution between the Map and Reduce tasks on DataNodes that give good performance result to define the reducer's slot numbers the same as the mapper's slot numbers or at least equal to two-third mapper slots. Let's learn to correctly configure the number of mappers and reducers and assume the following cluster examples: Cluster machine Nb Medium data size Large data size DataNode CPU cores 8 Reserve 1 CPU core Reserve 2 CPU cores DataNode TaskTracker daemon 1 1 1 DataNode HDFS daemon 1 1 1 Data block size 128 MB 256 MB DataNode CPU % utilization 95% to 120% 95% to 150% Cluster nodes 20 40 Replication factor 2 3 We want to use the CPU resources at least 95 percent, and due to Hyper-Threading, one CPU core might process more than one job at a time, so we can set the Hyper-Threading factor range between 120 percent and 170 percent. Maximum mapper's slot numbers on one node in a large data context = number of physical cores - reserved core * (0.95 -> 1.5) Reserved core = 1 for TaskTracker + 1 for HDFS Let's say the CPU on the node will use up to 120% (with Hyper-Threading) Maximum number of mapper slots = (8 - 2) * 1.2 = 7.2 rounded down to 7 Let's apply the 2/3 mappers/reducers technique: Maximum number of reducers slots = 7 * 2/3 = 5 Let's define the number of slots for the cluster: Mapper's slots: = 7 * 40 = 280 Reducer's slots: = 5 * 40 = 200 The block size is also used to enhance performance. The default Hadoop configuration uses 64 MB blocks, while we suggest using 128 MB in your configuration for a medium data context as well and 256 MB for a very large data context. This means that a mapper task can process one data block (for example, 128 MB) by only opening one block. In the default Hadoop configuration (set to 2 by default), two mapper tasks are needed to process the same amount of data. This may be considered as a drawback because initializing one more mapper task and opening one more file takes more time. Summary In this article, we learned about sizing and configuring the Hadoop cluster for optimizing it for MapReduce. Resources for Article: Further resources on this subject: Hadoop Tech Page Hadoop and HDInsight in a Heartbeat Securing the Hadoop Ecosystem Advanced Hadoop MapReduce Administration
Read more
  • 0
  • 3
  • 26839

article-image-restful-java-web-services-swagger
Fatema Patrawala
22 May 2018
14 min read
Save for later

Documenting RESTful Java Web Services using Swagger

Fatema Patrawala
22 May 2018
14 min read
In the earlier days, many software solution providers did not really pay attention to documenting their RESTful web APIs. However, many API vendors soon realized the need for a good API documentation solution. Today, you will find a variety of approaches to documenting RESTful web APIs. There are some popular solutions available today for describing, producing, consuming, and visualizing RESTful web services. In this tutorial, we will explore Swagger which offers a specification and a complete framework implementation for describing, producing, consuming, and visualizing RESTful web services. The Swagger framework works with many popular programming languages, such as Java, Scala, Clojure, Groovy, JavaScript, and .NET. This tutorial is an extract taken from the book RESTFul Java Web Services - Third Edition, written by Bogunuva Mohanram Balachandar. This book will help you master core REST concepts and create RESTful web services in Java. A glance at the market adoption of Swagger The greatest strength of Swagger is its powerful API platform, which satisfies the client, documentation, and server needs. The Swagger UI framework serves as the documentation and testing utility. Its support for different languages and its matured tooling support have really grabbed the attention of many API vendors, and it seems to be the one with the most traction in the community today. Swagger is built using Scala. This means that when you package your application, you need to have the entire Scala runtime into your build, which may considerably increase the size of your deployable artifact (the EAR or WAR file). That said, Swagger is however improving with each release. For example, the Swagger 2.0 release allows you to use YAML for describing APIs. So, keep a watch on this framework. The Swagger framework has the following three major components: Server: This component hosts the RESTful web API descriptions for the services that the clients want to use Client: This component uses the RESTful web API descriptions from the server to provide an automated interfacing mechanism to invoke the REST APIs User interface: This part of the framework reads a description of the APIs from the server, renders it as a web page, and provides an interactive sandbox to test the APIs A quick overview of the Swagger structure Let's take a quick look at the Swagger file structure before moving further. The Swagger 1.x file contents that describe the RESTful APIs are represented as the JSON objects. With Swagger 2.0 release onwards, you can also use the YAML format to describe the RESTful web APIs. This section discusses the Swagger file contents represented as JSON. The basic constructs that we'll discuss in this section for JSON are also applicable for the YAML representation of APIs, although the syntax differs. When using the JSON structure for describing the REST APIs, the Swagger file uses a Swagger object as the root document object for describing the APIs. Here is a quick overview of the various properties that you will find in a Swagger object: The following properties describe the basic information about the RESTful web application: swagger: This property specifies the Swagger version. info: This property provides metadata about the API. host: This property locates the server where the APIs are hosted. basePath: This property is the base path for the API. It is relative to the value set for the host field. schemes: This property transfers the protocol for a RESTful web API, such as HTTP and HTTPS. The following properties allow you to specify the default values at the application level, which can be optionally overridden for each operation: consumes: This property specifies the default internet media types that the APIs consume. It can be overridden at the API level. produces: This property specifies the default internet media types that the APIs produce. It can be overridden at the API level. securityDefinitions: This property globally defines the security schemes for the document. These definitions can be referred to from the security schemes specified for each API. The following properties describe the operations (REST APIs): paths: This property specifies the path to the API or the resources. The path must be appended to the basePath property in order to get the full URI. definitions: This property specifies the input and output entity types for operations on REST resources (APIs). parameters: This property specifies the parameter details for an operation. responses: This property specifies the response type for an operation. security: This property specifies the security schemes in order to execute this operation. externalDocs: This property links to the external documentation. A complete Swagger 2.0 specification is available at Github. The Swagger specification was donated to the Open API initiative, which aims to standardize the format of the API specification and bring uniformity in usage. Swagger 2.0 was enhanced as part of the Open API initiative, and a new specification OAS 3.0 (Open API specification 3.0) was released in July 2017. The tools are still being worked on to support OAS3.0. I encourage you to explore the Open API Specification 3.0 available at Github. Overview of Swagger APIs The Swagger framework consists of many sub-projects in the Git repository, each built with a specific purpose. Here is a quick summary of the key projects: swagger-spec: This repository contains the Swagger specification and the project in general. swagger-ui: This project is an interactive tool to display and execute the Swagger specification files. It is based on swagger-js (the JavaScript library). swagger-editor: This project allows you to edit the YAML files. It is released as a part of Swagger 2.0. swagger-core: This project provides the scala and java library to generate the Swagger specifications directly from code. It supports JAX-RS, the Servlet APIs, and the Play framework. swagger-codegen: This project provides a tool that can read the Swagger specification files and generate the client and server code that consumes and produces the specifications. In the next section, you will learn how to use the swagger-core project offerings to generate the Swagger file for a JAX-RS application. Generating Swagger from JAX-RS Both the WADL and RAML tools that we discussed in the previous sections use the JAX-RS annotations metadata to generate the documentation for the APIs. The Swagger framework does not fully rely on the JAX-RS annotations but offers a set of proprietary annotations for describing the resources. This helps in the following scenarios: The Swagger core annotations provide more flexibility for generating documentations compliant with the Swagger specifications It allows you to use Swagger for generating documentations for web components that do not use the JAX-RS annotations, such as servlet and the servlet filter The Swagger annotations are designed to work with JAX-RS, improving the quality of the API documentation generated by the framework. Note that the swagger-core project is currently supported on the Jersey and Restlet implementations. If you are considering any other runtime for your JAX-RS application, check the respective product manual and ensure the support before you start using Swagger for describing APIs. Some of the commonly used Swagger annotations are as follows: The @com.wordnik.swagger.annotations.Api annotation marks a class as a Swagger resource. Note that only classes that are annotated with @Api will be considered for generating the documentation. The @Api annotation is used along with class-level JAX-RS annotations such as @Produces and @Path. Annotations that declare an operation are as follows: @com.wordnik.swagger.annotations.ApiOperation: This annotation describes a resource method (operation) that is designated to respond to HTTP action methods, such as GET, PUT, POST, and DELETE. @com.wordnik.swagger.annotations.ApiParam: This annotation is used for describing parameters used in an operation. This is designed for use in conjunction with the JAX-RS parameters, such as @Path, @PathParam, @QueryParam, @HeaderParam, @FormParam, and @BeanParam. @com.wordnik.swagger.annotations.ApiImplicitParam: This annotation allows you to define the operation parameters manually. You can use this to override the @PathParam or @QueryParam values specified on a resource method with custom values. If you have multiple ImplicitParam for an operation, wrap them with @ApiImplicitParams. @com.wordnik.swagger.annotations.ApiResponse: This annotation describes the status codes returned by an operation. If you have multiple responses, wrap them by using @ApiResponses. @com.wordnik.swagger.annotations.ResponseHeader: This annotation describes a header that can be provided as part of the response. @com.wordnik.swagger.annotations.Authorization: This annotation is used within either Api or ApiOperation to describe the authorization scheme used on a resource or an operation. Annotations that declare API models are as follows: @com.wordnik.swagger.annotations.ApiModel: This annotation describes the model objects used in the application. @com.wordnik.swagger.annotations.ApiModelProperty: This annotation describes the properties present in the ApiModel object. A complete list of the Swagger core annotations is available at Github. Having learned the basics of Swagger, it is time for us to move on and build a simple example to get a feel of the real-life use of Swagger in a JAX-RS application. As always, this example uses the Jersey implementation of JAX-RS. Specifying dependency to Swagger To use Swagger in your Jersey 2 application, specify the dependency to swagger-jersey2-jaxrs jar. If you use Maven for building the source, the dependency to the swagger-core library will look as follows: <dependency> <groupId>com.wordnik</groupId> <artifactId>swagger-jersey2-jaxrs</artifactId> <version>1.5.1-M1</version> <!-use the appropriate version here, 1.5.x supports Swagger 2.0 spec --> </dependency> You should be careful while choosing the swagger-core version for your product. Note that swagger-core 1.3 produces the Swagger 1.2 definitions, whereas swagger-core 1.5 produces the Swagger 2.0 definitions. The next step is to hook the Swagger provider components into your Jersey application. This is done by configuring the Jersey servlet (org.glassfish.jersey.servlet.ServletContainer) in web.xml, as shown here: <servlet> <servlet-name>jersey</servlet-name> <servlet-class> org.glassfish.jersey.servlet.ServletContainer </servlet-class> <init-param> <param-name>jersey.config.server.provider.packages </param-name> <param-value> com.wordnik.swagger.jaxrs.json, com.packtpub.rest.ch7.swagger </param-value> </init-param> <load-on-startup>1</load-on-startup> </servlet> <servlet-mapping> <servlet-name>jersey</servlet-name> <url-pattern>/webresources/*</url-pattern> </servlet-mapping> To enable the Swagger documentation features, it is necessary to load the Swagger framework provider classes from the com.wordnik.swagger.jaxrs.listing package. The package names of the JAX-RS resource classes and provider components are configured as the value for the jersey.config.server.provider.packages init parameter. The Jersey framework scans through the configured packages for identifying the resource classes and provider components during the deployment of the application. Map the Jersey servlet to a request URI so that it responds to the REST resource calls that match the URI. If you prefer not to use web.xml, you can also use the custom application subclass for (programmatically) specifying all the configuration entries discussed here. To try this option, refer to Github. Configuring the Swagger definition After specifying the Swagger provider components, the next step is to configure and initialize the Swagger definition. This is done by configuring the com.wordnik.swagger.jersey.config.JerseyJaxrsConfig servlet in web.xml, as follows: <servlet> <servlet-name>Jersey2Config</servlet-name> <servlet-class> com.wordnik.swagger.jersey.config.JerseyJaxrsConfig </servlet-class> <init-param> <param-name>api.version</param-name> <param-value>1.0.0</param-value> </init-param> <init-param> <param-name>swagger.api.basepath</param-name> <param-value> http://localhost:8080/hrapp/webresources </param-value> </init-param> <load-on-startup>1</load-on-startup> </servlet> Here is a brief overview of the initialization parameters used for JerseyJaxrsConfig: api.version: This parameter specifies the API version for your application swagger.api.basepath: This parameter specifies the base path for your application With this step, we have finished all the configuration entries for using Swagger in a JAX-RS (Jersey 2 implementation) application. In the next section, we will see how to use the Swagger metadata annotation on a JAX-RS resource class for describing the resources and operations. Adding a Swagger annotation on a JAX-RS resource class Let's revisit the DepartmentResource class used in the previous sections. In this example, we will enhance the DepartmentResource class by adding the Swagger annotations discussed earlier. We use @Api to mark DepartmentResource as the Swagger resource. The @ApiOperation annotation describes the operation exposed by the DepartmentResource class: import com.wordnik.swagger.annotations.Api; import com.wordnik.swagger.annotations.ApiOperation; import com.wordnik.swagger.annotations.ApiParam; import com.wordnik.swagger.annotations.ApiResponse; import com.wordnik.swagger.annotations.ApiResponses; //Other imports are removed for brevity @Stateless @Path("departments") @Api(value = "/departments", description = "Get departments details") public class DepartmentResource { @ApiOperation(value = "Find department by id", notes = "Specify a valid department id", response = Department.class) @ApiResponses(value = { @ApiResponse(code = 400, message = "Invalid department id supplied"), @ApiResponse(code = 404, message = "Department not found") }) @GET @Path("{id}") @Produces("application/json") public Department findDepartment( @ApiParam(value = "The department id", required = true) @PathParam("id") Integer id){ return findDepartmentEntity(id); } //Rest of the codes are removed for brevity } To view the Swagger documentation, build the source and deploy it to the server. Once the application is deployed, you can navigate to http://<host>:<port>/<application-name>/<application-path>/swagger.json to view the Swagger resource listing in the JSON format. The Swagger URL for this example will look like the following: http://localhost:8080/hrapp/webresource/swagger.json The following sample Swagger representation is for the DepartmentResource class discussed in this section: { "swagger": "2.0", "info": { "version": "1.0.0", "title": "" }, "host": "localhost:8080", "basePath": "/hrapp/webresources", "tags": [ { "name": "user" } ], "schemes": [ "http" ], "paths": { "/departments/{id}": { "get": { "tags": [ "user" ], "summary": "Find department by id", "description": "", "operationId": "loginUser", "produces": [ "application/json" ], "parameters": [ { "name": "id", "in": "path", "description": "The department id", "required": true, "type": "integer", "format": "int32" } ], "responses": { "200": { "description": "successful operation", "schema": { "$ref": "#/definitions/Department" } }, "400": { "description": "Invalid department id supplied" }, "404": { "description": "Department not found" } } } } }, "definitions": { "Department": { "properties": { "departmentId": { "type": "integer", "format": "int32" }, "departmentName": { "type": "string" }, "_persistence_shouldRefreshFetchGroup": { "type": "boolean" } } } } } As mentioned at the beginning of this section, from the Swagger 2.0 release onward it supports the YAML representation of APIs. You can access the YAML representation by navigating to swagger.yaml. For instance, in the preceding example, the following URI gives you the YAML file: http://<host>:<port>/<application-name>/<application-path>/swagger.yaml The complete source code for this example is available at the Packt website. You can download the example from the Packt website link that we mentioned at the beginning of this book, in the Preface section. In the downloaded source code, see the rest-chapter7-service-doctools/rest-chapter7-jaxrs2swagger project. Generating a Java client from Swagger The Swagger framework is packaged with the Swagger code generation tool as well (swagger-codegen-cli), which allows you to generate client libraries by parsing the Swagger documentation file. You can download the swagger-codegen-cli.jar file from the Maven central repository by searching for swagger-codegen-cli in search maven. Alternatively, you can clone the Git repository and build the source locally by executing mvn install. Once you have swagger-codegen-cli.jar locally available, run the following command to generate the Java client for the REST API described in Swagger: java -jar swagger-codegen-cli.jar generate -i <Input-URI-or-File-location-for-swagger.json> -l <client-language-to-generate> -o <output-directory> The following example illustrates the use of this tool: java -jar swagger-codegen-cli-2.1.0-M2.jar generate -i http://localhost:8080/hrapp/webresources/swagger.json -l java -o generated-sources/java When you run this tool, it scans through the RESTful web API description available at http://localhost:8080/hrapp/webresources/swagger.json and generates a Java client source in the generated-sources/java folder. Note that the Swagger code generation process uses the mustache templates for generating the client source. If you are not happy with the generated source, Swagger lets you specify your own mustache template files. Use the -t flag to specify your template folder. To learn more, refer to the README.md file at Github. To learn more grab this latest edition of RESTful Java Web Services to build robust, scalable and secure RESTful web services using Java APIs. Getting started with Django RESTful Web Services How to develop RESTful web services in Spring Testing RESTful Web Services with Postman
Read more
  • 0
  • 0
  • 26559
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-fingerprint-detection-using-opencv
Packt
07 Oct 2015
11 min read
Save for later

Fingerprint detection using OpenCV 3

Packt
07 Oct 2015
11 min read
In this article by Joseph Howse, Quan Hua, Steven Puttemans, and Utkarsh Sinha, the authors of OpenCV Blueprints, we delve into the aspect of fingerprint detection using OpenCV. (For more resources related to this topic, see here.) Fingerprint identification, how is it done? We have already discussed the use of the first biometric, which is the face of the person trying to login to the system. However since we mentioned that using a single biometric can be quite risky, we suggest adding secondary biometric checks to the system, like the fingerprint of a person. There are many of the shelf fingerprint scanners which are quite cheap and return you the scanned image. However you will still have to write your own registration software for these scanners and which can be done by using OpenCV software. Examples of such fingerprint images can be found below. Examples of single individual thumb fingerprint in different scanning positions This dataset can be downloaded from the FVC2002 competition website released by the University of Bologna. The website (http://bias.csr.unibo.it/fvc2002/databases.asp) contains 4 databases of fingerprints available for public download of the following structure: Four fingerprint capturing devices DB1 - DB4. For each device, the prints of 10 individuals are available. For each person, 8 different positions of prints were recorded. We will use this publicly available dataset to build our system upon. We will focus on the first capturing device, using up to 4 fingerprints of each individual for training the system and making an average descriptor of the fingerprint. Then we will use the other 4 fingerprints to evaluate our system and make sure that the person is still recognized by our system. You could apply exactly the same approach on the data grabbed from the other devices if you would like to investigate the difference between a system that captures almost binary images and one that captures grayscale images. However we will provide techniques for doing the binarization yourself. Implement the approach in OpenCV 3 The complete fingerprint software for processing fingerprints derived from a fingerprint scanner can be found at https://github.com/OpenCVBlueprints/OpenCVBlueprints/tree/master/chapter_6/source_code/fingerprint/fingerprint_process/. In this subsection we will describe how you can implement this approach in the OpenCV interface. We will start by grabbing the image from the fingerprint system and apply binarization. This will enable us to remove any desired noise from the image as well as help us to make the contrast better between the kin and the wrinkled surface of the finger. // Start by reading in an image Mat input = imread("/data/fingerprints/image1.png", CV_LOAD_GRAYSCALE); // Binarize the image, through local thresholding Mat input_binary; threshold(input, input_binary, 0, 255, CV_THRESH_BINARY | CV_THRESH_OTSU); The Otsu thresholding will automatically choose the best generic threshold for the image to obtain a good contrast between foreground and background information. This is because the image contains a bimodal distribution (which means that we have an image with a 2 peak histogram) of pixel values. For that image, we can approximately take a value in the middle of those peaks as threshold value. (for images which are not bimodal, binarization won't be accurate.) Otsu allows us to avoid using a fixed threshold value, and thus making the system more general for any capturing device. However we do acknowledge that if you have only a single capturing device, then playing around with a fixed threshold value could result in a better image for that specific setup. The result of the thresholding can be seen below. In order to make thinning from the next skeletization step as effective as possible we need the inverse binary image. Comparison between grayscale and binarized fingerprint image Once we have a binary image we are actually already set to go to calculate our feature points and feature point descriptors. However, in order to improve the process a bit more, we suggest to skeletize the image. This will create more unique and stronger interest points. The following piece of code can apply the skeletization on top of the binary image. The skeletization is based on the Zhang-Suen line thinning approach. Special thanks to @bsdNoobz of the OpenCV Q&A forum who supplied this iteration approach. #include <opencv2/imgproc.hpp> #include <opencv2/highgui.hpp> using namespace std; using namespace cv; // Perform a single thinning iteration, which is repeated until the skeletization is finalized void thinningIteration(Mat& im, int iter) { Mat marker = Mat::zeros(im.size(), CV_8UC1); for (int i = 1; i < im.rows-1; i++) { for (int j = 1; j < im.cols-1; j++) { uchar p2 = im.at<uchar>(i-1, j); uchar p3 = im.at<uchar>(i-1, j+1); uchar p4 = im.at<uchar>(i, j+1); uchar p5 = im.at<uchar>(i+1, j+1); uchar p6 = im.at<uchar>(i+1, j); uchar p7 = im.at<uchar>(i+1, j-1); uchar p8 = im.at<uchar>(i, j-1); uchar p9 = im.at<uchar>(i-1, j-1); int A = (p2 == 0 && p3 == 1) + (p3 == 0 && p4 == 1) + (p4 == 0 && p5 == 1) + (p5 == 0 && p6 == 1) + (p6 == 0 && p7 == 1) + (p7 == 0 && p8 == 1) + (p8 == 0 && p9 == 1) + (p9 == 0 && p2 == 1); int B = p2 + p3 + p4 + p5 + p6 + p7 + p8 + p9; int m1 = iter == 0 ? (p2 * p4 * p6) : (p2 * p4 * p8); int m2 = iter == 0 ? (p4 * p6 * p8) : (p2 * p6 * p8); if (A == 1 && (B >= 2 && B <= 6) && m1 == 0 && m2 == 0) marker.at<uchar>(i,j) = 1; } } im &= ~marker; } // Function for thinning any given binary image within the range of 0-255. If not you should first make sure that your image has this range preset and configured! void thinning(Mat& im) { // Enforce the range tob e in between 0 - 255 im /= 255; Mat prev = Mat::zeros(im.size(), CV_8UC1); Mat diff; do { thinningIteration(im, 0); thinningIteration(im, 1); absdiff(im, prev, diff); im.copyTo(prev); } while (countNonZero(diff) > 0); im *= 255; } The code above can then simply be applied to our previous steps by calling the thinning function on top of our previous binary generated image. The code for this is: Apply thinning algorithm Mat input_thinned = input_binary.clone(); thinning(input_thinned); This will result in the following output: Comparison between binarized and thinned fingerprint image using skeletization techniques. When we got this skeleton image, the following step would be to look for crossing points on the ridges of the fingerprint, which are being then called minutiae points. We can do this by a keypoint detector that looks at a large change in local contrast, like the Harris corner detector. Since the Harris corner detector is both able to detect strong corners and edges, this is ideally for the fingerprint problem, where the most important minutiae are short edges and bifurcation, the positions where edges come together. More information about minutae points and Harris Corner detection can be found in the following publications: Ross, Arun A., Jidnya Shah, and Anil K. Jain. "Toward reconstructing fingerprints from minutiae points." Defense and Security. International Society for Optics and Photonics, 2005. Harris, Chris, and Mike Stephens. "A combined corner and edge detector." Alvey vision conference. Vol. 15. 1988. Calling the Harris Corner operation on a skeletonized and binarized image in OpenCV is quite straightforward. The Harris corners are stored as positions corresponding in the image with their cornerness response value. If we want to detect points with a certain cornerness, than we should simply threshold the image. Mat harris_corners, harris_normalised; harris_corners = Mat::zeros(input_thinned.size(), CV_32FC1); cornerHarris(input_thinned, harris_corners, 2, 3, 0.04, BORDER_DEFAULT); normalize(harris_corners, harris_normalised, 0, 255, NORM_MINMAX, CV_32FC1, Mat()); We now have a map with all the available corner responses rescaled to the range of [0 255] and stored as float values. We can now manually define a threshold, that will generate a good amount of keypoints for our application. Playing around with this parameter could improve performance in other cases. This can be done by using the following code snippet: float threshold = 125.0; vector<KeyPoint> keypoints; Mat rescaled; convertScaleAbs(harris_normalised, rescaled); Mat harris_c(rescaled.rows, rescaled.cols, CV_8UC3); Mat in[] = { rescaled, rescaled, rescaled }; int from_to[] = { 0,0, 1,1, 2,2 }; mixChannels( in, 3, &harris_c, 1, from_to, 3 ); for(int x=0; x<harris_normalised.cols; x++){ for(int y=0; y<harris_normalised.rows; y++){ if ( (int)harris_normalised.at<float>(y, x) > threshold ){ // Draw or store the keypoint location here, just like you decide. In our case we will store the location of the keypoint circle(harris_c, Point(x, y), 5, Scalar(0,255,0), 1); circle(harris_c, Point(x, y), 1, Scalar(0,0,255), 1); keypoints.push_back( KeyPoint (x, y, 1) ); } } } Comparison between thinned fingerprint and Harris corner response, as well as the selected Harris corners. Now that we have a list of keypoints we will need to create some of formal descriptor of the local region around that keypoint to be able to uniquely identify it among other keypoints. Chapter 3, Recognizing facial expressions with machine learning, discusses in more detail the wide range of keypoints out there. In this article, we will mainly focus on the process. Feel free to adapt the interface with other keypoint detectors and descriptors out there, for better or for worse performance. Since we have an application where the orientation of the thumb can differ (since it is not a fixed position), we want a keypoint descriptor that is robust at handling these slight differences. One of the mostly used descriptors for that is the SIFT descriptor, which stands for scale invariant feature transform. However SIFT is not under a BSD license and can thus pose problems to use in commercial software. A good alternative in OpenCV is the ORB descriptor. In OpenCV you can implement it in the following way. Ptr<Feature2D> orb_descriptor = ORB::create(); Mat descriptors; orb_descriptor->compute(input_thinned, keypoints, descriptors); This will enable us to calculate only the descriptors using the ORB approach, since we already retrieved the location of the keypoints using the Harris corner approach. At this point we can retrieve a descriptor for each detected keypoint of any given fingerprint. The descriptors matrix will contain a row for each keypoint containing the representation. Let us now start from the case where we have only a single reference image for each fingerprint. In that case we will have a database containing a set of feature descriptors for the training persons in the database. We then have a single new entry, consisting of multiple descriptors for the keypoints found at registration time. We now have to match these descriptors to the descriptors stored in the database, to see which one has the best match. The most simple way is by performing a brute force matching using the hamming distance criteria between descriptors of different keypoints. // Imagine we have a vector of single entry descriptors as a database // We will still need to fill those once we compare everything, by using the code snippets above vector<Mat> database_descriptors; Mat current_descriptors; // Create the matcher interface Ptr<DescriptorMatcher> matcher = DescriptorMatcher::create("BruteForce-Hamming"); // Now loop over the database and start the matching vector< vector< DMatch > > all_matches; for(int entry=0; i<database_descriptors.size();entry++){ vector< DMatch > matches; matcheràmatch(database_descriptors[entry], current_descriptors, matches); all_matches.push_back(matches); } We now have all the matches stored as DMatch objects. This means that for each matching couple we will have the original keypoint, the matched keypoint and a floating point score between both matches, representing the distance between the matched points. The idea about finding the best match seems pretty straightforward. We take a look at the amount of matches that have been returned by the matching process and weigh them by their Euclidean distance in order to add some certainty. We then look for the matching process that yielded the biggest score. This will be our best match and the match we want to return as the selected one from the database. If you want to avoid an imposter getting assigned to the best matching score, you can again add a manual threshold on top of the scoring to avoid matches that are not good enough, to be ignored. However it is possible, and should be taken into consideration, that if you increase the score to high, that people with a little change will be rejected from the system, like for example in the case where someone cuts his finger and thus changing his pattern to drastically. Fingerprint matching process visualized. To summarize, we saw how to detect fingerprints and implement it using OpenCV 3. Resources for Article: Further resources on this subject: Making subtle color shifts with curves [article] Tracking Objects in Videos [article] Hand Gesture Recognition Using a Kinect Depth Sensor [article]
Read more
  • 0
  • 1
  • 26528

article-image-task-parallel-library-multi-threading-net-core
Aaron Lazar
16 Aug 2018
11 min read
Save for later

Task parallel library for easy multi-threading in .NET Core [Tutorial]

Aaron Lazar
16 Aug 2018
11 min read
Compared to the classic threading model in .NET, Task Parallel Library minimizes the complexity of using threads and provides an abstraction through a set of APIs that help developers focus more on the application program instead of focusing on how the threads will be provisioned. In this article, we'll learn how TPL benefits of using traditional threading techniques for concurrency and high performance. There are several benefits of using TPL over threads: It autoscales the concurrency to a multicore level It autoscales LINQ queries to a multicore level It handles the partitioning of the work and uses ThreadPool where required It is easy to use and reduces the complexity of working with threads directly This tutorial is an extract from the book, C# 7 and .NET Core 2.0 High Performance, authored by Ovais Mehboob Ahmed Khan. Creating a task using TPL TPL APIs are available in the System.Threading and System.Threading.Tasks namespaces. They work around the task, which is a program or a block of code that runs asynchronously. An asynchronous task can be run by calling either the Task.Run or TaskFactory.StartNew methods. When we create a task, we provide a named delegate, anonymous method, or a lambda expression that the task executes. Here is a code snippet that uses a lambda expression to execute the ExecuteLongRunningTasksmethod using Task.Run: class Program { static void Main(string[] args) { Task t = Task.Run(()=>ExecuteLongRunningTask(5000)); t.Wait(); } public static void ExecuteLongRunningTask(int millis) { Thread.Sleep(millis); Console.WriteLine("Hello World"); } } In the preceding code snippet, we have executed the ExecuteLongRunningTask method asynchronously using the Task.Run method. The Task.Run method returns the Task object that can be used to further wait for the asynchronous piece of code to be executed completely before the program ends. To wait for the task, we have used the Wait method. Alternatively, we can also use the Task.Factory.StartNew method, which is more advanced and provides more options. While calling the Task.Factory.StartNew method, we can specify CancellationToken, TaskCreationOptions, and TaskScheduler to set the state, specify other options, and schedule tasks. TPL uses multiple cores of the CPU out of the box. When the task is executed using the TPL API, it automatically splits the task into one or more threads and utilizes multiple processors, if they are available. The decision as to how many threads will be created is calculated at runtime by CLR. Whereas a thread only has an affinity to a single processor, running any task on multiple processors needs a proper manual implementation. Task-based asynchronous pattern (TAP) When developing any software, it is always good to implement the best practices while designing its architecture. The task-based asynchronous pattern is one of the recommended patterns that can be used when working with TPL. There are, however, a few things to bear in mind while implementing TAP. Naming convention The method executing asynchronously should have the naming suffix Async. For example, if the method name starts with ExecuteLongRunningOperation, it should have the suffix Async, with the resulting name of ExecuteLongRunningOperationAsync. Return type The method signature should return either a System.Threading.Tasks.Task or System.Threading.Tasks.Task<TResult>. The task's return type is equivalent to the method that returns void, whereas TResult is the data type. Parameters The out and ref parameters are not allowed as parameters in the method signature. If multiple values need to be returned, tuples or a custom data structure can be used. The method should always return Task or Task<TResult>, as discussed previously. Here are a few signatures for both synchronous and asynchronous methods: Synchronous methodAsynchronous methodVoid Execute();Task ExecuteAsync();List<string> GetCountries();Task<List<string>> GetCountriesAsync();Tuple<int, string> GetState(int stateID);Task<Tuple<int, string>> GetStateAsync(int stateID);Person GetPerson(int personID);Task<Person> GetPersonAsync(int personID); Exceptions The asynchronous method should always throw exceptions that are assigned to the returning task. However, the usage errors, such as passing null parameters to the asynchronous method, should be properly handled. Let's suppose we want to generate several documents dynamically based on a predefined templates list, where each template populates the placeholders with dynamic values and writes it on the filesystem. We assume that this operation will take a sufficient amount of time to generate a document for each template. Here is a code snippet showing how the exceptions can be handled: static void Main(string[] args) { List<Template> templates = GetTemplates(); IEnumerable<Task> asyncDocs = from template in templates select GenerateDocumentAsync(template); try { Task.WaitAll(asyncDocs.ToArray()); }catch(Exception ex) { Console.WriteLine(ex); } Console.Read(); } private static async Task<int> GenerateDocumentAsync(Template template) { //To automate long running operation Thread.Sleep(3000); //Throwing exception intentionally throw new Exception(); } In the preceding code, we have a GenerateDocumentAsync method that performs a long running operation, such as reading the template from the database, populating placeholders, and writing a document to the filesystem. To automate this process, we used Thread.Sleep to sleep the thread for three seconds and then throw an exception that will be propagated to the calling method. The Main method loops the templates list and calls the GenerateDocumentAsync method for each template. Each GenerateDocumentAsync method returns a task. When calling an asynchronous method, the exception is actually hidden until the Wait, WaitAll, WhenAll, and other methods are called. In the preceding example, the exception will be thrown once the Task.WaitAll method is called, and will log the exception on the console. Task status The task object provides a TaskStatus that is used to know whether the task is executing the method running, has completed the method, has encountered a fault, or whether some other occurrence has taken place. The task initialized using Task.Run initially has the status of Created, but when the Start method is called, its status is changed to Running. When applying the TAP pattern, all the methods return the Task object, and whether they are using the Task.Run inside, the method body should be activated. That means that the status should be anything other than Created. The TAP pattern ensures the consumer that the task is activated and the starting task is not required. Task cancellation Cancellation is an optional thing for TAP-based asynchronous methods. If the method accepts the CancellationToken as the parameter, it can be used by the caller party to cancel a task. However, for a TAP, the cancellation should be properly handled. Here is a basic example showing how cancellation can be implemented: static void Main(string[] args) { CancellationTokenSource tokenSource = new CancellationTokenSource(); CancellationToken token = tokenSource.Token; Task.Factory.StartNew(() => SaveFileAsync(path, bytes, token)); } static Task<int> SaveFileAsync(string path, byte[] fileBytes, CancellationToken cancellationToken) { if (cancellationToken.IsCancellationRequested) { Console.WriteLine("Cancellation is requested..."); cancellationToken.ThrowIfCancellationRequested } //Do some file save operation File.WriteAllBytes(path, fileBytes); return Task.FromResult<int>(0); } In the preceding code, we have a SaveFileAsync method that takes the byte array and the CancellationToken as parameters. In the Main method, we initialize the CancellationTokenSource that can be used to cancel the asynchronous operation later in the program. To test the cancellation scenario, we will just call the Cancel method of the tokenSource after the Task.Factory.StartNew method and the operation will be canceled. Moreover, when the task is canceled, its status is set to Cancelled and the IsCompleted property is set to true. Task progress reporting With TPL, we can use the IProgress<T> interface to get real-time progress notifications from the asynchronous operations. This can be used in scenarios where we need to update the user interface or the console app of asynchronous operations. When defining the TAP-based asynchronous methods, defining IProgress<T> in a parameter is optional. We can have overloaded methods that can help consumers to use in the case of specific needs. However, they should only be used if the asynchronous method supports them.  Here is the modified version of SaveFileAsync that updates the user about the real progress: static void Main(string[] args) { var progressHandler = new Progress<string>(value => { Console.WriteLine(value); }); var progress = progressHandler as IProgress<string>; CancellationTokenSource tokenSource = new CancellationTokenSource(); CancellationToken token = tokenSource.Token; Task.Factory.StartNew(() => SaveFileAsync(path, bytes, token, progress)); Console.Read(); } static Task<int> SaveFileAsync(string path, byte[] fileBytes, CancellationToken cancellationToken, IProgress<string> progress) { if (cancellationToken.IsCancellationRequested) { progress.Report("Cancellation is called"); Console.WriteLine("Cancellation is requested..."); } progress.Report("Saving File"); File.WriteAllBytes(path, fileBytes); progress.Report("File Saved"); return Task.FromResult<int>(0); } Implementing TAP using compilers Any method that is attributed with the async keyword (for C#) or Async for (Visual Basic) is called an asynchronous method. The async keyword can be applied to a method, anonymous method, or a Lambda expression, and the language compiler can execute that task asynchronously. Here is a simple implementation of the TAP method using the compiler approach: static void Main(string[] args) { var t = ExecuteLongRunningOperationAsync(100000); Console.WriteLine("Called ExecuteLongRunningOperationAsync method, now waiting for it to complete"); t.Wait(); Console.Read(); } public static async Task<int> ExecuteLongRunningOperationAsync(int millis) { Task t = Task.Factory.StartNew(() => RunLoopAsync(millis)); await t; Console.WriteLine("Executed RunLoopAsync method"); return 0; } public static void RunLoopAsync(int millis) { Console.WriteLine("Inside RunLoopAsync method"); for(int i=0;i< millis; i++) { Debug.WriteLine($"Counter = {i}"); } Console.WriteLine("Exiting RunLoopAsync method"); } In the preceding code, we have the ExecuteLongRunningOperationAsync method, which is implemented as per the compiler approach. It calls the RunLoopAsync that executes a loop for a certain number of milliseconds that is passed in the parameter. The async keyword on the ExecuteLongRunningOperationAsync method actually tells the compiler that this method has to be executed asynchronously, and, once the await statement is reached, the method returns to the Main method that writes the line on a console and waits for the task to be completed. Once the RunLoopAsync is executed, the control comes back to await and starts executing the next statements in the ExecuteLongRunningOperationAsync method. Implementing TAP with greater control over Task As we know, that the TPL is centered on the Task and Task<TResult> objects. We can execute an asynchronous task by calling the Task.Run method and execute a delegate method or a block of code asynchronously and use Wait or other methods on that task. However, this approach is not always adequate, and there are scenarios where we may have different approaches to executing asynchronous operations, and we may use an Event-based Asynchronous Pattern (EAP) or an Asynchronous Programming Model (APM). To implement TAP principles here, and to get the same control over asynchronous operations executing with different models, we can use the TaskCompletionSource<TResult> object. The TaskCompletionSource<TResult> object is used to create a task that executes an asynchronous operation. When the asynchronous operation completes, we can use the TaskCompletionSource<TResult> object to set the result, exception, or state of the task. Here is a basic example that executes the ExecuteTask method that returns Task, where the ExecuteTask method uses the TaskCompletionSource<TResult> object to wrap the response as a Task and executes the ExecuteLongRunningTask through the Task.StartNew method: static void Main(string[] args) { var t = ExecuteTask(); t.Wait(); Console.Read(); } public static Task<int> ExecuteTask() { var tcs = new TaskCompletionSource<int>(); Task<int> t1 = tcs.Task; Task.Factory.StartNew(() => { try { ExecuteLongRunningTask(10000); tcs.SetResult(1); }catch(Exception ex) { tcs.SetException(ex); } }); return tcs.Task; } public static void ExecuteLongRunningTask(int millis) { Thread.Sleep(millis); Console.WriteLine("Executed"); } So now, we've been able to use TPL and TAP over traditional threads, thus improving performance. If you liked this article and would like to learn more such techniques, pick up this book, C# 7 and .NET Core 2.0 High Performance, authored by Ovais Mehboob Ahmed Khan. Get to know ASP.NET Core Web API [Tutorial] .NET Core completes move to the new compiler – RyuJIT Applying Single Responsibility principle from SOLID in .NET Core
Read more
  • 0
  • 0
  • 26469

article-image-role-angularjs
Packt
16 Dec 2014
7 min read
Save for later

Role of AngularJS

Packt
16 Dec 2014
7 min read
This article by Sandeep Kumar Patel, author of Responsive Web Design with AngularJS we will explore the role of AngularJS for responsive web development. Before going into AngularJS, you will learn about responsive "web development in general. Responsive" web development can be performed "in two ways: Using the browser sniffing approach Using the CSS3 media queries approach (For more resources related to this topic, see here.) Using the browser sniffing approach When we view" web pages through our browser, the browser sends a user agent string to the server. This string" provides information like browser and device details. Reading these details, the browser can be redirected to the appropriate view. This method of reading client details is known as browser sniffing. The browser string has a lot of different information about the source from where the request is generated. The following diagram shows the information shared by the user string:   Details of the parameters" present in the user agent string are as follows: Browser name: This" represents the actual name of the browser from where the request has originated, for example, Mozilla or Opera Browser version: This represents" the browser release version from the vendor, for example, Firefox has the latest version 31 Browser platform: This represents" the underlying engine on which the browser is running, for example, Trident or WebKit Device OS: This represents" the operating system running on the device from where the request has originated, for example, Linux or Windows Device processor: This represents" the processor type on which the operating system is running, for example, 32 or 64 bit A different browser string is generated based on the combination of the device and type of browser used while accessing a web page. The following table shows examples of browser "strings: Browser Device User agent string Firefox Windows desktop Mozilla/5.0 (Windows NT 5.1; rv:31.0) Gecko/20100101 Firefox/31.0 Chrome OS X 10 desktop Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.66 Safari/537.36 Opera Windows desktop Opera/9.80 (Windows NT 6.0) Presto/2.12.388 Version/12.14 Safari OS X 10 desktop Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/537.13+ (KHTML, like Gecko) Version/5.1.7 Safari/534.57.2 Internet Explorer Windows desktop Mozilla/5.0 (compatible; MSIE 10.6; Windows NT 6.1; Trident/5.0; InfoPath.2; SLCC1; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET CLR 2.0.50727) 3gpp-gba UNTRUSTED/1.0   AngularJS has features like providers or services which can be most useful for this browser user-agent sniffing and a redirection approach. An AngularJS provider can be created that can be used in the configuration in the routing module. This provider can have reusable properties and reusable methods that can be used to identify the device and route the specific request to the appropriate template view. To discover more about user agent strings on various browser and device combinations, visit http://www.useragentstring.com/pages/Browserlist/. CSS3 media queries approach CSS3 brings a "new horizon to web application development. One of the key features" is media queries to develop a responsive web application. Media queries uses media types and features as "deciding parameters to apply the style to the current web page. Media type CSS3 media queries" provide rules for media types to have different styles applied to a web page. In the media queries specification, media types that should be supported by the implemented browser are listed. These media types are as follows: all: This is used" for all media type devices aural: This is "used for speech and sound synthesizers braille: This is used "for braille tactile feedback devices embossed: This" is used for paged braille printers handheld: This is "used for small or handheld devices, for example, mobile print: This" is used for printers, for example, an A4 size paper document projection: This is" used for projection-based devices, such as a projector screen with a slide screen: This is "used for computer screens, for example, desktop and "laptop screens tty: This is" used for media using a fixed-pitch character grid, such as teletypes and terminals tv: This is used "for television-type devices, for example, webOS "or Android-based television A media rule can be declared using the @media keyword with the specific type for the targeted media. The following code shows an example of the media rule usage, where the background body color" is black and text is white for the screen type media, and background body color is white and text is black for the printer media type: @media screen { body {    background:black;    color:white; } }   @media print{ body {    background:white;    color:black; } } An external style "sheet can be downloaded and applied to the current page based on the media type with the HTML link tag. The following code uses the link type in conjunction with media type: <link rel='stylesheet' media='screen' href='<fileName.css>' /> To learn more about" different media types,visit https://developer.mozilla.org/en-US/docs/Web/CSS/@media#Media_types. Media feature Conditional styles can be "applied to a page based on different features of a device. The features that are "supported by CSS3 media queries to apply styles are as follows: color: Based on the" number of bits used for a color component by the device-specific style sheet, this can be applied to a page. color-index: Based "on the color look up, table styles can be applied "to a page. aspect-ratio: Based "on the aspect ratio, display area style sheets can be applied to a page. device-aspect-ratio: Based "on the device aspect ratio, styles can be applied to a page. device-height: Based "on device height, styles can be applied to a page. "This includes the entire screen. device-width: Based "on device width, styles can be applied to a page. "This includes the entire screen. grid: Based "on the device type, bitmap or grid, styles can be applied "to a page. height: Based on" the device rendering area height, styles can be used "to a page. monochrome: Based on" the monochrome type, styles can be applied. "This represents the number of bits used by the device in the grey scale. orientation: Based" on the viewport mode, landscape or portrait, styles can be applied to a page. resolution: Based" on the pixel density, styles can be applied to a page. scan: Based on the "scanning type used by the device for rendering, styles can be applied to a page. width: Based "on the device screen width, specific styles can be applied. The following" code shows some examples" of CSS3 media queries using different device features for conditional styles used: //for screen devices with a minimum aspect ratio 0.5 @media screen and (min-aspect-ratio: 1/2) { img {    height: 70px;    width: 70px; } } //for all device in portrait viewport @media all and (orientation: portrait) { img {    height: 100px;    width: 200px; } } //For printer devices with a minimum resolution of 300dpi pixel density @media print and (min-resolution: 300dpi) { img {    height: 600px;    width: 400px; } } To learn more" about different media features, visit https://developer.mozilla.org/en-US/docs/Web/CSS/@media#Media_features. Summary In this chapter, you learned about responsive design and the SPA architecture. "You now understand the role of the AngularJS library when developing a responsive application. We quickly went through all the important features of AngularJS with the coded syntax. In the next chapter, you will set up your AngularJS application and learn to create dynamic routing-based on the devices. Resources for Article:  Further resources on this subject: Best Practices for Modern Web Applications [article] Important Aspect of AngularJS UI Development [article] A look into responsive design frameworks [article]
Read more
  • 0
  • 23
  • 26408

article-image-step-by-step-guide-to-creating-odoo-addon-modules
Sugandha Lahoti
19 Jun 2018
15 min read
Save for later

A step by step guide to creating Odoo Addon Modules

Sugandha Lahoti
19 Jun 2018
15 min read
Odoo uses a client/server architecture in which clients are web browsers accessing the Odoo server via RPC. Both the server and client extensions are packaged as modules which are optionally loaded in a database. Odoo modules can either add brand new business logic to an Odoo system or alter and extend existing business logic. Everything in Odoo starts and ends with modules. In this article, we will cover the basics of creating Odoo Addon Modules. Recipes that we will cover in this. Creating and installing a new addon module Completing the addon module manifest Organizing the addon module file structure Adding Models Our main goal here is to understand how an addon module is structured and the typical incremental workflow to add components to it. This post is an excerpt from the book Odoo 11 Development Cookbook - Second Edition by Alexandre Fayolle and Holger Brunn. With this book, you can create fast and efficient server-side applications using the latest features of Odoo v11. For this article, you should have Odoo installed. You are also expected to be comfortable in discovering and installing extra addon modules. Creating and installing a new addon module In this recipe, we will create a new module, make it available in our Odoo instance, and install it. Getting ready We will need an Odoo instance ready to use. For explanation purposes, we will assume Odoo is available at ~/odoo-dev/odoo, although any other location of your preference can be used. We will also need a location for our Odoo modules. For the purpose of this recipe, we will use a local-addons directory alongside the odoo directory, at ~/odoo-dev/local-addons. How to do it... The following steps will create and install a new addon module: Change the working directory in which we will work and create the addons directory where our custom module will be placed: $ cd ~/odoo-dev $ mkdir local-addons Choose a technical name for the new module and create a directory with that name for the module. For our example, we will use my_module: $ mkdir local-addons/my_module A module's technical name must be a valid Python identifier; it must begin with a letter, and only contain letters (preferably lowercase), numbers, and underscore characters. Make the Python module importable by adding an __init__.py file: $ touch local-addons/my_module/__init__.py Add a minimal module manifest for Odoo to detect it. Create a __manifest__.py file with this line: {'name': 'My module'} Start your Odoo instance including our module directory in the addons path: $ odoo/odoo-bin --addons-path=odoo/addon/,local-addons/ If the --save option is added to the Odoo command, the addons path will be saved in the configuration file. Next time you start the server, if no addons path option is provided, this will be used. Make the new module available in your Odoo instance; log in to Odoo using admin, enable the Developer Mode in the About box, and in the Apps top menu, select Update Apps List. Now, Odoo should know about our Odoo module. Select the Apps menu at the top and, in the search bar in the top-right, delete the default Apps filter and search for my_module. Click on its Install button, and the installation will be concluded. How it works... An Odoo module is a directory containing code files and other assets. The directory name used is the module's technical name. The name key in the module manifest is its title. The __manifest__.py file is the module manifest. It contains a Python dictionary with information about the module, the modules it depends on, and the data files that it will load. In the example, a minimal manifest file was used, but in real modules, we will want to add a few other important keys. These are discussed in the Completing the module manifest recipe, which we will see next. The module directory must be Python-importable, so it also needs to have an __init__.py file, even if it's empty. To load a module, the Odoo server will import it. This will cause the code in the __init__.py file to be executed, so it works as an entry point to run the module Python code. Due to this, it will usually contain import statements to load the module Python files and submodules. Known modules can be installed directly from the command line using the --init or -i option. This list is initially set when you create a new database, from the modules found on the addons path provided at that time. It can be updated in an existing database with the Update Module List menu. Completing the addon module manifest The manifest is an important piece for Odoo modules. It contains important information about the module and declares the data files that should be loaded. Getting ready We should have a module to work with, already containing a __manifest__.py manifest file. You may want to follow the previous recipe to provide such a module to work with. How to do it... We will add a manifest file and an icon to our addon module: To create a manifest file with the most relevant keys, edit the module __manifest__.py file to look like this: { 'name': "Title", 'summary': "Short subtitle phrase", 'description': """Long description""", 'author': "Your name", 'license': "AGPL-3", 'website': "http://www.example.com", 'category': 'Uncategorized', 'version': '11.0.1.0.0', 'depends': ['base'], 'data': ['views.xml'], 'demo': ['demo.xml'], } To add an icon for the module, choose a PNG image to use and copy it to static/description/icon.png. How it works... The remaining content is a regular Python dictionary, with keys and values. The example manifest we used contains the most relevant keys: name: This is the title for the module. summary: This is the subtitle with a one-line description. description: This is a long description written in plain text or the ReStructuredText (RST) format. It is usually surrounded by triple quotes, and is used in Python to delimit multiline texts. For an RST quickstart reference, visit http://docutils.sourceforge.net/docs/user/rst/quickstart.html. author: This is a string with the name of the authors. When there is more than one, it is common practice to use a comma to separate their names, but note that it still should be a string, not a Python list. license: This is the identifier for the license under which the module is made available. It is limited to a predefined list, and the most frequent option is AGPL-3. Other possibilities include LGPL-3, Other OSI approved license, and Other proprietary. website: This is a URL people should visit to know more about the module or the authors. category: This is used to organize modules in areas of interest. The list of the standard category names available can be seen at https://github.com/odoo/odoo/blob/11.0/odoo/addons/base/module/module_data.xml. However, it's also possible to define other new category names here. version: This is the modules' version numbers. It can be used by the Odoo app store to detect newer versions for installed modules. If the version number does not begin with the Odoo target version (for example, 11.0), it will be automatically added. Nevertheless, it will be more informative if you explicitly state the Odoo target version, for example, using 11.0.1.0.0 or 11.0.1.0 instead of 1.0.0 or 1.0. depends: This is a list with the technical names of the modules it directly depends on. If none, we should at least depend on the base module. Don't forget to include any module defining XML IDs, Views, or Models referenced by this module. That will ensure that they all load in the correct order, avoiding hard-to-debug errors. data: This is a list of relative paths to the data files to load with module installation or upgrade. The paths are relative to the module root directory. Usually, these are XML and CSV files, but it's also possible to have YAML data files. demo: This is the list of relative paths to the files with demonstration data to load. These will only be loaded if the database was created with the Demo Data flag enabled. The image that is used as the module icon is the PNG file at static/description/icon.png. Odoo is expected to have significant changes between major versions, so modules built for one major version are likely to not be compatible with the next version without conversion and migration work. Due to this, it's important to be sure about a module's Odoo target version before installing it. There's more… Instead of having the long description in the module manifest, it's possible to have it in its own file. Since version 8.0, it can be replaced by a README file, with either a .txt, .rst, or an .md (Markdown) extension. Otherwise, include a description/index.html file in the module. This HTML description will override a description defined in the manifest file. There are a few more keys that are frequently used: application: If this is True, the module is listed as an application. Usually, this is used for the central module of a functional area auto_install: If this is True, it indicates that this is a "glue" module, which is automatically installed when all of its dependencies are installed installable: If this is True (the default value), it indicates that the module is available for installation Organizing the addon module file structure An addon module contains code files and other assets such as XML files and images. For most of these files, we are free to choose where to place them inside the module directory. However, Odoo uses some conventions on the module structure, so it is advisable to follow them. Getting ready We are expected to have an addon module directory with only the __init__.py and __manifest__.py files. In this recipe, we suppose this is local-addons/my_module. How to do it... To create the basic skeleton for the addon module, perform the given steps: Create the directories for code files: $ cd local-addons/my_module $ mkdir models $ touch models/__init__.py $ mkdir controllers $ touch controllers/__init__.py $ mkdir views $ mkdir security $ mkdir data $ mkdir demo $ mkdir i18n $ mkdir -p static/description Edit the module's top __init__.py file so that the code in subdirectories is loaded: from . import models from . import controllers This should get us started with a structure containing the most used directories, similar to this one: . ├── __init__.py ├── __manifest__.py │ ├── controllers │ └── __init__.py ├── data ├── demo ├── i18n ├── models │ └── __init__.py ├── security ├── static │ └── description └── views How it works... To provide some context, an Odoo addon module can have three types of files:  The Python code is loaded by the __init__.py files, where the .py files and code subdirectories are imported. Subdirectories containing code Python, in turn, need their own __init__.py. Data files that are to be declared in the data and demo keys of the __manifest__.py module manifest in order to be loaded are usually XML and CSV files for the user interface, fixture data, and demonstration data. There can also be YAML files, which can include some procedural instructions that are run when the module is loaded, for instance, to generate or update records programmatically rather than statically in an XML file. Web assets such as JavaScript code and libraries, CSS, and QWeb/HTML templates also play an important part. There are declared through an XML file extending the master templates to add these assets to the web client or website pages. The addon files are to be organized in these directories: models/ contains the backend code files, creating the Models and their business logic. A file per Model is recommended, with the same name as the model, for example, library_book.py for the library.book model. views/ contains the XML files for the user interface, with the actions, forms, lists, and so on. As with models, it is advised to have one file per model. Filenames for website templates are expected to end with the _template suffix. data/ contains other data files with module initial data. demo/ contains data files with demonstration data, useful for tests, training, or module evaluation. i18n/ is where Odoo will look for the translation .pot and .po files. These files don't need to be mentioned in the manifest file. security/ contains the data files defining access control lists, usually a ir.model.access.csv file, and possibly an XML file to define access Groups and Record Rules for row level security. controllers/ contains the code files for the website controllers, and for modules providing that kind of feature. static/ is where all web assets are expected to be placed. Unlike other directories, this directory name is not just a convention, and only files inside it can be made available for the Odoo web pages. They don't need to be mentioned in the module manifest, but will have to be referred to in the web template. When adding new files to a module, don't forget to declare them either in the __manifest__.py (for data files) or __init__.py (for code files); otherwise, those files will be ignored and won't be loaded. Adding models Models define the data structures to be used by our business applications. This recipe shows how to add a basic model to a module. We will use a simple book library example to explain this; we want a model to represent books. Each book has a name and a list of authors. Getting ready We should have a module to work with. We will use an empty my_module for our explanation. How to do it... To add a new Model, we add a Python file describing it and then upgrade the addon module (or install it, if it was not already done). The paths used are relative to our addon module location (for example, ~/odoo-dev/local-addons/my_module/): Add a Python file to the models/library_book.py module with the following code: from odoo import models, fields class LibraryBook(models.Model): _name = 'library.book' name = fields.Char('Title', required=True) date_release = fields.Date('Release Date') author_ids = fields.Many2many( 'res.partner', string='Authors' ) Add a Python initialization file with code files to be loaded by the models/__init__.py module with the following code: from . import library_book Edit the module Python initialization file to have the models/ directory loaded by the module: from . import models Upgrade the Odoo module, either from the command line or from the apps menu in the user interface. If you look closely at the server log while upgrading the module, you should see this line: odoo.modules.registry: module my_module: creating or updating database table After this, the new library.book model should be available in our Odoo instance. If we have the technical tools activated, we can confirm that by looking it up at Settings | Technical | Database Structure | Models. How it works... Our first step was to create a Python file where our new module was created. Odoo models are objects derived from the Odoo Model Python class. When a new model is defined, it is also added to a central model registry. This makes it easier for other modules to make modifications to it later. Models have a few generic attributes prefixed with an underscore. The most important one is _name, providing a unique internal identifier to be used throughout the Odoo instance. The model fields are defined as class attributes. We began defining the name field of the Char type. It is convenient for models to have this field, because by default, it is used as the record description when referenced from other models. We also used an example of a relational field, author_ids. It defines a many-to-many relation between Library Books and the partners. A book can have many authors and each author can have written many books. Next, we must make our module aware of this new Python file. This is done by the __init__.py files. Since we placed the code inside the models/ subdirectory, we need the previous __init__ file to import that directory, which should, in turn, contain another __init__ file importing each of the code files there (just one, in our case). Changes to Odoo models are activated by upgrading the module. The Odoo server will handle the translation of the model class into database structure changes. Although no example is provided here, business logic can also be added to these Python files, either by adding new methods to the Model's class or by extending the existing methods, such as create() or write(). Thus we learned about the structure of an Odoo addon module and learned, step-by-step, how to create a simple module from scratch. To know more about, how to define access rules for your data;  how to expose your data models to end users on the back end and on the front end; and how to create beautiful PDF versions of your data, read this book Odoo 11 Development Cookbook - Second Edition. ERP tool in focus: Odoo 11 Building Your First Odoo Application How to Scaffold a New module in Odoo 11
Read more
  • 0
  • 0
  • 26388
article-image-step-detector-and-step-counters-sensors
Packt
14 Apr 2016
13 min read
Save for later

Step Detector and Step Counters Sensors

Packt
14 Apr 2016
13 min read
In this article by Varun Nagpal, author of the book, Android Sensor Programming By Example, we will focus on learning about the use of step detector and step counter sensors. These sensors are very similar to each other and are used to count the steps. Both the sensors are based on a common hardware sensor, which internally uses accelerometer, but Android still treats them as logically separate sensors. Both of these sensors are highly battery optimized and consume very low power. Now, lets look at each individual sensor in detail. (For more resources related to this topic, see here.) In this article by Varun Nagpal, author of the book, Android Sensor Programming By Example, we will focus on learning about the use of step detector and step counter sensors. These sensors are very similar to each other and are used to count the steps. Both the sensors are based on a common hardware sensor, which internally uses accelerometer, but Android still treats them as logically separate sensors. Both of these sensors are highly battery optimized and consume very low power. Now, lets look at each individual sensor in detail. The step counter sensor The step counter sensor is used to get the total number of steps taken by the user since the last reboot (power on) of the phone. When the phone is restarted, the value of the step counter sensor is reset to zero. In the onSensorChanged() method, the number of steps is give by event.value[0]; although it's a float value, the fractional part is always zero. The event timestamp represents the time at which the last step was taken. This sensor is especially useful for those applications that don't want to run in the background and maintain the history of steps themselves. This sensor works in batches and in continuous mode. If we specify 0 or no latency in the SensorManager.registerListener() method, then it works in a continuous mode; otherwise, if we specify any latency, then it groups the events in batches and reports them at the specified latency. For prolonged usage of this sensor, it's recommended to use the batch mode, as it saves power. Step counter uses the on-change reporting mode, which means it reports the event as soon as there is change in the value. The step detector sensor The step detector sensor triggers an event each time a step is taken by the user. The value reported in the onSensorChanged() method is always one, the fractional part being always zero, and the event timestamp is the time when the user's foot hit the ground. The step detector sensor has very low latency in reporting the steps, which is generally within 1 to 2 seconds. The Step detector sensor has lower accuracy and produces more false positive, as compared to the step counter sensor. The step counter sensor is more accurate, but has more latency in reporting the steps, as it uses this extra time after each step to remove any false positive values. The step detector sensor is recommended for those applications that want to track the steps in real time and want to maintain their own history of each and every step with their timestamp. Time for action – using the step counter sensor in activity Now, you will learn how to use the step counter sensor with a simple example. The good thing about the step counter is that, unlike other sensors, your app doesn't need to tell the sensor when to start counting the steps and when to stop counting them. It automatically starts counting as soon as the phone is powered on. For using it, we just have to register the listener with the sensor manager and then unregister it after using it. In the following example, we will show the total number of steps taken by the user since the last reboot (power on) of the phone in the Android activity. We created a PedometerActivity and implemented it with the SensorEventListener interface, so that it can receive the sensor events. We initiated the SensorManager and Sensor object of the step counter and also checked the sensor availability in the OnCreate() method of the activity. We registered the listener in the onResume() method and unregistered it in the onPause() method as a standard practice. We used a TextView to display the total number of steps taken and update its latest value in the onSensorChanged() method. public class PedometerActivity extends Activity implements SensorEventListener{ private SensorManager mSensorManager; private Sensor mSensor; private boolean isSensorPresent = false; private TextView mStepsSinceReboot; @Override protected void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.activity_pedometer); mStepsSinceReboot = (TextView)findViewById(R.id.stepssincereboot); mSensorManager = (SensorManager) this.getSystemService(Context.SENSOR_SERVICE); if(mSensorManager.getDefaultSensor(Sensor.TYPE_STEP_COUNTER) != null) { mSensor = mSensorManager.getDefaultSensor(Sensor.TYPE_STEP_COUNTER); isSensorPresent = true; } else { isSensorPresent = false; } } @Override protected void onResume() { super.onResume(); if(isSensorPresent) { mSensorManager.registerListener(this, mSensor, SensorManager.SENSOR_DELAY_NORMAL); } } @Override protected void onPause() { super.onPause(); if(isSensorPresent) { mSensorManager.unregisterListener(this); } } @Override public void onSensorChanged(SensorEvent event) { mStepsSinceReboot.setText(String.valueOf(event.values[0])); } Time for action – maintaining step history with step detector sensor The Step counter sensor works well when we have to deal with the total number of steps taken by the user since the last reboot (power on) of the phone. It doesn't solve the purpose when we have to maintain history of each and every step taken by the user. The Step counter sensor may combine some steps and process them together, and it will only update with an aggregated count instead of reporting individual step detail. For such cases, the step detector sensor is the right choice. In our next example, we will use the step detector sensor to store the details of each step taken by the user, and we will show the total number of steps for each day, since the application was installed. Our next example will consist of three major components of Android, namely service, SQLite database, and activity. Android service will be used to listen to all the individual step details using the step counter sensor when the app is in the background. All the individual step details will be stored in the SQLite database and finally the activity will be used to display the list of total number of steps along with dates. Let's look at the each component in detail. The first component of our example is PedometerListActivity. We created a ListView in the activity to display the step count along with dates. Inside the onCreate() method of PedometerListActivity, we initiated the ListView and ListAdaptor required to populate the list. Another important task that we do in the onCreate() method is starting the service (StepsService.class), which will listen to all the individual steps' events. We also make a call to the getDataForList() method, which is responsible for fetching the data for ListView. public class PedometerListActivity extends Activity{ private ListView mSensorListView; private ListAdapter mListAdapter; @Override protected void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.activity_main); mSensorListView = (ListView)findViewById(R.id.steps_list); getDataForList(); mListAdapter = new ListAdapter(); mSensorListView.setAdapter(mListAdapter); Intent mStepsIntent = new Intent(getApplicationContext(), StepsService.class); startService(mStepsIntent); } In our example, the DateStepsModel class is used as a POJO (Plain Old Java Object) class, which is a handy way of grouping logical data together, to store the total number of steps and date. We also use the StepsDBHelper class to read and write the steps data in the database (discussed further in the next section). Inside the getDataForList() method, we initiated the object of the StepsDBHelper class and call the readStepsEntries() method of the StepsDBHelper class, which returns ArrayList of the DateStepsModel objects containing the total number of steps along with dates after reading from database. The ListAdapter class is used for populating the values for ListView, which internally uses ArrayList of DateStepsModel as the data source. The individual list item is the string, which is the concatenation of date and the total number of steps. class DateStepsModel { public String mDate; public int mStepCount; } private StepsDBHelper mStepsDBHelper; private ArrayList<DateStepsModel> mStepCountList; public void getDataForList() { mStepsDBHelper = new StepsDBHelper(this); mStepCountList = mStepsDBHelper.readStepsEntries(); } private class ListAdapter extends BaseAdapter{ private TextView mDateStepCountText; @Override public int getCount() { return mStepCountList.size(); } @Override public Object getItem(int position) { return mStepCountList.get(position); } @Override public long getItemId(int position) { return position; } @Override public View getView(int position, View convertView, ViewGroup parent) { if(convertView==null){ convertView = getLayoutInflater().inflate(R.layout.list_rows, parent, false); } mDateStepCountText = (TextView)convertView.findViewById(R.id.sensor_name); mDateStepCountText.setText(mStepCountList.get(position).mDate + " - Total Steps: " + String.valueOf(mStepCountList.get(position).mStepCount)); return convertView; } } The second component of our example is StepsService, which runs in the background and listens to the step detector sensor until the app is uninstalled. We implemented this service with the SensorEventListener interface so that it can receive the sensor events. We also initiated theobjects of StepsDBHelper, SensorManager, and the step detector sensor inside the OnCreate() method of the service. We only register the listener when the step detector sensor is available on the device. A point to note here is that we never unregistered the listener because we expect our app to log the step information indefinitely until the app is uninstalled. Both step detector and step counter sensors are very low on battery consumptions and are highly optimized at the hardware level, so if the app really requires, it can use them for longer durations without affecting the battery consumption much. We get a step detector sensor callback in the onSensorChanged() method whenever the operating system detects a step, and from CC: specify, we call the createStepsEntry() method of the StepsDBHelperclass to store the step information in the database. public class StepsService extends Service implements SensorEventListener{ private SensorManager mSensorManager; private Sensor mStepDetectorSensor; private StepsDBHelper mStepsDBHelper; @Override public void onCreate() { super.onCreate(); mSensorManager = (SensorManager) this.getSystemService(Context.SENSOR_SERVICE); if(mSensorManager.getDefaultSensor(Sensor.TYPE_STEP_DETECTOR) != null) { mStepDetectorSensor = mSensorManager.getDefaultSensor(Sensor.TYPE_STEP_DETECTOR); mSensorManager.registerListener(this, mStepDetectorSensor, SensorManager.SENSOR_DELAY_NORMAL); mStepsDBHelper = new StepsDBHelper(this); } } @Override public int onStartCommand(Intent intent, int flags, int startId) { return Service.START_STICKY; } @Override public void onSensorChanged(SensorEvent event) { mStepsDBHelper.createStepsEntry(); } The last component of our example is the SQLite database. We created a StepsDBHelper class and extended it from the SQLiteOpenHelper abstract utility class provided by the Android framework to easily manage database operations. In the class, we created a database called StepsDatabase, which is automatically created on the first object creation of the StepsDBHelper class by the OnCreate() method. This database has one table StepsSummary, which consists of only three columns (id, stepscount, and creationdate). The first column, id, is the unique integer identifier for each row of the table and is incremented automatically on creation of every new row. The second column, stepscount, is used to store the total number of steps taken for each date. The third column, creationdate, is used to store the date in the mm/dd/yyyy string format. Inside the createStepsEntry() method, we first check whether there is an existing step count with the current date, and we if find one, then we read the existing step count of the current date and update the step count by incrementing it by 1. If there is no step count with the current date found, then we assume that it is the first step of the current date and we create a new entry in the table with the current date and step count value as 1. The createStepsEntry() method is called from onSensorChanged() of the StepsService class whenever a new step is detected by the step detector sensor. public class StepsDBHelper extends SQLiteOpenHelper { private static final int DATABASE_VERSION = 1; private static final String DATABASE_NAME = "StepsDatabase"; private static final String TABLE_STEPS_SUMMARY = "StepsSummary"; private static final String ID = "id"; private static final String STEPS_COUNT = "stepscount"; private static final String CREATION_DATE = "creationdate";//Date format is mm/dd/yyyy private static final String CREATE_TABLE_STEPS_SUMMARY = "CREATE TABLE " + TABLE_STEPS_SUMMARY + "(" + ID + " INTEGER PRIMARY KEY AUTOINCREMENT," + CREATION_DATE + " TEXT,"+ STEPS_COUNT + " INTEGER"+")"; StepsDBHelper(Context context) { super(context, DATABASE_NAME, null, DATABASE_VERSION); } @Override public void onCreate(SQLiteDatabase db) { db.execSQL(CREATE_TABLE_STEPS_SUMMARY); } public boolean createStepsEntry() { boolean isDateAlreadyPresent = false; boolean createSuccessful = false; int currentDateStepCounts = 0; Calendar mCalendar = Calendar.getInstance(); String todayDate = String.valueOf(mCalendar.get(Calendar.MONTH))+"/" + String.valueOf(mCalendar.get(Calendar.DAY_OF_MONTH))+"/"+String.valueOf(mCalendar.get(Calendar.YEAR)); String selectQuery = "SELECT " + STEPS_COUNT + " FROM " + TABLE_STEPS_SUMMARY + " WHERE " + CREATION_DATE +" = '"+ todayDate+"'"; try { SQLiteDatabase db = this.getReadableDatabase(); Cursor c = db.rawQuery(selectQuery, null); if (c.moveToFirst()) { do { isDateAlreadyPresent = true; currentDateStepCounts = c.getInt((c.getColumnIndex(STEPS_COUNT))); } while (c.moveToNext()); } db.close(); } catch (Exception e) { e.printStackTrace(); } try { SQLiteDatabase db = this.getWritableDatabase(); ContentValues values = new ContentValues(); values.put(CREATION_DATE, todayDate); if(isDateAlreadyPresent) { values.put(STEPS_COUNT, ++currentDateStepCounts); int row = db.update(TABLE_STEPS_SUMMARY, values, CREATION_DATE +" = '"+ todayDate+"'", null); if(row == 1) { createSuccessful = true; } db.close(); } else { values.put(STEPS_COUNT, 1); long row = db.insert(TABLE_STEPS_SUMMARY, null, values); if(row!=-1) { createSuccessful = true; } db.close(); } } catch (Exception e) { e.printStackTrace(); } return createSuccessful; } The readStepsEntries() method is called from PedometerListActivity to display the total number of steps along with the date in the ListView. The readStepsEntries() method reads all the step counts along with their dates from the table and fills the ArrayList of DateStepsModelwhich is used as a data source for populating the ListView in PedometerListActivity. public ArrayList<DateStepsModel> readStepsEntries() { ArrayList<DateStepsModel> mStepCountList = new ArrayList<DateStepsModel>(); String selectQuery = "SELECT * FROM " + TABLE_STEPS_SUMMARY; try { SQLiteDatabase db = this.getReadableDatabase(); Cursor c = db.rawQuery(selectQuery, null); if (c.moveToFirst()) { do { DateStepsModel mDateStepsModel = new DateStepsModel(); mDateStepsModel.mDate = c.getString((c.getColumnIndex(CREATION_DATE))); mDateStepsModel.mStepCount = c.getInt((c.getColumnIndex(STEPS_COUNT))); mStepCountList.add(mDateStepsModel); } while (c.moveToNext()); } db.close(); } catch (Exception e) { e.printStackTrace(); } return mStepCountList; } What just happened? We created a small pedometer utility app that maintains the step history along with dates using the steps detector sensor. We used PedometerListActivityto display the list of the total number of steps along with their dates. StepsServiceis used to listen to all the steps detected by the step detector sensor in the background. And finally, the StepsDBHelperclass is used to create and update the total step count for each date and to read the total step counts along with dates from the database. Resources for Article: Further resources on this subject: Introducing the Android UI [article] Building your first Android Wear Application [article] Mobile Phone Forensics – A First Step into Android Forensics [article]
Read more
  • 0
  • 3
  • 26371

article-image-everything-you-need-to-know-about-pinecone-a-vector-database
Avinash Navlani
08 Jun 2023
5 min read
Save for later

Everything you need to know about Pinecone – A Vector Database

Avinash Navlani
08 Jun 2023
5 min read
In this 21st century of information, we need efficient reliable storage and faster information retrieval. Relational or older databases are the most crucial databases for any computer application, but they are unable to handle the data in different forms such as documents, key-value pairs, and graphs. Vector database is a novel approach that uses vectorization for efficient search, storage, and data analysis.  Image 1: Traditional Vs Vector Database Pinecone is one such vector database that is widely accepted across the industry for addressing challenges such as complexity and dimensionality. Pinecone is a cloud-native vector database that handles high-dimensional vector data. The core underlying approach for Pinecone is based on the Approximate Nearest Neighbor (ANN) search that efficiently locates faster matches and ranks them within a large dataset. In this tutorial, our focus will be on the pinecone database, its features, challenges, and use cases.  Working Mechanism Traditional databases search for exact query matches while vector databases search for the most similar vector to the input query. It uses ANN (Approximate Nearest Neighbour) search. It provides approximate results at high performance, accuracy, and speed. Let's see the vector database working mechanism.  Image 2: Vector Database Query Mechanism Vector databases first convert data into vectors and create indexing for faster searching. Vector database compares the indexed vector query and indexed vector in the database using the nearest neighbor or similarity matrix and computes the nearest most similar results. Finally, it post-processes the most similar results given by the nearest neighbor. Features Pinecone is a cloud-based vector database that offers various features and benefits to the infrastructure community: Fast and fresh vector search: Pinecone provides ultra-low query latency, even with billions of items. This means that users will always get a great experience, even when searching large datasets. Additionally, Pinecone indexes are updated in real-time, so users always have access to the most up-to-date information.  Filtered vector search: Pinecone allows you to combine vector search with metadata filters to get more relevant and faster results. For example, you could filter by product category, price, or customer rating.  Real-time updates: Pinecone supports real-time data updates, allowing for dynamic changes to the data. This contrasts with standalone vector indexes, which may require a full re-indexing process to incorporate new data. It has reliability, massive scalability, and security capability.  Backups and collections: Pinecone handle the routine operation of backing up all the data stored in the database. You can also selectively choose specific indexes that can be backed up in the form of “collections,” which store the data in that index for later use.  User-friendly API: Pinecone provides a user-friendly API layer that simplifies the development of high-performance vector search applications. This API layer is also language-agnostic, so you can use it with any programming language.  Programming language integration: It supports a wide range of programming languages for integration.  Cost-effectiveness: It is cost-effective because it offers cloud-native architecture. It offers pay-per-use based pricing. Challenges Pinecone vector database offers high-performance data search at a higher scale, but it also faces a few challenges such as:  Application integration with other applications will evolve over a period. Data privacy is the biggest concern for any database. Organizations need to implement proper authentication and authorization mechanisms. Vector-based models don’t explain the model's interpretability. So, it is challenging to interpret the underlying reason behind those relationships. Use cases Pinecone has a variety of real-life industry applications. Let’s discuss a few applications: Audio/Textual Search: Pinecone offers faster, fully deployment-ready search and similarity functionality for high-dimensional text and audio data. Natural language Processing: Pinecone utilizes AutoGPT to create context-aware solutions for document classification, semantic search, text summarization, sentiment analysis, and question-answering systems. Recommendations: Pinecone enables personalized recommendations with efficient similar items recommendations that improve user experience and satisfaction. Image and Video Analysis: Pinecone also has the capability of faster retrieval of image and video content. It is very useful in real-life surveillance and image recognition. Time series similarity search: Pinecone can detect Time-series patterns in historical time-series data using a similarity search service. such core capability is quite helpful for recommendations, clustering, and labeling applications. Summary Pinecone vector database is a vector-based database that offers high-performance search and similarity matching. It can deal with high-dimensional vector data at a higher scale, easy integration, and faster query results. Pinecone provides a reliable, and faster, option for searching at a higher scale. Author BioAvinash Navlani has over 8 years of experience working in data science and AI. Currently, he is working as a senior data scientist, improving products and services for customers by using advanced analytics, deploying big data analytical tools, creating and maintaining models, and onboarding compelling new datasets. Previously, he was a university lecturer, where he trained and educated people in data science subjects such as Python for analytics, data mining, machine learning, database management, and NoSQL. Avinash has been involved in research activities in data science and has been a keynote speaker at many conferences in India.Link - LinkedIn    Python Data Analysis, Third edition   
Read more
  • 0
  • 0
  • 26005

article-image-concurrency-and-parallelism-in-golang-tutorial
Natasha Mathur
06 Jul 2018
11 min read
Save for later

How Concurrency and Parallelism works in Golang [Tutorial]

Natasha Mathur
06 Jul 2018
11 min read
Computer and software programs are useful because they do a lot of laborious work very fast and can also do multiple things at once. We want our programs to be able to do multiple things simultaneously, and the success of a programming language can depend on how easy it is to write and understand multitasking programs. Concurrency and parallelism are two terms that are bound to come across often when looking into multitasking and are often used interchangeably. However, they mean two distinctly different things. In this article, we will look at how concurrency and parallelism work in Go using simple examples for better understanding. Let's get started! This article is an excerpt from a book 'Distributed Computing with Go' written by V.N. Nikhil Anurag. The standard definitions given on the Go blog are as follows: Concurrency: Concurrency is about dealing with lots of things at once. This means that we manage to get multiple things done at once in a given period of time. However, we will only be doing a single thing at a time. This tends to happen in programs where one task is waiting and the program decides to run another task in the idle time. In the following diagram, this is denoted by running the yellow task in idle periods of the blue task. Parallelism: Parallelism is about doing lots of things at once. This means that even if we have two tasks, they are continuously working without any breaks in between them. In the diagram, this is shown by the fact that the green task is running independently and is not influenced by the red task in any manner: It is important to understand the difference between these two terms. Let's look at a few concrete examples to further elaborate upon the difference between the two. Concurrency Let's look at the concept of concurrency using a simple example of a few daily routine tasks and the way we can perform them. Imagine you start your day and need to get six things done: Make hotel reservation Book flight tickets Order a dress Pay credit card bills Write an email Listen to an audiobook The order in which they are completed doesn't matter, and for some of the tasks, such as  writing an email or listening to an audiobook, you need not complete them in a single sitting. Here is one possible way to complete the tasks: Order a dress. Write one-third of the email. Make hotel reservation. Listen to 10 minutes of audiobook. Pay credit card bills. Write another one-third of the email. Book flight tickets. Listen to another 20 minutes of audiobook. Complete writing the email. Continue listening to audiobook until you fall asleep. In programming terms, we have executed the above tasks concurrently. We had a complete day and we chose particular tasks from our list of tasks and started to work on them. For certain tasks, we even decided to break them up into pieces and work on the pieces between other tasks. We will eventually write a program which does all of the preceding steps concurrently, but let's take it one step at a time. Let's start by building a program that executes the tasks sequentially, and then modify it progressively until it is purely concurrent code and uses goroutines. The progression of the program will be in three steps: Serial task execution. Serial task execution with goroutines. Concurrent task execution. Code overview The code will consist of a set of functions that print out their assigned tasks as completed. In the cases of writing an email or listening to an audiobook, we further divide the tasks into more functions. This can be seen as follows: writeMail, continueWritingMail1, continueWritingMail2 listenToAudioBook, continueListeningToAudioBook Serial task execution Let's first implement a program that will execute all the tasks in a linear manner. Based on the code overview we discussed previously, the following code should be straightforward: package main import ( "fmt" ) // Simple individual tasks func makeHotelReservation() { fmt.Println("Done making hotel reservation.") } func bookFlightTickets() { fmt.Println("Done booking flight tickets.") } func orderADress() { fmt.Println("Done ordering a dress.") } func payCreditCardBills() { fmt.Println("Done paying Credit Card bills.") } // Tasks that will be executed in parts // Writing Mail func writeAMail() { fmt.Println("Wrote 1/3rd of the mail.") continueWritingMail1() } func continueWritingMail1() { fmt.Println("Wrote 2/3rds of the mail.") continueWritingMail2() } func continueWritingMail2() { fmt.Println("Done writing the mail.") } // Listening to Audio Book func listenToAudioBook() { fmt.Println("Listened to 10 minutes of audio book.") continueListeningToAudioBook() } func continueListeningToAudioBook() { fmt.Println("Done listening to audio book.") } // All the tasks we want to complete in the day. // Note that we do not include the sub tasks here. var listOfTasks = []func(){ makeHotelReservation, bookFlightTickets, orderADress, payCreditCardBills, writeAMail, listenToAudioBook, } func main() { for _, task := range listOfTasks { task() } } We take each of the main tasks and start executing them in simple sequential order. Executing the preceding code should produce unsurprising output, as shown here: Done making hotel reservation. Done booking flight tickets. Done ordering a dress. Done paying Credit Card bills. Wrote 1/3rd of the mail. Wrote 2/3rds of the mail. Done writing the mail. Listened to 10 minutes of audio book. Done listening to audio book. Serial task execution with goroutines We took a list of tasks and wrote a program to execute them in a linear and sequential manner. However, we want to execute the tasks concurrently! Let's start by first introducing goroutines for the split tasks and see how it goes. We will only show the code snippet where the code actually changed here: /******************************************************************** We start by making Writing Mail & Listening Audio Book concurrent. *********************************************************************/ // Tasks that will be executed in parts // Writing Mail func writeAMail() { fmt.Println("Wrote 1/3rd of the mail.") go continueWritingMail1() // Notice the addition of 'go' keyword. } func continueWritingMail1() { fmt.Println("Wrote 2/3rds of the mail.") go continueWritingMail2() // Notice the addition of 'go' keyword. } func continueWritingMail2() { fmt.Println("Done writing the mail.") } // Listening to Audio Book func listenToAudioBook() { fmt.Println("Listened to 10 minutes of audio book.") go continueListeningToAudioBook() // Notice the addition of 'go' keyword. } func continueListeningToAudioBook() { fmt.Println("Done listening to audio book.") } The following is a possible output: Done making hotel reservation. Done booking flight tickets. Done ordering a dress. Done paying Credit Card bills. Wrote 1/3rd of the mail. Listened to 10 minutes of audio book. Whoops! That's not what we were expecting. The output from the continueWritingMail1, continueWritingMail2, and continueListeningToAudioBook functions is missing; the reason being that we are using goroutines. Since goroutines are not waited upon, the code in the main function continues executing and once the control flow reaches the end of the main function, the program ends. What we would really like to do is to wait in the main function until all the goroutines have finished executing. There are two ways we can do this—using channels or using WaitGroup.  We'll use WaitGroup now. In order to use WaitGroup, we have to keep the following in mind: Use WaitGroup.Add(int) to keep count of how many goroutines we will be running as part of our logic. Use WaitGroup.Done() to signal that a goroutine is done with its task. Use WaitGroup.Wait() to wait until all goroutines are done. Pass WaitGroup instance to the goroutines so they can call the Done() method. Based on these points, we should be able to modify the source code to use WaitGroup. The following is the updated code: package main import ( "fmt" "sync" ) // Simple individual tasks func makeHotelReservation(wg *sync.WaitGroup) { fmt.Println("Done making hotel reservation.") wg.Done() } func bookFlightTickets(wg *sync.WaitGroup) { fmt.Println("Done booking flight tickets.") wg.Done() } func orderADress(wg *sync.WaitGroup) { fmt.Println("Done ordering a dress.") wg.Done() } func payCreditCardBills(wg *sync.WaitGroup) { fmt.Println("Done paying Credit Card bills.") wg.Done() } // Tasks that will be executed in parts // Writing Mail func writeAMail(wg *sync.WaitGroup) { fmt.Println("Wrote 1/3rd of the mail.") go continueWritingMail1(wg) } func continueWritingMail1(wg *sync.WaitGroup) { fmt.Println("Wrote 2/3rds of the mail.") go continueWritingMail2(wg) } func continueWritingMail2(wg *sync.WaitGroup) { fmt.Println("Done writing the mail.") wg.Done() } // Listening to Audio Book func listenToAudioBook(wg *sync.WaitGroup) { fmt.Println("Listened to 10 minutes of audio book.") go continueListeningToAudioBook(wg) } func continueListeningToAudioBook(wg *sync.WaitGroup) { fmt.Println("Done listening to audio book.") wg.Done() } // All the tasks we want to complete in the day. // Note that we do not include the sub tasks here. var listOfTasks = []func(*sync.WaitGroup){ makeHotelReservation, bookFlightTickets, orderADress, payCreditCardBills, writeAMail, listenToAudioBook, } func main() { var waitGroup sync.WaitGroup // Set number of effective goroutines we want to wait upon waitGroup.Add(len(listOfTasks)) for _, task := range listOfTasks{ // Pass reference to WaitGroup instance // Each of the tasks should call on WaitGroup.Done() task(&waitGroup) } // Wait until all goroutines have completed execution. waitGroup.Wait() } Here is one possible output order; notice how continueWritingMail1 and continueWritingMail2 were executed at the end after listenToAudioBook and continueListeningToAudioBook: Done making hotel reservation. Done booking flight tickets. Done ordering a dress. Done paying Credit Card bills. Wrote 1/3rd of the mail. Listened to 10 minutes of audio book. Done listening to audio book. Wrote 2/3rds of the mail. Done writing the mail. Concurrent task execution In the final output of the previous part, we can see that all the tasks in listOfTasks are being executed in serial order, and the last step for maximum concurrency would be to let the order be determined by Go runtime instead of the order in listOfTasks. This might sound like a laborious task, but in reality this is quite simple to achieve. All we need to do is add the go keyword in front of task(&waitGroup): func main() { var waitGroup sync.WaitGroup // Set number of effective goroutines we want to wait upon waitGroup.Add(len(listOfTasks)) for _, task := range listOfTasks { // Pass reference to WaitGroup instance // Each of the tasks should call on WaitGroup.Done() go task(&waitGroup) // Achieving maximum concurrency } // Wait until all goroutines have completed execution. waitGroup.Wait() Following is a possible output: Listened to 10 minutes of audio book. Done listening to audio book. Done booking flight tickets. Done ordering a dress. Done paying Credit Card bills. Wrote 1/3rd of the mail. Wrote 2/3rds of the mail. Done writing the mail. Done making hotel reservation. If we look at this possible output, the tasks were executed in the following order: Listen to audiobook. Book flight tickets. Order a dress. Pay credit card bills. Write an email. Make hotel reservations. Now that we have a good idea on what concurrency is and how to write concurrent code using goroutines and WaitGroup, let's dive into parallelism. Parallelism Imagine that you have to write a few emails. They are going to be long and laborious, and the best way to keep yourself entertained is to listen to music while writing them, that is, listening to music "in parallel" to writing the emails. If we wanted to write a program that simulates this scenario, the following is one possible implementation: package main import ( "fmt" "sync" "time" ) func printTime(msg string) { fmt.Println(msg, time.Now().Format("15:04:05")) } // Task that will be done over time func writeMail1(wg *sync.WaitGroup) { printTime("Done writing mail #1.") wg.Done() } func writeMail2(wg *sync.WaitGroup) { printTime("Done writing mail #2.") wg.Done() } func writeMail3(wg *sync.WaitGroup) { printTime("Done writing mail #3.") wg.Done() } // Task done in parallel func listenForever() { for { printTime("Listening...") } } func main() { var waitGroup sync.WaitGroup waitGroup.Add(3) go listenForever() // Give some time for listenForever to start time.Sleep(time.Nanosecond * 10) // Let's start writing the mails go writeMail1(&waitGroup) go writeMail2(&waitGroup) go writeMail3(&waitGroup) waitGroup.Wait() } The output of the program might be as follows: Done writing mail #3. 19:32:57 Listening... 19:32:57 Listening... 19:32:57 Done writing mail #1. 19:32:57 Listening... 19:32:57 Listening... 19:32:57 Done writing mail #2. 19:32:57 The numbers represent the time in terms of Hour:Minutes:Seconds and, as can be seen, they are being executed in parallel. You might have noticed that the code for parallelism looks almost identical to the code for the final concurrency example. However, in the function listenForever, we are printing Listening... in an infinite loop. If the preceding example was written without goroutines, the output would keep printing Listening... and never reach the writeMail function calls. Goroutines are concurrent and, to an extent, parallel; however, we should think of them as being concurrent. The order of execution of goroutines is not predictable and we should not rely on them to be executed in any particular order. We should also take care to handle errors and panics in our goroutines because even though they are being executed in parallel, a panic in one goroutine will crash the complete program. Finally, goroutines can block on system calls, however, this will not block the execution of the program nor slow down the performance of the overall program. We looked at how goroutine can be used to run concurrent programs and also learned how parallelism works in Go. If you found this post useful, do check out the book 'Distributed Computing with Go' to learn more about Goroutines, channels and messages, and other concepts in Go. Golang Decorators: Logging & Time Profiling Essential Tools for Go Programming Why is Go the go-to language for cloud-native development? – An interview with Mina Andrawos
Read more
  • 0
  • 0
  • 25882
article-image-create-your-first-openai-gym-environment-tutorial
Savia Lobo
19 Sep 2018
7 min read
Save for later

Create your first OpenAI Gym environment [Tutorial]

Savia Lobo
19 Sep 2018
7 min read
OpenAI Gym is an open source toolkit that provides a diverse collection of tasks, called environments, with a common interface for developing and testing your intelligent agent algorithms. The toolkit introduces a standard Application Programming Interface (API) for interfacing with environments designed for reinforcement learning. Each environment has a version attached to it, which ensures meaningful comparisons and reproducible results with the evolving algorithms and the environments themselves. This article is an excerpt taken from the book, Hands-On Intelligent Agents with OpenAI Gym, written by Praveen Palanisamy. In this article, you will get to know what OpenAI Gym is, its features, and later create your own OpenAI Gym environment. The Gym toolkit, through its various environments, provides an episodic setting for reinforcement learning, where an agent's experience is broken down into a series of episodes. In each episode, the initial state of the agent is randomly sampled from a distribution, and the interaction between the agent and the environment proceeds until the environment reaches a terminal state. Do not worry if you are not familiar with reinforcement learning. Some of the basic environments available in the OpenAI Gym library are shown in the following screenshot: Examples of basic environments available in the OpenAI Gym with a short description of the task The OpenAI Gym natively has about 797 environments spread over different categories of tasks. The famous Atari category has the largest share with about 116 (half with screen inputs and half with RAM inputs) environments! The categories of tasks/environments supported by the toolkit are listed here: Algorithmic Atari Board games Box2D Classic control Doom (unofficial) Minecraft (unofficial) MuJoCo Soccer Toy text Robotics (newly added) The various types of environment (or tasks) available under the different categories, along with a brief description of each environment, is given next. Keep in mind that you may need some additional tools and packages installed on your system to run environments in each of these categories. To have a detailed overview of each of these categories, head over to the book. With that, you have a very good overview of all the different categories and types of environment that are available as part of the OpenAI Gym toolkit. It is worth noting that the release of the OpenAI Gym toolkit was accompanied by an OpenAI Gym website (gym.openai.com), which maintained a scoreboard for every algorithm that was submitted for evaluation. It showcased the performance of user-submitted algorithms, and some submissions were also accompanied by detailed explanations and source code. Unfortunately, OpenAI decided to withdraw support for the evaluation website. The service went offline in September 2017. Now you have a good picture of the various categories of environment available in OpenAI Gym and what each category provides you with. Next, we will look at the key features of OpenAI Gym that make it an indispensable component in many of today's advancements in intelligent agent development, especially those that use reinforcement learning or deep reinforcement learning. Understanding the features of OpenAI Gym Here, we will take a look at the key features that have made the OpenAI Gym toolkit very popular in the reinforcement learning community and led to it becoming widely adopted. Simple environment interface OpenAI Gym provides a simple and common Python interface to environments. Specifically, it takes an action as input and provides observation, reward, done and an optional info object, based on the action as the output at each step. If this does not make perfect sense to you yet, do not worry. We will go over the interface again in a more detailed manner to help you understand. This paragraph is just to give you an overview of the interface to make it clear how simple it is. This provides great flexibility for users as they can design and develop their agent algorithms based on any paradigm they like, and not be constrained to use any particular paradigm because of this simple and convenient interface. Comparability and reproducibility We intuitively feel that we should be able to compare the performance of an agent or an algorithm in a particular task to the performance of another agent or algorithm in the same task. For example, if an agent gets a score of 1,000 on average in the Atari game of Space Invaders, we should be able to tell that this agent is performing worse than an agent that scores 5000 on average in the Space Invaders game in the same amount of training time. But what happens if the scoring system for the game is slightly changed? Or if the environment interface was modified to include additional information about the game states that will provide an advantage to the second agent? This would make the score-to-score comparison unfair, right? To handle such changes in the environment, OpenAI Gym uses strict versioning for environments. The toolkit guarantees that if there is any change to an environment, it will be accompanied by a different version number. Therefore, if the original version of the Atari Space Invaders game environment was named SpaceInvaders-v0 and there were some changes made to the environment to provide more information about the game states, then the environment's name would be changed to SpaceInvaders-v1. This simple versioning system makes sure we are always comparing performance measured on the exact same environment setup. This way, the results obtained are comparable and reproducible. Ability to monitor progress All the environments available as part of the Gym toolkit are equipped with a monitor. This monitor logs every time step of the simulation and every reset of the environment. What this means is that the environment automatically keeps track of how our agent is learning and adapting with every step. You can even configure the monitor to automatically record videos of the game while your agent is learning to play. Creating your first OpenAI Gym environment This section provides a quick way to get started with the OpenAI Gym Python API on Linux and macOS using virtualenv so that you can get a sneak peak into the Gym! MacOS and Ubuntu Linux systems come with Python installed by default. You can check which version of Python is installed by running python --version from a terminal window. If this returns python followed by a version number, then you are good to proceed to the next steps! If you get an error saying the Python command was not found, then you have to install Python. Install virtualenv: $pip install virtualenv If pip is not installed on your system, you can install it by typing sudo easy_install pip. Create a virtual environment named openai-gym using the virtualenv tool: $virtualenv openai-gym Activate the openai-gym virtual environment: $source openai-gym/bin/activate Install all the packages for the Gym toolkit from upstream: $pip install -U gym If you get permission denied or failed with error code 1 when you run the pip install command, it is most likely because the permissions on the directory you are trying to install the package to (the openai-gym directory inside virtualenv in this case) needs special/root privileges. You can either run sudo -H pip install -U gym[all] to solve the issue or change permissions on the openai-gym directory by running sudo chmod -R o+rw ~/openai-gym. Test to make sure the installation is successful: $python -c 'import gym; gym.make("CartPole-v0");' Creating and visualizing a new Gym environment In just a minute or two, you have created an instance of an OpenAI Gym environment to get started! Let's open a new Python prompt and import the gym module: >>import gym Once the gym module is imported, we can use the gym.make method to create our new environment like this: >>env = gym.make('CartPole-v0') >>env.reset() env.render() This will bring up a window like this: Hooray! Summary In this post, you learned what OpenAI Gym is, its features, and created your first OpenAI Gym environment. You now have a very good idea about OpenAI Gym. If you've enjoyed this post, head over to the book, Hands-On Intelligent Agents with OpenAI Gym, to know about other latest learning environments and learning algorithms. Extending OpenAI Gym environments with Wrappers and Monitors [Tutorial] How to build a cartpole game using OpenAI Gym Top 5 tools for reinforcement learningTop 5 tools for reinforcement learning
Read more
  • 0
  • 0
  • 25879

article-image-how-get-started-redux-react-native
Emilio Rodriguez
04 Apr 2016
5 min read
Save for later

How To Get Started with Redux in React Native

Emilio Rodriguez
04 Apr 2016
5 min read
In mobile development there is a need for architectural frameworks, but complex frameworks designed to be used in web environments may end up damaging the development process or even the performance of our app. Because of this, some time ago I decided to introduce in all of my React Native projects the leanest framework I ever worked with: Redux. Redux is basically a state container for JavaScript apps. It is 100 percent library-agnostic so you can use it with React, Backbone, or any other view library. Moreover, it is really small and has no dependencies, which makes it an awesome tool for React Native projects. Step 1: Install Redux in your React Native project. Redux can be added as an npm dependency into your project. Just navigate to your project’s main folder and type: npm install --save react-redux By the time this article was written React Native was still depending on React Redux 3.1.0 since versions above depended on React 0.14, which is not 100 percent compatible with React Native. Because of this, you will need to force version 3.1.0 as the one to be dependent on in your project. Step 2: Set up a Redux-friendly folder structure. Of course, setting up the folder structure for your project is totally up to every developer but you need to take into account that you will need to maintain a number of actions, reducers, and components. Besides, it’s also useful to keep a separate folder for your API and utility functions so these won’t be mixing with your app’s core functionality. Having this in mind, this is my preferred folder structure under the src folder in any React Native project: Step 3: Create your first action. In this article we will be implementing a simple login functionality to illustrate how to integrate Redux inside React Native. A good point to start this implementation is the action, a basic function called from the component whenever we want the whole state of the app to be changed (i.e. changing from the logged out state into the logged in state). To keep this example as concise as possible we won’t be doing any API calls to a backend – only the pure Redux integration will be explained. Our action creator is a simple function returning an object (the action itself) with a type attribute expressing what happened with the app. No business logic should be placed here; our action creators should be really plain and descriptive. Step 4: Create your first reducer. Reducers are the ones in charge of updating the state of the app. Unlike in Flux, Redux only has one store for the whole app, but it will be conveniently name-spaced automatically by Redux once the reducers have been applied. In our example, the user reducer needs to be aware of when the user is logged in. Because of that, it needs to import the LOGIN_SUCCESS constant we defined in our actions before and export a default function, which will be called by Redux every time an action occurs in the app. Redux will automatically pass the current state of the app and the action occurred. It’s up to the reducer to realize if it needs to modify the state or not based on the action.type. That’s why almost every time our reducer will be a function containing a switch statement, which modifies and returns the state based on what action occurred. It’s important to state that Redux works with object references to identify when the state is changed. Because of this, the state should be cloned before any modification. It’s also interesting to know that the action passed to the reducers can contain other attributes apart from type. For example, when doing a more complex login, the user first name and last name can be added to the action by the action created and used by the reducer to update the state of the app. Step 5: Create your component. This step is almost pure React Native coding. We need a component to trigger the action and to respond to the change of state in the app. In our case it will be a simple View containing a button that disappears when logged in. This is a normal React Native component except for some pieces of the Redux boilerplate: The three import lines at the top will require everything we need from Redux ‘mapStateToProps’ and ‘mapDispatchToProps’ are two functions bound with ‘connect’ to the component: this makes Redux know that this component needs to be passed a piece of the state (everything under ‘userReducers’) and all the actions available in the app. Just by doing this, we will have access to the login action (as it is used in the onLoginButtonPress) and to the state of the app (as it is used in the !this.props.user.loggedIn statement) Step 6: Glue it all from your index.ios.js. For Redux to apply its magic, some initialization should be done in the main file of your React Native project (index.ios.js). This is pure boilerplate and only done once: Redux needs to inject a store holding the app state into the app. To do so, it requires a ‘Provider’ wrapping the whole app. This store is basically a combination of reducers. For this article we only need one reducer, but a full app will include many others and each of them should be passed into the combineReducers function to be taken into account by Redux whenever an action is triggered. About the Author Emilio Rodriguez started working as a software engineer for Sun Microsystems in 2006. Since then, he has focused his efforts on building a number of mobile apps with React Native while contributing to the React Native project. These contributions helped his understand how deep and powerful this framework is.
Read more
  • 0
  • 0
  • 25712