Design Patterns

12 Jan 2016

11 min read

Façade Pattern – Being Adaptive with Façade

12 Jan 2016

In this article by Chetan Giridhar, author of the book, Learning Python Design Patterns - Second Edition, we will get introduced to the Façade design pattern and how it is used in software application development. We will work with a sample use case and implement it in Python v3.5. In brief, we will cover the following topics in this article: An understanding of the Façade design pattern with a UML diagram A real-world use case with the Python v3.5 code implementation The Façade pattern and principle of least knowledge (For more resources related to this topic, see here.) Understanding the Façade design pattern Façade is generally referred to as the face of the building, especially an attractive one. It can be also referred to as a behavior or appearance that gives a false idea of someone's true feelings or situation. When people walk past a façade, they can appreciate the exterior face but aren't aware of the complexities of the structure within. This is how a façade pattern is used. Façade hides the complexities of the internal system and provides an interface to the client that can access the system in a very simplified way. Consider an example of a storekeeper. Now, when you, as a customer, visit a store to buy certain items, you're not aware of the layout of the store. You typically approach the storekeeper who is well aware of the store system. Based on your requirements, the storekeeper picks up items and hands them over to you. Isn't this easy? The customer need not know how the store looks and s/he gets the stuff done through a simple interface, the storekeeper. The Façade design pattern essentially does the following: It provides a unified interface to a set of interfaces in a subsystem and defines a high-level interface that helps the client use the subsystem in an easy way. Façade discusses representing a complex subsystem with a single interface object. It doesn't encapsulate the subsystem but actually combines the underlying subsystems. It promotes the decoupling of the implementation with multiple clients. A UML class diagram We will now discuss the Façade pattern with the help of the following UML diagram: As we observe the UML diagram, you'll realize that there are three main participants in this pattern: Façade: The main responsibility of a façade is to wrap up a complex group of subsystems so that it can provide a pleasing look to the outside world. System: This represents a set of varied subsystems that make the whole system compound and difficult to view or work with. Client: The client interacts with the Façade so that it can easily communicate with the subsystem and get the work completed. It doesn't have to bother about the complex nature of the system. You will now learn a little more about the three main participants from the data structure's perspective. Façade The following points will give us a better idea of Façade: It is an interface that knows which subsystems are responsible for a request It delegates the client's requests to the appropriate subsystem objects using composition For example, if the client is looking for some work to be accomplished, it need not have to go to individual subsystems but can simply contact the interface (Façade) that gets the work done System In the Façade world, System is an entity that performs the following: It implements subsystem functionality and is represented by a class. Ideally, a System is represented by a group of classes that are responsible for different operations. It handles the work assigned by the Façade object but has no knowledge of the façade and keeps no reference to it. For instance, when the client requests the Façade for a certain service, Façade chooses the right subsystem that delivers the service based on the type of service Client Here's how we can describe the client: The client is a class that instantiates the Façade It makes requests to the Façade to get the work done from the subsystems Implementing the Façade pattern in the real world To demonstrate the applications of the Façade pattern, let's take an example that we'd have experienced in our lifetime. Consider that you have a marriage in your family and you are in charge of all the arrangements. Whoa! That's a tough job on your hands. You have to book a hotel or place for marriage, talk to a caterer for food arrangements, organize a florist for all the decorations, and finally handle the musical arrangements expected for the event. In yesteryears, you'd have done all this by yourself, such as talking to the relevant folks, coordinating with them, negotiating on the pricing, but now life is simpler. You go and talk to an event manager who handles this for you. S/he will make sure that they talk to the individual service providers and get the best deal for you. From the Façade pattern perspective we will have the following three main participants: Client: It's you who need all the marriage preparations to be completed in time before the wedding. They should be top class and guests should love the celebrations. Façade: The event manager who's responsible for talking to all the folks that need to work on specific arrangements such as food, flower decorations, among others Subsystems: They represent the systems that provide services such as catering, hotel management, and flower decorations Let's develop an application in Python v3.5 and implement this use case. We start with the client first. It's you! Remember, you're the one who has been given the responsibility to make sure that the marriage preparations are done and the event goes fine! However, you're being clever here and passing on the responsibility to the event manager, isn't it? Let's now look at the You class. In this example, you create an object of the EventManager class so that the manager can work with the relevant folks on marriage preparations while you relax. class You(object): def __init__(self): print("You:: Whoa! Marriage Arrangements??!!!") def askEventManager(self): print("You:: Let's Contact the Event Manager\n\n") em = EventManager() em.arrange() def __del__(self): print("You:: Thanks to Event Manager, all preparations done! Phew!") Let's now move ahead and talk about the Façade class. As discussed earlier, the Façade class simplifies the interface for the client. In this case, EventManager acts as a façade and simplifies the work for You. Façade talks to the subsystems and does all the booking and preparations for the marriage on your behalf. Here is the Python code for the EventManager class: class EventManager(object): def __init__(self): print("Event Manager:: Let me talk to the folks\n") def arrange(self): self.hotelier = Hotelier() self.hotelier.bookHotel() self.florist = Florist() self.florist.setFlowerRequirements() self.caterer = Caterer() self.caterer.setCuisine() self.musician = Musician() self.musician.setMusicType() Now that we're done with the Façade and client, let's dive into the subsystems. We have developed the following classes for this scenario: Hotelier is for the hotel bookings. It has a method to check whether the hotel is free on that day (__isAvailable) and if it is free for booking the Hotel (bookHotel). The Florist class is responsible for flower decorations. Florist has the setFlowerRequirements() method to be used to set the expectations on the kind of flowers needed for the marriage decoration. The Caterer class is used to deal with the caterer and is responsible for the food arrangements. Caterer exposes the setCuisine() method to accept the type of cuisine to be served at the marriage. The Musician class is designed for musical arrangements at the marriage. It uses the setMusicType() method to understand the music requirements for the event. class Hotelier(object): def __init__(self): print("Arranging the Hotel for Marriage? --") def __isAvailable(self): print("Is the Hotel free for the event on given day?") return True def bookHotel(self): if self.__isAvailable(): print("Registered the Booking\n\n") class Florist(object): def __init__(self): print("Flower Decorations for the Event? --") def setFlowerRequirements(self): print("Carnations, Roses and Lilies would be used for Decorations\n\n") class Caterer(object): def __init__(self): print("Food Arrangements for the Event --") def setCuisine(self): print("Chinese & Continental Cuisine to be served\n\n") class Musician(object): def __init__(self): print("Musical Arrangements for the Marriage --") def setMusicType(self): print("Jazz and Classical will be played\n\n") you = You() you.askEventManager() The output of the preceding code is given here: In the preceding code example: The EventManager class is the Façade that simplifies the interface for You EventManager uses composition to create objects of the subsystems such as Hotelier, Caterer, and others The principle of least knowledge As you have learned in the initial parts of this article, the Façade provides a unified system that makes subsystems easy to use. It also decouples the client from the subsystem of components. The design principle that is employed behind the Façade pattern is the principle of least knowledge. The principle of least knowledge guides us to reduce the interactions between objects to just a few friends that are close enough to you. In real terms, it means the following:: When designing a system, for every object created, one should look at the number of classes that it interacts with and the way in which the interaction happens. Following the principle, make sure that we avoid situations where there are many classes created tightly coupled to each other. If there are a lot of dependencies between classes, the system becomes hard to maintain. Any changes in one part of the system can lead to unintentional changes to other parts of the system, which means that the system is exposed to regressions and this should be avoided. Summary We began the article by first understanding the Façade design pattern and the context in which it's used. We understood the basis of Façade and how it is effectively used in software architecture. We looked at how Façade design patterns create a simplified interface for clients to use. It simplifies the complexity of subsystems so that the client benefits. The Façade doesn't encapsulate the subsystem and the client is free to access the subsystems even without going through the Façade. You also learned the pattern with a UML diagram and sample code implementation in Python v3.5. We understood the principle of least knowledge and how its philosophy governs the Façade design patterns. Further resources on this subject: Asynchronous Programming with Python [article] Optimization in Python [article] The Essentials of Working with Python Collections [article]

0
0
3846

Packt

20 Oct 2015

9 min read

Application Patterns

Packt

20 Oct 2015

9 min read

0
0
1211

Packt

25 Sep 2015

15 min read

Patterns of Traversing

Packt

25 Sep 2015

15 min read

In this article by Ryan Lemmer, author of the book Haskell Design Patterns, we will focus on two fundamental patterns of recursion: fold and map. The more primitive forms of these patterns are to be found in the Prelude, the "old part" of Haskell. With the introduction of Applicative, came more powerful mapping (traversal), which opened the door to type-level folding and mapping in Haskell. First, we will look at how Prelude's list fold is generalized to all Foldable containers. Then, we will follow the generalization of list map to all Traversable containers. Our exploration of fold and map culminates with the Lens library, which raises Foldable and Traversable to an even higher level of abstraction and power. In this article, we will cover the following: Traversable Modernizing Haskell Lenses (For more resources related to this topic, see here.) Traversable As with Prelude.foldM, mapM fails us beyond lists, for example, we cannot mapM over the Tree from earlier: main = mapM doF aTree >>= print -- INVALID The Traversable type-class is to map in the same way as Foldable is to fold: -- required: traverse or sequenceA class (Functor t, Foldable t) => Traversable (t :: * -> *) where -- APPLICATIVE form traverse :: Applicative f => (a -> f b) -> t a -> f (t b) sequenceA :: Applicative f => t (f a) -> f (t a) -- MONADIC form (redundant) mapM :: Monad m => (a -> m b) -> t a -> m (t b) sequence :: Monad m => t (m a) -> m (t a) The traverse fuction generalizes our mapA function, which was written for lists, to all Traversable containers. Similarly, Traversable.mapM is a more general version of Prelude.mapM for lists: mapM :: Monad m => (a -> m b) -> [a] -> m [b] mapM :: Monad m => (a -> m b) -> t a -> m (t b) The Traversable type-class was introduced along with Applicative: "we introduce the type class Traversable, capturing functorial data structures through which we can thread an applicative computation" Applicative Programming with Effects - McBride and Paterson A Traversable Tree Let's make our Traversable Tree. First, we'll do it the hard way: – a Traversable must also be a Functor and Foldable: instance Functor Tree where fmap f (Leaf x) = Leaf (f x) fmap f (Node x lTree rTree) = Node (f x) (fmap f lTree) (fmap f rTree) instance Foldable Tree where foldMap f (Leaf x) = f x foldMap f (Node x lTree rTree) = (foldMap f lTree) `mappend` (f x) `mappend` (foldMap f rTree) --traverse :: Applicative ma => (a -> ma b) -> mt a -> ma (mt b) instance Traversable Tree where traverse g (Leaf x) = Leaf <$> (g x) traverse g (Node x ltree rtree) = Node <$> (g x) <*> (traverse g ltree) <*> (traverse g rtree) data Tree a = Node a (Tree a) (Tree a) | Leaf a deriving (Show) aTree = Node 2 (Leaf 3) (Node 5 (Leaf 7) (Leaf 11)) -- import Data.Traversable main = traverse doF aTree where doF n = do print n; return (n * 2) The easier way to do this is to auto-implement Functor, Foldable, and Traversable: {-# LANGUAGE DeriveFunctor #-} {-# LANGUAGE DeriveFoldable #-} {-# LANGUAGE DeriveTraversable #-} import Data.Traversable data Tree a = Node a (Tree a) (Tree a)| Leaf a deriving (Show, Functor, Foldable, Traversable) aTree = Node 2 (Leaf 3) (Node 5 (Leaf 7) (Leaf 11)) main = traverse doF aTree where doF n = do print n; return (n * 2) Traversal and the Iterator pattern The Gang of Four Iterator pattern is concerned with providing a way "...to access the elements of an aggregate object sequentially without exposing its underlying representation" "Gang of Four" Design Patterns, Gamma et al, 1995 In The Essence of the Iterator Pattern, Jeremy Gibbons shows precisely how the Applicative traversal captures the Iterator pattern. The Traversable.traverse class is the Applicative version of Traversable.mapM, which means it is more general than mapM (because Applicative is more general than Monad). Moreover, because mapM does not rely on the Monadic bind chain to communicate between iteration steps, Monad is a superfluous type for mapping with effects (Applicative is sufficient). In other words, Applicative traverse is superior to Monadic traversal (mapM): "In addition to being parametrically polymorphic in the collection elements, the generic traverse operation is parametrised along two further dimensions: the datatype being tra- versed, and the applicative functor in which the traversal is interpreted" "The improved compositionality of applicative functors over monads provides better glue for fusion of traversals, and hence better support for modular programming of iterations" The Essence of the Iterator Pattern - Jeremy Gibbons Modernizing Haskell 98 The introduction of Applicative, along with Foldable and Traversable, had a big impact on Haskell. Foldable and Traversable lift Prelude fold and map to a much higher level of abstraction. Moreover, Foldable and Traversable also bring a clean separation between processes that preserve or discard the shape of the structure that is being processed. Traversable describes processes that preserve that shape of the data structure being traversed over. Foldable processes, in turn, discard or transform the shape of the structure being folded over. Since Traversable is a specialization of Foldable, we can say that shape preservation is a special case of shape transformation. This line between shape preservation and transformation is clearly visible from the fact that functions that discard their results (for example, mapM_, forM_, sequence_, and so on) are in Foldable, while their shape-preserving counterparts are in Traversable. Due to the relatively late introduction of Applicative, the benefits of Applicative, Foldable, and Traversable have not found their way into the core of the language. This is due to the change with the Foldable Traversable In Prelude proposal (planned for inclusion in the core libraries from GHC 7.10). For more information, visit https://wiki.haskell.org/Foldable_Traversable_In_Prelude. This will involve replacing less generic functions in Prelude, Control.Monad, and Data.List with their more polymorphic counterparts in Foldable and Traversable. There have been objections to the movement to modernize, the main concern being that more generic types are harder to understand, which may compromise Haskell as a learning language. These valid concerns will indeed have to be addressed, but it seems certain that the Haskell community will not resist climbing to new abstract heights. Lenses A Lens is a type that provides access to a particular part of a data structure. Lenses express a high-level pattern for composition. However, Lens is also deeply entwined with Traversable, and so we describe it in this article instead. Lenses relate to the getter and setter functions, which also describe access to parts of data structures. To find our way to the Lens abstraction (as per Edward Kmett's Lens library), we'll start by writing a getter and setter to access the root node of a Tree. Deriving Lens Returning to our Tree from earlier: data Tree a = Node a (Tree a) (Tree a) | Leaf a deriving (Show) intTree = Node 2 (Leaf 3) (Node 5 (Leaf 7) (Leaf 11)) listTree = Node [1,1] (Leaf [2,1]) (Node [3,2] (Leaf [5,2]) (Leaf [7,4])) tupleTree = Node (1,1) (Leaf (2,1)) (Node (3,2) (Leaf (5,2)) (Leaf (7,4))) Let's start by writing generic getter and setter functions: getRoot :: Tree a -> a getRoot (Leaf z) = z getRoot (Node z _ _) = z setRoot :: Tree a -> a -> Tree a setRoot (Leaf z) x = Leaf x setRoot (Node z l r) x = Node x l r main = do print $ getRoot intTree print $ setRoot intTree 11 print $ getRoot (setRoot intTree 11) If we want to pass in a setter function instead of setting a value, we use the following: fmapRoot :: (a -> a) -> Tree a -> Tree a fmapRoot f tree = setRoot tree newRoot where newRoot = f (getRoot tree) We have to do a get, apply the function, and then set the result. This double work is akin to the double traversal we saw when writing traverse in terms of sequenceA. In that case we resolved the issue by defining traverse first (and then sequenceA i.t.o. traverse): We can do the same thing here by writing fmapRoot to work in a single step (and then rewriting setRoot' i.t.o. fmapRoot'): fmapRoot' :: (a -> a) -> Tree a -> Tree a fmapRoot' f (Leaf z) = Leaf (f z) fmapRoot' f (Node z l r) = Node (f z) l r setRoot' :: Tree a -> a -> Tree a setRoot' tree x = fmapRoot' (_ -> x) tree main = do print $ setRoot' intTree 11 print $ fmapRoot' (*2) intTree The fmapRoot' function delivers a function to a particular part of the structure and returns the same structure: fmapRoot' :: (a -> a) -> Tree a -> Tree a To allow for I/O, we need a new function: fmapRootIO :: (a -> IO a) -> Tree a -> IO (Tree a) We can generalize this beyond I/O to all Monads: fmapM :: (a -> m a) -> Tree a -> m (Tree a) It turns out that if we relax the requirement for Monad, and generalize f' to all the Functor container types, then we get a simple van Laarhoven Lens! type Lens' s a = Functor f' => (a -> f' a) -> s -> f' s The remarkable thing about a van Laarhoven Lens is that given the preceding function type, we also gain "get", "set", "fmap", "mapM", and many other functions and operators. The Lens function type signature is all it takes to make something a Lens that can be used with the Lens library. It is unusual to use a type signature as "primary interface" for a library. The immediate benefit is that we can define a lens without referring to the Lens library. We'll explore more benefits and costs to this approach, but first let's write a few lenses for our Tree. The derivation of the Lens abstraction used here has been based on Jakub Arnold's Lens tutorial, which is available at http://blog.jakubarnold.cz/2014/07/14/lens-tutorial-introduction-part-1.html. Writing a Lens A Lens is said to provide focus on an element in a data structure. Our first lens will focus on the root node of a Tree. Using the lens type signature as our guide, we arrive at: lens':: Functor f => (a -> f' a) -> s -> f' s root :: Functor f' => (a -> f' a) -> Tree a -> f' (Tree a) Still, this is not very tangible; fmapRootIO is easier to understand with the Functor f' being IO: fmapRootIO :: (a -> IO a) -> Tree a -> IO (Tree a) fmapRootIO g (Leaf z) = (g z) >>= return . Leaf fmapRootIO g (Node z l r) = (g z) >>= return . (x -> Node x l r) displayM x = print x >> return x main = fmapRootIO displayM intTree If we drop down from Monad into Functor, we have a Lens for the root of a Tree: root :: Functor f' => (a -> f' a) -> Tree a -> f' (Tree a) root g (Node z l r) = fmap (x -> Node x l r) (g z) root g (Leaf z) = fmap Leaf (g z) As Monad is a Functor, this function also works with Monadic functions: main = root displayM intTree As root is a lens, the Lens library gives us the following: -– import Control.Lens main = do -- GET print $ view root listTree print $ view root intTree -- SET print $ set root [42] listTree print $ set root 42 intTree -- FMAP print $ over root (+11) intTree The over is the lens way of fmap'ing a function into a Functor. Composable getters and setters Another Lens on Tree might be to focus on the rightmost leaf: rightMost :: Functor f' => (a -> f' a) -> Tree a -> f' (Tree a) rightMost g (Node z l r) = fmap (r' -> Node z l r') (rightMost g r) rightMost g (Leaf z) = fmap (x -> Leaf x) (g z) The Lens library provides several lenses for Tuple (for example, _1 which brings focus to the first Tuple element). We can compose our rightMost lens with the Tuple lenses: main = do print $ view rightMost tupleTree print $ set rightMost (0,0) tupleTree -- Compose Getters and Setters print $ view (rightMost._1) tupleTree print $ set (rightMost._1) 0 tupleTree print $ over (rightMost._1) (*100) tupleTree A Lens can serve as a getter, setter, or "function setter". We are composing lenses using regular function composition (.)! Note that the order of composition is reversed in (rightMost._1) the rightMost lens is applied before the _1 lens. Lens Traversal A Lens focuses on one part of a data structure, not several, for example, a lens cannot focus on all the leaves of a Tree: set leaves 0 intTree over leaves (+1) intTree To focus on more than one part of a structure, we need a Traversal class, the Lens generalization of Traversable). Whereas Lens relies on Functor, Traversal relies on Applicative. Other than this, the signatures are exactly the same: traversal :: Applicative f' => (a -> f' a) -> Tree a -> f' (Tree a) lens :: Functor f'=> (a -> f' a) -> Tree a -> f' (Tree a) A leaves Traversal delivers the setter function to all the leaves of the Tree: leaves :: Applicative f' => (a -> f' a) -> Tree a -> f' (Tree a) leaves g (Node z l r) = Node z <$> leaves g l <*> leaves g r leaves g (Leaf z) = Leaf <$> (g z) We can use set and over functions with our new Traversal class: set leaves 0 intTree over leaves (+1) intTree The Traversals class compose seamlessly with Lenses: main = do -- Compose Traversal + Lens print $ over (leaves._1) (*100) tupleTree -- Compose Traversal + Traversal print $ over (leaves.both) (*100) tupleTree -- map over each elem in target container (e.g. list) print $ over (leaves.mapped) (*(-1)) listTree -- Traversal with effects mapMOf leaves displayM tupleTree (The both is a Tuple Traversal that focuses on both elements). Lens.Fold The Lens.Traversal lifts Traversable into the realm of lenses: main = do print $ sumOf leaves intTree print $ anyOf leaves (>0) intTree The Lens Library We used only "simple" Lenses so far. A fully parametrized Lens allows for replacing parts of a data structure with different types: type Lens s t a b = Functor f' => (a -> f' b) -> s -> f' t –- vs simple Lens type Lens' s a = Lens s s a a Lens library function names do their best to not clash with existing names, for example, postfixing of idiomatic function names with "Of" (sumOf, mapMOf, and so on), or using different verb forms such as "droppingWhile" instead of "dropWhile". While this creates a burden as i.t.o has to learn new variations, it does have a big plus point—it allows for easy unqualified import of the Lens library. By leaving the Lens function type transparent (and not obfuscating it with a new type), we get Traversals by simply swapping out Functor for Applicative. We also get to define lenses without having to reference the Lens library. On the downside, Lens type signatures can be bewildering at first sight. They form a language of their own that requires effort to get used to, for example: mapMOf :: Profunctor p => Over p (WrappedMonad m) s t a b -> p a (m b) -> s -> m t foldMapOf :: Profunctor p => Accessing p r s a -> p a r -> s -> r On the surface, the Lens library gives us composable getters and setters, but there is much more to Lenses than that. By generalizing Foldable and Traversable into Lens abstractions, the Lens library lifts Getters, Setters, Lenses, and Traversals into a unified framework in which they are all compose together. Edward Kmett's Lens library is a sprawling masterpiece that is sure to leave a lasting impact on idiomatic Haskell. Summary We started with Lists (Haskel 98), then generalizing for all Traversable containers (Introduced in the mid-2000s). Following that, we saw how the Lens library (2012) places traversing in an even broader context. Lenses give us a unified vocabulary to navigate data structures, which explains why it has been described as a "query language for data structures". Resources for Article: Further resources on this subject: Plotting in Haskell[article] The Hunt for Data[article] Getting started with Haskell [article]

0
0
2947

Packt

22 Sep 2015

18 min read

Cassandra Design Patterns

Packt

22 Sep 2015

18 min read

In this article by Rajanarayanan Thottuvaikkatumana, author of the book Cassandra Design Patterns, Second Edition, the author has discussed how Apache Cassandra is one of the most popular NoSQL data stores. He states this based on the research paper Dynamo: Amazon’s Highly Available Key-Value Store and the research paper Bigtable: A Distributed Storage System for Structured Data. Cassandra is implemented with best features from both of these research papers. In general, NoSQL data stores can be classified into the following groups: Key-value data store Column family data store Document data store Graph data store Cassandra belongs to the column family data store group. Cassandra’s peer-to-peer architecture avoids single point failures in the cluster of Cassandra nodes and gives the ability to distribute the nodes across racks or data centres. This makes Cassandra a linearly scalable data store. In other words, the more processing you need, the more Cassandra nodes you can add to your cluster. Cassandra’s multi data centre support makes it a perfect choice to replicate the data stores across data centres for disaster recovery, high availability, separating transaction processing, analytical environments, and for building resiliency into the data store infrastructure. Design patterns in Cassandra The term “design patterns” is a highly misinterpreted term in the software development community. In an extremely general sense, it is a set of solutions for some known problems in quite a specific context. It is used in this book to describe a pattern of using certain features of Cassandra to solve some real-world problems. This book is a collection of such design patterns with real-world examples. Coexistence patterns Cassandra is one of the highly successful NoSQL data stores, which is greatly similar to the traditional RDBMS. Cassandra column families (also known as Cassandra tables), in a logical perspective, have a similarity with RDBMS-based tables in the view of the users, even though the underlying structure of these tables are totally different. Because of this, Cassandra is best fit to be deployed along with the traditional RDBMS to solve some of the problems that RDBMS is not able to handle. The caveat here is that because of the similarity of RDBMS tables and Cassandra column families in the view of the end users, many users and data modelers try to use Cassandra in the exact the same way as the RDBMS schema is being modeled, used, and getting into serious deployment issues. How do you prevent such pitfalls? The key here is to understand the differences in a theoretical perspective as well as in a practical perspective, and follow best practices prescribed by the creators of Cassandra. Where do you start with Cassandra? The best place to look at is the new application development requirements and take it from there. Look at the cases where there is a need to normalize the RDBMS tables and keep all the data items together, which would have got distributed if you were to design the same solution in RDBMS. Instead of thinking from the pure data model perspective, start thinking in terms of the application's perspective. How the data is generated by the application, what are the read requirements, what are the write requirements, what is the response time expected out of some of the use cases, and so on. Depending on these aspects, design the data model. In the big data world, the application becomes the first class citizen and the data model leaves the driving seat in the application design. Design the data model to serve the needs of the applications. In any organization, new reporting requirements come all the time. The major challenge in to generate reports is the underlying data store. In the RDBMS world, reporting is always a challenge. You may have to join multiple tables to generate even simple reports. Even though the RDBMS objects such as views, stored procedures, and indexes maybe used to get the desired data for the reports, when the report is being generated, the query plan is going to be very complex most of the time. The consumption of processing power is another need to consider when generating such reports on the fly. Because of these complexities, many times, for reporting requirements, it is common to keep separate tables containing data exported from the transactional tables. This is a great opportunity to start with NoSQL stores like Cassandra as a reporting data store. Data aggregation and summarization are common requirements in any organization. This helps to control the data growth by storing only the summary statistics and moving the transactional data into archives. Many times, this aggregated and summarized data is used for statistical analysis. Making the summary accurate and easily accessible is a big challenge. Most of the time, data aggregation and reporting goes hand in hand. The aggregated data is heavily used in reports. The aggregation process speeds up the queries to a great extent. This is another place where you can start with NoSQL stores like Cassandra. The coexistence of RDBMS and NoSQL data stores like Cassandra is very much possible, feasible, and sensible; and this is the only way to get started with the NoSQL movement, unless you embark on a totally new product development from scratch. In summary, this section of the book discusses about some design patterns related to de-normalization, reporting, and aggregation of data using Cassandra as the preferred NoSQL data store. RDBMS migration patterns A big bang approach to any kind of technology migration is not advisable. A series of deliberations have to happen before the eventual and complete change over. Migration from RDBMS to Cassandra is not different at all. Any new technology replacing an old one must coexist harmoniously, at least for a short period of time. This gives a lot of confidence on the new technology to the stakeholders. Many technology pundits give various approaches on the RDBMS to NoSQL migration strategies. Many such guidelines are specific to the particular NoSQL data stores giving attention to specific areas, and most of the time, this will end up on the process rather than the technology. The migration from RDBMS to Cassandra is not an easy task. Mainly because the RDBMS-based systems are really time tested and trust worthy in most of the organizations. So, migrating from such a robust RDBMS-based system to Cassandra is not going to be easy for anyone. One of the best approaches to achieve this goal is to exploit some of the new or unique features in Cassandra, which many of the traditional RDBMS don't have. This also prevents the usage of Cassandra just like any other RDBMS. Cassandra is unique. Cassandra is not an RDBMS. The approach of banking on the unique features is not only applicable to the RDBMS to Cassandra migration, but also to any migration from one paradigm to another. Some of the design patterns that are discussed in this section of the book revolve around very simple and important features of Cassandra, but have profound application potential when designing the next generation NoSQL data stores using Cassandra. A wise usage of these unique features in Cassandra will give a head start on the eventual and complete migration from RDBMS. The modeling of collection objects in RDBMS is a real pain, because multiple tables are to be defined and a join is required to access data. Many RDBMS offer this by providing capability to define user-defined data types, but there is absolutely no standardization at all in this space. Collection objects are very commonly seen in the real-world applications. A list of actions, tuple of related values, set of objects, dictionaries, and things like that come quite often in applications. Cassandra has elegant ways to model this because they are data types in column families. Counting is a very commonly required process in many business processes and applications. In RDBMS, this has to be modeled as integers or long numbers, but many times, applications make big mistakes in using them in wrong ways. Cassandra has a counter data type in the column family that alleviates this problem. Getting rid of unwanted records from an RDBMS table is not an automatic process. When some application events occur, they have to be removed by application programs or through some other means. But in many situations, many data items will have a preallocated time to live. They should go away without the intervention of any external events. Cassandra has a way to assign time-to-live (TTL) attribute to data items. By making use of TTL, data items get removed without any other external event's intervention. All the design patterns covered in this section of the book revolve around some of the new features of Cassandra that will make the migration from RDBMS to Cassandra an easy task. Cache migration pattern Database access whether it is from RDBMS or other highly distributed NoSQL data stores is always an input/output (I/O) intensive operation. It makes perfect sense to cache the frequently used, but reasonably static data for fast access for the applications consuming this data. In such situations, the in-memory cache is preferred to the repeated database access for each request. Using cache is not always a pleasant experience. Getting into really weird problems such as data loss, data getting out of sync with its source and other data integrity problems are very common. It is very common to see wrong components coming into the enterprise solution stack all the time for various reasons. Overlooking on some of the features and adopting the technology without much background work is a very common pitfall. Many a times, the use of cache comes into the solution stack to reduce the latency of the responses. Once the initial results are favorable, more and more data will get tossed into the cache. Slowly, this will become a practice to see that more and more data is getting into cache. Now is the time when problems start popping up one by one. Pure in-memory cache solutions are favored by everybody, by the virtue of its ability to serve the data quickly until you start loosing data. This is because of the faults in the system, along with application and node crashes. Cache serves data much faster than being served from other data stores. But if the caching solution in use is giving data integrity problems, it is better to migrate to NoSQL data stores like Cassandra. Is Cassandra faster than the in-memory caching solutions? The obvious answer is no. But it is not as bad as many think. Cassandra can be configured to serve fast reads, and bonus comes in the form of high data integrity with strong replication capabilities. Cache is good as long as it serves its purpose without any data loss or any other data integrity issues. Emphasizing on the use case of the key/value type cache and various methods of cache to NoSQL migration are discussed in this section of the book. Cassandra cannot be used as a replacement for cache in terms of the speed of data access. But when it comes to data integrity, Cassandra shines all the time with its tuneable consistency feature. With a continual tuning and manipulating data with clean and well-written application code, data access can be improved to a great level, and it will be much better than many other data stores. The design pattern covered in this section of the book gives some guidance on migrating from caching solutions to Cassandra, if this is a must. CAP patterns When it comes to large-scale Internet applications or web services, popularly known as the Internet of Things (IoT) applications, the number of components are huge and the way they are distributed is beyond imagination. There will be hundreds of application servers, hundreds of data store nodes, and many other components in the whole ecosystem. In such a scenario, for doing an atomic transaction by getting an agreement from all the components involved is, for all practical purposes, impossible. Consistency, availability, and partition tolerance are three important guarantees, popularly known as CAP guarantees that any distributed computing systems should offer even though all is not possible simultaneously. In the IoT applications, the distribution of the application nodes is unavoidable. This means that the possibility of network partition is pretty much there. So, it is mandatory to give the P guarantee. Now, the question is whether to forfeit the C guarantee or the A guarantee. At this stage, the situation is not as grave as portrayed in the CAP Theorem conjectured by Eric Brewer. For all the use cases in a given IoT application, there is no need of having 100% of C guarantee and 100% of A guarantee. So, depending on the need of the level of A guarantee, the C guarantee can be tuned. In other words, it is called tunable consistency. Depending on the way data is ingested into Cassandra, and the way it is consumed from Cassandra, tuning is possible to give best results for the appropriate read and write requirements of the applications. In some applications, the speed at which the data is written will be very high. In other words, the velocity of the data ingestion into Cassandra is very high. This falls into the write-heavy applications. In some applications, the need to read data quickly will be an important requirement. This is mainly needed in the applications where there is a lot of data processing required. Data analytics applications, batch processing applications, and so on fall under this category. These fall into the read-heavy applications. Now, there is a third category of applications where there is an equal importance for fast writes as well as fast reads. These are the kind of applications where there is a constant inflow of data, and at the same time, there is a need to read the data by clients for various purposes. This falls into the read-write balanced applications. The consistency level requirements for all the previous three types of applications are totally different. There is no one way to tune so that it is optimal for all the three types of applications. All the three applications' consistency levels are to be tuned differently from use case to use case. In this section of the book, various design patterns related to applications with the needs of fast writes, fast reads, and moderate write and read are discussed. All these design patterns revolve around using the tuneable consistency parameters of Cassandra. Whether it is for write or read and if the consistency levels are set high, the availability levels will be low and vice versa. So, by making use of the consistency level knob, the Cassandra data store can be used for various types of writing and reading use cases. Temporal patterns In any applications, the usage of data that varies over the period of time is called as temporal data, which is very important. Temporal data is needed wherever there is a need to maintain chronology. There are so many applications in which there is a huge need for storage, retrieval, and processing of data that is tied to time. The biggest challenge in dealing with temporal data stored in a data store is that they are hugely used for analytical purposes and retrieving the data, based on various sort orders in terms of time. So, the data stores that are used to capture the temporal data should be capable of storing the data strictly adhering to the chronology. There are so many usage patterns that are seen in the real world that fall into showing temporal behavior. For the classification purpose in this book, they are bucketed into three. The first one is the general time series category. The second one is the log category, such as in an audit log, a transaction log, and so on. The third one is the conversation category, such as in the conversation messages of a chat application. There is relevance in this classification, because these are commonly used across in many of the applications. In many of the applications, these are really cross cutting concerns; and designers underestimate this aspect; and finally, many of the applications will have different data stores capturing this temporal data. There is a need to have a common strategy dealing with temporal data that fall in these three commonly seen categories in an enterprise wide solution architecture. In other words, there should be a uniform way of capturing temporal data; there should be a uniform way of processing temporal data; and there should be a commonly used set of tools and libraries to manage the temporal data. Out of the three design patterns that are discussed in this section of the book, the first Time Series pattern is a general design pattern that covers the most general behavior of any kind of temporal data. The next two design patterns namely Log pattern and Conversation pattern are two special cases of the first design pattern. This section of the book covers the general nature of temporal data, some specific instances of such data items in the real-world applications, and why Cassandra is the best fit as a NoSQL data store to persist the temporal data. Temporal data comes quite often in many use cases of lots of applications. Data modeling of temporal data is very important in the Cassandra perspective for optimal storage and quick access of the data. Some common design patterns to model temporal data have been covered in this section of the book. By focusing on some very few aspects, such as the partition key, primary key, clustering column and the number of records that gets stored in a wide row of Cassandra, very effective and high performing temporal data models can be built. Analytical patterns The 3Vs of big data namely Volume, Variety, and Velocity pose another big challenge, which is the analysis of the data stored in NoSQL data stores, such as Cassandra. What are the analytics use cases? How can the distributed data be processed? What are the data transformations that are typically seen in the applications? These are the topics covered in this section of the book. Unlike other sections of this book, the focus is shifted from Cassandra to other technologies like Apache Hadoop, Hadoop MapReduce, and Apache Spark to introduce the big data analytics tool space. The design patterns such as Map/Reduce Pattern and Transformation Pattern are very commonly seen in the data analytics world. Cassandra with Apache Spark has good compatibility, and is a very ideal tool set in the data analysis use cases. This section of the book covers some data analysis aspects and mainly discusses about data processing. Data transformation is one of the major activity in data processing. Out of the many data processing patterns, Map/Reduce Pattern deserves a special mention, because it is being used in so many batch processing and analysis use cases, dealing with big data. Spark has been chosen as the tool of choice to explain the data processing activities. This section explains how a Map/Reduce kind of data processing task can be done using Cassandra. Spark has also been discussed, which is very powerful to perform online data analysis. This section of the book also covers some of the commonly seen data transformations that are used in the data processing applications. Summary Many Cassandra design patterns have been covered in this book. If the design patterns are not being used in any real-world applications, it has only theoretical value. To give a practical approach to the applicability of these design patterns, an end-to-end application is taken as a case point and described as the last chapter of the book, which is used as a vehicle to explain the applicability of the Cassandra design patterns discussed in the earlier sections of the book. Users love Cassandra because of its SQL-like interface CQL. Also, its features are very closely related to the RDBMS even though the paradigm is totally new. Application developers love Cassandra because of the plethora of drivers available in the market so that they can write applications in their preferred programming language. Architects love Cassandra because they can store structured, semi-structured, and unstructured data in it. Database administers love Cassandra because it comes with almost no maintenance overhead. Service managers love Cassandra because of the wonderful monitoring tools available in the market. CIOs love Cassandra because it gives value for their money. And Cassandra works! An application based on Cassandra will be perfect only if its features are used in the right way, and this book is an attempt to guide the Cassandra community in this direction. Resources for Article: Further resources on this subject: Cassandra Architecture [article] Getting Up and Running with Cassandra [article] Getting Started with Apache Cassandra [article]

0
0
3171

Packt

14 Sep 2015

6 min read

Getting Started with Meteor

Packt

14 Sep 2015

6 min read

In this article, based on Marcelo Reyna's book Meteor Design Patterns, we will see that when you want to develop an application of any kind, you want to develop it fast. Why? Because the faster you develop, the better your return on investment will be (your investment is time, and the real cost is the money you could have produced with that time). There are two key ingredients ofrapid web development: compilers and patterns. Compilers will help you so that youdon’t have to type much, while patterns will increase the paceat which you solve common programming issues. Here, we will quick-start compilers and explain how they relate withMeteor, a vast but simple topic. The compiler we will be looking at isCoffeeScript. (For more resources related to this topic, see here.) CoffeeScriptfor Meteor CoffeeScript effectively replaces JavaScript. It is much faster to develop in CoffeeScript, because it simplifies the way you write functions, objects, arrays, logical statements, binding, and much more.All CoffeeScript files are saved with a .coffee extension. We will cover functions, objects, logical statements, and binding, since thisis what we will use the most. Objects and arrays CoffeeScriptgets rid of curly braces ({}), semicolons (;), and commas (,). This alone saves your fingers from repeating unnecessary strokes on the keyboard. CoffeeScript instead emphasizes on the proper use of tabbing. Tabbing will not only make your code more readable (you are probably doing it already), but also be a key factor inmaking it work. Let’s look at some examples: #COFFEESCRIPT toolbox = hammer:true flashlight:false Here, we are creating an object named toolbox that contains two keys: hammer and flashlight. The equivalent in JavaScript would be this: //JAVASCRIPT - OUTPUT var toolbox = { hammer:true, flashlight:false }; Much easier! As you can see, we have to tab to express that both the hammer and the flashlight properties are a part of toolbox. The word var is not allowed in CoffeeScript because CoffeeScript automatically applies it for you. Let’stakea look at how we would createan array: #COFFEESCRIPT drill_bits = [ “1/16 in” “5/64 in” “3/32 in” “7/64 in” ] //JAVASCRIPT – OUTPUT vardrill_bits; drill_bits = [“1/16 in”,”5/64 in”,”3/32 in”,”7/64 in”]; Here, we can see we don’t need any commas, but we do need brackets to determine that this is an array. Logical statements and operators CoffeeScript also removes a lot ofparentheses (()) in logical statements and functions. This makes the logic of the code much easier to understand at the first glance. Let’s look at an example: #COFFEESCRIPT rating = “excellent” if five_star_rating //JAVASCRIPT – OUTPUT var rating; if(five_star_rating){ rating = “excellent”; } In this example, we can clearly see thatCoffeeScript is easier to read and write.Iteffectively replaces all impliedparentheses in any logical statement. Operators such as &&, ||, and !== are replaced by words. Here is a list of the operators that you will be using the most: CoffeeScript JavaScript is === isnt !== not ! and && or || true, yes, on true false, no, off false @, this this Let's look at a slightly more complex logical statement and see how it compiles: #COFFEESCRIPT # Suppose that “this” is an object that represents a person and their physical properties if@eye_coloris “green” retina_scan = “passed” else retina_scan = “failed” //JAVASCRIPT - OUTPUT if(this.eye_color === “green”){ retina_scan = “passed”; } else { retina_scan = “failed”; } When using @eye_color to substitute for this.eye_color, notice that we do not need . Functions JavaScript has a couple of ways of creating functions. They look like this: //JAVASCRIPT //Save an anonymous function onto a variable varhello_world = function(){ console.log(“Hello World!”); } //Declare a function functionhello_world(){ console.log(“Hello World!”); } CoffeeScript uses ->instead of the function()keyword.The following example outputs a hello_world function: #COFFEESCRIPT #Create a function hello_world = -> console.log “Hello World!” //JAVASCRIPT - OUTPUT varhello_world; hello_world = function(){ returnconsole.log(“Hello World!”); } Once again, we use a tab to express the content of the function, so there is no need ofcurly braces ({}). This means that you have to make sure you have all of the logic of the function tabbed under its namespace. But what about our parameters? We can use (p1,p2) -> instead, where p1 and p2 are parameters. Let’s make our hello_world function say our name: #COFFEESCRIPT hello_world = (name) -> console.log “Hello #{name}” //JAVSCRIPT – OUTPUT varhello_world; hello_world = function(name) { returnconsole.log(“Hello “ + name); } In this example, we see how the special word function disappears and string interpolation. CoffeeScript allows the programmer to easily add logic to a string by escaping the string with #{}. Unlike JavaScript, you can also add returns and reshape the way astring looks without breaking the code. Binding In Meteor, we will often find ourselves using the properties of bindingwithin nested functions and callbacks.Function binding is very useful for these types of cases and helps avoid having to save data in additional variables. Let’s look at an example: #COFFEESCRIPT # Let’s make the context of this equal to our toolbox object # this = # hammer:true # flashlight:false # Run a method with a callback Meteor.call “use_hammer”, -> console.log this In this case, the thisobjectwill return a top-level object, such as the browser window. That's not useful at all. Let’s bind it now: #COFFEESCRIPT # Let’s make the context of this equal to our toolbox object # this = # hammer:true # flashlight:false # Run a method with a callback Meteor.call “use_hammer”, => console.log this The key difference is the use of =>instead of the expected ->sign fordefining the function. This will ensure that the callback'sthis object contains the context of the executing function. The resulting compiled script is as follows: //JAVASCRIPT Meteor.call(“use_hammer”, (function(_this) { return function() { returnConsole.log(_this); }; })(this)); CoffeeScript will improve your code and help you write codefaster. Still, itis not flawless. When you start combining functions with nested arrays, things can get complex and difficult to read, especially when functions are constructed with multiple parameters. Let’s look at an ugly query: #COFFEESCRIPT People.update sibling: $in:[“bob”,”bill”] , limit:1 -> console.log “success!” There are a few ways ofexpressing the difference between two different parameters of a function, but by far the easiest to understand. We place a comma one indentation before the next object. Go to coffeescript.org and play around with the language by clicking on the try coffeescript link. Summary We can now program faster because we have tools such as CoffeeScript, Jade, and Stylus to help us. We also seehow to use templates, helpers, and events to make our frontend work with Meteor. Resources for Article: Further resources on this subject: Why Meteor Rocks! [article] Function passing [article] Meteor.js JavaScript Framework: Why Meteor Rocks! [article]

0
0
1315

Packt

31 Mar 2015

16 min read

Dealing with Legacy Code

Packt

31 Mar 2015

16 min read

In this article by Arun Ravindran, author of the book Django Best Practices and Design Patterns, we will discuss the following topics: Reading a Django code base Discovering relevant documentation Incremental changes versus full rewrites Writing tests before changing code Legacy database integration (For more resources related to this topic, see here.) It sounds exciting when you are asked to join a project. Powerful new tools and cutting-edge technologies might await you. However, quite often, you are asked to work with an existing, possibly ancient, codebase. To be fair, Django has not been around for that long. However, projects written for older versions of Django are sufficiently different to cause concern. Sometimes, having the entire source code and documentation might not be enough. If you are asked to recreate the environment, then you might need to fumble with the OS configuration, database settings, and running services locally or on the network. There are so many pieces to this puzzle that you might wonder how and where to start. Understanding the Django version used in the code is a key piece of information. As Django evolved, everything from the default project structure to the recommended best practices have changed. Therefore, identifying which version of Django was used is a vital piece in understanding it. Change of Guards Sitting patiently on the ridiculously short beanbags in the training room, the SuperBook team waited for Hart. He had convened an emergency go-live meeting. Nobody understood the "emergency" part since go live was at least 3 months away. Madam O rushed in holding a large designer coffee mug in one hand and a bunch of printouts of what looked like project timelines in the other. Without looking up she said, "We are late so I will get straight to the point. In the light of last week's attacks, the board has decided to summarily expedite the SuperBook project and has set the deadline to end of next month. Any questions?" "Yeah," said Brad, "Where is Hart?" Madam O hesitated and replied, "Well, he resigned. Being the head of IT security, he took moral responsibility of the perimeter breach." Steve, evidently shocked, was shaking his head. "I am sorry," she continued, "But I have been assigned to head SuperBook and ensure that we have no roadblocks to meet the new deadline." There was a collective groan. Undeterred, Madam O took one of the sheets and began, "It says here that the Remote Archive module is the most high-priority item in the incomplete status. I believe Evan is working on this." "That's correct," said Evan from the far end of the room. "Nearly there," he smiled at others, as they shifted focus to him. Madam O peered above the rim of her glasses and smiled almost too politely. "Considering that we already have an extremely well-tested and working Archiver in our Sentinel code base, I would recommend that you leverage that instead of creating another redundant system." "But," Steve interrupted, "it is hardly redundant. We can improve over a legacy archiver, can't we?" "If it isn't broken, then don't fix it", replied Madam O tersely. He said, "He is working on it," said Brad almost shouting, "What about all that work he has already finished?" "Evan, how much of the work have you completed so far?" asked O, rather impatiently. "About 12 percent," he replied looking defensive. Everyone looked at him incredulously. "What? That was the hardest 12 percent" he added. O continued the rest of the meeting in the same pattern. Everybody's work was reprioritized and shoe-horned to fit the new deadline. As she picked up her papers, readying to leave she paused and removed her glasses. "I know what all of you are thinking... literally. But you need to know that we had no choice about the deadline. All I can tell you now is that the world is counting on you to meet that date, somehow or other." Putting her glasses back on, she left the room. "I am definitely going to bring my tinfoil hat," said Evan loudly to himself. Finding the Django version Ideally, every project will have a requirements.txt or setup.py file at the root directory, and it will have the exact version of Django used for that project. Let's look for a line similar to this: Django==1.5.9 Note that the version number is exactly mentioned (rather than Django>=1.5.9), which is called pinning. Pinning every package is considered a good practice since it reduces surprises and makes your build more deterministic. Unfortunately, there are real-world codebases where the requirements.txt file was not updated or even completely missing. In such cases, you will need to probe for various tell-tale signs to find out the exact version. Activating the virtual environment In most cases, a Django project would be deployed within a virtual environment. Once you locate the virtual environment for the project, you can activate it by jumping to that directory and running the activated script for your OS. For Linux, the command is as follows: $ source venv_path/bin/activate Once the virtual environment is active, start a Python shell and query the Django version as follows: $ python >>> import django >>> print(django.get_version()) 1.5.9 The Django version used in this case is Version 1.5.9. Alternatively, you can run the manage.py script in the project to get a similar output: $ python manage.py --version 1.5.9 However, this option would not be available if the legacy project source snapshot was sent to you in an undeployed form. If the virtual environment (and packages) was also included, then you can easily locate the version number (in the form of a tuple) in the __init__.py file of the Django directory. For example: $ cd envs/foo_env/lib/python2.7/site-packages/django $ cat __init__.py VERSION = (1, 5, 9, 'final', 0) ... If all these methods fail, then you will need to go through the release notes of the past Django versions to determine the identifiable changes (for example, the AUTH_PROFILE_MODULE setting was deprecated since Version 1.5) and match them to your legacy code. Once you pinpoint the correct Django version, then you can move on to analyzing the code. Where are the files? This is not PHP One of the most difficult ideas to get used to, especially if you are from the PHP or ASP.NET world, is that the source files are not located in your web server's document root directory, which is usually named wwwroot or public_html. Additionally, there is no direct relationship between the code's directory structure and the website's URL structure. In fact, you will find that your Django website's source code is stored in an obscure path such as /opt/webapps/my-django-app. Why is this? Among many good reasons, it is often more secure to move your confidential data outside your public webroot. This way, a web crawler would not be able to accidentally stumble into your source code directory. Starting with urls.py Even if you have access to the entire source code of a Django site, figuring out how it works across various apps can be daunting. It is often best to start from the root urls.py URLconf file since it is literally a map that ties every request to the respective views. With normal Python programs, I often start reading from the start of its execution—say, from the top-level main module or wherever the __main__ check idiom starts. In the case of Django applications, I usually start with urls.py since it is easier to follow the flow of execution based on various URL patterns a site has. In Linux, you can use the following find command to locate the settings.py file and the corresponding line specifying the root urls.py: $ find . -iname settings.py -exec grep -H 'ROOT_URLCONF' {} ; ./projectname/settings.py:ROOT_URLCONF = 'projectname.urls' $ ls projectname/urls.py projectname/urls.py Jumping around the code Reading code sometimes feels like browsing the web without the hyperlinks. When you encounter a function or variable defined elsewhere, then you will need to jump to the file that contains that definition. Some IDEs can do this automatically for you as long as you tell it which files to track as part of the project. If you use Emacs or Vim instead, then you can create a TAGS file to quickly navigate between files. Go to the project root and run a tool called Exuberant Ctags as follows: find . -iname "*.py" -print | etags - This creates a file called TAGS that contains the location information, where every syntactic unit such as classes and functions are defined. In Emacs, you can find the definition of the tag, where your cursor (or point as it called in Emacs) is at using the M-. command. While using a tag file is extremely fast for large code bases, it is quite basic and is not aware of a virtual environment (where most definitions might be located). An excellent alternative is to use the elpy package in Emacs. It can be configured to detect a virtual environment. Jumping to a definition of a syntactic element is using the same M-. command. However, the search is not restricted to the tag file. So, you can even jump to a class definition within the Django source code seamlessly. Understanding the code base It is quite rare to find legacy code with good documentation. Even if you do, the documentation might be out of sync with the code in subtle ways that can lead to further issues. Often, the best guide to understand the application's functionality is the executable test cases and the code itself. The official Django documentation has been organized by versions at https://docs.djangoproject.com. On any page, you can quickly switch to the corresponding page in the previous versions of Django with a selector on the bottom right-hand section of the page: In the same way, documentation for any Django package hosted on readthedocs.org can also be traced back to its previous versions. For example, you can select the documentation of django-braces all the way back to v1.0.0 by clicking on the selector on the bottom left-hand section of the page: Creating the big picture Most people find it easier to understand an application if you show them a high-level diagram. While this is ideally created by someone who understands the workings of the application, there are tools that can create very helpful high-level depiction of a Django application. A graphical overview of all models in your apps can be generated by the graph_models management command, which is provided by the django-command-extensions package. As shown in the following diagram, the model classes and their relationships can be understood at a glance: Model classes used in the SuperBook project connected by arrows indicating their relationships This visualization is actually created using PyGraphviz. This can get really large for projects of even medium complexity. Hence, it might be easier if the applications are logically grouped and visualized separately. PyGraphviz Installation and Usage If you find the installation of PyGraphviz challenging, then don't worry, you are not alone. Recently, I faced numerous issues while installing on Ubuntu, starting from Python 3 incompatibility to incomplete documentation. To save your time, I have listed the steps that worked for me to reach a working setup. On Ubuntu, you will need the following packages installed to install PyGraphviz: $ sudo apt-get install python3.4-dev graphviz libgraphviz-dev pkg-config Now activate your virtual environment and run pip to install the development version of PyGraphviz directly from GitHub, which supports Python 3: $ pip install git+http://github.com/pygraphviz/pygraphviz.git#egg=pygraphviz Next, install django-extensions and add it to your INSTALLED_APPS. Now, you are all set. Here is a sample usage to create a GraphViz dot file for just two apps and to convert it to a PNG image for viewing: $ python manage.py graph_models app1 app2 > models.dot $ dot -Tpng models.dot -o models.png Incremental change or a full rewrite? Often, you would be handed over legacy code by the application owners in the earnest hope that most of it can be used right away or after a couple of minor tweaks. However, reading and understanding a huge and often outdated code base is not an easy job. Unsurprisingly, most programmers prefer to work on greenfield development. In the best case, the legacy code ought to be easily testable, well documented, and flexible to work in modern environments so that you can start making incremental changes in no time. In the worst case, you might recommend discarding the existing code and go for a full rewrite. Or, as it is commonly decided, the short-term approach would be to keep making incremental changes, and a parallel long-term effort might be underway for a complete reimplementation. A general rule of thumb to follow while taking such decisions is—if the cost of rewriting the application and maintaining the application is lower than the cost of maintaining the old application over time, then it is recommended to go for a rewrite. Care must be taken to account for all the factors, such as time taken to get new programmers up to speed, the cost of maintaining outdated hardware, and so on. Sometimes, the complexity of the application domain becomes a huge barrier against a rewrite, since a lot of knowledge learnt in the process of building the older code gets lost. Often, this dependency on the legacy code is a sign of poor design in the application like failing to externalize the business rules from the application logic. The worst form of a rewrite you can probably undertake is a conversion, or a mechanical translation from one language to another without taking any advantage of the existing best practices. In other words, you lost the opportunity to modernize the code base by removing years of cruft. Code should be seen as a liability not an asset. As counter-intuitive as it might sound, if you can achieve your business goals with a lesser amount of code, you have dramatically increased your productivity. Having less code to test, debug, and maintain can not only reduce ongoing costs but also make your organization more agile and flexible to change. Code is a liability not an asset. Less code is more maintainable. Irrespective of whether you are adding features or trimming your code, you must not touch your working legacy code without tests in place. Write tests before making any changes In the book Working Effectively with Legacy Code, Michael Feathers defines legacy code as, simply, code without tests. He elaborates that with tests one can easily modify the behavior of the code quickly and verifiably. In the absence of tests, it is impossible to gauge if the change made the code better or worse. Often, we do not know enough about legacy code to confidently write a test. Michael recommends writing tests that preserve and document the existing behavior, which are called characterization tests. Unlike the usual approach of writing tests, while writing a characterization test, you will first write a failing test with a dummy output, say X, because you don't know what to expect. When the test harness fails with an error, such as "Expected output X but got Y", then you will change your test to expect Y. So, now the test will pass, and it becomes a record of the code's existing behavior. Note that we might record buggy behavior as well. After all, this is unfamiliar code. Nevertheless, writing such tests are necessary before we start changing the code. Later, when we know the specifications and code better, we can fix these bugs and update our tests (not necessarily in that order). Step-by-step process to writing tests Writing tests before changing the code is similar to erecting scaffoldings before the restoration of an old building. It provides a structural framework that helps you confidently undertake repairs. You might want to approach this process in a stepwise manner as follows: Identify the area you need to make changes to. Write characterization tests focusing on this area until you have satisfactorily captured its behavior. Look at the changes you need to make and write specific test cases for those. Prefer smaller unit tests to larger and slower integration tests. Introduce incremental changes and test in lockstep. If tests break, then try to analyze whether it was expected. Don't be afraid to break even the characterization tests if that behavior is something that was intended to change. If you have a good set of tests around your code, then you can quickly find the effect of changing your code. On the other hand, if you decide to rewrite by discarding your code but not your data, then Django can help you considerably. Legacy databases There is an entire section on legacy databases in Django documentation and rightly so, as you will run into them many times. Data is more important than code, and databases are the repositories of data in most enterprises. You can modernize a legacy application written in other languages or frameworks by importing their database structure into Django. As an immediate advantage, you can use the Django admin interface to view and change your legacy data. Django makes this easy with the inspectdb management command, which looks as follows: $ python manage.py inspectdb > models.py This command, if run while your settings are configured to use the legacy database, can automatically generate the Python code that would go into your models file. Here are some best practices if you are using this approach to integrate to a legacy database: Know the limitations of Django ORM beforehand. Currently, multicolumn (composite) primary keys and NoSQL databases are not supported. Don't forget to manually clean up the generated models, for example, remove the redundant 'ID' fields since Django creates them automatically. Foreign Key relationships may have to be manually defined. In some databases, the auto-generated models will have them as integer fields (suffixed with _id). Organize your models into separate apps. Later, it will be easier to add the views, forms, and tests in the appropriate folders. Remember that running the migrations will create Django's administrative tables (django_* and auth_*) in the legacy database. In an ideal world, your auto-generated models would immediately start working, but in practice, it takes a lot of trial and error. Sometimes, the data type that Django inferred might not match your expectations. In other cases, you might want to add additional meta information such as unique_together to your model. Eventually, you should be able to see all the data that was locked inside that aging PHP application in your familiar Django admin interface. I am sure this will bring a smile to your face. Summary In this article, we looked at various techniques to understand legacy code. Reading code is often an underrated skill. But rather than reinventing the wheel, we need to judiciously reuse good working code whenever possible. Resources for Article: Further resources on this subject: So, what is Django? [article] Adding a developer with Django forms [article] Introduction to Custom Template Filters and Tags [article]

0
0
2573

article-image-chain-responsibility-pattern

Packt

05 Feb 2015

12 min read

The Chain of Responsibility Pattern

Packt

05 Feb 2015

12 min read

In this article by Sakis Kasampalis, author of the book Mastering Python Design Patterns, we will see a detailed description of the Chain of Responsibility design pattern with the help of a real-life example as well as a software example. Also, its use cases and implementation are discussed. (For more resources related to this topic, see here.) When developing an application, most of the time we know which method should satisfy a particular request in advance. However, this is not always the case. For example, we can think of any broadcast computer network, such as the original Ethernet implementation [j.mp/wikishared]. In broadcast computer networks, all requests are sent to all nodes (broadcast domains are excluded for simplicity), but only the nodes that are interested in a sent request process it. All computers that participate in a broadcast network are connected to each other using a common medium such as the cable that connects the three nodes in the following figure: If a node is not interested or does not know how to handle a request, it can perform the following actions: Ignore the request and do nothing Forward the request to the next node The way in which the node reacts to a request is an implementation detail. However, we can use the analogy of a broadcast computer network to understand what the chain of responsibility pattern is all about. The Chain of Responsibility pattern is used when we want to give a chance to multiple objects to satisfy a single request, or when we don't know which object (from a chain of objects) should process a specific request in advance. The principle is the same as the following: There is a chain (linked list, tree, or any other convenient data structure) of objects. We start by sending a request to the first object in the chain. The object decides whether it should satisfy the request or not. The object forwards the request to the next object. This procedure is repeated until we reach the end of the chain. At the application level, instead of talking about cables and network nodes, we can focus on objects and the flow of a request. The following figure, courtesy of a title="Scala for Machine Learning" www.sourcemaking.com [j.mp/smchain], shows how the client code sends a request to all processing elements (also known as nodes or handlers) of an application: Note that the client code only knows about the first processing element, instead of having references to all of them, and each processing element only knows about its immediate next neighbor (called the successor), not about every other processing element. This is usually a one-way relationship, which in programming terms means a singly linked list in contrast to a doubly linked list; a singly linked list does not allow navigation in both ways, while a doubly linked list allows that. This chain organization is used for a good reason. It achieves decoupling between the sender (client) and the receivers (processing elements) [GOF95, page 254]. A real-life example ATMs and, in general, any kind of machine that accepts/returns banknotes or coins (for example, a snack vending machine) use the chain of responsibility pattern. There is always a single slot for all banknotes, as shown in the following figure, courtesy of www.sourcemaking.com: When a banknote is dropped, it is routed to the appropriate receptacle. When it is returned, it is taken from the appropriate receptacle [j.mp/smchain], [j.mp/c2chain]. We can think of the single slot as the shared communication medium and the different receptacles as the processing elements. The result contains cash from one or more receptacles. For example, in the preceding figure, we see what happens when we request $175 from the ATM. A software example I tried to find some good examples of Python applications that use the Chain of Responsibility pattern but I couldn't, most likely because Python programmers don't use this name. So, my apologies, but I will use other programming languages as a reference. The servlet filters of Java are pieces of code that are executed before an HTTP request arrives at a target. When using servlet filters, there is a chain of filters. Each filter performs a different action (user authentication, logging, data compression, and so forth), and either forwards the request to the next filter until the chain is exhausted, or it breaks the flow if there is an error (for example, the authentication failed three consecutive times) [j.mp/soservl]. Apple's Cocoa and Cocoa Touch frameworks use Chain of Responsibility to handle events. When a view receives an event that it doesn't know how to handle, it forwards the event to its superview. This goes on until a view is capable of handling the event or the chain of views is exhausted [j.mp/chaincocoa]. Use cases By using the Chain of Responsibility pattern, we give a chance to a number of different objects to satisfy a specific request. This is useful when we don't know which object should satisfy a request in advance. An example is a purchase system. In purchase systems, there are many approval authorities. One approval authority might be able to approve orders up to a certain value, let's say $100. If the order is more than $100, the order is sent to the next approval authority in the chain that can approve orders up to $200, and so forth. Another case where Chain of Responsibility is useful is when we know that more than one object might need to process a single request. This is what happens in an event-based programming. A single event such as a left mouse click can be caught by more than one listener. It is important to note that the Chain of Responsibility pattern is not very useful if all the requests can be taken care of by a single processing element, unless we really don't know which element that is. The value of this pattern is the decoupling that it offers. Instead of having a many-to-many relationship between a client and all processing elements (and the same is true regarding the relationship between a processing element and all other processing elements), a client only needs to know how to communicate with the start (head) of the chain. The following figure demonstrates the difference between tight and loose coupling. The idea behind loosely coupled systems is to simplify maintenance and make it easier for us to understand how they function [j.mp/loosecoup]: Implementation There are many ways to implement Chain of Responsibility in Python, but my favorite implementation is the one by Vespe Savikko [j.mp/savviko]. Vespe's implementation uses dynamic dispatching in a Pythonic style to handle requests [j.mp/ddispatch]. Let's implement a simple event-based system using Vespe's implementation as a guide. The following is the UML class diagram of the system: The Event class describes an event. We'll keep it simple, so in our case an event has only name: class Event: def __init__(self, name): self.name = name def __str__(self): return self.name The Widget class is the core class of the application. The parent aggregation shown in the UML diagram indicates that each widget can have a reference to a parent object, which by convention, we assume is a Widget instance. Note, however, that according to the rules of inheritance, an instance of any of the subclasses of Widget (for example, an instance of MsgText) is also an instance of Widget. The default value of parent is None: class Widget: def __init__(self, parent=None): self.parent = parent The handle() method uses dynamic dispatching through hasattr() and getattr() to decide who is the handler of a specific request (event). If the widget that is asked to handle an event does not support it, there are two fallback mechanisms. If the widget has parent, then the handle() method of parent is executed. If the widget has no parent but a handle_default() method, handle_default() is executed: def handle(self, event): handler = 'handle_{}'.format(event) if hasattr(self, handler): method = getattr(self, handler) method(event) elif self.parent: self.parent.handle(event) elif hasattr(self, 'handle_default'): self.handle_default(event) At this point, you might have realized why the Widget and Event classes are only associated (no aggregation or composition relationships) in the UML class diagram. The association is used to show that the Widget class "knows" about the Event class but does not have any strict references to it, since an event needs to be passed only as a parameter to handle(). MainWIndow, MsgText, and SendDialog are all widgets with different behaviors. Not all these three widgets are expected to be able to handle the same events, and even if they can handle the same event, they might behave differently. MainWIndow can handle only the close and default events: class MainWindow(Widget): def handle_close(self, event): print('MainWindow: {}'.format(event)) def handle_default(self, event): print('MainWindow Default: {}'.format(event)) SendDialog can handle only the paint event: class SendDialog(Widget): def handle_paint(self, event): print('SendDialog: {}'.format(event)) Finally, MsgText can handle only the down event: class MsgText(Widget): def handle_down(self, event): print('MsgText: {}'.format(event)) The main() function shows how we can create a few widgets and events, and how the widgets react to those events. All events are sent to all the widgets. Note the parent relationship of each widget. The sd object (an instance of SendDialog) has as its parent the mw object (an instance of MainWindow). However, not all objects need to have a parent that is an instance of MainWindow. For example, the msg object (an instance of MsgText) has the sd object as a parent: def main(): mw = MainWindow() sd = SendDialog(mw) msg = MsgText(sd) for e in ('down', 'paint', 'unhandled', 'close'): evt = Event(e) print('nSending event -{}- to MainWindow'.format(evt)) mw.handle(evt) print('Sending event -{}- to SendDialog'.format(evt)) sd.handle(evt) print('Sending event -{}- to MsgText'.format(evt)) msg.handle(evt) The following is the full code of the example (chain.py): class Event: def __init__(self, name): self.name = name def __str__(self): return self.name class Widget: def __init__(self, parent=None): self.parent = parent def handle(self, event): handler = 'handle_{}'.format(event) if hasattr(self, handler): method = getattr(self, handler) method(event) elif self.parent: self.parent.handle(event) elif hasattr(self, 'handle_default'): self.handle_default(event) class MainWindow(Widget): def handle_close(self, event): print('MainWindow: {}'.format(event)) def handle_default(self, event): print('MainWindow Default: {}'.format(event)) class SendDialog(Widget): def handle_paint(self, event): print('SendDialog: {}'.format(event)) class MsgText(Widget): def handle_down(self, event): print('MsgText: {}'.format(event)) def main(): mw = MainWindow() sd = SendDialog(mw) msg = MsgText(sd) for e in ('down', 'paint', 'unhandled', 'close'): evt = Event(e) print('nSending event -{}- to MainWindow'.format(evt)) mw.handle(evt) print('Sending event -{}- to SendDialog'.format(evt)) sd.handle(evt) print('Sending event -{}- to MsgText'.format(evt)) msg.handle(evt) if __name__ == '__main__': main() Executing chain.py gives us the following results: >>> python3 chain.py Sending event -down- to MainWindow MainWindow Default: down Sending event -down- to SendDialog MainWindow Default: down Sending event -down- to MsgText MsgText: down Sending event -paint- to MainWindow MainWindow Default: paint Sending event -paint- to SendDialog SendDialog: paint Sending event -paint- to MsgText SendDialog: paint Sending event -unhandled- to MainWindow MainWindow Default: unhandled Sending event -unhandled- to SendDialog MainWindow Default: unhandled Sending event -unhandled- to MsgText MainWindow Default: unhandled Sending event -close- to MainWindow MainWindow: close Sending event -close- to SendDialog MainWindow: close Sending event -close- to MsgText MainWindow: close There are some interesting things that we can see in the output. For instance, sending a down event to MainWindow ends up being handled by the default MainWindow handler. Another nice case is that although a close event cannot be handled directly by SendDialog and MsgText, all the close events end up being handled properly by MainWindow. That's the beauty of using the parent relationship as a fallback mechanism. If you want to spend some more creative time on the event example, you can replace the dumb print statements and add some actual behavior to the listed events. Of course, you are not limited to the listed events. Just add your favorite event and make it do something useful! Another exercise is to add a MsgText instance during runtime that has MainWindow as the parent. Is this hard? Do the same for an event (add a new event to an existing widget). Which is harder? Summary In this article, we covered the Chain of Responsibility design pattern. This pattern is useful to model requests / handle events when the number and type of handlers isn't known in advance. Examples of systems that fit well with Chain of Responsibility are event-based systems, purchase systems, and shipping systems. In the Chain Of Responsibility pattern, the sender has direct access to the first node of a chain. If the request cannot be satisfied by the first node, it forwards to the next node. This continues until either the request is satisfied by a node or the whole chain is traversed. This design is used to achieve loose coupling between the sender and the receiver(s). ATMs are an example of Chain Of Responsibility. The single slot that is used for all banknotes can be considered the head of the chain. From here, depending on the transaction, one or more receptacles is used to process the transaction. The receptacles can be considered the processing elements of the chain. Java's servlet filters use the Chain of Responsibility pattern to perform different actions (for example, compression and authentication) on an HTTP request. Apple's Cocoa frameworks use the same pattern to handle events such as button presses and finger gestures. Resources for Article: Further resources on this subject: Exploring Model View Controller [Article] Analyzing a Complex Dataset [Article] Automating Your System Administration and Deployment Tasks Over SSH [Article]

0
0
3561

Packt

30 Dec 2014

13 min read

Middleware

Packt

30 Dec 2014

13 min read

0
0
2273

Packt

19 Dec 2014

30 min read

Performance Optimization

Packt

19 Dec 2014

30 min read

0
0
3674

Packt

19 Nov 2014

6 min read

Function passing

Packt

19 Nov 2014

6 min read

In this article by Simon Timms, the author of the book, Mastering JavaScript Design Patterns, we will cover function passing. In functional programming languages, functions are first-class citizens. Functions can be assigned to variables and passed around just like you would with any other variable. This is not entirely a foreign concept. Even languages such as C had function pointers that could be treated just like other variables. C# has delegates and, in more recent versions, lambdas. The latest release of Java has also added support for lambdas, as they have proven to be so useful. (For more resources related to this topic, see here.) JavaScript allows for functions to be treated as variables and even as objects and strings. In this way, JavaScript is functional in nature. Because of JavaScript's single-threaded nature, callbacks are a common convention and you can find them pretty much everywhere. Consider calling a function at a later date on a web page. This is done by setting a timeout on the window object as follows: setTimeout(function(){alert("Hello from the past")}, 5 * 1000); The arguments for the set timeout function are a function to call and a time to delay in milliseconds. No matter the JavaScript environment in which you're working, it is almost impossible to avoid functions in the shape of callbacks. The asynchronous processing model of Node.js is highly dependent on being able to call a function and pass in something to be completed at a later date. Making calls to external resources in a browser is also dependent on a callback to notify the caller that some asynchronous operation has completed. In basic JavaScript, this looks like the following code: var xmlhttp = new XMLHttpRequest()xmlhttp.onreadystatechange=function()if (xmlhttp.readyState==4 &&xmlhttp.status==200){//process returned data}};xmlhttp.open("GET", http://some.external.resource, true); xmlhttp.send(); You may notice that we assign onreadystatechange before we even send the request. This is because assigning it later may result in a race condition in which the server responds before the function is attached to the ready state change. In this case, we've used an inline function to process the returned data. Because functions are first class citizens, we can change this to look like the following code: var xmlhttp;function requestData(){xmlhttp = new XMLHttpRequest()xmlhttp.onreadystatechange=processData;xmlhttp.open("GET", http://some.external.resource, true); xmlhttp.send();}function processData(){if (xmlhttp.readyState==4 &&xmlhttp.status==200){ //process returned data}} This is typically a cleaner approach and avoids performing complex processing in line with another function. However, you might be more familiar with the jQuery version of this, which looks something like this: $.getJSON('http://some.external.resource', function(json){//process returned data}); In this case, the boiler plate of dealing with ready state changes is handled for you. There is even convenience provided for you should the request for data fail with the following code: $.ajax('http://some.external.resource',{ success: function(json){ //process returned data},error: function(){ //process failure},dataType: "json"}); In this case, we've passed an object into the ajax call, which defines a number of properties. Amongst these properties are function callbacks for success and failure. This method of passing numerous functions into another suggests a great way of providing expansion points for classes. Likely, you've seen this pattern in use before without even realizing it. Passing functions into constructors as part of an options object is a commonly used approach to providing extension hooks in JavaScript libraries. Implementation In Westeros, the tourism industry is almost nonextant. There are great difficulties with bandits killing tourists and tourists becoming entangled in regional conflicts. Nonetheless, some enterprising folks have started to advertise a grand tour of Westeros in which they will take those with the means on a tour of all the major attractions. From King's Landing to Eyrie, to the great mountains of Dorne, the tour will cover it all. In fact, a rather mathematically inclined member of the tourism board has taken to calling it a Hamiltonian tour, as it visits everywhere once. The HamiltonianTour class provides an options object that allows the definition of an options object. This object contains the various places to which a callback can be attached. In our case, the interface for it would look something like the following code: export class HamiltonianTourOptions{onTourStart: Function;onEntryToAttraction: Function;onExitFromAttraction: Function;onTourCompletion: Function;} The full HamiltonianTour class looks like the following code: var HamiltonianTour = (function () {function HamiltonianTour(options) { this.options = options;}HamiltonianTour.prototype.StartTour = function () { if (this.options.onTourStart&&typeof (this.options.onTourStart) === "function") this.options.onTourStart(); this.VisitAttraction("King's Landing"); this.VisitAttraction("Winterfell"); this.VisitAttraction("Mountains of Dorne"); this.VisitAttraction("Eyrie"); if (this.options.onTourCompletion&&typeof (this.options.onTourCompletion) === "function") this.options.onTourCompletion();}; HamiltonianTour.prototype.VisitAttraction = function (AttractionName) { if (this.options.onEntryToAttraction&&typeof (this.options.onEntryToAttraction) === "function") this.options.onEntryToAttraction(AttractionName); //do whatever one does in a Attraction if (this.options.onExitFromAttraction&&typeof (this.options.onExitFromAttraction) === "function") this.options.onExitFromAttraction(AttractionName);};return HamiltonianTour;})(); You can see in the highlighted code how we check the options and then execute the callback as needed. This can be done by simply using the following code: var tour = new HamiltonianTour({onEntryToAttraction: function(cityname){console.log("I'm delighted to be in " + cityname)}});tour.StartTour(); The output of the preceding code will be: I'm delighted to be in King's LandingI'm delighted to be in WinterfellI'm delighted to be in Mountains of DorneI'm delighted to be in Eyrie Summary In this article, we have learned about function passing. Passing functions is a great approach to solving a number of problems in JavaScript and tends to be used extensively by libraries such as jQuery and frameworks such as Express. It is so commonly adopted that using it provides to added barriers no your code's readability. Resources for Article: Further resources on this subject: Creating Java EE Applications [article] Meteor.js JavaScript Framework: Why Meteor Rocks! [article] Dart with JavaScript [article]

0
0
1397

Packt

18 Sep 2014

8 min read

Redis in Autosuggest

Packt

18 Sep 2014

8 min read

0
0
3226

article-image-index-item-sharding-and-projection-dynamodb

Packt

17 Sep 2014

13 min read

Index, Item Sharding, and Projection in DynamoDB

Packt

17 Sep 2014

13 min read

Understanding the secondary index and projections should go hand in hand because of the fact that a secondary index cannot be used efficiently without specifying projection. In this article by Uchit Vyas and Prabhakaran Kuppusamy, authors of DynamoDB Applied Design Patterns, we will take a look at local and global secondary indexes, and projection and its usage with indexes. (For more resources related to this topic, see here.) The use of projection in DynamoDB is pretty much similar to that of traditional databases. However, here are a few things to watch out for: Whenever a DynamoDB table is created, it is mandatory to create a primary key, which can be of a simple type (hash type), or it can be of a complex type (hash and range key). For the specified primary key, an index will be created (we call this index the primary index). Along with this primary key index, the user is allowed to create up to five secondary indexes per table. There are two kinds of secondary index. The first is a local secondary index (in which the hash key of the index must be the same as that of the table) and the second is the global secondary index (in which the hash key can be any field). In both of these secondary index types, the range key can be a field that the user needs to create an index for. Secondary indexes A quick question: while writing a query in any database, keeping the primary key field as part of the query (especially in the where condition) will return results much faster compared to the other way. Why? This is because of the fact that an index will be created automatically in most of the databases for the primary key field. This the case with DynamoDB also. This index is called the primary index of the table. There is no customization possible using the primary index, so the primary index is seldom discussed. In order to make retrieval faster, the frequently-retrieved attributes need to be made as part of the index. However, a DynamoDB table can have only one primary index and the index can have a maximum of two attributes (hash and range key). So for faster retrieval, the user should be given privileges to create user-defined indexes. This index, which is created by the user, is called the secondary index. Similar to the table key schema, the secondary index also has a key schema. Based on the key schema attributes, the secondary index can be either a local or global secondary index. Whenever a secondary index is created, during every item insertion, the items in the index will be rearranged. This rearrangement will happen for each item insertion into the table, provided the item contains both the index's hash and range key attribute. Projection Once we have an understanding of the secondary index, we are all set to learn about projection. While creating the secondary index, it is mandatory to specify the hash and range attributes based on which the index is created. Apart from these two attributes, if the query wants one or more attribute (assuming that none of these attributes are projected into the index), then DynamoDB will scan the entire table. This will consume a lot of throughput capacity and will have comparatively higher latency. The following is the table (with some data) that is used to store book information: Here are few more details about the table: The BookTitle attribute is the hash key of the table and local secondary index The Edition attribute is the range key of the table The PubDate attribute is the range key of the index (let's call this index IDX_PubDate) Local secondary index While creating the secondary index, the hash and range key of the table and index will be inserted into the index; optionally, the user can specify what other attributes need to be added. There are three kinds of projection possible in DynamoDB: KEYS_ONLY: Using this, the index consists of the hash and range key values of the table and index INCLUDE: Using this, the index consists of attributes in KEYS_ONLY plus other non-key attributes that we specify ALL: Using this, the index consists of all of the attributes from the source table The following code shows the creation of a local secondary index named Idx_PubDate with BookTitle as the hash key (which is a must in the case of a local secondary index), PubDate as the range key, and using the KEYS_ONLY projection: private static LocalSecondaryIndex getLocalSecondaryIndex() { ArrayList<KeySchemaElement> indexKeySchema = newArrayList<KeySchemaElement>(); indexKeySchema.add(new KeySchemaElement() .withAttributeName("BookTitle") .withKeyType(KeyType.HASH)); indexKeySchema.add(new KeySchemaElement() .withAttributeName("PubDate") .withKeyType(KeyType.RANGE)); LocalSecondaryIndex lsi = new LocalSecondaryIndex() .withIndexName("Idx_PubDate") .withKeySchema(indexKeySchema) .withProjection(new Projection() .withProjectionType("KEYS_ONLY")); return lsi; } The usage of the KEYS_ONLY index type will create the smallest possible index and the usage of ALL will create the biggest possible index. We will discuss the trade-offs between these index types a little later. Going back to our example, let us assume that we are using the KEYS_ONLY index type, so none of the attributes (other than the previous three key attributes) are projected into the index. So the index will look as follows: You may notice that the row order of the index is almost the same as that of the table order (except the second and third rows). Here, you can observe one point: the table records will be grouped primarily based on the hash key, and then the records that have the same hash key will be ordered based on the range key of the index. In the case of the index, even though the table's range key is part of the index attribute, it will not play any role in the ordering (only the index's hash and range keys will take part in the ordering). There is a negative in this approach. If the user is writing a query using this index to fetch BookTitle and Publisher with PubDate as 28-Dec-2008, then what happens? Will DynamoDB complain that the Publisher attribute is not projected into the index? The answer is no. The reason is that even though Publisher is not projected into the index, we can still retrieve it using the secondary index. However, retrieving a nonprojected attribute will scan the entire table. So if we are sure that certain attributes need to be fetched frequently, then we must project it into the index; otherwise, it will consume a large number of capacity units and retrieval will be much slower as well. One more question: if the user is writing a query using the local secondary index to fetch BookTitle and Publisher with PubDate as 28-Dec-2008, then what happens? Will DynamoDB complain that the PubDate attribute is not part of the primary key and hence queries are not allowed on nonprimary key attributes? The answer is no. It is a rule of thumb that we can write queries on the secondary index attributes. It is possible to include nonprimary key attributes as part of the query, but these attributes must at least be key attributes of the index. The following code shows how to add non-key attributes to the secondary index's projection: private static Projection getProjectionWithNonKeyAttr() { Projection projection = new Projection() .withProjectionType(ProjectionType.INCLUDE); ArrayList<String> nonKeyAttributes = new ArrayList<String>(); nonKeyAttributes.add("Language"); nonKeyAttributes.add("Author2"); projection.setNonKeyAttributes(nonKeyAttributes); return projection; } There is a slight limitation with the local secondary index. If we write a query on a non-key (both table and index) attribute, then internally DynamoDB might need to scan the entire table; this is inefficient. For example, consider a situation in which we need to retrieve the number of editions of the books in each and every language. Since both of the attributes are non-key, even if we create a local secondary index with either of the attributes (Edition and Language), the query will still result in a scan operation on the entire table. Global secondary index A problem arises here: is there any way in which we can create a secondary index using both the index keys that are different from the table's primary keys? The answer is the global secondary index. The following code shows how to create the global secondary index for this scenario: private static GlobalSecondaryIndex getGlobalSecondaryIndex() { GlobalSecondaryIndex gsi = new GlobalSecondaryIndex() .withIndexName("Idx_Pub_Edtn") .withProvisionedThroughput(new ProvisionedThroughput() .withReadCapacityUnits((long) 1) .withWriteCapacityUnits((long) 1)) .withProjection(newProjection().withProjectionType ("KEYS_ONLY")); ArrayList<KeySchemaElement> indexKeySchema1 = newArrayList<KeySchemaElement>(); indexKeySchema1.add(new KeySchemaElement() .withAttributeName("Language") .withKeyType(KeyType.HASH)); indexKeySchema1.add(new KeySchemaElement() .withAttributeName("Edition") .withKeyType(KeyType.RANGE)); gsi.setKeySchema(indexKeySchema1); return gsi; } While deciding the attributes to be projected into a global secondary index, there are trade-offs we must consider between provisioned throughput and storage costs. A few of these are listed as follows: If our application doesn't need to query a table so often and it performs frequent writes or updates against the data in the table, then we must consider projecting the KEYS_ONLY attributes. The global secondary index will be minimum size, but it will still be available when required for the query activity. The smaller the index, the cheaper the cost to store it and our write costs will be cheaper too. If we need to access only those few attributes that have the lowest possible latency, then we must project only those (lesser) attributes into a global secondary index. If we need to access almost all of the non-key attributes of the DynamoDB table on a frequent basis, we can project these attributes (even the entire table) into the global secondary index. This will give us maximum flexibility with the trade-off that our storage cost would increase, or even double if we project the entire table's attributes into the index. The additional storage costs to store the global secondary index might equalize the cost of performing frequent table scans. If our application will frequently retrieve some non-key attributes, we must consider projecting these non-key attributes into the global secondary index. Item sharding Sharding, also called horizontal partitioning, is a technique in which rows are distributed among the database servers to perform queries faster. In the case of sharding, a hash operation will be performed on the table rows (mostly on one of the columns) and, based on the hash operation output, the rows will be grouped and sent to the proper database server. Take a look at the following diagram: As shown in the previous diagram, if all the table data (only four rows and one column are shown for illustration purpose) is stored in a single database server, the read and write operations will become slower and the server that has the frequently accessed table data will work more compared to the server storing the table data that is not accessed frequently. The following diagram shows the advantage of sharding over a multitable, multiserver database environment: In the previous diagram, two tables (Tbl_Places and Tbl_Sports) are shown on the left-hand side with four sample rows of data (Austria.. means only the first column of the first item is illustrated and all other fields are represented by ..).We are going to perform a hash operation on the first column only. In DynamoDB, this hashing will be performed automatically. Once the hashing is done, similar hash rows will be saved automatically in different servers (if necessary) to satisfy the specified provisioned throughput capacity. Have you ever wondered about the importance of the hash type key while creating a table (which is mandatory)? Of course we all know the importance of the range key and what it does. It simply sorts items based on the range key value. So far, we might have been thinking that the range key is more important than the hash key. If you think that way, then you may be correct, provided we neither need our table to be provisioned faster nor do we need to create any partitions for our table. As long as the table data is smaller, the importance of the hash key will be realized only while writing a query operation. However, once the table grows, in order to satisfy the same provision throughput, DynamoDB needs to partition our table data based on this hash key (as shown in the previous diagram). This partitioning of table items based on the hash key attribute is called sharding. It means the partitions are created by splitting items and not attributes. This is the reason why a query that has the hash key (of table and index) retrieves items much faster. Since the number of partitions is managed automatically by DynamoDB, we cannot just hope for things to work fine. We also need to keep certain things in mind, for example, the hash key attribute should have more distinct values. To simplify, it is not advisable to put binary values (such as Yes or No, Present or Past or Future, and so on) into the hash key attributes, thereby restricting the number of partitions. If our hash key attribute has either Yes or No values in all the items, then DynamoDB can create only a maximum of two partitions; therefore, the specified provisioned throughput cannot be achieved. Just consider that we have created a table called Tbl_Sports with a provisioned throughput capacity of 10, and then we put 10 items into the table. Assuming that only a single partition is created, we are able to retrieve 10 items per second. After a point of time, we put 10 more items into the table. DynamoDB will create another partition (by hashing over the hash key), thereby satisfying the provisioned throughput capacity. There is a formula taken from the AWS site: Total provisioned throughput/partitions = throughput per partition OR No. of partitions = Total provisioned throughput/throughput per partition In order to satisfy throughput capacity, the other parameters will be automatically managed by DynamoDB. Summary In this article, we saw what the local and global secondary indexes are. We walked through projection and its usage with indexes. Resources for Article: Further resources on this subject: Comparative Study of NoSQL Products [Article] Ruby with MongoDB for Web Development [Article] Amazon DynamoDB - Modelling relationships, Error handling [Article]

0
0
5406

5 min read

(For more resources related to this topic, see here.) Design patterns are ways to solve a problem and the way to get your intended result in the best possible manner. So, design patterns are not only ways to create a large and robust system, but they also provide great architectures in a friendly manner. In software engineering, a design pattern is a general repeatable and optimized solution to a commonly occurring problem within a given context in software design. It is a description or template for how to solve a problem, and the solution can be used in different instances. The following are some of the benefits of using design patterns: Maintenance Documentation Readability Ease in finding appropriate objects Ease in determining object granularity Ease in specifying object interfaces Ease in implementing even for large software projects Implements the code reusability concept If you are not familiar with design patterns, the best way to begin understanding is observing the solutions we use for commonly occurring, everyday life problems. Let's take a look at the following image: Many different types of power plugs exist in the world. So, we need a solution that is reusable, optimized, and cheaper than buying a new device for different power plug types. In simple words, we need an adapter. Have a look at the following image of an adapter: In this case, an adapter is the best solution that's reusable, optimized, and cheap. But an adapter does not provide us with a solution when our car's wheel blows out. In object-oriented languages, we the programmers use the objects to do whatever we want to have the outcome we desire. Hence, we have many types of objects, situations, and problems. That means we need more than just one approach to solving different kinds of problems. Elements of design patterns The following are the elements of design patterns: Name: This is a handle we can use to describe the problem Problem: This describes when to apply the pattern Solution: This describes the elements, relationships, responsibilities, and collaborations, in a way that we follow to solve a problem Consequences: This details the results and trade-offs of applying the pattern Classification of design patterns Design patterns are generally divided into three fundamental groups: Creational patterns Structural patterns Behavioral patterns Let's examine these in the following subsections. Creational patterns Creational patterns are a subset of design patterns in the field of software development; they serve to create objects. They decouple the design of an object from its representation. Object creation is encapsulated and outsourced (for example, in a factory) to keep the context of object creation independent from concrete implementation. This is in accordance with the rule: "Program on the interface, not the implementation." Some of the features of creational patterns are as follows: Generic instantiation: This allows objects to be created in a system without having to identify a specific class type in code (Abstract Factory and Factory pattern) Simplicity: Some of the patterns make object creation easier, so callers will not have to write large, complex code to instantiate an object (Builder (Manager) and Prototype pattern) Creation constraints: Creational patterns can put bounds on who can create objects, how they are created, and when they are created The following patterns are called creational patterns: The Abstract Factory pattern The Factory pattern The Builder (Manager) pattern The Prototype pattern The Singleton pattern Structural patterns In software engineering, design patterns structure patterns facilitate easy ways for communications between various entities. Some of the examples of structures of the samples are as follows: Composition: This composes objects into a tree structure (whole hierarchies). Composition allows customers to be uniformly treated as individual objects according to their composition. Decorator: This dynamically adds options to an object. A Decorator is a flexible alternative embodiment to extend functionality. Flies: This is a share of small objects (objects without conditions) that prevent overproduction. Adapter: This converts the interface of a class into another interface that the clients expect. Adapter lets those classes work together that would normally not be able to because of the different interfaces. Facade: This provides a unified interface meeting the various interfaces of a subsystem. Facade defines a higher-level interface to the subsystem, which is easier to use. Proxy: This implements the replacement (surrogate) of another object that controls access to the original object. Bridge: This separates an abstraction from its implementation, which can then be independently altered. Behavioral patterns Behavioral patterns are all about a class' objects' communication. Behavioral patterns are those patterns that are most specifically concerned with communication between objects. The following is a list of the behavioral patterns: Chain of Responsibility pattern Command pattern Interpreter pattern Iterator pattern Mediator pattern Memento pattern Observer pattern State pattern Strategy pattern Template pattern Visitor pattern If you want to check out the usage of some patterns in the Laravel core, have a look at the following list: The Builder (Manager) pattern: IlluminateAuthAuthManager and IlluminateSessionSessionManager The Factory pattern: IlluminateDatabaseDatabaseManager and IlluminateValidationFactory The Repository pattern: IlluminateConfigRepository and IlluminateCacheRepository The Strategy pattern: IIlluminateCacheStoreInterface and IlluminateConfigLoaderInterface The Provider pattern: IIlluminateAuthAuthServiceProvider and IlluminateHashHashServiceProvider Summary In this article, we have explained the fundamentals of design patterns. We've also introduced some design patterns that are used in the Laravel Framework. Resources for Article: Further resources on this subject: Laravel 4 - Creating a Simple CRUD Application in Hours [article] Your First Application [article] Creating and Using Composer Packages [article]

0
0
1384

article-image-overview-architecture-and-modeling-cassandra

Packt

21 Jan 2014

5 min read

An overview of architecture and modeling in Cassandra

Packt

21 Jan 2014

5 min read

(For more resources related to this topic, see here.) Cassandra uses a peer-to-peer architecture, unlike a master-slave architecture, which is prone to single point of failure (SPOF) problems. Cassandra is deployed on multiple machines with each machine acting as a node in a cluster. Data is autosharded, that is, automatically distributed across nodes using key-based sharding, which means that the keys are used to distribute the data across the cluster. Each key-value data element in Cassandra is replicated across the cluster on other nodes (the default replication is 3) for high availability and fault tolerance. If a node goes down, the data can be served from another node having a copy of the original data. Sharding is an old concept used for distributing data across different systems. Sharding can be horizontal or vertical. In horizontal sharding, in case of RDBMS, data is distributed on the basis of rows, with some rows residing on a single machine and the other rows residing on other machines. Vertical sharding is similar to columnar storage, where columns can be stored separately in different locations. Hadoop Distributed File Systems (HDFS) use data-volumes-based sharding, where a single big file is sharded and distributed across multiple machines using the block size. So, as an example, if the block size is 64 MB, a 640 MB file will be split into 10 chunks and placed in multiple machines. The same autosharding capability is used when new nodes are added to Cassandra, where the new node becomes responsible for a specific key range of data. The details of what node holds what key ranges is coordinated and shared across the cluster using the gossip protocol. So, whenever a client wants to access a specific key, each node locates the key and its associated data quickly within a few milliseconds. When the client writes data to the cluster, the data will be written to the nodes responsible for that key range. However, if the node responsible for that key range is down or not reachable, Cassandra uses a clever solution called Hinted Handoff that allows the data to be managed by another node in the cluster and to be written back on the responsible node once that node is back in the cluster. The replication of data raises the concern of data inconsistency when the replicas might have different states for the same data. Cassandra uses mechanisms such as anti-entropy and read repair for solving this problem and synchronizing data across the replicas. Anti-entropy is used at the time of compaction, where compaction is a concept borrowed from Google BigTable. Compaction in Cassandra refers to the merging of SSTable and helps in optimizing data storage and increasing read performance by reducing the number of seeks across SSTables. Another problem that compaction solves is handling deletion in Cassandra. Unlike traditional RDBMS, all deletes in Cassandra are soft deletes, which means that the records still exist in the underlying data store but are marked with a special flag so that these deleted records do not appear in query results. The records marked as deleted records are called tombstone records. Major compactions handle these soft deletes or tombstones by removing them from the SSTable in the underlying file stores. Cassandra, like Dynamo, uses a Merkle tree data structure to represent the data state at a column family level in a node. This Merkle tree representation is used during major compactions to find the difference in the data states across nodes and reconciled. The Merkle tree or Hash tree is a data structure in the form of a tree where every non-leaf node is labeled with the hash of children nodes, allowing the efficient and secure verification of the contents of the large data structure. Cassandra, like Dynamo, falls under the AP part of the CAP theorem and offers a tunable consistency level. Cassandra provides multiple consistency levels, as illustrated in the following table: Operation ZERO ANY ONE QUORUM ALL Read Not supported Not supported Reads from one node Read from a majority of nodes with replicas Read from all the nodes with replicas Write Asynchronous write Writes on one node including hints Writes on one node with commit log and Memtable Writes on a majority of nodes with replicas Writes on all the nodes with replicas A summary of the features in Cassandra The following table summarizes the key features of Cassandra with respect to its origins in Google BigTable and Amazon Dynamo: Feature Cassandra implementation Google BigTable Amazon Dynamo Architecture Peer-to-peer architecture, ring-based deployment architecture No Yes Data model Multidimensional map (row,column, timestamp) -> bytes Yes No CAP theorem AP with tunable consistency No Yes Storage architecture SSTable, Memtables Yes No Storage layer Local filesystem storage No No Fast reads and efficient storage Bloom filters, compactions Yes No Programming language Java No Yes Client programming language Multiple languages supported: Java, PHP, Python, REST, C++, .NET, and so on. Not known Not known Scalability model Horizontal scalability; multiple nodes deployment than a single machine deployment Yes Yes Version conflicts Timestamp field (not a vector clock as usually assumed) No No Hard deletes/updates Data is always appended using the timestamp field—deletes/updates are soft appends and are cleaned asynchronously as part of major compactions Yes No Summary Cassandra packs the best features of two technologies proven at scale—Google BigTable and Amazon Dynamo. However, today Cassandra has evolved beyond these origins with new unique and enterprise-ready features such as Cassandra Query Language (CQL), support for collection columns, lightweight transactions, and triggers. Resources for Article: Further resources on this subject: Basic Concepts and Architecture of Cassandra [Article] About Cassandra [Article] Getting Started with Apache Cassandra [Article]

0
0
2365

article-image-organizing-backbone-applications-structure-optimize-and-deploy

Packt

21 Jan 2014

9 min read

Organizing Backbone Applications - Structure, Optimize, and Deploy

Packt

21 Jan 2014

9 min read

(For more resources related to this topic, see here.) Creating application architecture The essential premise at the heart of Backbone has always been to try and discover the minimal set of data-structuring (Models and Collections) and user interface (Views and URLs) primitives that are useful when building web applications with JavaScript. Jeremy Ashkenas, creator of Backbone.js, Underscore.js, and CoffeeScript As Jeremy mentioned, Backbone.js has no intention, at least in the near future, to raise its bar to provide application architecture. Backbone will continue to be a lightweight tool to produce the minimal features required for web development. So, should we blame Backbone.js for not including such functionality even though there is a huge demand for this in the developer community? Certainly not! Backbone.js only yields the set of components that are necessary to create the backbone of an application and gives us complete freedom to build the app architecture in whichever way we want. If working on a significantly large JavaScript application, remember to dedicate sufficient time to planning the underlying architecture that makes the most sense. It's often more complex than you may initially imagine. Addy Osmani, author of Patterns For Large-Scale JavaScript Application Architecture So, as we start digging into more detail on creating an application architecture, we are not going to talk about trivial applications or something similar to a to-do-list app. Rather, we will investigate how to structure a medium- or large-level application. After discussions with a number of developers, we found that the main issue they face here is that there are several methodologies the online blog posts and tutorials offer to structure an application. While most of these tutorials talk about good practices, it becomes difficult to choose exactly one from them. Keeping that in mind, we will explore a number of steps that you should follow to make your app robust and maintainable in the long run. Managing a project directory This is the first step towards creating a solid app architecture. We have already discussed this in detail in the previous sections. If you are comfortable using another directory layout, go ahead with it. The directory structure will not matter much if the rest of your application is organized properly. Organizing code with AMD We will use RequireJS for our project. As discussed earlier, it comes with a bunch of facilities such as the following: Adding a lot of script tags in one HTML file and managing all of the dependencies on your own may work for a medium-level project, but will gradually fail for a large-level project. Such a project may have thousands of lines of code; managing a code base of that size requires small modules to be defined in each individual file. With RequireJS, you do not need to worry about how many files you have—you just know that if the standard is followed properly, it is bound to work. The global namespace is never touched and you can freely give the best names to something that matches with it the most. Debugging the RequireJS modules is a lot easier than other approaches because you know what the dependencies and path to each of them are in every module definition. You can use r.js, an optimization tool for RequireJS that minifies all the JavaScript and CSS files, to create the production-ready build. Setting up an application For a Backbone app, there must be a centralized object that will hold together all the components of the application. In a simple application, most people generally just make the main router work as the central object. But that will surely not work for a large application and you need an Application object that should work as the parent component. This object should have a method (mostly init()) that will work as the entry point to your application and initialize the main router along with the Backbone history. In addition, either your Application class should extend Backbone.Events or it should include a property that points to an instance of the Backbone.Events class. The benefit of doing this is that the app or Backbone.Events instance can act as a central event aggregator, and you can trigger application-level events on it. A very basic Application class will look like the following code snippet: // File: application.js define([ 'underscore', 'backbone', 'router' ], function (_, Backbone, Router) { // the event aggregator var PubSub = _.extend({}, Backbone.Events); var Application = function () { // Do useful stuff here } _.extend(Application.prototype, { pubsub: new PubSub(), init: function () { Backbone.history.start(); } }); return Application; }); Application is a simple class with an init() method and a PubSub instance. The init() method acts as the starting point of the application and PubSub works as the application-level event manager. You can add more functionality to the Application class, such as starting and stopping modules and adding a region manager for view layout management. It is advisable to keep this class as short as you can. Using the module pattern We often see that intermediate-level developers find it a bit confusing to initially use a module-based architecture. It can be a little difficult for them to make the transition from a simple MVC architecture to a modular MVC architecture. While the points we are discussing in this article are valid for both these architectures, we should always prefer to use a modular concept in nontrivial applications for better maintainability and organization. In the directory structure section, we saw how the module consists of a main.js file, its views, models, and collections all together. The main.js file will define the module and have different methods to manage the other components of that module. It works as the starting point of the module. A simple main.js file will look like the following code: // File: main.js define([ 'app/modules/user/views/userlist', 'app/modules/user/views/userdetails' ], function (UserList, UserDetails) { var myVar; return { initialize: function () { this.showUserList(); }, showUsersList: function () { var userList = new UserList(); userList.show(); }, showUserDetails: function (userModel) { var userDetails = new UserDetails({ model: userModel }); userDetails.show(); } }; }); As you can see, the responsibility of this file is to initiate the module and manage the components of that module. We have to make sure that it handles only parent-level tasks; it shouldn't contain a method that one of its views should ideally have. The concept is not very complex, but you need to set it up properly in order to use it for a large application. You can even go for an existing app and module setup and integrate it with your Backbone app. For instance, Marionette provides an application infrastructure for Backbone apps. You can use its inbuilt Application and Module classes to structure your application. It also provides a general-purpose Controller class—something that doesn't come with the Backbone library but can be used as a mediator to provide generic methods and work as a common medium among the modules. You can also use AuraJS (https://github.com/aurajs/aura), a framework-agonistic event-driven architecture developed by Addy Osmani (http://addyosmani.com) and many others; it works quite well with Backbone.js. A thorough discussion on AuraJS is beyond the scope of this book, but you can grab a lot of useful information about it from its documentation and examples (https://github.com/aurajs/todomvc). It is an excellent boilerplate tool that gives your app a kick-start and we highly recommend it, especially if you are not using the Marionette application infrastructure. The following are a few benefits of using AuraJS ; they may help you choose this framework for your application: AuraJS is framework-agnostic. Though it works great with Backbone.js, you can use it for your JavaScript module architecture even if you aren't using Backbone.js. It utilizes the module pattern, application-level and module-level communication using the facade (sandbox) and mediator patterns. It abstracts away the utility libraries that you use (such as templating and DOM manipulation) so you can swap alternatives anytime you want. Managing objects and module communication One of the most important ways to keep the application code maintainable is to reduce the tight coupling between modules and objects. If you are following the module pattern, you should never let one module communicate with another directly. Loose coupling adds a level of restriction in your code, and a change in one module will never enforce a change in the rest of the application. Moreover, it lets you re-use the same modules elsewhere. But how can we communicate if there is no direct relationship? The two important patterns we use in this case are the observer and mediator patterns. Using the observer/PubSub pattern The PubSub pattern is nothing but the event dispatcher. It works as a messaging channel between the object (publisher) that fires the event and another object (subscriber) that receives the notification. We mentioned earlier that we can have an application-level event aggregator as a property of the Application object. This event aggregator can work as the common channel via which the other modules can communicate, and that too without interacting directly. Even at the module-level, you may need a common event dispatcher only for that module; the views, models, and collections of that module can use it to communicate with each other. However, publishing too many events via a dispatcher sometimes makes it difficult to manage them and you must be careful enough to understand which events you should publish via a generic dispatcher and which ones you should fire on a certain component only. Anyhow, this pattern is one of the best tools to design a decoupled system, and you should always have one ready for use in your module-based application. Summary This article dealt with one of the most important topics of Backbone.js-based application development. At the framework level, learning Backbone is quite easy and developers get a complete grasp over it in a very short period of time. Resources for Article: Further resources on this subject: Building an app using Backbone.js [article] Testing Backbone.js Application [article] Understanding Backbone [article]

0
0
2235

How-To Tutorials - Design Patterns

Façade Pattern – Being Adaptive with Façade

Application Patterns

Patterns of Traversing

Cassandra Design Patterns

Getting Started with Meteor

Dealing with Legacy Code

The Chain of Responsibility Pattern

Middleware

Performance Optimization

Function passing

Trending Topics

Redis in Autosuggest

Index, Item Sharding, and Projection in DynamoDB

Design patterns

An overview of architecture and modeling in Cassandra

Organizing Backbone Applications - Structure, Optimize, and Deploy