Packt+ | Advance your knowledge in tech

You're reading from C++ Data Structures and Algorithm Design Principles Leverage the power of modern C++ to build robust and scalable applications

Product type Paperback

Published in Oct 2019

Publisher

ISBN-13 9781838828844

Length 626 pages

Edition 1st Edition

Languages

C++

Concepts

Data Structures and Algorithms

Authors (4):

Anil Achary

John Carey

Payas Rajan

Shreyans Doshi

View More author details

Table of Contents (11) Chapters

About the Book

1. Lists, Stacks, and Queues FREE CHAPTER

2. Trees, Heaps, and Graphs

3. Hash Tables and Bloom Filters

4. Divide and Conquer

5. Greedy Algorithms

6. Graph Algorithms I

7. Graph Algorithms II

8. Dynamic Programming I

9. Dynamic Programming II

1. Appendix

Chapter 4: Divide and Conquer

Activity 8: Vaccinations

In this activity, we will store and lookup the vaccination status of students to determine if they need to be vaccinated. These steps should help you complete the activity:

Begin by including the following headers:
#include <iostream>
#include <vector>
#include <chrono>
#include <random>
#include <algorithm>
#include <numeric>
Define the Student class as follows:
class Student
{
private:
    std::pair<int, int> name;
    bool vaccinated;
public:
    // Constructor
    Student(std::pair<int, int> n, bool v) :
        name(n), vaccinated(v)
    {}
    // Getters
    auto get_name() { return name; }
    auto is_vaccinated() { return vaccinated; }
    // Two people are same if they have the same name
    bool operator ==(const Student& p) const
    {
        return this->name == p.name;
    }
    // The ordering of a set of people is defined by their name
    bool operator< (const Student& p) const
    {
        return this->name < p.name;
    }
    bool operator> (const Student& p) const
    {
        return this->name > p.name;
    }
};
The following function lets us generate a student from random data:
auto generate_random_Student(int max)
{
    std::random_device rd;
    std::mt19937 rand(rd());
    // the IDs of Student should be in range [1, max]
    std::uniform_int_distribution<std::mt19937::result_type> uniform_dist(1, max);
    // Generate random credentials
    auto random_name = std::make_pair(uniform_dist(rand), uniform_dist(rand));
    bool is_vaccinated = uniform_dist(rand) % 2 ? true : false;
    return Student(random_name, is_vaccinated);
}
The following code is used to run and test the output of our implementation:
void search_test(int size, Student p)
{
    std::vector<Student> people;
    // Create a list of random people
    for (auto i = 0; i < size; i++)
        people.push_back(generate_random_Student(size));
    std::sort(people.begin(), people.end());
    // To measure the time taken, start the clock
    std::chrono::steady_clock::time_point begin = std::chrono::steady_clock::now();
    bool search_result = needs_vaccination(p, people);
    // Stop the clock
    std::chrono::steady_clock::time_point end = std::chrono::steady_clock::now();
    std::cout << "Time taken to search = " <<
        std::chrono::duration_cast<std::chrono::microseconds>
        (end - begin).count() << " microseconds" << std::endl;
    if (search_result)
        std::cout << "Student (" << p.get_name().first
<< " " << p.get_name().second << ") "
            << "needs vaccination." << std::endl;
    else
        std::cout << "Student (" << p.get_name().first
<< " " << p.get_name().second << ") "
            << "does not need vaccination." << std::endl;
}
The following function implements our logic for whether a vaccination is needed:
bool needs_vaccination(Student P, std::vector<Student>& people)
{
    auto first = people.begin();
    auto last = people.end();
    while (true)
    {
        auto range_length = std::distance(first, last);
        auto mid_element_index = std::floor(range_length / 2);
        auto mid_element = *(first + mid_element_index);
        // Return true if the Student is found in the sequence and
// he/she's not vaccinated
        if (mid_element == P && mid_element.is_vaccinated() == false)
            return true;
        else if (mid_element == P && mid_element.is_vaccinated() == true)
            return false;
        else if (mid_element > P)
            std::advance(last, -mid_element_index);
        if (mid_element < P)
            std::advance(first, mid_element_index);
        // Student not found in the sequence and therefore should be vaccinated
        if (range_length == 1)
            return true;
    }
}
Finally, the driver code is implemented as follows:
int main()
{
    // Generate a Student to search
    auto p = generate_random_Student(1000);
    search_test(1000, p);
    search_test(10000, p);
    search_test(100000, p);
    return 0;
}
Note
Since we are randomizing values in step 3, your output may vary from the expected output shown for this activity.

Activity 9: Partial Sorting

The partial quicksort is only a slight modification of the original quicksort algorithm that was demonstrated in Exercise 20, Quicksort. Compared to that exercise, only step 4 is different. The following is a reference implementation:

Add the following header files:
#include <iostream>
#include <vector>
#include <chrono>
#include <random>
#include <algorithm>
Next, we shall implement the partition operation, as follows:
template <typename T>
auto partition(typename std::vector<T>::iterator begin,
    typename std::vector<T>::iterator end)
{
    auto pivot_val = *begin;
    auto left_iter = begin + 1;
    auto right_iter = end;
    while (true)
    {
        // Starting from the first element of vector,
        // find an element that is greater than pivot.
        while (*left_iter <= pivot_val && std::distance(left_iter, right_iter) > 0)
            left_iter++;
        // Starting from the end of vector moving to the beginning,
        // find an element that is lesser than the pivot.
        while (*right_iter > pivot_val && std::distance(left_iter, right_iter) > 0)
            right_iter--;
        // If left and right iterators meet, there are no elements left to swap.
        // Else, swap the elements pointed to by the left and right iterators
        if (left_iter == right_iter)
            break;
        else
            std::iter_swap(left_iter, right_iter);
    }
    if (pivot_val > *right_iter)
        std::iter_swap(begin, right_iter);
    return right_iter;
}
Since the desired output also needs an implementation of the quicksort algorithm, we'll implement one as follows:
template <typename T>
void quick_sort(typename std::vector<T>::iterator begin,
    typename std::vector<T>::iterator last)
{
    // If there are more than 1 elements in the vector
    if (std::distance(begin, last) >= 1)
    {
        // Apply the partition operation
        auto partition_iter = partition<T>(begin, last);
        // Recursively sort the vectors created by the partition operation
        quick_sort<T>(begin, partition_iter-1);
        quick_sort<T>(partition_iter, last);
    }
}
Implement the partial quicksort function as follows:
template <typename T>
void partial_quick_sort(typename std::vector<T>::iterator begin,
    typename std::vector<T>::iterator last,
    size_t k)
{
    // If there are more than 1 elements in the vector
    if (std::distance(begin, last) >= 1)
    {
        // Apply the partition operation
        auto partition_iter = partition<T>(begin, last);
        // Recursively sort the vectors created by the partition operation
        partial_quick_sort<T>(begin, partition_iter-1, k);

        // Sort the right subvector only if the final position of pivot < k
        if(std::distance(begin, partition_iter) < k)
            partial_quick_sort<T>(partition_iter, last, k);
    }
}
The following helper functions can be then used to print the contents of a vector and to generate a random vector:
template <typename T>
void print_vector(std::vector<T> arr)
{
    for (auto i : arr)
        std::cout << i << " ";
    std::cout << std::endl;
}
// Generates random vector of a given size with integers [1, size]
template <typename T>
auto generate_random_vector(T size)
{
    std::vector<T> V;
    V.reserve(size);
    std::random_device rd;
    std::mt19937 rand(rd());
    // the IDs of Student should be in range [1, max]
    std::uniform_int_distribution<std::mt19937::result_type> uniform_dist(1, size);
    for (T i = 0; i < size; i++)
        V.push_back(uniform_dist(rand));
    return std::move(V);
}
The following function implements the testing logic for our sorting functions:
// Sort the first K elements of a random vector of a given 'size'
template <typename T>
void test_partial_quicksort(size_t size, size_t k)
{
        // Create two copies of a random vector to use for the two algorithms
        auto random_vec = generate_random_vector<T>(size);
        auto random_vec_copy(random_vec);
        std::cout << "Original vector: "<<std::endl;
        print_vector<T>(random_vec);

        // Measure the time taken by partial quick sort
        std::chrono::steady_clock::time_point
begin_qsort = std::chrono::steady_clock::now();
        partial_quick_sort<T>(random_vec.begin(), random_vec.end()-1, k);
        std::chrono::steady_clock::time_point
end_qsort = std::chrono::steady_clock::now();

        std::cout << std::endl << "Time taken by partial quick sort = "
            << 'std::chrono::duration_cast<std::chrono::microseconds>
            (end_qsort - begin_qsort).count()
            << " microseconds" << std::endl;

        std::cout << "Partially sorted vector (only first "<< k <<" elements):";
        print_vector<T>(random_vec);
        // Measure the time taken by partial quick sort
        begin_qsort = std::chrono::steady_clock::now();
        quick_sort<T>(random_vec_copy.begin(), random_vec_copy.end()-1);
        end_qsort = std::chrono::steady_clock::now();
        std::cout << std::endl <<"Time taken by full quick sort = "
            << std::chrono::duration_cast<std::chrono::microseconds>
            (end_qsort - begin_qsort).count()
            << " microseconds" << std::endl;

        std::cout << "Fully sorted vector: ";
        print_vector<T>(random_vec_copy);
}
Finally, add the driver code, as follows:
int main()
{
test_partial_quicksort<unsigned>(100, 10);
return 0;
}

Activity 10: Implementing WordCount in MapReduce

In this activity, we will implement the MapReduce model to solve the WordCount problem. The following is the solution to this activity:

Implement the map task as follows:
struct map_task : public mapreduce::map_task<
    std::string,                             // MapKey (filename)
    std::pair<char const*, std::uintmax_t>>  // MapValue (memory mapped file contents)
{
    template<typename Runtime>
    void operator()(Runtime& runtime, key_type const& key, value_type& value) const
    {
        bool in_word = false;
        char const* ptr = value.first;
        char const* end = ptr + value.second;
        char const* word = ptr;
        // Iterate over the contents of the file, extract words and emit a <word,1> pair.
        for (; ptr != end; ++ptr)
        {
            // Convert the character to upper case.
            char const ch = std::toupper(*ptr, std::locale::classic());
            if (in_word)
            {
                if ((ch < 'A' || ch > 'Z') && ch != '\'')
                {
runtime.emit_intermediate(std::pair<char const*,
              std::uintmax_t> (word, ptr - word), 1);
                    in_word = false;
                }
            }
            else if (ch >= 'A' && ch <= 'Z')
            {
                word = ptr;
                in_word = true;
            }
        }
        // Handle the last word.
        if (in_word)
        {
            assert(ptr > word);
            runtime.emit_intermediate(std::pair<char const*,
                          std::uintmax_t>(word, ptr - word), 1);
        }
    }
};
The preceding map function is applied separately to each file in the input directory. The contents of the input file are accepted as the * character in value. The inner loop then iterates over the contents of the file, extracting different words and emitting < key, value > pairs, where key is a word and value is set to 1.
Implement the reduce task as follows:
template<typename KeyType>
struct reduce_task : public mapreduce::reduce_task<KeyType, unsigned>
{
    using typename mapreduce::reduce_task<KeyType, unsigned>::key_type;
    template<typename Runtime, typename It>
    void operator()(Runtime& runtime, key_type const& key, It it, It const ite) const
    {
        runtime.emit(key, std::accumulate(it, ite, 0));
}
};
The reduce operation can then be applied to all < key, value > pairs that are emitted by the map function. Since the value was set to 1 in the previous step, we can now use std::accumulate() to get the total number of times a key appears among the input pairs of the reduce operation.