Lesson 7: The Java Collections Framework and Generics
Activity 27: Read Users from CSV Using Array with Initial Capacity
Solution:
Create a class called UseInitialCapacity with a main() method
public class UseInitialCapacity { public static final void main (String [] args) throws Exception { } }
Add a constant field that will be the initial capacity of the array. It will also be used when the array needs to grow:
private static final int INITIAL_CAPACITY = 5;
Add a static method that will resize arrays. It receives two parameters: an array of Users and an int that represents the new size for the array. It should also return an array of Users. Implement the resize algorithm using System.arraycopy like you did in the previous exercise. Be mindful that the new size might be smaller than the current size of the passed in array:
private static User[] resizeArray(User[] users, int newCapacity) { User[] newUsers = new User[newCapacity]; int lengthToCopy = newCapacity > users.length ? users.length : newCapacity; System.arraycopy(users, 0, newUsers, 0, lengthToCopy); return newUsers; }
Write another static method that will load the users from a CSV file into an array. It needs to ensure that the array has the capacity to receive the users as they are loaded from the file. You'll also need to ensure that after finishing loading the users, the array do not contain extra slots at the end:
public static User[] loadUsers(String pathToFile) throws Exception { User[] users = new User[INITIAL_CAPACITY]; BufferedReader lineReader = new BufferedReader(new FileReader(pathToFile)); try (CSVReader reader = new CSVReader(lineReader)) { String [] row = null; while ( (row = reader.readRow()) != null) { // Reached end of the array if (users.length == reader.getLineCount()) { // Increase the array by INITIAL_CAPACITY users = resizeArray(users, users.length + INITIAL_CAPACITY); } users[users.length - 1] = User.fromValues(row); } // end of while // If read less rows than array capacity, trim it if (reader.getLineCount() < users.length - 1) { users = resizeArray(users, reader.getLineCount()); } } // end of try return users; }
In the main method, call the load users method and print the total number of users loaded:
User[] users = loadUsers(args[0]); System.out.println(users.length);
Add imports:
import java.io.BufferedReader; import java.io.FileReader;
The output is as follows:
27
Activity 28: Reading a Real Dataset Using Vector
Solution:
Before starting, change your CSVLoader to support files without headers. To do that, add a new constructor that receives a boolean that tells if it should ignore the first line or not:
public CSVReader(BufferedReader reader, boolean ignoreFirstLine) throws IOException { this.reader = reader; if (ignoreFirstLine) { reader.readLine(); } }
Change the old constructor to call this new one passing true to ignore the first line. This will avoid you to go back and change any existing code:
public CSVReader(BufferedReader reader) throws IOException { this(reader, true); }
Create a class called CalculateAverageSalary with main method:
public class CalculateAverageSalary { public static void main (String [] args) throws Exception { } }
Create another method that reads data from the CSV and load the wages into a Vector. The method should return the Vector at the end:
private static Vector loadWages(String pathToFile) throws Exception { Vector result = new Vector(); FileReader fileReader = new FileReader(pathToFile); BufferedReader bufferedReader = new BufferedReader(fileReader); try (CSVReader csvReader = new CSVReader(bufferedReader, false)) { String [] row = null; while ( (row = csvReader.readRow()) != null) { if (row.length == 15) { // ignores empty lines result.add(Integer.parseInt(row[2].trim())); } } } return result; }
In the main method, call the loadWages method and store the loaded wages in a Vector. Also store the initial time that the application started:
Vector wages = loadWages(args[0]); long start = System.currentTimeMillis();
Initialize three variables to store the min, max and sum of all wages:
int totalWage = 0; int maxWage = 0; int minWage = Integer.MAX_VALUE;
In a for-each loop, process all wages, storing the min, max and adding it to the sum:
for (Object wageAsObject : wages) { int wage = (int) wageAsObject; totalWage += wage; if (wage > maxWage) { maxWage = wage; } if (wage < minWage) { minWage = wage; } }
At the end print the number of wages loaded and total time it took to load and process them. Also print the average, min and max wages:
System.out.printf("Read %d rows in %dms\n", wages.size(), System.currentTimeMillis() - start); System.out.printf("Average, Min, Max: %d, %d, %d\n", totalWage / wages.size(), minWage, maxWage);
Add imports:
import java.io.BufferedReader; import java.io.FileReader; import java.util.Vector;
The output is as follows:
Read 32561 rows in 198ms Average, Min, Max: 57873, 12285, 1484705
Activity 29: Iterating on Vector of Users
Solution:
Create a new class called IterateOnUsersVector with main method:
public class IterateOnUsersVector { public static void main(String [] args) throws IOException { } }
In the main method, call the UsersLoader.loadUsersInVector passing the first argument passed from the command line as the file to load from and store the data in a Vector:
Vector users = UsersLoader.loadUsersInVector(args[0]);
Iterate over the users Vector using a for-each loop and print the information about the users to the console:
for (Object userAsObject : users) { User user = (User) userAsObject; System.out.printf("%s - %s\n", user.name, user.email); }
Add imports:
import java.io.IOException; import java.util.Vector;
The output is as follows:
Bill Gates - william.gates@microsoft.com Jeff Bezos - jeff.bezos@amazon.com Marc Benioff - marc.benioff@salesforce.com Bill Gates - william.gates@microsoft.com Jeff Bezos - jeff.bezos@amazon.com Sundar Pichai - sundar.pichai@google.com Jeff Bezos - jeff.bezos@amazon.com Larry Ellison - lawrence.ellison@oracle.com Marc Benioff - marc.benioff@salesforce.com Larry Ellison - lawrence.ellison@oracle.com Jeff Bezos - jeff.bezos@amazon.com Bill Gates - william.gates@microsoft.com Sundar Pichai - sundar.pichai@google.com Jeff Bezos - jeff.bezos@amazon.com Sundar Pichai - sundar.pichai@google.com Marc Benioff - marc.benioff@salesforce.com Larry Ellison - lawrence.ellison@oracle.com Marc Benioff - marc.benioff@salesforce.com Jeff Bezos - jeff.bezos@amazon.com Marc Benioff - marc.benioff@salesforce.com Bill Gates - william.gates@microsoft.com Sundar Pichai - sundar.pichai@google.com Larry Ellison - lawrence.ellison@oracle.com Bill Gates - william.gates@microsoft.com Larry Ellison - lawrence.ellison@oracle.com Jeff Bezos - jeff.bezos@amazon.com Sundar Pichai - sundar.pichai@google.com
Activity 30: Using a Hashtable to Group Data
Solution:
Create a class called GroupWageByEducation with a main method:
public class GroupWageByEducation { public static void main (String [] args) throws Exception { } }
Create a static method that creates and returns a Hashtable with keys of type String and values of type Vector of Integers:
private static Hashtable<String, Vector<Integer>> loadWages(String pathToFile) throws Exception { Hashtable<String, Vector<Integer>> result = new Hashtable<>(); return result; }
Between creating the Hashtable and returning it, load the rows from the CSV ensuring they have the correct format:
FileReader fileReader = new FileReader(pathToFile); BufferedReader bufferedReader = new BufferedReader(fileReader); try (CSVReader csvReader = new CSVReader(bufferedReader, false)) { String [] row = null; while ( (row = csvReader.readRow()) != null) { if (row.length == 15) { } } }
In the if inside the while loop, get the education level and wage for the record:
String education = row[3].trim(); int wage = Integer.parseInt(row[2].trim());
Find the Vector in the Hashtable that corresponds to the current education level and add the new wage to it:
// Get or create the vector with the wages for the specified education Vector<Integer> wages = result.getOrDefault(education, new Vector<>()); wages.add(wage); // Ensure the vector will be in the hashtable next time result.put(education, wages);
In the main method, call your loadWages method passing the first argument from the command line as the file to load the data from:
Hashtable<String,Vector<Integer>> wagesByEducation = loadWages(args[0]);
Iterate on the Hashtable entries using a for-each loop and for each entry, get the Vector of the corresponding wages and initialize min, max and sum variables for it:
for (Entry<String, Vector<Integer>> entry : wagesByEducation.entrySet()) { Vector<Integer> wages = entry.getValue(); int totalWage = 0; int maxWage = 0; int minWage = Integer.MAX_VALUE; }
After initializing the variables, iterate over all wages and store the min, max and sum values:
for (Integer wage : wages) { totalWage += wage; if (wage > maxWage) { maxWage = wage; } if (wage < minWage) { minWage = wage; } }
Then, print the information found for the specified entry, which represents an education level:
System.out.printf("%d records found for education %s\n", wages.size(), entry.getKey()); System.out.printf("\tAverage, Min, Max: %d, %d, %d\n", totalWage / wages.size(), minWage, maxWage);
Add imports:
import java.io.BufferedReader; import java.io.FileReader; import java.util.Hashtable; import java.util.Map.Entry; import java.util.Vector;
The output is as follows:
1067 records found for education Assoc-acdm Average, Min, Max: 193424, 19302, 1455435 433 records found for education 12th Average, Min, Max: 199097, 23037, 917220 1382 records found for education Assoc-voc Average, Min, Max: 181936, 20098, 1366120 5355 records found for education Bachelors Average, Min, Max: 188055, 19302, 1226583 51 records found for education Preschool Average, Min, Max: 235889, 69911, 572751 10501 records found for education HS-grad Average, Min, Max: 189538, 19214, 1268339 168 records found for education 1st-4th Average, Min, Max: 239303, 34378, 795830 333 records found for education 5th-6th Average, Min, Max: 232448, 32896, 684015 576 records found for education Prof-school Average, Min, Max: 185663, 14878, 747719 514 records found for education 9th Average, Min, Max: 202485, 22418, 758700 1723 records found for education Masters Average, Min, Max: 179852, 20179, 704108 933 records found for education 10th Average, Min, Max: 196832, 21698, 766115 413 records found for education Doctorate Average, Min, Max: 186698, 19520, 606111 7291 records found for education Some-college Average, Min, Max: 188742, 12285, 1484705 646 records found for education 7th-8th Average, Min, Max: 188079, 20057, 750972 1175 records found for education 11th Average, Min, Max: 194928, 19752, 806316
Activity 31: Sorting Users
Solution:
Write a comparator class to compare Users by ID:
import java.util.Comparator; public class ByIdComparator implements Comparator<User> { public int compare(User first, User second) { if (first.id < second.id) { return -1; } if (first.id > second.id) { return 1; } return 0; } }
Write a comparator class to compare Users by email:
import java.util.Comparator; public class ByEmailComparator implements Comparator<User> { public int compare(User first, User second) { return first.email.toLowerCase().compareTo(second.email.toLowerCase()); } }
Write a comparator class to compare Users by name:
import java.util.Comparator; public class ByNameComparator implements Comparator<User> { public int compare(User first, User second) { return first.name.toLowerCase().compareTo(second.name.toLowerCase()); } }
Create a new class called SortUsers with a main method which loads the unique users keyed by email:
public class SortUsers { public static void main (String [] args) throws IOException { Hashtable<String, User> uniqueUsers = UsersLoader.loadUsersInHashtableByEmail(args[0]); } }
After loading the users, transfer the users into a Vector of Users to be able to preserve order since Hashtable doesn't do that:
Vector<User> users = new Vector<>(uniqueUsers.values());
Ask the user to pick what field he wants to sort the users by and collect the input from standard input:
Scanner reader = new Scanner(System.in); System.out.print("What field you want to sort by: "); String input = reader.nextLine();
Use the input in a switch statement to pick what comparator to use. If the input is not valid, print a friendly message and exit:
Comparator<User> comparator; switch(input) { case "id": comparator = newByIdComparator(); break; case "name": comparator = new ByNameComparator(); break; case "email": comparator = new ByEmailComparator(); break; default: System.out.printf("Sorry, invalid option: %s\n", input); return; }
Tell the user what field you're going to sort by and sort the Vector of users:
System.out.printf("Sorting by %s\n", input); Collections.sort(users, comparator);
Print the users using a for-each loop:
for (User user : users) { System.out.printf("%d - %s, %s\n", user.id, user.name, user.email); }
Add imports:
import java.io.IOException; import java.util.Collections; import java.util.Comparator; import java.util.Hashtable; import java.util.Scanner; import java.util.Vector;
The output is as follows:
5 unique users found. What field you want to sort by: email Sorting by email 30 - Jeff Bezos, jeff.bezos@amazon.com 50 - Larry Ellison, lawrence.ellison@oracle.com 20 - Marc Benioff, marc.benioff@salesforce.com 40 - Sundar Pichai, sundar.pichai@google.com 10 - Bill Gates, william.gates@microsoft.com