6. Arrays
Too much of a good thing can be wonderful.
6.1. Introduction
With one exception, all of the types we’ve talked about in this book
have held a single value. For example, an int
variable can only
contain a single int
value. If you try to put a second int
value
into a variable, it will overwrite the first.
The String
type, of course, is the exception. String
objects can
contain char
sequences of any length from 0 up to a practically
limitless size (the theoretical maximum length of a Java String
is
Integer.MAX_INT
, more than 650 times the length of War and Peace).
As remarkable as String
objects are, this chapter is about a more
general kind of list called an array. We can create arrays to hold a
list of any type of variable in Java.
The ability to work with lists expands the scope of problems that we can solve. Beyond simple lists, we can use the same tools to create tables, grids, and other structures to solve fascinating problems like the one that comes next.
6.2. Problem: Game of Life
Some physicists insist that the rules governing the universe are horribly complicated. Some insist that the fundamental laws are simple and only their overall interaction is complex. With the power to do simulations quickly, computer scientists have shown that some systems can exhibit very complex interactions using simple rules. Perhaps the best known examples of these systems are cellular automata, of which Conway’s Game of Life is the most famous.
The framework for Conway’s Game of Life is an infinite grid made up of squares called cells. Some cells in the grid are black (or alive, to give it a biological flavor), and the rest are white (or dead). Any given cell has 8 neighbors as shown in the figure below.
The pattern of alive and dead cells at any given time on the grid is called a generation. To determine the next generation, we use the following rules.
-
If a living cell has fewer than two living neighbors, it dies from loneliness.
-
If a living cell has more than three neighbors, it dies because of overcrowding.
-
If a living cell has exactly two or three living neighbors, it lives to the next generation.
-
If a dead cell has exactly three live neighbors, it becomes a living cell.
These four simple rules allow for more complex interactions than you might expect. The patterns that emerge from applying these rules to a starting configuration of alive and dead cells strike a balance between complete chaos and rigid order. As the name of the game implies, the similarity to biological patterns of development can be surprising.
Your problem is to create a Life simulator of size n × m, specific values for which will be discussed below. The program should simulate the process at a speed that is engaging to watch, with a new generation every tenth of a second. The program should begin by randomly making 10% of all the cells living and the rest dead.
6.2.1. Terminal limitations
One problem you might be worrying about is how to display the
simulation. In Chapter 15 you’ll learn how to make a graphical user interface (GUI)
that can display a grid of cells in black and white and much more
interesting things as well. For now, the main tool that we can use for
output is still the terminal. The output method we recommend is printing
an asterisk ('*'
) for a living cell and a space (' '
) for a dead
one. In this way you can easily see the patterns form on the screen and
change over time.
The classic terminal isn’t very big. For this reason, we suggest that you set the height of your simulation to 24 rows and the width of your simulation to 80 columns. These dimensions conform to the most ancient terminal sizes. If your terminal screen is much larger, you can change the width and height later to perform a larger simulation. No matter how large the display for the game is, the ideal size of the game. Because our size is so limited, we must deal with the problem of a cell on the boundary. Anything beyond the boundaries should be counted as dead. Thus, a cell right on the edge of the simulation grid can have a maximum of 5 neighbors. A cell in one of the four corners can only have 3 neighbors.
In order to give the appearance of smooth transitions, you need to print out each new generation of cells quickly, and in the same locations as the previous generation. Simply printing one set after another will not achieve this effect unless your terminal screen is exactly 24 rows tall. So, you will need to clear the screen each time. In an effort to be platform independent, Java does not provide an easy way to clear the terminal screen. A quick and simple hack is to simply print out 100 blank lines before printing the next generation. This hack will work as long as your terminal is not significantly more than 100 rows in height. If it is, you’ll need to print a larger number of blank rows.
Finally, you need to wait a short period of time between generations so that the user can see each configuration before it’s cleared away and replaced by the next one. The simplest way to do this is by having your program go to sleep for a short period of time, a tenth of a second as we suggested before. The code to make your program sleep for that amount of time is:
try { Thread.sleep(100); }
catch(InterruptedException e) {}
We’ll explain this code in much greater detail in Chapter 13.
The key item of importance is the number passed into the sleep()
method.
This value is the number of milliseconds you want your program to sleep.
100 milliseconds is, of course, one tenth of a second.
In order to simulate the Game of Life, we need to store information, namely the liveness or deadness of cells, in a grid. First, we need to discuss the simpler task of storing data in a list.
6.3. Concepts: Lists of data
Lists of data are of paramount importance in many different areas of life: shopping lists, packing lists, lists of employees working at a company, address books, top ten lists, and more. Lists are even more important in programming. As you know, one of the great strengths of computers is their speed. If we have a long list of data, we can use that speed to perform operations on all the data quickly. E-mail contact lists, entries in a database, and cells in a spreadsheet are just a few of the most obvious ways that lists come up in computer applications.
Even in Java, there are many different ways to record a list of information, but a list is only one form of data structure. As the name implies, a data structure is a way to organize data, whether in a list, in a tree, in sorted order, in some kind of hierarchy, or in any other way that might be useful. We’ll only talk about the array data structure in this chapter, but other data structures will be discussed in Chapter 18. Below, we give a short explanation of some of the attributes any given data structure might have.
6.3.1. Data structure attribute
- Contents
-
Keeping only a single value in a data structure defeats the purpose of a data structure. But, if we can store more than a single value, must all of those values come from the same type? If a data structure can hold several different types of data, we call it heterogeneous, but if it can only hold one type, we call it homogeneous.
- Size
-
The size of a data structure may be something that’s fixed when it’s created or it could change on the fly. If a data structure’s size or length can change, we call it a dynamic data structure. If the data structure has a size or length that can never change after it’s been created, we call it a static data structure.
- Element Access
-
One of the reasons there are so many different data structures is that different ways of structuring data are better or worse for a given task. For example, if your task is to add a list of numbers, then you’re expecting to access every single element in the list. However, if you’re searching for a word in a dictionary, you don’t want to check every dictionary entry to find it.
Some data structures are optimized so that you can efficiently read, insert, or delete only a single item, often the first (or last) item in the data structure. Some data structures only allow you to move through the structure sequentially, one item after another. Such a data structure has what is called sequential access. Still others allow you to jump randomly to any point you want inside the data structure efficiently. These data structures have what is called random access. Advanced programmers take into account many different factors before deciding which data structure is best suited to their problem.
6.3.2. Characteristics of an array
Now that we’ve defined these attributes, we can say that an array is a
homogeneous, static data structure with random access. An array is
homogeneous because it can only hold a list of a single type of data,
such as int
, double
, or String
. An array is static because it has
a fixed length that is set only when the array is instantiated. An array
also has random access because jumping to any element in the array is
fast and takes about the same amount of time as jumping to any
other.
An array is a list of a specific type of elements that has length n, a length specified when the array is created. Each of the n elements is indexed using a number between 0 and n - 1]. Once again, zero-based counting rears its ugly head. Consider the following list of items: {9, 4, 2, 1, 6, 8, 3}
If this list is stored in an array, the first element, 9, would have index 0, 4 would have index 1, and so on, finishing at 3 with an index of 6, although the total number of items is 7. Not all languages use zero-based counting for array indexes, but many do, including C, C++, and Java. The reason that languages like C originally used zero-based counting for indexes is that the variable corresponding to the array is an address inside the computer’s memory giving the first element in the array. Thus, an index of 0 is 0 times the size of an element added to the starting address, and an index of 5 is 5 times the size of an element added to the starting address. So, zero-based indexes gave a quick way for the program to compute where in memory a given element of an array is.
6.4. Syntax: Arrays in Java
The idea of a list is not mysterious. Numbering each element of the list is natural, even though the numbers start at 0 instead of 1. Nevertheless, arrays are the source of many errors that cause Java programs to crash. Below, we explain the basics of creating arrays, indexing into arrays, and using arrays with loops. Then, there’s an extra subsection explaining how to send data from a file to a program as if the file were being typed in by a user. Using this technique can save a lot of time when you’re experimenting with arrays.
6.4.1. Array declaration and instantiation
To create an array, you usually need to create an array variable first.
Remember that an array is a homogeneous data structure, meaning that it
can only store elements of a single type. When you create an array
variable, you have to specify what that type is. To declare an array
variable, you use the type it’s going to hold, followed by square
brackets ([]
), followed by the name of the variable. For example, if
you want to create an array called numbers
that can hold integers, you
would type the following.
int[] numbers;
If you have some C or C++ programming experience, you might be used to the brackets being on the other side of the variable, like so.
int numbers[];
In Java, both declarations are perfectly legal and equivalent. However,
the first declaration is preferred from a stylistic perspective. It
follows the pattern of using the type (an array of int
values in this
case) followed by the variable name as the syntax for a declaration.
As we said, arrays are also static data structures, meaning that their
length is fixed at the time of their creation. Yet we didn’t specify a
length above. This declaration has not yet created an array, just a
variable that can point at an array. In the second half of this chapter,
we will further discuss this difference between the way an array is
created and the way an int
or any other variable of primitive type is
created. To actually create the array, we need to use another step,
involving the keyword new
. Here’s how we instantiate an array of
int
type with 10 elements.
numbers = new int[10];
We use the keyword new
, followed by the type of element, followed by
the number of elements the array can hold in square brackets. This new
array is stored into numbers
. In other words, the variable numbers
is now a name for the array. Commonly, the two steps of declaring and
instantiating an array will be combined into one line of code.
int[] numbers = new int[10];
It’s always possible to separate the two steps. In some cases, a single variable might be used to point at an array of one particular length, then changed to point at an array of another length, and so on, as below.
int[] numbers;
numbers = new int[10];
numbers = new int[100];
numbers = new int[1000];
Here, the variable numbers
starts off pointing at no array. Next, it’s
made to point at a new array with 10 elements. Then, it’s made to
point at a new array with 100 elements, ignoring the 10 element array.
Finally, it’s made to point at an array with 1,000 elements, ignoring
the 100 element array. Remember, the arrays themselves are static; their
lengths can’t change. The array type variables, however, can point at
different arrays with different lengths, provided that they’re still
the right type (in this case, arrays of int
values).
What values are inside the array when it’s first created? Let’s return
to the case where numbers
points at a new array with 10 elements. Each
of those elements contains the int
value 0
, as shown below.
Whenever an array is instantiated, each of its n elements
is set to some default value. For int
, long
, short
, and byte
this value is 0
. For double
and float
, this value is 0.0
. For
char
, this value is '\0'
, a special unprintable character. For
boolean
, this value is false
. For String
or any other reference
type, this value is null
, a special value that means there’s no
object.
It’s also possible to use a list to initialize an array. For example,
we can create an array of type double
that contains the values 0.5
,
1.0
, 1.5
, 2.0
, and 2.5
using the following code.
double[] increments = {0.5, 1.0, 1.5, 2.0, 2.5};
This line of code is equivalent to using the new
keyword to create a
double
array with 5 elements and then setting each to the values
shown.
6.4.2. Indexing into arrays
To use a value in an array, you must index into the array, using the
square brackets once again. Returning to the example of the int
array
numbers
with length 10, we can read the value at index 4 from the
array and print it out.
System.out.println(numbers[4]);
Until some other value has been stored there, the value of numbers[4]
is 0
,
and so 0
is all that will be printed out.
We can set the value at numbers[4]
to 17
as follows.
numbers[4] = 17;
Then, if we try to print out numbers[4]
, 17
will be printed. The
contents of the numbers
array will look like this.
The key thing to understand about indexing into an array is that it
gives you an element of the specified type. In other words, numbers[4]
is an int
variable in every possible sense. You can read its value.
You can change its value. You can pass it into a method. It can be used
anywhere a normal int
can be used, as in the following example.
int x = numbers[4];
double y = Math.sqrt(numbers[2]) + numbers[4];
numbers[9] = (int)(y*x);
Executing this code will store 17
into x
and 17.0
into y
. Then,
the product of those two, 289
, will be stored into numbers[9]
.
Remember, in Java, the type on the left and the type on the right of the
assignment operator (=
) must match, except in cases of automatic
casting, like storing an int
value into a double
variable. Since
they have the same type, it makes sense to store an element of an int
array like numbers[4]
into an int
variable like x
. However, an
array of int
values can’t be stored into an int
type.
int z = numbers;
This code will cause a compiler error. What would it mean? You can’t put a list of variables into a single variable. And the converse is true as well.
numbers = 31;
This code will also cause a compiler error. A single value can’t be stored into a whole list. You would have to specify an index where it can be stored. Furthermore, you must be careful to specify a legal index. No negative index will ever be legal, and neither will an index greater than or equal to the number of elements in the array.
numbers[10] = 99;
This code will compile correctly. If you remember, we instantiated the
array that numbers
points at to have 10 elements, numbered 0 through
9. Thus, we are trying to store 99
into the element that is one index
after the last legal element. As a result, Java will cause an error
called an ArrayIndexOutOfBoundsException
to happen, which will crash
your program.
6.4.3. Using loops with arrays
One reason to use arrays is to avoid declaring 10 separate variables
just to have 10 int
values to work with. But once you have the array,
you’ll often need an automated way to process it. Any of the three
kinds of loops provides a powerful tool for performing operations on an
array, but the for
loop is an especially good match. Here is an
example of a for
loop that sets the values in an array to their
indexes.
int[] values = new int[100];
for(int i = 0; i < 100; i++)
values[i] = i;
This sample of code shows how easy it is to iterate over every element
in an array with a for
loop, but it has a flaw in its style. Note that
the number 100
is used twice: once in the instantiation of the array
and a second time in the termination condition of the for
loop. This
fragment of code works fine, but if the programmer changes the length of
values
to be 50
or 500
, the bounds of the for
loop will also
need to change. Furthermore, the length of the array might be determined
by user input.
To make the code both more robust and readable, we can use the length
field of the values
array for the bound of the for
loop.
int[] values = new int[100];
for(int i = 0; i < values.length; i++)
values[i] = i;
The length
field gives the length of the array that values
points
to. If the programmer wants to instantiate the array with a different
length, that’s fine. The length
field will always reflect the correct
value. Whenever possible, use the length
field of arrays in your code.
Note that the length
field is read-only. If you try to set
values.length
to a specific value, your code will not compile.
Setting the values in an array is only one possible task you can perform
with a loop. Let’s assume that an array of type double
named data
has been declared, instantiated, and filled with user input. We could
sum all its elements using the following code. A more elegant way to do
the same summation is discussed in Section 6.9.1.
double sum = 0.0;
for(int i = 0; i < data.length; i++)
sum += data[i];
System.out.println("The sum of your data is: " + sum);
So far, we’ve only discussed operations on the values in an array. It
is important to realize that the order of those values can be equally
important. We’re going to create an array of char
type named
letters
, initialized with some values, and then reverse the order of
the array.
char[] letters = {'b', 'y', 'z', 'a', 'n', 't', 'i', 'n', 'e'}; (1)
int start = 0; (2)
int end = letters.length - 1; (3)
char temp;
while(start < end) { (4)
temp = letters[start]; (5)
letters[start] = letters[end];
letters[end] = temp;
start++; (6)
end--;
}
for(int i = 0; i < letters.length; i++) (7)
System.out.print(letters[i]);
1 | After initializing the letters
array, we declare start and end . |
2 | start gets the value 0 , the
first index of letters . |
3 | end gets the value letters.length - 1 , the last valid index
of letters . |
4 | The while loop continues as long as
the start is less than the end . |
5 | The first three lines of each
iteration of the while loop will swap the char at index start with
the char at index end . |
6 | The two lines after that will increment and
decrement start and end , respectively. When the two meet in the
middle, the entire array has been reversed. |
7 | The simple for loop at the
end prints out each char in letters , giving the output enitnazyb . |
Of course, we could have printed out the array elements in reverse order without changing their order, but we wanted to reverse them, perhaps because we will need them reversed in the future.
6.4.4. Redirecting input
With arrays and loops, we can process a lot of data, but testing programs that process a lot of data can be tedious. Instead of typing data into the terminal, we can read data from a file. In Java, file I/O is a messy process that involves several objects and method calls. We’re going to talk about it in depth in Chapter 20, but for now we can use a quick and easy workaround.
If you create a text file using a simple text editor, you can redirect
the file as input to a program. Everything you’ve written in the text
file is treated as if it were being typed into the command line by a
person. To do so, you type the command using java
to run your class
file normally, type the <
sign, and then type the name of the file you
want to use as input. For example, if you have a text file called
numbers.txt
that you want to use as input to a program stored in
Summer.class
, you could do so as follows.
java Summer < numbers.txt
Redirecting input this way is not a part of Java. Instead, it’s a feature of the terminal running under your OS. Not all operating systems support input redirection, but virtually every flavor of Linux and Unix do, as well as the Windows command line and the macOS terminal. We could write the program mentioned above and give it the simple task of summing all the numbers it gets as input.
import java.util.*;
public class Summer {
public static void main(String[] args) {
Scanner in = new Scanner(System.in);
System.out.print("How many numbers do you want to add? ");
int n = in.nextInt();
int sum = 0;
for(int i = 0; i < n; i++) {
System.out.print("Enter next number: ");
sum += in.nextInt();
}
System.out.println("The sum of the numbers is " + sum);
}
}
Now, we can type out a file with a list of numbers in it and save it as
numbers.txt
. To conform with the program we wrote, we should also put
the total count of numbers as the first value in the file. You can put
each number on a separate line or just leave a space between each one.
As long as they are separated by white space, the Scanner
object will
take care of the rest. You’ll have to type the numbers into the file
once, but then you can test your program over and over with the same file.
If you do run the program with the file you’ve created, you’ll notice
that the program still prompts you once for the total count of numbers
and then prompts you many times to enter the next number. With
redirected input, all that text runs together in a bizarre way. All the
input is coming from numbers.txt
. If you expect a program to read
strictly from redirected input, you can design your code a little
differently. For one thing, you don’t need to have explicit prompts for
the user. For another, you can use a number of special methods from the
Scanner
class. The Scanner
class has a several methods like
hasNextInt()
and hasNextDouble()
. These methods will examine the
input and see if there is another legal int
or double
and return
true
or false
accordingly. If you expect a file to have only a long
sequence of int
values, you can use hasNextInt()
to determine if
you’ve reached the end of the file or not. Using hasNextInt()
, we can
simplify the program and remove the expectation that the first number
gives the total count of numbers.
import java.util.*;
public class QuietSummer {
public static void main(String[] args) {
Scanner in = new Scanner(System.in);
int sum = 0;
while(in.hasNextInt())
sum += in.nextInt();
System.out.println("The sum of the numbers is " + sum);
}
}
On the other hand, you might be interested in the output of a program.
The output could be very long or it might take a lot of time to produce
or you might want to store it permanently. For these situations, it is
possible to redirect output as well. Instead of printing to the
screen, you can send the output to a file of your choosing. The syntax
for this operation is just like the syntax for input redirection except
that you use the >
sign instead of <
. To run QuietSummer
with
input from numbers.txt
and output to sum.txt
, we could do the
following.
java QuietSummer < numbers.txt > sum.txt
You would be free to examine sum.txt
at any time with your text editor
of choice. When using output redirection, it makes more sense to run
QuietSummer
than Summer
. If we had run Summer
, all of that
unnecessary output prompting the user to enter numbers would be saved in
sum.txt
.
6.5. Examples: Array usage
Here are a few examples of practical array usage. We’re going to discuss some techniques useful mostly for searching and sorting. Searching for values in a list seems mundane, but it’s one of the most practical tasks that a computer scientist routinely carries out. By making a computer do the work, it saves human beings countless hours of tedious searching and checking. Another important task is sorting. Sorting a list can make future searches faster and is the simplest way to find the median of a list of values. Sorting is a fundamental part of countless real world problems.
In the examples below, we’ll first discuss finding the largest (or smallest) value in a list, move on to sorting lists, and then talk about a task that searches for words, like a dictionary look up.
Finding the largest value input by a user is not difficult. Applying
that knowledge to an array is pretty straightforward as well. This
simple task is also a building block of the sorting algorithm we’ll
discuss below. The key to finding the largest value in any list is to
keep a temporary variable that records the largest value found so far.
As we go along, we update the variable if we find a larger value. The
only trick is initializing the variable to some appropriate starting
value. We could initialize it to zero, but what if entire list of
numbers is negative? Then, our answer would be larger than any of the
numbers in the list. If our list of numbers is of type int
, we could
initialize our variable to Integer.MIN_VALUE
, the smallest possible
int
. This approach works, but you have to remember the name of the
constant, and it doesn’t improve the readability of the code.
When working with an array, the best way to find the largest value in
the list is by setting your temporary variable to the first element
(index 0
) in the array. Below is a short snippet of code that finds
the largest value in an int
array named values
in exactly this way.
int largest = values[0];
for(int i = 1; i < values.length; i++)
if(values[i] > largest)
largest = values[i];
System.out.println("The largest value is " + largest);
Note that the for
loop starts at 1
not 0
. Because largest
is
initialized to be values[0]
, there’s no reason to check that value a second time.
Doing so would still give the correct answer, but it wastes a tiny
amount of time.
What’s the feature of this code that makes it find the largest value?
The key is the >
operator. With the change of a single character, we
could find the smallest value instead.
int smallest = values[0];
for(int i = 1; i < values.length; i++)
if(values[i] < smallest)
smallest = values[i];
System.out.println("The smallest value is " + smallest);
In addition to the necessary change from >
to <
, we also changed the
output and the name of the variable to avoid confusion. Now, we’ll show
how repeatedly finding the smallest value in an array can be used to
sort it. Alternatively, the largest value could be used equally well.
Sorting is the bread and butter of computer scientists. Much research has been devoted to finding the fastest ways to sort a list of data. The rest of the world assumes that sorting a list of data is trivial because computer scientists have done such a good job solving this problem. The name of the sorting algorithm we are going to describe below is selection sort. It’s not one of the fastest ways to sort data, but it’s simple and easy to understand.
The idea behind selection sort is to find the smallest element in an
array and put it at index 0
of the array. Then, from the remaining
elements, find the smallest element and put it at index 1
of the
array. The process continues, filling the array up from the beginning
with the smallest values until the entire array is sorted. If the length
of the array is n, we’ll need to look for the smallest
element in the array n - 1 times. By putting the code that
searches for the smallest value inside of an outer loop, we can write a
program that does selection sort of int
values input by the user as
follows. This program’s not very long, but there’s a lot going on.
import java.util.*;
public class SelectionSort {
public static void main(String[] args) {
Scanner in = new Scanner(System.in);
int n = in.nextInt(); (1)
int[] values = new int[n]; (2)
int smallest;
int temp;
for(int i = 0; i < values.length; i++) (3)
values[i] = in.nextInt();
1 | After
instantiating a Scanner , we read in the total number of values the
list will hold. We cannot rely on the hasNextInt() method to tell us
when to stop reading values. |
2 | We need to know up front how many values we are going to store so that we can instantiate our array with the appropriate length. |
3 | Then, we read each value into the array using the
first for loop. |
for(int i = 0; i < n - 1; i++) { (1)
smallest = i;
for(int j = i + 1; j < n; j++) (2)
if(values[j] < values[smallest])
smallest = j; (3)
temp = values[smallest]; (4)
values[smallest] = values[i];
values[i] = temp;
}
1 | The next for loop is where the actual sort happens. We start at index
0 and then try to find the smallest value to be put in that spot.
Then, we move on to index 1 , and so on, just as we described before.
Note that we only go up to n - 2 . We don’t need to find the value to
put in index n - 1 , because the rest of the list has the n - 1
smallest numbers in it and so the last number must already be the
largest. |
2 | If you look carefully, you’ll notice that the inner for
loop has the same overall shape as the loop used to find the smallest
value in the previous example; however, there is one key difference: |
3 | Instead of storing the value of the smallest number in smallest , we
now store the index of the smallest number. We need to store the index
of the smallest number so that, in the next step, we can swap the
corresponding element with the element at i , the spot in the array
we’re trying to fill. |
4 | The three lines after the inner for loop are a
simple swap to do exactly that. |
System.out.print("The sorted list is: ");
for(int i = 0; i < values.length; i++) (1)
System.out.print(values[i] + " ");
}
}
1 | After all the sorting is done, the final for loop prints out the newly
sorted list. |
This program gives no prompts for user input, so it’s well designed for input redirection. If you’re going to make a file containing numbers you want to sort with this program, make sure that the first number is the total count of numbers in the file.
Again, this program sorts the list in ascending order (from smallest to
largest). If you wanted to sort the list in descending order, you would
only need to change the <
to a >
in the comparison of the inner
for
loop, although other changes are recommended for the sake of
readability.
In this example, we read in a list of words and a long passage of text and keep track of the number of times each word in the list occurs in the passage. This kind of text searching has many applications. Similar ideas are used in a spell checker that needs to look up words in a dictionary. The incredibly valuable find and replace tools in modern word processors use some of the same techniques.
To make this program work, however, we need to read in a (potentially long) list of words and then a lot of text. We are forced to use input redirection (or some other file input) because typing this text in multiple times would be tedious. When we get to Chapter 20, we’ll talk about ways to read from multiple files at the same time. Right now, we can only redirect input from a single file, and so we’re forced to put the list of words at the top of the file, followed by the text we want to search through.
import java.util.*;
public class WordCount {
public static void main(String[] args) {
Scanner in = new Scanner(System.in);
int n = in.nextInt(); (1)
String[] words = new String[n]; (2)
int[] counts = new int[n]; (3)
String temp;
for(int i = 0; i < words.length; i++) (4)
words[i] = in.next().toLowerCase();
1 | As in the last example, this program begins by reading in the length of the list of words. |
2 | Then, it instantiates the String array words to
hold these words. |
3 | It also instantiates an array counts of type int
to keep track of the number of times each word is found. By default,
each element in counts is initialized to 0 . |
4 | The first for loop in
the program reads in each word and stores it into the array words . |
while(in.hasNext()) { (1)
temp = in.next().toLowerCase();
for(int i = 0; i < n; i++) (2)
if(temp.equals(words[i])) {
counts[i]++; (3)
break;
}
}
System.out.println("The word counts are: ");
1 | The while loop reads in each word from the text following the list and
stores it in a variable called temp . |
2 | Then, it loops through words
and tests to see if temp matches any of the elements in the list. |
3 | If it does, it increases the value of the element of counts that has the
same index and breaks out of the inner for loop. |
for(int i = 0; i < words.length; i++) (1)
System.out.println(words[i] + " " + counts[i]);
}
}
1 | After all the words in the text have been processed, the final for
loop prints out each word from the list, along with its counts. |
This program uses two different arrays for bookkeeping: words
contains
the words we are searching for and counts
contains the number of times
each word has been found. These two arrays are separate data structures.
The only link between them is the code we wrote to maintain the
correspondence between their elements.
To give a clear picture of how this program should behave, here’s a sample input file with two paragraphs from the beginning of The Count of Monte Cristo by Alexandre Dumas.
7 and at bridge for pilot vessel walnut On the 24th of February, 1815, the look-out at Notre-Dame de la Garde signaled the three-master, the Pharaon from Smyrna, Trieste, and Naples. As usual, a pilot put off immediately, and rounding the Chateau d'If, got on board the vessel between Cape Morgion and Rion island. Immediately, and according to custom, the ramparts of Fort Saint-Jean were covered with spectators; it is always an event at Marseilles for a ship to come into port, especially when this ship, like the Pharaon, has been built, rigged, and laden at the old Phocee docks, and belongs to an owner of the city.
And here’s the output one should get from running WordCount
with
input redirected from the file given above.
The word counts are: and 6 at 3 bridge 0 for 1 pilot 1 vessel 1 walnut 0
For this example, the program works fine. However, our program would
have given incorrect output if ship
, spectators
, or several other
words in the text had been on the word list. You see, the next()
method in the Scanner
class reads in String
values separated by
white space. The word ship
appears twice in the text, but the second
instance is followed by a comma. Since the words are separated by white
space only, the String
"ship,"
does not match the String
"ship"
.
Dealing with punctuation is not difficult, but it would increase the
length of the code, so we leave it as an exercise.
Imagine you’re a teacher who has just given an exam. You want to produce statistics for the class so that the students have some idea how well they have done. You want to write a Java program to help you produce the statistics, to save time now and in the future.
The statistics you want to collect are listed in the following table.
Statistic | Description |
---|---|
Maximum |
Maximum score |
Minimum |
Minimum score |
Mean |
Average of all the scores |
Standard Deviation |
Sample standard deviation of the scores |
Median |
Middle value of the scores when ordered |
Example 6.1 covered how to find the maximum and minimum scores in a list. The mean is simply the sum of all the scores divided by the total number of scores. Standard deviation is a little bit trickier. It gives a measurement of how spread out the data is. Let n be the number of data points, label each data point xi, where 1 ≤ i ≤ n, and let μ be the mean of all the data points. Then, the formula for the sample standard deviation is as follows.
Finally, if you sort a list of numbers in order, the median is the middle value in the list, or the average of the two middle values, if the list has an even length.
These kinds of statistical operations are useful and are packaged into many important business applications such as Microsoft Excel. This version will have a simple interface whose input comes from the command line. First, the total number of scores will be entered. Then, each score should be entered one by one. After all the data has been entered, the program should compute and output the five values.
Below we give the solution to this statistics problem. Several different tasks are combined here, but each of them should be reasonably easy to solve after the previous examples.
import java.util.*;
public class Statistics {
public static void main(String[] args) {
Scanner in = new Scanner(System.in);
int n = in.nextInt(); (1)
int[] scores = new int[n]; (2)
for(int i = 0; i < n; i++) (3)
scores[i] = in.nextInt();
1 | In our solution, the main() method begins by reading in the total
number of scores. |
2 | It declares an int array of that length named
scores . |
3 | Then, we read in each of the scores and store them into
scores . |
int max = scores[0]; (1)
int min = scores[0];
int sum = scores[0];
for(int i = 1; i < n; i++) { (2)
if(scores[i] > max)
max = scores[i];
if(scores[i] < min)
min = scores[i];
sum += scores[i];
}
1 | Here we declare variables max , min , and sum to hold, respectively,
the maximum, minimum, and sum of the elements in the array. Then, we set
all three variables to the value of the first element of the array.
These initializations make the following code work. |
2 | In a single for
loop, we find the maximum, minimum, and sum of all the values in the
array. |
We could have done so with three separate loops, but this
approach is more efficient. Setting max
and min
to scores[0]
follows the pattern we’ve used before, but setting sum
to the same
value is also necessary in this case. Because the loop iterates from 1
up to scores.length - 1
, we must include the value at index 0
in
sum
. Alternatively, we could have set sum
to 0
and started the
for
loop at i = 0
.
double mean = ((double)sum)/n;
System.out.println("Maximum:\t\t" + max);
System.out.println("Minimum:\t\t" + min);
System.out.println("Mean:\t\t\t" + mean);
In this short snippet of code, we compute the mean, being careful to
cast it into type double
before the division, and then print out the three
statistics we’ve computed.
double variance = 0;
for(int i = 0; i < n; i++)
variance += (scores[i] - mean)*(scores[i] - mean); (1)
variance /= (n - 1); (2)
System.out.println("Standard Deviation:\t" + Math.sqrt(variance)); (3)
1 | At this point, we use the mean we’ve already computed to find the
sample standard deviation. Following the formula for sample standard
deviation, we subtract the mean from each score, square the result, and
add it to a running total. Although the formula for sample standard
deviation uses the bounds 1 to n, we
translate them to 0 to n - 1 because of zero-based array numbering. |
2 | Dividing the total by n - 1 gives the sample variance. |
3 | Then, the square root of the variance is the standard deviation. |
int temp;
for(int i = 0; i < n - 1; i++) { (1)
min = i;
for(int j = i + 1; j < n; j++)
if(scores[j] < scores[min])
min = j;
temp = scores[min];
scores[min] = scores[i];
scores[i] = temp;
}
1 | To find the median, we use our selection sort code. |
Note that we have
reused the variable min
to hold the smallest value found so far,
instead of declaring a new variable such as smallest
. Some programmers
might object to doing so, since we run the risk of interpreting the
variable as the minimum value in the entire array, as it was before.
Either approach is fine. If you worry about confusing people reading
your code, add a comment.
double median;
if(n % 2 == 0) (1)
median = (scores[n/2] + scores[n/2 + 1])/2.0; (2)
else
median = scores[n/2]; (3)
System.out.println("Median:\t\t\t" + median);
}
}
1 | After the array has been sorted, we need to do a quick check to see if its length is odd or even. |
2 | If its length is even, we need to find the average of the two middle elements. |
3 | If its length is odd, we can report the value of the single middle element. |
Note that some of the statistics we found, such as the maximum, minimum, or mean, could be computed with a single pass over the data without the need for an array for storage. However, the last two tasks need to store all the values to work. The simplest way to find the sample standard deviation of a list of values requires its mean, requiring one pass to find the mean and a second to sum the squares of the difference of the mean and each value. Likewise, it’s impossible to find the median of a list of values without storing the list.
6.6. Concepts: Multidimensional lists
In the previous half of the chapter, we focused on lists of data and how to store them in Java in arrays. The arrays we have discussed already are one-dimensional arrays. That is, each element in the array has a single index that refers to it. Given a specific index, an element will have that index, come before it, or come after it. These kinds of arrays can be used to solve a huge number of problems involving lists or collections of data.
Sometimes, the data needs to be represented with more structure. One way to provide this structure is with a two-dimensional array. You can think of a two-dimensional array as a table of data. Instead of using a single index, a two-dimensional array has two indexes. Usually, we think about these dimensions as rows and columns. Below is a table of information that gives the distances in miles between the five largest cities in the United States.
New York | Los Angeles | Chicago | Houston | Phoenix | |
---|---|---|---|---|---|
New York |
0 |
2,791 |
791 |
1,632 |
2,457 |
Los Angeles |
2,791 |
0 |
2,015 |
1,546 |
373 |
Chicago |
791 |
2,015 |
0 |
1,801 |
1,181 |
Houston |
1,632 |
1,546 |
1,801 |
0 |
1,176 |
Phoenix |
2,457 |
373 |
1,181 |
1,176 |
0 |
The position of each number in the table is a fundamental part of its
usefulness. We know that the distance from Chicago to Houston is 1,801
miles because that number is in the Chicago row and the Houston column.
A two-dimensional array shares almost all of its properties with a
one-dimensional array. It’s still a homogeneous, static data structure
with random access. If the example above were made into a Java array,
the numbers themselves would be the elements of the array. The names of
the cities would need to be stored separately, perhaps in an array of
type String
.
There’s no reason to confine the idea of a two-dimensional list to a
table of values. Many games are played on a two-dimensional grid. One of
the most famous such games is chess. As with so many other things in
computer science, we must come up with an abstraction that mirrors
reality and allows us to store the information inside of a computer. For
chess, we’ll need an 8 × 8 two-dimensional array. We can represent
each piece in the board with a char
, using the encoding given below.
Piece | Encoding |
---|---|
Pawn |
'P' |
Knight |
'N' |
Bishop |
'B' |
Rook |
'R' |
Queen |
'Q' |
King |
'K' |
Using uppercase characters for black pieces and lowercase characters for white pieces, we could represent a game of chess after a classic king’s pawn open by white as shown.
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | |
---|---|---|---|---|---|---|---|---|
0 |
'R' |
'N' |
'B' |
'Q' |
'K' |
'B' |
'N' |
'R' |
1 |
'P' |
'P' |
'P' |
'P' |
'P' |
'P' |
'P' |
'P' |
2 |
||||||||
3 |
||||||||
4 |
||||||||
5 |
'p' |
|||||||
6 |
'p' |
'p' |
'p' |
'p' |
'p' |
'p' |
'p' |
|
7 |
'r' |
'n' |
'b' |
'q' |
'k' |
'b' |
'n' |
'r' |
Observe that, just as with one-dimensional arrays, the indexes for rows and columns in two-dimensional arrays also use zero-based counting.
After the step from one-dimensional arrays to two-dimensional arrays, it’s natural to wonder if there can be arrays of even higher dimension. We can visualize a two-dimensional array as a table, but a three-dimensional array is harder to visualize. Nevertheless, there are uses for three-dimensional arrays.
Consider a professor who’s taking a survey of students in her course. She wants to know how many students there are in each of three categories: gender, class level, and major. If she treats each of these as a dimension and assigns an index to each possible value, she could store the results in a three-dimensional array. For gender she could pick male = 0 and female = 1. For class level she could pick freshman = 0, sophomore = 1, junior = 2, senior = 3, and other = 4. Assuming it’s a computer science class, for major she could pick computer science = 0, math = 1, other science = 2, engineering = 3, and humanities = 4. Using this system she could compactly store the number of students in any combination of categories she was interested in. For example, the total number of female sophomore engineering students would be stored in the cell with gender index 1, class level index 1, and major index 3.
Three dimensions is usually the practical limit when programming in Java. If you find an especially good reason to use four or higher dimensions, feel free to do so, but it should happen infrequently. The Java language has no set limit on array dimensions, but most virtual machines have the absurdly high limitation of 255 different dimensions.
6.7. Syntax: Advanced arrays in Java
Now that we’ve discussed the value of storing data in multidimensional lists, we’ll describe the Java language features that allow you to do so. The changes needed to go from one-dimensional arrays to two-dimensional and higher arrays are simple. First, we’ll describe how to declare, instantiate, and index into two-dimensional arrays. Then, we’ll discuss some of the ways in which arrays (both one-dimensional and higher) are different from primitive data types. Next, we’ll explain how it’s possible to make two-dimensional arrays in Java where the rows are not all the same length. Finally, we’ll cover some of the most common mistakes programmers make with arrays.
6.7.1. Multidimensional arrays
When declaring a two-dimensional array, the main difference from a
one-dimensional array is an extra pair of brackets. If we wish to
declare a two-dimensional array of type int
in which we could store
values like the table of distances above, we would do so as follows.
int[][] distances;
As with one-dimensional arrays, it’s legal to put the brackets on the other side of the variable identifier or, even more bizarrely, have a pair on each side.
Once the array is declared, it must still be instantiated using the
new
keyword before it can be used. This time we will use two pairs of
brackets, with the number in the first pair specifying the number of
rows and the number in the second pair specifying the number of columns.
distances = new int[5][5];
After the instantiation, we will have 5 rows and 5 columns, giving a
total of 25 locations where int
values can be stored. Indexing these
locations is done by specifying row and column values in the brackets.
So, to fill up the table with the distances between cities given above
we can use the following tedious code.
// New York
distances[0][1] = 2791;
distances[0][2] = 791;
distances[0][3] = 1632;
distances[0][4] = 2457;
// Los Angeles
distances[1][0] = 2791;
distances[1][2] = 2015;
distances[1][3] = 1546;
distances[1][4] = 373;
// Chicago
distances[2][0] = 791;
distances[2][1] = 2015;
distances[2][3] = 1801;
distances[1][4] = 1181;
// Houston
distances[3][0] = 1632;
distances[3][1] = 1546;
distances[3][2] = 1801;
distances[3][4] = 1176;
// Phoenix
distances[4][0] = 2457;
distances[4][1] = 373;
distances[4][2] = 1181;
distances[4][3] = 1176;
You’ll notice that we did not specify values for distances[0][0]
,
distances[1][1]
, distances[2][2]
, distances[3][3]
, or
distances[4][4]
, since each of these already has the default value of
0
.
Much more often, multidimensional array manipulation will use nested
for
loops. For example, we could create an array with 3 rows and 4
columns, and then assign values to those locations such that they were
numbered increasing across each row.
int[][] values = new int[3][4];
int number = 1;
for(int i = 0; i < values.length; i++)
for(int j = 0; j < values[i].length; j++) {
values[i][j] = number;
number++;
}
This code would result in an array filled up like the following table.
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
The bounds for the outer for
loop in this example uses
values.length
, giving the total number of rows. Then, the inner for
loops uses values[i].length
, which is the length (number of columns)
of the current row. In this case, all the rows of the array have the same
number of columns, but this is not always true, as we’ll discuss
later.
6.7.2. Reference types
All array variables are reference type variables, not simple values
like most of the types we have discussed so far. A reference variable is
a name for an object. You might recall that we described the difference
between reference types and primitive types in
Section 3.2, but the only reference type we’ve
considered in detail is String
.
More than one reference variable can point at the same object. When one
object has more than one name, this is called aliasing. The String
type is immutable, meaning that an object of type String
cannot change
its contents. Arrays, however, are mutable, which means that aliasing
can cause unexpected results. Here is a simple example with
one-dimensional array aliasing.
int[] array1 = new int[10];
for(int i = 0; i < array1.length; i++)
array1[i] = i;
int[] array2 = array1;
array2[3] = 17;
System.out.println(array1[3]);
Surprisingly, the value printed out will be 17
. The variables array1
and array2
are references to the same fundamental array. Unlike
primitive values, the complete contents of array1
are not copied to
array2
. Only one array exists because only one array has been created
by the new
keyword. When index 3 of array2
is updated, index 3
of array1
changes as well, because the two variables are simply two
names for the same array.
Sometimes this reference feature of Java allows us to write code that is confusing or has unexpected consequences. However, the benefit is that we can assign one array to another without incurring the expense of copying the entire array. If you create an array with 1,000,000 elements, copying that array several times could get expensive in terms of program running time.
The best rule of thumb for understanding reference types is that there
is only one actual object for every call to new
. The primary exception
to this rule is that uses of new
can be hidden from the user when
they’re in method calls.
String greeting = new String("Hello");
String pronoun = greeting.substring(0,2);
At the end of this code, the reference pronoun
will point to an object
containing the String "He"
. The substring()
method invokes new
internally, generating a new String
object completely separate from
the String
referenced by greeting
. This code may look unusual
because we’re explicitly using new
to make a String
object
containing "Hello"
. The String
class is different from every other
class because it can be instantiated without using the new
keyword.
The line String greeting = "Hello";
implicitly calls new
to create an object
containing the String "Hello"
and functions nearly the same as the
similar line above.
6.7.3. Ragged arrays
We’re ashamed to say that we’ve lied to you. In Java, there’s no such thing as a true multidimensional array! Instead, the examples of two-dimensional and three-dimensional arrays we’ve given above are actually arrays of arrays (of arrays). Thinking about multidimensional arrays in this way can give the programmer more flexibility.
If we return to the definition of the two-dimensional array with 3 rows and 4 columns, we can instantiate each row separately instead of as a block.
int[][] values = new int[3][];
int number = 1;
for(int i = 0; i < values.length; i++) {
values[i] = new int[4];
for(int j = 0; j < values[i].length; j++) {
values[i][j] = number;
number++;
}
}
This code is functionally equivalent to the earlier code that instantiated all 12 locations at once. The same could be done with a three-dimensional array or higher. We can specify the length of each row independently, and, more bizarrely, we can give each row a different length. A multidimensional array whose rows have different lengths is called a ragged array.
A ragged array is usually unnecessary. The main reason to use a ragged array is to save space, when you have tabular data in which the lengths of each row vary a great deal. If the lengths of the rows vary only a little, it’s probably not worth the extra hassle. However, if some rows have 10 elements and others have 1,000,000, the space saved can be significant.
We can apply the idea of ragged arrays to the table of distances between cities. If you examine this table, you’ll notice that about half the data in it is repeated, because the distance from Chicago to Los Angeles is the same as the distance from Los Angeles to Chicago, and so on. We can store the data in a triangular shape to keep only the unique distance information.
New York | Los Angeles | Chicago | Houston | Phoenix | |
---|---|---|---|---|---|
New York |
0 |
||||
Los Angeles |
2,791 |
0 |
|||
Chicago |
791 |
2,015 |
0 |
||
Houston |
1,632 |
1,546 |
1,801 |
0 |
|
Phoenix |
2,457 |
373 |
1,181 |
1,176 |
0 |
We could create this table in code by doing the following.
distances = new int[5][];
// New York
distances[0] = new int[1];
// Los Angeles
distances[1] = new int[2];
distances[1][0] = 2791;
// Chicago
distances[2] = new int[3];
distances[2][0] = 791;
distances[2][1] = 2015;
// Houston
distances[3] = new int[4];
distances[3][0] = 1632;
distances[3][1] = 1546;
distances[3][2] = 1801;
// Phoenix
distances[4] = new int[5];
distances[4][0] = 2457;
distances[4][1] = 373;
distances[4][2] = 1181;
distances[4][3] = 1176;
With this table a user cannot simply type in distances[0][4]
and hope
to get the distance from New York to Phoenix. Instead, we have to be
careful to make sure that the index of the first city is never larger
than the index of the second city. If we’re reading in the indexes of
the cities from a user, we can write some code to do this check. Let
city1
and city2
, respectively, contain the indexes of the cities the
user wants to use to find the distances between.
if(city1 > city2) {
int temp = city1;
city1 = city2;
city2 = temp;
}
System.out.println("The distance is: " + distances[city1][city2] +
" miles");
If we wanted to be even cleverer, we could eliminate the zero entries from the table, but then the ragged array would have one fewer row than the original two-dimensional array.
6.7.4. Common pitfalls
Even one-dimensional arrays make many new errors possible. Below we list two of the most common mistakes made with both one-dimensional and multidimensional arrays.
Pitfall: Array out of bounds
The length of an array is determined at run time. Sometimes the number is
specified in the source code, but it’s always possible for an array to
be instantiated based on user input. The Java compiler doesn’t do any
checking to see if you’re in the right range. If your program tries to
access an illegal element, it’ll crash with an
Here’s a classic example. By iterating through the loop one too many
times, the program will try to store There are other less common causes for going outside of array bounds. Imagine that you’re scanning through a file that has been redirected to input, keeping a count of the occurrences of each letter of the alphabet in the file.
This segment of code does a decent job of counting the occurrences of
each letter. The |
Pitfall: Uninitialized reference arrays
Another problem only comes up with arrays of reference types. Whenever
the elements of an array are primitive data types, memory for that type
is allocated. Whenever the elements of the array are reference types,
only references to objects, initialized to
The following code, however, will cause a
Arrays of reference types must initialize each element before using it.
The
In this case, there would be no error, although
A similar error can happen with multidimensional arrays.
Because an array is itself a reference type, the |
6.8. Examples: Two-dimensional arrays
Below we give some examples where two-dimensional arrays can be helpful. We start with a simple calendar example, move on to matrix and vector multiplication useful in math, and finish with a game.
We’re going to create a calendar that can be printed to the terminal to show which day of the week each day lands on. Our program will prompt the user for the day of the week the month starts on and for the total number of days in the month. Our program will print out labels for the seven days of the week, followed by numbering starting at the appropriate place, and wrapping such that each numbered day of the month falls under the appropriate day of the week.
import java.util.*;
public class Calendar {
public static void main(String[] args) {
String[][] squares = new String[7][7]; (1)
squares[0][0] = "Sun"; (2)
squares[0][1] = "Mon";
squares[0][2] = "Tue";
squares[0][3] = "Wed";
squares[0][4] = "Thu";
squares[0][5] = "Fri";
squares[0][6] = "Sat";
for(int i = 1; i < squares.length; i++) (3)
for(int j = 0; j < squares[i].length; j++)
squares[i][j] = " ";
Scanner in = new Scanner(System.in);
System.out.print("Which day does your month start on? (0 - 6) ");
int start = in.nextInt(); //read starting day (4)
System.out.print("How many days does your month have? (28 - 31) ");
int days = in.nextInt(); //read days in month (5)
int day = 1;
int row = 1;
int column = start;
while(day <= days) { //fill calendar (6)
squares[row][column] = "" + day;
day++;
column++;
if(column >= squares[row].length) {
column = 0;
row++;
}
}
for(int i = 0; i < squares.length; i++) { (7)
for(int j = 0; j < squares[i].length; j++)
System.out.print(squares[i][j] + "\t");
System.out.println();
}
}
}
1 | First, our code creates a 7 × 7 array of type
String called squares . The array needs 7 rows so that it can start
with a row to label the days and then output up to 6 rows to cover the
weeks. (Months with 31 days span parts of 6 different weeks if they
start on a Friday or a Saturday.) The number of columns corresponds to
the seven days of the week. |
2 | Next, we initialize the first row of the array to abbreviations for each day of the week. |
3 | Then, we initialize the rest of the array to be a single space. |
4 | Our program then reads from the user the day the month starts on. |
5 | The program also reads from the user for the total number of days in the month. |
6 | The main work of the program is done by the while loop, which fills
each square with a steadily increasing day number for each column,
moving on to the next row when a row is filled. |
7 | Finally, the two nested for loops at the end print out the contents of squares , putting a
tab ('\t' ) between each column and starting a new line for each row. |
Arrays give a natural way to represent vectors and matrices. In 3D graphics and video game design, we can represent a point in 3D space as a vector with three elements: x, y, and z. If we want to rotate the three-dimensional point represented by this vector, we can multiply it by a matrix. For example, to rotate a point around the x-axis by θ degrees, we could use the following matrix.
Given an m × n matrix A, let Aij be the element in the ith row, jth column. Given a vector v of length n, let vi be the ith element in the vector. To multiply A by v, we use the following equation to find the ith element of the resulting vector v′.
By transforming this equation to Java code, we can write a program that can read in a three-dimensional point and rotate it around the x-axis by the amount specified by the user.
import java.util.*;
public class MatrixRotate {
public static void main(String[] args) {
double[] point = new double[3]; (1)
System.out.println("What point do you want to rotate?");
Scanner in = new Scanner(System.in);
System.out.print("x: "); (2)
point[0] = in.nextDouble();
System.out.print("y: ");
point[1] = in.nextDouble();
System.out.print("z: ");
point[2] = in.nextDouble();
System.out.print("What angle around the x-axis? ");
double theta = Math.toRadians(in.nextDouble()); (3)
double[][] rotation = new double[3][3];
rotation[0][0] = 1; (4)
rotation[1][1] = Math.cos(theta);
rotation[1][2] = -Math.sin(theta);
rotation[2][1] = Math.sin(theta);
rotation[2][2] = Math.cos(theta);
double[] rotatedPoint = new double[3];
for(int i = 0; i < rotatedPoint.length; i++) (5)
for(int j = 0; j < point.length; j++)
rotatedPoint[i] += rotation[i][j]*point[j];
System.out.println("Rotated point: [" + rotatedPoint[0] + (6)
"," + rotatedPoint[1] + "," + rotatedPoint[2] + "]");
}
}
1 | This program begins by declaring a array of type double to hold the
vector. |
2 | Afterward, it reads three values from the user into it. |
3 | Then, the program reads in the angle of rotation in degrees and converts it to radians. |
4 | Next, we use the Math class to calculate the values in the
rotation matrix. Note that we do not change the values that need to be
zero. |
5 | We use a for loop to perform the matrix-vector
multiplication. Again, the summing done by
our calculations uses the fact that all elements of rotatedPoint are
initialized to 0.0 . |
6 | Finally, we print out the answer. |
Almost every child knows the game of tic-tac-toe, also known as noughts and crosses. Its playing area is a 3 × 3 grid. Players take turns placing Xs and Os, trying to get three in a row. Strategically, it’s not the most interesting game since two players who make no mistakes will always tie. Still, we present a program that allows two human players to play the game because the manipulations of a two-dimensional array in the program are similar to those for more complicated games such as Connect Four, checkers, chess, or Go. Our program will catch any attempt to play on a location that has already been played and will determine the winner, if there is one.
Games often give rise to complex programs, since rules that are intuitively obvious to humans may be difficult to state explicitly in Java. Our program begins by setting up quite a few variables and objects.
import java.util.*;
public class TicTacToe {
public static void main(String[] args) {
Scanner in = new Scanner(System.in); (1)
char[][] board = new char[3][3]; (2)
for(int i = 0; i < board.length; i++) (3)
for(int j = 0; j < board[0].length; j++)
board[i][j] = ' ';
boolean turn = true; (4)
boolean gameOver = false;
int row, column, moves = 0; (5)
char shape;
1 | First, we create a Scanner to read in data. |
2 | Then, we declare
and instantiate our 3 × 3 playing board as a
two-dimensional array of type char . |
3 | We want any unplayed space on the
grid to be the char for a space, so we fill the array with ' ' . |
4 | Next, we declare a boolean value to keep track of whose turn it is and
another to keep track of whether the game is over. |
5 | Finally, we
declare variables to hold the row, the column, the number of moves that
have been made so far and the current shape ('X' or 'O' ). |
The core of the game is a while
loop that runs until gameOver
becomes true
. The first line of the body of this loop is an obscure
Java shortcut often referred to as the ternary operator. This line is
really shorthand for the following.
if(turn)
shape = 'X';
else
shape = '0';
The ternary operator works with a condition followed by a question mark
and then two values separated by a colon. If the condition is true
,
the first value is assigned, otherwise the second value is assigned.
It’s perfect for situations like this where one value is needed when
turn
is true
and another is needed when turn
is false
. The
ternary operator is a useful trick, but it shouldn’t be overused.
while(!gameOver) {
shape = turn ? 'X' : 'O';
System.out.print(shape + "'s turn. Enter row (0-2): "); (1)
row = in.nextInt();
System.out.print("Enter column (0-2): ");
column = in.nextInt();
if(board[row][column] != ' ') (2)
System.out.println("Illegal move");
else {
board[row][column] = shape; (3)
moves++;
turn = !turn;
// Print board (4)
System.out.println(board[0][0] + "|"
+ board[0][1] + "|" + board[0][2]);
System.out.println("-----");
System.out.println(board[1][0] + "|"
+ board[1][1] + "|" + board[1][2]);
System.out.println("-----");
System.out.println(board[2][0] + "|"
+ board[2][1] + "|" + board[2][2] + "\n");
1 | After assigning the appropriate value to shape , our code reads in the
row and column values for the current player’s next move. |
2 | If the row and column selected correspond to a spot that’s already been taken, the program gives an error message. |
3 | Otherwise, the program sets
board[row][column] to the appropriate symbol, increments moves , and
changes the value of turn . |
4 | Then, it prints out the board. |
Our program doesn’t do any bounds checking on row
and column
.
If a user tries to place a move at row 5 column 3, our program will try
to do so and crash. Additional clauses in the if
statement could
be used to add bounds checking.
Perhaps the trickiest part of our tic-tac-toe program is checking for a win.
// Check rows (1)
for(int i = 0; i < board.length; i++)
if(board[i][0] == shape && board[i][1] == shape
&& board[i][2] == shape)
gameOver = true;
// Check column (2)
for(int i = 0; i < board[0].length; i++)
if(board[0][i] == shape && board[1][i] == shape
&& board[2][i] == shape)
gameOver = true;
// Check diagonals (3)
if(board[0][0] == shape && board[1][1] == shape
&& board[2][2] == shape)
gameOver = true;
if(board[0][2] == shape && board[1][1] == shape
&& board[2][0] == shape)
gameOver = true;
if(gameOver) (4)
System.out.println(shape + " wins!");
else if(moves == 9){ (5)
gameOver = true;
System.out.println("Tie game!");
}
}
}
}
}
1 | First we check each row to see if it contains three in a row. |
2 | Then, we check each column. |
3 | Finally, we check the two diagonals. |
4 | If any of those checks ended the game, we announce a winner. |
5 | Otherwise, if the number of moves has reached 9 with no winner, it must be a tie game. |
In a larger game (such as Connect Four), we would want to find better ways to automate checking rows, columns, and diagonals. For example, we wouldn’t want to check the entirety of a larger board each move. Instead, we could focus only on the rows, columns, and diagonals affected by the last move.
6.9. Advanced: Special array tools in Java
Arrays are fundamental data structures in many programming languages.
There are often special syntactical tools or libraries designed to make
them easier to use. In this section, we explore two advanced tools, the
enhanced for
loop and the Arrays
utility class.
6.9.1. Enhanced for
loops
In Chapter 5 we described three loops: while
loops, for
loops, and do
-while
loops. Although these are the only
three loops in Java, there’s a special form of the for
loop designed
for use with arrays (and some other data structures). This construct is
often called the enhanced for
loop.
An enhanced for
loop does not have the three-part header of a regular for
loop. Instead, it’s designed to iterate over the contents of an array or other list.
Inside its parentheses is a declaration of a variable with the same type
as the elements of the array, then a colon (:
), then the name of the
array. Consider the following example of an enhanced for
loop used to sum the
values of an array of int
values called array
. As with all loops in
Java, braces are optional if there’s only one executable statement in
the loop.
int sum = 0;
for(int value : array)
sum += value;
This code functions in exactly the same way as the traditional for
loop we would use to solve the same problem.
int sum = 0;
for(int i = 0; i < array.length; i++)
sum += array[i];
The advantage of the enhanced for
loop is that it’s shorter and clearer.
There’s also no worry about being off by one with your indexes. The
enhanced for
loop iterates over every element in the array, no indexes
needed!
Enhanced for
loops can be nested or used inside of other loops. Consider the
following nested enhanced for
loops that print out all possible chess
pieces, in both black and white colors.
String[] colors = {"Black", "White"};
String[] pieces = {"King", "Queen", "Rook", "Bishop", "Knight", "Pawn"};
for(String color : colors)
for(String piece : pieces)
System.out.println(color + " " + piece);
Enhanced for
loops do have a few drawbacks. For one, they’re designed for iterating
through an entire array. It’s ugly to try to make them stop early, and
it’s impossible to make them go back to previous values. They’re also
only designed for read access, not write access. The variable in the
header of the enhanced for
loop takes on each value in the array in turn,
but assigning values to that variable has no effect on the underlying
array. Consider the following for
loop that assigns 5
to every value
in array
.
for(int i = 0; i < array.length; i++)
array[i] = 5;
This kind of assignment is impossible in an enhanced for
loop. The
“equivalent” enhanced for
loop does nothing. It assigns 5
to the local
variable value
but never changes the contents of array
.
for(int value : array)
value = 5;
While enhanced for
loops are great for arrays, they can also be used for
any data structure that implements the Iterable
interface. We
discuss interfaces in Chapter 10 and dynamic data
structures in Chapter 18 and
Chapter 19.
6.9.2. The Arrays
class
The designers of the Java API knew that arrays were important and added
a special Arrays
class to manipulate them.
This class has a number of static methods that can be used to search for
values in arrays, make copies of arrays, copy selected ranges of arrays,
test arrays for equality, fill arrays with specific values, sort arrays,
convert an entire array into a String
representation, and more. The
signatures of the methods below are given for double
arrays, but most
methods are overloaded to work with all primitive types and reference
types.
Method | Purpose |
---|---|
binarySearch(double[] array, double value) |
Returns index of |
copyOf(double[] array, int length) |
Returns a copy of |
copyOfRange(double[] array, int from, int to) |
Returns a copy of
|
equals(double[] array1, double[] array2) |
Returns |
fill(double[] array, double value) |
Fills |
sort(double[] array) |
Sorts |
toString(double[] a) |
Returns a |
Consult the API for more information. Even though tasks like fill()
are simple, it’s worth using the method from Arrays
instead of
writing your own. The methods in the Java API have often been tuned for
speed and use special instructions that are not accessible to regular Java
programmers.
6.10. Solution: Game of Life
Here we present our solution to the Conway’s Game of Life simulation. Our program is designed to run the simulation with 24 rows and 80 columns, although it would be easy to change those dimensions.
public class Life {
public static void main(String[] args) {
final int ROWS = 24; (1)
final int COLUMNS = 80;
final int GENERATIONS = 500;
boolean[][] board = new boolean[ROWS][COLUMNS]; (2)
boolean[][] temp = new boolean[ROWS][COLUMNS];
boolean[][] swap;
for(int row = 0; row < ROWS; row++) (3)
for(int column = 0; column < COLUMNS; column++)
board[row][column] = (Math.random() < 0.1);
1 | The main() method of our program starts by defining ROWS , COLUMNS ,
and GENERATIONS as named constants using the final keyword. |
2 | Next, we
create two arrays with ROWS rows and COLUMNS columns. The board
array will hold the current generation. The temp array will be used to
fill in the next generation. Then, temp will be copied into board ,
and the process will repeat. The swap variable is just a reference we’ll
use to swap board and temp . |
3 | We randomly fill the board, making 10% of the cells living. Again, you may wish to play with this number to see how the patterns in the simulation are affected. |
for(int generation = 0; generation < GENERATIONS; generation++) { (1)
for(int row = 0; row < ROWS; row++) (2)
for(int column = 0; column < COLUMNS; column++) {
int total = 0;
for(int i = Math.max(row - 1, 0); (3)
i < Math.min(row + 2, ROWS); i++)
for(int j = Math.max(column - 1, 0);
j < Math.min(column + 2, COLUMNS); j++)
if((i != row || j != column) && board[i][j])
total++;
if(board[row][column])
temp[row][column] = (total == 2 || total == 3); (4)
else
temp[row][column] = (total == 3); (5)
}
swap = board; (6)
board = temp;
temp = swap;
1 | The for loop at the beginning of this segment of code runs once for
each generation we simulate. |
2 | The two nested for loops examine each
cell in board . |
3 | The two for loops nested inside of those loops do the
calculations to determine if a cell will be alive or dead in the next
generation. These inner loops start one row before the current row and
finish one row after the current row. They do the same for columns. The
Math.max() and Math.min() methods are used to keep the loops from
going out of bounds of the array. When backing up a row or a column, the
Math.max() methods make sure that we do not generate an index smaller
than 0. When going forward a row or a column, the Math.min() methods
make sure that we do not generate an index greater than ROWS - 1 or
COLUMNS - 1 . |
4 | After these two innermost for loops have counted the total of living
cells around the cell in question, we decide the fate of the cell for
the next generation. If the cell’s alive and has exactly 2 or 3 living
neighbors, it’ll continue to live. |
5 | If a cell’s dead, it’ll come to life only if it has exactly 3 living neighbors. |
6 | After we’ve
stored the state of each cell in the next generation into temp , we
swap board and temp , using the swap variable. We could have thrown
out the old array stored in board instead of swapping it with temp ,
but then we’d have to create a new array for temp each time, which
is less efficient. |
for(int i = 0; i < 100; i++) (1)
System.out.println();
for(int row = 0; row < ROWS; row++) { (2)
for(int column = 0; column < COLUMNS; column++)
if(board[row][column])
System.out.print("*");
else
System.out.print(" ");
System.out.println();
}
try { Thread.sleep(500); } (3)
catch(InterruptedException e) {}
}
}
}
1 | The first for loop in this segment prints 100 blank lines to clear the
screen, as we explained earlier. |
2 | The two nested for loops print out
the state of the current generation, with a * for each living cell and
a blank space for each dead one. |
3 | After the output, the code sleeps for
500 milliseconds to give the effect of an animation. We’ll discuss
exceptions in general in Chapter 12 and give more
information about the Thread.sleep() method in Chapter 13. |
Although 500 milliseconds (half a second) is a long delay for animation, the scrolling effect of the screen makes the pattern of alive and dead cells hard to perceive on a terminal. Some kind of GUI (such as we discuss in Chapter 15) could provide a more pleasing way to visualize the Game of Life, but drawing arbitrary patterns on a GUI presents its own difficulties.
6.11. Concurrency: Arrays
Arrays are critical to concurrent programming in Java. In
Chapter 13, we’ll explain how to
create independent threads of execution, each of which is tied to a
Thread
object. If you have a dual, quad, or higher core computer, you
might want to use two or four threads to solve a problem, but some
programs can use hundreds. How can you keep track of all those Thread
objects? In many cases, you’ll hold references to them in an array.
Arrays can also hold large lists of data. It’s common for threaded programs to share a single array which each thread reads and writes to. In this way, memory costs are kept low because there’s only one copy of all the data. In the simplest case, each thread works on some portion of the array without interacting with the rest. Even then, how do you assign parts of the array to the different threads?
We’ll assume that each element of the array needs to be processed in
some way. For example, we might want to record whether or not each
long
in an array is prime or not. If you have k threads
and an array of length n where n happens to
be a multiple of k, then it’s easy: Each thread gets
exactly n/k items to work on. For example, the first
thread will work on indexes 0 through n/k - 1, the
second thread will work on indexes n/k through
2n/k - 1, and so on, with the last thread working
on indexes (k
- 1)n/k through n - 1. Not every element in the array
will require the same amount of computation, but we often assume that
they do because it can be difficult to guess which elements will take
more time to process.
What if the number of elements in the array is not a multiple of the number of threads? We still want to assign the work the work as fairly as possible. New programmers are sometimes tempted to use the same arithmetic from the case in which the threads evenly divide the length of the array: Each thread gets ⌊n/k⌋ (using integer division) elements, and we stick the last thread with the leftovers. How bad can that be?
This assignment of work can be very poorly balanced. Consider a case with 10 threads and 28 pieces of data. ⌊28/10⌋ = 2, using integer division. Thus, the first nine threads have 2 units of work to do, but the last thread is stuck with 10! Not only is this unfair; it’s inefficient. The person writing the program probably wants to minimize the total amount of time needed to finish the job. In this case, the time from when the first thread starts to when the last thread finishes is called the task’s makespan. With this division of work, the makespan is 10 units of work.
A simple way to fix this problem is to look at the value n mod k, the leftovers when you divide n by k. We want to spread those out over the first few threads. We know that any remainder will be smaller than k. If the index of the thread (starting at 0, of course) is less than the remainder, we add an extra element to its work. In this way, 28 units of work spread over 10 threads will give 3 elements to the first 8 threads and 2 elements to the rest. Using this strategy, the makespan becomes 3 units of work, a huge improvement over 10. Finding a way to spreading work across multiple threads to improve efficiency is a form of load balancing, a broad term for dividing work across computing resources.
The program below reads the length of an array and the number of threads from the user and then prints out the amount of work for each one. You should be able to adapt the ideas in it to your own multi-threaded programs in Chapter 13.
import java.util.*;
public class AssigningWork {
public static void main(String[] args) {
Scanner in = new Scanner(System.in);
System.out.print("How long is your array? ");
int n = in.nextInt();
System.out.print("How many threads do you have? ");
int k = in.nextInt();
int quotient = n / k;
int remainder = n % k;
int next = 0;
for(int i = 0; i < k; i++) {
int work = quotient;
if(i < remainder)
work++;
System.out.println("Thread " + i + " does " + work
+ " units of work, starting at index " + next
+ " and ending at index " + (next + work - 1));
next += work;
}
}
}
6.12. Exercises
Conceptual Problems
-
Why can’t an array be used to hold an arbitrarily long list of numbers entered by a user? What are strategies that can be used to overcome this problem?
-
In future chapters, we’ll introduce a data structure called a linked list. A linked list is a homogeneous, dynamic data structure with sequential access (unlike an array, which has random access). You can instantly jump to any place in an array, but you have to step through each element of a linked list to get to the one you want, even if you know its position in the list exactly. On the other hand, inserting values into the beginning of a linked list can be done in one step, while an array would need to be resized and have its contents copied over. List some tasks for which an array would be better than a linked list and vice versa.
-
Given the following code:
double[] array1 = new double[50]; double[] array2 = new double[50]; for(int i = 0; i < array1.length; i++) { array1[i] = i + 1; array2[i] = array2.length - i; } array2 = array1; for(int i = 1; i < array1.length; i++) array1[i] = array1[i - 1] + array1[i];
What is the value in
array2[array2.length - 1]
after this code is executed? -
What error will be caused by the following code, and why?
String[] array = new String[100]; System.out.println(array[99].charAt(0) + " is the first letter of the last String.");
-
An array of length n in Java typically takes n times the number of bytes for each element plus an additional 16 bytes of overhead. Since an
int
uses 4 bytes of storage, an array of 100int
elements would take 416 bytes. Consider the following three-dimensional array declaration and allocation.int[][][] data = new int[10][5][20];
How many bytes are allocated for this array? Remember that the 16 byte overhead will occur repeatedly, since Java creates a three-dimensional array as an array of arrays of arrays.
-
Our original table of city distances allocates 5 · 5 = 25
int
elements to store all the distances between the five cities, including repeats. How manyint
elements are allocated for the triangular, ragged array version of this city distance table? If we used the normal table style, n cities would require n2int
elements. How many elements would the triangular, ragged array version allocate for n cities? -
Consider the naive method of dividing an array of length n among k threads that was discussed in Section 6.11: Each thread gets ⌊n/k⌋ (rounded down because of integer division) elements, and the last thread gets any extras. What mathematical expression describes how many extra elements are allocated to the last thread? Can you come up with an example in which the last element gets all the elements? What should have happened in this case using the other, more fair scheme for assigning the data to threads?
Programming Practice
-
In Example 6.3, our code would not count
ship,
as an occurrence ofship
because of the comma.Rewrite the code from Example 6.3 to remove punctuation from the beginning and end of a word. Use a loop that runs as long as the character at the beginning of a word is not a letter, replacing the word with a substring of itself that does not include the first character. Use a second loop to remove non-letters from the end of a word. Be careful to stop if the length of the
String
becomes 0, as with text that’s entirely composed of non-letters. -
In Example 6.3, we wrote a program that counts the occurrences of each word from a list within a text. If the list of words to search within is long, it can take quite some time to search through the entire list. If the list of words were sorted, we could do a trick that would allow us to search much faster. We could play a “high-low” game, searching through the list by checking the middle word in the array. If that word’s too late in the alphabet, repeat the search on the first half of the list. If it’s too early in the alphabet, repeat the search on the second half of the list. By repeatedly dividing the list in half, until you either find the word you’re looking for or narrow your search down to a single incorrect word, you can search much faster. This kind of searching is called binary search and uses around log n comparisons to find an element in a list of n items. In contrast, looking through the list one element at a time takes about n comparisons.
Rewrite the code from Example 6.3 to use binary search, after applying selection sort from Example 6.2. Although selection sort will take some extra time, you should more than make up the difference with such a fast search. To implement binary search, keep variables for the start, middle, and end of the list. Keep adjusting the three variables until the middle index has the word you are looking for or the start and end variables reach each other. Remember to use the
compareTo()
method from theString
class to compare words. -
In Example 6.4, we gave a program that finds the maximum, minimum, mean, standard deviation, and median of a list of values. Another statistic that is sometimes important is the mode, or most commonly occurring element. For example, in the list {1, 2, 3, 3, 3, 5, 6, 6, 10}, the mode is 3. Write a program that can determine the mode of a given list of
int
values. A list can have multiple modes if more than one element occurs with maximum frequency. For our purposes, we’ll consider any list with multiple modes to have no modes. You may wish to sort the list before starting the process of counting the frequency of each value. -
We used the example of tic-tac-toe in Example 6.7 because a more complex game would have taken too much space to solve. The game of Connect Four (or the Captain’s Mistress, as it was originally called) pits two players against each other on a 6 × 7 vertical board. One player uses red checkers while the other uses black. The two players take turns dropping their checkers into columns of the board in which the checkers will drop to the lowest empty row, due to gravity. The goal of the game is to be the first to make four in a row of your color.
Implement a version of Connect Four for two human players, similar to our version of tic-tac-toe. Many of the ideas are the same, but the details are more complicated. First, a player will only choose a particular column. Your program must then find which row a checker dropped into that column will fall to. Then, the process of counting four in a row is more difficult than the three in a row of tic-tac-toe. You will need more loops to automate the process fully.
-
Once you’ve mastered the material in Chapter 15, adapt the solution to Conway’s Game of Life from Section 6.10 to display on a graphical user interface. You can use a
GridLayout
to arrange a large number ofJLabel
objects in a grid and update their background colors toColor.BLACK
andColor.WHITE
as needed, using thesetBackground()
method. (To make these colors visible, you will also need to call thesetOpaque()
method once on eachJLabel
with an argument oftrue
.) The Game of Life is much more compelling with a real GUI instead of an improvised command line representation.
Experiments
-
Creating arrays with longer and longer lengths requires more processor time, since all of those elements must be initialized to some default value. Using an OS
time
command, determine the amount of time it takes to create anint
array of length 10, 10,000, and 10,000,000. In all likelihood, the amount of time that instantiation of the array takes is a small part of the program, and you should see very little difference in those three times. However, time is not the only important resource. When you run a JVM, it has a default heap size that limits the amount of space you can use to create new objects, including arrays. When you exceed this size, your program will crash with anOutOfMemoryError
. Experiment with different sizes of arrays until you can estimate the size of your heap within 5MB or so.This estimate will be very rough, since the JVM uses other memory in the background. For a more accurate picture, you can use the
Runtime.getRuntime().maxMemory()
method to determine the maximum JVM memory size and theRuntime.getRuntime().totalMemory()
method to determine the total JVM memory being used. -
Run the implementation of the word search program using the binary search improvement from Exercise 6.9. Use the OS
time
command to time the difference between the regular and binary search versions of the program with a long list of words. You may see very little difference on small input, but you can easily find a list of the 1,000 most commonly used words in English on the Internet along with long, copyright free texts from Project Gutenberg. Combining these two into a single input should see a significant increase in speed for the binary search version relative to the regular version. -
Generate input files consisting of 1,000, 10,000, and 100,000 random
int
values. Time our implementation of selection sort from Example 6.2 running on each of these input files and redirecting output to output files. What’s the behavior of the running time as the input length increases by a factor of 10? As a function of n, how many times total does the body of the innerfor
loop run during selection sort? Does this function closely mirror the increase in running time?