CS 241 Lab 5: Matriarchal family relations

A family tree program

Have you ever been to a family reunion and had to figure out how you were related to a distant relative? Kinship terminology (sister, cousin, third cousin, mother-in-law, etc.) is actually used by anthropologists to help identify and distinguish cultures. In this lab we will implement a simple system for building family trees (with women only, just to make things easier) and for determining the relationship, if any, between any two people in the tree.

Here's how the program will work: the user will enter a textual description of a complete family and the program will build the family tree. (Don't worry, I'll be posting some sample family texts.) Then the user should be able to repeatedly enter two names to ask how they are related (for example, Mary and Sue): the program will respond with something like "Mary is the daughter of Sue" or "Mary is the great aunt of Sue" or even "Mary is the second cousin, once removed, of Sue". The program should accept several such queries, making responses as it goes, before a new family tree is entered via the text area and button.

Note: you may choose to implement this program either as a console application or (using tools from CS 141 or other prior experience) as a GUI application. If you use a GUI, be sure to use a text area to implement the main family data input, as you will need multiple lines.

For example, a GUI interface for the program might look like this:

Building the tree

OK, this looks great, but how do we actually write a program that does the job? We start by making a basic person class: each person will have a String name and a Person mom reference. Finally, we will need some sort of "birthing" method.

To start off, we will need to break up the data from the text area into lines: you can do that using the same technique as previous labs: use the getText() method for the text area and then either a Scanner with the nextLine() method or the split(...) method for strings. (For the latter, just use something like split("(\r|\n)+"); this will give you an array of lines.)

If you use a console approach, you have basically the same choices, but the Scanner with nextLine() is probably easier.

What should we do with the input lines? Well, if only one name appears on a line, we should create a new person with that name, and with no mother (there will always be one woman in the tree whose ancestry is unknown), and enter them into a database which will keep track of names and people for us. For the "database" you can use a hash table just as for the tree traversal lab. Recall that hash tables support put and get methods, so that you can store objects and look them up in the table by name.

For example, the following code shows how you could create a hash table and store some people in it:

		import java.util.*;
		
		...
		
		Hashtable family = new Hashtable();
		
		...
		
		family.put("Anna", new Person("Anna"));
		
		...
		
		Person p = (Person) family.get("Anna");
(Note that you must also import a library to get access to the built-in hash tables.)

For what it’s worth, here's a nice tutorial on using Hashtables:

When you see a line from the input with two names on it, you should create a new person with the first name and whose mother is the (existing) person with the second name. How will you get a reference to the mother? Just look her up in the hash table using the get method as above!

Important point: for this particular lab, we will want to move from a woman in the tree up the tree, to her mother, not down to her children: so the "tree" we build will be one with upwards links, one per person, and no downwards ones!

Finding relationships

OK, let's assume you can get a family tree built, somehow, inside a hash table "database": now how do you respond to queries from the user? Get the two strings representing names from the user and look up two Person objects in the database to get nodes in the tree. (See the code display above for how to do this.)

Once you have the two nodes, you can determine the relationship between the two people by finding their nearest common ancestor, and the "distance" (in terms of "mom" links) between each of them and this common ancestor. Let's call the two people A and B, the common ancestor C, and the distances between them k and j:

Note that the distances k and j might be zero: that is, either A or B (or both!) might be equal to C. Also note that k is (implicitly) the smaller of the two: you can use Math.min(k,j) to get the minimum explicitly, and Math.abs(j-k) to get the difference as a positive magnitude.

Once we have the values of k and j, we can use them to determine the relationship as shown in the following chart: you should convince yourself that the chart is valid by looking at some examples. (The parenthesized numbers such as "(k-3)" mean that you should use that number of repetitions of the preceding word, or use an appropriate "number word" such as twice, thrice, etc.)

Finding the common ancestor and the k and j values

Last but not least, you need to know how to find the common ancestor and the related numeric values of k and j. To do this, you could use a doubly-nested loop, where the outer loop keeps track of an upward movement in the tree from person A and the inner loop controls an upward movement from person B. As you move, you look to match the "current nodes" you are pointing to: if they match, you have found the common ancestor. And, as you move, you can keep track of counts (so as to have the k and j values when you finish). To summarize in a picture, we have:

However, it would be more efficient to not waste time in a doubly-nested loop looking at the same nodes over and over: how could you make this process more efficient? (Hint: think about the levels of the nodes in the tree: how could you compute, store and exploit this information?)

Here are some other notes about the (naive) search process: