INTRODUCTION TO COMPUTER SCIENCE HANDOUT #5. THE TREE DATA MODEL K5 & K6, Computer Science Department, Vaên Lang University Second semester Feb, 2002 Instructor: Traàn Ñöùc Quang Major themes: 1. Basic Terminology 2. Implementation of Trees 3. Binary Trees and Binary Search Trees Reading: Sections 5.2, 5.3, 5.6, and 5.7. 5.1 BASIC TERMINOLOGY A list discussed in the last handout is a linear structure, whereas a tree is a non-linear structure representing hierachical relationships of information, such as that of directo- ries and files stored in a computer. We can define formally a tree as a finite set of nodes and edges such that 1. There is one specially designated node called the root of the tree. The root is generally drawn at the top. 2. Every node c other than the root is connected by an edge to some one other node p called the parent of c. c is also called a child of p. We draw the parent of a node above that node. 3. A tree is connected in the sense that if we start at any node n other than the root, move to the parent of n, to the parent of the parent of n, and so on, we eventually reach the root of the tree. r n 1 n 2 n 3 n 4 n 5 30 INTRODUCTION TO COMPUTER SCIENCE: HANDOUT #5. THE TREE DATA MODEL In the figure, r is the root and has three children: n 1 , n 2 , and n 3 . We can define important concepts from the figure. 1. The node n 1 has two children, n 4 and n 5 , but the nodes n 2 and n 3 both have no children. A node with no children is called a leaf; otherwise, they are interior. 2. n 4 is a descendant of r and n 1 ; conversely, r and n 1 are ancestors of n 4 . 3. Nodes n 1 , n 2 , and n 3 are siblings; so are n 4 and n 5 . 4. The height of r is 2; this is also the height of the tree. The height of n 1 is 1 and of n 4 is 0. The depth or level of r is 0, of n 1 is 1 and n 4 is 2. 5.2 IMPLEMENTATION OF TREES Many data structures can be used to represent trees. Which one we should use depends on the particular operations we want to perform. In this very short handout, we use a common representation for a tree called leftmost-child−right-sibling as suggested in the following figure. In the sketch, the downward arrows are leftmost-child links; the sideway arrows are the right-sibling links. We can define a structure for a node as follows: typedef struct NODE *pNODE; struct NODE { int info; pNODE leftmostChild, rightSibling; }; In this representation, a node has the field info to hold any information associated with the node; the fields leftmostChild and rightSibling are pointers to the left- most child and right sibling of the node in question, respectively. Thus a node with a NULL leftmost-child pointer is a leaf; a node with a NULL right-sibling pointer is a rightmost node. We can keep track of a tree using pointer header to the root. From this pointer, we can traverse the tree in several ways, but it is beyond the scope of this course. r n 1 n 2 n 3 n 4 n 5 n 6 5.3 BINARY TREES AND BINARY SEARCH TREES 31 5.3 BINARY TREES AND BINARY SEARCH TREES In a binary tree, a node can have at most two children, and rather than counting chil- dren from the left, we call them a left child and a right child. A similar data structure can be used for a binary tree. In this case, we also use two pointers, one to the left child and the other to the right child. Either or both pointers may be NULL. A structure for a node can be declared as follows: typedef struct NODE *TREE; struct NODE { int info; TREE leftChild, rightChild; }; Here we call the type "pointer to node" by the name TREE since the most common use for this type will be to represent trees and subtrees. We can interprete the leftChild and rightChild fields either as pointers to children or as the left and right subtrees themselves. The other issues for binary trees are the same as that for general trees. Binary Search Trees Binary search tree is a kind of binary trees that is useful for implementing a set of data elements in which we frequently perform a lookup for a specified element. The field used to lookup is called a search key or just key. In a binary search tree, the following property must hold at every node x: all nodes r n 2 n 1 n 4 n 5 n 6 n 3 • • • • • • • • header 32 INTRODUCTION TO COMPUTER SCIENCE: HANDOUT #5. THE TREE DATA MODEL in the left subtree of x have keys less than the key of x, and all nodes in the right sub- tree have keys greater than the key of x. This property is called the binary search tree property (BST property). Trees are very important in computer algorithms and are discussed in greater details in many textbooks. Our textbook is one of the most fundamentals. 5.4 GLOSSARY Tree: Cây. Binary tree: Cây nhò phân. Binary search tree: Cây nhò phân tìm kiếm, cây tìm kiếm nhò phân. Subtree: Cây con. Node: Nút. Edge: Cạnh. Leaf: Nút lá. Interior node: Nút trong, nút nội. Hierarchical relationship: Mối liên hệ phân cấp. Parent: Cha, nút cha. Child, children: Con, nút con. Sibling: Nút anh em. Ancestor: Tổ tiên, nút tổ tiên. Descendant: Hậu duệ, nút hậu duệ. Connected: liên thông. Leftmost, rightmost: Tận trái, tận phải. Topmost, bottommost: Trên cùng, dưới cùng. Lookup, Insert, Delete, Update: Tìm kiếm, Chèn, Xóa, Cập nhật. Traverse: Duyệt (cây). . INTRODUCTION TO COMPUTER SCIENCE HANDOUT #5. THE TREE DATA MODEL K5 & K6, Computer Science Department, Vaên Lang University Second semester Feb, 2002 Instructor: Traàn Ñöùc. the root, move to the parent of n, to the parent of the parent of n, and so on, we eventually reach the root of the tree. r n 1 n 2 n 3 n 4 n 5 30 INTRODUCTION TO COMPUTER SCIENCE: HANDOUT #5. THE TREE. Search Trees Reading: Sections 5. 2, 5. 3, 5. 6, and 5. 7. 5. 1 BASIC TERMINOLOGY A list discussed in the last handout is a linear structure, whereas a tree is a non-linear structure representing