Information Processing Letters 59 ( 1996) 289-294 Parallel maximum independent set in convex bipartite graphs Artur Czumaj a,*, Krzysztof Diks b*l, Teresa M PrzytyckaCq2 a Heinz zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA Nixdorflnstitute and Department of M athematics & Computer Science, Universiv of Paderbom, D- 33095 h Instytut Informatyki Uniwersy tet W arszawski, PL- 02- 097 Paderborn, Germany W arszawa Poland ’ Department of Computer Science, University of M aryland, A.Y W illiams Bldg., College Park, M D 20742, USA Received 20 January 1995; revised 19 August 1996 Communicated by M.J Atallah zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPON Abstract A bipartite graph G = (VW E) is called convex if the vertices in W can be ordered in such a way that the elements of adjacent to any vertex u E V form an interval (i.e a sequence consecutively numbered vertices) Such a graph can be represented in a compact form that requires O(n) space, where n = max{ IVI, 1WI} G iven a convex bipartite graph G in the compact form Dekel and Sahni designed an 0( log* (n) )-time, n-processor EREW PRAM algorithm to compute a maximum matching in G We show that the matching produced by their algorithm can be used to construct optimally in parallel a maximum set of independent vertices Our algorithm runs in 0( logn) time with n/ logn processors on an Arbitrary CRCW PRAM W Keywords: Bipartite graphs; Convex graphs; Independent set; PRAM algorithms Introduction An independent set of a graph is a subset of its vertices such that no two vertices in the subset are adjacent The problem of finding a maximum cardinality independent set (or shortly, the MIS problem) is one of the most fundamental problems in graph theory If there are no restrictions on the input graph the MIS problem is known to be NP-complete However, in the case of bipartite graphs the MIS problem is closely * Corresponding author Email: artur@uni-paderbom.de Supported in part by DFG-Graduiertenkolleg “Parallele Rechnemetzwerke in der Produktionstechnik” ME 872/4- t Email: diks@mimuw.edu.pl Partly supported by EC Cooperative Action K-1000 (project ALTEC: Algorithms for Future Technologies) * Email: przytyck@cs.umbs.edu 0020-0190/96/$12.00 Copyright PII SOO20-0190(96)00131-7 @ 1996 Published related to a maximum matching problem and hence it can be solved in polynomial time [ 61 A subset M of edges of a graph G = (YE) is a matching if no two edges in M are incident to the same vertex; A4 is of maximum cardinality (or simply, a maximum matching) if it contains the maximum number of edges The problem of finding a maximum cardinality matching is called the maximum matching problem In this paper we address the problem of finding in parallel a maximum independent set in a special class of graphs - convex bipartite graphs Let G = (Vu! E) be an undirected bipartite graph, where Y W are sets of vertices and E is a set of edges of the form (u, w), with u E V and w E W The graph G is convex if there is an ordering “ and mation to the compact representation of the graph.) A that the lemma holds for k - 1, directed edge going from a vertex u to a vertex w will = [bk-‘(i),ek-l(i)] For j E Let Rk-’ (i) be denoted by u -+ w N( Rk-’ (i)) let m(j) be the vertex for which For a set X V zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA U W let N(X) denote the set of “outgoing” neighbors of the vertices in X, i.e (j,m(j)) E M Then N(j) = EbegW,end(j)l \ {m(j)} Thus by t h e inductive hypothesis, it follows N(X) = {w u E X and u f w that Rk(i) = [bk(i),ek(i)], where is a directed edge in G} bk(i> =min({bk-l(i)} For every integer k > let Rk( X) be the set of vertices defined as follows: Rk(X) = N(X), I N(N(Rk-l(X)>> k= 1, U Rk-‘(X), k > Let R(X) denote the set of all vertices reachable from X in an odd number of steps Observe that if u is reachable from X then it is reachable in at most n - steps Thus R(X) = Ur’J’2’ k-l Rk(X) For simplicity we will write N(i), Rk (i), R(i) instead of N( {i}), Rk( {i}), R( {i}), respectively Let VObe the set of all unmatched vertices in V Our goal, as pointed in Introduction, is to compute the sets WI = R( Vo) and VI = N( WI ) It is sufficient to show how to compute R( VO) - the set of all vertices in W reachable from the unmatched vertices in V (Since v,u) E ManduE Wl}onecan V=&U{uEVI( easily compute VI from WI and M in constant time with n processors.) The basic idea of our approach is as follows First, we show that for each i E Vi, R(i) is an interval whose second endpoint is end(i) The first endpoints for all i can be computed in O( 1) time with n processors Then, given a sequence of intervals sorted with respect end values, we show how to compute the representation of the union of these intervals as a union of disjoint intervals With this representation, we can decide if i E R( VO), for all vertex i E W, in constant time with linear number of processors In order to present our algorithm precisely we need the following lemmas (Recall that the graph is given in the sorted compact form.) Lemma For every i E VOand every integer k the elements of Rk (i) form un interval [ bk (i) , ek (i) ] in H! 1j E N(Rk-‘(i))}), U {beg(j) and ek(i> =max({ek-l(i)} U{end(j) Lemma2 then j < i 1j E N(Rk-l(i))}) Let i c hand k rfj E N(Rk(i)) Proof Induction on k k = 1: If j E N( R’ (i) > then there is a unique vertex k E R’(i) such that (j, k) l M Since i is unmatched and M is the greedy matching then j < i k > 1: Assume that the lemma holds for k - Consider the set N( Rk( i) \ Rk-’ (i) ) If this set is empty then the lemma holds for k obviously Suppose that it is non-empty and let j E N( Rk (i) \ Rk-’ (i) ) Then there are p E Rk-‘(i),q E N(Rk-‘(i)) and r E Rk(i) \ Rk-‘(i) such that (q,p) E M,(q,r) E E \ M and (j, r) E M By the induction hypothesis, q < i and therefore end(q) < end(i) Since Rk-’ (i) is an interval in W, thus r < p This observation and the fact that M is the greedy matching imply j < q, otherwise the edge (q, r) would belong to M instead of (4,~) and (j,r) El Lemma3 Zfi E VOthen R(i) = [b(i),e(i)],where b(i) = br”/21(i) and e(i) = end(i) Proof The lemma follows immediately from Lemmas 1,2 and the fact that end(j) < end(i), for every j / j E N(Q(i)))) Define a function next : v -+ v as follows next(i) = i 1, beg(i) q(i), otherwise G b&q(i)), Since beg(ne&‘(i)) < beg(ne&(i)) for every k 0, we get a contradiction Corollary For every i E Vo R(i) = [beg(next*(i)),end(i)] Informally, next( i) is the vertex of V with the smallest If all next(j) are known then one can compute beg value that can be reached from the vertex i in at next* (j) in 0( log n) time with n/ log n processors usmost steps via its matched neighbors in W ing the tree contraction technique We must be careful Let next0(i) = i,ne&(i) = next(ne&‘(i)), here, as tree contraction algorithms are usually prefor every k 1, and let next*(i) = j be such that sented in the context of a tree where each internal node ne_xt( j) = j and nextk (i) = j, for some k Obhas associated with a list of its children Unfortunately serve that if i # next(i) then beg(next(i)) < beg(i) in our application this is not the case For completeThus pointers next are the parent pointers of a rooted ness of the presentation we show, in the Appendix, a forest Furthermore, the function next has the followtechnique that allows to avoid this restriction ing important property: zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA We concluded that computing of intkrvals R(i) reLemma Rk(i) For every i E V, and every integer k = [beg(ne&‘(i)),end(i)] Proof The proof is by induction on k Since nat”( i) = i and R’(i) = [beg(i),end(i)] the lemma holds for k = Assume that k > and the lemma holds for every positive integer < k By the induction hypothesis, Rk-‘(i) = [beg(nextk-*(i)),end(i)] It follows from the definition of the function next that Rk(i) > [beg(ne&-](i)),end(i)] Suppose that there is j E Rk(i) \ [beg(nextk-‘(i)),end(i)] Then there are p E Rk-‘(i),t E N(Rk-l(i)) such that (t,p) E M and (t, j) E E \ M Notice that beg(t) < j < beg(ne&‘(i)) Let ko k - be the smallest positive integer such that p E Rb (i) By the induction hypothesis, p E [beg(nextk0-‘(i)),end(i)l and duces to computing function next Function next can be computed in 0( log n) time with n/ log n processors as follows LetW= (WI, , wlMl ) be the increasing sequence of the vertices of W matched in M W is easy to compute using the prefix computation Moreover, using the prefix computations one can easily compute two tables such that for every w E W, A[ l lWl] and C[l lWJ] A[ w] is the largest index j such that Wj < w and C [ w] is the smallest index j such that Wj > w For every i E V let a(i) and c(i) be the smallest and largest indices such that Wa(i) beg(i) and WC(i) < end(i) Indices a(i) and c(i) can be computed with a linear work using tables A and C Observe now that q(i) is a vertex with the minimum beg value among all vertices in V matched with vertices Wa(i), Wa(i)+l, , WC(i) In order to compute q(i) (and hence next(i) ) , for all i E V, one can apply the algorithm for the range minimum searching problem [ 21 It takes 0( log n) time on an n/ log n-processor CREW PRAM Thus, we know how to compute a representation of R( VO) as a union of n intervals sorted with respect to the second endpoint Our final step is to simplify this representation to the union of non-intersecting intervals p [beg(ne&‘-*(i)),end(i)] Then beg(ne_&(i)) < beg(t) < j < beg(nex@‘(i)) Lemma6 Let II, ZP where Zi = [ bi, ei] be the sequence of intervals such thatfor i < j, ei ej Then, the set of intervals Z{, , Z: such that for any i # j A czumaj et al /Information Processing Letters 59 (1996) 289-294 293 zi’nz~=0andz~Uz~ Uz,=z~Uz~ Uz~cunbe the non-optimal algorithm of Miller and Reif [lo], computed in 0( log n) time with n/ log n processors zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFED and finally combine the information from the smaller problem to all other nodes in the forest Proof First, we eliminate every interval Zj such that Zj is contained in an interval Zf with j’ > j To find these intervals we consider the sequence of the first A.Z TheJirst phase endpoints of the intervals II, , Zp For each element in this sequences we find closest dominating (i.e not The first phase reduces the number of vertices in the forest to at most n/ log n Let P = n/ log n, A be the larger than the given element) successor If such a array containing all the nodes of F, B an empty array successor exists, then the interval is eliminated This of length n, and for each node u, succ( u) = p(u) can be done in O(logn) time with n/ logn procesWe repeat the following process until the size of A sors [ 21 In this way we obtain a sequence of intervals is smaller than n/log n sorted with respect to both endpoints Now, the intervals Ii, Z: can be computed using the list ranking Finding all leaves and chains Split the nodes in A technique into three groups: (i) the leaves L, that is, all nodes v E A that for no u E A, succ( u) = u; (ii) the nodes on Thus we can conclude the paper with the following chains C, that is, the nodes u E A - L such that for theorem exactly one u E A, succ(u) = u; and (iii) the other nodes One can easily verify which node belongs to Theorem Given a (sorted, compact representawhich of these sets in 0( 1Al/P) time with P procestion of) convex bipartite graph G = ( Y W, E) of size n sors, and then, using the prefix sums algorithm of Cole and the greedy matching M one can compute a maxiand Vishkin [ 31, rearrange them to store in consecumum independent set of vertices in G in 0( log n) time tivepartsofAintimeO(~A~/P+log~A~/loglog~A~) using n/ log n processors of a CRCW PRAM with P processors Appendix A Trre contraction in the absence of an Euler tour In this Appendix, we show how to solve optimally the rooting problem in a forest Given a forest F defined by the parent’s relation (i.e., each node u has a pointer p(v) to its parent) with nodes { 1, ,n} A node u is a root if p(u) = u The rooting problem is to find for each node u the root r(u) of the tree it belongs to Given an Euler tour of each tree (or given for every node the list of its children), the rooting problem can be solved in 0( log n) time with n/ log n processors using standard tree-contraction algorithms [ 10,l ] However this technique cannot deal with unbounded degree trees when no Euler tour is given We show an approach that circumvents this assumption and design an 0( logn)-time algorithm for the rooting problem that employs O(n) operations Our algorithm consists of three phases First, we reduce the problem of size n to the problem of size n/ log n Then we solve the smaller problem using Remove all leaves If u E L then we set the pointer to the node which will find the root of u, PT(u) = succ( u), and remove u from the array A All leaves are stored at the first free entries of an array B Halves the chains C is a collection of lists Using the algorithm of Cole and Vishkin [3], find a maximal independent set MIS in C in O(lCl/P + log ICI / log log 1C I) time with P processors Additionally we require that the last element from each list belongs to MIS l For each node u E C - MIS, if succ(u) E MIS, then PT( u) = succ( u), and otherwise PT( u) = succ(succ(u)) Remove u from A and store all nodes from E C - MIS at the first free entries of B l For each node from MIS that is not the last vertex on a chain, if succ(succ(u)) E MIS then set succ( u) = succ(succ( v)), and otherwise set succ(u) =succ(succ(succ(u))) Fact Phase can be pelformed with P = n/ log n processors in O(logn) time 294 A Czumaj et al./Informa~ion Processing Letters 59 (1996) 289-294 zyxwvutsrqponmlkjihgfedcbaZYXW zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA This finally leads to the following theorem Proof Let N, denote the size of A before iteration t Standard arguments (see e.g [ zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA 10,l ] ) can be applied Theorem The rooting problem can be solved in to show that N,+I < $N, Hence N,+l n(t)’ and 0( log n) with n/ log zyxwvutsrqponmlkjihgfedcbaZYXW n processors on a CRCW PRAM there are at most log log n iteration of the loop The running time of iteration t is O( N,/P + log N,/ log log NI) with P processors Summing this References together we get the running time bounded by log log ” c El ($+ log log N, > log log n Q + c O(N,) r=l = O(logn) 111 K Abrahamson, N Dadoun, D.G Kirkpatrick and T log Nr +loglognO log Nr ( log log Nt > A.2 The second phase Now we perform standard tree-contraction algorithm (e.g [ lo] ) for the forest defined by the relation WCC in the array A It runs in 0( log N) time and uses N processors, where N is the number of vertices in the forest Since in our case N = n/ log n, this yields to an 0( log n)-time n/ log n-processors algorithm A.3 The third phase In this step we have to combine the information computed for the nodes that were left after Phase to obtain the pointers to the root for all the nodes in F Observe that the nodes stored in B are ordered with respect to the time when they were removed from A This gives us a partition of B into blocks of nodes that were removed at the same iteration Since they are at most log log n blocks, we can analyze them successively, one by one, in the reverse order of the time when the nodes from given block were removed Then using the information of the root of all the nodes in the already analyzed part of B, we can compute r(u) using the pointer PT( 0) If Bi denotes the size of the ith block, then each block can be analyzed in constant time using Bi processors, or in O(Bi/P) time using P processors Summing this over all blocks we get the 0( log n) running time of the third phase with P = n/ log n processors Przytycka, A simple parallel tree contraction algorithm, J Algorithms 10 (1989) 287-302 [21 Berkman, B Schieber and U Vishkin, Optimal doubly logarithmic parallel algorithms based on finding all nearest smaller values, J Algorithms 14 ( 1993) R Cole and U Vishkin, Faster optimal parallel prefix sums [31 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONML and list ranking, Inform and Compur 81 (3) (1989) 334352 [41 E Dekel and S Sahni, A parallel matching algorithm for convex bipartite graphs and applications to scheduling, J Parallel Distributed Compur (1984) 185-205 [51 H.N Gabow and R.E Tarjan, A linear-time algorithm for a special case of disjoint set union, J Cornput System Sci 30 (1985) 209-221 [61 E Gavril, Testing for equality between maximum matching and minimum node covering, Inform Process L&t ( 1977) 199-202 [71 E Glover, Maximum matching in a convex bipartite graph, Naval Rex Logist Quart 14 (1967) 313-316 [81 R.M Karp and V Ramachandran, A survey of parallel algorithms for shared-memory machines, in: J van Leeuwen, ed., Handbook of Theoretical Computer Science, Volume A: Algorirhms and Complexity (Elsevier, Amsterdam, 1990) Chapter 17, pp 869-941 91 W Lipski and F.P Preparata, Efficient algorithms for finding maximum matchings in convex bipartite graphs and related problems, Acta Inform 15 (1981) 329-346 lo] G.L Miller and J.H Reif, Parallel tree contraction, in: Proc 26th IEEE Symp on Foundationsof Computer Science ( 1985) 478-489

