Generalized Cores
Abstract
Cores are, besides connectivity components, one among few concepts that provides us with efficient decompositions of large graphs and networks.
In the paper a generalization of the notion of core of a graph based on vertex property function is presented. It is shown that for the local monotone vertex property functions the corresponding cores can be determined in time.
definition[theorem]Definition \newdefremark[theorem]Remark \categoryF.2.2 Analysis of algorithms and problem complexity Nonnumerical Algorithms and Problems [Computations on discrete structures] \termsAlgorithms, Performance, Theory
This work was supported by the Ministry of Education, Science and Sport of Slovenia, Project J18532. It is a detailed version of the part of the talk presented at Recent Trends in Graph Theory, Algebraic Combinatorics, and Graph Algorithms, September 2001, Bled, Slovenia.
1 Cores
The notion of core was introduced by Seidman in 1983 [Seidman (1983), Wasserman and Faust (1994)].
Let be a simple graph. is the set of vertices and is the set of lines (edges or arcs). We will denote and . A subgraph induced by the set is a core or a core of order iff and is a maximum subgraph with this property. The core of maximum order is also called the main core. The core number of vertex is the highest order of a core that contains this vertex. Since the set determines the corresponding core we also often call it a core.
The degree can be the number of neighbors in an undirected graph or indegree, outdegree, indegree outdegree, … determining different types of cores.
The cores have the following important properties:

The cores are nested:

Cores are not necessarily connected subgraphs.
In this paper we present a generalization of the notion of core from degrees to other properties of vertices.
2 cores
Let be a network, where is a graph and is a function assigning values to lines. A vertex property function on , or a function for short, is a function , , with real values.
Examples of vertex property functions: Let denotes the set of neighbors of vertex in graph , and , .





, where

, where

number of cycles of length through vertex
The subgraph induced by the set is a core at level iff


is maximal such set.
The function is monotone iff it has the property
All among functions – are monotone.
For monotone function the core at level can be determined by successively deleting vertices with value of lower than .
;
while do ;
Theorem 2.1
For monotone function the above procedure determines the core at level .
The set returned by the procedure evidently has the first property from the core definition.
Let us also show that for monotone the result of the procedure is independent of the order of deletions.
Suppose the contrary – there are two different cores at level , determined by sets and . The core was produced by deleting the sequence , , , …, ; and by the sequence , , , …, . Assume that . We will show that this leads to contradiction.
Take any . To show that it also can be deleted we first apply the sequence , , , …, to get . Since it appears in the sequence , , , …, . Let and . Then, since , we have, by monotonicity of , also . Therefore also all are deleted – – a contradiction.
Since the result of the procedure is uniquely defined and vertices outside have value lower than , the final set satisfies also the second condition from the definition of core – it is the core at level .
Corollary 2.2
For monotone function the cores are nested
Follows directly from the theorem 1. Since the result is independent of the order of deletions we first determine the . In the following we eventually delete some additional vertices thus producing . Therefore .
Example of nonmonotone function: Consider the following function
where on the network , ,
We get different results depending on whether we first delete the vertex or (or ) – see Figure 2.
The original network is a core at level 2. Applying the algorithm to the network we have three choices for the first vertex to be deleted: , or . Deleting we get, after removing the isolated vertex , the core at level 3. Note that the values of in vertices and increased from 2 to 3.
Deleting (or symmetrically – we analyze only the first case) we get the set at level 2 – the value at increased to 2.5. In the next step we can delete either the vertex , producing the set at level 3, or the vertex , producing the core at level 4.
As we see, the result of the algorithm depends on the order of deletions. The core at level 4 is not contained in the core at level 3.
3 Algorithms
3.1 Algorithm for core at level
The function is local iff
The functions – from examples are local; is not local for .
In the following we shall assume also that for the function there exists a constant such that
For a local function an algorithm for determining core at level exists (assuming that can be computed in ).
INPUT: graph represented by lists of neighbors and  
OUTPUT: , is a core at level  
1.  ;  
2.  for do ;  
3.  ;  
4.  while do begin  
4.1.  ;  
4.2.  for do begin  
4.2.1.  ;  
4.2.2.  ;  
end;  
end; 
The step 4.2.1. can often be speeded up by updating the .
This algorithm is straightforwardly extended to produce the hierarchy of cores. The hierarchy is determined by the corenumber assigned to each vertex – the highest level value of cores that contain the vertex.
3.2 Determining the hierarchy of cores
INPUT: graph represented by lists of neighbors  
OUTPUT: table with core number for each vertex  
1.  ;  
2.  for do ;  
3.  ;  
4.  while do begin  
4.1.  ;  
4.2.  ;  
4.3.  for do begin  
4.3.1.  ;  
4.3.2.  ;  
end;  
end; 
Let us assume that is a maximum time needed for computing the value of , , . Then the complexity of statements 1. – 3. is . Let us now look at the body of the while loop. Since at each repetition of the body the size of the set is decreased by 1 there are at most repetitions. Statements 4.1 and 4.2 can be implemented to run in constant time, thus contributing to the loop. In all combined repetitions of the while and for loops each line is considered at most once. Therefore the body of the for loop (statements 4.3,1 and 4.3.2) is executed at most times – contributing at most to the while loop. Often the value can be updated in constant time – . Summing up all the contributions we get the total time complexity of the algorithm .
For a local function, for which the value of can be computed in), we have .
The described algorithm is partially implemented in program for large networks analysis Pajek (Slovene word for Spider) for Windows (32 bit) [Batagelj and Mrvar (1998)]. It is freely available, for noncommercial use, at its homepage:
http://vlado.fmf.unilj.si/pub/networks/pajek/
A standalone implementation of the algorithm in C is available at
http://www.educa.fmf.unilj.si/datana/pub/networks/cores/
For the property functions – a quicker core determining
algorithm can be developed [Batagelj and
Zaveršnik (2001)].
4 Example – Internet Connections
1  1  7582  12  2048  490 
2  2  9288  13  4096  314 
3  4  12519  14  8192  153 
4  8  33866  15  16384  48 
5  16  33757  16  32768  44 
6  32  11433  17  65536  11 
7  64  6518  18  131072  9 
8  128  3812  19  262144  0 
9  256  2356  20  524288  2 
10  512  1528  21  1048576  3 
11  1024  918 
As an example of application of the proposed algorithm we applied it to the routing data on the Internet network. This network was produced from web scanning data (May 1999) available from
http://www.cs.belllabs.com/who/ches/map/index.html
It can be obtained also as a Pajek’s NET file from
http://vlado.fmf.unilj.si/pub/networks/data/web/web.zip
It has
vertices, arcs (loops were removed),
, and average degree is .
The arcs have as values the number of traceroute paths
which contain the arc.
Using Pajek implementation of the proposed algorithm on 300 MHz PC we obtained in 3 seconds the cores segmentation presented in Table 1 – there are vertices with core number in the interval .
The program also determined the core number for every vertex. Figure 3 shows a core at level 25000 of the Internet network – every vertex inside this core is visited by at least 25000 traceroute paths. In the figure the sizes of circles representing vertices are proportional to (the square roots of) their core numbers. Since the arcs values span from 1 to 626826 they can not be displayed directly. We recoded them according to the thresholds , . These class numbers are represented by the thickness of the arcs.
5 Conclusions
The cores, because they can be efficiently determined, are one among few concepts that provide us with meaningful decompositions of large networks [Garey and Johnson (1979)]. We expect that different approaches to the analysis of large networks can be built on this basis. For example, the sequence of vertices in sequential coloring can be determined by descending order of their core numbers (combined with their degrees). We obtain in this basis the following bound on the chromatic number of a given graph
Cores can also be used to localize the search for interesting subnetworks in large networks [Batagelj et al. (1999), Batagelj and Mrvar (2000)]:

If it exists, a component is contained in a core.

If it exists, a clique is contained in a core. .
References
 Batagelj and Mrvar (1998) Batagelj, V. and Mrvar, A. 1998. Pajek – a program for large network analysis. Connections 21 (2), 47–57.
 Batagelj and Mrvar (2000) Batagelj, V. and Mrvar, A. 2000. Some analyses of Erdős collaboration graph. Social Networks 22, 173–186.
 Batagelj et al. (1999) Batagelj, V., Mrvar, A., and Zaveršnik, M. 1999. Partitioning approach to visualization of large graphs. In Graph drawing, J. Kratochvíl, Ed. Lecture notes in computer science, 1731. Springer, Berlin, 90–97.
 Batagelj and Zaveršnik (2001) Batagelj, V. and Zaveršnik, M. 2001. An algorithm for cores decomposition of networks. Manuscript, Submitted.
 Garey and Johnson (1979) Garey, M. R. and Johnson, D. S. 1979. Computer and intractability. Freeman, San Francisco.
 Seidman (1983) Seidman, S. B. 1983. Network structure and minimum degree. Social Networks 5, 269–287.
 Wasserman and Faust (1994) Wasserman, S. and Faust, K. 1994. Social network analysis: Methods and applications. Cambridge University Press, Cambridge.
eceived Month Year; revised Month Year; accepted Month Year