We study the limits on performance of network coding in a very general setting called multi-source multi-sink network
coding over directed acyclic graphs (DAGs), which allows multiple senders and multiple receivers demanding to receive
packets from arbitrary subsets of senders. The limits on performance of network coding for a given network are best
described using its rate region which is the set containing all rate vectors for which communication is possible using some
network coding scheme. Characterizing these sets using linear/non-linear inequalities is known to be a notoriously difficult
problem in information theory. Yan et. al. [B1] provided the following implicit characterization of rate region
R_{*}:

| (1) |

where, ω is the vector of source information rates, r is the vector of edge capacities, Γ_{n}^{*} is the yet unknown
2^{n} - 1-dimensional set of all entropic vectors, L′,L′′ are sets of constraints on entropy vectors encoding independence of
sources and the fact that outgoing messages at a node are functions of incoming messages, and con(⋅),,proj_{ω,r}(⋅) represent
convex hull,closure and projection onto ω,r coordinates respectively.

Roughly speaking, R_{*} is the projection of a set with exponentially large dimension that is Γ_{n}^{*}. However, characterizing
Γ_{n}^{*} is itself a long-standing fundamental open question in information theory. We use following approaches to handle this
problem:

- Computation: Use computational means to determine rate regions of small networks. This is done by
evaluating the formula in equation 1 with the unknown set Γ
_{n}^{*}replaced by known bounds (strict subsets or supersets of Γ_{n}^{*}) on Γ_{n}^{*}. This is essential for providing intuition about large networks. - Symmetry: Use symmetry of networks to gain insights into structure of rate regions and improving efficiency of algorithms for computing bounds on rate regions.
- Hierarchy: Develop a hierarchy of networks. Based on the lines of graph well quasi ordering, this approach revolves around different notions defining sub-networks (minor relations). We try answering questions like: For which classes of networks and minor relations can we determine rate regions of large networks using rate regions its subnetworks?
- Applications: We apply our computational tools to solve the data distributed storage systems. One of the earlier models on distributed storage systems is Multilevel Diversity Coding Systems (MDCS). We solved thousands of MDCS instances and investigate their sufficiency of simple codes and sufficiency of Shannon outer bound.

Information theorists have traditionally proved rate regions by hand. This involves a converse proof: finding a
collection of conic combinations of inequalities known to hold for the problem that gives the tightest collection of
implied inequalities amongst all such collections, giving an outer bound on the rate region and an achievable
code construction for every point in the outer bound, thus showing it to be the exact rate region. In network
coding, however, this scheme falls apart due to sheer number of inequalities one must consider: e.g. the simplest
polyhedral outer bound on Γ_{n}^{*}, the LP outer bound, has 2^{n-2} + n inequalities[B2]. Imagine the manual labor
required to find that many conic combination coefficients to prove just 1 inequality bounding the rate region!
Similarly, finding an achievability scheme for a given rate point also involves a lot of head-scratching for arbitrary
networks. Proving that a small network coding instance has a certain rate region can be seen as a three step
process:

- Compute an outer bound R
^{out}using outer bound Γ_{n}^{out}⊃Γ_{n}^{*} - Compute an inner bound R
^{in}using an outer bound Γ_{n}^{in}⊂Γ_{n}^{*} - Do they match?
- Yes: Declare that we have found the rate region
- No: Repeat step 1 and 2 with improved bounds

By improving the inner/outer bounds in step 3, we mean smaller outer bound Γ_{n}^{out′} ⊂ Γ_{n}^{out} and/or a bigger inner bound
Γ_{n}^{in′} ⊃ Γ_{n}^{in}. An important restriction we make when considering the inner/outer bounds in step 1) and 2) is to consider
only polyhedral bounds that are in fact polyhedral cones. Steps 1) and 2) can now be stated as an instance of evaluating
following equation,

| (2) |

where L is the intersection of all network coding constraints. This restriction allows us to use a variety of polyhedral
computation algorithms for problems such as polyhedral extreme ray enumeration, redundancy removal, and projection, for
which there are nice implementations out there (see cdd, lrs, chm). While most known polyhedral outer bounds on Γ_{n}^{*} have
tractable inequality representations and humongous (mostly unknown) extreme ray representations, most polyhedral inner
bounds are the opposite, with tractable extreme ray representations and humongous (mostly unknown) inequality
representations.

Outer bound R^{out} on network coding rate region can be obtained using an outer bound on region of entropic vectors Γ_{n}^{out}.
If the outer bound used is Shannon outer bound of Γ_{n}, we get an outer bound R^{LP} on the rate region. Other outer bounds
on region of entropic vectors, especially those tighter than Γ_{n}, can be obtained using non-Shannon outer bounds formed by
adding non-Shannon information inequalities to Γ_{n}. One of the methods of computing non-Shannon information inequalities
is by using Csirmaz’s extended copy lemma [B3]. Both of these problems: computation of of R^{out} and computation of
non-Shannon information inequalities can be solved using our software chm that performs exact polyhedral
projection using rational arithmetic and floating point simplex with basis identification for exact solutions to LPs
(QsOpt_ex).

The most desirable property of an inner bound Γ_{n}^{in} on region of entropic vector is a readily available construction of a
collection random variables for every point in the bound. Two such classes of inner bounds are 1)Γ_{n}^{q} and
2)Γ_{n,k,q}^{⊂space}[A1], which are inner bounds obtained from 1) matroids representable over F_{q} 2) n-subspace
arrangements over F_{q} that can be obtained from size k > n representable matroids. We device algorithms to
compute these classes of inner bounds which are available in extreme ray representations of the associated
polyhedral cones. Of special interest are the orderly generation algorithms [B4] and those based on homomorphism
principle[B5].

Recall that an inequality in the rate region of a network is in terms of rate variables (edge capacities) and source entropies. Sometimes, the inequalities in a rate region are easy to derive. However, for many instances, it is not easy to derive by hand all of the inequalities for the rate region. While we can use computation to find rate regions, when we are done, we would also like to have a more traditional converse proof i.e. we would like to know the conic combination of information inequalities that yielded a given rate region inequality. A technique based on LP duality is discussed in [B6, B7] using which we show how to use a computer to generate traditional converse proofs automatically.

The projection of the Shannon outer bound intersected with the network constraints gives an outer bound on the network
coding rate region, which is in principle equivalent to a converse proof. Suppose a polyhedral cone P is determined by
Ax ≥ 0. For an inequality implied by Ax ≥ 0, b^{T}x ≥ 0, where b^{T} has zeros on eliminated dimensions, then there exists
a vector λ ≥ 0 such that A^{T}λ = b. The vector λ gives coefficients for the weighted sum of inequalities in
P.

Such a coefficient vector λ need not be unique. Hence, among the various possibilities for λ satisfying b = A^{T}λ, we want
to select one that gives the simplest proof, which means it involves the smallest number of inequalities and
constraints. This vector will thus be the sparsest vector λ, having the fewest non-zero entries, i.e. the lowest
l_{0}-norm.

Because the l_{0}-norm is a complicated non-differentiable and non-linear function, this is a difficult optimization problem to
solve directly. Hence we use the typical relaxation that replaces the l_{0}-norm with the l_{1}-norm. Furthermore, since
λ ≥ 0, we do not need the absolute values in the l_{1}-norm, and the problem is approximated by the linear
program

| (3) |

which is equivalent to

| (4) |

An example of the computer aided proof applied to an MDCS instance can be found in Fig. 1.

Graph theoretic symmetry, symmetries of network codes, and symmetries of rate regions are inherently linked to each
other.[A2] Network symmetry group (NSG) is the common thread between these notions of symmetry. It can be computed as
a subgroup of the automorphism group of a directed cyclic graph appropriately constructed from the underlying network’s
directed acyclic graph. Generalizations of Chen and Yeung’s partition symmetrical entropy functions [B8] can be obtained
corresponding to any network symmetry group. Knowledge of the network symmetry group can be utilized to reduce the
complexity of computing the LP outer bounds on network coding capacity as well as the complexity of polyhedral projection
for computing frate regions. Symmetry can also be used for improving the efficiency of computing bounds on network coding
rate region. For example, computing the LP outer bound R^{LP} [B2] on R^{*} is easier when the NSG is known.
The algorithm that can compute R^{LP} when NSG is known is called symCHM [A3]. As the name suggests, it
is a symmetry exploiting variant of CHM. symCHM is a general polyhedral projection algorithm i.e. it is
not limited to computing network coding rate regions. Here is an example comparing symCHM to CHM.

On the lines of the the forbidden minor characterizations for combinatorial structures such as graphs and matroids, we would like to see if sufficiency of linear codes for networks also has forbidden network minor characterization, atleast for restricted classes of networks. We define embedding operations in such a manner that if a class of codes is not optimal for a small network, then all large networks that have this small network embedded in it will have the same property. Furthermore, since we can obtain a database of rate regions of small networks, we would like to extend the solvability to large networks. Thus, we defined some combination operations so that rate region of the large network can be obtained from the rate regions of small networks involved in the combination.

Fig. 3: Network embedding and combination operations to preserve important properties.

A demonstration of usefulness of embedding operations can be found in [A4], where we obtained forbidden minor list for MDCS problems up to 3-level 4-encoders ,with respect to the following properties 1) Sufficiency of binary scalar codes for achieving the entire rate region 2) Sufficiency of superposition coding, as in Fig. 4.

Fig. 4: Forbidden network minors for scalar binary codes (upper) and superposition coding (lower) on MDCS problems.

As a demonstration of usefulness of combination operations is shown in Fig. 5, which shows how to solve a large network by using combination operators on some small networks and their rate regions.

Fig. 5: An example of large network that is a combination of five small networks and it can be solvable by combining rate regions of the five small networks. We have further extended these operators to general hyperedge networks in [A5]. We show that by applying recursively the seemingly straightforward combination and embedding operators described above, one can reason about large networks using small networks that are not directly related to them via one-shot use of combination and/or embedding operators.

Funding for this Topic:

This work supported by the National Science Foundation under CCF-1421828 and CCF-1016588.

[A1] J. Apte, C. Li, and J. Walsh, “Algorithms for computing network coding rate regions via single element extensions of matroids,” in Information Theory (ISIT), 2014 IEEE International Symposium on, June 2014, pp. 2306–2310.

[A2] J. Apte and J. M. Walsh, “Symmetry in Network Coding,” in Int. Symp. Information Theory (ISIT), June 2015, to appear.

[A3] ——, “Exploiting Symmetry in Computing Polyhedral Bounds on Network Coding Rate Regions,” in Int. Symp. Network Coding (NetCod), June 2015, to appear.

[A4] Congduan Li, Steven Weber, and John MacLaren Walsh, “On Multilevel Diversity Coding Systems,” IEEE Trans. Inform. Theory, submitted on July 21, 2014. Reviews received January 6, 2016. [Online]. Available: http://arxiv.org/abs/1407.5659

[A5] ——, “On Multi-source Networks: Enumeration, Rate Region Computation, and Hierarchy,” IEEE Trans. Inform. Theory, submitted on July 21, 2015. [Online]. Available: http://arxiv.org/abs/1507.05728

[A6] C. Li, S. Weber, and J. Walsh, “Network embedding operations preserving the insufficiency of linear network codes,” in 52nd Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2014, Oct 2014.

[A7] C. Li, J. Apte, J. M. Walsh, and S. Weber, “A new computational approach for determining rate regions and optimal codes for coded networks,” in IEEE International Symposium on Network Coding (NetCod), Jun 2013, pp. 1 –6.

[A8] C. Li, J. M. Walsh, and S. Weber, “Computational approaches for determining rate regions and codes using entropic vector bounds,” in 50th Annual Allerton Conference on Communication, Control and Computing, Oct 2012, pp. 1 –9.

[A9] S. Weber, C. Li, and J. M. Walsh, “Rate region for a class of delay mitigating codes and p2p networks,” 46th Annual Conference on Information Sciences and Systems, March 2012.

[B1] X. Yan, R. Yeung, and Z. Zhang, “An implicit characterization of the achievable rate region for acyclic multisource multisink network coding,” IEEE Transactions on Information Theory, vol. 58, no. 9, pp. 5625–5639, Sept 2012.

[B2] R. W. Yeung, Information Theory and Network Coding. Springer, 2008.

[B3] L. Csirmaz, “Information inequalities for four variables,” January 2013.

[B4] B. D. McKay, “Isomorph-free exhaustive generation,” J. Algorithms, vol. 26, no. 2, pp. 306–324, Feb. 1998. [Online]. Available: http://dx.doi.org/10.1006/jagm.1997.0898

[B5] Betten, A., Braun, M., Fripertinger, H., Kerber, A., Kohnert, A., and Wassermann, A., Error-Correcting Linear Codes: Classification by Isometry and Applications. Springer-Verlag Berlin Heidelberg, 2006, vol. 18.

[B6] C. Tian, “Rate region of the (4, 3, 3) exact-repair regenerating codes,” in IEEE International Symposium on Information Theory (ISIT), July 2013, pp. 1426–1430.

[B7] ——, “Characterizing the rate region of the (4,3,3) exact-repair regenerating codes,” IEEE Journal on Selected Areas in Communications, vol. 32, no. 5, pp. 967–975, May 2014.

[B8] Q. Chen and R. W. Yeung, “Partition-symmetrical entropy functions,” CoRR, vol. abs/1407.7405, 2014. [Online]. Available: http://arxiv.org/abs/1407.7405