The Shortest Path Problems


Let's define the shortest path problem first:

Definition (Shortest Path Problem)

A weighted digraph , is a digraph with weights assigned to every edges. That is, there's a mapping such that is the weight of .

(For convenience, we shall let the weights be real numbers. Doing so is sufficient for most cases, and can be easily imitated when dealing with extreme cases.)

For a walk in a weighted digraph , the weight of the walk is defined as . And the shortest path problem between (specified vertices) is, to find a -walk with minimum weight.

The shortest path problem can also be defined in graphs with weights assigned to edges. However, given that we can equivalently reduce the problem to a digraph version, by replacing every undirected edge with two directed edges in opposite directions and the same weight, for convenience, we shall narrow down the discussion to digraphs.

So a shortest path problem is to find a walk rather than a path, as is indicated by the name. It's unfortunate that mathematicians use walk versus path, while computer scientists use path versus simple path, and the term "shortest path problem" becomes more widespread than the "shortest walk problem". Anyway, just a simple naming issue won't undermine the well-formedness of the discussion in this text, as long as we've clarified that we are to find.

(For convenience, in this text, the shortest -path (in weight) means the -walk with minimum weight from now on.)

Obviously, finding the shortest -walk in length is equivalent to finding the shortest -path in weighted digraph with assigned to all edges uniformly. However, one should not feel too surprised when a shortest -walk in length is not the shortest -path:

Example

Consider finding the shortest -path in the following weighted digraph:

The shortest -path is marked blue. Clearly it's not the shortest in length.

It's also good to bear in mind that the weight of the edge can be negative.

So the shortest path problem is a generalization of finding the shortest walk in length in a digraph, and is much more useful since now more information about the concrete problem is embedded into the weighted graph, and solvable by the tools we are going to introduce in this text.

The plural form "problems" of the title is not a careless typo of an extra s-letter, but due to our referring to the two real world variation of the shortest path problem:

  1. Single-Source Shortest Path (SSSP): With a specified vertex called the source vertex, find all shortest -path in weight for every vertex .
  2. All-Pairs Shortest Path (APSP): Find all shortest -path in weight for every pair of vertices .

The SSSP problem is actually not a true variation, since the shortest path problem between specified endpoints can be shown to equivalent to the SSSP problem with as the source endpoint in general: consider in the weighted digraph of this problem, there's . Given that the value of for edge is arbitrary, our answer won't be correct without finding the shortest -path first.

Similarly, there's also Single-Sink Shortest Path problem, which is to find shortest -path in weight for every vertex and a specified vertex . However, it's also as hard as the shortest path problem between specified endpoints in general, given that we may let all other vertex in be adjacent to .

For convenience, in this text, we may assume : Since when there's more than one edge in , it suffices to preprocess and keep the edge of minimum weight, without affecting the result. In this way, there's no more than edges in , although it might be far less than when it is sparse.

The APSP problem can be naively solved looming an SSSP algorithm over every vertices in . Let's be dealing with preprocessed dense digraphs so that . The Bellmand-Ford algorithm, which is an SSSP algorithm for general digraph, runs in time complexity , and thus the APSP algorithm based on it runs in time complexity . The Dijkstra's algorithm, which is an SSSP algorithm for digraph of non-negative edge weights, runs in time complexity (using a binary heap), and thus the APSP algorithm based on it runs in time complexity .

However, the naive way mentioned above is inefficient, and we will show even the APSP problem of a general dense digraph can be solved within time complexity , by the Floyd-Warshall algorithm. This is even faster than looming the Dijkstra's SSSP algorithm over every source endpoint in , which is applicable only to digraph with non-negative edge weights.

Fundamental Theorem of Shortest Path Problem

Before our delving into any SSSP or APSP algorithm, it's crucial to characterize the shortest path problem first.

Think of the following example: Tom is an accountant of an international enterprise, which is based on country and has off-shore business in countries . The main responsibility of Tom is to handle the currency exchange between the four countries.

Consider a digraph with vertices , whose underlying graph is a complete graph over , such that represents the exchange rate from to subtracting the transaction fee per unit of currency in . So converting from to and then back to is lossy, and Tom need to find a way of exchange such that minimize the loss.

The behaviour of currency exchange is multiplicative: For a -walk in digraph , the value of currency after the exchange is . However, we may convert it into a digraph with weights (mind the minus sign), such that . and the value of currency after the exchange is . In this way, the less the value of , the more value of currency preseved after the exchange. Therefore, in order to fulfill his responsibility. Tom just need to find a shortest -path in .

Tom was depressed. One day, when Tom was to do the currency exchange between country and , he had trouble finding the shortest -path in the . The crucial elements in digraph and is shown as the following, with other edges hidden:

First, Tom found a temporary shortest -path . But soon he noticed, in the cycle , there's:

And by circulating after reaching in , he obtain a shorter -path with less weight . So he realized such a shortest -path would never exist, as he can always concatenate the cycle to any so-called shortest -path, to make it even shorter.

Tom shared his discovery with his best friend Jerry. "Stop making that frown face, bro, this is going to make big money.", said Jerry, "The cycle you found is a cash printer: Every time you circulate around the cycle, you boost the currency value by , not to mention the growth is measured in exponentation, per circulation."

While the story of Tom is fiction and too supernatural to happen, it reveals the specialty of the shortest path problem in general: the shortest -path might not exist even if .

In fact, for the existence of the shortest -path, we would like to prove the Fundamental Theorem of Shortest Path Problem:

Theorem (Fundamental Theorem of Shortest Path Problem)
The shortest -path in weight might either be a -path, or never exist.

Proof. Consider there's some vertex such that and , there's a closed walk passing such that . Consider a -walk in the form of , by circulating when passing , we obtain a -walk such that . In this way, we can control the weight of a -walk to be arbitrary small, and thus the shortest -path shall never exist.

On the other hand, assume for any such that and , every closed walk passing through fulfills . Consider any -walk in the form of :

  1. By presumption, it's impossible for to be true.
  2. When , we may fix into , which is a shorter in length by eliminating the segment of .
  3. When , we may fix into , which is shorter in weight due to .

In this way, for any -walk, its weight is greater than or equal to a -path it can be fixed into. Thus, a shortest -path is to be found in . Since is a finite set, the shortest -path must exist in this case.

Conventionally, we call a closed walk with negative weight a negative cycle. Thus the Fundamental Theorem of Shortest Path Problem can be reworded as:

Corollary

In the shortest path problem between :

  1. If there's negative cycle between and , then there's no shortest -path;
  2. If there's no negative cycle between and , then the shortest -path exists and is a -path.

To say there's a negative cycle between and , it means there's some vertex such that , and there's a negative cycle passing through .

The Fundamental Theorem of Shortest Path Problem gives rise to the graph metric of weighted distance:

Definition (Weighted Distance between Vertices)
In the weighted digraph with weight , the weighted distance is a graph metric between every pair of vertices and ranged within such that:

  1. if .
  2. , if there's no negative cycle between and .
  3. if there's negative cycle between and .

The weighted distance is almost the weighted version of the distance , except for its capability of taking when there's negative cycle between and .

The Fundamental Theorem of Shortest Path Problem characterizes what a correct algorithm for solving the shortest path problem should behave like. However, it's costly and mostly unneeded to enumerate all negative cycles. So most algorithm will yield a shortest -path such that if exists, otherwise just halt with the signal "negative cycle detected".

The following lemma is not fundamental, but is obvious and quite common in our proof:

Lemma
For any shortest -path , any -walk segment of is a shortest -path.

Proof. Let , if there's negative cycle between and then it's a negative cycle between and , which is a contradiction to is a shortest -path. Thus the shortest -path exist.

And if , then we can replace the walk segment of in with , so that we obtain a -walk with weight , which is even shorter than and a contradiction to is a shortest -path.

In this way, it suffices for a SSSP algorithm accepting as input to signal with "negative cycle detected", or to output a mapping of previous vertices and weighted function from . The mapping fulfills: if ; or if there's a shortest -path such that the previous vertex of is , since a shortest -path must contain the prefix of a shortest -path. The weighted function fulfills .

The case of APSP algorithm is analogous, it suffices for the algorithm to signal with negative cycle detected, or output a mapping of previous vertex and the weighted distance such that . The mapping is to output the previous vertex of a shortest -path, given that a shortest -path must contain a prefix of a shortest -path.

We will see, in an SSSP algorithm with source vertex , it's reasonable to initialize , and . In an APSP algorithm, it's reasonable to initialize and .

Bellman-Ford SSSP Algorithm

The Bellman-Ford SSSP algorithm might be the simplest SSSP algorithm that can be prove to be correct when applied to general weighted digraphs. We would like to present the Bellman-Ford algorithm first:

Algorithm (Bellman-Ford)
The input is the digraph with weight and the source vertex , either the algorithm halts with "negative cycle detected", or it outputs the classical result of in SSSP, as is described here:

  1. Perform classical initialization of in SSSP.
  2. Loop for times:
    1. For every :
      1. If , then update
        .
  3. For every :
    1. If , then halt with "negative cycle detected".
  4. Halt with classical result of in SSSP.

So the layout of the Bellman-Ford algorithm is simple: initially the shortest -path is the empty walk (if there's no negative cycle, which will be checked later), and we perform passes of construction of all shortest -paths (we will need to prove), and finally we check whether the stablizes, in order to confirm whether there's negative cycle reachable from (we will also need to prove).

Again, the order of visiting depends on the algorithmic representation of . Thus might stablize after arbitrary number of passes (let every edge be of weight ):

For example, the Bellman-Ford stabilizes right after the first pass when visiting in the order of , as is shown in the figure on the left above. However, it will take passes for the Bellman-Ford to stabilize when visiting in the order of , as is shown in the figure on the right above. If the Bellman-Ford algorithm would like to be correct, it must be capable of handling whichever permutation of .

To prove its correctness, we must articulate what each pass of step 2 of the Bellman-Ford algorithm actually find. Ironically, the would-be difficulty of handling permutation of sheds light on the problem. Let's do a little observation:

Consider on the first pass, no matter how we permute , the temporarily shortest paths wll always be found. This is due to the fact that the temporarily shortest -path has been found before the first pass of step 2, and will always be tested regardless of the permutation of .

Consider on the second pass, again no matter how we permute , the temporarily shortest paths will always be found, due to the temporarily shortest paths found after the first pass, and that will be checked regardless of the permutation of .

In this way, it turns out that the temporarily shortest paths found at each pass are what we can rely on: Since obviously any walk segment of a temporarily shortest path must also be a temporarily shortest path of its own length. Conversely, we concatenate edges to temporarily shortest paths, to construct longer temporarily shortest paths. In fact, by some tweaks and accomodations, eventually we can find the following lemma to be true:

Lemma
The temporarily shortest -path of length (that is, the -walk with minimum weight among all -walks of length ) is found not after the -th pass of step 2 in the Bellmand-Ford SSSP algorithm with source vertex .

Proof. Clearly it can be proved by mathematical induction. Initially, the shortest -path of length is only , found at the -th pass / right after the step 1.

Let's prove the temporarily shortest -path of length is found not after the -th pass. Noted that must be a temporarily shortest -path of length , and by induction hypothesis, it's found not after the -th pass. Let it be found at the -th pass (), and at the -th pass, the edge is inevitably visited, therefore is found at the -th pass, and .

This also leads us to the proof of the algorithm:

Theorem
The Bellman-Ford SSSP algorithm yields the correct answer.

Proof. Consider every subproblem of finding the shortest -path. By the Fundamental Theorem of Shortest Path Problem, if there's no negative cycle between and , there's a shortest -path that is a -path. The length of a path in is at most , and the shortest -path is the temporarily shortest -path of length , thus found not after the -th pass.

Given that all shortest paths must be found not after the -th pass of step 2 of the algorithm, the must stabilize when it's being checked at step 3. If not so, there must be negative cycle.

The won't contain a cycle if there's no negative cycle. Otherwise there's a vertex such that a temporarily shortest -path found first, which is extended to temporarily shortest -path for some , and furtherly extended into temporarily shortest -path . Noted that , which indicates is a negative cycle.

In this way, the Bellman-Ford SSSP algorithm yields the correct answer.

Theorem
The Bellman-Ford SSSP algorithm runs in time complexity and space complexity .

Proof. There're passes of finding temporarly shortest paths and pass of detecting negative cycle, each pass iterates edges and perform operations of . Thus the algorithm runs in time complexity .

The algorithm allocates two mappings of and , both takes spaces. Thus the eventual space complexity is .

A common optimization of the Bellman-Ford SSSP algorithm is to halt when no more temporarily shortest path is found / stabilize, given that the future passes at step 2 will not find temporarily shortest path / change , and clearly no negative cycle in the weighted digraph. However, doing so will not change the (worst case) time complexity.

Dijkstra's SSSP Algorithm

It turns out that finding SSSP can be much simpler when there's no edge with negative weight. For example, it suffices to perform one pass of BFS at source vertex to find the shortest paths in length, which is equivalent to finding SSSP in the weighted digraph with weight assigned to all vertices, and runs in time complexity . When the weight is not just but non-negative, the SSSP can also be solved greedily, which is the core idea of the Dijkstra's SSSP algorithm.

Again, let's present the Dijkstra's SSSP algorithm first:

Algorithm (Dijkstra)
The input is the digraph with non-negative weight and the source vertex , the output is the classical result of in SSSP, as is described here:

  1. Perform classical initialization of in SSSP.
  2. Initialize empty priority queue , which stores elements of key-priority pairs , and supports dequeuing the element of minimum priority, or modifying the priority of an element by its key .
  3. Push into .
  4. Loop while is not empty:
    1. Dequeue element from .
    2. For every :
      1. If :
        1. Update ,
        2. Enqueue-or-update-priority into .
  5. Halt with classical result of in SSSP.

The idea is simple, to prioritize paths with minimum weight, and extend these paths first.

To prove the correctness of the Dijkstra's SSSP algorithm, we need characterize the behaviour of the priority queue first:

Lemma
If is dequeued from before , then . (Noted that and don't have to be in at the same time.)
Proof. It suffices to prove the minimum priority of is non-decreasing. Consider when is dequeued from , there's . Assume is enqueued-or-priority-updated due to , so we have , and thus .

And we can see the non-negative weights of edges in the digraph participate in the monotonicity of priority. Such monotonicity leads to:

Lemma
When is dequeued from , no element with key will be enqueued into again, and the values of stabilize.
Proof. Assume the contrary that after the is popped from , when is popped from , an edge is checked, rendering and causing to be enqueued into again. By the monotonicity of the priority queue , since is dequeued from after , there must be , which implies . Noted that the inequality contains , which is a contradiction.

And now we can prove the correctness of the Dijkstra's SSSP algorithm:

Theorem
The Dijkstra's SSSP algorithm yields the correct answer.

Proof. For those , the initial values have been correct. When , the algorithm guarantees some values are assigned to , implying a shortest -path, and element with key is enqueued into . We just need to ensure the shortest -path implied by are correct when is dequeued from .

Assume the contrary that the -path found is incorrect, and the correct shortest -path is such that . We may assume the algorithm is able to find correctly, otherwise we may recursively reason on first. Since has specified the shortest -path for every , every recursion inspects a vertex appears earlier in , thus the recursion can't last forever.

Noted that must be dequeued after , since otherwise when is popped, edge is checked, the priority of will be set to not greater than , which is a contradiction. And if is dequeued after , there must be , but we expect instead, which is a contradiction.

The won't contain a cycle, since if Dijkstra's SSSP algorithm introduce a cycle by temporarily shortest -path , the cycle has non-negative weight, and thus is longer than , which is a contradiction.

In this way, the Dijkstra's SSSP algorithm yields the correct answer.

Again, we can see the non-negative weights of edges play a role in ensuring the correct shortest -path to be found after is popped and stabilize.

Theorem
The Dijkstra's SSSP algorithm runs in time complexity and space complexity , if a (minimum) binary heap is used to fulfill the purpose of a priority queue.

Proof. To use binary heap to fulfill the purpose of a priority queue in Dijkstra's SSSP algorithm, that is, to support updating priority, a common approach is to remove the old element from the binary heap and then add the new element of updated priority back. To do so, we will have to keep track of the vertex in addition to the priority in the heap; and for every vertex, we will have to keep track of the index of element with key of the current vertex, given that a binary heap is built on an array.

Every time the heap percolates elements in the heap, we look up the vertex of the element and adjust the tracked index of the element. To remove the element with key of specific vertex, we look up the index of element associated with this vertex, and swap the element with the last element in the heap, modifying the heap size, then adjust the heap beginning at the place of swapped element.

Every element is added and removed from the priority queue once. There're at most elements in heap, this the cost of adding and removing elements is . Every time an edge is visited, it may adjust the priority of an element, thus the cost of adjusting priority is . In this way, the eventual time complexity is .

The heap and index tracking map take space, alongside with the mapping of taking space. In this way, the eventual space complexity is .

The priority queue can also be implemented by a Fibonacci heap, which supports enqueuing and priority updates, thus time complexity of the implementation by Fibonacci heap can reach , and is asymptotically faster than the implemenation by binary heap running in . However, the Fibonacci heap is much complexer than the binary heap and is reported to have large constant factor and generally slower than the binary heap within the reasonable time scale of OI problems. In thus way, the time complexity bound by Fibonacci heap remains theoretic, and in practice it suffices to implement the Dijkstra's algorithm by binary heap.

Shortest Path Problem in DAG

Besides eliminating negative cycles by requiring non-negative edge weights, an weighted digraph which is a DAG has no cycle, and thus no negative cycles. Is there possibly any specialty in shortest path problem in DAG?

The answer is yes and simple, we will just have to exploit the topological sort of the DAG:

Algorithm (The SSSP Algorithm of DAG)
The input is a DAG with weight and the source vertex , the output is the classical result of in SSSP, as is described here:

  1. Perform classical initialization of in SSSP.
  2. Perform a topological sort for vertices reachable from , in which one pass of DFS at suffices. The topological sort is stored in the list .
  3. Iterate for every :
    1. For every :
      1. If :
        1. Update .
  4. Halt with classical result of in SSSP.

The proof is quite obvious to some extent:

Theorem
The SSSP algorithm of DAG yields the correct answer.

Proof. Since DAG is acyclic, therefore it has no negative cycle, and for every reachable from , the shortest -path exists. We prove by mathematical induction, parameterized by , that having found the shortest -path for implies the shortest -path must be found, for .

Noted that the first item of is and the shortest -path has been found to be . For the , consider any shortest -path (with weight ) to be found. Obviously we have , thus is the shortest -path and has been found prior to visiting , by induction hypothesis. Then when the algorithm visits , it will iterate , edge will be concatenated to , tested among all temporarily shortest -paths, and used to update if applicable. Thus , and the shortest -path must be found.

The won't contain a cycle, since any walk recovered from must be found in the weighted digraph, while a DAG doesn't contain a cycle.

By looming mathematical induction along every item of sequentially, we prove that the SSSP algorithm of DAG yields the correct answer.

Theorem
The SSSP algorithm of DAG runs in time complexity and space complexity .

Proof. The topological sort takes time complexity . The list has at most vertices, with edges incident to them, thus the iterations of step 3 take time complexity . In this way, the eventual time complexity is .

The topological sort takes space complexity . The mapping of takes space complexity . Thus the eventual space complexity is .

So the SSSP problem of DAG is of the kind that can be efficiently solved in linear time complexity, proportional to the magnitude of vertices and edges.

Interestingly, we might have been using the SSSP algorithm of DAG when solving some problem, implicitly and unnoticedly:

Example

Starting with , every time we may perform one of the following operations:

  1. Increment by ().
  2. Multiply by ().

Find the minimum number of operations required to fix into .

This is a classical problem solvable by dynamic programming:

  1. Initialize list of size , filled with initially.
  2. Set .
  3. Iterate from to :
    1. If , update .
    2. If , update .
  4. Halt with .

The content of list is shown in the figure on the left above. Every value is obtained by adding to , or by doubling if , and we just need to find which way requires less operations. This can be easily extracted to the digraph with edge weight on the right. Clearly finding the number of operations to is equivalent to finding the shortest
-path in the extracted digraph. Since we have no operation to decrease the value of , we have , thus the digraph is a DAG, and is sufficiently a topological sort of it.

Although it's generally uncommon to think in this way when solving such kind of problems, but it helps in understanding the connection between thoughts from different fields, and pulling things together is vital to studying mathematics.

Matrix Exponentiation Based APSP Algorithm

Recall that we've claim that naively looming SSSP over all vertices does not yield the most efficient way of solving the APSP. In fact, even the naive way can be furtherly improved, by exploiting algebraic properties.

To apply the APSP algorithm, we first preprocess the input digraph so that there's no multiple edge, and thus we may let . In addition to that, we may preprocess the weighted digraph so that it's loopless for further processing: since we may scan every , and if there's , will not appear in any shortest path; if , such a loop is a negative cycle, and we may signal the caller with "negative cycle detected" right away.

Number the vertices of digraph as , and define the weighted adjacency matrix as a -matrix such that . Fasten tight, we are about to unleash some algebraic power of it.

For convenience, let's focus only on the weights first. It's easy to verify that the weighted adjacency matrix stores the weights of temporarily shortest -paths of length . If there's another matrix storing the weights of temporarily shortest -paths of length , how can we obtain the weights of temporarily shortest -paths of length ?

Every temporarily shortest -path of length is either the -path of
length , or constructed from a -path of length for some concatenating the edge . And our way of defining turns out to be very clever:

  1. For every temporarily shortest -path, there's .
  2. The weight of walk constructed by concatenating to the temporarily shortest -path can be obtained by .
  3. If , then given that , we have .
  4. If there's no temporarily shortest -path, then given that , we have .

In this way, the weight of the temporarily shortest -paths of length can always be obtained by . We will show the specialty of this expression.

Recall that a ring is a additive abelian group equipped with a multiplicative semigroup / monoid, such that the multiplication is distributive over the addition, and the additive identity is the multiplicative zero. If we allow the addition to be defined over a commutative semigroup / monoid, in which some element does not have additive inverse, we obtaina semiring. And we would like to show:

Theorem
The algebraic structure is a commutative semiring with zero and unity .

Proof. We will need to prove is the additive commutative monoid with identity , is the multiplicative commutative monoid with identity , is distributive over and is the multiplicative zero.

To prove is a commutative monoid:

  1. Obviously it's closed.
  2. , thus it's associative.
  3. thus it's commutative.
  4. , thus is the identity of this monoid.

And , there's no element such that , this the monoid is generally not invertible.

Clearly is obtained by adjoining to , while has already been an abelian group, we just need to check how affects the operations. To prove is a commutative monoid:

  1. , it's closed and commutative, remains as the identity of this monoid, and there's no inverse for .
  2. , and , thus it's associative.

Since there's , the is distributive over . And since , is the multiplicative zero, thus the zero of the semiring.

The semiring is called the tropical semiring by mathematicians.

In this way, let to indicate their role in the semiring of more intuitively, soon we have:

What does it look most similar to? Yes, matrix multiplication. So the matrix stores the weights of temporarily shortest -paths of length , when viewing as matrices over tropical semiring. And recall that itself stores the weights of all temporarily shortest path of length , it won't be hard to see:

Theroem
When viewing as matrix over tropical semiring, the matrix stores the weights of all temporarily shortest path of length .
Proof. By mathematical inducition with as the base step, and as the induction step, we are done with the proof.

Please notice the proof does not require the multiplication of matrices over tropical semiring to be associative: When there's no additional paranthesis, the natural order of computing is from left to right, which means we first compute and them multiply by , while is stored in (and computed first) in the induction steps. So there's no algebraic issue here.

In this way, a matrix exponentiation based APSP algorithm might works in this way: first compute , and then see whether to check if there's negative cycle. What about the complexity? For each cell, we have to do addition and pair-wise minimum, and there're cells to go. And we need to do times of matrix multiplication, so there're operations to go.

By now, you must be super furious, since it's nothing better than the naive way of looming Bellman-Ford over every vertices, which also runs in , but without having to deal with these dizzying and useless algebraic garbages.

But working towards matrix exponentation is hopeful, since matrix exponentiation (over fields) has a famous optimization technique called fast matrix exponentiation, based on the concept of exponentiaion by squaring: If we are to compute for square matrix, we may first factorize as , and then:

Since and , starting with , we loop from to , and in each iteration, we check whether to decide whether to multiply into , and then compute from squaring . The result of is stored in after the loop. Doing in such a way, we just need to do matrix multiplication of by , and matrix squaring of (if we cleverly ellide the last squaring by checking whether ). In this way, there're totally operations to go (without Strassen's algorithm), and since , the eventual time complexity is .

The fast matrix exponentation requires associativity of matrices (over fields): we need to partition multiplicands into groups of ; and when computing from squaring , we need to group the first and last multiplicands.

So if we want to deploy fast matrix exponentiation to matrices over tropical semiring, their matrix multiplication needs to be associative. Luckily, the multiplication of matrices over semirings is associative:

Theorem
Let be a semiring, then the multiplication of matrices over is associative. That is, for matrices , there's .

Proof. First, consider , clearly holds for them, which means:

The reasoning of involves only addition and multiplication, in which the distributive law is required for expanding and contracting , the commutative law is required for swapping summation order / pivoting in .

In this way, the reasoning can even take place in a semiring. Thus the multiplication of matrices over a semiring is also associative.

In this way, the fast matrix exponentiation is applicable to the matrices over tropical semiring, and the matrix exponentiation of part of our APSP algorithm can run in time complexity . Please notice the divide-and-conquer based Strassen's algortihm, which runs in time complexity per matrix multiplication, is not applicable, due to lacking additive inverse / subtraction for some semiring elements.

Now all remains to do is to show we are also able to handle the update of previous vertex efficiently, especially to show it's compatible with the process of squaring the matrices:

Lemma
Let be the temporarily shortest -path of length , then for any , must be the concatenation of a temporarily shortest path -path of length and a temporarily shortest -path of length , for some .

Proof. It's clear that must be the temporarily shortest -path of length , otherwise we may replace it with the shorter one, which is a shorter -path of length , which is a contradiction.

When , we just need to argue that : Since if there's any shorter -path of length , then is shorter than and of length , which is a contradiction. In fact, must not contain a negative loop if would like to be true.

When , we may split into such that . The segment is the temporarily shortest -path of length , otherwise we can replace the segment with the shorter , so that is shorter than , while of length . For the same reason, the segment must also be the temporarily shortest -path of length , otherwise we may replace the segment with the shorter , so that is shorter than , while of length .

Corollary
The temporarily shortest paths of length can be constructed from the temporarily shortest paths of length and the temporarily shortest paths of length .
Proof. Since any temporarily shortest -path of length is concatenation of a segment of temporarily shortest -path of length , and a segment of temporarily shortest -paths of length . We will encounter it by doing a concatenating-and-picking-up-minimum loop for all .

Let stores the weights of temporarily shortest paths of length , stores the previous vertices of each temporarily shortest paths of length , and stores the actual length of each temporarily shortest paths of length . In this way, we may compute from and , by multiplying and concatenating the shortest paths stored in with , synchronizing the length in :

Algorithm (Weight-Path-Length Matrix Multiplication)
The input is and , the output is :

  1. Initialize as matrix of all , and as matrix of all .
  2. Loop for from to :
    1. Loop for from to :
      1. Loop for from to :
        1. Compute
        2. If or :
          1. Update .
          2. If ,
            then update ;
            otherwise update .
  3. Halt with .

Here, one might be confused about the meaning of . Althought stores the weights of all shortest paths of length , but it's not sufficient to use to reconstruct a temporarily shortest path of length . Since if stores the previous vertex of in the temporarily shortest -path of length , then stores the previous vertex of in the temporarily shortest -path of length . And if the temporarily shortest -path is reconstructed into a path of length , then the temporarily shortest -path reconstructed is of length .

Indeed, storing the previous vertex of temporarily shortest paths of length does not suffice to reconstruct the underlying path. However, we know when there's no negative cycle, it suffices to find temporarily shortest paths of length , and we just need to ensure we are able to reconstruct the underlying path then. In fact, we have:

Lemma
In weighted digraph with no negative cycle, let the shortest paths be represented by for , if all shortest paths are the shortest in length, then tracking back previous vertices in reconstructs into paths.

Proof. Assume the contrary, if it fails when recovering the shortest -path , circulating between , then there're a shortest -path in the form of and a shortest -path in the form of .

By the Fundamental Theorem of Shortest Path Problem, the shortest paths exists when there's no negative cycle in the digraph, and is also a shortest -walk such that , since is the shortest in length; is also a shortest -path such that , since is the shortest in length. Since have distinct endpoints, , and thus , which is a contradiction.

Theorem
In weighted digraph with no negative cycle, for the input temporarily shortest paths of length and of length , such that all walks are paths and the shortest in length, then the Weight-Path-Length Matrix Multiplication combines them into output temporarily shortest paths of length that are paths and the shortest in length.

Proof. Assume the contrary, that the algorithm fails to create paths, it must combine -path and -path into the temporarily shortest -path as output. Consider the weight of closed walk : if , then is shorter than it; if , then the digraph contains a negative cycle. So it's only case that .

We argue that for temporarily shortest -path of length as input, and the temporarily shortest -path of length as input, there must be : Assume the contrary that , then given that , must be the temporarily shortest -path found instead of . The analogous argument is applicable to show .

So the only possibility is that , however since the input are paths shortest in length, we have and thus there's instead, which is a contradiction. In this way, can't be the output of the algorithm. The concatenation of may also contain closed walk, however we may recursively reasoning on this new shortest -path then, and given that , the recursion can't last forever since the descending chain of positive integer to zero is finite.

Now the output of the algorithm must be a path, but we still need to show it's the shortest in length. Let the output temporarily shortest -path of length be . By this lemma, can be split into the shortest -path of length and the shortest -path of length . Let the shortest -path of length specified by input be , and the shortest -path specified by input be , clearly we must have , and the Weight-Path-Length algorithm must scan for creating the shortest -path. So the shortest -path produced by the algorithm must fulfill , that is, is a path and the shortest in length, and we are done with the proof.

One can easily find the length matrix plays an indispensible role in ensuring the path found to be the shortest in length.

In this way, the following algorithm is in the ultimate form:

Algorithm (Matrix Exponentiation Based APSP Algorithm)
The input is the digraph with weight , either the algorithm halts with "negative cycle detected", or it outputs the classical result of in APSP, as is described here:

  1. Preprocess the digraph to preserve the edges of minimum weights. Then scan every loops, and halt with "negative cycle detected" if the loop has negative weight, or remove the loop.
  2. Initialize matrices:
  3. Initialize counter .
  4. Loop until :
    1. Perform Weight-Path-Length multiplication on and , to produce .
    2. Update .
  5. Test whether , if not so, halt with "negative cycle detected".
  6. Halt with and
Theorem
The Matrix Exponentiation Based APSP algorithm yields the correct answer.

Proof. It won't be hard to find in the digraph that is preprocessed and with loop removed, every edge is the shortest -path of length , and no -walk can be shorter. In this way, represents the shortest paths of length such that all walks are paths and the shortest in length.

Starting with , each time, the Weight-Path-Length matrix multiplication combines and , representing temporarily shortest paths of length , into , representing temporarily shortest paths of length , until . In this way, if there's no negative cycle in the digraph, every tuple stores the temporarily shortest paths of length such that all walks are paths and the shortest in length.

By the Fundamental Theorem of Shortest Path Problem, we know if there's no negative cycle, all shortest paths must be paths, and the tuple must stabilize after ; conversely, if there's negative cycle, the tuple won't stabilize even after , so testing whether for some should be equivalent to testing whether . However, the common fast matrix multiplication, which tests each bit in the binary decomposition of to evaluate precisely, takes up to multiplications, while evaluating for some takes only multiplications, which is preferred in our case.

The Matrix Exponentiation Based APSP algorithm, which runs in time complexity is not the fastest algorithm for solving the APSP problem. However it reveals the suboptimiality of naively looming SSSP algorithms over vertices by surpassing it in complexity, and sheds light on discovering faster APSP algorithms. The semiring algebra beneath it is also interesting nevertheless.

Floyd-Warshall APSP Algorithm

How do we ensure that an APSP algorithm (that does not work in the naive way) would work? Recall this lemma, we manage to create a necessary condition that a temporarily shortest path of length must be constructed from a temporarily shortest path of length and of length , then we design an algorithm to find it sufficiently.

Interestingly, constructing the temporarily shortest path of length from temporarily shortest paths of length and is not the only way of solving the APSP problem. By changing the way of constructing temporarily shortest paths, we may solve the APSP problem more efficiently.

For convenience, we may assume the digraph has no negative cycle for now, and we will handle the case with negative cycles later.

Given that the vertices and edges are finite in a digraph, we may number the vertices in the digraph as . Let , a temporarily shortest -path with internal vertices from is a temporarily shortest -path such that . Clearly by this definition, a shortest -path is a temporarily shortest -path with internal vertices from , but:

Lemma
In weighted digraph with no negative cycle, for the temporarily shortest -paths with internal vertices from , at least one of them is a path.
Proof. Assume it's not a path, then it contains a closed walk of non-negative weight, waiting to be removed or introducing a contradition to it's the shortest in weight.
Lemma
In weighted digraph with no negative cycle, for a temporarily shortest -path with internal vertices from that is a path, it's:

  1. A temporarily shortest -path with internal vertices from .
  2. Constructed from temporarily shortest -path and shortest -path, both are paths and with internal vertices from .
Proof. For a temporarily shortest -path with internal vertices from , if , then such a path is also a temporarily shortest -path with internal vertices from ; if , then . Since is a path, is not an internal vertices of , thus are paths and temporarily shortest paths with internal vertices from .

Theorem

In weighted digraph with no negative cycle, from the temporarily shortest -path , -path and -path with internal vertices from which are paths, by letting:

Then we obtain a temporarily shortest -path with internal vertices from that is a path.

Proof. We just need to show the concatenation is a -path when . Assume the contrary that is not a path, let and be concatenated. Now consider the closed walk , it must have zero weight, so has the same weight as but . Since is the temporarily shortest path with internal vertices from , , which is a contradiction to .

In this way, we obtain an APSP algorithm that can find shortest paths in a digraph with no negative cycle:

Algorithm (Floyd-Warshall without Negative Cycle Handling)
The input is the digraph such that , with weight and no negative cycle, the output is the classical result of in APSP, as is described here:

  1. Perform classical initialization of in APSP.
  2. Loop for to :
    1. Loop for to :
      1. Update .
  3. Loop for to :
    1. Loop for to :
      1. Loop for to :
        1. If , update .
  4. Halt with classical result of in APSP.

To show tracking previous vertices in reconstruct into paths, we would like to show after every -th iteration at step 3, the recovers into path. This will require:

Lemma
After the -th iteration at step 3, if the -path is set to the temporarily shortest -path with internal vertices from , then the temporarily shortest -path with internal vertices from must also be updated.

Proof. Let be replacing the temporarily shortest -path with internal vertices from after the -th iteration, this will require . Assume the contrary that the temporarily shortest -path with internal vertices from is also the temporarily shortest path with internal vertices from . Now we have .

On the other hand, the path does not contain as internal vertex, so its internal vertices are from , and thus . Now we have:

Which is a contradiction. So there must be sufficiently , and the temporarily shortest -path with internal vertices from must be updated after this iteration.

In the lemma above, we conversatively assume that "some path" will be set to the temporarily shortest -path, not necessarily . However, we still have:

Lemma
If before the -th iteration, recovers into paths, that is, the temporarily shortest -path has a temporarily shortest -path segment as prefix, then after the -th iteration, still recovers into paths.

Proof. Let temporarily shortest -path with internal vertices from be found after the -th iteration.

Noted that by tracking back , starting with and ending at , we recover the -path segment , by the presumption. And due to the previous lemma, every are updated after the -th iteration, and the only way to update it is by concatenating with the temporarily shortest -path with internal vertices from , which are -path segments / prefixes of , so is set to for .

On the other hand, before the -th iteration, by tracking back , starting with and ending at , we recover the path . We argue that for remains unchanged after the -th iteration. Otherwise such a temporarily shortest -path must be constructed from concatenating -path and some -path , while has already been in , which is a contradiction to every temporarily shortest path constructed must be a path we've proved earlier.

In this way, if the temporarily shortest -path with internal vertices from is found after the -th iteration, by tracking back , starting with and ending at , we recover into exactly this shortest path.

Obviously after the step 2, the initial configuration of stores only paths made up of single edges, and thus can be recovered into paths; during the execution, the algorithm preseves the property per iteration of step 3, thus when the algorithm terminates, the yielded recovers into path.

The reasoning of the lemma above can also be used to show all temporarily shortest -paths and -paths remains unchanged in the -th iteration of step 3, thus updating and in place is safe.

In this way, the Floyd-Warshall APSP algorithm is correct when there's no negative cycle in the digraph, and it turns out to be super efficient when the digraph is dense:

Theorem
The Floyd-Warshall APSP algorithm without negative cycle handling runs in time complexity and space complexity .

Proof. The initialization at step 1 takes time complexity , the step 2 takes time complexity , the step 3 takes time complexity . So the eventual time complexity accumulates to .

During the execution, only the mappings and are required, and they take space. So the eventual space complexity accumulates to .

Now let's talk about handling negative cycles in the Floyd-Warshall APSP algorithm. There're many ways to accomplish this task within , and we would like to introduce one of the simplest ways:

Algorithm (General Negative Cycle Detection in APSP)
The input is the digraph such that , with weight , the output is whether the digraph contains negative cycle:

  1. Preprocess the digraph into such that and , while . That is, create a new pseudo vertex , and edges of zero weight to all other vertices in .
  2. Execute Bellman-Ford SSSP algorithm with digraph and source vertex as input. If the Bellman-Ford halts with "negative cycle detected", then halt with true; otherwise halt with false.

The correctness of this method is obvious: for every negative cycle in digraph , the walk must be inspected by the Bellman-Ford algorithm; for every negative cycle detected by , is sufficiently a negative cycle to be found in ; adding pseudo vertex will not introduce a negative cycle, since it does not introduce any cycle in the first place.

The time complexity of this method is obviously , and the space complexity is obviously , so integrating such a detection into Floyd-Warshall APSP algorithm will not affect its overall time complexity and space complexity:

Algorithm (Floyd-Warshall)
The input is the digraph such that , with weight , either the algorithm halts with "negative cycle detected", or it outputs the classical result of in APSP, as is described here:

  1. Execute the General Negative Cycle Detection in APSP, if it detects a negative cycle, halt with "negative cycle detected".
  2. Given all negative cycles have been handled, just execute the Floyd-Warshall without Negative Cycle Handling will yield the correct answer now.

We shall conclude the complete implementation of the Floyd-Warshal APSP algorithm in this text for now.

Johnson's APSP Algorithm

The Floyd-Warshall APSP algorithm works well for dense digraph in time complexity , however when the digraph is sparse with non-negative weights, by looming the Dijkstra's algorithm over every vertex, it runs in time complexity with (minimum) binary heap as the priority queue. When such a digraph is as sparse as , the time complexity can reach and is faster than the Floyd-Warshall APSP algorithm.

So the problem remains as: Can the APSP problem of a general sparse digraph be solved faster than ? The Johnson's APSP algorithm finds us a way, that we may reweight the digraph so that all edges becomes non-negatively weighted, but without altering the shortest paths. In the reweighted digraph, the Dijkstra's algorithm is safe to go, and looming the Dijkstra's algorithm over every vertex should yields the APSP within , as is analyzed above.

First, we must be aware of that naively reweighting the digraph will not work, since it will alter the shortest paths:

So it's non-trivial to come up with a reweighting process that preserves the shortest paths.

Incredibly, the Johnson's algorithm tweaks the General Negative Cycle Detection in APSP into a reweighting process that may preserve the shortest paths:

Algorithm (Johnson's APSP Reweighting Process)
The input is the digraph such that , with weight , either the algorithm halts with "negative cycle detected", or it outputs the new weight :

  1. Preprocess the digraph into such that and , while . That is, create a new pseudo vertex , and edges of zero weight to all other vertices in .
  2. Execute Bellman-Ford SSSP algorithm with digraph and source vertex as input. If the Bellman-Ford halts with "negative cycle detected", we also halt the algorithm with this signal; otherwise the Bellman-Ford halts with , we will let , and reweight as .
Theorem
The Johnson's APSP Reweighting Process preserves the shortest paths, that is, in the digraph with no negative cycle, for every shortest -path found in with weight , it must also be the shortest -path in with weight .

Proof. For every walk , its weight is:

The reweighting process won't introduce negative cycle: Since for any closed walk , let it be passing any , then we have , so won't be a negative cycle in the digraph reweighted with , unless it has already been a negative cycle in the digraph with weight . In this way, if digraph with weight does not contain negative cycle, then with weight must also not contain negative cycle. This will guarantee the existence and well-formedness of the shortest paths.

When there's no negative cycle in the digraph, by the Fundamental Theorem of Shortest Path Problem, for every -path , there must be . This will imply , and thus is also the shortest -path with the minimum weight in the digraph reweighted with .

Finding a reweighting process that preserves the shortest paths is the first step, to apply the Dijkstra's SSSP algorithm safely, all edges must also be of non-negative weights. However, we have:

Theorem

Proof. Let be the weighted distance metric in digraph , and be the weighted distance metric in digraph .

We would like to show when , that is, introducing pseudo vertex won't change the weighted distance between vertices in : If introducing changes the shortest -path into a less weighted one, the new shortest -path must be found in and not in , which means it must pass . However, there's no incoming edge to , which means no -path may pass when .

In the digraph weighted with , there must be , since otherwise the walk by concatenating the shortest -path and the edge is strictly shorter than the shortest -path in weight, which is a contradiction.

In this way, we have , noted that the left hand side is equal to , and thus .

Now all obstacles are clear, and we are ready to present the Johnson's APSP algorithm:

Algorithm (Johnson)
The input is the digraph such that , with weight , either the algorithm halts with "negative cycle detected", or it outputs the classical result of in APSP, as is described here:

  1. Execute the Johnson's APSP Reweighting Process, when the process halts with "negative cycle detected", we also halt the algorithm with this signal; otherwise we fetch the new weight .
  2. Perform classical initialization of in APSP.
  3. Loop for to :
    1. Execute the Dijkstra's SSSP algorithm in with weight and source vertex , fetch the outputs into .
    2. Loop for to :
      1. Assign .
  4. Halt with classical result of in APSP.

It won't be hard to see the Johnson's APSP algorithm is correct: At step 1, it either halts with "negative cycle detected", or the digraph with new weight is a weighted digraph with non-negative weight, which is safe for executing Dijkstra's SSSP algorithm at step 3.1. On the other hand, given that the Dijkstra's SSSP algorithm has guaranteed that the s recover into paths, and the algorithm simply assign to every with source vertex at step 3.2, also recovers into paths.

November 1, 2024