Let's define the
shortest path problem first:
Definition (Shortest Path Problem)
A weighted digraph ,
is a digraph with weights
assigned to every edges. That is,
there's a mapping
such that is the
weight of .
(For convenience, we shall let the
weights be real numbers. Doing so is
sufficient for most cases, and can
be easily imitated when dealing
with extreme cases.)
For a walk in a weighted
digraph ,
the weight of the walk is
defined as
.
And
the shortest path problem between (specified vertices)
is, to find a
-walk with minimum weight.
The shortest path problem can
also be defined in graphs with
weights assigned to edges.
However, given that we can
equivalently reduce the problem
to a digraph version, by
replacing every undirected edge
with two directed edges in
opposite directions and the
same weight, for convenience,
we shall narrow down the
discussion to digraphs.
So a shortest path problem is to
find a walk rather than a path, as
is indicated by the name. It's
unfortunate that mathematicians
use walk versus path, while
computer scientists use path
versus simple path, and the
term "shortest path problem"
becomes more widespread than the
"shortest walk problem". Anyway,
just a simple naming issue won't
undermine the well-formedness of
the discussion in this text, as
long as we've clarified that we
are to find.
(For convenience, in this text,
the shortest -path
(in weight) means the -walk
with minimum weight from now on.)
Obviously, finding the shortest
-walk in length is equivalent
to finding the shortest -path
in weighted digraph with
assigned to all edges uniformly.
However, one should not feel too
surprised when a shortest
-walk in length is not the
shortest -path:
Example
Consider finding the shortest
-path in the following
weighted digraph:
The shortest -path
is marked blue. Clearly it's
not the shortest in length.
It's also good to bear in mind
that the weight of the edge
can be negative.
So the shortest path problem is
a generalization of finding the
shortest walk in length in a
digraph, and is much more useful
since now more information about
the concrete problem is embedded
into the weighted graph, and
solvable by the tools we are
going to introduce in this text.
The plural form "problems" of
the title is not a careless typo
of an extra s-letter, but due
to our referring to the two real
world variation of the shortest
path problem:
- Single-Source Shortest Path (SSSP):
With a specified vertex
called the source vertex, find
all shortest -path in weight
for every vertex .
- All-Pairs Shortest Path (APSP):
Find all shortest -path in
weight for every pair of vertices
.
The SSSP problem is actually
not a true variation, since
the shortest path problem
between specified endpoints
can be shown to
equivalent to the SSSP
problem with as the
source endpoint in general:
consider in the weighted
digraph of this problem, there's
.
Given that the value of
for edge is
arbitrary, our answer won't be
correct without finding the
shortest -path first.
Similarly, there's also
Single-Sink Shortest Path
problem, which is to find
shortest -path in
weight for every vertex
and a specified
vertex . However,
it's also as hard as the
shortest path problem
between specified endpoints
in general,
given that we may let all
other vertex in
be adjacent to .
For convenience, in this text,
we may assume :
Since when there's more than one
edge in , it suffices
to preprocess and keep the edge
of minimum weight, without affecting
the result. In this way, there's
no more than edges in ,
although it might be far less than
when it is sparse.
The APSP problem can be
naively solved looming an
SSSP algorithm over every
vertices in . Let's be
dealing with
preprocessed
dense digraphs so that
. The
Bellmand-Ford algorithm,
which is an SSSP algorithm
for general digraph, runs
in time complexity
, and
thus the APSP algorithm based
on it runs in time complexity
. The Dijkstra's
algorithm, which is an SSSP
algorithm for digraph of
non-negative edge weights,
runs in time complexity
(using a binary heap), and
thus the APSP algorithm
based on it runs in time
complexity .
However, the naive way mentioned
above is inefficient, and we will
show even the APSP problem of
a general dense digraph can be
solved within time complexity
, by the
Floyd-Warshall algorithm.
This is even faster than
looming the Dijkstra's
SSSP algorithm over every
source endpoint in , which
is applicable only to digraph
with non-negative edge weights.
Fundamental Theorem of Shortest Path Problem
Before our delving into any
SSSP or APSP algorithm, it's
crucial to characterize the
shortest path problem first.
Think of the following example:
Tom is an accountant of an
international enterprise, which
is based on country and
has off-shore business in
countries . The main
responsibility of Tom is to
handle the currency exchange
between the four countries.
Consider a digraph
with vertices
,
whose underlying graph is
a complete graph over ,
such that
represents the exchange
rate from to
subtracting the
transaction fee per unit
of currency in . So
converting from to
and then back to is
lossy, and Tom need to find
a way of exchange such that
minimize the loss.
The behaviour of currency
exchange is multiplicative:
For a -walk
in digraph , the value
of currency after the
exchange is
.
However, we may convert it
into a digraph with
weights
(mind the minus sign),
such that
.
and the value of currency
after the exchange is
. In this way,
the less the value of ,
the more value of currency
preseved after the exchange.
Therefore, in order to
fulfill his responsibility.
Tom just need to find a
shortest -path in .
Tom was depressed. One day,
when Tom was to do the
currency exchange between
country and , he
had trouble finding the
shortest -path in
the . The crucial
elements in digraph
and is shown as the
following, with other
edges hidden:
First, Tom found a
temporary shortest -path
.
But soon he noticed,
in the cycle
,
there's:
And by circulating
after reaching in
, he obtain a shorter
-path
with less weight
.
So he realized such a
shortest -path
would never exist, as
he can always concatenate
the cycle to any
so-called shortest
-path, to make it
even shorter.
Tom shared his discovery
with his best friend Jerry.
"Stop making that frown
face, bro, this is going
to make big money.",
said Jerry, "The cycle
you found is a cash printer:
Every time you circulate
around the cycle, you boost
the currency value by
,
not to mention the growth
is measured in exponentation,
per circulation."
While the story of Tom is
fiction and too supernatural
to happen, it reveals the
specialty of the shortest
path problem in general: the
shortest -path might
not exist even if
.
In fact, for the existence
of the shortest -path,
we would like to prove the
Fundamental Theorem of
Shortest Path Problem:
Theorem (Fundamental Theorem of Shortest Path Problem)
The shortest -path
in weight might either
be a -path, or
never exist.
Proof. Consider there's
some vertex such
that and
, there's a
closed walk passing
such that . Consider
a -walk in the form of
,
by circulating when
passing , we obtain
a -walk
such that
.
In this way, we can control
the weight of a -walk
to be arbitrary small, and
thus the shortest -path
shall never exist.
On the other hand, assume
for any such
that and
, every
closed walk passing
through fulfills
. Consider
any -walk in the form of
:
- By presumption, it's
impossible for
to be true.
- When ,
we may fix into
,
which is a shorter in
length by eliminating
the segment of .
- When , we
may fix into
,
which is shorter in
weight due to
.
In this way, for any
-walk, its weight is
greater than or equal to a
-path it can be
fixed into. Thus, a shortest
-path is to be
found in .
Since is
a finite set, the shortest
-path must
exist in this case.
Conventionally, we call a
closed walk with
negative weight
a negative cycle. Thus
the Fundamental Theorem of
Shortest Path Problem can
be reworded as:
Corollary
In the shortest path problem
between :
- If there's negative
cycle between and ,
then there's no shortest
-path;
- If there's no negative
cycle between and ,
then the shortest
-path exists and
is a -path.
To say there's a negative cycle
between and , it means
there's some vertex
such that
,
and there's a negative cycle
passing through .
The Fundamental Theorem of
Shortest Path Problem gives
rise to the graph metric of
weighted distance:
Definition (Weighted Distance between Vertices)
In the weighted digraph
with weight
,
the
weighted distance
is a graph metric
between every pair of vertices
and ranged within
such that:
- if .
- ,
if there's no negative cycle between and .
- if there's
negative cycle between and .
The weighted distance
is almost the
weighted version of the
distance , except
for its capability of
taking when
there's negative cycle
between and .
The Fundamental Theorem of
Shortest Path Problem
characterizes what a correct
algorithm for solving the
shortest path problem should
behave like. However, it's
costly and mostly unneeded
to enumerate all negative
cycles. So most algorithm
will yield a shortest
-path such that
if exists,
otherwise just halt with
the signal
"negative cycle detected".
The following lemma is not
fundamental, but is obvious
and quite common in our proof:
Lemma
For any shortest -path
, any -walk segment
of is a shortest
-path.
Proof. Let
,
if there's negative cycle
between and then
it's a negative cycle
between and , which
is a contradiction to
is a shortest -path.
Thus the shortest
-path exist.
And if
,
then we can replace the
walk segment of in
with , so that
we obtain a -walk
with weight
,
which is even shorter than
and a contradiction
to is a
shortest -path.
In this way, it suffices
for a SSSP algorithm
accepting as
input to signal with
"negative cycle detected",
or to output a mapping
of previous vertices
and weighted function
from . The mapping
fulfills:
if ; or
if there's a shortest
-path such that the
previous vertex of
is ,
since a shortest
-path must contain
the prefix of a shortest
-path.
The weighted function
fulfills
.
The case of APSP algorithm
is analogous, it suffices
for the algorithm to signal
with negative cycle
detected, or output a
mapping of previous vertex
and the weighted distance
such that
.
The mapping
is to output the previous
vertex
of a shortest -path,
given that a shortest
-path must contain
a prefix of a shortest
-path.
We will see, in an SSSP
algorithm with source vertex
, it's reasonable
to initialize
,
and
.
In an APSP algorithm, it's
reasonable to initialize
and
.
Bellman-Ford SSSP Algorithm
The Bellman-Ford SSSP algorithm
might be the simplest SSSP
algorithm that can be prove to
be correct when applied to
general weighted digraphs. We
would like to present the
Bellman-Ford algorithm first:
Algorithm (Bellman-Ford)
The input is the digraph
with weight
and the source vertex
,
either the algorithm halts
with "negative cycle detected",
or it outputs the classical
result of
in SSSP, as is described
here:
- Perform classical initialization of
in SSSP.
- Loop for times:
- For every :
- If ,
then update
.
- For every :
- If ,
then halt with "negative cycle detected".
- Halt with classical result
of in SSSP.
So the layout of the Bellman-Ford
algorithm is simple: initially
the shortest -path is the
empty walk (if there's no negative
cycle, which will be checked
later), and we perform
passes of construction of all
shortest -paths (we will
need to prove), and finally we
check whether the
stablizes, in order to confirm
whether there's negative cycle
reachable from (we will also
need to prove).
Again, the order of visiting
depends on the algorithmic
representation of . Thus
might stablize after arbitrary
number of passes (let every
edge be of weight ):
For example, the Bellman-Ford
stabilizes right after the
first pass when visiting in
the order of
,
as is shown in the figure
on the left above.
However, it will take passes
for the Bellman-Ford to stabilize
when visiting in the order of
,
as is shown in the figure
on the right above. If the
Bellman-Ford algorithm would
like to be correct, it must
be capable of handling
whichever permutation of .
To prove its correctness,
we must articulate what
each pass of step 2 of the
Bellman-Ford algorithm
actually find. Ironically,
the would-be difficulty of
handling permutation of
sheds light on the problem.
Let's do a little observation:
Consider on the first pass,
no matter how we permute
, the
temporarily shortest paths
wll always be found. This
is due to the fact that
the temporarily shortest
-path has
been found before the
first pass of step 2, and
will
always be tested regardless
of the permutation of .
Consider on the second pass,
again no matter how we
permute , the temporarily
shortest paths
will always be found, due to
the temporarily shortest paths
found after the first pass, and
that
will be checked regardless
of the permutation of .
In this way, it turns out
that the temporarily shortest
paths found at each pass are
what we can rely on: Since
obviously any walk segment of a
temporarily shortest path must
also be a temporarily shortest
path of its own length.
Conversely, we concatenate edges
to temporarily shortest paths,
to construct longer temporarily
shortest paths. In fact, by
some tweaks and accomodations,
eventually we can find the
following lemma to be true:
Lemma
The temporarily shortest
-path of length
(that is, the -walk
with minimum weight among
all -walks of length
) is found not
after the -th pass of
step 2 in the Bellmand-Ford
SSSP algorithm with source
vertex .
Proof. Clearly it
can be proved by mathematical
induction. Initially, the
shortest -path of
length is only
, found at the
-th pass / right after
the step 1.
Let's prove the temporarily
shortest -path
of length is
found not after the
-th pass. Noted that
must be a temporarily
shortest -path of
length , and
by induction hypothesis,
it's found not after
the -th pass.
Let it be found at the
-th pass (),
and at the -th
pass, the edge
is inevitably visited,
therefore
is found at the
-th pass, and
.
This also leads us to the
proof of the algorithm:
Theorem
The Bellman-Ford SSSP algorithm
yields the correct answer.
Proof. Consider every
subproblem of finding the
shortest -path. By
the Fundamental Theorem of
Shortest Path Problem, if
there's no negative cycle
between and , there's
a shortest -path that is
a -path. The length of
a path in is at most ,
and the shortest -path
is the temporarily shortest
-path of length
, thus found
not after the -th pass.
Given that all shortest
paths must be found not after
the -th pass of step
2 of the algorithm, the
must
stabilize when it's being
checked at step 3. If not so,
there must be negative cycle.
The won't contain
a cycle if there's no negative
cycle. Otherwise there's a
vertex such that
a temporarily shortest -path
found first, which is extended to
temporarily shortest -path
for some , and
furtherly extended into
temporarily shortest -path
.
Noted that
,
which indicates
is a negative cycle.
In this way, the Bellman-Ford
SSSP algorithm yields the
correct answer.
Theorem
The Bellman-Ford SSSP algorithm
runs in time complexity
and space
complexity .
Proof. There're
passes of finding temporarly
shortest paths and pass of
detecting negative cycle, each
pass iterates edges and
perform operations of .
Thus the algorithm runs in
time complexity .
The algorithm allocates two
mappings of and
, both takes
spaces. Thus the eventual
space complexity is .
A common optimization of the
Bellman-Ford SSSP algorithm
is to halt when no more
temporarily shortest path is
found /
stabilize, given that the
future passes at step 2 will
not find temporarily shortest
path / change
, and
clearly no negative cycle in
the weighted digraph. However,
doing so will not change the
(worst case) time complexity.
Dijkstra's SSSP Algorithm
It turns out that finding
SSSP can be much simpler
when there's no edge with
negative weight. For example,
it suffices to perform one
pass of BFS at source vertex
to find the shortest paths
in length, which is
equivalent to finding SSSP
in the weighted digraph
with weight assigned
to all vertices, and runs in
time complexity .
When the weight is not just
but non-negative, the
SSSP can also be solved
greedily, which is the core
idea of the Dijkstra's
SSSP algorithm.
Again, let's present the
Dijkstra's SSSP algorithm first:
Algorithm (Dijkstra)
The input is the digraph
with
non-negative weight
and the source vertex
,
the output is the classical
result of
in SSSP, as is described
here:
- Perform classical initialization of
in SSSP.
- Initialize empty priority
queue , which stores
elements of key-priority
pairs , and supports
dequeuing the element of
minimum priority, or
modifying the priority
of an element by its key .
- Push into .
- Loop while is not empty:
- Dequeue element from .
- For every :
- If :
- Update
,
- Enqueue-or-update-priority into .
- Halt with classical result
of in SSSP.
The idea is simple, to
prioritize paths with
minimum weight, and extend
these paths first.
To prove the correctness of
the Dijkstra's SSSP algorithm,
we need characterize the
behaviour of the priority
queue first:
Lemma
If is dequeued
from before ,
then . (Noted that
and
don't have to be in at
the same time.)
Proof. It suffices to
prove the minimum priority
of is non-decreasing.
Consider when
is dequeued
from , there's
.
Assume is
enqueued-or-priority-updated
due to , so
we have
,
and thus
.
And we can see the non-negative
weights of edges in the digraph
participate in the monotonicity
of priority. Such monotonicity
leads to:
Lemma
When is dequeued from
, no element with key
will be enqueued into
again, and the values of
stabilize.
Proof. Assume the contrary
that after the is
popped from , when
is popped from , an edge
is checked, rendering
and causing
to be
enqueued into again. By the
monotonicity of the
priority queue , since
is dequeued from
after , there must
be , which implies
.
Noted that the inequality
contains , which
is a contradiction.
And now we can prove the correctness
of the Dijkstra's SSSP algorithm:
Theorem
The Dijkstra's SSSP algorithm
yields the correct answer.
Proof. For those
,
the initial values
have been correct. When
, the algorithm
guarantees some values are
assigned to
,
implying a shortest -path,
and element with key is
enqueued into . We just
need to ensure the shortest
-path implied by
are correct when is
dequeued from .
Assume the contrary that the
-path found is
incorrect, and the correct
shortest -path is
such that
.
We may assume the algorithm
is able to find
correctly, otherwise we
may recursively reason
on first. Since
has specified the shortest
-path for every
, every
recursion inspects a vertex
appears earlier in ,
thus the recursion can't
last forever.
Noted that must
be dequeued after
, since otherwise
when is popped,
edge is checked, the
priority of will be
set to not greater than
, which is a
contradiction. And if
is dequeued after
, there must be
, but
we expect
instead, which is a contradiction.
The won't contain
a cycle, since if Dijkstra's SSSP
algorithm introduce a cycle by
temporarily shortest -path
,
the cycle
has non-negative weight, and thus
is longer than , which
is a contradiction.
In this way, the Dijkstra's SSSP
algorithm yields the correct answer.
Again, we can see the non-negative
weights of edges play a role in
ensuring the correct shortest
-path to be found after
is popped and
stabilize.
Theorem
The Dijkstra's SSSP algorithm runs in
time complexity
and space complexity , if a
(minimum) binary heap is used to fulfill
the purpose of a priority queue.
Proof. To use binary heap to
fulfill the purpose of a priority
queue in Dijkstra's SSSP algorithm,
that is, to support updating
priority, a common approach is to
remove the old element from the
binary heap and then add the new
element of updated priority back.
To do so, we will have to keep
track of the vertex in addition
to the priority in the heap; and
for every vertex, we will have to
keep track of the index of element
with key of the current vertex,
given that a binary heap is
built on an array.
Every time the heap percolates
elements in the heap, we look
up the vertex of the element
and adjust the tracked index
of the element. To remove the
element with key of specific
vertex, we look up the index
of element associated with
this vertex, and swap the
element with the last element
in the heap, modifying the
heap size, then adjust the
heap beginning at the place
of swapped element.
Every element is added and
removed from the priority
queue once. There're at
most elements in heap,
this the cost of adding and
removing elements is
. Every
time an edge is visited,
it may adjust the priority
of an element, thus the
cost of adjusting priority
is . In
this way, the eventual
time complexity is
.
The heap and index
tracking map take
space, alongside with the
mapping of
taking space.
In this way, the eventual
space complexity is .
The priority queue can also
be implemented by a Fibonacci
heap, which supports
enqueuing and
priority updates, thus
time complexity of the
implementation by Fibonacci
heap can reach
, and
is asymptotically faster than
the implemenation by binary
heap running in
.
However, the Fibonacci heap
is much complexer than the
binary heap and is reported
to have large constant factor
and generally slower than
the binary heap within the
reasonable time scale of OI
problems. In thus way, the
time complexity bound by
Fibonacci heap remains
theoretic, and in practice
it suffices to implement
the Dijkstra's algorithm
by binary heap.
Shortest Path Problem in DAG
Besides eliminating negative
cycles by requiring non-negative
edge weights, an weighted digraph
which is a DAG has no cycle, and
thus no negative cycles. Is there
possibly any specialty in
shortest path problem in DAG?
The answer is yes and simple, we
will just have to exploit the
topological sort of the DAG:
Algorithm (The SSSP Algorithm of DAG)
The input is a
DAG
with weight
and the source vertex
,
the output is the classical
result of
in SSSP, as is described
here:
- Perform classical initialization of
in SSSP.
- Perform a topological sort
for vertices reachable from
, in which one pass of
DFS at suffices. The
topological sort is stored
in the list .
- Iterate for every :
- For every :
- If :
- Update
.
- Halt with classical result
of in SSSP.
The proof is quite obvious
to some extent:
Theorem
The SSSP algorithm of DAG yields the
correct answer.
Proof. Since DAG is acyclic,
therefore it has no negative cycle,
and for every reachable
from , the shortest -path
exists. We prove by mathematical
induction, parameterized by ,
that having found the shortest
-path for
implies the shortest -path
must be found, for .
Noted that the first item of
is and the shortest -path
has been found to be . For
the , consider
any shortest -path
(with weight ) to be
found. Obviously we have
,
thus is the shortest
-path and has been
found prior to visiting , by
induction hypothesis. Then when
the algorithm visits , it
will iterate ,
edge will be
concatenated to , tested
among all temporarily shortest
-paths, and used to update
if applicable. Thus
, and
the shortest -path must
be found.
The won't contain
a cycle, since any walk recovered
from must be found
in the weighted digraph, while a
DAG doesn't contain a cycle.
By looming mathematical induction
along every item of
sequentially, we prove that the
SSSP algorithm of DAG yields
the correct answer.
Theorem
The SSSP algorithm of DAG runs in
time complexity
and space complexity .
Proof. The topological
sort takes time complexity
. The list has
at most vertices, with
edges incident to them,
thus the iterations of step 3
take time complexity
. In this way, the
eventual time complexity is
.
The topological sort takes
space complexity .
The mapping of
takes
space complexity .
Thus the eventual space
complexity is .
So the SSSP problem of DAG
is of the kind that can be
efficiently solved in linear
time complexity, proportional
to the magnitude of
vertices and edges.
Interestingly, we might have
been using the SSSP algorithm
of DAG when solving some problem,
implicitly and unnoticedly:
Example
Starting with , every
time we may perform one of
the following operations:
- Increment by ().
- Multiply by ().
Find the minimum number
of operations required to
fix into .
This is a classical problem
solvable by dynamic programming:
- Initialize list
of size , filled with
initially.
- Set .
- Iterate from to :
- If ,
update .
- If ,
update .
- Halt with .
The content of list
is shown in the figure on the left
above. Every value is
obtained by adding to ,
or by doubling if
, and we just need to
find which way requires less
operations. This can be easily
extracted to the digraph with
edge weight on the right.
Clearly finding the number of
operations to is equivalent
to finding the shortest
-path in the extracted digraph.
Since we have no operation to
decrease the value of , we have
,
thus the digraph is a DAG, and
is sufficiently
a topological sort of it.
Although it's generally uncommon
to think in this way when solving
such kind of problems, but it
helps in understanding the
connection between thoughts from
different fields, and pulling
things together is vital to
studying mathematics.
Matrix Exponentiation Based APSP Algorithm
Recall that we've claim that
naively looming SSSP over all
vertices does not yield the
most efficient way of solving
the APSP. In fact, even the
naive way can be furtherly
improved, by exploiting
algebraic properties.
To apply the APSP algorithm,
we first
preprocess
the input digraph so that
there's no multiple edge,
and thus we may let
.
In addition to that, we
may preprocess the weighted
digraph so that it's
loopless for further
processing: since we may
scan every
, and if there's
, will
not appear in any shortest
path; if , such
a loop is a negative cycle,
and we may signal the caller
with "negative cycle detected"
right away.
Number the vertices of digraph as
,
and define the
weighted adjacency matrix
as a
-matrix such that
.
Fasten tight, we are about to
unleash some algebraic power of it.
For convenience, let's focus
only on the weights first.
It's easy to verify that the
weighted adjacency matrix
stores the weights
of temporarily shortest
-paths of length
. If there's
another matrix
storing the weights of
temporarily shortest
-paths of length
, how can we obtain
the weights of temporarily
shortest -paths
of length ?
Every temporarily shortest
-path of length
is either the
-path of
length , or constructed
from a -path of
length for some
concatenating the edge
. And our way
of defining turns
out to be very clever:
- For every temporarily shortest
-path, there's
.
- The weight of walk
constructed by concatenating
to the
temporarily shortest
-path can be
obtained by
.
- If ,
then given that
,
we have
.
- If there's no temporarily
shortest -path,
then given that
,
we have
.
In this way, the weight of
the temporarily shortest
-paths of length
can always be
obtained by
.
We will show the specialty
of this expression.
Recall that a ring is a
additive abelian group
equipped with a multiplicative
semigroup / monoid, such
that the multiplication is
distributive over the
addition, and the additive
identity is the multiplicative
zero. If we allow the addition
to be defined over a
commutative semigroup / monoid,
in which some element does not
have additive inverse, we
obtaina semiring. And
we would like to show:
Theorem
The algebraic structure
is a commutative semiring
with zero and unity .
Proof. We will need to
prove
is the additive commutative
monoid with identity ,
is the multiplicative commutative
monoid with identity ,
is distributive over
and is the
multiplicative zero.
To prove
is a commutative monoid:
- Obviously it's closed.
- ,
thus it's associative.
-
thus it's commutative.
- ,
thus is the identity
of this monoid.
And , there's
no element such that
,
this the monoid is
generally not invertible.
Clearly
is obtained by
adjoining to
, while
has
already been an abelian
group, we just need to
check how
affects the operations.
To prove is a
commutative monoid:
- ,
it's closed and commutative,
remains as the identity
of this monoid, and there's no
inverse for .
- ,
and
,
thus it's associative.
Since there's
,
the is distributive
over . And since
,
is the
multiplicative zero, thus
the zero of the semiring.
The semiring
is called the
tropical semiring by
mathematicians.
In this way, let
to indicate their role in
the semiring of
more intuitively, soon
we have:
What does it look most similar to?
Yes, matrix multiplication.
So the matrix
stores
the weights of temporarily
shortest -paths of
length , when viewing
as
matrices over tropical semiring.
And recall that
itself stores the weights of
all temporarily shortest path
of length , it won't
be hard to see:
Theroem
When viewing as
matrix over tropical semiring,
the matrix
stores the weights of all
temporarily shortest path of
length .
Proof. By mathematical
inducition with
as the base step, and
as the induction step, we
are done with the proof.
Please notice the proof
does not require the
multiplication of matrices
over tropical semiring
to be associative: When
there's no additional
paranthesis, the natural
order of computing
is from left to right,
which means we first
compute
and them multiply
by
, while
is
stored in
(and computed first)
in the induction steps.
So there's no algebraic
issue here.
In this way, a matrix
exponentiation based
APSP algorithm might
works in this way:
first compute
,
and then see whether
to check if there's
negative cycle. What
about the complexity?
For each cell, we have to
do addition and
pair-wise minimum,
and there're
cells to go. And we need
to do times of
matrix multiplication, so
there're
operations to go.
By now, you must be super
furious, since it's nothing
better than the naive way
of looming Bellman-Ford over
every vertices, which also
runs in , but
without having to deal with
these dizzying and useless
algebraic garbages.
But working towards matrix
exponentation is hopeful,
since matrix exponentiation
(over fields) has a
famous optimization
technique called fast matrix
exponentiation, based on
the concept of exponentiaion
by squaring: If we are to
compute for
square matrix, we
may first factorize as
,
and then:
Since
and
,
starting with
,
we loop from to ,
and in each iteration, we
check whether to
decide whether to multiply
into
, and then
compute
from squaring .
The result of
is stored in
after the loop. Doing in
such a way, we just need
to do matrix multiplication
of by
, and
matrix squaring of
(if
we cleverly ellide the
last squaring by checking
whether ). In this
way, there're totally
operations
to go (without Strassen's
algorithm), and since
, the
eventual time complexity is
.
The fast matrix exponentation
requires associativity of
matrices (over fields): we
need to partition
multiplicands into groups
of ;
and when computing
from squaring
,
we need to group the
first and last
multiplicands.
So if we want to deploy
fast matrix exponentiation
to matrices over tropical
semiring, their matrix
multiplication needs to
be associative. Luckily,
the multiplication of
matrices over semirings
is associative:
Theorem
Let be a semiring,
then the multiplication of matrices
over is associative. That is,
for matrices
,
there's
.
Proof. First, consider
,
clearly
holds for them, which means:
The reasoning of
involves only addition and multiplication,
in which the distributive law is required
for expanding
and contracting
,
the commutative law is required for
swapping summation order / pivoting in
.
In this way, the reasoning can
even take place in a semiring. Thus
the multiplication of matrices over
a semiring is also associative.
In this way, the fast
matrix exponentiation is
applicable to the matrices
over tropical semiring, and
the matrix exponentiation
of part of
our APSP algorithm can run
in time complexity
. Please
notice the divide-and-conquer
based Strassen's algortihm,
which runs in time complexity
per
matrix multiplication, is
not applicable, due to
lacking additive inverse /
subtraction for some
semiring elements.
Now all remains to do is to
show we are also able to
handle the update of
previous vertex efficiently,
especially to show it's
compatible with the process
of squaring the matrices:
Lemma
Let be the temporarily
shortest -path
of length , then for
any ,
must be the concatenation
of a temporarily shortest path
-path of length
and a temporarily
shortest -path
of length , for
some .
Proof. It's clear
that must be the
temporarily shortest
-path of
length ,
otherwise we may replace
it with the shorter one,
which is a shorter
-path of length
, which
is a contradiction.
When ,
we just need to argue
that : Since
if there's any shorter
-path
of length ,
then
is shorter than and
of length
,
which is a contradiction.
In fact, must not
contain a negative loop
if would like
to be true.
When ,
we may split into
such that
.
The segment is the
temporarily shortest
-path of
length ,
otherwise we can replace
the segment with the
shorter , so that
is shorter than ,
while of length .
For the same reason,
the segment must
also be the temporarily
shortest -path
of length ,
otherwise we may replace
the segment with the
shorter , so that
is shorter than ,
while of length .
Corollary
The temporarily shortest
paths of length
can be constructed from
the temporarily shortest
paths of length
and the temporarily shortest
paths of length .
Proof. Since any
temporarily shortest
-path of length
is concatenation
of a segment of temporarily
shortest -path
of length , and
a segment of temporarily
shortest -paths
of length . We
will encounter it by doing a
concatenating-and-picking-up-minimum
loop for all
.
Let stores
the weights of temporarily
shortest paths of length ,
stores
the previous vertices of
each temporarily shortest paths
of length , and
stores
the actual length of each
temporarily shortest paths of
length . In this way,
we may compute
from
and
,
by multiplying
and concatenating the shortest paths
stored in with
, synchronizing
the length in :
Algorithm (Weight-Path-Length Matrix Multiplication)
The input is
and
,
the output is
:
- Initialize
as matrix of all , and
as matrix of
all .
- Loop for from to :
- Loop for from to :
- Loop for from to :
- Compute
- If or
:
- Update .
- If ,
then
update ;
otherwise update
.
- Halt with .
Here, one might be confused about
the meaning of .
Althought
stores the weights of all
shortest paths of length ,
but it's not sufficient to use
to reconstruct
a temporarily shortest path of
length . Since if
stores the previous vertex
of in the temporarily
shortest -path of
length , then
stores the previous vertex
of in the temporarily
shortest -path of
length . And if
the temporarily shortest
-path is
reconstructed into a path
of length , then
the temporarily shortest
-path reconstructed
is of length .
Indeed, storing the previous
vertex of temporarily shortest
paths of length
does not suffice to reconstruct
the underlying path. However,
we know when there's no negative
cycle, it suffices to find
temporarily shortest paths of
length , and we
just need to ensure we are
able to reconstruct the
underlying path then. In fact,
we have:
Lemma
In weighted digraph with
no negative cycle, let
the shortest paths
be represented by
for , if
all shortest paths are
the shortest in length,
then tracking back previous
vertices in
reconstructs into paths.
Proof. Assume the contrary,
if it fails when recovering
the shortest -path
,
circulating between ,
then there're a shortest
-path in the form of
and a shortest -path
in the form of
.
By the Fundamental Theorem
of Shortest Path Problem,
the shortest paths exists
when there's no negative
cycle in the digraph, and
is also a shortest
-walk such that
,
since is the shortest
in length; is also a
shortest -path such that
,
since is the shortest
in length. Since have
distinct endpoints,
, and thus
,
which is a contradiction.
Theorem
In weighted digraph with
no negative cycle, for
the input temporarily shortest
paths of length and
of length , such that
all walks are paths and the
shortest in length, then
the Weight-Path-Length
Matrix Multiplication
combines them into output
temporarily shortest
paths of length
that are paths and the
shortest in length.
Proof. Assume the
contrary, that the algorithm
fails to create paths,
it must combine -path
and -path
into the temporarily
shortest -path
as output. Consider the
weight of closed walk
:
if , then
is shorter than it;
if , then
the digraph contains a
negative cycle. So it's
only case that .
We argue that for
temporarily shortest
-path
of length as
input, and the temporarily
shortest -path
of length
as input, there must be
:
Assume the contrary
that ,
then given that
,
must be the temporarily
shortest -path
found instead of
.
The analogous argument
is applicable to show
.
So the only possibility
is that
,
however since the input
are paths shortest in
length, we have
and thus there's
instead, which is a
contradiction. In
this way,
can't be the output
of the algorithm.
The concatenation of
may also contain closed walk,
however we may recursively
reasoning on this new shortest
-path then, and
given that
,
the recursion can't last
forever since the descending
chain of positive integer
to zero is finite.
Now the output of the
algorithm must be a path,
but we still need to show
it's the shortest in length.
Let the output temporarily
shortest -path of
length be .
By this lemma,
can be split into the
shortest -path
of length and
the shortest -path
of length .
Let the shortest
-path of
length specified
by input be , and
the shortest -path
specified by input be
, clearly we must have
,
and the Weight-Path-Length
algorithm must scan
for creating
the shortest -path.
So the shortest -path
produced by the algorithm
must fulfill
,
that is, is a path and
the shortest in length,
and we are done with the proof.
One can easily find the
length matrix plays an
indispensible role in
ensuring the path found to
be the shortest in length.
In this way, the following
algorithm is in the ultimate form:
Algorithm (Matrix Exponentiation Based APSP Algorithm)
The input is the digraph
with weight
,
either the algorithm halts
with "negative cycle detected",
or it outputs the classical
result of
in APSP, as is described
here:
- Preprocess the digraph
to preserve the edges of
minimum weights. Then
scan every loops, and
halt with
"negative cycle detected"
if the loop has negative
weight, or remove the loop.
- Initialize matrices:
- Initialize counter .
- Loop until :
- Perform Weight-Path-Length multiplication
on
and ,
to produce
.
- Update .
- Test whether ,
if not so, halt with
"negative cycle detected".
- Halt with
and
Theorem
The Matrix Exponentiation Based APSP
algorithm yields the correct answer.
Proof. It won't be hard to find
in the digraph that is preprocessed
and with loop removed, every edge
is the
shortest -path of length
, and no -walk
can be shorter. In this way,
represents the shortest paths of
length such that all
walks are paths and the
shortest in length.
Starting with , each time,
the Weight-Path-Length matrix
multiplication combines
and
,
representing temporarily
shortest paths of length
, into
,
representing temporarily
shortest paths of length
,
until . In
this way, if there's no
negative cycle in the digraph,
every tuple
stores the temporarily
shortest paths of length
such that all
walks are paths and the
shortest in length.
By the Fundamental Theorem
of Shortest Path Problem,
we know if there's no negative
cycle, all shortest paths
must be paths, and the tuple
must stabilize after
; conversely,
if there's negative cycle,
the tuple
won't stabilize even after
, so
testing whether
for some
should be equivalent to testing whether
.
However, the common fast matrix
multiplication, which tests each
bit in the binary decomposition
of to evaluate
precisely, takes up to
multiplications, while
evaluating
for some
takes only
multiplications, which is
preferred in our case.
The Matrix Exponentiation
Based APSP algorithm, which
runs in time complexity
is
not the fastest algorithm
for solving the APSP
problem. However it
reveals the suboptimiality
of naively looming SSSP
algorithms over vertices
by surpassing it in
complexity, and sheds light
on discovering faster
APSP algorithms. The semiring
algebra beneath it is also
interesting nevertheless.
Floyd-Warshall APSP Algorithm
How do we ensure that an
APSP algorithm (that does not
work in the naive way)
would work? Recall
this lemma,
we manage to create a
necessary condition that
a temporarily shortest path
of length must
be constructed from a
temporarily shortest path
of length and
of length , then
we design an algorithm to
find it sufficiently.
Interestingly, constructing
the temporarily shortest
path of length
from temporarily shortest
paths of length
and is not the
only way of solving the
APSP problem. By changing
the way of constructing
temporarily shortest paths,
we may solve the APSP
problem more efficiently.
For convenience, we may
assume the digraph has
no negative cycle for
now, and we will handle
the case with negative
cycles later.
Given that the vertices
and edges are finite in a
digraph, we may number the
vertices in the digraph as
.
Let
,
a temporarily shortest -path
with internal vertices from
is a temporarily shortest
-path such that
.
Clearly by this definition,
a shortest -path is a
temporarily shortest -path
with internal vertices from
, but:
Lemma
In weighted digraph with
no negative cycle, for
the temporarily shortest
-paths with internal
vertices from , at least
one of them is a path.
Proof. Assume it's not
a path, then it contains a
closed walk of non-negative
weight, waiting to be removed
or introducing a contradition
to it's the shortest in weight.
Lemma
In weighted digraph
with
no negative cycle, for a
temporarily shortest
-path
with internal vertices from
that is a path, it's:
- A temporarily shortest
-path with internal
vertices from .
- Constructed from temporarily
shortest -path and
shortest -path, both
are paths and with internal
vertices from .
Proof. For a temporarily
shortest -path with
internal vertices from , if
,
then such a path is also
a temporarily shortest
-path with internal
vertices from ; if
,
then
.
Since is a path,
is not an internal
vertices of ,
thus are paths
and temporarily shortest
paths with internal
vertices from .
Theorem
In weighted digraph with
no negative cycle, from
the temporarily shortest
-path ,
-path and
-path with
internal vertices from
which are paths,
by letting:
Then we obtain a temporarily
shortest -path
with internal vertices
from that is a path.
Proof. We just need
to show the concatenation
is a
-path when
.
Assume the contrary that
is not
a path, let
and
be concatenated. Now
consider the closed walk
,
it must have zero weight, so
has the same weight as
but
.
Since is the
temporarily shortest
path with internal
vertices from ,
,
which is a contradiction to
.
In this way, we obtain an
APSP algorithm that can find
shortest paths in a digraph
with no negative cycle:
Algorithm (Floyd-Warshall without Negative Cycle Handling)
The input is the digraph
such that
,
with weight
and
no negative cycle,
the output is the classical
result of
in APSP, as is described
here:
- Perform classical initialization of
in APSP.
- Loop for to :
- Loop for to :
- Update .
- Loop for to :
- Loop for to :
- Loop for to :
- If , update
.
- Halt with classical result
of in APSP.
To show tracking previous
vertices in
reconstruct into paths,
we would like to show after
every -th iteration at
step 3, the
recovers into path. This
will require:
Lemma
After the -th iteration
at step 3, if the -path
is set to the temporarily
shortest -path with
internal vertices from ,
then the temporarily
shortest -path with
internal vertices from
must also be updated.
Proof. Let be
replacing the temporarily
shortest -path
with internal vertices
from after the
-th iteration, this will
require .
Assume the contrary that
the temporarily shortest
-path with
internal vertices from
is also the
temporarily shortest path
with internal vertices
from . Now we have
.
On the other hand, the path
does not contain as
internal vertex, so its
internal vertices are from
, and thus
.
Now we have:
Which is a contradiction.
So there must be sufficiently
,
and the temporarily shortest
-path with internal
vertices from must
be updated after this iteration.
In the lemma above, we
conversatively assume that
"some path" will be set to
the temporarily shortest
-path, not necessarily
.
However, we still have:
Lemma
If before the -th iteration,
recovers into
paths, that is, the temporarily
shortest -path has
a temporarily shortest
-path
segment as prefix, then after
the -th iteration,
still recovers into paths.
Proof. Let
temporarily shortest -path
with internal vertices from
be found after the
-th iteration.
Noted that by tracking back
, starting
with and ending at
, we
recover the -path
segment , by the presumption.
And due to the previous lemma,
every
are updated after the -th
iteration, and the only way
to update it is by concatenating
with the temporarily shortest
-path with internal
vertices from , which
are -path segments /
prefixes of , so
is set to
for
.
On the other hand, before
the -th iteration, by
tracking back
, starting
with and ending at
, we
recover the path . We
argue that
for
remains
unchanged after the
-th iteration. Otherwise
such a temporarily shortest
-path must be
constructed from concatenating
-path and some
-path , while
has already been in ,
which is a contradiction to
every temporarily shortest
path constructed must be a
path we've
proved
earlier.
In this way, if the
temporarily shortest -path
with internal vertices from
is found after the -th iteration,
by tracking back
, starting
with and ending at
, we
recover into exactly this
shortest path.
Obviously after the step 2,
the initial configuration of
stores only paths
made up of single edges, and
thus can be recovered into
paths; during the execution,
the algorithm preseves the
property per iteration of step
3, thus when the algorithm
terminates, the
yielded recovers into path.
The reasoning of the lemma
above can also be used to
show all temporarily shortest
-paths and -paths
remains unchanged in the
-th iteration of step 3,
thus updating
and in place is safe.
In this way, the Floyd-Warshall
APSP algorithm is correct
when there's no negative
cycle in the digraph, and it
turns out to be super efficient
when the digraph is dense:
Theorem
The Floyd-Warshall APSP algorithm
without negative cycle handling
runs in time complexity
and space
complexity .
Proof. The initialization
at step 1 takes time complexity
, the step 2 takes
time complexity , the
step 3 takes time complexity
. So the eventual
time complexity accumulates to
.
During the execution, only
the mappings and
are required, and
they take space.
So the eventual space complexity
accumulates to .
Now let's talk about handling
negative cycles in the
Floyd-Warshall APSP algorithm.
There're many ways to accomplish
this task within ,
and we would like to introduce
one of the simplest ways:
Algorithm (General Negative Cycle Detection in APSP)
The input is the digraph
such that
,
with weight
, the output
is whether the digraph
contains negative cycle:
- Preprocess the digraph into
such that
and ,
while .
That is, create a new pseudo
vertex , and edges
of zero weight to all
other vertices in .
- Execute Bellman-Ford SSSP
algorithm with digraph
and source vertex as
input. If the Bellman-Ford
halts with
"negative cycle detected",
then halt with true;
otherwise halt with false.
The correctness of this
method is obvious: for every
negative cycle
in digraph , the walk
must be inspected by the Bellman-Ford
algorithm; for every negative cycle
detected by
,
is sufficiently a negative
cycle to be found in ; adding
pseudo vertex will not
introduce a negative cycle,
since it does not introduce any
cycle in the first place.
The time complexity of this
method is obviously ,
and the space complexity
is obviously , so
integrating such a detection
into Floyd-Warshall APSP
algorithm will not affect its
overall time complexity and
space complexity:
Algorithm (Floyd-Warshall)
The input is the digraph
such that
,
with weight
,
either the algorithm halts
with "negative cycle detected",
or it outputs the classical
result of
in APSP, as is described
here:
- Execute the General Negative
Cycle Detection in APSP, if
it detects a negative cycle,
halt with "negative cycle detected".
- Given all negative cycles have
been handled, just execute the
Floyd-Warshall without Negative
Cycle Handling will yield the
correct answer now.
We shall conclude the complete
implementation of the Floyd-Warshal
APSP algorithm in this text for now.
Johnson's APSP Algorithm
The Floyd-Warshall APSP algorithm
works well for dense digraph in
time complexity , however
when the digraph is sparse with
non-negative weights, by
looming the Dijkstra's algorithm
over every vertex, it runs in
time complexity
with (minimum) binary heap as
the priority queue. When such a
digraph is as sparse as
,
the time complexity can reach
and is faster than the
Floyd-Warshall APSP algorithm.
So the problem remains as: Can
the APSP problem of a general
sparse digraph be solved faster
than ? The Johnson's
APSP algorithm finds us a way,
that we may reweight the
digraph so that all edges
becomes non-negatively weighted,
but without altering the shortest
paths. In the reweighted digraph,
the Dijkstra's algorithm is safe
to go, and looming the Dijkstra's
algorithm over every vertex
should yields the APSP within
, as is analyzed above.
First, we must be aware of that
naively reweighting the digraph
will not work, since it will
alter the shortest paths:
So it's non-trivial to come up
with a reweighting process that
preserves the shortest paths.
Incredibly, the Johnson's
algorithm tweaks the
General Negative Cycle Detection in APSP
into a reweighting process
that may preserve the
shortest paths:
Algorithm (Johnson's APSP Reweighting Process)
The input is the digraph
such that
,
with weight
, either
the algorithm halts with
"negative cycle detected",
or it outputs the new
weight
:
- Preprocess the digraph into
such that
and ,
while .
That is, create a new pseudo
vertex , and edges
of zero weight to all
other vertices in .
- Execute Bellman-Ford SSSP
algorithm with digraph
and source vertex as
input. If the Bellman-Ford
halts with
"negative cycle detected",
we also halt the algorithm
with this signal;
otherwise the Bellman-Ford
halts with
,
we will let
,
and reweight as
.
Theorem
The Johnson's APSP Reweighting Process
preserves the shortest paths, that is,
in the digraph with no negative
cycle, for every shortest
-path found in
with weight , it must also be
the shortest -path in
with weight .
Proof. For every walk
,
its weight is:
The reweighting process won't
introduce negative cycle: Since
for any closed walk , let it
be passing any ,
then we have
,
so won't be a negative cycle
in the digraph reweighted with
, unless it has already been
a negative cycle in the digraph
with weight . In this way,
if digraph with weight
does not contain negative cycle,
then with weight must
also not contain negative cycle.
This will guarantee the existence
and well-formedness of the
shortest paths.
When there's no negative cycle
in the digraph, by the Fundamental
Theorem of Shortest Path Problem,
for every -path ,
there must be
.
This will imply
,
and thus is also the shortest
-path with the
minimum weight in the digraph
reweighted with .
Finding a reweighting process
that preserves the shortest
paths is the first step, to
apply the Dijkstra's SSSP
algorithm safely, all edges
must also be of non-negative
weights. However, we have:
Proof. Let be
the weighted distance metric
in digraph , and
be the weighted distance
metric in digraph .
We would like to show
when , that is,
introducing pseudo vertex
won't change the
weighted distance between
vertices in : If
introducing changes
the shortest -path
into a less weighted one,
the new shortest
-path must be
found in and not in
, which means it must
pass . However, there's
no incoming edge to ,
which means no -path
may pass when .
In the digraph weighted
with , there must be
,
since otherwise the walk
by concatenating the shortest
-path and the edge
is strictly shorter
than the shortest
-path in weight,
which is a contradiction.
In this way, we have
,
noted that the left
hand side is equal to
,
and thus .
Now all obstacles are
clear, and we are ready
to present the
Johnson's APSP algorithm:
Algorithm (Johnson)
The input is the digraph
such that
,
with weight
,
either the algorithm halts
with "negative cycle detected",
or it outputs the classical
result of
in APSP, as is described
here:
- Execute the Johnson's
APSP Reweighting Process,
when the process halts with
"negative cycle detected",
we also halt the algorithm
with this signal; otherwise
we fetch the new weight .
- Perform classical initialization of
in APSP.
- Loop for to :
- Execute the Dijkstra's
SSSP algorithm in
with weight and source
vertex , fetch
the outputs into
.
- Loop for to :
- Assign .
- Halt with classical result
of in APSP.
It won't be hard to see the Johnson's
APSP algorithm is correct: At step
1, it either halts with "negative
cycle detected", or the digraph
with new weight is a
weighted digraph with non-negative
weight, which is safe for executing
Dijkstra's SSSP algorithm at step
3.1. On the other hand, given that
the Dijkstra's SSSP algorithm has
guaranteed that the
s recover into
paths, and the algorithm simply
assign to every
with source vertex
at step 3.2,
also recovers into paths.