The number theoretic functions equipped with
Dirichlet convolution and prefix sum, form one
of the most important topics in elementary
number theory.
In this text, we are going to study the number
theoretic functions, however this is done from
algebraic perspective by studying the Dirichlet
ring, which they form under addition and
Dirichlet convolution.
Actually, one who gets the chance to study both
elementary number theory and algebra in parallel
will find how convenience and simple it is by
employing some algebra.
Dirichlet ring
In elementary number theory, we are specially
interested in such kind of functions:
Definition (Number Theoretic Function)
Any function in the form of
is called
a number theoretic function or an
arithmetic function.
For convenience, we may denote the set of all
number theoretic functions as .
So a number theoretic function takes in a
positive integer and outputs a complex number.
Obviously, we can combine them using
appropriate field operations of
to obtain another number theoretic function.
And we are specially interested in number
theoretic functions obtained from
the Dirichlet convolution of
, which is defined as:
Definition (Dirichlet Convolution)
Mind the symmetry when they loom over the
divisors of , there's clearly
.
Here are some common notations of number
theoretic functions we will use in this text:
Definition (Constant Function)
For each complex number ,
we define the constant function
as to yield for
every input integer, that is:
Definition (Identity Function)
The identity function
yields the
input integer itself, that is,
Definition (Unit Function)
The unit function
yields for input and for
all others, that is:
Just by substituting in the convolution's
formula, we will see there's
.
Actually we want to claim:
Assertion
The is a commutative
ring with zero and unity
.
And call or simply
the Dirichlet ring.
Proof of ring properties
It's trivial to show and verify
is an abelian group,
with group identity while
,
and Dirichlet convolution is distributive
over addition by
.
However, it's very non-trivial to show the
Dirichlet convolution is associative, and
we are going to see it right now.
Proof. It's very simple to get to a point that:
Where
is the set of indices we are iterating over,
and it's a finite set.
By
we have
and
.
This enables us to consider the map
defined by
,
clearly by
and
it's injective, and thus bijective
since it's on the same finite set.
In this way, is a permutation
on , and thus permutation
on summands of ,
so we have:
By now Dirichlet convolution has
been shown to be associative.
Now we've checked
against all ring axioms, and proved it to
be a ring as is claimed by us.
Furtherly, it's also simple to show that:
Theorem
The Dirichlet ring is an integral domain.
Proof. Assume it's the contrary that
.
Clearly there're such that
and
.
Then we have:
Which is a contradiction to . So
there's no zero divisor in and
is an integral domain.
Multiplicativity
The number theoretic functions that are
multiplicative draw most of our interest.
To begin, define counting function
as to yield the number
of divisors for each input integer, and the
divisor sum function
as to yield the
sum of all divisors of the input integer.
They are also expressible using the
language of Dirichlet convolution, as:
But how can we evaluate them?
Let
be the unique factorization of over the
ring of , and consider:
From the combinatorial point of view, when
we expand it into mononomials, each term of
the expansion corresponds to a divisor of .
And all of the terms has coefficient since
all primes are distinct and is
UFD. So the number of terms are the value of
, and the direct evaluation of the
expression is the value of , which
are clearly evaluated to:
Consider the connection between
and , or the one between
and ,
for any . At the
first glance, there's no special connection
between them, but when , let
,
it turns out there's no common prime in
their factorizations. So we have
and thus:
This leads to characterization of certain
kind of number theoretic functions:
Definition (Multiplicative)
Number theoretic function
is said to be multiplicative, if there's:
It's furtherly said to be completely
multiplicative, if even
is not required in order for
to be true.
For convenience, we may denote the set
of all multiplicative number theoretic
functions as .
Consider the function value of
multiplicative ,
we have:
Proof. Since we have:
And in the field the complex number
, if
,
then at this point is invertible
so we must have .
Otherwise it's the case of
,
including , rendering .
We would like to show that multiplicativity
is closed under Dirichlet convolution:
Theorem (Preserve by Operation)
Proof. First, one must notice
and vice versa, otherwise at this point
there's . Meantime
there's
and vice versa.
In this way, combinatorially all
divisors of is generated by
,
and we may have
,
this will lead us to:
Then, noted that there're
and
all along the way, and
by the multiplicativity
of we have
,
,
thus we have:
So is again multiplicative.
A common misconception is to think of
's being a subring of
, while in fact:
Warning
The multiplicative number theoretic functions do not form a subring of .
Proof.
But is still a
multiplicatively closed set, so
localization
is feasible for .
Dirichlet inverse
Conventionally, the inverse of a number
theoretic functions in is
called the Dirichlet inverse of
that function. We claim that:
Theorem (Existence of Inverse)
Proof. If the Dirichlet inverse
of would
like to exist, have
to be evaluated to for
every input positive integer.
First consider
,
it's impossible for to
exist in if ,
and so does ; otherwise at
this point there must be
.
Then, we prove by mathematical
induction: assume prior to the point
of , the function values
have existed; and at the point of
we have:
In this way, we are guaranteed to
find value of at every
point of
whenever , is
well-defined and thus exists.
Clearly every non-zero multiplicative
number theoretic function has a
non-zero function value at , and is
thus invertible. We would like to show
its inverse is also multiplicative:
Theorem (Preserve by Inversion)
Proof. First, choose any
and any , we
can do the following partition
of summations:
Then, we prove by mathematical
induction: it's obvious that
;
assume prior to the point of
with we already have
;
and at the point of , consider
,
we have:
Thus can be
simplified into
,
which implies
,
and is multiplicative at .
Gathering the fact that
exhibits multiplicativity for every
input positive integer, we've shown
.
We may use the conventional
notation from
ring theory to represent the unit
group of . Clearly
fulfils all
group axioms and forms a subgroup
of now.
One of the most important conclusion
from elementary number theory is the
Möbius Inversion Formula, which states:
Where is the
Möbius function. For every
input positive integer with its
unique factorization
,
Möbius function is defined as:
Definition (Möbius Function)
The formula can also be reinterpreted
using the language of Dirichlet ring that:
Assertion (Möbius Inversion Formula)
Which is much more simpler and conciser
than its elementary form.
Actually it suffices to show is
the Dirichlet inverse of that:
Proof. Instead of verifying
directly, it
would be more interesting to detour and
see whether we can derive Dirichlet
inversion of directly, using the
theorems we've just proved in last section.
We may let the inversion of
be . Since
is multiplicative, must also be
multiplicative, so we just need to check
function value evaluated at each prime
power , which is:
At the point of , we can see it's
; but at the
point of , there's
,
and assume there has already been
prior
to the point of , then at the point
of we have:
So the value of evaluated at
each prime power is:
So clearly there's ,
and thus when we combine
and verify the function value of
at each point of
.
Which is the direct consequence of
multiplicativity of and .
Application on Euler's totient function
We are familiar with the Euler's totient
function ,
which is defined to be
.
And we are here to provide
alternative perspective from number
theoretic functions and Dirichlet ring,
which gives us a closed form of
the function.
First we would like to show the following
theorem, which is due to Gauss:
Proof. Consider the additive group
,
for every element inside we can
evaluate , and it's
. Meantime, there's
,
mapping every
to a point in
.
So we can let there be the set
,
and the map
defined by
.
The map is injective. Otherwise
consider any
such that .
They are mapped to the same point in
,
so there has to be
.
which can be rewritten as
.
By multiplying on both sides we have
,
and thus
,
which is a contradiction to .
The map is surjective, since
for every point ,
we claim there's
.
First, since , there has to be
.
Then pick the common factor
out and we can see
.
Substituting into the definition of
and we will get
.
By now, we've proved to be a bijection
between and
. There're clearly elements
in , while
is the composition of elements in
,
and there're totally
of them. In this way, we have:
Or simply ,
and we are done with the proof.
One can easily see is
multiplicative, and thus
is also
multiplicative. And for each input
positive integer with its
unique factorization
,
we claim there's:
Proof. Although one can derive
by directly arguing there're
elements divisible by in
, it can
also be derived from Möbius inversion
formula that:
Then by the multiplicativity of
we have:
In this way we derive the closed form
of the Euler's totient function.
We can also simply write
where represents the set of
all primes, when the exponent of each
prime in is unrelated.
It won't be hard to derive the probability
of picking up a integer that is coprime to
depends on purely the primes that
is composed of, and unrelated to the
exponent of each prime, when we pick
integer uniformly, and noticing there's
.
This also can be proved by applying
inclusion-exclusion principle, arguing
they must lie outside the integers
divisible by prime factor of , and
we can see their conclusions match.
Dirichlet inverse of Euler's totient function
It won't be hard to see there's
,
but just like its original form, we
we also have a closed form
for .
First, we would like to show there's a
transformation about that:
Proof. By applying the formula of
Dirichlet inverse, at each ,
we have:
And obviously it's the case of
.
Then, we can show there's a general
way to evaluate
when is multiplicative:
Proof. Obviously we have:
So we have ,
and therefore
.
In this way, we will just need to care
about the function value at each
, which is:
And by multiplying the function values
evaluated at each prime factor together,
we are done with the proof.
This method is completely applicable to
:
just by letting ,
and then we can see
as expected.
And the case of
is just as simple as:
In this way, we've derived the
closed form for .
Characterization of completely multiplicative functions
We would like to show the following
fact about completely multiplicative
functions that:
Proof. Noted that
have already been completely
multiplicative, and recall how we
compute , we will
do the analogous thing as:
As you can see, it's completely due
to 's being completely multiplicative
that we can even merge
into ,
then it turns out to be
.
Conversely, when ,
given that is already multiplicative,
in order to show is completely
multiplicative, it suffices to show
, and consider:
It turns out to be
,
which can be recursively reduced
down to ,
thus is completely multiplicative
when .
Besides the identity function
, an
important example of completely
multiplicative function is the
Liouville's function
:
Definition (Liouville's Function)
Since it's completely multiplicative,
we can see ,
and thus .
And by direct computation we have:
And by gathering function values
at all points we have:
So turns out to
be a square indicator for input
positive integer.
The identity function can also be
extend to yield the -th power
of input integer:
Definition (-th Power Function)
Clearly this function is completely
multiplicative, so we have
.
Let ,
which is sometimes called the
Jordan's totient function,
it's just the similar case as
that:
The summation
is multiplicative, and at each prime
factor value we have
.
And its Dirichlet inverse, however, is
,
or explicitly
.
The completely multiplicative functions
also exhibit some sort of endomorphic
behaviour:
Proof. We can simply expand
into:
So we can clearly see it's
.
Actually the convolution need not to be
constrained to the divisor of the integer:
define to be the set of
functions in the form of
such that ,
and we can define the generalized
convolution as:
Definition (Generalized Convolution)
Clearly there's
and it's again. We
are going to show such kind of
convolution is tightly associated with
the Dirichlet convolution.
Associativity of generalized convolution
We are going to show there's some
kind of "actions" of
on by associativity:
Theorem (Associativity of Generalized Convolution)
Proof. It's very simple to get
to a point that:
Where
is the set of indices we are iterating
over, and it's a finite set.
It's good to know that
.
Obviously there's
.
Let's define
,
and then:
And consider the relationship
defined by
.
One can easily verify that
is an equivalence relationship,
which will partition
into a series of disjoint sets
.
The next problem is, what are
the possible in .
Just by setting and we
can easily see
,
so there's
.
Apparently it's impossible for
, and if ,
there's some
such that , but
,
which is a contradiction. In this way,
we can have faith in
.
And is iterable by
:
while intuitively the should
be a subset in the proposed form
above, it suffices to show
.
Obviously there's
. And for
, we have
,
and since ,
it's only possible that
.
Now we are donw with
.
This will enable us to partition and
reorder the summands as:
By now, we've proved
.
When is trimmed to the
unit group ,
then the generalized convolution is
realized as the group action of
on :
In this way, we've established the
connection between Dirichlet convolution
and the generalized convolution.
Convolution with floor function
One of the most commonly used case of
generalized convolution is related to
the floor function
defined by .
Consider the generalized convolution:
First, we would like to show:
Proof. Let's think oppositely:
what's the range of such that
for
some ? It
turns out to be .
When
,
we have
.
Since is a
positive integer, we can make a
narrower range that
.
Let's take
,
then
,
so it's
.
It turns out that we will just need
to care about the function value at
each positive integer in order to study
.
For convenience, we may define the
constant function
for any , such that:
Then, the most prominent use case
of generalized convolution related
to the floor function is rendered
as below:
By the associativity of
generalized convolution, the
theorem suffices to show:
Proof. Expand the expression
from
up to yields:
The left hand sides add up to
obviously, while the right hand sides
are made purely of . Each
must have appeared for at least
once at , but
how many times exactly does
appear among all these identities?
Well, clearly must appear
and only appear at
when , and in the range
of positive integers from up
to , there're exactly
of
them, in the form of
So eventually the summation can
be rewritten as:
By now, we are done with
the proof.
Corollary
An alternative way to prove is to consider:
Computational Dirichlet hyperbola method
This method derives from Dirichlet
hyperbola method of analytic number
theory, and is trimmed to fit concrete
implementation of computation of some
generalized convolution.
Choose any
such that , consider
the following expression:
Where each summand
corresponds to the lattice point
below the hyperbola .
This is how the method got its name.
For convenience, let
,
noted that there's
,
and we would like to claim:
Theorem (Computational Dirichlet Hyperbola Method)
Proof. Let
be the collection of points we are
summing over, they can be partitioned
into the following regions:
With
,
counting multiplicity of elements
to be exactly once, by
inclusion-exclusion principle.
Do summation in every these regions:
And put them back to
,
we have:
Now we are done with the proof.
With the computational Dirichlet
hyperbola method, we can compute
miracly fast,
in a time bound within :
Proof. This is done by simply
letting , noted that
, and
.
Or explicitly written in python:
def hyperbolic_sum(N: int) -> int:
import math
u = math.isqrt(N)
return 2 * sum(N // i for i in range(1, u+1)) - u * u
The stupidest way is to compute
at each point and sum it from up
to , which will take
steps if it takes to
compute each , while there's
;
the less stupid way is to do the summation
of from
up to , which takes steps;
and the wisest way among them is to
apply the computational Dirichlet
hyperbola method, which can be completed
within steps.
Meantime, there's a common technique
known as the "number theoretic block
summation" by Chinese competitive
programmers for solving this problem,
and we can show it's also an algorithm
of .
First, for any , it's
easy to see there're
values of
such that
:
just think of mirroring the hyperbola
around , and
points on the line ,
below now correspond
to each value of such that
.
Then, the algorithm of number theoretic
block summation evaluating
is shown as below:
Example (Number Theoretic Block Summation)
- Initialize .
- If , then output as the
computed result of
and halt.
- Evaluate ,
record the previous value of as
and update ,
then update ,
and go back to step 2.
Or explicitly written in python:
def number_theoretic_block_sum(N: int) -> int:
n, s = 1, 0
while n <= N:
d, np = N // n, n
n = (N // d) + 1
s += d * (n - np)
return s
To show it's an algorithm of ,
it suffices to show:
Lemma
There're at most
distinct
values of
.
Proof. In the range of
,
the values of
drops
drastically. In fact, obviously there's
,
but the difference
implies ,
so it's impossible for the values
of to
coincide in that range. We can
conclude there're exactly
distinct
values of
in the range of
.
In the range of
,
we use the symmetry of
around , noted that stride of
is always
greater than in the range before it's
mirrored, so there must be at least one
value of such that
,
and there're at exactly
distinct values of
in the range of
.
The point
is on when is a
square, in this case there're
distinct
values of .
Otherwise there're
distinct values.
While the computational Dirichlet
hyperbola method has speed
supremacy over the block summation later
comer, I've once come across to adapt
the concept of block summation to
accelerate the hyperbola method more
furtherly. But the analysis above have
just showed such acceleration is
impossible, since both
and
have at least
and
summands to go, thus the time bound
is stiff.
"Lord Du Sieve"
Fancy as it is when computing
, one may find
it very difficult and not intuitive to
apply the same rationale to compute
or
. For such
scenario, there's another easy-to-go
method commonly referred as
"Lord Du Sieve",
by Chinese competitive programmers,
which is also based on some
transformation on the generalized
convolution formula.
Let be a decomposition
of , and denote
,
for convenience. Again, by the property
,
we will find:
In the last section, we've proved
there're at most
distinct values of
.
And if we apply a bit of concept of
block summation, we will have:
Where
are the distinct values of
we
use to gather summands.
Noted that only when
will we have
(recall the descending rate of
in the interval
),
so we can pluck that term off
the summation, and then:
By now, we've reached the
essence of the "Lord Du Sieve":
Corollary ("Lord Du Sieve")
If we are able to compute
and the interval sum
for arbitrary
range fastly (ideally
within ), then we will be
able to compute the value of
fastly based on the previously
evaluated values of
.
We will analyze how fast is it later.
Evaluating
will envolve evaluting some
.
But noted that
,
they must have been evaluated
if we evaluate s by the
ascending order of the elements
in .
An easy template for solving
problem written in python is:
from typing import *
T = TypeVar("T")
def lord_du_sieve(
N: int, F: Callable[T, T],
Sg: Callable[[int, int], T],
divG1: Callable[[T], T],
) -> T:
H: Dict[int, T] = dict()
l = N
while l > 0:
m = N // l
n, s = 2, 0
while n <= m:
d, np = m // n, n
n = (m // d) + 1
s += Sg(np, n-1) * H[d]
H[m] = divG1(F(m) - s)
l = N // (m+1)
return H[N]
The most difficult might be generating
the sequence of elements in
in ascending order, or to climb up the
hyperbola from right
to left. To do so, we set the cursor
to at start, corresponding
to .
To obtain the next value of ,
we try to increment the value of
by
, which may or may not in
. But don't worry,
we know
is the upper bound of such that
.
There must be , otherwise
,
which is a contradiction. Actually,
setting as the upper bound
excludes all such that
,
pointing the upper bound of
such that
.
And if
is in , there must be
.
So is the next value of .
We are familiar with
,
and we can substitute them into
the template written in python:
def lord_du_sieve_mu(N: int) -> int:
H: Dict[int, int] = dict()
l = N
while l > 0:
m = N // l
n, s = 2, 0
while n <= m:
d, np = m // n, n
n = (m // d) + 1
s += (n - np) * H[d]
H[m] = 1 - s
l = N // (m+1)
return H[N]
def lord_du_sieve_phi(N: int) -> int:
H: Dict[int, int] = dict()
l = N
while l > 0:
m = N // l
n, s = 2, 0
while n <= m:
d, np = m // n, n
n = (m // d) + 1
s += (n - np) * H[d]
H[m] = (m * (m+1)) // 2 - s
l = N // (m+1)
return H[N]
For the time complexity, assuming it
takes steps
to sum each , then the total steps
taken to compute is
,
and for each part:
So the "Lord Du Sieve" is a way to
complete the summation within the time
bound of . However,
when combined with the Linear sieve, it
can be furtherly optimized to a time
bound of , which
will be discussed in a follow-up topic.
Conclusion
In this text, we start with the Dirichlet
ring, which is formed by the number
theoretic functions under addition and
Dirichlet convolution, and study the
multiplicativity and invertibility of
the number theoretic functions.
Then, we study the summation fo number
theoretic functions loomed over the
factors of the input positive integer.
Such summation can be viewed as Dirichlet
convolution with constant function ,
and is strongly tied with the Möbius
inversion formula, which suffices to
prove Möbius function to be the
Dirichlet inversion of . We have
also discussed summations associated
with Möbius function, in which the
property of Möbius function's yielding
non-zeroes only on square free has
been exploited.
Finally, we enlarge the Dirichlet
convolution to generalized convolution,
which looms over to the floor of
the input positive number , in which
the -th summands is
, where
is taken from the set of generalized
functions. The generalized convolution
can be proved to be associative with
Dirichlet convolution,and can be viewed
as some kind of "action" of the Dirichlet
ring on the set of generalized functions.
Among the generalized convolutions, the
most special case is the prefix sum,
which can be realized as constant function
convolute with generalized function
, resulting in the
floor function .
Such kind of summation is tied with
the hyperbola: by deliberatedly
partitioning the summands and apply
some inclusion-exclusion principle we
get the computational Dirichlet
hyperbola method; by leveraging the
property of
has at most
distinct values we get the number
theoretic block summation and "Lord
Du Sieve". Both yields us sublinear
ways to compute the prefix sums.
While this text is among the
introductory texts to elementary number
theory, one should find how helpful
it is to adapt method of algebra, in
many different ways.