Number Theoretic Functions and Dirichlet Ring

The number theoretic functions equipped with Dirichlet convolution and prefix sum, form one of the most important topics in elementary number theory.

In this text, we are going to study the number theoretic functions, however this is done from algebraic perspective by studying the Dirichlet ring, which they form under addition and Dirichlet convolution.

Actually, one who gets the chance to study both elementary number theory and algebra in parallel will find how convenience and simple it is by employing some algebra.

Dirichlet ring

In elementary number theory, we are specially interested in such kind of functions:

Definition (Number Theoretic Function)

Any function in the form of is called a number theoretic function or an arithmetic function.

For convenience, we may denote the set of all number theoretic functions as .

So a number theoretic function takes in a positive integer and outputs a complex number. Obviously, we can combine them using appropriate field operations of to obtain another number theoretic function.

And we are specially interested in number theoretic functions obtained from the Dirichlet convolution of , which is defined as:

Definition (Dirichlet Convolution)

Mind the symmetry when they loom over the divisors of , there's clearly .

Here are some common notations of number theoretic functions we will use in this text:

Definition (Constant Function)

For each complex number , we define the constant function as to yield for every input integer, that is:

Definition (Identity Function)

The identity function yields the input integer itself, that is,

Definition (Unit Function)

The unit function yields for input and for all others, that is:

Just by substituting in the convolution's formula, we will see there's . Actually we want to claim:

Assertion

The is a commutative ring with zero and unity .

And call or simply the Dirichlet ring.

Proof of ring properties

It's trivial to show and verify is an abelian group, with group identity while , and Dirichlet convolution is distributive over addition by . However, it's very non-trivial to show the Dirichlet convolution is associative, and we are going to see it right now.

Theorem (Associativity)

Proof. It's very simple to get to a point that:

Where is the set of indices we are iterating over, and it's a finite set.

By we have and . This enables us to consider the map defined by , clearly by and it's injective, and thus bijective since it's on the same finite set.

In this way, is a permutation on , and thus permutation on summands of , so we have:

By now Dirichlet convolution has been shown to be associative.

Now we've checked against all ring axioms, and proved it to be a ring as is claimed by us.

Furtherly, it's also simple to show that:

Theorem

The Dirichlet ring is an integral domain.

Proof. Assume it's the contrary that .

Clearly there're such that and . Then we have:

Which is a contradiction to . So there's no zero divisor in and is an integral domain.

Multiplicativity

The number theoretic functions that are multiplicative draw most of our interest.

To begin, define counting function as to yield the number of divisors for each input integer, and the divisor sum function as to yield the sum of all divisors of the input integer. They are also expressible using the language of Dirichlet convolution, as:

Definition

But how can we evaluate them?

Let be the unique factorization of over the ring of , and consider:

From the combinatorial point of view, when we expand it into mononomials, each term of the expansion corresponds to a divisor of . And all of the terms has coefficient since all primes are distinct and is UFD. So the number of terms are the value of , and the direct evaluation of the expression is the value of , which are clearly evaluated to:

Corollary

Consider the connection between and , or the one between and , for any . At the first glance, there's no special connection between them, but when , let , it turns out there's no common prime in their factorizations. So we have and thus:

This leads to characterization of certain kind of number theoretic functions:

Definition (Multiplicative)

Number theoretic function is said to be multiplicative, if there's:

It's furtherly said to be completely multiplicative, if even is not required in order for to be true.

For convenience, we may denote the set of all multiplicative number theoretic functions as .

Consider the function value of multiplicative , we have:

Lemma

Proof. Since we have:

And in the field the complex number , if , then at this point is invertible so we must have .

Otherwise it's the case of , including , rendering .

We would like to show that multiplicativity is closed under Dirichlet convolution:

Theorem (Preserve by Operation)

Proof. First, one must notice and vice versa, otherwise at this point there's . Meantime there's and vice versa.

In this way, combinatorially all divisors of is generated by , and we may have , this will lead us to:

Then, noted that there're and all along the way, and by the multiplicativity of we have , , thus we have:

So is again multiplicative.

A common misconception is to think of 's being a subring of , while in fact:

Warning

The multiplicative number theoretic functions do not form a subring of .

Proof.

But is still a multiplicatively closed set, so localization is feasible for .

Dirichlet inverse

Conventionally, the inverse of a number theoretic functions in is called the Dirichlet inverse of that function. We claim that:

Theorem (Existence of Inverse)

Proof. If the Dirichlet inverse of would like to exist, have to be evaluated to for every input positive integer.

First consider , it's impossible for to exist in if , and so does ; otherwise at this point there must be .

Then, we prove by mathematical induction: assume prior to the point of , the function values have existed; and at the point of we have:

In this way, we are guaranteed to find value of at every point of whenever , is well-defined and thus exists.

Clearly every non-zero multiplicative number theoretic function has a non-zero function value at , and is thus invertible. We would like to show its inverse is also multiplicative:

Theorem (Preserve by Inversion)

Proof. First, choose any and any , we can do the following partition of summations:

Then, we prove by mathematical induction: it's obvious that ; assume prior to the point of with we already have ; and at the point of , consider , we have:

Thus can be simplified into , which implies , and is multiplicative at .

Gathering the fact that exhibits multiplicativity for every input positive integer, we've shown .

We may use the conventional notation from ring theory to represent the unit group of . Clearly fulfils all group axioms and forms a subgroup of now.

Möbius inversion formula

One of the most important conclusion from elementary number theory is the Möbius Inversion Formula, which states:

Where is the Möbius function. For every input positive integer with its unique factorization , Möbius function is defined as:

Definition (Möbius Function)

The formula can also be reinterpreted using the language of Dirichlet ring that:

Assertion (Möbius Inversion Formula)

Which is much more simpler and conciser than its elementary form.

Proof of the formula

Actually it suffices to show is the Dirichlet inverse of that:

Theorem

Proof. Instead of verifying directly, it would be more interesting to detour and see whether we can derive Dirichlet inversion of directly, using the theorems we've just proved in last section.

We may let the inversion of be . Since is multiplicative, must also be multiplicative, so we just need to check function value evaluated at each prime power , which is:

At the point of , we can see it's ; but at the point of , there's , and assume there has already been prior to the point of , then at the point of we have:

So the value of evaluated at each prime power is:

So clearly there's , and thus when we combine and verify the function value of at each point of .

Corollary

Which is the direct consequence of multiplicativity of and .

Application on Euler's totient function

We are familiar with the Euler's totient function , which is defined to be . And we are here to provide alternative perspective from number theoretic functions and Dirichlet ring, which gives us a closed form of the function.

First we would like to show the following theorem, which is due to Gauss:

Theorem (Gauss)

Proof. Consider the additive group , for every element inside we can evaluate , and it's . Meantime, there's , mapping every to a point in . So we can let there be the set , and the map defined by .

The map is injective. Otherwise consider any such that . They are mapped to the same point in , so there has to be . which can be rewritten as . By multiplying on both sides we have , and thus , which is a contradiction to .

The map is surjective, since for every point , we claim there's . First, since , there has to be . Then pick the common factor out and we can see . Substituting into the definition of and we will get .

By now, we've proved to be a bijection between and . There're clearly elements in , while is the composition of elements in , and there're totally of them. In this way, we have:

Or simply , and we are done with the proof.

One can easily see is multiplicative, and thus is also multiplicative. And for each input positive integer with its unique factorization , we claim there's:

Theorem

Proof. Although one can derive by directly arguing there're elements divisible by in , it can also be derived from Möbius inversion formula that:

Then by the multiplicativity of we have:

In this way we derive the closed form of the Euler's totient function.

We can also simply write where represents the set of all primes, when the exponent of each prime in is unrelated.

Corollary

It won't be hard to derive the probability of picking up a integer that is coprime to depends on purely the primes that is composed of, and unrelated to the exponent of each prime, when we pick integer uniformly, and noticing there's . This also can be proved by applying inclusion-exclusion principle, arguing they must lie outside the integers divisible by prime factor of , and we can see their conclusions match.

Dirichlet inverse of Euler's totient function

It won't be hard to see there's , but just like its original form, we we also have a closed form for .

First, we would like to show there's a transformation about that:

Lemma

Proof. By applying the formula of Dirichlet inverse, at each , we have:

And obviously it's the case of .

Then, we can show there's a general way to evaluate when is multiplicative:

Lemma

Proof. Obviously we have:

So we have , and therefore . In this way, we will just need to care about the function value at each , which is:

And by multiplying the function values evaluated at each prime factor together, we are done with the proof.

This method is completely applicable to : just by letting , and then we can see as expected.

And the case of is just as simple as:

Corollary

In this way, we've derived the closed form for .

Characterization of completely multiplicative functions

We would like to show the following fact about completely multiplicative functions that:

Theorem

Proof. Noted that have already been completely multiplicative, and recall how we compute , we will do the analogous thing as:

As you can see, it's completely due to 's being completely multiplicative that we can even merge into , then it turns out to be .

Conversely, when , given that is already multiplicative, in order to show is completely multiplicative, it suffices to show , and consider:

It turns out to be , which can be recursively reduced down to , thus is completely multiplicative when .

Besides the identity function , an important example of completely multiplicative function is the Liouville's function :

Definition (Liouville's Function)

Since it's completely multiplicative, we can see , and thus .

And by direct computation we have:

And by gathering function values at all points we have:

So turns out to be a square indicator for input positive integer.

The identity function can also be extend to yield the -th power of input integer:

Definition (-th Power Function)

Clearly this function is completely multiplicative, so we have . Let , which is sometimes called the Jordan's totient function, it's just the similar case as that:

The summation is multiplicative, and at each prime factor value we have . And its Dirichlet inverse, however, is , or explicitly .

The completely multiplicative functions also exhibit some sort of endomorphic behaviour:

Theorem

Proof. We can simply expand into:

So we can clearly see it's .

Generalized convolution formula

Actually the convolution need not to be constrained to the divisor of the integer: define to be the set of functions in the form of such that , and we can define the generalized convolution as:

Definition (Generalized Convolution)

Clearly there's and it's again. We are going to show such kind of convolution is tightly associated with the Dirichlet convolution.

Associativity of generalized convolution

We are going to show there's some kind of "actions" of on by associativity:

Theorem (Associativity of Generalized Convolution)

Proof. It's very simple to get to a point that:

Where is the set of indices we are iterating over, and it's a finite set.

It's good to know that . Obviously there's . Let's define , and then:

And consider the relationship defined by . One can easily verify that is an equivalence relationship, which will partition into a series of disjoint sets .

The next problem is, what are the possible in . Just by setting and we can easily see , so there's . Apparently it's impossible for , and if , there's some such that , but , which is a contradiction. In this way, we can have faith in .

And is iterable by : while intuitively the should be a subset in the proposed form above, it suffices to show . Obviously there's . And for , we have , and since , it's only possible that . Now we are donw with .

This will enable us to partition and reorder the summands as:

By now, we've proved .

When is trimmed to the unit group , then the generalized convolution is realized as the group action of on :

Corollary

In this way, we've established the connection between Dirichlet convolution and the generalized convolution.

Convolution with floor function

One of the most commonly used case of generalized convolution is related to the floor function ¹ defined by . Consider the generalized convolution:

First, we would like to show:

Lemma

Proof. Let's think oppositely: what's the range of such that for some ? It turns out to be .

When , we have . Since is a positive integer, we can make a narrower range that . Let's take , then , so it's .

Corollary

It turns out that we will just need to care about the function value at each positive integer in order to study .

For convenience, we may define the constant function for any , such that:

Then, the most prominent use case of generalized convolution related to the floor function is rendered as below:

Theorem

By the associativity of generalized convolution, the theorem suffices to show:

Lemma

Proof. Expand the expression from up to yields:

The left hand sides add up to obviously, while the right hand sides are made purely of . Each must have appeared for at least once at , but how many times exactly does appear among all these identities?

Well, clearly must appear and only appear at when , and in the range of positive integers from up to , there're exactly of them, in the form of

So eventually the summation can be rewritten as:

By now, we are done with the proof.

Corollary

An alternative way to prove is to consider:

Computational Dirichlet hyperbola method

This method derives from Dirichlet hyperbola method of analytic number theory, and is trimmed to fit concrete implementation of computation of some generalized convolution.

Choose any such that , consider the following expression:

Where each summand corresponds to the lattice point below the hyperbola . This is how the method got its name.

For convenience, let , noted that there's , and we would like to claim:

Theorem (Computational Dirichlet Hyperbola Method)

Proof. Let be the collection of points we are summing over, they can be partitioned into the following regions:

With , counting multiplicity of elements to be exactly once, by inclusion-exclusion principle.

Do summation in every these regions:

And put them back to , we have:

Now we are done with the proof.

With the computational Dirichlet hyperbola method, we can compute miracly fast, in a time bound within :

Example

Proof. This is done by simply letting , noted that , and .

Or explicitly written in python:

def hyperbolic_sum(N: int) -> int:
	import math
	u = math.isqrt(N)
	return 2 * sum(N // i for i in range(1, u+1)) - u * u

The stupidest way is to compute at each point and sum it from up to , which will take steps if it takes to compute each , while there's ; the less stupid way is to do the summation of from up to , which takes steps; and the wisest way among them is to apply the computational Dirichlet hyperbola method, which can be completed within steps.

Meantime, there's a common technique known as the "number theoretic block summation" by Chinese competitive programmers for solving this problem, and we can show it's also an algorithm of .

First, for any , it's easy to see there're values of such that : just think of mirroring the hyperbola around , and points on the line , below now correspond to each value of such that .

Then, the algorithm of number theoretic block summation evaluating is shown as below:

Example (Number Theoretic Block Summation)

Initialize .
If , then output as the computed result of and halt.
Evaluate , record the previous value of as and update , then update , and go back to step 2.

Or explicitly written in python:

def number_theoretic_block_sum(N: int) -> int:
	n, s = 1, 0
	while n <= N:
		d, np = N // n, n
		n = (N // d) + 1
		s += d * (n - np)
	return s

To show it's an algorithm of , it suffices to show:

Lemma

There're at most distinct values of .

Proof. In the range of , the values of drops drastically. In fact, obviously there's , but the difference implies , so it's impossible for the values of to coincide in that range. We can conclude there're exactly distinct values of in the range of .

In the range of , we use the symmetry of around , noted that stride of is always greater than in the range before it's mirrored, so there must be at least one value of such that , and there're at exactly distinct values of in the range of .

The point is on when is a square, in this case there're distinct values of . Otherwise there're distinct values.

While the computational Dirichlet hyperbola method has speed supremacy over the block summation later comer, I've once come across to adapt the concept of block summation to accelerate the hyperbola method more furtherly. But the analysis above have just showed such acceleration is impossible, since both and have at least and summands to go, thus the time bound is stiff.

"Lord Du Sieve"

Fancy as it is when computing , one may find it very difficult and not intuitive to apply the same rationale to compute or . For such scenario, there's another easy-to-go method commonly referred as "Lord Du Sieve", by Chinese competitive programmers, which is also based on some transformation on the generalized convolution formula.

Let be a decomposition of , and denote , for convenience. Again, by the property , we will find:

In the last section, we've proved there're at most distinct values of . And if we apply a bit of concept of block summation, we will have:

Where are the distinct values of we use to gather summands.

Noted that only when will we have (recall the descending rate of in the interval ), so we can pluck that term off the summation, and then:

By now, we've reached the essence of the "Lord Du Sieve":

Corollary ("Lord Du Sieve")

If we are able to compute and the interval sum for arbitrary range fastly (ideally within ), then we will be able to compute the value of fastly based on the previously evaluated values of . We will analyze how fast is it later.

Evaluating will envolve evaluting some . But noted that , they must have been evaluated if we evaluate s by the ascending order of the elements in .

An easy template for solving problem written in python is:

from typing import *
T = TypeVar("T")

def lord_du_sieve(
	N: int, F: Callable[T, T],
	Sg: Callable[[int, int], T],
	divG1: Callable[[T], T],
) -> T:
	H: Dict[int, T] = dict()
	l = N
	while l > 0:
		m = N // l
		n, s = 2, 0
		while n <= m:
			d, np = m // n, n
			n = (m // d) + 1
			s += Sg(np, n-1) * H[d]
		H[m] = divG1(F(m) - s)
		l = N // (m+1)
	return H[N]

The most difficult might be generating the sequence of elements in in ascending order, or to climb up the hyperbola from right to left. To do so, we set the cursor to at start, corresponding to . To obtain the next value of , we try to increment the value of by , which may or may not in . But don't worry, we know is the upper bound of such that . There must be , otherwise , which is a contradiction. Actually, setting as the upper bound excludes all such that , pointing the upper bound of such that . And if is in , there must be . So is the next value of .

We are familiar with , and we can substitute them into the template written in python:

def lord_du_sieve_mu(N: int) -> int:
	H: Dict[int, int] = dict()
	l = N
	while l > 0:
		m = N // l
		n, s = 2, 0
		while n <= m:
			d, np = m // n, n
			n = (m // d) + 1
			s += (n - np) * H[d]
		H[m] = 1 - s
		l = N // (m+1)
	return H[N]

def lord_du_sieve_phi(N: int) -> int:
	H: Dict[int, int] = dict()
	l = N
	while l > 0:
		m = N // l
		n, s = 2, 0
		while n <= m:
			d, np = m // n, n
			n = (m // d) + 1
			s += (n - np) * H[d]
		H[m] = (m * (m+1)) // 2 - s
		l = N // (m+1)
	return H[N]

For the time complexity, assuming it takes steps to sum each , then the total steps taken to compute is , and for each part:

So the "Lord Du Sieve" is a way to complete the summation within the time bound of . However, when combined with the Linear sieve, it can be furtherly optimized to a time bound of , which will be discussed in a follow-up topic.

Conclusion

In this text, we start with the Dirichlet ring, which is formed by the number theoretic functions under addition and Dirichlet convolution, and study the multiplicativity and invertibility of the number theoretic functions.

Then, we study the summation fo number theoretic functions loomed over the factors of the input positive integer. Such summation can be viewed as Dirichlet convolution with constant function , and is strongly tied with the Möbius inversion formula, which suffices to prove Möbius function to be the Dirichlet inversion of . We have also discussed summations associated with Möbius function, in which the property of Möbius function's yielding non-zeroes only on square free has been exploited.

Finally, we enlarge the Dirichlet convolution to generalized convolution, which looms over to the floor of the input positive number , in which the -th summands is , where is taken from the set of generalized functions. The generalized convolution can be proved to be associative with Dirichlet convolution,and can be viewed as some kind of "action" of the Dirichlet ring on the set of generalized functions.

Among the generalized convolutions, the most special case is the prefix sum, which can be realized as constant function convolute with generalized function , resulting in the floor function . Such kind of summation is tied with the hyperbola: by deliberatedly partitioning the summands and apply some inclusion-exclusion principle we get the computational Dirichlet hyperbola method; by leveraging the property of has at most distinct values we get the number theoretic block summation and "Lord Du Sieve". Both yields us sublinear ways to compute the prefix sums.

While this text is among the introductory texts to elementary number theory, one should find how helpful it is to adapt method of algebra, in many different ways.

Also known as the "greatest integer function" in some books. ^[return]

January 7, 2024

algebra number-theory ring-theory