Markov birth-and-death dynamics of populations

Spatial birth-and-death processes with a finite number of particles are obtained as unique solutions to certain stochastic equations. Conditions are given for existence and uniqueness of such solutions, as well as for continuous dependence on the initial conditions. The possibility of an explosion and connection with the heuristic generator of the process are discussed.


Introduction
This article deals with spatial birth-and-death processes which may describe stochastic dynamics of spatial population. Specifically, at each moment of time the population is represented as a collection of motionless points in R d . We interpret the points as particles, or individuals. Existing particles may die and new particles may appear. Each particle is characterized by its location.
The state space of a spatial birth-and-death Markov process on R d with finite number of points is the space of finite configurations over R d , where |η| is the number of points of η.
Denote by B(R d ) the Borel σ-algebra on R d . The evolution of a spatial birth-and-death process in R d admits the following description. Two functions characterize the development in time, the birth rate coefficient b : R d × Γ 0 (R d ) → [0; ∞) and the death rate coefficient If the system is in state η ∈ Г 0 (R d ) at time t, then the probability that a new particle appears (a "birth") in a bounded set B ∈ B(R d ) over time interval [t; t + ∆t] is ∆t B b(x, η)dx + o(∆t), the probability that a particle x ∈ η is deleted from the configuration (a "death") over time and no two events happen simultaneously. By an event we mean a birth or a death. Using a slightly different terminology, we can say that the rate at which a birth occurs in B is B b(x, η)dx, the rate at which a particle x ∈ η dies is d(x, η), and no two events happen at the same time.
Such processes, in which the birth and death rates depend on the spatial structure of the system as opposed to classical Z + -valued birth-and-death processes (see e.g. [KM59], [CG], [Har63, Page 116], [AN72, Page 109], and references therein), were first studied by Preston in [Pre75]. A heuristic description similar to that above appeared already there. Our description resembles the one in [GK06].
The (heuristic) generator of a spatial birth-and-death process should be of the form for F in an appropriate domain, where η ∪ x and η \ x are shorthands for η ∪ {x} and η \ {x}, respectively.
Spatial point processes have been used in statistics for simulation purposes, see e.g. [MS94], [MW04,chapter 11] and references therein. For application of spatial and stochastic models in biology see e.g. [Lev03], [FOK + 14], and references therein.
To construct a spatial birth-and-death process with given birth and death rate coefficients, we consider in Section 2 stochastic equations with Poisson type noise where (η t ) t≥0 is a suitable Г 0 (R d )-valued cadlag stochastic process, the "solution" of the equation, I A is the indicator function of the set A, B ∈ B(R d ) is a Borel set, N 1 is a Poisson point processes on R d × R + × R + with intensity dx × ds × du, N 2 is a Poisson point process on Z × R + × R + with intensity # × dr × dv, # is the counting measure on Z d , η 0 is a (random) initial finite configuration, b, d : R d × Γ 0 (R d ) → [0; ∞) are functions that are measurable with respect to the product σ-algebra B(R) × B(Г 0 (R)) and {x i } is some collection of points satisfying η s ⊂ {x i } for every moment of time s (the precise definition is given in Section 1.3.1). We require the processes N 1 , N 2 , η 0 to be independent of each other. Equation (2) is understood in the sense that the equality holds a.s. for all bounded B ∈ B(R d ) and t ≥ 0.
Garcia and Kurtz studied in [GK06] equations similar to (2) for infinite systems. In the earlier work [Gar95] of Garcia another approach was used: birth-and-death processes were obtained as projections of Poisson point processes. A further development of the projection method appears in [GK08]. Fournier and Meleard in [FM04] considered a similar equation for the construction of the Bolker-Pacala-Dieckmann-Law process with finitely many particles.
Holley and Stroock [HS78] constructed a spatial birth-and-death process as a Markov family of unique solutions to the corresponding martingale problem. For the most part, they consider a process contained in a bounded volume, with bounded birth and death rate coefficients. They also proved the corresponding result for the nearest neighbor model in R 1 with an infinite number of particles.
Kondratiev and Skorokhod [KS06] constructed a contact process in continuum, with the infinite number of particles. The contact process can be described as a spatial birth-and-death where λ > 0 and 0 ≤ a ∈ L 1 (R d ). Under some additional assumptions, they showed existence of the process for a broad class of initial conditions. Furthermore, if the value of some energy functional on the initial condition is finite, then it stays finite at any point in time.
In the aforementioned references as well as in the present work the evolution of the system in time via Markov process is described. An alternative approach consists in using the concept of statistical dynamics that substitutes the notion of a Markov stochastic process. This approach is based on considering evolutions of measures and their correlation functions. For details see e.g. [FKK12a], [FKK14], and references therein.
There is an enormous amount of literature concerning interacting particle systems on lattices and related topics (e.g., [Lig85], [Lig04], [KL99], [Ald13], [Fra14], [Spi77], etc.) Penrose in [Pen08] gives a general existence result for interacting particle systems on a lattice with local interactions and bounded jump rates (see also [Lig85,Chapter 9]). The spin space is allowed to be non-compact, which gives the opportunity to incorporate spatial birth-and-death processes in continuum. Unfortunately, the assumptions become rather restrictive when applied to continuous space models. More specifically, the birth rate coefficient should be bounded, and for every bounded Borel set B the expression should be bounded uniformly in η, η ∈ Г(R d ).
Let us briefly describe the contents of the article.
In Section 1 we introduce give some general notions, definitions and results related to Markov processes in configuration spaces. We start with configuration spaces, which are the state spaces for birth-and-death processes, then we introduce and discuss metrical and topological structures thereof. Also, we present some facts and constructions from probability theory, such as integration with respect to a Poisson point process, or a sufficient condition for a functional transformation of a Markov chain to be a Markov chain again.
In the second section we construct a spatial birth-and-death process (η t ) t≥0 as a unique solution to equation (2). We prove strong existence and pathwise uniqueness for (2). A key condition is that we require b to grow not faster than linearly in the sense that The equation is solved pathwisely, "from one jump to another". Also, we prove uniqueness in law for equation (2) and the Markov property for the unique solution. Considering (2) with a (non-random) initial condition α ∈ Г 0 (R d ) and denoting corresponding solution by (η(α, t)) t≥0 , we see that a unique solution induces a Markov family of probability measures on the Skorokhod ∞) (which can be regarded as the canonical space for a solution of (2)).
When the birth and death rate coefficients b and d satisfy some continuity assumptions, the solution is expected to have continuous dependence on the initial condition, at least in some proper sense. Realization of this idea and precise formulations are given in Section 2.1. The proof is based on considering a coupling of two birth-and-death processes.
The formal relation of a unique solution to (2) and operator L in (1) is given via the martingale problem, in Section 2.2, and via some kind of a pointwise convergence, in Section 2.5.
In Section 2.4 we formulate and prove a theorem about coupling of two birth-and-death processes. The idea to compare a spatial birth-and-death process with some "simpler" process goes back to Preston, [Pre75]. In [FM04] this technique was applied to the study of the probability of extinction.

Configuration spaces and Markov processes: miscellaneous
In this section we list some notions and facts we use in this work.

Some notations and conventions
Sometimes we write ∞ and +∞ interchangeably, so that f → ∞ and f → +∞, or a < ∞ and a < +∞ may have the same meaning. However, +∞ is reserved for the real line only, whereas ∞ have wider range of applications, e.g. for a sequence {x n } n∈N ⊂ R d we may write x n → ∞, n → ∞, which is equivalent to |x n | → +∞. On the other hand, we do not assign any meaning to x n → +∞.
In all probabilistic constructions we work on some probability space (Ω, F , P ), sometimes equipped with a filtration of σ-algebras. Elements of Ω are usually denoted as ω.
The set A c is the complement of the set A ⊂ Ω: A c = Ω \ A. We write [a; b], [a; b) etc. for the intervals of real numbers. For example, The half line R + includes 0: R + = [0; ∞).

Configuration spaces
In this section we introduce notions and facts about spaces of configurations, in particular, topological and metrical structures on Г(R d ) as well as a characterization of compact sets of Г(R d ). We discuss configurations over Euclidean spaces only.
We recall that |A| denotes the number of elements of A. We also say that Г(Λ) is the space of configurations over Λ. Note that ∅ ∈ Г(Λ).
Let Z + be the set {0, 1, 2, ...}. We say that a Radon measure µ on When a counting measure ν satisfies additionally ν({x}) ≤ 1 for all x ∈ R d , we call it a simple counting measure.
As long as it does not lead to ambiguities, we identify a configuration with a simple counting Radon measures on R d : as a measure, a configuration γ ∈ Г(R d ) maps a set B ∈ B into |γ ∩ B|.
One equips Г(R d ) with the vague topology, i.e., the weakest topology such that for all f ∈ C c (R d ) (the set of continuous functions on R d with compact support) the map Equipped with this topology, Г(R d ) is a Polish space, i.e., there exists a metric on Г(R d ) compatible with the vague topology and with respect to which Г(R d ) is a complete separable metric space, see, e.g., [KK06], and references therein. We say that a metric is compatible with a given topology if the topology induced by the metric coincides with the given topology.
Let B r (x) denote the closed ball in R d of the radius r centered at x.
A set is said to be relatively compact if its closure is compact. The following theorem gives a characterization of compact sets in Г(R d ), cf. [KK06], [HS78].
holds for all n ∈ N.
Proof. Assume that (4) is satisfied for some F ⊂ Г(R d ). In metric spaces compactness is equivalent to sequential compactness, therefore it is sufficient to show that an arbitrary sequence contains a convergent subsequence in Г(R d ). To this end, consider an arbitrary sequence {γ n } n∈N ⊂ F . The supremum sup n γ n (B 1 (0)) is finite, consequently, by the Banach-Alaoglu theorem there exists a measure α 1 ∈ C(B 1 (0)) * here C(B 1 (0)) * is the dual space of C(B 1 (0)) and a subsequence {γ Indeed, arguing by contradiction one may get that α 1 (A) ∈ Z + for all Borel sets A, and Lemma 1.5 below ensures that α 1 is a simple counting measure.
Similarly, from the sequence γ (1) n we may extract subsequence {γ n } in such a way that γ (2) n converges to some α 2 ∈ Г(B 2 (0)). Continuing in the same way, we will find a sequence of sequences {γ Conversely, if (4) is not fulfilled for some n 0 ∈ N, then we can construct a sequence {γ n } n∈N ⊂ F such that either the first summand in (4) tends to infinity: in which case, of course, there is no convergent subsequence, or the second summand in (4) tends to infinity. In the latter case, a subsequence of the sequence {γ n | Bn 0 (0) } n∈N may converge to a counting measure (when all γ n are considered as measures). However, the limit measure can not be a simple counting measure. Thus, the sequence {γ n } n∈N ⊂ F does not contain a convergent subsequence in Г(R d ).
We denote by CS(Г(R d )) the space of all compact subsets of Г(R d ).
Proof. Let {K m } n∈N be an arbitrary sequence from CS(Г(R d )). We will show that n K n = Г(R d ). To each compact K m we may assign a sequence q There exists a configuration whose intersection with B n (0) contains at least q Remark. Since Г(R d ) is a separable metrizable space, Proposition 1.3 implies that Г(R d ) is not locally compact.
For another description of all compact sets in Г(R d ) we will use the set Φ ⊂ C(R d ) of all positive continuous functions φ satisfying the following conditions: Proof. (i) Denote θ n = min For γ ∈ K c we have c Bn(0)×Bn(0) Ψ(x, y)γ(dx)γ(dy) Consequently, It remains to show that K c is closed, in which case Theorem 1.2 will imply compactness of K c .
We endow Г (n) 0 (R d ) with the topology induced by this one-to-one correspondence. Equivalently, a set A ⊂ Г we consider, of course, with the relative, or subspace, topology. As far as Г Having defined topological structures on Г (n) 0 (R d ), n ≥ 0, we endow Г 0 (R d ) with the topology of disjoint union, In this topology, a set We note that in order for K n to be compact, the set sym −1 K n , regarded as a subset of (R d ) n , should not have limit points on the diagonals, i.e. limit points Let us introduce a metric compatible with the described topology on Г 0 (R d ). We set otherwise.
Here d Eucl (ζ, η) is the metric induced by the Euclidean metric and the map sym: where |x − y| is the Euclidean distance between x and y, sym −1 η = sym −1 ({η}). In many aspects, this metric resembles the Wasserstein type distance in [RS99]. The differences are, dist is bounded by 1 and it is defined on Г 0 (R d ) only.
Note that the metric dist satisfies equalities x ∈ ζ, η. We note that the space Г 0 (R d ) equipped with this metric is not complete. Nevertheless, is a Polish space, i.e., Г 0 (R d ) is separable and there exists a metricρ which induces the same topology as dist does and such that Г 0 (R d ) equipped withρ is a complete metric space. To prove this, we embed Г  0 (R d ) of n-point multiple configurations, which we define as the space of all counting measures η on R d with η(R d ) = n. Abusing notation, we may represent each η ∈Г (n) 0 (R d ) as a set {x 1 , ..., x n }, where some points among x j ∈ R d may be equal (recall our convention on identifying a configuration with a measure; as a measure, η = n j=1 δ x j ). One should keep in mind that {x 1 , ..., x n } is not really a set here, since it is possible The representation allows us to extend sym to the map sym : and define a metric onГ is the metric induced by the Euclidean metric and the map sym: The metrics dist and dist coincide on Г is a complete separable metric space, and thus a Polish space. The next lemma describes convergence inГ Proof. The inequality dist(η m , η m ) < ε implies existence of a point from η m in the ball .., n}. Furthermore, in the case when x i is a multiple point, i.e., if x j = x i for some j = i, then there are at least as many points from η m in B ε (x i ) we have in the previous sentence "exactly as many" instead of "at least as many", because otherwise there would not be enough points in η m . The statement of the lemma follows by letting ε → 0.
we will show that it is a countable intersection of open sets in a Polish spaceГ (n) 0 (R d ). Then we may apply Alexandrov's theorem: any G δ subset of a Polish space is a Polish space, see §33, VI in [Kur66].
To do so, denote by B m the closed ball of radius m in R d , with the center at the origin.
. This is an immediate consequence of the previous lemma.

Lebesgue-Poisson measures
Here we define the Lebesgue-Poisson measure on Г 0 (R d ), corresponding to a non-atomic Radon measure σ on R d . Our prime example for σ will be the Lebesgue measure on R d . For any n ∈ N the product measure σ ⊗n can be considered by restriction as a measure on (R d ) n . The projection of this measure on Г (n) 0 via sym we denote by σ (n) , so that The measure λ σ is finite iff σ is finite. We say that σ is the intensity measure of λ σ .

The Skorokhod space
For a complete separable metric space (E, ρ) the space D E of all cadlag E-valued functions equipped with the Skorokhod topology is a Polish space; for this statement and related definitions, see, e.g., Theorem 5.6, Chapter 3 in [EK86]. Let ρ D be a metric on D E compatible with the Skorokhod topology and such that (D E , ρ D ) is a complete separable metric space. Denote where Then (P(D E ), ρ p ) is separable and complete; see, e.g., [EK86], Section 1, Chapter 3, and Theorem 1.7, Chapter 3. The Borel σ-algebra B(D E ) coincides with the one generated by the coordinate mappings; see Theorem 7.1, Chapter 3 in [EK86]. In this work, we mostly consider T ] endowed with the Skorokhod topology.

Integration with respect to Poisson point processes
We give a short introduction to the theory of integration with respect to Poisson point processes. Chapter 1] for the theory of integration with respect to an orthogonal martingale measure.
On some filtered probability space (Ω, F , {F } t≥0 , P ), consider a Poisson point process N We require the filtration {F } t≥0 to be increasing and right-continuous, and we assume that F 0 is complete under P . We interpret the argument from the first space R + as time. For X = R d the intensity measure β will be the Lebesgue measure on R d , for is the case with configurations, for X = R d we treat a point process as a random collection of points as well as a random measure.
We say that the process N is called compatible with all random variables of the type respect to the smallest σ -algebra generated by all g having the following properties: as the Lebesgue-Stieltjes integral with respect to the measure N : This sum is well defined, since We use dN (s, x, u) and N (ds, dx, du) interchangeably when we integrate over all variables. The process I t (f ) is right-continuous as a function of t, and adapted. Moreover, the process This equality will be used several times throughout this work.
1.8 Remark. We can extend the collection of integrands, in particular, we can define However, we do not use such integrands.
The Lebesgue-Stieltjes integral is defined ω-wisely and it is a function of an integrand and an integrator. As a result, we have the following statement. The sign d = means equality in distribution.
1.9 Statement. Let M k be Poisson point processes defined on some, possibly different, probability spaces, and let α k be integrands, k = 1, 2, such that integrals α k dM k are well defined. If The proof is straightforward.
The measure# is not σ-finite. For a cadlag Г 0 (R d )-valued process (η t ) t∈[0;∞] , adapted to We can not hope to give a meaningful definition for an integral of the type (16), because of the measurability issues. For example, the map where u is an independent ofÑ 2 uniformly distributed on [0; 1] random variable, does not have to be a random variable. Even if it were a random variable, some undesirable phenomena would appear, see, e.g., [Pod09].
To avoid this difficulty, we employ another construction. A similar approach was used in [FM04]. If we could give meaningful definition to the integrals of the type (16), we would expect to be a martingale (under some conditions on f and B).
Having this in mind, consider a Poisson point process N 2 on Z × R + × R + with intensity # × dr × dv, defined on (Ω, F , {F } t≥0 , P ) (here # denotes the counting measure on Z. This measure is σ-finite). We require N 2 to be compatible with {F } t≥0 . Let (η t ) t∈[0,∞] be an adapted cadlag process in Γ 0 (R d ), satisfying the following condition: for any T < ∞, The set R ∞ := t∈[0;∞] η t is at most countable, provided (17). Let be the lexicographical order on R d . We can label the points of η 0 , There exists an a.s. unique representation such that for any n, m ∈ N, n < m, either inf {s : x m ∈ η s } and x n x m . In other words, as time goes on, appearing points are added to {x 1 , x 2 , ...} in the order in which they appear. If several points appear simultaneously, we add them in the lexicographical order.
For the sake of convenience, we set For a predictable process Assume that R T is bounded for some T > 0. Then, for a bounded predictable f ∈ L 1 (R d × 1.10 Proposition. The process N is a Poisson point process with intensity dt×dx, independent of F τ .
Proof. To prove the proposition, it is enough to show that Indeed, N is determined completely by values on sets of type (b − a)β(U ), a, b, U as in (i), therefore it must be an independent of F τ Poisson point process if (i) and (ii) hold.
Let τ n be the sequence of {F t } t≥0 -stopping times, τ n = k 2 n on {τ ∈ ( k−1 2 n ; k 2 n ]}, k ∈ N. Then τ n ↓ τ and τ n − τ ≤ 1 2 n . The stopping times τ n take only countably many values. The process N satisfies the strong Markov property for τ n : the processes N n , defined by are Poisson point processes, independent of F τn . To prove this, take k with P {τ n = k 2 n } > 0 and note that on {τ n = k 2 n }, N n coincides with process the Poisson point processÑ k 2 n given bỹ Conditionally on {τ n = k 2 n },Ñ k 2 n is again a Poisson point process, with the same intensity. Furthermore, conditionally on

s. and all random variables
Analogously, the strong Markov property for a Poisson point process on R + ×N with intensity dt × # may be formulated and proven.
1.11 Remark. We assumed in Proposition 1.10 that the filtration {F t } t≥0 , compatible with N , is right-continuous and complete. To be able to apply Proposition 1.10, we should show that such filtrations exist.
Introduce the natural filtration of N , and let F t be the completion of F 0 t under P . Then N is compatible with {F t }. We claim that {F t } t≥0 , defined in such a way, is right-continuous (this may be regarded as an analog of Blumenthal 0 − 1 law). Indeed, as in the proof of Proposition 1.10, one may check thatÑ a is independent of F a+ . Since F ∞ = σ(Ñ a ) ∨ F a , σ(Ñ a ) and F a are independent and F a+ ⊂ F ∞ , one sees that F a+ ⊂ F a . Thus, F a+ = F a .
1.12 Remark. We prefer to work with right-continuous complete filtrations, because we want to ensure that there is no problem with conditional probabilities, and that the hitting times we will consider are stopping times.

Miscellaneous
When we write ξ ∼ Exp(λ), we mean that the random variable ξ is exponentially distributed with parameter λ.
1.13 Lemma. If α and β are exponentially distributed random variables with parameters a and b respectively (notation: α ∼ Exp(a), β ∼ Exp(b) ) and they are independent, then Indeed, Here are few other properties of exponential distributions. If ξ 1 , ξ 2 , ..., ξ n are independent exponentially distributed random variables with parameters c 1 , ..., c n respectively, then min We will also need the result about finiteness of the expectation of the Yule process. A Yule process (Z t ) t≥0 is a pure birth Markov process in Z + with birth rate µn, µ > 0, n ∈ Z + . That is, if Z t = n, then a birth occur at rate µn, i.e.
For more details about Yule processes see e.g. [AN72, Chapter 3], [Har63, Chapter 5], [Arn06] and references therein. Let (Z t (n)) t≥0 be a Yule process started at n. The process (Z t (n)) t≥0 can be considered as a sum of n independent Yule processes started from 1, see e.g. [Arn06].
Here are some other properties of Poisson point processes which are used throughout in the article. If N is a Poisson point process on R + × R d × R + with intensity ds × dx × du, then a.s.
Put differently, no plane of the form R + × {x} × R + contains more than 1 point of N . Using the σ-additivity of the probability measure, one can deduce (19) from We can write and then we can compute Thus, (20) holds.
Let x τ be the unique element of R d defined by Then

Pure jump type Markov processes
In this section we give a very concise treatment of pure jump type Markov processes. Most of the definitions and facts given here can be found in [Kal02,Chapter 12]; see also, e.g., [GS75, Chapter 3, § 1].
We say that a process X = (X t ) t≥0 in some measurable space (S, S) is of pure jump type if its paths are a.s. right-continuous and constant apart from isolated jumps. In that case we may denote the jump times of X by τ 1 , τ 2 , ..., with understanding that τ n = ∞ if there are fewer that n jumps. The times τ n are stopping times with respect to the right-continuous filtration induced by X. For convenience we may choose X to be the identity mapping on the canonical . When X is a Markov process, the distribution with initial state x is denoted by P x , and we note that the mapping x → P x (A) is measurable in x, A ∈ Ω.
Theorem 12.14 [Kal02] (strong Markov property, Doob) A pure jump type Markov process satisfies strong Markov property at every stopping time.
We say that a state x ∈ S is absorbing if P x {X ≡ x} = 1.
Lemma 12.16 [Kal02] If x is non-absorbing, then under P x the time τ 1 until the first jump is exponentially distributed and independent of θ τ 1 X.
Here θ t is a shift, and θ τ 1 X defines a new process, For a non-absorbing state x, we may define the rate function c(x) and jump transition kernel In the sequel, c(x) will also be referred to as jump rate. The kernel cµ is called a rate kernel.
The following theorem gives an explicit representation of the process in terms of a discretetime Markov chain and a sequence of exponentially distributed random variables. This result shows in particular that the distribution P x is uniquely determined by the rate kernel cµ. We assume existence of the required randomization variables (so that the underlying probability space is "rich enough").
Theorem 12.17 [Kal02] (embedded Markov chain) Let X be a pure jump type Markov process with rate kernel cµ. Then there exists a Markov process Y on Z + with transition kernel µ and an independent sequence of i.i.d., exponentially distributed random variables γ 1 , γ 2 , ... with mean 1 such that a.s. where In particular, the differences between the moments of jumps τ n+1 − τ n of a pure jump type Markov process are exponentially distributed given the embedded chain Y , with parameter c(Y n ). If c(Y k ) = 0 for some (random) k, we set τ n = ∞ for n ≥ k + 1, while Y n are not defined, n ≥ k + 1.
Theorem 12.18 [Kal02] (synthesis) For any rate kernel cµ on S with µ(x, {x}) ≡ 0, consider a Markov chain Y with transition kernel µ and a sequence γ 1 , γ 2 , ... of independent exponentially distributed random variables with mean 1, independent of Y . Assume that n γn c(Yn) = ∞ a.s. for every initial distribution for Y . Then (23) and (24) define a pure jump type Markov process with rate kernel cµ.
Next proposition gives a convenient criterion for non-explosion.
Proposition 12.19 [Kal02] (explosion) For any rate kernel cµ and initial state x, let (Y n ) and (τ n ) be such as in Theorem 12.17. Then a.s.
In particular, τ n → ∞ a.s. when x is recurrent for (Y n ).

Markovian functions of a Markov chain
Here A j ∈ B(S), m j ∈ N, k 1 ∈ N, F n = σ{X 1 , ..., X n }. The space S is separable, hence there exists a transition probability kernel Q : S × B(S) → [0; 1] such that Q(s, A) = P s {X 1 ∈ A}, s ∈ S, A ∈ B(S).
Consider a transformation of the chain X, Y n = f (X n ), where f : S → Z + is a Borelmeasurable function, with convention B(Z + ) = 2 Z + . In the future we will need to know when the process Y = {Y n } Z + is a Markov chain. A similar question appeared for the first time in [BR58].
A sufficient condition for Y to be a Markov chain is given in the next lemma.
1.14 Lemma. Assume that for any bounded Borel function h : Then Y is a Markov chain. (26) is the equality of distributions of X 1 under two different measures, P s and P q .

Remark. Condition
Proof. For the natural filtrations of the processes X and Y we have an inclusion since Y is a function of X. For k ∈ N and bounded Borel functions h j : Z + → R, j = 1, 2, ..., k (any function on Z + is a Borel function), To transform the last integral, we introduce a new kernel: for y ∈ f (S) chose x ∈ S with f (x) = y, ans then for B ⊂ Z + define The expression on the right-hand side does not depend on the choice of x because of (26). To make the kernel Q defined on Z + × B(Z + ), we set Q(y, B) = I {0∈B} , y / ∈ f (S).
Then from the change of variables formula for the Lebesgue integral it follows that the last integral in (28) allows the representation Likewise, we set z n−1 = f (x n−1 ) in the next to last integral: Further proceeding, we get Thus, This equality and (27) imply that Y is a Markov chain.
1.15 Remark. The kernel Q and the chain f (X n ) are related: for all s ∈ S, n, m ∈ N and M ⊂ N, whenever P s {f (X n+1 ) = m} > 0. Informally, one may say that Q is the transition probability kernel for the chain {f (X n )} n∈Z + .
1.16 Remark. Clearly, this result holds for a Markov chain which is not necessarily defined on a canonical state space, because the property of a process to be a Markov chain depends on its distribution only.

A birth-and-death process in the space of finite configurations: construction and basic properties
We would like to construct a Markov process in the space of finite configurations Г 0 (R d ), with a heuristic generator of the form for F in an appropriate domain. We call the functions b : the birth rate coefficient and the death rate coefficient, respectively.
Theorem 2.16 summarizes the main results obtained in this section.
To construct a spatial birth-and-death process, we consider the stochastic equation with where (η t ) t≥0 is a suitable cadlag Г 0 (R d )-valued stochastic process, the "solution" of the equation, with respect to the product σ-algebra B(R) × B(Г 0 (R)), and the sequence {..., x −1 , x 0 , x 1 , ...} is related to (η t ) t∈[0;∞] , as described in Section 1.3.1. We require the processes N 1 , N 2 , η 0 to be independent of each other. Equation (31) is understood in the sense that the equality holds a.s.
for every bounded B ∈ B(R d ) and t ≥ 0.
As it was said in the preliminaries on Page 6, we identify a finite configuration with a finite simple counting measure, so that a configuration γ acts as a measure in the following way: We will treat an element of Г 0 (R d ) both as a set and as a counting measure, as long as this does not lead to ambiguity. An appearing of a new point will be interpreted as a birth, and a disappearing will be interpreted as a death. We will refer to points of η t as particles.
These settings are formally equivalent: the relation between d andd is given by or, equivalently, The settings used here appeared in [HS78], [GK06], etc.
We define the cumulative death rate at ζ by and the cumulative birth rate by 2.1 Definition. A (weak) solution of equation (31) is a triple ((η t ) t≥0 , N 1 , N 2 ), (Ω, F , P ), is a probability space, and {F t } t≥0 is an increasing, right-continuous and complete filtration of subσ -algebras of F , (v) the processes N 1 , N 2 and η 0 are independent, the processes N 1 and N 2 are compatible being the sequence related to (η t ) t≥0 .
Note that due to Statement 1.9 item (viii) of this definition is a statement about the joint distribution of (η t ), N 1 , N 2 .
and let C t be the completion of C 0 t under P . Note that {C t } t≥0 is a right-continuous filtration, see Remark 1.11.

Definition.
A solution of (31) is called strong if (η t ) t≥0 is adapted to (C t , t ≥ 0).

Remark.
In the definition above we considered solutions as processes indexed by t ∈ [0; ∞).
The reformulations for the case t ∈ [0; T ], 0 < T < ∞, are straightforward. This remark applies to the results below, too.
Sometimes only the solution process (that is, (η t ) t≥0 ) will be referred to as a (strong or weak) solution, when all the other structures are clear from the context.
We will say that the existence of strong solution holds, if on any probability space with given N 1 , N 2 , η 0 , satisfying (i)-(v) of Definition (2.1), there exists a strong solution.
We assume that the birth rate b satisfies the following conditions: sublinear growth on the second variable in the sense that and let d satisfy We also assume that By a non-random initial condition we understand an initial condition with a distribution, concentrated at one point: for some η ′ ∈ Г 0 (R d ), P {η 0 = η ′ } = 1.
For  From (19) it follows that the points z n are uniquely determined almost surely on F . Moreover, σ n+1 > σ n a.s., and σ n are finite a.s. on Then by induction on n it follows that σ n is a stopping time for each n ∈ N, and ζ σn is F σn ∩ Fmeasurable. By direct substitution we see that (ζ t ) t≥0 is a strong solution for (38)  with the same intensity, compatible with {S t } t≥0 . From now on and until other is specified, we work on the filtered probability space (F, S , {S t } t≥0 , Q). We use the same symbols for random processes and random variables, having in mind that we consider their restrictions to F . The process (ζ t ) t∈[0; lim n→∞ σn) has the Markov property, because the process N 1 has the strong Markov property and independent increments. Indeed, conditioning on S σn , thus the chain {ζ σn } n∈Z + is a Markov chain, and, given {ζ σn } n∈Z + , σ n+1 − σ n are distributed exponentially: Therefore, the random variables γ n = (σ n − σ n−1 )( The jump rate of (ζ t ) t∈[0; lim n→∞ σn) is given by Condition (35) implies that c(α) ≤ c 1 |α| + c 2 . Consequently, c(ζ σn ) ≤ c 1 |ζ σn | + c 2 = c 1 |ζ 0 | + c 1 n + c 2 .
which holds a.s., yields thatζ has a birth at the moment σ 1 , and in the same point of space at that. Therefore,ζ coincides with ζ up to σ 1 a.s. Similar reasoning shows that they coincide up to σ n a.s., and, because σ n → ∞ a.s., Thus, pathwise uniqueness holds. The constructed solution is strong. Now we turn our attention to (39). We can write I{σ n ≤ t}. (41) where (Z t ) is the Yule process (see Page 19) with birth rate defined as follows: and Z 0 = |η 0 |. Thus, we have |ζ t | ≤ Z t a.s., hence E|ζ t | ≤ EZ t < ∞.
As in the proof of Proposition 2.5, (η t ) is a strong solution of (31), t ∈ [0; lim n θ n ).
Random variables θ n , n ∈ N, are stopping times with respect to the filtration {F t , t ≥ 0}.
Using the strong Markov property of a Poisson point process, we see that, on {θ n < ∞}, the conditional distribution of θ b n+1 given F θn is exp( , and the conditional distribution of θ d n+1 given F θn is exp( x∈η θn d(x, η θn )). In particular, θ b n , θ d n > 0, n ∈ N, and the process (η t ) is of pure jump type.
Similarly to the proof of Proposition 2.5, one can show by induction on n that equation (31) has a unique solution on [0; θ n ]. Namely, each two solutions coincide on [0; θ n ] a.s. Thus, any solution coincides with (η t ) a.s. for all t ∈ [0; θ n ]. Now we will show that θ n → ∞ a.s. as n → ∞. Denote by θ ′ k the moment of the k-th birth. It is sufficient to show that θ ′ k → ∞, k → ∞, because only finitely many deaths may occur between any two births, since there are only finitely particles. By induction on k ′ one may see where σ i are the moments of births of (η t ) t≥0 , the solution of (38), and η t ⊂ η t for all t ∈ [0; lim n θ n ). For instance, let us show that (η t ) t≥0 has a birth at θ ′ 1 . We The latter implies that at time moment θ ′ 1 a birth occurs for the process (η t ) t≥0 in the same point. Hence, η θ ′ 1 ⊂ η θ ′ 1 , and we can go on. Since σ k → ∞ as k → ∞, we also have θ ′ k → ∞, and therefore θ n → ∞, n → ∞.
In particular, for any time t the integral is finite a.s.

Remark.
Let η 0 be a non-random initial condition, η 0 ≡ α, α ∈ Г 0 (R d ). The solution of (31) with η 0 ≡ α will be denoted as (η(α, t)) t≥0 . Let P α be the push-forward of P under the mapping From the proof one may derive that, for fixed ω ∈ Ω, constructed unique solution is jointly measurable in (t, α). Thus, the family {P α } of probability measures on D Г 0 (R d ) [0; T ] is measurable in α. We will often use formulations related to the probability space in this case, coordinate mappings will be denoted by η t , The processes (η t ) t∈[0;T ] and (η(α, ·)) t∈[0;T ] have the same law (under P α and P , respectively).
As one would expect, the family of measures {P α , α ∈ Г 0 (R d )} is a Markov process, or a Markov family of probability measures; see Theorem 2.15 below. For a measure µ on Г 0 (R d ), we define We denote by E µ the expectation under P µ .
2.8 Remark. Let b 1 , d 1 be another pair of birth and death coefficients, satisfying all conditions imposed on b and d. Consider a unique solution (η t ) of (31) with coefficients b 1 , d 1 instead of b, d, but with the same initial condition η 0 and all the other underlying structures. If for all ζ ∈ D, where D ∈ B(Г 0 (R d )) , b 1 (·, ζ) ≡ b(·, ζ), d 1 (·, ζ) ≡ d(·, ζ), thenη t = η t for all t ≤ inf{s ≥ 0 : η s / ∈ D} = inf{s ≥ 0 :η s / ∈ D}. This may be proven in the same way as the theorem above.
2.9 Remark. Assume that all the conditions of Theorem 2.6 are fulfilled except Condition (37). Then we could not claim that (42) holds. However, other conclusions of the theorem would hold. We are mostly interested in the case of a non-random initial condition, therefore we do not discuss the case when (42) is not satisfied.
2.10 Remark. We solved equation (31) ω-wisely. As a consequence, there is a functional dependence of the solution process and the "input": the process (η t ) t≥0 is some function of η 0 , N 1 and N 2 . Note that θ n and z n from the proof of Theorem 2.6 are measurable functions of η 0 , N 1 and N 2 in the sense that, e.g., θ 1 = F 1 (η 0 , N 1 , N 2 ) a.s. for a measurable F 1 : 2.11 Proposition. If (η t ) t≥0 is a solution to equation (31), then the inequality holds for all t > 0.
Proof. We already know that E|η t | is finite. Since η t satisfies equation (31)  For B = R d , taking expectation in the last inequality, we obtain Since η is a solution of (31), we have for all s ∈ [0; t] almost surely η s− = η s . Consequently, E|η s− | = E|η s |. Applying this and (35), we see that E|η s |ds + c 2 t + Eη 0 (R d ), so the statement of the lemma follows from (37) and Gronwall's inequality.
2.13 Corollary. Joint uniqueness in law holds for equation (31) with initial distribution ν 2.14 Remark. We note here that altering the order of the initial configuration does not change the law of the solution. We could replace the lexicographical order with any other. To see this, note that if ς is a permutation of Z (that is, ς : Z → Z is a bijection), then the processÑ 2 defined byÑ has the same law as N 2 , and is adapted to {F t } t≥0 , too. Therefore, solutions of (31) and of (31) with N 2 being replaced byÑ 2 have the same law. But replacing N 2 withÑ 2 in equation where the sequence {x ′ i } is related to the process (ξ s ) s∈[0;t] , ξ s = η s . The unique solution of (45) is (η s ) s∈[t ′ ;t] . As in the proof of Theorem 2.6 we can see that (η s ) s∈[t ′ ;t] is measurable with respect to the filtration generated by the random variables N 1 (B, [s; q], U ), N 2 (i, [s; q], U ), and almost surely. Furthermore, using arguments similar to those in Remark 2.14, we can conclude The following theorem sums up the results we have obtained so far. Proof. The statement is a consequence of Theorem 2.6, Remark 2.7 and Theorem 2.15. In particular, the Markov property of {P α , α ∈ Г 0 (R d )} follows from the statement given in the last sentence of the proof of Theorem 2.15.
We call the unique solution of (31) (or, sometimes, the corresponding family of measures on D Г 0 (R d ) [0; ∞)) a (spatial) birth-and-death Markov process.
2.17 Remark. We note that d does not need to be defined on the whole space R d × Γ 0 (R d ).
The equation makes sense even if d(x, η) is defined on {(x, η) | x ∈ η}. Of course, any such function may be extended to a function on R d × Γ 0 (R d ).

Continuous dependence on initial conditions
In order to prove the continuity of the distribution of the solution of (31) with respect to initial conditions, we make the following continuity assumptions on b and d.
2.18 Continuity assumptions. Let b, d be continuous with respect to both arguments. Furthermore, let the map is defined only on {(x, η) | x ∈ η}. We require that, whenever η n → η and η n ∋ z n → x ∈ η, we also have d(z n , η n ) → d(x, η). Similar condition appeared in [HS78, Theorem 3.1].

2.19
Theorem. Let the birth and death coefficients b and d satisfy the above continuity assumptions 2.18. Then for every T > 0 the map which assigns to a non-random initial condition η 0 = α the law of the solution of equation (31) stopped at time T , is continuous.
Remark. We mean continuity in in the space of measures on D Г 0 (R d ) [0; T ]; see Page 13.
Proof. Denote by η(α, · ) the solution of (31), started from α. Let α n → α, α n , α ∈ Г 0 (R d ), With no loss in generality we assume that |α n | = |α|, n ∈ N. By Lemma 1.5 we can label elements of α n , α n = {x Taking into account Remark 2.14, we can assume without loss of generality (in the sense that we do not have to use lexicographical order; not in the sense that we can make x Let {θ i } i∈N be the moments of jumps of process η(α, · ). Without loss of generality, assume that d(x, α) > 0, x ∈ α, and ||b(·, α)|| L 1 > 0, L 1 := L 1 (R d ) (if some of these inequalities are not fulfilled, the following reasonings should be changed insignificantly).
Similarly, the probability that the same event as for η(α, ·) occurs at time θ 1 for η(ν, ·) is high. Indeed, assume, for example, that a birth occurs at θ 1 , that is to say that (48) holds.
Once more using Lemma 1.13 we get The case of death occurring at θ 1 may be analyzed in the same way.
From inequalities (9) and (10) we may deduce sup t∈(0;θ 1 ] dist(η(α, t), η(α n , t)) Proceeding in the same manner we may extend this to sup t∈(0;θn] dist(η(α, t), η(α n , t)) particularly because of the strong Markov property of a Poisson point process. In fact, with high probability the processes η(α n , · ) and η(α, · ) change up to time θ n in the same way in the following sense: births occur in the same places at the same time moments. Deaths occur at the same time moments, and when a point is deleted from η(α, · ), then its counterpart is deleted from η(α n , · ).
2.20 Remark. In fact, we have proved an even stronger statement. Namely, take α n → α.

The martingale problem
Now we briefly discuss the martingale problem associated with L defined in (30). Let C b (Г 0 (R d )) be the space of all bounded continuous functions on Г 0 (R d ). We equip C b (Г 0 (R d )) with the supremum norm.
is a local martingale for every f ∈ C b (Г 0 ). Here y is the coordinate mapping, y(t)(ω) = ω(t), Thus, we require M f to be a local martingale under Q with respect to {I t } t≥0 . Note that L can be considered as a bounded operator on C b (Г 0 (R d )).

2.22
Proposition. Let (η(α, t)) t≥0 be a solution to (31). Then for every f ∈ C(Г 0 ) the process is a local martingale under P with respect to {F t } t≥0 .
Proof. In this proof ζ t will stand for η(α, t). Denote τ n = inf{t ≥ 0 : |ζ t | > n or ζ t [−n; n] d }. Clearly, τ n , n ∈ N, is a stopping time and τ n → ∞ a.s. Let ζ n t = ζ t∧τn . We want to show that The process (ζ t ) t≥0 satisfies In the above equality as well as in few other places throughout this proof we treat elements of Г 0 (R d ) as measures rather than as configurations. Since (ζ t ) is of the pure jump type, the sum on the right-hand side of (57) is a.s. finite.
Note that ζ s = ζ s− ∪ x a.s. in the first summand on the right-hand side of (58), and are martingales. Observe that we see that the difference on the right hand side of (56) is a martingale because of (58) and (59).
2.23 Corollary. The unique solution of (31) induces a solution of the martingale problem 2.21.

Birth rate without sublinear growth condition
In this section we will consider equation (31) with the a birth rate coefficient that does not satisfy the sublinear growth condition (35).
Instead, we assume only that Under this assumption we can not guarantee existence of solution on the whole line [0; ∞) or even on a finite interval [0; T ]. It is possible that infinitely many points appear in finite time.
We would like to show that a unique solution exists up to an explosion time, maybe finite.

Coupling
Here we discuss the coupling of two birth-and-death processes. The theorem we prove here will be used in the sequel. As a matter of fact, we have already used the coupling technique in the proof of Theorem 2.6.
Consider two equations of the form (31), where t ∈ [0; T ] and {x and (64) 1 , ...} be the sequence related to (ξ Note that here {x We will show by induction that each moment of birth for (η t ) t∈[0;T ] is a moment of birth for Here we deal only with the base case, the induction step is done in the same way. We have nothing to show if τ 1 is a moment of a birth of (ξ The processN 2 is {F t }-adapted. One can see that (η t ) t∈[0;T ] is the unique solution of equation

Related semigroup of operators
We say now a few words about the semigroup of operators related to the unique solution of (31).
We write η(α, t) for a unique solution of (31), started from α ∈ Г 0 (R d ). We want to define an operator S t by S t f (α) = Ef (η(α, t)) (= E α f (η(t))) for an appropriate class of functions. Unfortunately, it seems difficult to make S t a C 0 -semigroup on some functional Banach space for general b, d satisfying (35) and (36).
We start with the case when the cumulative birth and death rates are bounded. Let C b = C b (Г 0 (R d )) be the space of all bounded continuous functions on Г 0 (R d ). It becomes a Banach space once it is equipped with the supremum norm. We assume the existence of a constant C > 0 such that for all ζ ∈ Г 0 (R d ) where B and D are defined in (33) and (34). Formula (30) defines then a bounded operator L : C b → C b , and we will show that S t coincides with e tL . For f ∈ C b , the function S t f is bounded and continuous. Boundedness is a consequence of the boundedness of f , and continuity of S t f follows from Remark 2.20. Indeed, let α n → α, ξ (n) t d = η(α n , t) and dist(η(α, t), ξ (n) t ) p → 0, n → ∞.
Thus, S t f is continuous (note that the continuity of S t f does not follow from Theorem 2.19 alone, since for a fixed t ∈ [0; T ] the functional D Г 0 (R d ) [0; T ] ∋ x → x(t) ∈ R is not continuous in the Skorokhod topology). Furthermore, since for small t and for all A ∈ B(R d ), P {η(α, t) = α ∪ {y} for some y ∈ A} = t y∈A b(y, α)dy + o(t), and for x ∈ α P {η(α, t) = α \ {x}} = td(x, α) + o(t), we may estimate Therefore, (66) defines a C 0 semigroup on C b . Its generator Thus, S t = e tL , and we have proved the following 2.1 Proposition. Assume that (67) is fulfilled. Then the family of operators (S t , t ≥ 0) on C b defined in (66) constitutes a C 0 -semigroup. Its generator coincides with L given in (30).
Now we turn out attention to general b, d satisfying (35) and (36) but not necessarily (67).
Formula (73) gives us the formal relation of (η(α, t)) t≥0 to the operator L. Of course, for fixed f the convergence in (73) does not have to be uniform in α.
2.3 Remark. The question about the construction of a semigroup acting on some class of probability measures on Г 0 (R d ) is yet to be studied.