Statistics

Testing Complete Spatial Randomness

Testing CSR Monte Carlo Tests - Let $T$ be any test statistic where larger $T$ cast doubt on the null hypothesis. - Let $t1$ be the value of $T$ calculated from dataset. - For convenience, assume that the null samplin...

2023. 7. 26.3 min read

Testing CSR

Monte Carlo Tests

  • Let TT be any test statistic where larger TT cast doubt on the null hypothesis.

  • Let t1t_1 be the value of TT calculated from dataset.

  • For convenience, assume that the null sampling distribution of TT is continuous.

Let t2,,tst_2,\ldots,t_s be the values of TT calculated from s1s-1 independent simulations of H0H_0. Then under H0H_0, the values t1,,tst_1,\ldots,t_s are exchangeable, i.e.

P(ti=t(j))=1s,j=1,,sP(t_i=t_{(j)})=\frac{1}{s}, j=1,\ldots,s

Hence, if RR denotes the number of ti>t1t_i>t_1 then P(Rr)=r+1s.P(R\leq r) = \frac{r+1}{s}. Which means that the p-value of Monte Carlo test is (r+1)/s(r+1)/s.

Inter-event distance based Test

  • Let TT be the distance between two events independently and uniformly distributed in AA.

  • For a unit square of AA

    H(t) &= P(T\leq t) \\ &= \begin{cases} \pi t^2 - {8\over 3}t^3+{1\over 2}t^4,\quad 0\leq t\leq 1 \\ {1\over3}-2t^2-{1\over2}t^4+{4\over3}(t^2-1)^{1\over2}(2t^2+1)+2t^2\sin^{-1}(2t^{-2}-1),\quad 1\leq t\leq \sqrt{2} \end{cases} \end{aligned}$$
  • For a circle of unit radius AA,

    H(t) = 1+\pi^{-1}[2(t^2-1)\cos^{-1}({t\over2})-t(1+{t^2\over2})\sqrt{1-{t^2\over4}}],\quad 0\leq t\leq2 \end{aligned}$$
  • Consider empirical distribution function(EDF) of inter-event distances as:

    H^1(t)=2n(n1)i<jI(tijt)\hat{H}_1(t)={2\over n(n-1)}\sum_{i<j}I(t_{ij}\leq t)

    where tijt_{ij} are observed inter-event distances from data.

  • Monte Carlo-based approach is used for this test.

  • Generating Monte-Carlo Samples

  • Generate s1s-1 times of nn events in AA under CSR assumption

  • Calculate H^i(t),i=2,,s\hat{H}_i(t), i=2,\ldots,s

  • Calculate envelopes:

U(t)=max2is{H^i(t)}L(t)=min2is{H^i(t)}\begin{aligned} U(t) &=\max_{2\leq i\leq s}\lbrace \hat{H}_i(t)\rbrace \\ L(t) &=\min_{2\leq i\leq s}\lbrace \hat{H}_i(t)\rbrace \end{aligned}
  • Two common MC test approaches

    1. Choose appropriate t0t_0 and define ui=H^i(t0)u_i=\hat{H}_i(t_0) under CSR. Note that under H0H_0 at MC test,

      P(u1=u(j))=1/sP(u_1=u_{(j)})=1/s

      If u1u_1 ranks kkth largest or higher than kkth, the test that rejects CSR based on that gives an exact one-sided test of size k/sk/s.

      example : k=5,s=100,u1u(5)k=5, s=100, u_1\geq u_{(5)} then size = 0.05

    2. Define

      u_i = \int(\hat{H}_i(t)-H(t))^2 dt.\tag{*}

      Then proceed to a test based on the rank of u1u_1.

  • Note that the approach 2 is more objective but known to have weak power.

Nearest neighbor distance based Test

  • Let YY be the nearest neighbor distance under CSR when there are nn events in a region AA

  • Theorical distribution of YY is quite difficult, instead use an approximation.

  • Note that an event being within distance yy from known(specified) event is

    πy2A\frac{\pi y^2}{|A|}

    Then, the CDF can be approximated by as follows:

    G(y)=P(Y\leq y)&\approx 1-(1-\pi y^2|A|^{-1})^{n-1} \\ &\approx 1-\exp(-\lambda\pi y^2),\quad y\geq 0 \end{aligned}$$ where (2) is a further approximation with large $n$ with $\lambda=n\vert A\vert^{-1}$.
  • Empirical CDF is gien as:

    G^1(y)=1niI(yiy)\hat{G}_1(y) = \frac{1}{n}\sum_i I(y_i\leq y)

  • Let Gˉi(y)=1s1jiG^j(y)\bar{G}_i(y)=\frac{1}{s-1}\sum_{j\neq i}\hat{G}_j(y) then the MC test is given as same as (*) where

    ui=(G^i(y)Gˉi(y))2dt.u_i = \int(\hat{G} _i(y)-\bar{G}_i(y))^2 dt.