Metric 111-median selection: Query complexity vs. approximation ratio

Ching-Lueh Chang 111Department of Computer Science and Engineering, Yuan Ze University, Taoyuan, Taiwan. Email: clchang@saturn.yzu.edu.tw 222Innovation Center for Big Data and Digital Convergence, Yuan Ze University, Taoyuan, Taiwan.
Abstract

Consider the problem of finding a point in a metric space ({1,2,,n},d)12𝑛𝑑(\{1,2,\ldots,n\},d) with the minimum average distance to other points. We show that this problem has no deterministic o(n1+1/(h1))𝑜superscript𝑛111o(n^{1+1/(h-1)})-query (2hΩ(1))2Ω1(2h-\Omega(1))-approximation algorithms for any constant h+{1}superscript1h\in\mathbb{Z}^{+}\setminus\{1\}.

1 Introduction

The metric 111-median problem asks for a point in an n𝑛n-point metric space with the minimum average distance to other points. It has a Monte-Carlo O(n/ϵ2)𝑂𝑛superscriptitalic-ϵ2O(n/\epsilon^{2})-time (1+ϵ)1italic-ϵ(1+\epsilon)-approximation algorithm for all ϵ>0italic-ϵ0\epsilon>0 [6, 7]. In Dsuperscript𝐷\mathbb{R}^{D}, Kumar et al. [8] give a Monte-Carlo O(2poly(1/ϵ)D)𝑂superscript2poly1italic-ϵ𝐷O(2^{\text{poly}(1/\epsilon)}D)-time (1+ϵ)1italic-ϵ(1+\epsilon)-approximation algorithm for 111-median selection and another algorithm for k𝑘k-median selection, where D1𝐷1D\geq 1 and ϵ>0italic-ϵ0\epsilon>0. Guha et al. [5] give streaming approximation algorithms for k𝑘k-median selection in metric spaces.

Chang [3], Wu [11] and Chang [1] show that metric 111-median has a deterministic nonadaptive O(n1+1/h)𝑂superscript𝑛11O(n^{1+1/h})-time (2h)2(2h)-approximation algorithm for all constants h+{1}superscript1h\in\mathbb{Z}^{+}\setminus\{1\}. Furthermore, Chang [4] shows the nonexistence of deterministic o(n2)𝑜superscript𝑛2o(n^{2})-time (4Ω(1))4Ω1(4-\Omega(1))-approximation algorithms for metric 111-median. This paper generalizes his result to show that metric 111-median has no deterministic o(n1+1/(h1))𝑜superscript𝑛111o(n^{1+1/(h-1)})-query (2hΩ(1))2Ω1(2h-\Omega(1))-approximation algorithms for any constant h+{1}superscript1h\in\mathbb{Z}^{+}\setminus\{1\}. Combining our result with an existing upper bound [11, 1],

min{c1metric 1-median has a deterministic O(n1+ϵ)-query c-approx. alg.}𝑐conditional1metric 1-median has a deterministic O(n1+ϵ)-query c-approx. alg.\displaystyle\min\left\{c\geq 1\mid\text{{\sc metric $1$-median} has a deterministic $O(n^{1+\epsilon})$-query $c$-approx.\ alg.}\right\}
=\displaystyle= min{c1metric 1-median has a deterministic O(n1+ϵ)-time c-approx. alg.}𝑐conditional1metric 1-median has a deterministic O(n1+ϵ)-time c-approx. alg.\displaystyle\min\left\{c\geq 1\mid\text{{\sc metric $1$-median} has a deterministic $O(n^{1+\epsilon})$-time $c$-approx. alg.}\right\}
=\displaystyle= 21ϵ21italic-ϵ\displaystyle 2\left\lceil\frac{1}{\epsilon}\right\rceil

for all constants ϵ(0,1)italic-ϵ01\epsilon\in(0,1). That is, we determine the best approximation ratio of deterministic O(n1+ϵ)𝑂superscript𝑛1italic-ϵO(n^{1+\epsilon})-query (resp., O(n1+ϵ)𝑂superscript𝑛1italic-ϵO(n^{1+\epsilon})-time) algorithms for all ϵ(0,1)italic-ϵ01\epsilon\in(0,1).

As in the previous lower bounds for deterministic algorithms [4, 2], we use an adversarial method. Roughly speaking, our proof proceeds as follows:

  1. (i)

    Design an adversary Adv for answering the distance queries of any deterministic algorithm A𝐴A with query complexity q(n)=o(n1+1/(h1))𝑞𝑛𝑜superscript𝑛111q(n)=o(n^{1+1/(h-1)}).

  2. (ii)

    Show that A𝐴A’s output has a large average distance to other points, according to Adv’s answers to A𝐴A.

  3. (iii)

    Construct a distance function with respect to which a certain point α^^𝛼\hat{\alpha} has a small average distance to other points.

  4. (iv)

    Construct the final distance function d(,)𝑑d(\cdot,\cdot) similar to that in item (iii).

  5. (v)

    Show that d𝑑d is a metric.

  6. (vi)

    Show the consistency of d(,)𝑑d(\cdot,\cdot) with Adv’s answers.

  7. (vii)

    Compare α^^𝛼\hat{\alpha} in item (iii) with A𝐴A’s output to establish our lower bound on A𝐴A’s approximation ratio.

Central to our constructions are two graph sequences, {H(i)}i=0q(n)superscriptsubscriptsuperscript𝐻𝑖𝑖0𝑞𝑛\{H^{(i)}\}_{i=0}^{q(n)} and {G(i)}i=0q(n)superscriptsubscriptsuperscript𝐺𝑖𝑖0𝑞𝑛\{G^{(i)}\}_{i=0}^{q(n)} in Sec. 3, that are unseen in previous lower bounds [9, 2, 4]. Like in [4], we need a small set S𝑆S of points whose distances to other points are answered as large values during A𝐴A’s execution, and yet we assign a small value to the distances from a certain point α^S^𝛼𝑆\hat{\alpha}\in S to many other points in item (iii).

This paper is organized as follows. Sec. 2 introduces the terminologies. Sec. 3 proves our main theorem that metric 111-median has no deterministic o(n1+1/(h1))𝑜superscript𝑛111o(n^{1+1/(h-1)})-query (2hΩ(1))2Ω1(2h-\Omega(1))-approximation algorithms for any constant h+{1}superscript1h\in\mathbb{Z}^{+}\setminus\{1\}. In particular, Secs. 3.13.23.3 and 3.4 correspond to items (ii), (iii), (iv)–(vi) and (vii) above, respectively.

2 Definitions

A finite metric space (M,d)𝑀𝑑(M,d) is a finite set M𝑀M endowed with a function d:M2[0,):𝑑superscript𝑀20d\colon M^{2}\to[0,\infty) such that

  • d(x,x)=0𝑑𝑥𝑥0d(x,x)=0,

  • d(x,y)>0𝑑𝑥𝑦0d(x,y)>0 if xy𝑥𝑦x\neq y,

  • d(x,y)=d(y,x)𝑑𝑥𝑦𝑑𝑦𝑥d(x,y)=d(y,x), and

  • d(x,y)+d(y,z)d(x,z)𝑑𝑥𝑦𝑑𝑦𝑧𝑑𝑥𝑧d(x,y)+d(y,z)\geq d(x,z)

for all x𝑥x, y𝑦y, zM𝑧𝑀z\in M [10]. For all c1𝑐1c\geq 1, a point zM𝑧𝑀z\in M is said to be a c𝑐c-approximate 111-median of (M,d)𝑀𝑑(M,d) if

xMd(z,x)cxMd(y,x)subscript𝑥𝑀𝑑𝑧𝑥𝑐subscript𝑥𝑀𝑑𝑦𝑥\sum_{x\in M}\,d\left(z,x\right)\leq c\cdot\sum_{x\in M}\,d\left(y,x\right)

for all yM𝑦𝑀y\in M. For convenience, [n]=def.{1,2,,n}superscriptdef.delimited-[]𝑛12𝑛[n]\stackrel{{\scriptstyle\text{def.}}}{{=}}\{1,2,\ldots,n\}.

For deterministic algorithms A𝐴A and 𝒪:{1,2,,n}2:𝒪superscript12𝑛2{\cal O}\colon\{1,2,\ldots,n\}^{2}\to\mathbb{R}, denote by A𝒪(1n)superscript𝐴𝒪superscript1𝑛A^{\cal O}(1^{n}) the execution of A𝐴A with oracle access to 𝒪𝒪\cal O and with input 1nsuperscript1𝑛1^{n}, where n𝑛n\in\mathbb{N}. As the input to A𝐴A will be 1nsuperscript1𝑛1^{n} throughout this paper, abbreviate A𝒪(1n)superscript𝐴𝒪superscript1𝑛A^{\cal O}(1^{n}) as A𝒪superscript𝐴𝒪A^{\cal O}. If Adsuperscript𝐴𝑑A^{d} outputs a c𝑐c-approximate 111-median of ([n],d)delimited-[]𝑛𝑑([n],d) for each finite metric space ([n],d)delimited-[]𝑛𝑑([n],d), then A𝐴A is said to be c𝑐c-approximate for metric 111-median, where c1𝑐1c\geq 1.

Fact 1 ([3, 1, 11]).

For each constant h+{1}superscript1h\in\mathbb{Z}^{+}\setminus\{1\}, metric 111-median has a deterministic nonadaptive O(n1+1/h)𝑂superscript𝑛11O(n^{1+1/h})-time (2h)2(2h)-approximation algorithm.

A weighted undirected graph G=(V,E,w)𝐺𝑉𝐸𝑤G=(V,E,w) has a finite vertex set V𝑉V, an edge set E𝐸E and a weight function w:E(0,):𝑤𝐸0w\colon E\to(0,\infty), where each edge is an unordered pair of distinct vertices in V𝑉V. If w:Y(0,):𝑤𝑌0w\colon Y\to(0,\infty) for a superset Y𝑌Y of E𝐸E, interpret (V,E,w)𝑉𝐸𝑤(V,E,w) simply as (V,E,w|E)𝑉𝐸evaluated-at𝑤𝐸(V,E,w|_{E}), where w|Eevaluated-at𝑤𝐸w|_{E} denotes the restriction of w𝑤w on E𝐸E. For all vV𝑣𝑉v\in V, let

NG(v)=def.{uV(u,v)E}superscriptdef.subscript𝑁𝐺𝑣conditional-set𝑢𝑉𝑢𝑣𝐸N_{G}(v)\stackrel{{\scriptstyle\text{def.}}}{{=}}\left\{u\in V\mid(u,v)\in E\right\}

and degG(v)=def.|NG(v)|superscriptdef.subscriptdeg𝐺𝑣subscript𝑁𝐺𝑣\text{\rm deg}_{G}(v)\stackrel{{\scriptstyle\text{def.}}}{{=}}|N_{G}(v)|. For all SV𝑆𝑉S\subseteq V, NG(S)=def.vSNG(v)superscriptdef.subscript𝑁𝐺𝑆subscript𝑣𝑆subscript𝑁𝐺𝑣N_{G}(S)\stackrel{{\scriptstyle\text{def.}}}{{=}}\bigcup_{v\in S}\,N_{G}(v). For all s𝑠s, tV𝑡𝑉t\in V, an s𝑠s-t𝑡t path P𝑃P in G𝐺G is a sequence {viV}i=0ksuperscriptsubscriptsubscript𝑣𝑖𝑉𝑖0𝑘\{v_{i}\in V\}_{i=0}^{k} satisfying k𝑘k\in\mathbb{N}, v0=ssubscript𝑣0𝑠v_{0}=s, vk=tsubscript𝑣𝑘𝑡v_{k}=t and (vi,vi+1)Esubscript𝑣𝑖subscript𝑣𝑖1𝐸(v_{i},v_{i+1})\in E for all i{0,1,,k1}𝑖01𝑘1i\in\{0,1,\ldots,k-1\}. Its weight (or length) is w(P)=def.i=0k1w(vi,vi+1)superscriptdef.𝑤𝑃superscriptsubscript𝑖0𝑘1𝑤subscript𝑣𝑖subscript𝑣𝑖1w(P)\stackrel{{\scriptstyle\text{def.}}}{{=}}\sum_{i=0}^{k-1}\,w(v_{i},v_{i+1}).333w(P)𝑤𝑃w(P) is a common and convenient abuse of notation. The shortest s𝑠s-t𝑡t distance in G𝐺G is

dG(s,t)=inf{w(P)P is an s-t path in G},subscript𝑑𝐺𝑠𝑡infimumconditional-set𝑤𝑃P is an s-t path in Gd_{G}(s,t)=\inf\left\{w(P)\mid\text{$P$ is an $s$-$t$ path in $G$}\right\},

where s𝑠s, tV𝑡𝑉t\in V. So dG(s,t)=subscript𝑑𝐺𝑠𝑡d_{G}(s,t)=\infty if G𝐺G has no s𝑠s-t𝑡t paths. Note that we allow only positive weights, i.e., Im(w)(0,)Im𝑤0\mathop{\mathrm{Im}}(w)\subseteq(0,\infty). So a shortest s𝑠s-t𝑡t path must be simple, i.e., it does not repeat vertices. If w1𝑤1w\equiv 1, abbreviate (V,E,w)𝑉𝐸𝑤(V,E,w) as (V,E)𝑉𝐸(V,E) and call it an unweighted graph.

The following fact is well-known.

Fact 2.

For each undirected graph G=(V,E)𝐺𝑉𝐸G=(V,E),

vVdegG(v)=2|E|.subscript𝑣𝑉subscriptdeg𝐺𝑣2𝐸\sum_{v\in V}\,\text{\rm deg}_{G}(v)=2\cdot|E|.

For a predicate P𝑃P, let χ[P]=1𝜒delimited-[]𝑃1\chi[P]=1 if P𝑃P is true and χ[P]=0𝜒delimited-[]𝑃0\chi[P]=0 otherwise. The following fact about geometric series is not hard to see.

Fact 3.

For all r2𝑟2r\geq 2 and m𝑚m\in\mathbb{N},

k=0mrk2rm.superscriptsubscript𝑘0𝑚superscript𝑟𝑘2superscript𝑟𝑚\sum_{k=0}^{m}\,r^{k}\leq 2r^{m}.

3 Query complexity vs. approximation ratio

Throughout this section,

  • n+𝑛superscriptn\in\mathbb{Z}^{+},

  • δ(0,1)𝛿01\delta\in(0,1) and h+{1}superscript1h\in\mathbb{Z}^{+}\setminus\{1\} are constants (i.e., they are independent of n𝑛n),

  • A𝐴A is a deterministic o(n1+1/(h1))𝑜superscript𝑛111o(n^{1+1/(h-1)})-query algorithm for metric 111-median, and

  • S=[δn][n]𝑆delimited-[]𝛿𝑛delimited-[]𝑛S=[\lfloor\delta n\rfloor]\subseteq[n].

All pairs in [n]2superscriptdelimited-[]𝑛2[n]^{2} are assumed to be unordered in this section. So, e.g., (1,2){2}×[n]122delimited-[]𝑛(1,2)\in\{2\}\times[n]. By padding at most n1𝑛1n-1 dummy queries, assume without loss of generality that A𝐴A will have queried for the distances between its output and all other points when halting. Denote A𝐴A’s query complexity by

q(n)=o(n1+1/(h1)).𝑞𝑛𝑜superscript𝑛111q(n)=o\left(n^{1+1/(h-1)}\right).

Without loss of generality, forbid making the same query twice or querying for the distance from a point to itself, where the queries for d(x,y)𝑑𝑥𝑦d(x,y) and d(y,x)𝑑𝑦𝑥d(y,x) are considered to be the same for x𝑥x, y[n]𝑦delimited-[]𝑛y\in[n]. Furthermore, let n𝑛n be sufficiently large to satisfy

q(n)𝑞𝑛\displaystyle q(n) \displaystyle\leq δn1+1/(h1),𝛿superscript𝑛111\displaystyle\delta n^{1+1/(h-1)}, (1)
δn1/(h1)𝛿superscript𝑛11\displaystyle\delta n^{1/(h-1)} >\displaystyle> 3,3\displaystyle 3, (2)
2q(n)|S|12𝑞𝑛𝑆1\displaystyle\frac{2q(n)}{|S|-1} \displaystyle\leq δn1/(h1).𝛿superscript𝑛11\displaystyle\delta n^{1/(h-1)}. (3)

Define two unweighted undirected graphs G(0)superscript𝐺0G^{(0)} and H(0)superscript𝐻0H^{(0)} by

EG(0)superscriptsubscript𝐸𝐺0\displaystyle E_{G}^{(0)} =def.superscriptdef.\displaystyle\stackrel{{\scriptstyle\text{def.}}}{{=}} {(u,v)(u,v[n]S)(uv)},conditional-set𝑢𝑣𝑢𝑣delimited-[]𝑛𝑆𝑢𝑣\displaystyle\left\{\left(u,v\right)\mid\left(u,v\in[n]\setminus S\right)\land\left(u\neq v\right)\right\}, (4)
G(0)superscript𝐺0\displaystyle G^{(0)} =def.superscriptdef.\displaystyle\stackrel{{\scriptstyle\text{def.}}}{{=}} ([n],EG(0)),delimited-[]𝑛superscriptsubscript𝐸𝐺0\displaystyle\left([n],E_{G}^{(0)}\right), (5)
EH(0)superscriptsubscript𝐸𝐻0\displaystyle E_{H}^{(0)} =def.superscriptdef.\displaystyle\stackrel{{\scriptstyle\text{def.}}}{{=}} ,\displaystyle\emptyset, (6)
H(0)superscript𝐻0\displaystyle H^{(0)} =def.superscriptdef.\displaystyle\stackrel{{\scriptstyle\text{def.}}}{{=}} ([n],EH(0)).delimited-[]𝑛superscriptsubscript𝐸𝐻0\displaystyle\left([n],E_{H}^{(0)}\right). (7)
1:  Let EG(0)superscriptsubscript𝐸𝐺0E_{G}^{(0)}, G(0)superscript𝐺0G^{(0)}, EH(0)superscriptsubscript𝐸𝐻0E_{H}^{(0)} and H(0)superscript𝐻0H^{(0)} be as in equations (4)–(7);
2:  for i=1𝑖1i=1, 222, \ldots, q(n)𝑞𝑛q(n) do
3:     Receive the i𝑖ith query of A𝐴A, denoted (ai,bi)subscript𝑎𝑖subscript𝑏𝑖(a_{i},b_{i});
4:     if dG(i1)(ai,bi)hsubscript𝑑superscript𝐺𝑖1subscript𝑎𝑖subscript𝑏𝑖d_{G^{(i-1)}}(a_{i},b_{i})\leq h then
5:        Find a shortest aisubscript𝑎𝑖a_{i}-bisubscript𝑏𝑖b_{i} path Pisubscript𝑃𝑖P_{i} in G(i1)superscript𝐺𝑖1G^{(i-1)};
6:        EH(i)EH(i1){ee is an edge on Pi}superscriptsubscript𝐸𝐻𝑖superscriptsubscript𝐸𝐻𝑖1conditional-set𝑒e is an edge on PiE_{H}^{(i)}\leftarrow E_{H}^{(i-1)}\cup\{e\mid\text{$e$ is an edge on $P_{i}$}\};
7:        H(i)([n],EH(i))superscript𝐻𝑖delimited-[]𝑛superscriptsubscript𝐸𝐻𝑖H^{(i)}\leftarrow([n],E_{H}^{(i)});
8:        EG(i)EG(i1){(u,v)EG(i1)EH(i)(degH(i)(u)δn1/(h1)2)(degH(i)(v)δn1/(h1)2)}superscriptsubscript𝐸𝐺𝑖superscriptsubscript𝐸𝐺𝑖1conditional-set𝑢𝑣superscriptsubscript𝐸𝐺𝑖1superscriptsubscript𝐸𝐻𝑖subscriptdegsuperscript𝐻𝑖𝑢𝛿superscript𝑛112subscriptdegsuperscript𝐻𝑖𝑣𝛿superscript𝑛112E_{G}^{(i)}\leftarrow E_{G}^{(i-1)}\setminus\{(u,v)\in E_{G}^{(i-1)}\setminus E_{H}^{(i)}\mid(\text{deg}_{H^{(i)}}(u)\geq\delta n^{1/(h-1)}-2)\lor(\text{deg}_{H^{(i)}}(v)\geq\delta n^{1/(h-1)}-2)\};
9:        G(i)([n],EG(i))superscript𝐺𝑖delimited-[]𝑛superscriptsubscript𝐸𝐺𝑖G^{(i)}\leftarrow([n],E_{G}^{(i)});
10:     else
11:        EH(i)EH(i1)superscriptsubscript𝐸𝐻𝑖superscriptsubscript𝐸𝐻𝑖1E_{H}^{(i)}\leftarrow E_{H}^{(i-1)};
12:        H(i)([n],EH(i))superscript𝐻𝑖delimited-[]𝑛superscriptsubscript𝐸𝐻𝑖H^{(i)}\leftarrow([n],E_{H}^{(i)});
13:        EG(i)EG(i1)superscriptsubscript𝐸𝐺𝑖superscriptsubscript𝐸𝐺𝑖1E_{G}^{(i)}\leftarrow E_{G}^{(i-1)};
14:        G(i)([n],EG(i))superscript𝐺𝑖delimited-[]𝑛superscriptsubscript𝐸𝐺𝑖G^{(i)}\leftarrow([n],E_{G}^{(i)});
15:     end if
16:     Q(i)([n],{(aj,bj)j[i]})superscript𝑄𝑖delimited-[]𝑛conditional-setsubscript𝑎𝑗subscript𝑏𝑗𝑗delimited-[]𝑖Q^{(i)}\leftarrow([n],\{(a_{j},b_{j})\mid j\in[i]\});
17:     Output min{dH(i)(ai,bi),h(1/2)χ[v{ai,bi},(vS)(degQ(i)(v)δn1/(h1))]}subscript𝑑superscript𝐻𝑖subscript𝑎𝑖subscript𝑏𝑖12𝜒delimited-[]𝑣subscript𝑎𝑖subscript𝑏𝑖𝑣𝑆subscriptdegsuperscript𝑄𝑖𝑣𝛿superscript𝑛11\min\{d_{H^{(i)}}(a_{i},b_{i}),h-(1/2)\cdot\chi[\exists v\in\{a_{i},b_{i}\},\,(v\in S)\land(\text{deg}_{Q^{(i)}}(v)\leq\delta n^{1/(h-1)})]\} as the answer to the i𝑖ith query of A𝐴A;
18:  end for
Figure 1: Algorithm Adv for answering A𝐴A’s queries

Algorithm Adv in Fig. 1 answers A𝐴A’s queries. In particular, for all i[q(n)]𝑖delimited-[]𝑞𝑛i\in[q(n)], the i𝑖ith iteration of the loop of Adv answers the i𝑖ith query of A𝐴A, denoted (ai,bi)[n]2subscript𝑎𝑖subscript𝑏𝑖superscriptdelimited-[]𝑛2(a_{i},b_{i})\in[n]^{2}. It constructs three unweighted undirected graphs, G(i)=([n],EG(i))superscript𝐺𝑖delimited-[]𝑛superscriptsubscript𝐸𝐺𝑖G^{(i)}=([n],E_{G}^{(i)}), H(i)=([n],EH(i))superscript𝐻𝑖delimited-[]𝑛superscriptsubscript𝐸𝐻𝑖H^{(i)}=([n],E_{H}^{(i)}) and Q(i)superscript𝑄𝑖Q^{(i)}. As G(i1)superscript𝐺𝑖1G^{(i-1)} is unweighted for all i[q(n)]𝑖delimited-[]𝑞𝑛i\in[q(n)], Pisubscript𝑃𝑖P_{i} in line 5 of Adv is an aisubscript𝑎𝑖a_{i}-bisubscript𝑏𝑖b_{i} path in G(i1)superscript𝐺𝑖1G^{(i-1)} with the minimum number of edges. By line 16 of Adv, the edges of Q(i)superscript𝑄𝑖Q^{(i)} are precisely the first i𝑖i queries of A𝐴A.

Lemma 4.
EH(0)EH(1)EH(q(n))EG(q(n))EG(q(n)1)EG(0).superscriptsubscript𝐸𝐻0superscriptsubscript𝐸𝐻1superscriptsubscript𝐸𝐻𝑞𝑛superscriptsubscript𝐸𝐺𝑞𝑛superscriptsubscript𝐸𝐺𝑞𝑛1superscriptsubscript𝐸𝐺0E_{H}^{(0)}\subseteq E_{H}^{(1)}\subseteq\ldots\subseteq E_{H}^{(q(n))}\subseteq E_{G}^{(q(n))}\subseteq E_{G}^{(q(n)-1)}\subseteq\ldots\subseteq E_{G}^{(0)}.
Proof.

By lines 6 and 11 of Adv in Fig. 1, EH(i1)EH(i)superscriptsubscript𝐸𝐻𝑖1superscriptsubscript𝐸𝐻𝑖E_{H}^{(i-1)}\subseteq E_{H}^{(i)} for all i[q(n)]𝑖delimited-[]𝑞𝑛i\in[q(n)]. By lines 8 and 13, EG(i)EG(i1)superscriptsubscript𝐸𝐺𝑖superscriptsubscript𝐸𝐺𝑖1E_{G}^{(i)}\subseteq E_{G}^{(i-1)} for all i[q(n)]𝑖delimited-[]𝑞𝑛i\in[q(n)].

To show that EH(q(n))EG(q(n))superscriptsubscript𝐸𝐻𝑞𝑛superscriptsubscript𝐸𝐺𝑞𝑛E_{H}^{(q(n))}\subseteq E_{G}^{(q(n))}, we shall prove the stronger statement that EH(i)EG(i)superscriptsubscript𝐸𝐻𝑖superscriptsubscript𝐸𝐺𝑖E_{H}^{(i)}\subseteq E_{G}^{(i)} for all i{0,1,,q(n)}𝑖01𝑞𝑛i\in\{0,1,\ldots,q(n)\} by mathematical induction. By equation (6), EH(0)EG(0)superscriptsubscript𝐸𝐻0superscriptsubscript𝐸𝐺0E_{H}^{(0)}\subseteq E_{G}^{(0)}. Assume as the induction hypothesis that EH(i1)EG(i1)superscriptsubscript𝐸𝐻𝑖1superscriptsubscript𝐸𝐺𝑖1E_{H}^{(i-1)}\subseteq E_{G}^{(i-1)}. The following shows that EH(i)EG(i1)superscriptsubscript𝐸𝐻𝑖superscriptsubscript𝐸𝐺𝑖1E_{H}^{(i)}\subseteq E_{G}^{(i-1)} by examining each eEH(i)𝑒superscriptsubscript𝐸𝐻𝑖e\in E_{H}^{(i)}:

  1. Case 1:

    eEH(i1)𝑒superscriptsubscript𝐸𝐻𝑖1e\in E_{H}^{(i-1)}. By the induction hypothesis, eEG(i1)𝑒superscriptsubscript𝐸𝐺𝑖1e\in E_{G}^{(i-1)}.

  2. Case 2:

    eEH(i1)𝑒superscriptsubscript𝐸𝐻𝑖1e\notin E_{H}^{(i-1)}. As eEH(i)EH(i1)𝑒superscriptsubscript𝐸𝐻𝑖superscriptsubscript𝐸𝐻𝑖1e\in E_{H}^{(i)}\setminus E_{H}^{(i-1)}, lines 6 and 11 show that e𝑒e is on Pisubscript𝑃𝑖P_{i} (and that the i𝑖ith iteration of the loop of Adv runs line 6 rather than line 11). By line 5, each edge on Pisubscript𝑃𝑖P_{i} is in EG(i1)superscriptsubscript𝐸𝐺𝑖1E_{G}^{(i-1)}. In particular, eEG(i1)𝑒superscriptsubscript𝐸𝐺𝑖1e\in E_{G}^{(i-1)}.

Having shown that EH(i)EG(i1)superscriptsubscript𝐸𝐻𝑖superscriptsubscript𝐸𝐺𝑖1E_{H}^{(i)}\subseteq E_{G}^{(i-1)}, lines 8 and 13 will both result in EH(i)EG(i)superscriptsubscript𝐸𝐻𝑖superscriptsubscript𝐸𝐺𝑖E_{H}^{(i)}\subseteq E_{G}^{(i)}, completing the induction step. ∎

Lemma 5.

For all i[q(n)]𝑖delimited-[]𝑞𝑛i\in[q(n)] with dG(i1)(ai,bi)hsubscript𝑑superscript𝐺𝑖1subscript𝑎𝑖subscript𝑏𝑖d_{G^{(i-1)}}(a_{i},b_{i})\leq h,

dH(i)(ai,bi)=dH(q(n))(ai,bi)=dG(q(n))(ai,bi)=dG(i1)(ai,bi).subscript𝑑superscript𝐻𝑖subscript𝑎𝑖subscript𝑏𝑖subscript𝑑superscript𝐻𝑞𝑛subscript𝑎𝑖subscript𝑏𝑖subscript𝑑superscript𝐺𝑞𝑛subscript𝑎𝑖subscript𝑏𝑖subscript𝑑superscript𝐺𝑖1subscript𝑎𝑖subscript𝑏𝑖d_{H^{(i)}}\left(a_{i},b_{i}\right)=d_{H^{(q(n))}}\left(a_{i},b_{i}\right)=d_{G^{(q(n))}}\left(a_{i},b_{i}\right)=d_{G^{(i-1)}}\left(a_{i},b_{i}\right).
Proof.

By line 4 of Adv, the i𝑖ith iteration of the loop runs lines 5–9. Lines 5–7 put (the edges of) a shortest aisubscript𝑎𝑖a_{i}-bisubscript𝑏𝑖b_{i} path in G(i1)superscript𝐺𝑖1G^{(i-1)} into H(i)superscript𝐻𝑖H^{(i)}; hence

dH(i)(ai,bi)dG(i1)(ai,bi).subscript𝑑superscript𝐻𝑖subscript𝑎𝑖subscript𝑏𝑖subscript𝑑superscript𝐺𝑖1subscript𝑎𝑖subscript𝑏𝑖d_{H^{(i)}}\left(a_{i},b_{i}\right)\leq d_{G^{(i-1)}}\left(a_{i},b_{i}\right).

This and Lemma 4 complete the proof. ∎

Below is an easy consequence of Lemma 4.

Lemma 6.

For all i[q(n)]𝑖delimited-[]𝑞𝑛i\in[q(n)] with dG(i1)(ai,bi)>hsubscript𝑑superscript𝐺𝑖1subscript𝑎𝑖subscript𝑏𝑖d_{G^{(i-1)}}(a_{i},b_{i})>h,

dG(q(n))(ai,bi)>h.subscript𝑑superscript𝐺𝑞𝑛subscript𝑎𝑖subscript𝑏𝑖d_{G^{(q(n))}}(a_{i},b_{i})>h.

3.1 The average distance from A𝐴A’s output to other points

This subsection shows that the output of A𝖠𝖽𝗏superscript𝐴𝖠𝖽𝗏A^{\sf Adv} has a large average distance to other points, according to the answers of Adv.

Lemma 7.

For all i[q(n)]𝑖delimited-[]𝑞𝑛i\in[q(n)] and v[n]𝑣delimited-[]𝑛v\in[n],

degH(i)(v)degH(i1)(v)+2.subscriptdegsuperscript𝐻𝑖𝑣subscriptdegsuperscript𝐻𝑖1𝑣2\text{\rm deg}_{H^{(i)}}(v)\leq\text{\rm deg}_{H^{(i-1)}}(v)+2.
Proof.

If the i𝑖ith iteration of the loop of Adv runs lines 11–14 but not 5–9, then H(i)=H(i1)superscript𝐻𝑖superscript𝐻𝑖1H^{(i)}=H^{(i-1)}, proving the lemma. So assume otherwise. Being shortest, Pisubscript𝑃𝑖P_{i} in line 5 does not repeat vertices. Therefore, v𝑣v is incident to at most two edges on Pisubscript𝑃𝑖P_{i}, which together with lines 6–7 complete the proof. ∎

Lemma 8.

For all v[n]𝑣delimited-[]𝑛v\in[n],

degH(q(n))(v)<δn1/(h1).subscriptdegsuperscript𝐻𝑞𝑛𝑣𝛿superscript𝑛11\text{\rm deg}_{H^{(q(n))}}(v)<\delta n^{1/(h-1)}.
Proof.

Assume

degH(q(n))(v)δn1/(h1)2subscriptdegsuperscript𝐻𝑞𝑛𝑣𝛿superscript𝑛112\displaystyle\text{\rm deg}_{H^{(q(n))}}(v)\geq\delta n^{1/(h-1)}-2 (8)

for, otherwise, there is nothing to prove. Clearly,

degH(0)(v)=(6)–(7)0<(2)δn1/(h1)2.superscript(6)–(7)subscriptdegsuperscript𝐻0𝑣0superscript(2)𝛿superscript𝑛112\displaystyle\text{\rm deg}_{H^{(0)}}(v)\stackrel{{\scriptstyle\text{(\ref{initiallymarkededgeset})--(\ref{initiallymarkedgraph})}}}{{=}}0\stackrel{{\scriptstyle\text{(\ref{tediouscondition2})}}}{{<}}\delta n^{1/(h-1)}-2. (9)

By inequalities (8)–(9), there exists i[q(n)]𝑖delimited-[]𝑞𝑛i\in[q(n)] satisfying

degH(i1)(v)<δn1/(h1)2,subscriptdegsuperscript𝐻𝑖1𝑣𝛿superscript𝑛112\displaystyle\text{\rm deg}_{H^{(i-1)}}(v)<\delta n^{1/(h-1)}-2, (10)
degH(i)(v)δn1/(h1)2.subscriptdegsuperscript𝐻𝑖𝑣𝛿superscript𝑛112\displaystyle\text{\rm deg}_{H^{(i)}}(v)\geq\delta n^{1/(h-1)}-2. (11)

Clearly,

NG(i)(v)={u[n](u,v)EG(i)}.subscript𝑁superscript𝐺𝑖𝑣conditional-set𝑢delimited-[]𝑛𝑢𝑣superscriptsubscript𝐸𝐺𝑖\displaystyle N_{G^{(i)}}(v)=\left\{u\in[n]\mid\left(u,v\right)\in E_{G}^{(i)}\right\}. (12)

As H(i1)H(i)superscript𝐻𝑖1superscript𝐻𝑖H^{(i-1)}\neq H^{(i)} by inequalities (10)–(11), the i𝑖ith iteration of the loop of Adv runs lines 5–9 but not 11–14. By inequality (11) and line 8 of Adv,

{u[n](u,v)EG(i)}={u[n](u,v)EG(i1)(EG(i1)EH(i))}.conditional-set𝑢delimited-[]𝑛𝑢𝑣superscriptsubscript𝐸𝐺𝑖conditional-set𝑢delimited-[]𝑛𝑢𝑣superscriptsubscript𝐸𝐺𝑖1superscriptsubscript𝐸𝐺𝑖1superscriptsubscript𝐸𝐻𝑖\displaystyle\left\{u\in[n]\mid\left(u,v\right)\in E_{G}^{(i)}\right\}=\left\{u\in[n]\mid\left(u,v\right)\in E_{G}^{(i-1)}\setminus\left(E_{G}^{(i-1)}\setminus E_{H}^{(i)}\right)\right\}. (13)

Equations (12)–(13) and Lemma 4 give

NG(i)(v)={u[n](u,v)EH(i)}.subscript𝑁superscript𝐺𝑖𝑣conditional-set𝑢delimited-[]𝑛𝑢𝑣superscriptsubscript𝐸𝐻𝑖\displaystyle N_{G^{(i)}}(v)=\left\{u\in[n]\mid\left(u,v\right)\in E_{H}^{(i)}\right\}. (14)

By inequality (10) and Lemma 7,

degH(i)(v)<δn1/(h1).subscriptdegsuperscript𝐻𝑖𝑣𝛿superscript𝑛11\displaystyle\text{\rm deg}_{H^{(i)}}(v)<\delta n^{1/(h-1)}.

This and equation (14) imply degG(i)(v)<δn1/(h1)subscriptdegsuperscript𝐺𝑖𝑣𝛿superscript𝑛11\text{\rm deg}_{G^{(i)}}(v)<\delta n^{1/(h-1)}, which together with Lemma 4 completes the proof. ∎

Lemma 9.

For all v[n]𝑣delimited-[]𝑛v\in[n],

|{u[n]dH(q(n))(v,u)<h}|2δh1n.conditional-set𝑢delimited-[]𝑛subscript𝑑superscript𝐻𝑞𝑛𝑣𝑢2superscript𝛿1𝑛\left|\left\{u\in[n]\mid d_{H^{(q(n))}}\left(v,u\right)<h\right\}\right|\leq 2\delta^{h-1}n.
Proof.

By Lemma 8,

|{u[n] v-u path in H(q(n)) with exactly k edges}|(δn1/(h1))kconditional-set𝑢delimited-[]𝑛 v-u path in H(q(n)) with exactly k edgessuperscript𝛿superscript𝑛11𝑘\left|\left\{u\in[n]\mid\text{$\exists$ $v$-$u$ path in $H^{(q(n))}$ with exactly $k$ edges}\right\}\right|\leq\left(\delta n^{1/(h-1)}\right)^{k}

for all k𝑘k\in\mathbb{N}. Consequently,

|{u[n] v-u path in H(q(n)) with at most h1 edges}|conditional-set𝑢delimited-[]𝑛 v-u path in H(q(n)) with at most h1 edges\displaystyle\left|\left\{u\in[n]\mid\text{$\exists$ $v$-$u$ path in $H^{(q(n))}$ with at most $h-1$ edges}\right\}\right| \displaystyle\leq k=0h1(δn1/(h1))ksuperscriptsubscript𝑘01superscript𝛿superscript𝑛11𝑘\displaystyle\sum_{k=0}^{h-1}\,\left(\delta n^{1/(h-1)}\right)^{k}
(2) and Fact 3superscript(2) and Fact 3\displaystyle\stackrel{{\scriptstyle\text{(\ref{tediouscondition2})~{}and~{}Fact~{}\ref{geometricseriesbound}}}}{{\leq}} 2δh1n.2superscript𝛿1𝑛\displaystyle 2\delta^{h-1}n.

Finally, recall that H(q(n))superscript𝐻𝑞𝑛H^{(q(n))} is unweighted. ∎

Denote the output of AAdvsuperscript𝐴AdvA^{\text{\sf Adv}} by z𝑧z. Furthermore,

I=def.{j[q(n)]z{aj,bj}}.superscriptdef.𝐼conditional-set𝑗delimited-[]𝑞𝑛𝑧subscript𝑎𝑗subscript𝑏𝑗\displaystyle I\stackrel{{\scriptstyle\text{def.}}}{{=}}\left\{j\in\left[q(n)\right]\mid z\in\left\{a_{j},b_{j}\right\}\right\}. (15)

The following lemma analyzes the sum of the distances, as answered by line 17 of Adv, from z𝑧z to other points.

Lemma 10.
iImin{dH(i)(ai,bi),h12χ[v{ai,bi},(vS)(degQ(i)(v)δn1/(h1))]}subscript𝑖𝐼subscript𝑑superscript𝐻𝑖subscript𝑎𝑖subscript𝑏𝑖12𝜒delimited-[]𝑣subscript𝑎𝑖subscript𝑏𝑖𝑣𝑆subscriptdegsuperscript𝑄𝑖𝑣𝛿superscript𝑛11\displaystyle\sum_{i\in I}\,\min\left\{d_{H^{(i)}}\left(a_{i},b_{i}\right),h-\frac{1}{2}\cdot\chi\left[\exists v\in\left\{a_{i},b_{i}\right\},\,\left(v\in S\right)\land\left(\text{\rm deg}_{Q^{(i)}}(v)\leq\delta n^{1/(h-1)}\right)\right]\right\}
\displaystyle\geq n(h2hδh1o(1)δ).𝑛2superscript𝛿1𝑜1𝛿\displaystyle n\cdot\left(h-2h\delta^{h-1}-o(1)-\delta\right).
Proof.

By Lemma 4,

iImin{dH(i)(ai,bi),h12χ[v{ai,bi},(vS)(degQ(i)(v)δn1/(h1))]}subscript𝑖𝐼subscript𝑑superscript𝐻𝑖subscript𝑎𝑖subscript𝑏𝑖12𝜒delimited-[]𝑣subscript𝑎𝑖subscript𝑏𝑖𝑣𝑆subscriptdegsuperscript𝑄𝑖𝑣𝛿superscript𝑛11\displaystyle\sum_{i\in I}\,\min\left\{d_{H^{(i)}}\left(a_{i},b_{i}\right),h-\frac{1}{2}\cdot\chi\left[\exists v\in\left\{a_{i},b_{i}\right\},\,\left(v\in S\right)\land\left(\text{\rm deg}_{Q^{(i)}}(v)\leq\delta n^{1/(h-1)}\right)\right]\right\}
\displaystyle\geq iImin{dH(q(n))(ai,bi),h12χ[v{ai,bi},(vS)(degQ(i)(v)δn1/(h1))]}subscript𝑖𝐼subscript𝑑superscript𝐻𝑞𝑛subscript𝑎𝑖subscript𝑏𝑖12𝜒delimited-[]𝑣subscript𝑎𝑖subscript𝑏𝑖𝑣𝑆subscriptdegsuperscript𝑄𝑖𝑣𝛿superscript𝑛11\displaystyle\sum_{i\in I}\,\min\left\{d_{H^{(q(n))}}\left(a_{i},b_{i}\right),h-\frac{1}{2}\cdot\chi\left[\exists v\in\left\{a_{i},b_{i}\right\},\,\left(v\in S\right)\land\left(\text{\rm deg}_{Q^{(i)}}(v)\leq\delta n^{1/(h-1)}\right)\right]\right\}\,\,\,\,\,\,\,\,\,\,
\displaystyle\geq iImin{dH(q(n))(ai,bi),h}subscript𝑖𝐼subscript𝑑superscript𝐻𝑞𝑛subscript𝑎𝑖subscript𝑏𝑖\displaystyle\sum_{i\in I}\,\min\left\{d_{H^{(q(n))}}\left(a_{i},b_{i}\right),h\right\}
\displaystyle- iI12χ[v{ai,bi},(vS)(degQ(i)(v)δn1/(h1))].subscript𝑖𝐼12𝜒delimited-[]𝑣subscript𝑎𝑖subscript𝑏𝑖𝑣𝑆subscriptdegsuperscript𝑄𝑖𝑣𝛿superscript𝑛11\displaystyle\sum_{i\in I}\,\frac{1}{2}\cdot\chi\left[\exists v\in\left\{a_{i},b_{i}\right\},\,\left(v\in S\right)\land\left(\text{\rm deg}_{Q^{(i)}}(v)\leq\delta n^{1/(h-1)}\right)\right].

For all iI𝑖𝐼i\in I, there exists ci[n]subscript𝑐𝑖delimited-[]𝑛c_{i}\in[n] with {z,ci}={ai,bi}𝑧subscript𝑐𝑖subscript𝑎𝑖subscript𝑏𝑖\{z,c_{i}\}=\{a_{i},b_{i}\} by equation (15). Therefore,

iImin{dH(q(n))(ai,bi),h}=iImin{dH(q(n))(z,ci),h}.subscript𝑖𝐼subscript𝑑superscript𝐻𝑞𝑛subscript𝑎𝑖subscript𝑏𝑖subscript𝑖𝐼subscript𝑑superscript𝐻𝑞𝑛𝑧subscript𝑐𝑖\sum_{i\in I}\,\min\left\{d_{H^{(q(n))}}\left(a_{i},b_{i}\right),h\right\}=\sum_{i\in I}\,\min\left\{d_{H^{(q(n))}}\left(z,c_{i}\right),h\right\}.

As we forbid repeated queries, {ci}iIsubscriptsubscript𝑐𝑖𝑖𝐼\{c_{i}\}_{i\in I} is a sequence of distinct points. So by Lemma 9,

iImin{dH(q(n))(z,ci),h}h(|I|2δh1n).subscript𝑖𝐼subscript𝑑superscript𝐻𝑞𝑛𝑧subscript𝑐𝑖𝐼2superscript𝛿1𝑛\sum_{i\in I}\,\min\left\{d_{H^{(q(n))}}\left(z,c_{i}\right),h\right\}\geq h\cdot\left(|I|-2\delta^{h-1}n\right).

Recall that A𝖠𝖽𝗏superscript𝐴𝖠𝖽𝗏A^{\sf Adv} will have queried for the distances between its output (which is z𝑧z) and all other points when halting. So

|I|n1𝐼𝑛1|I|\geq n-1

by equation (15).444Because we forbid repeated queries and queries for the distance from a point to itself, we also have |I|n1𝐼𝑛1|I|\leq n-1.

Clearly,

iIχ[v{ai,bi},(vS)(degQ(i)(v)δn1/(h1))]subscript𝑖𝐼𝜒delimited-[]𝑣subscript𝑎𝑖subscript𝑏𝑖𝑣𝑆subscriptdegsuperscript𝑄𝑖𝑣𝛿superscript𝑛11\displaystyle\sum_{i\in I}\,\chi\left[\exists v\in\left\{a_{i},b_{i}\right\},\,\left(v\in S\right)\land\left(\text{\rm deg}_{Q^{(i)}}(v)\leq\delta n^{1/(h-1)}\right)\right]
=\displaystyle= iIχ[v{z,ci},(vS)(degQ(i)(v)δn1/(h1))]subscript𝑖𝐼𝜒delimited-[]𝑣𝑧subscript𝑐𝑖𝑣𝑆subscriptdegsuperscript𝑄𝑖𝑣𝛿superscript𝑛11\displaystyle\sum_{i\in I}\,\chi\left[\exists v\in\left\{z,c_{i}\right\},\,\left(v\in S\right)\land\left(\text{\rm deg}_{Q^{(i)}}(v)\leq\delta n^{1/(h-1)}\right)\right]
\displaystyle\leq iIχ[(zS)(degQ(i)(z)δn1/(h1))]subscript𝑖𝐼𝜒delimited-[]𝑧𝑆subscriptdegsuperscript𝑄𝑖𝑧𝛿superscript𝑛11\displaystyle\sum_{i\in I}\,\chi\left[\left(z\in S\right)\land\left(\text{\rm deg}_{Q^{(i)}}\left(z\right)\leq\delta n^{1/(h-1)}\right)\right]
+\displaystyle+ iIχ[(ciS)(degQ(i)(ci)δn1/(h1))].subscript𝑖𝐼𝜒delimited-[]subscript𝑐𝑖𝑆subscriptdegsuperscript𝑄𝑖subscript𝑐𝑖𝛿superscript𝑛11\displaystyle\sum_{i\in I}\,\chi\left[\left(c_{i}\in S\right)\land\left(\text{\rm deg}_{Q^{(i)}}\left(c_{i}\right)\leq\delta n^{1/(h-1)}\right)\right].

By line 16 of Adv and equation (15),

degQ(i)(z)=|{jIji}|.subscriptdegsuperscript𝑄𝑖𝑧conditional-set𝑗𝐼𝑗𝑖\displaystyle\text{deg}_{Q^{(i)}}\left(z\right)=\left|\left\{j\in I\mid j\leq i\right\}\right|.

Therefore,

iIχ[(zS)(degQ(i)(z)δn1/(h1))]subscript𝑖𝐼𝜒delimited-[]𝑧𝑆subscriptdegsuperscript𝑄𝑖𝑧𝛿superscript𝑛11\displaystyle\sum_{i\in I}\,\chi\left[\left(z\in S\right)\land\left(\text{\rm deg}_{Q^{(i)}}\left(z\right)\leq\delta n^{1/(h-1)}\right)\right] \displaystyle\leq iIχ[|{jIji}|δn1/(h1)]subscript𝑖𝐼𝜒delimited-[]conditional-set𝑗𝐼𝑗𝑖𝛿superscript𝑛11\displaystyle\sum_{i\in I}\,\chi\left[\left|\left\{j\in I\mid j\leq i\right\}\right|\leq\delta n^{1/(h-1)}\right]
\displaystyle\leq δn1/(h1),𝛿superscript𝑛11\displaystyle\delta n^{1/(h-1)},

where the last inequality follows because |{jIji}|=kconditional-set𝑗𝐼𝑗𝑖𝑘|\{j\in I\mid j\leq i\}|=k when i𝑖i is the k𝑘kth smallest element of I𝐼I, for all k[|I|]𝑘delimited-[]𝐼k\in[|I|]. Recall the distinctness of the points in {ci}iIsubscriptsubscript𝑐𝑖𝑖𝐼\{c_{i}\}_{i\in I}. Therefore,

iIχ[(ciS)(degQ(i)(ci)δn1/(h1))]iIχ[ciS]|S|=δn.subscript𝑖𝐼𝜒delimited-[]subscript𝑐𝑖𝑆subscriptdegsuperscript𝑄𝑖subscript𝑐𝑖𝛿superscript𝑛11subscript𝑖𝐼𝜒delimited-[]subscript𝑐𝑖𝑆𝑆𝛿𝑛\displaystyle\sum_{i\in I}\,\chi\left[\left(c_{i}\in S\right)\land\left(\text{\rm deg}_{Q^{(i)}}\left(c_{i}\right)\leq\delta n^{1/(h-1)}\right)\right]\leq\sum_{i\in I}\,\chi\left[c_{i}\in S\right]\leq|S|=\left\lfloor\delta n\right\rfloor. (17)

Inequalities (3.1)–(17) complete the proof. ∎

3.2 Planting a point with a small average distance to other points

This subsection constructs a distance function with respect to which a certain point has an average distance of approximately 1/2121/2 to other points.

Lemma 11.
|EH(q(n))|hq(n).superscriptsubscript𝐸𝐻𝑞𝑛𝑞𝑛\left|E_{H}^{(q(n))}\right|\leq h\cdot q(n).
Proof.

Consider the i𝑖ith iteration of the loop of Adv, where i[q(n)]𝑖delimited-[]𝑞𝑛i\in[q(n)].

  • Running lines 4–5 results in Pisubscript𝑃𝑖P_{i} having at most hh edges. Consequently,

    |EH(i)||EH(i1)|+hsuperscriptsubscript𝐸𝐻𝑖superscriptsubscript𝐸𝐻𝑖1\displaystyle\left|E_{H}^{(i)}\right|\leq\left|E_{H}^{(i-1)}\right|+h (18)

    by line 6.

  • Running line 11 yields |EH(i)|=|EH(i1)|superscriptsubscript𝐸𝐻𝑖superscriptsubscript𝐸𝐻𝑖1|E_{H}^{(i)}|=|E_{H}^{(i-1)}|, implying inequality (18) as well.

Now,

|EH(q(n))||EH(0)|=i=1q(n)(|EH(i)||EH(i1)|)(18)hq(n).superscriptsubscript𝐸𝐻𝑞𝑛superscriptsubscript𝐸𝐻0superscriptsubscript𝑖1𝑞𝑛superscriptsubscript𝐸𝐻𝑖superscriptsubscript𝐸𝐻𝑖1superscript(18)𝑞𝑛\left|E_{H}^{(q(n))}\right|-\left|E_{H}^{(0)}\right|=\sum_{i=1}^{q(n)}\,\left(\left|E_{H}^{(i)}\right|-\left|E_{H}^{(i-1)}\right|\right)\stackrel{{\scriptstyle\text{(\ref{theincreaseofthenumberofmarkededges})}}}{{\leq}}h\cdot q(n).

Finally, |EH(0)|=0superscriptsubscript𝐸𝐻00|E_{H}^{(0)}|=0 by equation (6). ∎

Lemma 12.
|{u[n]degH(q(n))(u)δn1/(h1)2}|=hδo(n).conditional-set𝑢delimited-[]𝑛subscriptdegsuperscript𝐻𝑞𝑛𝑢𝛿superscript𝑛112𝛿𝑜𝑛\left|\left\{u\in[n]\mid\text{\rm deg}_{H^{(q(n))}}(u)\geq\delta n^{1/(h-1)}-2\right\}\right|=\frac{h}{\delta}\cdot o(n).
555We explicitly write down the constants hh and δ𝛿\delta on the right-hand side for clarity, although they can be absorbed within o()𝑜o(\cdot).
Proof.

By Fact 2, the average degree in H(q(n))superscript𝐻𝑞𝑛H^{(q(n))} is

1n2|EH(q(n))|.1𝑛2superscriptsubscript𝐸𝐻𝑞𝑛\frac{1}{n}\cdot 2\cdot\left|E_{H}^{(q(n))}\right|.

So by the averaging argument (that any finite nonempty sequence of nonnegative numbers with average a¯¯𝑎\bar{a} has at most an a¯/t¯𝑎𝑡\bar{a}/t fraction of numbers that are greater than or equal to t>0𝑡0t>0),

1n|{u[n]degH(q(n))(u)δn1/(h1)2}|1n2|EH(q(n))|1δn1/(h1)2,1𝑛conditional-set𝑢delimited-[]𝑛subscriptdegsuperscript𝐻𝑞𝑛𝑢𝛿superscript𝑛1121𝑛2superscriptsubscript𝐸𝐻𝑞𝑛1𝛿superscript𝑛112\frac{1}{n}\cdot\left|\left\{u\in[n]\mid\text{\rm deg}_{H^{(q(n))}}(u)\geq\delta n^{1/(h-1)}-2\right\}\right|\leq\frac{1}{n}\cdot 2\cdot\left|E_{H}^{(q(n))}\right|\cdot\frac{1}{\delta n^{1/(h-1)}-2},

where the rightmost denominator is positive and is Θ(δn1/(h1))Θ𝛿superscript𝑛11\Theta(\delta n^{1/(h-1)}) by equation (2). This and Lemma 11 complete the proof. ∎

By inequality (2), S{z}𝑆𝑧S\setminus\{z\}\neq\emptyset. Let

α^=def.argminαS{z}degQ(q(n))(α),superscriptdef.^𝛼subscriptargmin𝛼𝑆𝑧subscriptdegsuperscript𝑄𝑞𝑛𝛼\displaystyle\hat{\alpha}\stackrel{{\scriptstyle\text{def.}}}{{=}}\mathop{\mathrm{argmin}}_{\alpha\in S\setminus\{z\}}\,\text{deg}_{Q^{(q(n))}}(\alpha), (19)

breaking ties arbitrarily.

Lemma 13.

For all i[q(n)]𝑖delimited-[]𝑞𝑛i\in[q(n)],

degQ(i)(α^)δn1/(h1).subscriptdegsuperscript𝑄𝑖^𝛼𝛿superscript𝑛11\displaystyle\text{\rm deg}_{Q^{(i)}}\left(\hat{\alpha}\right)\leq\delta n^{1/(h-1)}.
Proof.

By line 16 of Adv,

degQ(i)(α^)degQ(q(n))(α^).subscriptdegsuperscript𝑄𝑖^𝛼subscriptdegsuperscript𝑄𝑞𝑛^𝛼\displaystyle\text{deg}_{Q^{(i)}}\left(\hat{\alpha}\right)\leq\text{deg}_{Q^{(q(n))}}\left(\hat{\alpha}\right). (20)

By equation (19) and the averaging argument,

degQ(q(n))(α^)1|S{z}|αS{z}degQ(q(n))(α).subscriptdegsuperscript𝑄𝑞𝑛^𝛼1𝑆𝑧subscript𝛼𝑆𝑧subscriptdegsuperscript𝑄𝑞𝑛𝛼\displaystyle\text{deg}_{Q^{(q(n))}}(\hat{\alpha})\leq\frac{1}{|S\setminus\{z\}|}\cdot\sum_{\alpha\in S\setminus\{z\}}\,\text{deg}_{Q^{(q(n))}}(\alpha).

Furthermore,

αS{z}degQ(q(n))(α)α[n]degQ(q(n))(α)=2q(n),subscript𝛼𝑆𝑧subscriptdegsuperscript𝑄𝑞𝑛𝛼subscript𝛼delimited-[]𝑛subscriptdegsuperscript𝑄𝑞𝑛𝛼2𝑞𝑛\displaystyle\sum_{\alpha\in S\setminus\{z\}}\,\text{deg}_{Q^{(q(n))}}(\alpha)\leq\sum_{\alpha\in[n]}\,\text{deg}_{Q^{(q(n))}}(\alpha)=2q(n),\,\,\, (21)

where the equality follows from Fact 2, line 16 of Adv and the non-repeating of queries. Finally,

degQ(i)(α^)(20)–(21)2q(n)|S|1(3)δn1/(h1).superscript(20)–(21)subscriptdegsuperscript𝑄𝑖^𝛼2𝑞𝑛𝑆1superscript(3)𝛿superscript𝑛11\displaystyle\text{deg}_{Q^{(i)}}(\hat{\alpha})\stackrel{{\scriptstyle\text{(\ref{trivialbecausethequerygraphgrows})--(\ref{sumofdegreesinthequerygraph})}}}{{\leq}}\frac{2q(n)}{|S|-1}\stackrel{{\scriptstyle\text{(\ref{tediouscondition3})}}}{{\leq}}\delta n^{1/(h-1)}.

Inductively, let

V0subscript𝑉0\displaystyle V_{0} =def.superscriptdef.\displaystyle\stackrel{{\scriptstyle\text{def.}}}{{=}} {α^},^𝛼\displaystyle\left\{\hat{\alpha}\right\}, (22)
V1subscript𝑉1\displaystyle V_{1} =def.superscriptdef.\displaystyle\stackrel{{\scriptstyle\text{def.}}}{{=}} NQ(q(n))(α^)V0,subscript𝑁superscript𝑄𝑞𝑛^𝛼subscript𝑉0\displaystyle N_{Q^{(q(n))}}\left(\hat{\alpha}\right)\setminus V_{0}, (23)
Vj+1subscript𝑉𝑗1\displaystyle V_{j+1} =def.superscriptdef.\displaystyle\stackrel{{\scriptstyle\text{def.}}}{{=}} NH(q(n))(Vj)(i=0jVi)subscript𝑁superscript𝐻𝑞𝑛subscript𝑉𝑗superscriptsubscript𝑖0𝑗subscript𝑉𝑖\displaystyle N_{H^{(q(n))}}\left(V_{j}\right)\setminus\left(\bigcup_{i=0}^{j}\,V_{i}\right) (24)

for all j[h2]𝑗delimited-[]2j\in[h-2]. Furthermore,

Vh=def.[n](i=0h1Vi).superscriptdef.subscript𝑉delimited-[]𝑛superscriptsubscript𝑖01subscript𝑉𝑖\displaystyle V_{h}\stackrel{{\scriptstyle\text{def.}}}{{=}}[n]\setminus\left(\bigcup_{i=0}^{h-1}\,V_{i}\right). (25)

The following lemma is not hard to see from equations (22)–(25).

Lemma 14.

(V0,V1,,Vh)subscript𝑉0subscript𝑉1subscript𝑉(V_{0},V_{1},\ldots,V_{h}) is a partition of [n]delimited-[]𝑛[n], i.e., k=0hVk=[n]superscriptsubscript𝑘0subscript𝑉𝑘delimited-[]𝑛\bigcup_{k=0}^{h}\,V_{k}=[n] and ViVj=subscript𝑉𝑖subscript𝑉𝑗V_{i}\cap V_{j}=\emptyset for all distinct i𝑖i, j{0,1,,h}𝑗01j\in\{0,1,\ldots,h\}.

Let

B𝐵\displaystyle B =\displaystyle= {u[n]degH(q(n))(u)δn1/(h1)2},conditional-set𝑢delimited-[]𝑛subscriptdegsuperscript𝐻𝑞𝑛𝑢𝛿superscript𝑛112\displaystyle\left\{u\in[n]\mid\text{\rm deg}_{H^{(q(n))}}(u)\geq\delta n^{1/(h-1)}-2\right\}, (26)
\displaystyle{\cal E} =def.superscriptdef.\displaystyle\stackrel{{\scriptstyle\text{def.}}}{{=}} [EG(q(n))(i,j{0,1,,h},|ij|2Vi×Vj)]({α^}×(Vh(BS))).delimited-[]superscriptsubscript𝐸𝐺𝑞𝑛subscriptformulae-sequence𝑖𝑗01𝑖𝑗2subscript𝑉𝑖subscript𝑉𝑗^𝛼subscript𝑉𝐵𝑆\displaystyle\left[E_{G}^{(q(n))}\setminus\left(\bigcup_{i,j\in\{0,1,\ldots,h\},\,|i-j|\geq 2}\,V_{i}\times V_{j}\right)\right]\cup\left(\left\{\hat{\alpha}\right\}\times\left(V_{h}\setminus\left(B\cup S\right)\right)\right).\,\,\,\,\,\,\,\,\,\,\, (27)

By equation (19), α^Vh(BS)^𝛼subscript𝑉𝐵𝑆\hat{\alpha}\notin V_{h}\setminus(B\cup S), which together with equation (4) and Lemma 4 forbids any edge in \cal E from being a self-loop. For all distinct u𝑢u, v[n]𝑣delimited-[]𝑛v\in[n],

w(u,v)=def.{1/2,if one of u and v is α^ and the other is in Vh(BS),1,otherwise.superscriptdef.𝑤𝑢𝑣cases12if one of u and v is α^ and the other is in Vh(BS),1otherwise.\displaystyle w\left(u,v\right)\stackrel{{\scriptstyle\text{def.}}}{{=}}\left\{\begin{array}[]{ll}1/2,&\text{if one of $u$ and $v$ is $\hat{\alpha}$ and the other is in $V_{h}\setminus(B\cup S)$,}\\ 1,&\text{otherwise.}\end{array}\right. (30)

Furthermore, let

𝒢𝒢\displaystyle{\cal G} =def.superscriptdef.\displaystyle\stackrel{{\scriptstyle\text{def.}}}{{=}} ([n],,w)delimited-[]𝑛𝑤\displaystyle\left([n],{\cal E},w\right) (31)

be a weighted undirected graph.

Lemma 15.
j=1h1|Vj|2δh1n.superscriptsubscript𝑗11subscript𝑉𝑗2superscript𝛿1𝑛\sum_{j=1}^{h-1}\,\left|V_{j}\right|\leq 2\delta^{h-1}n.
Proof.

By Lemma 8 and equation (24),

|Vj+1||Vj|δn1/(h1)subscript𝑉𝑗1subscript𝑉𝑗𝛿superscript𝑛11\left|V_{j+1}\right|\leq\left|V_{j}\right|\cdot\delta n^{1/(h-1)}

for all j[h2]𝑗delimited-[]2j\in[h-2]. Therefore, j=1h1|Vj|superscriptsubscript𝑗11subscript𝑉𝑗\sum_{j=1}^{h-1}\,|V_{j}| is bounded from above by the (h1)1(h-1)-term geometric series with the common ratio of δn1/(h1)𝛿superscript𝑛11\delta n^{1/(h-1)} and the initial value of |V1|subscript𝑉1|V_{1}|. Consequently,

j=1h1|Vj|(2) and Fact 32|V1|δh2n(h2)/(h1).superscript(2) and Fact 3superscriptsubscript𝑗11subscript𝑉𝑗2subscript𝑉1superscript𝛿2superscript𝑛21\displaystyle\sum_{j=1}^{h-1}\,\left|V_{j}\right|\stackrel{{\scriptstyle\text{(\ref{tediouscondition2})~{}and~{}Fact~{}\ref{geometricseriesbound}}}}{{\leq}}2\cdot\left|V_{1}\right|\cdot\delta^{h-2}n^{(h-2)/(h-1)}. (32)

By Lemma 13, |NQ(q(n))(α^)|δn1/(h1)subscript𝑁superscript𝑄𝑞𝑛^𝛼𝛿superscript𝑛11|N_{Q^{(q(n))}}(\hat{\alpha})|\leq\delta n^{1/(h-1)}. So by equation (23), we have |V1|δn1/(h1)subscript𝑉1𝛿superscript𝑛11|V_{1}|\leq\delta n^{1/(h-1)}, which together with inequality (32) completes the proof. ∎

Lemma 16.
|Vh(BS)|n(12δh1hδo(1)δ).subscript𝑉𝐵𝑆𝑛12superscript𝛿1𝛿𝑜1𝛿\left|V_{h}\setminus\left(B\cup S\right)\right|\geq n\left(1-2\delta^{h-1}-\frac{h}{\delta}\cdot o(1)-\delta\right).
Proof.

By Lemma 12 and equation (26), |B|=(h/δ)o(n)𝐵𝛿𝑜𝑛|B|=(h/\delta)\cdot o(n). By construction, |S|=δn𝑆𝛿𝑛|S|=\lfloor\delta n\rfloor. Finally,

|Vh|Lemmas 1415n2δh1n|V0|=(22)n2δh1n1.superscriptLemmas 1415subscript𝑉𝑛2superscript𝛿1𝑛subscript𝑉0superscript(22)𝑛2superscript𝛿1𝑛1\displaystyle\left|V_{h}\right|\stackrel{{\scriptstyle\text{Lemmas~{}\ref{disjointnessoflayers}--\ref{sizeofthenonlastlayers}}}}{{\geq}}n-2\delta^{h-1}n-\left|V_{0}\right|\stackrel{{\scriptstyle\text{(\ref{layer0})}}}{{=}}n-2\delta^{h-1}n-1.

The following lemma says that α^^𝛼\hat{\alpha} has an average distance of approximately 1/2121/2 to other points w.r.t. the distance function min{d𝒢(,),h}subscript𝑑𝒢\min\{d_{\cal G}(\cdot,\cdot),h\}.

Lemma 17.
v[n]min{d𝒢(α^,v),h}n(12+2hδh1+h2δo(1)+hδ).subscript𝑣delimited-[]𝑛subscript𝑑𝒢^𝛼𝑣𝑛122superscript𝛿1superscript2𝛿𝑜1𝛿\displaystyle\sum_{v\in[n]}\,\min\left\{d_{\cal G}\left(\hat{\alpha},v\right),h\right\}\leq n\cdot\left(\frac{1}{2}+2h\delta^{h-1}+\frac{h^{2}}{\delta}\cdot o(1)+h\delta\right).
Proof.

By equations (27)–(31), d𝒢(α^,v)1/2subscript𝑑𝒢^𝛼𝑣12d_{\cal G}(\hat{\alpha},v)\leq 1/2 for all vVh(BS)𝑣subscript𝑉𝐵𝑆v\in V_{h}\setminus(B\cup S). This and Lemma 16 complete the proof. ∎

3.3 A metric consistent with Adv’s answers

This subsection constructs a metric d:[n]2[0,):𝑑superscriptdelimited-[]𝑛20d\colon[n]^{2}\to[0,\infty) consistent with Adv’s answers in line 17. So Lemma 10 will require z𝑧z, which is the output of A𝖠𝖽𝗏superscript𝐴𝖠𝖽𝗏A^{\sf Adv}, to have an average distance (w.r.t. d𝑑d) of at least approximately hh to other points. Although d(,)𝑑d(\cdot,\cdot) will not be exactly min{d𝒢(,),h}subscript𝑑𝒢\min\{d_{\cal G}(\cdot,\cdot),h\}, Lemma 17 will forbid v[n]d(α^,v)/nsubscript𝑣delimited-[]𝑛𝑑^𝛼𝑣𝑛\sum_{v\in[n]}\,d(\hat{\alpha},v)/n from exceeding 1/2121/2 by too much. Details follow.

Recall that H(i)superscript𝐻𝑖H^{(i)} and G(i)superscript𝐺𝑖G^{(i)} are unweighted for all i{0,1,,q(n)}𝑖01𝑞𝑛i\in\{0,1,\ldots,q(n)\}. They can be treated as having the weight function w𝑤w while preserving dH(i)(,)subscript𝑑superscript𝐻𝑖d_{H^{(i)}}(\cdot,\cdot) and dG(i)(,)subscript𝑑superscript𝐺𝑖d_{G^{(i)}}(\cdot,\cdot), as shown by the lemma below.

Lemma 18.

For all i{0,1,,q(n)}𝑖01𝑞𝑛i\in\{0,1,\ldots,q(n)\}, each path P𝑃P in H(i)superscript𝐻𝑖H^{(i)} or G(i)superscript𝐺𝑖G^{(i)} has exactly w(P)𝑤𝑃w(P) edges.

Proof.

As α^S^𝛼𝑆\hat{\alpha}\in S by equation (19), equation (30) implies w(u,v)=1𝑤𝑢𝑣1w(u,v)=1 for all distinct u𝑢u, v[n]S𝑣delimited-[]𝑛𝑆v\in[n]\setminus S. This and equation (4) imply that all edges in EG(0)superscriptsubscript𝐸𝐺0E_{G}^{(0)} have weight 111 w.r.t. w𝑤w. So by Lemma 4, the edges in EH(i)EG(i)superscriptsubscript𝐸𝐻𝑖superscriptsubscript𝐸𝐺𝑖E_{H}^{(i)}\cup E_{G}^{(i)} have weight 111 w.r.t. w𝑤w. Finally, recall that H(i)=([n],EH(i))superscript𝐻𝑖delimited-[]𝑛superscriptsubscript𝐸𝐻𝑖H^{(i)}=([n],E_{H}^{(i)}) and G(i)=([n],EG(i))superscript𝐺𝑖delimited-[]𝑛superscriptsubscript𝐸𝐺𝑖G^{(i)}=([n],E_{G}^{(i)}). ∎

We now show that H(q(n))superscript𝐻𝑞𝑛H^{(q(n))} has an edge in Vi×Vjsubscript𝑉𝑖subscript𝑉𝑗V_{i}\times V_{j} only if |ij|1𝑖𝑗1|i-j|\leq 1.

Lemma 19.
EH(q(n))(i,j{0,1,,h},|ij|2Vi×Vj)=.superscriptsubscript𝐸𝐻𝑞𝑛subscriptformulae-sequence𝑖𝑗01𝑖𝑗2subscript𝑉𝑖subscript𝑉𝑗E_{H}^{(q(n))}\cap\left(\bigcup_{i,j\in\{0,1,\ldots,h\},\,|i-j|\geq 2}\,V_{i}\times V_{j}\right)=\emptyset.
Proof.

Suppose for contradiction that there exists eEH(q(n))𝑒superscriptsubscript𝐸𝐻𝑞𝑛e\in E_{H}^{(q(n))} with an endpoint in Vksubscript𝑉𝑘V_{k} and the other in Vsubscript𝑉V_{\ell}, where k𝑘k, {0,1,,h}01\ell\in\{0,1,\ldots,h\} and k+2𝑘2\ell\geq k+2. Then NH(q(n))(Vk)Vsubscript𝑁superscript𝐻𝑞𝑛subscript𝑉𝑘subscript𝑉N_{H^{(q(n))}}(V_{k})\cap V_{\ell}\neq\emptyset, which together with Lemma 14 and k+2𝑘2\ell\geq k+2 implies

NH(q(n))(Vk)j=0k+1Vj.not-subset-of-or-equalssubscript𝑁superscript𝐻𝑞𝑛subscript𝑉𝑘superscriptsubscript𝑗0𝑘1subscript𝑉𝑗\displaystyle N_{H^{(q(n))}}\left(V_{k}\right)\not\subseteq\bigcup_{j=0}^{k+1}\,V_{j}. (33)

As k+2𝑘2\ell\geq k+2 and k𝑘k, {0,1,,h}01\ell\in\{0,1,\ldots,h\}, we have 0kh20𝑘20\leq k\leq h-2.

  1. Case 1:

    k=0𝑘0k=0. By equations (19) and (22), V0Ssubscript𝑉0𝑆V_{0}\subseteq S. So NG(0)(V0)=subscript𝑁superscript𝐺0subscript𝑉0N_{G^{(0)}}(V_{0})=\emptyset by equations (4)–(5). Consequently, NH(q(n))(V0)=subscript𝑁superscript𝐻𝑞𝑛subscript𝑉0N_{H^{(q(n))}}(V_{0})=\emptyset by Lemma 4, contradicting relation (33).

  2. Case 2:

    k[h2]𝑘delimited-[]2k\in[h-2]. Relation (33) contradicts equation (24) (with jk𝑗𝑘j\leftarrow k).

A contradiction occurs in either case. ∎

Lemma 20.

EH(q(n))superscriptsubscript𝐸𝐻𝑞𝑛E_{H}^{(q(n))}\subseteq{\cal E}.

Proof.

By Lemma 19 and equation (27), EG(q(n))EH(q(n))superscriptsubscript𝐸𝐺𝑞𝑛superscriptsubscript𝐸𝐻𝑞𝑛E_{G}^{(q(n))}\cap E_{H}^{(q(n))}\subseteq{\cal E}. This and Lemma 4 complete the proof. ∎

Lemma 21.

Let P𝑃P be a path in 𝒢𝒢\cal G that visits no edges in {α^}×(Vh(BS))^𝛼subscript𝑉𝐵𝑆\{\hat{\alpha}\}\times(V_{h}\setminus(B\cup S)). If the first and the last vertices of P𝑃P are in Vhsubscript𝑉V_{h} and V1subscript𝑉1V_{1}, respectively, then w(P)h1𝑤𝑃1w(P)\geq h-1.

Proof.

By Lemma 14, k=0hVk=[n]superscriptsubscript𝑘0subscript𝑉𝑘delimited-[]𝑛\bigcup_{k=0}^{h}\,V_{k}=[n], Vi+1Vi=subscript𝑉𝑖1subscript𝑉𝑖V_{i+1}\cap V_{i}=\emptyset and (Vi+1×Vi)(Vj+1×Vj)=subscript𝑉𝑖1subscript𝑉𝑖subscript𝑉𝑗1subscript𝑉𝑗(V_{i+1}\times V_{i})\cap(V_{j+1}\times V_{j})=\emptyset for all distinct i𝑖i, j[h1]𝑗delimited-[]1j\in[h-1]. Because P𝑃P is a path in 𝒢𝒢\cal G visiting no edges in {α^}×(Vh(BS))^𝛼subscript𝑉𝐵𝑆\{\hat{\alpha}\}\times(V_{h}\setminus(B\cup S)), no edges on P𝑃P are in Vi×Vjsubscript𝑉𝑖subscript𝑉𝑗V_{i}\times V_{j} for any i𝑖i, j{0,1,,h}𝑗01j\in\{0,1,\ldots,h\} with |ij|2𝑖𝑗2|i-j|\geq 2 by equations (27) and (31). This forces P𝑃P, which is a Vhsubscript𝑉V_{h}-V1subscript𝑉1V_{1} path, to visit at least one edge in Vi+1×Visubscript𝑉𝑖1subscript𝑉𝑖V_{i+1}\times V_{i} for each i[h1]𝑖delimited-[]1i\in[h-1] (for a total of at least h11h-1 edges). As α^i=1hVi^𝛼superscriptsubscript𝑖1subscript𝑉𝑖\hat{\alpha}\notin\bigcup_{i=1}^{h}\,V_{i} by equations (22)–(25), equation (30) gives w(u,v)=1𝑤𝑢𝑣1w(u,v)=1 for all (u,v)i=1h1Vi+1×Vi𝑢𝑣superscriptsubscript𝑖11subscript𝑉𝑖1subscript𝑉𝑖(u,v)\in\bigcup_{i=1}^{h-1}\,V_{i+1}\times V_{i}. We have shown that P𝑃P has at least h11h-1 edges of weight (w.r.t. w𝑤w) 111. ∎

We proceed to analyze shortest aisubscript𝑎𝑖a_{i}-bisubscript𝑏𝑖b_{i} paths in 𝒢𝒢{\cal G}, where i[q(n)]𝑖delimited-[]𝑞𝑛i\in[q(n)]. Clearly, such paths must be simple.

Lemma 22.

Let P𝑃P be a shortest aisubscript𝑎𝑖a_{i}-bisubscript𝑏𝑖b_{i} path in 𝒢𝒢{\cal G}, where i[q(n)]𝑖delimited-[]𝑞𝑛i\in[q(n)]. If P𝑃P visits exactly one edge in {α^}×(Vh(BS))^𝛼subscript𝑉𝐵𝑆\{\hat{\alpha}\}\times(V_{h}\setminus(B\cup S)) and α^{ai,bi}^𝛼subscript𝑎𝑖subscript𝑏𝑖\hat{\alpha}\in\{a_{i},b_{i}\}, then w(P)h1/2𝑤𝑃12w(P)\geq h-1/2.

Proof.

Being shortest, P𝑃P must be simple. Assume α^=ai^𝛼subscript𝑎𝑖\hat{\alpha}=a_{i} for now. Because P𝑃P is a simple α^^𝛼\hat{\alpha}-bisubscript𝑏𝑖b_{i} path in 𝒢𝒢{\cal G} visiting exactly one edge in {α^}×(Vh(BS))^𝛼subscript𝑉𝐵𝑆\{\hat{\alpha}\}\times(V_{h}\setminus(B\cup S)), it can be decomposed into an edge (α^,v)^𝛼𝑣(\hat{\alpha},v), where vVh(BS)𝑣subscript𝑉𝐵𝑆v\in V_{h}\setminus(B\cup S), and a v𝑣v-bisubscript𝑏𝑖b_{i} path P~~𝑃\tilde{P} in 𝒢𝒢{\cal G} that visits no edges in {α^}×(Vh(BS))^𝛼subscript𝑉𝐵𝑆\{\hat{\alpha}\}\times(V_{h}\setminus(B\cup S)).666If the first edge on P𝑃P is not in {α^}×(Vh(BS))^𝛼subscript𝑉𝐵𝑆\{\hat{\alpha}\}\times(V_{h}\setminus(B\cup S)), then P𝑃P’s later visit of an edge in {α^}×(Vh(BS))^𝛼subscript𝑉𝐵𝑆\{\hat{\alpha}\}\times(V_{h}\setminus(B\cup S)) must make P𝑃P non-simple, a contradiction. As α^=ai^𝛼subscript𝑎𝑖\hat{\alpha}=a_{i}, we have biNQ(q(n))(α^)subscript𝑏𝑖subscript𝑁superscript𝑄𝑞𝑛^𝛼b_{i}\in N_{Q^{(q(n))}}(\hat{\alpha}) by line 16 of Adv. So by equations (22)–(23), biV1{α^}subscript𝑏𝑖subscript𝑉1^𝛼b_{i}\in V_{1}\cup\{\hat{\alpha}\}, implying biV1subscript𝑏𝑖subscript𝑉1b_{i}\in V_{1} because querying for the distance from a point to itself is forbidden and α^=ai^𝛼subscript𝑎𝑖\hat{\alpha}=a_{i}. In summary, P~~𝑃\tilde{P} is a path in 𝒢𝒢\cal G, from vVh(BS)𝑣subscript𝑉𝐵𝑆v\in V_{h}\setminus(B\cup S) to biV1subscript𝑏𝑖subscript𝑉1b_{i}\in V_{1}, that visits no edges in {α^}×(Vh(BS))^𝛼subscript𝑉𝐵𝑆\{\hat{\alpha}\}\times(V_{h}\setminus(B\cup S)). So by Lemma 21 (with PP~𝑃~𝑃P\leftarrow\tilde{P}),

w(P~)h1.𝑤~𝑃1\displaystyle w\left(\tilde{P}\right)\geq h-1. (34)

As vVh𝑣subscript𝑉v\in V_{h}, we have α^v^𝛼𝑣\hat{\alpha}\neq v by equations (22) and (25). By the construction of P~~𝑃\tilde{P},

w(P)=w(α^,v)+w(P~)(30)12+w(P~).𝑤𝑃𝑤^𝛼𝑣𝑤~𝑃superscript(30)12𝑤~𝑃\displaystyle w(P)=w\left(\hat{\alpha},v\right)+w\left(\tilde{P}\right)\stackrel{{\scriptstyle\text{(\ref{newedgeweightfunction})}}}{{\geq}}\frac{1}{2}+w\left(\tilde{P}\right). (35)

Inequalities (34)–(35) show that w(P)h1/2𝑤𝑃12w(P)\geq h-1/2. The case of α^=bi^𝛼subscript𝑏𝑖\hat{\alpha}=b_{i} is symmetric: Reverse P𝑃P and exchange all the above occurrences of “aisubscript𝑎𝑖a_{i}” with “bisubscript𝑏𝑖b_{i}.” ∎

Lemma 23.

For all i[q(n)]𝑖delimited-[]𝑞𝑛i\in[q(n)] with α^{ai,bi}^𝛼subscript𝑎𝑖subscript𝑏𝑖\hat{\alpha}\in\{a_{i},b_{i}\},

χ[v{ai,bi},(vS)(degQ(i)(v)δn1/(h1))]=1.𝜒delimited-[]𝑣subscript𝑎𝑖subscript𝑏𝑖𝑣𝑆subscriptdegsuperscript𝑄𝑖𝑣𝛿superscript𝑛111\chi\left[\exists v\in\left\{a_{i},b_{i}\right\},\,\left(v\in S\right)\land\left(\text{\rm deg}_{Q^{(i)}}(v)\leq\delta n^{1/(h-1)}\right)\right]=1.
Proof.

By equation (19), α^S^𝛼𝑆\hat{\alpha}\in S. This and Lemma 13 complete the proof. ∎

Lemma 24.

For all distinct u𝑢u, v[n](BS)𝑣delimited-[]𝑛𝐵𝑆v\in[n]\setminus(B\cup S), we have (u,v)EG(q(n))𝑢𝑣superscriptsubscript𝐸𝐺𝑞𝑛(u,v)\in E_{G}^{(q(n))}.

Proof.

As u𝑢u, v[n]B𝑣delimited-[]𝑛𝐵v\in[n]\setminus B, equation (26) implies

degH(i)(u)subscriptdegsuperscript𝐻𝑖𝑢\displaystyle\text{\rm deg}_{H^{(i)}}(u) <\displaystyle< δn1/(h1)2,𝛿superscript𝑛112\displaystyle\delta n^{1/(h-1)}-2, (36)
degH(i)(v)subscriptdegsuperscript𝐻𝑖𝑣\displaystyle\text{\rm deg}_{H^{(i)}}(v) <\displaystyle< δn1/(h1)2𝛿superscript𝑛112\displaystyle\delta n^{1/(h-1)}-2 (37)

when i=q(n)𝑖𝑞𝑛i=q(n). So by Lemma 4, inequalities (36)–(37) hold for all i[q(n)]𝑖delimited-[]𝑞𝑛i\in[q(n)].

As u𝑢u, v[n]S𝑣delimited-[]𝑛𝑆v\in[n]\setminus S and uv𝑢𝑣u\neq v, we have (u,v)EG(0)𝑢𝑣superscriptsubscript𝐸𝐺0(u,v)\in E_{G}^{(0)} by equation (4). By lines 8 and 13 of Adv,

EG(i1){(x,y)[n]2(degH(i)(x)δn1/(h1)2)(degH(i)(y)δn1/(h1)2)}EG(i)superscriptsubscript𝐸𝐺𝑖1conditional-set𝑥𝑦superscriptdelimited-[]𝑛2subscriptdegsuperscript𝐻𝑖𝑥𝛿superscript𝑛112subscriptdegsuperscript𝐻𝑖𝑦𝛿superscript𝑛112superscriptsubscript𝐸𝐺𝑖\displaystyle E_{G}^{(i-1)}\setminus\left\{\left(x,y\right)\in[n]^{2}\mid\left(\text{deg}_{H^{(i)}}(x)\geq\delta n^{1/(h-1)}-2\right)\lor\left(\text{deg}_{H^{(i)}}(y)\geq\delta n^{1/(h-1)}-2\right)\right\}\subseteq E_{G}^{(i)} (38)

for all i[q(n)]𝑖delimited-[]𝑞𝑛i\in[q(n)]. By inequalities (36)–(37) and relation (38), (u,v)EG(i)𝑢𝑣superscriptsubscript𝐸𝐺𝑖(u,v)\in E_{G}^{(i)} if (u,v)EG(i1)𝑢𝑣superscriptsubscript𝐸𝐺𝑖1(u,v)\in E_{G}^{(i-1)}, for all i[q(n)]𝑖delimited-[]𝑞𝑛i\in[q(n)]. The proof is complete by mathematical induction. ∎

Lemma 25.

Let P𝑃P be a shortest aisubscript𝑎𝑖a_{i}-bisubscript𝑏𝑖b_{i} path in 𝒢𝒢{\cal G}, where i[q(n)]𝑖delimited-[]𝑞𝑛i\in[q(n)]. If P𝑃P visits exactly two edges in {α^}×(Vh(BS))^𝛼subscript𝑉𝐵𝑆\{\hat{\alpha}\}\times(V_{h}\setminus(B\cup S)), then G(q(n))superscript𝐺𝑞𝑛G^{(q(n))} has an aisubscript𝑎𝑖a_{i}-bisubscript𝑏𝑖b_{i} path with exactly w(P)𝑤𝑃w(P) edges.

Proof.

Being shortest, P𝑃P must be simple. Therefore, the two edges of P𝑃P in {α^}×(Vh(BS))^𝛼subscript𝑉𝐵𝑆\{\hat{\alpha}\}\times(V_{h}\setminus(B\cup S)), denoted (u,α^)𝑢^𝛼(u,\hat{\alpha}) and (α^,v)^𝛼𝑣(\hat{\alpha},v), are consecutive on P𝑃P. Clearly, uv𝑢𝑣u\neq v. Replace the subpath (u,α^,v)𝑢^𝛼𝑣(u,\hat{\alpha},v) of P𝑃P by the edge (u,v)𝑢𝑣(u,v) to yield an aisubscript𝑎𝑖a_{i}-bisubscript𝑏𝑖b_{i} path P~~𝑃\tilde{P}. Except for the two edges of P𝑃P in {α~}×(Vh(BS))~𝛼subscript𝑉𝐵𝑆\{\tilde{\alpha}\}\times(V_{h}\setminus(B\cup S)) (which are (u,α^)𝑢^𝛼(u,\hat{\alpha}) and (α^,v)^𝛼𝑣(\hat{\alpha},v)), all edges of P𝑃P are in EG(q(n))superscriptsubscript𝐸𝐺𝑞𝑛E_{G}^{(q(n))} by equation (27) and P𝑃P’s being a path in 𝒢=([n],,w)𝒢delimited-[]𝑛𝑤{\cal G}=([n],{\cal E},w). As u𝑢u, vVh(BS)𝑣subscript𝑉𝐵𝑆v\in V_{h}\setminus(B\cup S) and uv𝑢𝑣u\neq v, (u,v)EG(q(n))𝑢𝑣superscriptsubscript𝐸𝐺𝑞𝑛(u,v)\in E_{G}^{(q(n))} by Lemma 24. In summary, all the edges of P~~𝑃\tilde{P} (including (u,v)𝑢𝑣(u,v) and the edges of P𝑃P not in {α^}×(Vh(BS))^𝛼subscript𝑉𝐵𝑆\{\hat{\alpha}\}\times(V_{h}\setminus(B\cup S))) are in EG(q(n))superscriptsubscript𝐸𝐺𝑞𝑛E_{G}^{(q(n))}. Consequently, P~~𝑃\tilde{P} is an aisubscript𝑎𝑖a_{i}-bisubscript𝑏𝑖b_{i} path in G(q(n))=([n],EG(q(n)))superscript𝐺𝑞𝑛delimited-[]𝑛superscriptsubscript𝐸𝐺𝑞𝑛G^{(q(n))}=([n],E_{G}^{(q(n))}). So we are left only to prove that P~~𝑃\tilde{P} has exactly w(P)𝑤𝑃w(P) edges, which, by Lemma 18 (with PP~𝑃~𝑃P\leftarrow\tilde{P} and iq(n)𝑖𝑞𝑛i\leftarrow q(n)), is equivalent to proving w(P~)=w(P)𝑤~𝑃𝑤𝑃w(\tilde{P})=w(P).

Note that α^Vh(BS)^𝛼subscript𝑉𝐵𝑆\hat{\alpha}\notin V_{h}\setminus(B\cup S) by equation (19). By the construction of P~~𝑃\tilde{P} and recalling that u𝑢u, vVh(BS)𝑣subscript𝑉𝐵𝑆v\in V_{h}\setminus(B\cup S) and uv𝑢𝑣u\neq v,

w(P~)=w(P)w(u,α^)w(α^,v)+w(u,v)=(30)w(P)1212+1=w(P).𝑤~𝑃𝑤𝑃𝑤𝑢^𝛼𝑤^𝛼𝑣𝑤𝑢𝑣superscript(30)𝑤𝑃12121𝑤𝑃w\left(\tilde{P}\right)=w(P)-w\left(u,\hat{\alpha}\right)-w\left(\hat{\alpha},v\right)+w\left(u,v\right)\stackrel{{\scriptstyle\text{(\ref{newedgeweightfunction})}}}{{=}}w(P)-\frac{1}{2}-\frac{1}{2}+1=w(P).

Lemma 26.

Every simple path in 𝒢𝒢\cal G visiting exactly one edge in {α^}×(Vh(BS))^𝛼subscript𝑉𝐵𝑆\{\hat{\alpha}\}\times(V_{h}\setminus(B\cup S)) either starts or ends at α^^𝛼\hat{\alpha}.

Proof.

By equation (19), α^S^𝛼𝑆\hat{\alpha}\in S. So by equation (4) and Lemma 4, α^^𝛼\hat{\alpha} is incident to no edges in EG(q(n))superscriptsubscript𝐸𝐺𝑞𝑛E_{G}^{(q(n))}. Consequently, the set of all edges of 𝒢𝒢\cal G incident to α^^𝛼\hat{\alpha} is {α^}×(Vh(BS))^𝛼subscript𝑉𝐵𝑆\{\hat{\alpha}\}\times(V_{h}\setminus(B\cup S)) by equation (27). The lemma is now easy to see. ∎

Lemma 27.

For all i[q(n)]𝑖delimited-[]𝑞𝑛i\in[q(n)],

min{dH(i)(ai,bi),h12χ[v{ai,bi},(vS)(degQ(i)(v)δn1/(h1))]}subscript𝑑superscript𝐻𝑖subscript𝑎𝑖subscript𝑏𝑖12𝜒delimited-[]𝑣subscript𝑎𝑖subscript𝑏𝑖𝑣𝑆subscriptdegsuperscript𝑄𝑖𝑣𝛿superscript𝑛11\displaystyle\min\left\{d_{H^{(i)}}\left(a_{i},b_{i}\right),h-\frac{1}{2}\cdot\chi\left[\exists v\in\left\{a_{i},b_{i}\right\},\,\left(v\in S\right)\land\left(\text{\rm deg}_{Q^{(i)}}(v)\leq\delta n^{1/(h-1)}\right)\right]\right\} (39)
\displaystyle\leq min{d𝒢(ai,bi),h12χ[v{ai,bi},(vS)(degQ(i)(v)δn1/(h1))]}.subscript𝑑𝒢subscript𝑎𝑖subscript𝑏𝑖12𝜒delimited-[]𝑣subscript𝑎𝑖subscript𝑏𝑖𝑣𝑆subscriptdegsuperscript𝑄𝑖𝑣𝛿superscript𝑛11\displaystyle\min\left\{d_{\cal G}\left(a_{i},b_{i}\right),h-\frac{1}{2}\cdot\chi\left[\exists v\in\left\{a_{i},b_{i}\right\},\,\left(v\in S\right)\land\left(\text{\rm deg}_{Q^{(i)}}(v)\leq\delta n^{1/(h-1)}\right)\right]\right\}.\,\,\,\,\,\,\,\,\,\,\,\,\,
Proof.

Assume the existence of an aisubscript𝑎𝑖a_{i}-bisubscript𝑏𝑖b_{i} path in 𝒢𝒢{\cal G} for, otherwise, d𝒢(ai,bi)=subscript𝑑𝒢subscript𝑎𝑖subscript𝑏𝑖d_{\cal G}(a_{i},b_{i})=\infty and inequality (39) trivially holds. Pick any shortest aisubscript𝑎𝑖a_{i}-bisubscript𝑏𝑖b_{i} path P𝑃P in 𝒢=([n],,w)𝒢delimited-[]𝑛𝑤{\cal G}=([n],{\cal E},w). Clearly,

w(P)=d𝒢(ai,bi).𝑤𝑃subscript𝑑𝒢subscript𝑎𝑖subscript𝑏𝑖\displaystyle w(P)=d_{\cal G}\left(a_{i},b_{i}\right). (40)

Being shortest, P𝑃P must be simple.

We establish inequality (39) in the following exhaustive cases:

  1. Case 1:

    P𝑃P visits no edges in {α^}×(Vh(BS))^𝛼subscript𝑉𝐵𝑆\{\hat{\alpha}\}\times(V_{h}\setminus(B\cup S)). By equation (27), all edges of P𝑃P are in EG(q(n))superscriptsubscript𝐸𝐺𝑞𝑛E_{G}^{(q(n))}, i.e., P𝑃P is a path in G(q(n))superscript𝐺𝑞𝑛G^{(q(n))}. So by Lemma 18 (with iq(n)𝑖𝑞𝑛i\leftarrow q(n)), w(P)𝑤𝑃w(P) equals the length of P𝑃P in the unweighted graph G(q(n))superscript𝐺𝑞𝑛G^{(q(n))}. Therefore,

    dG(q(n))(ai,bi)w(P).subscript𝑑superscript𝐺𝑞𝑛subscript𝑎𝑖subscript𝑏𝑖𝑤𝑃\displaystyle d_{G^{(q(n))}}\left(a_{i},b_{i}\right)\leq w(P). (41)

    If dG(i1)(ai,bi)hsubscript𝑑superscript𝐺𝑖1subscript𝑎𝑖subscript𝑏𝑖d_{G^{(i-1)}}(a_{i},b_{i})\leq h, then

    dH(i)(ai,bi)=dG(q(n))(ai,bi)subscript𝑑superscript𝐻𝑖subscript𝑎𝑖subscript𝑏𝑖subscript𝑑superscript𝐺𝑞𝑛subscript𝑎𝑖subscript𝑏𝑖d_{H^{(i)}}\left(a_{i},b_{i}\right)=d_{G^{(q(n))}}\left(a_{i},b_{i}\right)

    by Lemma 5. Otherwise, dG(q(n))(ai,bi)>hsubscript𝑑superscript𝐺𝑞𝑛subscript𝑎𝑖subscript𝑏𝑖d_{G^{(q(n))}}(a_{i},b_{i})>h by Lemma 6. In either case, equations (40)–(41) imply inequality (39).

  2. Case 2:

    P𝑃P visits exactly one edge in {α^}×(Vh(BS))^𝛼subscript𝑉𝐵𝑆\{\hat{\alpha}\}\times(V_{h}\setminus(B\cup S)) and α^{ai,bi}^𝛼subscript𝑎𝑖subscript𝑏𝑖\hat{\alpha}\in\{a_{i},b_{i}\}. By Lemma 22 and equation (40), d𝒢(ai,bi)h1/2subscript𝑑𝒢subscript𝑎𝑖subscript𝑏𝑖12d_{\cal G}(a_{i},b_{i})\geq h-1/2. This and Lemma 23 force the right-hand side of inequality (39) to equal h1/212h-1/2. By Lemma 23, the left-hand side of inequality (39) is less than or equal to h1/212h-1/2. We have verified inequality (39).

  3. Case 3:

    P𝑃P visits exactly one edge in {α^}×(Vh(BS))^𝛼subscript𝑉𝐵𝑆\{\hat{\alpha}\}\times(V_{h}\setminus(B\cup S)) and α^{ai,bi}^𝛼subscript𝑎𝑖subscript𝑏𝑖\hat{\alpha}\notin\{a_{i},b_{i}\}. A contradiction to Lemma 26 occurs.

  4. Case 4:

    P𝑃P visits exactly two edges in {α^}×(Vh(BS))^𝛼subscript𝑉𝐵𝑆\{\hat{\alpha}\}\times(V_{h}\setminus(B\cup S)). Lemma 25 and that G(q(n))superscript𝐺𝑞𝑛G^{(q(n))} is unweighted imply inequality (41). Proceeding as in Case 1, equations (40)–(41) and Lemmas 56 imply inequality (39) no matter dG(i1)(ai,bi)hsubscript𝑑superscript𝐺𝑖1subscript𝑎𝑖subscript𝑏𝑖d_{G^{(i-1)}}(a_{i},b_{i})\leq h or otherwise.

  5. Case 5:

    P𝑃P visits at least three edges in {α^}×(Vh(BS))^𝛼subscript𝑉𝐵𝑆\{\hat{\alpha}\}\times(V_{h}\setminus(B\cup S)). Clearly, P𝑃P is non-simple, a contradiction.

Define d:[n]2[0,):𝑑superscriptdelimited-[]𝑛20d\colon[n]^{2}\to[0,\infty) by

d(ai,bi)=d(bi,ai)𝑑subscript𝑎𝑖subscript𝑏𝑖𝑑subscript𝑏𝑖subscript𝑎𝑖\displaystyle d\left(a_{i},b_{i}\right)=d\left(b_{i},a_{i}\right)
=def.superscriptdef.\displaystyle\stackrel{{\scriptstyle\text{def.}}}{{=}} min{d𝒢(ai,bi),h12χ[v{ai,bi},(vS)(degQ(i)(v)δn1/(h1))]},subscript𝑑𝒢subscript𝑎𝑖subscript𝑏𝑖12𝜒delimited-[]𝑣subscript𝑎𝑖subscript𝑏𝑖𝑣𝑆subscriptdegsuperscript𝑄𝑖𝑣𝛿superscript𝑛11\displaystyle\min\left\{d_{\cal G}\left(a_{i},b_{i}\right),h-\frac{1}{2}\cdot\chi\left[\exists v\in\left\{a_{i},b_{i}\right\},\,\left(v\in S\right)\land\left(\text{deg}_{Q^{(i)}}(v)\leq\delta n^{1/(h-1)}\right)\right]\right\},\,\,\,\,\,\,\,\,\,\,
d(u,v)𝑑𝑢𝑣\displaystyle d\left(u,v\right)
=def.superscriptdef.\displaystyle\stackrel{{\scriptstyle\text{def.}}}{{=}} min{d𝒢(u,v),h}subscript𝑑𝒢𝑢𝑣\displaystyle\min\left\{d_{\cal G}\left(u,v\right),h\right\} (43)

for all i[q(n)]𝑖delimited-[]𝑞𝑛i\in[q(n)] and (u,v)[n]2{(aj,bj)j[q(n)]}𝑢𝑣superscriptdelimited-[]𝑛2conditional-setsubscript𝑎𝑗subscript𝑏𝑗𝑗delimited-[]𝑞𝑛(u,v)\in[n]^{2}\setminus\{(a_{j},b_{j})\mid j\in[q(n)]\}. Because all pairs in [n]2superscriptdelimited-[]𝑛2[n]^{2} are unordered in this section, (bi,ai)[n]2{(aj,bj)j[q(n)]}subscript𝑏𝑖subscript𝑎𝑖superscriptdelimited-[]𝑛2conditional-setsubscript𝑎𝑗subscript𝑏𝑗𝑗delimited-[]𝑞𝑛(b_{i},a_{i})\notin[n]^{2}\setminus\{(a_{j},b_{j})\mid j\in[q(n)]\} for all i[q(n)]𝑖delimited-[]𝑞𝑛i\in[q(n)]. Consequently, equation (43) does not redefine d(bi,ai)𝑑subscript𝑏𝑖subscript𝑎𝑖d(b_{i},a_{i}). Because 𝒢𝒢{\cal G} is undirected, the right-hand side of equation (43) remains intact with u𝑢u and v𝑣v interchanged. As A𝐴A does not repeat queries, equation (3.3) defines d(ai,bi)𝑑subscript𝑎𝑖subscript𝑏𝑖d(a_{i},b_{i}) and d(bi,ai)𝑑subscript𝑏𝑖subscript𝑎𝑖d(b_{i},a_{i}) only once for each i[q(n)]𝑖delimited-[]𝑞𝑛i\in[q(n)] (note that forbidding repeated queries implies the nonexistence of distinct i𝑖i, j[q(n)]𝑗delimited-[]𝑞𝑛j\in[q(n)] satisfying (1) ai=ajsubscript𝑎𝑖subscript𝑎𝑗a_{i}=a_{j} and bi=bjsubscript𝑏𝑖subscript𝑏𝑗b_{i}=b_{j} or (2) ai=bjsubscript𝑎𝑖subscript𝑏𝑗a_{i}=b_{j} and bi=ajsubscript𝑏𝑖subscript𝑎𝑗b_{i}=a_{j}). It is now clear that d(,)𝑑d(\cdot,\cdot) is a well-defined function on [n]2superscriptdelimited-[]𝑛2[n]^{2}, a set of unordered pairs.777Even if we considered each pair in [n]2superscriptdelimited-[]𝑛2[n]^{2} to be ordered, our arguments would still have shown that d(,)𝑑d(\cdot,\cdot) is well-defined and symmetric. So we have the following lemma.

Lemma 28.

For all x𝑥x, y[n]𝑦delimited-[]𝑛y\in[n], d(x,y)=d(y,x)𝑑𝑥𝑦𝑑𝑦𝑥d(x,y)=d(y,x).

Lemma 29.

For all distinct x𝑥x, y[n]𝑦delimited-[]𝑛y\in[n], d(x,x)=0𝑑𝑥𝑥0d(x,x)=0 and d(x,y)1/2𝑑𝑥𝑦12d(x,y)\geq 1/2.

Proof.

Recall that 𝒢=([n],,w)𝒢delimited-[]𝑛𝑤{\cal G}=([n],{\cal E},w). As Im(w)[1/2,)Im𝑤12\mathop{\mathrm{Im}}(w)\subseteq[1/2,\infty) by equation (30), we have d𝒢(x,y)subscript𝑑𝒢𝑥𝑦d_{\cal G}(x,y), d𝒢(y,x)1/2subscript𝑑𝒢𝑦𝑥12d_{\cal G}(y,x)\geq 1/2. So by equations (3.3)–(43) and h+{1}superscript1h\in\mathbb{Z}^{+}\setminus\{1\}, d(x,y)1/2𝑑𝑥𝑦12d(x,y)\geq 1/2. Because we forbid queries for the distance from a point to itself, d(x,x)𝑑𝑥𝑥d(x,x) is not defined by equations (3.3). By equation (43), d(x,x)=0𝑑𝑥𝑥0d(x,x)=0. ∎

Lemma 30.

([n],d)delimited-[]𝑛𝑑([n],d) is a metric space.

Proof.

By Lemmas 2829, we only need to show that

d(x,y)+d(y,z)d(x,z)𝑑𝑥𝑦𝑑𝑦𝑧𝑑𝑥𝑧\displaystyle d\left(x,y\right)+d\left(y,z\right)\geq d\left(x,z\right) (44)

for all x𝑥x, y𝑦y, z[n]𝑧delimited-[]𝑛z\in[n]. It is well-known that a positively-weighted undirected graph induces a distance function obeying the triangle inequality; hence

d𝒢(x,y)+d𝒢(y,z)d𝒢(x,z).subscript𝑑𝒢𝑥𝑦subscript𝑑𝒢𝑦𝑧subscript𝑑𝒢𝑥𝑧\displaystyle d_{\cal G}\left(x,y\right)+d_{\cal G}\left(y,z\right)\geq d_{\cal G}\left(x,z\right). (45)

Because 𝒢𝒢\cal G is undirected, d𝒢(,)subscript𝑑𝒢d_{\cal G}(\cdot,\cdot) is symmetric. So by equations (3.3)–(43),

d(x,y){min{d𝒢(x,y),h},min{d𝒢(x,y),h12}}𝑑𝑥𝑦subscript𝑑𝒢𝑥𝑦subscript𝑑𝒢𝑥𝑦12\displaystyle d\left(x,y\right)\in\left\{\min\left\{d_{\cal G}\left(x,y\right),h\right\},\min\left\{d_{\cal G}\left(x,y\right),h-\frac{1}{2}\right\}\right\} (46)

for all x𝑥x, y[n]𝑦delimited-[]𝑛y\in[n]. Now verify inequality (44) in the following exhaustive (but not mutually exclusive) cases:

  1. Case 1:

    x=y𝑥𝑦x=y, y=z𝑦𝑧y=z or x=z𝑥𝑧x=z. Lemma 29 implies inequality (44).

  2. Case 2:

    d𝒢(x,y)h1/2subscript𝑑𝒢𝑥𝑦12d_{\cal G}(x,y)\geq h-1/2 and yz𝑦𝑧y\neq z. By relation (46), d(x,y)h1/2𝑑𝑥𝑦12d(x,y)\geq h-1/2. As yz𝑦𝑧y\neq z, d(y,z)1/2𝑑𝑦𝑧12d(y,z)\geq 1/2 by Lemma 29. By relation (46), d(x,z)h𝑑𝑥𝑧d(x,z)\leq h. Summarizing the above proves inequality (44).

  3. Case 3:

    d𝒢(y,z)h1/2subscript𝑑𝒢𝑦𝑧12d_{\cal G}(y,z)\geq h-1/2 and xy𝑥𝑦x\neq y. Replace “(x,y)𝑥𝑦(x,y),” “(y,z)𝑦𝑧(y,z)” and “yz𝑦𝑧y\neq z” in the analysis of Case 2 by “(y,z)𝑦𝑧(y,z),” “(x,y)𝑥𝑦(x,y)” and “xy𝑥𝑦x\neq y,” respectively.

  4. Case 4:

    d𝒢(x,y)<h1/2subscript𝑑𝒢𝑥𝑦12d_{\cal G}(x,y)<h-1/2 and d𝒢(y,z)<h1/2subscript𝑑𝒢𝑦𝑧12d_{\cal G}(y,z)<h-1/2. By relation (46), d(x,y)=d𝒢(x,y)𝑑𝑥𝑦subscript𝑑𝒢𝑥𝑦d(x,y)=d_{\cal G}(x,y) and d(y,z)=d𝒢(y,z)𝑑𝑦𝑧subscript𝑑𝒢𝑦𝑧d(y,z)=d_{\cal G}(y,z). So inequalities (44)–(45) share a common left-hand side. To deduce inequality (44) from inequality (45), therefore, it suffices to show that d𝒢(x,z)d(x,z)subscript𝑑𝒢𝑥𝑧𝑑𝑥𝑧d_{\cal G}(x,z)\geq d(x,z), which follows from relation (46).

Lemma 31.

For all i[q(n)]𝑖delimited-[]𝑞𝑛i\in[q(n)],

dH(i)(ai,bi)d𝒢(ai,bi).subscript𝑑superscript𝐻𝑖subscript𝑎𝑖subscript𝑏𝑖subscript𝑑𝒢subscript𝑎𝑖subscript𝑏𝑖d_{H^{(i)}}\left(a_{i},b_{i}\right)\geq d_{\cal G}\left(a_{i},b_{i}\right).
Proof.

Assume the existence of an aisubscript𝑎𝑖a_{i}-bisubscript𝑏𝑖b_{i} path in H(i)superscript𝐻𝑖H^{(i)} for, otherwise, dH(i)(ai,bi)=subscript𝑑superscript𝐻𝑖subscript𝑎𝑖subscript𝑏𝑖d_{H^{(i)}}(a_{i},b_{i})=\infty and there is nothing to prove. Take a shortest aisubscript𝑎𝑖a_{i}-bisubscript𝑏𝑖b_{i} path P𝑃P in the unweighted graph H(i)=([n],EH(i))superscript𝐻𝑖delimited-[]𝑛superscriptsubscript𝐸𝐻𝑖H^{(i)}=([n],E_{H}^{(i)}). So dH(i)(ai,bi)subscript𝑑superscript𝐻𝑖subscript𝑎𝑖subscript𝑏𝑖d_{H^{(i)}}(a_{i},b_{i}) is the number of P𝑃P’s edges. By Lemma 18, P𝑃P’s number of edges equals w(P)𝑤𝑃w(P). By Lemma 4, P𝑃P’s edges are in EH(q(n))superscriptsubscript𝐸𝐻𝑞𝑛E_{H}^{(q(n))}. So by Lemma 20, P𝑃P is a path in 𝒢=([n],,w)𝒢delimited-[]𝑛𝑤{\cal G}=([n],{\cal E},w), implying d𝒢(ai,bi)w(P)subscript𝑑𝒢subscript𝑎𝑖subscript𝑏𝑖𝑤𝑃d_{\cal G}(a_{i},b_{i})\leq w(P). Summarizing the above proves the lemma. ∎

The following lemma says that line 17 of Adv answers queries consistently with d(,)𝑑d(\cdot,\cdot).

Lemma 32.

For all i[q(n)]𝑖delimited-[]𝑞𝑛i\in[q(n)],

min{dH(i)(ai,bi),h12χ[v{ai,bi},(vS)(degQ(i)(v)δn1/(h1))]}subscript𝑑superscript𝐻𝑖subscript𝑎𝑖subscript𝑏𝑖12𝜒delimited-[]𝑣subscript𝑎𝑖subscript𝑏𝑖𝑣𝑆subscriptdegsuperscript𝑄𝑖𝑣𝛿superscript𝑛11\displaystyle\min\left\{d_{H^{(i)}}\left(a_{i},b_{i}\right),h-\frac{1}{2}\cdot\chi\left[\exists v\in\left\{a_{i},b_{i}\right\},\,\left(v\in S\right)\land\left(\text{\rm deg}_{Q^{(i)}}(v)\leq\delta n^{1/(h-1)}\right)\right]\right\} (47)
=\displaystyle= d(ai,bi).𝑑subscript𝑎𝑖subscript𝑏𝑖\displaystyle d\left(a_{i},b_{i}\right).
Proof.

Lemma 27 and equation (3.3) prove the “\leq” part of equation (47). On the other hand, Lemma 31 and equation (3.3) imply the “\geq” part of equation (47). ∎

3.4 Putting things together

We now arrive at our main result.

Theorem 33.

Metric 111-median has no deterministic o(n1+1/(h1))𝑜superscript𝑛111o(n^{1+1/(h-1)})-query (2hϵ)2italic-ϵ(2h-\epsilon)-approximation algorithms for any constants h+{1}superscript1h\in\mathbb{Z}^{+}\setminus\{1\} and ϵ>0italic-ϵ0\epsilon>0.

Proof.

By Lemma 32 and line 17 of Adv, Adv answers A𝐴A’s queries consistently with d(,)𝑑d(\cdot,\cdot). This implies that AAdvsuperscript𝐴AdvA^{\text{\sf Adv}} and Adsuperscript𝐴𝑑A^{d} have the same output.888See, e.g., [2, Lemma 8]. That is, Adsuperscript𝐴𝑑A^{d} outputs z𝑧z. By Lemma 30, ([n],d)delimited-[]𝑛𝑑([n],d) is a metric space.

By relation (46), d(x,y)min{d𝒢(x,y),h}𝑑𝑥𝑦subscript𝑑𝒢𝑥𝑦d(x,y)\leq\min\{d_{\cal G}(x,y),h\} for all x𝑥x, y[n]𝑦delimited-[]𝑛y\in[n]. Therefore,

v[n]d(α^,v)n(12+2hδh1+h2δo(1)+hδ)subscript𝑣delimited-[]𝑛𝑑^𝛼𝑣𝑛122superscript𝛿1superscript2𝛿𝑜1𝛿\displaystyle\sum_{v\in[n]}\,d\left(\hat{\alpha},v\right)\leq n\cdot\left(\frac{1}{2}+2h\delta^{h-1}+\frac{h^{2}}{\delta}\cdot o(1)+h\delta\right) (48)

by Lemma 17.

Recall that A𝐴A does not repeat queries. So by equation (15) and Lemmas 2829,

v[n]d(z,v)iId(ai,bi).subscript𝑣delimited-[]𝑛𝑑𝑧𝑣subscript𝑖𝐼𝑑subscript𝑎𝑖subscript𝑏𝑖\displaystyle\sum_{v\in[n]}\,d\left(z,v\right)\geq\sum_{i\in I}\,d\left(a_{i},b_{i}\right).
999In fact, this is an equality because A𝖠𝖽𝗏superscript𝐴𝖠𝖽𝗏A^{\sf Adv} will have queried for the distances between its output and all other points when halting.

By Lemmas 10 and 32,

iId(ai,bi)n(h2hδh1o(1)δ).subscript𝑖𝐼𝑑subscript𝑎𝑖subscript𝑏𝑖𝑛2superscript𝛿1𝑜1𝛿\displaystyle\sum_{i\in I}\,d\left(a_{i},b_{i}\right)\geq n\cdot\left(h-2h\delta^{h-1}-o(1)-\delta\right). (49)

By inequalities (48)–(49),

v[n]d(z,v)v[n]d(α^,v)h2hδh1o(1)δ1/2+2hδh1+(h2/δ)o(1)+hδ.subscript𝑣delimited-[]𝑛𝑑𝑧𝑣subscript𝑣delimited-[]𝑛𝑑^𝛼𝑣2superscript𝛿1𝑜1𝛿122superscript𝛿1superscript2𝛿𝑜1𝛿\displaystyle\frac{\sum_{v\in[n]}\,d\left(z,v\right)}{\sum_{v\in[n]}\,d\left(\hat{\alpha},v\right)}\geq\frac{h-2h\delta^{h-1}-o(1)-\delta}{1/2+2h\delta^{h-1}+(h^{2}/\delta)\cdot o(1)+h\delta}. (50)

Note that all the derivations so far have been valid for all constants h+{1}superscript1h\in\mathbb{Z}^{+}\setminus\{1\} and δ(0,1)𝛿01\delta\in(0,1). Take δ=δ(h,ϵ)>0𝛿𝛿italic-ϵ0\delta=\delta(h,\epsilon)>0 to be sufficiently small and n𝑛n to be sufficiently large so that the right-hand side of inequality (50) is greater than 2hϵ2italic-ϵ2h-\epsilon.101010Alternatively, we may take δ=δ(n)=(max{q(n),n}n1+1/(h1))1/3𝛿𝛿𝑛superscript𝑞𝑛𝑛superscript𝑛11113\delta=\delta(n)=\left(\frac{\max\{q(n),n\}}{n^{1+1/(h-1)}}\right)^{1/3} from the beginning of this section. Then, as q(n)=o(n1+1/(h1))𝑞𝑛𝑜superscript𝑛111q(n)=o(n^{1+1/(h-1)}), the right-hand side of inequality (50) is 2ho(1)2𝑜12h-o(1), and inequalities (1)–(3) remain true for all sufficiently large n𝑛n. Then inequality (50) forbids z𝑧z, which is the common output of A𝖠𝖽𝗏superscript𝐴𝖠𝖽𝗏A^{\sf Adv} and Adsuperscript𝐴𝑑A^{d}, from being a (2hϵ)2italic-ϵ(2h-\epsilon)-approximate 111-median of ([n],d)delimited-[]𝑛𝑑([n],d). Note that A𝐴A can be any deterministic o(n1+1/(h1))𝑜superscript𝑛111o(n^{1+1/(h-1)})-query algorithm from the beginning of this section. ∎

Next, we use Theorem 33 and Fact 1 to determine the minimum value of c1𝑐1c\geq 1 such that metric 111-median has a deterministic O(n1+ϵ)𝑂superscript𝑛1italic-ϵO(n^{1+\epsilon})-query (resp., O(n1+ϵ)𝑂superscript𝑛1italic-ϵO(n^{1+\epsilon})-time) c𝑐c-approximation algorithm, for each constant ϵ(0,1)italic-ϵ01\epsilon\in(0,1).

Theorem 34.

For each constant ϵ(0,1)italic-ϵ01\epsilon\in(0,1),

min{c1metric 1-median has a deterministic O(n1+ϵ)-query c-approx. alg.}𝑐conditional1metric 1-median has a deterministic O(n1+ϵ)-query c-approx. alg.\displaystyle\min\left\{c\geq 1\mid\text{{\sc metric $1$-median} has a deterministic $O(n^{1+\epsilon})$-query $c$-approx.\ alg.}\right\}
=\displaystyle= min{c1metric 1-median has a deterministic O(n1+ϵ)-time c-approx. alg.}𝑐conditional1metric 1-median has a deterministic O(n1+ϵ)-time c-approx. alg.\displaystyle\min\left\{c\geq 1\mid\text{{\sc metric $1$-median} has a deterministic $O(n^{1+\epsilon})$-time $c$-approx.\ alg.}\right\}
=\displaystyle= 21ϵ.21italic-ϵ\displaystyle 2\left\lceil\frac{1}{\epsilon}\right\rceil.
Proof.

Take h=1/ϵ1italic-ϵh=\lceil 1/\epsilon\rceil; hence h+{1}superscript1h\in\mathbb{Z}^{+}\setminus\{1\}. It is easy to verify that n1+ϵ=o(n1+1/(h1))superscript𝑛1italic-ϵ𝑜superscript𝑛111n^{1+\epsilon}=o(n^{1+1/(h-1)}). So by Theorem 33, metric 111-median does not have a deterministic O(n1+ϵ)𝑂superscript𝑛1italic-ϵO(n^{1+\epsilon})-query (21/ϵϵ)21italic-ϵsuperscriptitalic-ϵ(2\lceil 1/\epsilon\rceil-\epsilon^{\prime})-approximation algorithm for any constant ϵ>0superscriptitalic-ϵ0\epsilon^{\prime}>0.

Clearly, n1+1/h=O(n1+ϵ)superscript𝑛11𝑂superscript𝑛1italic-ϵn^{1+1/h}=O(n^{1+\epsilon}). So by Fact 1, metric 111-median has a deterministic O(n1+ϵ)𝑂superscript𝑛1italic-ϵO(n^{1+\epsilon})-time (21/ϵ)21italic-ϵ(2\lceil 1/\epsilon\rceil)-approximation algorithm.

The above analyses remain valid with “query” and “time” exchanged because every O(n1+ϵ)𝑂superscript𝑛1italic-ϵO(n^{1+\epsilon})-time algorithm makes O(n1+ϵ)𝑂superscript𝑛1italic-ϵO(n^{1+\epsilon}) queries. Consequently, deterministic O(n1+ϵ)𝑂superscript𝑛1italic-ϵO(n^{1+\epsilon})-query (resp., O(n1+ϵ)𝑂superscript𝑛1italic-ϵO(n^{1+\epsilon})-time) algorithms can be (21/ϵ)21italic-ϵ(2\lceil 1/\epsilon\rceil)-approximate but not (21/ϵϵ)21italic-ϵsuperscriptitalic-ϵ(2\lceil 1/\epsilon\rceil-\epsilon^{\prime})-approximate for any constant ϵ>0superscriptitalic-ϵ0\epsilon^{\prime}>0. ∎

The brute-force exact algorithm for metric 111-median is well-known to run in O(n2)𝑂superscript𝑛2O(n^{2}) time. Therefore, there is no need to extend Theorem 34 to the case of ϵ1italic-ϵ1\epsilon\geq 1. On the other hand, the following corollary deals with the case of ϵ=0italic-ϵ0\epsilon=0.

Corollary 35.

Metric 111-median does not have a deterministic O(n1+o(1))𝑂superscript𝑛1𝑜1O(n^{1+o(1)})-query (resp., O(n1+o(1))𝑂superscript𝑛1𝑜1O(n^{1+o(1)})-time) O(1)𝑂1O(1)-approximation algorithm.

Proof.

Take hh\to\infty in Theorem 33. ∎

Acknowledgments

The author is supported in part by the Ministry of Science and Technology of Taiwan under grant 103-2221-E-155-026-MY2.

Appendix A Optimizing the hidden factors in Theorem 33

This appendix discusses how the bound of o(n1+1/(h1))𝑜superscript𝑛111o(n^{1+1/(h-1)}) in Theorem 33 hides factors dependent on hh. For all i[q(n)]𝑖delimited-[]𝑞𝑛i\in[q(n)],

Bi1=def.{v[n]degH(i1)(v)δn1/(h1)2}.superscriptdef.subscript𝐵𝑖1conditional-set𝑣delimited-[]𝑛subscriptdegsuperscript𝐻𝑖1𝑣𝛿superscript𝑛112\displaystyle B_{i-1}\stackrel{{\scriptstyle\text{def.}}}{{=}}\left\{v\in[n]\mid\text{deg}_{H^{(i-1)}}(v)\geq\delta n^{1/(h-1)}-2\right\}. (51)
Lemma 36.

For all i[q(n)]𝑖delimited-[]𝑞𝑛i\in[q(n)] and distinct u𝑢u, v[n](Bi1S)𝑣delimited-[]𝑛subscript𝐵𝑖1𝑆v\in[n]\setminus(B_{i-1}\cup S), we have (u,v)EG(i1)𝑢𝑣superscriptsubscript𝐸𝐺𝑖1(u,v)\in E_{G}^{(i-1)}.

Proof.

As u𝑢u, v[n]Bi1𝑣delimited-[]𝑛subscript𝐵𝑖1v\in[n]\setminus B_{i-1},

degH(j)(u)subscriptdegsuperscript𝐻𝑗𝑢\displaystyle\text{deg}_{H^{(j)}}(u) <\displaystyle< δn1/(h1)2,𝛿superscript𝑛112\displaystyle\delta n^{1/(h-1)}-2,
degH(j)(v)subscriptdegsuperscript𝐻𝑗𝑣\displaystyle\text{deg}_{H^{(j)}}(v) <\displaystyle< δn1/(h1)2𝛿superscript𝑛112\displaystyle\delta n^{1/(h-1)}-2

for all j{0,1,,i1}𝑗01𝑖1j\in\{0,1,\ldots,i-1\} by equation (51) and Lemma 4. So by lines 8 and 13 of Adv, (u,v)EG(j)𝑢𝑣superscriptsubscript𝐸𝐺𝑗(u,v)\in E_{G}^{(j)} if (u,v)EG(j1)𝑢𝑣superscriptsubscript𝐸𝐺𝑗1(u,v)\in E_{G}^{(j-1)}, for all j[i1]𝑗delimited-[]𝑖1j\in[i-1]. By equation (4), (u,v)EG(0)𝑢𝑣superscriptsubscript𝐸𝐺0(u,v)\in E_{G}^{(0)}. The proof is complete by mathematical induction. ∎

Lemma 37.

For each i[q(n)]𝑖delimited-[]𝑞𝑛i\in[q(n)] such that the i𝑖ith iteration of the loop of Adv runs lines 5–9, Pisubscript𝑃𝑖P_{i} in line 5 does not have two non-consecutive vertices in [n](Bi1S)delimited-[]𝑛subscript𝐵𝑖1𝑆[n]\setminus(B_{i-1}\cup S).

Proof.

By line 5 of Adv, two non-consecutive vertices on Pisubscript𝑃𝑖P_{i} are not connected by an edge in EG(i1)superscriptsubscript𝐸𝐺𝑖1E_{G}^{(i-1)}. This and Lemma 36 complete the proof. ∎

Lemma 38.

For all i[q(n)]𝑖delimited-[]𝑞𝑛i\in[q(n)] and vBi1𝑣subscript𝐵𝑖1v\in B_{i-1},

NG(i1)(v)NH(i1)(v).subscript𝑁superscript𝐺𝑖1𝑣subscript𝑁superscript𝐻𝑖1𝑣N_{G^{(i-1)}}(v)\subseteq N_{H^{(i-1)}}(v).
Proof.

By equation (51),

degH(i1)(v)δn1/(h1)2.subscriptdegsuperscript𝐻𝑖1𝑣𝛿superscript𝑛112\text{deg}_{H^{(i-1)}}(v)\geq\delta n^{1/(h-1)}-2.

Clearly,

degH(0)(v)=(6)0<(2)δn1/(h1)2.superscript(6)subscriptdegsuperscript𝐻0𝑣0superscript(2)𝛿superscript𝑛112\text{\rm deg}_{H^{(0)}}(v)\stackrel{{\scriptstyle\text{(\ref{initiallymarkededgeset})}}}{{=}}0\stackrel{{\scriptstyle\text{(\ref{tediouscondition2})}}}{{<}}\delta n^{1/(h-1)}-2.

So there exists j[i1]𝑗delimited-[]𝑖1j\in[i-1] satisfying

degH(j1)(v)subscriptdegsuperscript𝐻𝑗1𝑣\displaystyle\text{deg}_{H^{(j-1)}}(v) <\displaystyle< δn1/(h1)2,𝛿superscript𝑛112\displaystyle\delta n^{1/(h-1)}-2, (52)
degH(j)(v)subscriptdegsuperscript𝐻𝑗𝑣\displaystyle\text{deg}_{H^{(j)}}(v) \displaystyle\geq δn1/(h1)2.𝛿superscript𝑛112\displaystyle\delta n^{1/(h-1)}-2. (53)

Clearly,

NG(j)(v)={u[n](u,v)EG(j)}.subscript𝑁superscript𝐺𝑗𝑣conditional-set𝑢delimited-[]𝑛𝑢𝑣superscriptsubscript𝐸𝐺𝑗\displaystyle N_{G^{(j)}}(v)=\left\{u\in[n]\mid(u,v)\in E_{G}^{(j)}\right\}. (54)

As H(j1)H(j)superscript𝐻𝑗1superscript𝐻𝑗H^{(j-1)}\neq H^{(j)} by inequalities (52)–(53), the j𝑗jth iteration of the loop of Adv runs lines 5–9 but not 11–14. By inequality (53) and line 8 of Adv,

{u[n](u,v)EG(j)}={u[n](u,v)EG(j1)(EG(j1)EH(j))}.conditional-set𝑢delimited-[]𝑛𝑢𝑣superscriptsubscript𝐸𝐺𝑗conditional-set𝑢delimited-[]𝑛𝑢𝑣superscriptsubscript𝐸𝐺𝑗1superscriptsubscript𝐸𝐺𝑗1superscriptsubscript𝐸𝐻𝑗\displaystyle\left\{u\in[n]\mid(u,v)\in E_{G}^{(j)}\right\}=\left\{u\in[n]\mid(u,v)\in E_{G}^{(j-1)}\setminus\left(E_{G}^{(j-1)}\setminus E_{H}^{(j)}\right)\right\}. (55)

Equations (54)–(55) and Lemma 4 give

NG(j)(v)=NH(j)(v).subscript𝑁superscript𝐺𝑗𝑣subscript𝑁superscript𝐻𝑗𝑣N_{G^{(j)}}(v)=N_{H^{(j)}}(v).

This and Lemma 4 complete the proof. ∎

Lemma 39.

For all i[q(n)]𝑖delimited-[]𝑞𝑛i\in[q(n)],

|EH(i)||EH(i1)|+1.superscriptsubscript𝐸𝐻𝑖superscriptsubscript𝐸𝐻𝑖11\displaystyle\left|E_{H}^{(i)}\right|\leq\left|E_{H}^{(i-1)}\right|+1.
Proof.

Clearly, we may assume that the i𝑖ith iteration of the loop of Adv runs lines 5–9 but not 11–14. By line 6, we only need to show that

|{e(e is an edge on Pi)(eEH(i1))}|1.conditional-set𝑒e is an edge on Pi𝑒superscriptsubscript𝐸𝐻𝑖11\displaystyle\left|\left\{e\mid\left(\text{$e$ is an edge on $P_{i}$}\right)\land\left(e\notin E_{H}^{(i-1)}\right)\right\}\right|\leq 1. (56)

By Lemma 37, Pisubscript𝑃𝑖P_{i} in line 5 has at most one edge in ([n](Bi1S))2superscriptdelimited-[]𝑛subscript𝐵𝑖1𝑆2([n]\setminus(B_{i-1}\cup S))^{2}. So, to prove inequality (56), it suffices to show that each edge (u,v)𝑢𝑣(u,v) on Pisubscript𝑃𝑖P_{i} with (u,v)([n](Bi1S))2𝑢𝑣superscriptdelimited-[]𝑛subscript𝐵𝑖1𝑆2(u,v)\notin([n]\setminus(B_{i-1}\cup S))^{2} satisfies (u,v)EH(i1)𝑢𝑣superscriptsubscript𝐸𝐻𝑖1(u,v)\in E_{H}^{(i-1)}, as done below:

  1. Case 1:

    {u,v}S𝑢𝑣𝑆\{u,v\}\cap S\neq\emptyset. By equation (4) and Lemma 4, (u,v)EG(i1)𝑢𝑣superscriptsubscript𝐸𝐺𝑖1(u,v)\notin E_{G}^{(i-1)}. Consequently, Pisubscript𝑃𝑖P_{i} has an edge not in EG(i1)superscriptsubscript𝐸𝐺𝑖1E_{G}^{(i-1)}, contradicting line 5.

  2. Case 2:

    {u,v}Bi1𝑢𝑣subscript𝐵𝑖1\{u,v\}\cap B_{i-1}\neq\emptyset. By symmetry, assume vBi1𝑣subscript𝐵𝑖1v\in B_{i-1}. So by Lemma 38, NG(i1)(v)NH(i1)(v)subscript𝑁superscript𝐺𝑖1𝑣subscript𝑁superscript𝐻𝑖1𝑣N_{G^{(i-1)}}(v)\subseteq N_{H^{(i-1)}}(v). Because Pisubscript𝑃𝑖P_{i} is a path in G(i1)superscript𝐺𝑖1G^{(i-1)} by line 5 and (u,v)𝑢𝑣(u,v) is on Pisubscript𝑃𝑖P_{i}, uNG(i1)(v)𝑢subscript𝑁superscript𝐺𝑖1𝑣u\in N_{G^{(i-1)}}(v). In summary, uNH(i1)(v)𝑢subscript𝑁superscript𝐻𝑖1𝑣u\in N_{H^{(i-1)}}(v). I.e., (u,v)EH(i1)𝑢𝑣superscriptsubscript𝐸𝐻𝑖1(u,v)\in E_{H}^{(i-1)}.

The following improvement over Lemma 11 is immediate from equation (6) and Lemma 39.

Lemma 40.
|EH(q(n))|q(n).superscriptsubscript𝐸𝐻𝑞𝑛𝑞𝑛\left|E_{H}^{(q(n))}\right|\leq q(n).

Assuming 100h=o(n1/(h1))100𝑜superscript𝑛11100\leq h=o(n^{1/(h-1)}), the following modifications to this paper show that the bound of o(n1+1/(h1))𝑜superscript𝑛111o(n^{1+1/(h-1)}) in Theorem 33 depends on hh as o(n1+1/(h1)/h)𝑜superscript𝑛111o(n^{1+1/(h-1)}/h):

  1. (1)

    Take

    q(n)𝑞𝑛\displaystyle q(n) =\displaystyle= o(n1+1/(h1)h),𝑜superscript𝑛111\displaystyle o\left(\frac{n^{1+1/(h-1)}}{h}\right),
    δ𝛿\displaystyle\delta =\displaystyle= hmax{q(n),n}n1+1/(h1),𝑞𝑛𝑛superscript𝑛111\displaystyle h\cdot\frac{\max\{q(n),n\}}{n^{1+1/(h-1)}},
    λ𝜆\displaystyle\lambda =\displaystyle= δh/8,superscript𝛿8\displaystyle\delta^{h/8},
    S𝑆\displaystyle S =\displaystyle= [λn].delimited-[]𝜆𝑛\displaystyle[\lfloor\lambda n\rfloor].
  2. (2)

    Replace “δ𝛿\delta” by “δ𝛿\sqrt{\delta}” in inequality (2).

  3. (3)

    Replace “δ𝛿\delta” by 1/δh/41superscript𝛿41/\delta^{h/4} in inequality (3).

  4. (4)

    Replace the two occurrences of “δ𝛿\delta” by “δ𝛿\sqrt{\delta}” in line 8 of Adv.

  5. (5)

    Replace “δ𝛿\delta” by “1/δh/41superscript𝛿41/\delta^{h/4}” in line 17 of Adv.

  6. (6)

    Replace all occurrences of “δ𝛿\delta” by “δ𝛿\sqrt{\delta}” in Lemma 8 and its proof.

  7. (7)

    Replace all occurrences of “δ𝛿\delta” by “δ𝛿\sqrt{\delta}” in Lemma 9 and its proof.

  8. (8)

    Replace “δn1/(h1)𝛿superscript𝑛11\delta n^{1/(h-1)}” and “h2hδh1o(1)δ2superscript𝛿1𝑜1𝛿h-2h\delta^{h-1}-o(1)-\delta” by “n1/(h1)/δh/4superscript𝑛11superscript𝛿4n^{1/(h-1)}/\delta^{h/4}” and “h2hδh1o(1)λ/21/(2δh/4n11/(h1))2superscript𝛿1𝑜1𝜆212superscript𝛿4superscript𝑛111h-2h{\sqrt{\delta}}^{h-1}-o(1)-\lambda/2-1/(2\delta^{h/4}n^{1-1/(h-1)}),” respectively, in the statement of Lemma 10.

  9. (9)

    Replace all occurrences of “δn1/(h1)𝛿superscript𝑛11\delta n^{1/(h-1)},” “2δh1n2superscript𝛿1𝑛2\delta^{h-1}n” and “δn𝛿𝑛\lfloor\delta n\rfloor” by “n1/(h1)/δh/4superscript𝑛11superscript𝛿4n^{1/(h-1)}/\delta^{h/4},” “2δh1n2superscript𝛿1𝑛2{\sqrt{\delta}}^{h-1}n” and “λn𝜆𝑛\lfloor\lambda n\rfloor,” respectively, in the proof of Lemma 10.

  10. (10)

    Replace all occurrences of “δn1/(h1)𝛿superscript𝑛11\delta n^{1/(h-1)},” “(h/δ)o(n)𝛿𝑜𝑛(h/\delta)\cdot o(n)” and “Lemma 11” by “δn1/(h1)𝛿superscript𝑛11\sqrt{\delta}\,n^{1/(h-1)},” “(1/δ)O(q(n)/n1/(h1))1𝛿𝑂𝑞𝑛superscript𝑛11(1/\sqrt{\delta})\cdot O(q(n)/n^{1/(h-1)})” and “Lemma 40,” respectively, in Lemma 12 and its proof.

  11. (11)

    That α^^𝛼\hat{\alpha} is well-defined in equation (19) follows from |S|2𝑆2|S|\geq 2, which holds for all sufficiently large n𝑛n by item (1) and h100100h\geq 100.

  12. (12)

    Replace all occurrences of “δ𝛿\delta” by “1/δh/41superscript𝛿41/\delta^{h/4}” in Lemma 13 and its proof.

  13. (13)

    Replace “δ𝛿\delta” by “δ𝛿\sqrt{\delta}” in equation (26).

  14. (14)

    Replace “δh1superscript𝛿1\delta^{h-1}” by “δh/41superscript𝛿41\delta^{h/4-1}” in the statement of Lemma 15.

  15. (15)

    Replace all occurrences of “δ𝛿\delta” by “δ𝛿\sqrt{\delta}” and “1/δh/41superscript𝛿41/\delta^{h/4},” respectively, in the first and the second paragraphs of the proof of Lemma 15.

  16. (16)

    Replace “12δh1(h/δ)o(1)δ12superscript𝛿1𝛿𝑜1𝛿1-2\delta^{h-1}-(h/\delta)\cdot o(1)-\delta” by “12δh/41(1/δ)O(q(n)/n1+1/(h1))λ12superscript𝛿411𝛿𝑂𝑞𝑛superscript𝑛111𝜆1-2\delta^{h/4-1}-(1/\sqrt{\delta})\cdot O(q(n)/n^{1+1/(h-1)})-\lambda” in the statement of Lemma 16.

  17. (17)

    Replace all occurrences of “(h/δ)o(n)𝛿𝑜𝑛(h/\delta)\cdot o(n),” “δn𝛿𝑛\lfloor\delta n\rfloor” and “δh1superscript𝛿1\delta^{h-1}” by “(1/δ)O(q(n)/n1/(h1))1𝛿𝑂𝑞𝑛superscript𝑛11(1/\sqrt{\delta})\cdot O(q(n)/n^{1/(h-1)}),” “λn𝜆𝑛\lfloor\lambda n\rfloor” and “δh/41superscript𝛿41\delta^{h/4-1},” respectively, in the proof of Lemma 16.

  18. (18)

    Replace “δh1superscript𝛿1\delta^{h-1},” “(h2/δ)o(1)superscript2𝛿𝑜1(h^{2}/\delta)\cdot o(1)” and “hδ𝛿h\delta” by “δh/41superscript𝛿41\delta^{h/4-1},” “(h/δ)O(q(n)/n1+1/(h1))𝛿𝑂𝑞𝑛superscript𝑛111(h/\sqrt{\delta})\cdot O(q(n)/n^{1+1/(h-1)})” and “hλ𝜆h\lambda,” respectively, in the statement of Lemma 17.

  19. (19)

    Replace “δ𝛿\delta” by “1/δh/41superscript𝛿41/\delta^{h/4}” in the statement of Lemma 23.

  20. (20)

    Replace all occurrences of “δ𝛿\delta” by “δ𝛿\sqrt{\delta}” in the proof of Lemma 24.

  21. (21)

    Replace the two occurrences of “δ𝛿\delta” by “1/δh/41superscript𝛿41/\delta^{h/4}” in the statement of Lemma 27.

  22. (22)

    Replace “δ𝛿\delta” by “1/δh/41superscript𝛿41/\delta^{h/4}” in equation (3.3).

  23. (23)

    Replace “δ𝛿\delta” by “1/δh/41superscript𝛿41/\delta^{h/4}” in the statement of Lemma 32.

  24. (24)

    Replace “δh1superscript𝛿1\delta^{h-1},” “(h2/δ)o(1)superscript2𝛿𝑜1(h^{2}/\delta)\cdot o(1)” and “hδ𝛿h\delta” by “δh/41superscript𝛿41\delta^{h/4-1},” “(h/δ)O(q(n)/n1+1/(h1))𝛿𝑂𝑞𝑛superscript𝑛111(h/\sqrt{\delta})\cdot O(q(n)/n^{1+1/(h-1)})” and “hλ𝜆h\lambda,” respectively, in inequality (48).

  25. (25)

    Replace “h2hδh1o(1)δ2superscript𝛿1𝑜1𝛿h-2h\delta^{h-1}-o(1)-\delta” by “h2hδh1o(1)λ/21/(2δh/4n11/(h1))2superscript𝛿1𝑜1𝜆212superscript𝛿4superscript𝑛111h-2h{\sqrt{\delta}}^{h-1}-o(1)-\lambda/2-1/(2\delta^{h/4}n^{1-1/(h-1)})” in the right-hand side of inequality (49).

  26. (26)

    Replace the numerator and the denominator on the right-hand side of inequality (50) by “h2hδh1o(1)λ/21/(2δh/4n11/(h1))2superscript𝛿1𝑜1𝜆212superscript𝛿4superscript𝑛111h-2h{\sqrt{\delta}}^{h-1}-o(1)-\lambda/2-1/(2\delta^{h/4}n^{1-1/(h-1)})” and “1/2+2hδh/41+(h/δ)O(q(n)/n1+1/(h1))+hλ122superscript𝛿41𝛿𝑂𝑞𝑛superscript𝑛111𝜆1/2+2h\delta^{h/4-1}+(h/\sqrt{\delta})\cdot O(q(n)/n^{1+1/(h-1)})+h\lambda,” respectively.

  27. (27)

    Verify that the right-hand side of inequality (50) is 2ho(1)2𝑜12h-o(1). To see this, use item (1) and 100h=o(n1/(h1))100𝑜superscript𝑛11100\leq h=o(n^{1/(h-1)}) to verify that δ=o(1)𝛿𝑜1\delta=o(1), maxx1xδx/8=O(δ)=o(1)subscript𝑥1𝑥superscript𝛿𝑥8𝑂𝛿𝑜1\max_{x\geq 1}\,x\cdot\delta^{x/8}=O(\delta)=o(1) (which requires elementary calculus and reveals that hδh1=o(1)superscript𝛿1𝑜1h\sqrt{\delta}^{h-1}=o(1), hδh/41=o(1)superscript𝛿41𝑜1h\delta^{h/4-1}=o(1) and hλ=hδh/8=o(1)𝜆superscript𝛿8𝑜1h\lambda=h\delta^{h/8}=o(1)), λ=o(1)𝜆𝑜1\lambda=o(1), δh/41/nh/(4(h1))superscript𝛿41superscript𝑛41\delta^{h/4}\geq 1/n^{h/(4(h-1))}, δh/4n11/(h1)=nΩ(1)superscript𝛿4superscript𝑛111superscript𝑛Ω1\delta^{h/4}\cdot n^{1-1/(h-1)}=n^{\Omega(1)}, δhq(n)/n1+1/(h1)𝛿𝑞𝑛superscript𝑛111\sqrt{\delta}\geq\sqrt{h\cdot q(n)/n^{1+1/(h-1)}} and hq(n)/n1+1/(h1)=o(1)𝑞𝑛superscript𝑛111𝑜1\sqrt{h\cdot q(n)/n^{1+1/(h-1)}}=o(1).

  28. (28)

    Replace all occurrences of “δ𝛿\delta” by “δ𝛿\sqrt{\delta}” in equation (51) as well as in the proofs of Lemmas 36 and 38.

References

  • [1] C.-L. Chang. A deterministic sublinear-time nonadaptive algorithm for metric 111-median selection. To appear in Theoretical Computer Science.
  • [2] C.-L. Chang. Some results on approximate 111-median selection in metric spaces. Theoretical Computer Science, 426:1–12, 2012.
  • [3] C.-L. Chang. Deterministic sublinear-time approximations for metric 111-median selection. Information Processing Letters, 113(8):288–292, 2013.
  • [4] C.-L. Chang. A lower bound for metric 111-median selection. Technical Report arXiv: 1401.2195, 2014.
  • [5] S. Guha, A. Meyerson, N. Mishra, R. Motwani, and L. O’Callaghan. Clustering data streams: Theory and practice. IEEE Transactions on Knowledge and Data Engineering, 15(3):515–528, 2003.
  • [6] P. Indyk. Sublinear time algorithms for metric space problems. In Proceedings of the 31st Annual ACM Symposium on Theory of Computing, pages 428–434, 1999.
  • [7] P. Indyk. High-Dimensional Computational Geometry. PhD thesis, Stanford University, 2000.
  • [8] A. Kumar, Y. Sabharwal, and S. Sen. Linear-time approximation schemes for clustering problems in any dimensions. Journal of the ACM, 57(2):5, 2010.
  • [9] R. R. Mettu and C. G. Plaxton. Optimal time bounds for approximate clustering. Machine Learning, 56(1–3):35–60, 2004.
  • [10] W. Rudin. Principles of Mathematical Analysis. McGraw-Hill, 3rd edition, 1976.
  • [11] B.-Y. Wu. On approximating metric 111-median in sublinear time. Information Processing Letters, 114(4):163–166, 2014.