Deterministic metric 111-median selection with very few queries 111Part of this paper appears in Proceedings of the 27th International Computing and Combinatorics Conference (COCOONΒ 2021).

Ching-Lueh Chang222Department of Computer Science and Engineering, Yuan Ze University, Taoyuan, Taiwan. clchang@saturn.yzu.edu.tw
Abstract

Given an n𝑛n-point metric space (M,d)𝑀𝑑(M,d), metric 111-median asks for a point p∈M𝑝𝑀p\in M minimizing βˆ‘x∈Md​(p,x)subscriptπ‘₯𝑀𝑑𝑝π‘₯\sum_{x\in M}\,d(p,x). We show that for each computable function f:β„€+β†’β„€+:𝑓→superscriptβ„€superscriptβ„€f\colon\mathbb{Z}^{+}\to\mathbb{Z}^{+} satisfying f​(n)=ω​(1)π‘“π‘›πœ”1f(n)=\omega(1), metric 111-median has a deterministic, o​(n)π‘œπ‘›o(n)-query, o​(f​(n)β‹…log⁑n)π‘œβ‹…π‘“π‘›π‘›o(f(n)\cdot\log n)-approximation and nonadaptive algorithm. Previously, no deterministic o​(n)π‘œπ‘›o(n)-query o​(n)π‘œπ‘›o(n)-approximation algorithms are known for metric 111-median. On the negative side, we prove each deterministic O​(n)𝑂𝑛O(n)-query algorithm for metric 111-median to be not (δ​log⁑n)𝛿𝑛(\delta\log n)-approximate for a sufficiently small constant Ξ΄>0𝛿0\delta>0. We also refute the existence of deterministic o​(n)π‘œπ‘›o(n)-query O​(log⁑n)𝑂𝑛O(\log n)-approximation algorithms.

Keywords: metric space; 1-median; median selection; query complexity; sublinear algorithm; sublinear computation

1 Introduction

An n𝑛n-point metric space (M,d)𝑀𝑑(M,d) is a size-n𝑛n set M𝑀M endowed with a distance function d:MΓ—Mβ†’[0,∞):𝑑→𝑀𝑀0d\colon M\times M\to[0,\infty) such that

  • β€’

    d​(x,y)=0𝑑π‘₯𝑦0d(x,y)=0 if and only if x=yπ‘₯𝑦x=y,

  • β€’

    d​(x,y)=d​(y,x)𝑑π‘₯𝑦𝑑𝑦π‘₯d(x,y)=d(y,x), and

  • β€’

    d​(x,y)+d​(y,z)β‰₯d​(x,z)𝑑π‘₯𝑦𝑑𝑦𝑧𝑑π‘₯𝑧d(x,y)+d(y,z)\geq d(x,z) (triangle inequality)

for all xπ‘₯x, y𝑦y, z∈M𝑧𝑀z\in MΒ [16]. Metric 111-median asks for a point p∈M𝑝𝑀p\in M minimizing βˆ‘x∈Md​(p,x)subscriptπ‘₯𝑀𝑑𝑝π‘₯\sum_{x\in M}\,d(p,x). Clearly, it has a brute-force O​(n2)𝑂superscript𝑛2O(n^{2})-time algorithm. Furthermore, it generalizes the classical median selectionΒ [6] and can be generalized further to metric kπ‘˜k-median clustering. In social network analysis, metric 111-median asks for an actor with the maximum closeness centralityΒ [17]. For all Ξ²β‰₯1𝛽1\beta\geq 1, a β𝛽\beta-approximate 111-median of (M,d)𝑀𝑑(M,d) is a point p∈M𝑝𝑀p\in M satisfying βˆ‘y∈Md​(p,y)≀β⋅minq∈Mβ€‹βˆ‘y∈Md​(q,y)subscript𝑦𝑀𝑑𝑝𝑦⋅𝛽subscriptπ‘žπ‘€subscriptπ‘¦π‘€π‘‘π‘žπ‘¦\sum_{y\in M}\,d(p,y)\leq\beta\cdot\min_{q\in M}\sum_{y\in M}\,d(q,y). By convention, a β𝛽\beta-approximation algorithm for metric 111-median must output a β𝛽\beta-approximate 111-median of (M,d)𝑀𝑑(M,d). A query inspects d​(x,y)𝑑π‘₯𝑦d(x,y) for some xπ‘₯x, y∈M𝑦𝑀y\in M. An algorithm is nonadaptive if its i𝑖ith query (xi,yi)∈M2subscriptπ‘₯𝑖subscript𝑦𝑖superscript𝑀2(x_{i},y_{i})\in M^{2} is independent of the answers to the first iβˆ’1𝑖1i-1 queries, for all i>1𝑖1i>1. Write dGsubscript𝑑𝐺d_{G} for the distance function induced by an undirected graph G𝐺G.

IndykΒ [11, 12] gives a Monte Carlo O​(n/Ο΅2)𝑂𝑛superscriptitalic-Ο΅2O(n/\epsilon^{2})-time (1+Ο΅)1italic-Ο΅(1+\epsilon)-approximation algorithm for metric 111-median, where Ο΅>0italic-Ο΅0\epsilon>0. His time complexity is optimal w.r.t.Β n𝑛n. When restricted to ℝDsuperscriptℝ𝐷\mathbb{R}^{D}, metric 111-median has a Monte Carlo O​(Dβ‹…exp⁑(poly​(1/Ο΅)))𝑂⋅𝐷poly1italic-Ο΅O(D\cdot\exp(\text{poly}(1/\epsilon)))-time (1+Ο΅)1italic-Ο΅(1+\epsilon)-approximation algorithmΒ [14]. The more general kπ‘˜k-median clustering in metric spaces has streaming approximation algorithmsΒ [10], requires Ω​(n​k)Ξ©π‘›π‘˜\Omega(nk) time for O​(1)𝑂1O(1)-approximationsΒ [15] and is inapproximable to within (1+2/eβˆ’Ξ©β€‹(1))12𝑒Ω1(1+2/e-\Omega(1)) unless NPβŠ†DTIME​(nO​(log⁑log⁑n))NPDTIMEsuperscript𝑛𝑂𝑛\text{NP}\subseteq\text{DTIME}(n^{O(\log\log n)})Β [13]. For ℝDsuperscriptℝ𝐷\mathbb{R}^{D} and graph metrics, a well-studied problem is to find the average distance from a query point to a finite set of pointsΒ [1, 8, 9].

Deterministic ω​(n)πœ”π‘›\omega(n)-query computation is almost completely understood for metric 111-median: For all constants ϡ∈(0,1)italic-Ο΅01\epsilon\in(0,1), the best approximation ratio achievable by deterministic o​(n2)π‘œsuperscript𝑛2o(n^{2})-query and O​(n1+Ο΅)𝑂superscript𝑛1italic-Ο΅O(n^{1+\epsilon})-query algorithms is 444 and 2β€‹βŒˆ1/Ο΅βŒ‰21italic-Ο΅2\lceil 1/\epsilon\rceil, respectivelyΒ [2, 4, 18]. The same holds with β€œquery” replaced by β€œtime” and regardless of whether the algorithms can be adaptiveΒ [2, 4]. In contrast, we study the largely unknown deterministic O​(n)𝑂𝑛O(n)- or o​(n)π‘œπ‘›o(n)-query computation. An o​(n)π‘œπ‘›o(n)-query algorithm enjoys the strength of ignoring a 1βˆ’o​(1)1π‘œ11-o(1) fraction of points.

It is folklore that every point is an (nβˆ’1)𝑛1(n-1)-approximate 111-median. Surprisingly, this is the current best upper bound for deterministic o​(n)π‘œπ‘›o(n)-query algorithms. In particular, no deterministic o​(n)π‘œπ‘›o(n)-query o​(n)π‘œπ‘›o(n)-approximation algorithms are known for metric 111-median. Instead, we give a deterministic, o​(n)π‘œπ‘›o(n)-query, o​(f​(n)β‹…log⁑n)π‘œβ‹…π‘“π‘›π‘›o(f(n)\cdot\log n)-approximation and nonadaptive algorithm for each computable function f:β„€+β†’β„€+:𝑓→superscriptβ„€superscriptβ„€f\colon\mathbb{Z}^{+}\to\mathbb{Z}^{+} satisfying f​(n)=ω​(1)π‘“π‘›πœ”1f(n)=\omega(1). So, e.g., metric 111-median has a deterministic o​(n)π‘œπ‘›o(n)-query o​(α​(n)β‹…log⁑n)π‘œβ‹…π›Όπ‘›π‘›o(\alpha(n)\cdot\log n)-approximation algorithm for the very slowly growing inverse Ackermann function α​(β‹…)𝛼⋅\alpha(\cdot). Our main technical discovery is that a β𝛽\beta-approximate 111-median of (S,d|SΓ—S)𝑆evaluated-at𝑑𝑆𝑆(S,d|_{S\times S}) (where d|SΓ—Sevaluated-at𝑑𝑆𝑆d|_{S\times S} denotes d𝑑d restricted to SΓ—S𝑆𝑆S\times S) is an O​(β​n/|S|)𝑂𝛽𝑛𝑆O(\beta n/|S|)-approximate 111-median of (M,d)𝑀𝑑(M,d), for all βˆ…βŠŠSβŠ†M𝑆𝑀\emptyset\subsetneq S\subseteq M and Ξ²β‰₯1𝛽1\beta\geq 1. When SβŠ†M𝑆𝑀S\subseteq M is a uniformly random set of a sufficiently large size, an approximate solution to metric kπ‘˜k-median clustering for (S,d|SΓ—S)𝑆evaluated-at𝑑𝑆𝑆(S,d|_{S\times S}) is a good one for (M,d)𝑀𝑑(M,d) with high probabilityΒ [7]. But our discovery is for any S𝑆S and is new.

ChangΒ [3] shows that metric 111-median has a deterministic, O​(exp⁑(O​(1/Ο΅))β‹…n​log⁑n)𝑂⋅𝑂1italic-ϡ𝑛𝑛O(\exp(O(1/\epsilon))\cdot n\log n)-time, O​(exp⁑(O​(1/Ο΅))β‹…n)𝑂⋅𝑂1italic-ϡ𝑛O(\exp(O(1/\epsilon))\cdot n)-query, (ϡ​log⁑n)italic-ϡ𝑛(\epsilon\log n)-approximation and nonadaptive algorithm, for all Ο΅>0italic-Ο΅0\epsilon>0. So deterministic O​(n)𝑂𝑛O(n)-query algorithms can be (ϡ​log⁑n)italic-ϡ𝑛(\epsilon\log n)-approximate for each Ο΅>0italic-Ο΅0\epsilon>0. Currently, the best lower bound against deterministic O​(n)𝑂𝑛O(n)-query algorithms is that they cannot be O​(1)𝑂1O(1)-approximateΒ [4]. So there is a huge gap between Chang’sΒ [3] approximation ratio of ϡ​log⁑nitalic-ϡ𝑛\epsilon\log n and the current best lower bound. We close the gap by showing each deterministic O​(n)𝑂𝑛O(n)-query algorithm for metric 111-median to be not (δ​log⁑n)𝛿𝑛(\delta\log n)-approximate for a sufficiently small constant Ξ΄>0𝛿0\delta>0 (depending on the algorithm). Our approach, sketched below, adversarially answers the queries of a deterministic O​(n)𝑂𝑛O(n)-query algorithm Alg:

  1. (I)

    Start with the complete graph on M𝑀M.

  2. (II)

    Mark all edges in an O​(1)𝑂1O(1)-regular expander graph as permanent.

  3. (III)

    Repeat the following:

    1. (1)

      Upon receiving a query (a,b)∈M2π‘Žπ‘superscript𝑀2(a,b)\in M^{2}, find a shortest aπ‘Ža-b𝑏b path P𝑃P and answer by the length of P𝑃P.

    2. (2)

      Mark all edges of P𝑃P as permanent.

    3. (3)

      For each vertex v𝑣v incident to too many permanent edges, remove all non-permanent edges incident to v𝑣v.

Intuitively, itemΒ (III3) keeps degrees small, thus forcing the output of Alg to have a large average distance to other points. Because itemΒ (III1) answers a query by the length of P𝑃P, itemsΒ (III2)–(III3) must preserve all edge of P𝑃P (by marking them as permanent and not removing them) for the consistency in answering future queries. ItemsΒ (I)Β andΒ (III1)–(III3) follow Chang’sΒ [4] paradigm. To prove a lower bound against Alg, we shall make the output of Alg a lot worse than a 111-median, presumably by identifying or planting a vertex with a sufficiently small average distance to other points. However, Chang fails in this respect. We overcome his problem by itemΒ (II), which allows a vertex to have an O​(1)𝑂1O(1) average distance to other vertices.

An extension of our lower bound forbids each deterministic o​(n)π‘œπ‘›o(n)-query algorithm for metric 111-median to be o​(f​(n)β‹…log⁑n)π‘œβ‹…π‘“π‘›π‘›o(f(n)\cdot\log n)-approximate for some computable function f:β„€+β†’β„€+:𝑓→superscriptβ„€superscriptβ„€f\colon\mathbb{Z}^{+}\to\mathbb{Z}^{+} satisfying f​(n)=ω​(1)π‘“π‘›πœ”1f(n)=\omega(1). In particular, deterministic o​(n)π‘œπ‘›o(n)-query O​(log⁑n)𝑂𝑛O(\log n)-approximation algorithms do not exist. Previously, the best lower bound against deterministic o​(n)π‘œπ‘›o(n)-query algorithms A𝐴A is folklore and forbids A𝐴A to be hA​(n)subscriptβ„Žπ΄π‘›h_{A}(n)-approximate for some hA​(n)=ω​(1)subscriptβ„Žπ΄π‘›πœ”1h_{A}(n)=\omega(1).333For a sketch of proof, answer all queries of A𝐴A by 111 and put all points not involved in the queries to be extremely close to one another but extremely far away from A𝐴A’s output and from the points involved in the queries. So previous works do not yet refute the existence of deterministic o​(n)π‘œπ‘›o(n)-query O​(α​(n))𝑂𝛼𝑛O(\alpha(n))-approximation algorithms, where α​(β‹…)𝛼⋅\alpha(\cdot) is the very slowly growing inverse Ackermann function.

ChangΒ [5]’s adversarial method shows that metric 111-median has no deterministic O​(n)𝑂𝑛O(n)-query o​(log⁑n)π‘œπ‘›o(\log n)-approximation algorithms that make each point involve in O​(1)𝑂1O(1) queries to d𝑑d. But his adversary is rather naΓ―ve and does not seem to yield any unconditional lower bound such as ours.

2 Upper bound

Take an n𝑛n-point metric space (M,d)𝑀𝑑(M,d) and βˆ…βŠŠSβŠ†M𝑆𝑀\emptyset\subsetneq S\subseteq M. Define

xβˆ—superscriptπ‘₯\displaystyle x^{*} ≑\displaystyle\equiv argminx∈Mβˆ‘y∈Md​(x,y),subscriptargminπ‘₯𝑀subscript𝑦𝑀𝑑π‘₯𝑦\displaystyle\mathop{\mathrm{argmin}}_{x\in M}\,\sum_{y\in M}\,d(x,y),
xSβˆ—subscriptsuperscriptπ‘₯𝑆\displaystyle x^{*}_{S} ≑\displaystyle\equiv argminx∈Sβˆ‘y∈Sd​(x,y)subscriptargminπ‘₯𝑆subscript𝑦𝑆𝑑π‘₯𝑦\displaystyle\mathop{\mathrm{argmin}}_{x\in S}\,\sum_{y\in S}\,d(x,y)

to be a 111-median of (M,d)𝑀𝑑(M,d) and (S,d|SΓ—S)𝑆evaluated-at𝑑𝑆𝑆(S,d|_{S\times S}), respectively, breaking ties arbitrarily. Furthermore, pick 𝒖𝒖\boldsymbol{u} and 𝒗𝒗\boldsymbol{v} independently and uniformly at random from S𝑆S. So

rΒ―S≑𝐸[d​(𝒖,𝒗)]subscriptΒ―π‘Ÿπ‘†πΈdelimited-[]𝑑𝒖𝒗\bar{r}_{S}\equiv\mathop{E}\left[\,d\left(\boldsymbol{u},\boldsymbol{v}\right)\,\right]

is the average distance in (S,d|SΓ—S)𝑆evaluated-at𝑑𝑆𝑆(S,d|_{S\times S}).

Lemma 1.
βˆ‘y∈Sd​(xβˆ—,y)β‰₯|S|​rΒ―S2.subscript𝑦𝑆𝑑superscriptπ‘₯𝑦𝑆subscriptΒ―π‘Ÿπ‘†2\sum_{y\in S}\,d\left(x^{*},y\right)\geq\frac{|S|\,\bar{r}_{S}}{2}.
Proof.

We have

βˆ‘y∈Sd​(xβˆ—,y)subscript𝑦𝑆𝑑superscriptπ‘₯𝑦\displaystyle\sum_{y\in S}\,d\left(x^{*},y\right) =\displaystyle= |S|⋅𝐸[d​(xβˆ—,𝒖)]⋅𝑆𝐸delimited-[]𝑑superscriptπ‘₯𝒖\displaystyle|S|\cdot\mathop{E}\left[\,d\left(x^{*},\boldsymbol{u}\right)\,\right]
=\displaystyle= 12β‹…(|S|⋅𝐸[d​(xβˆ—,𝒖)]+|S|⋅𝐸[d​(xβˆ—,𝒗)])β‹…12⋅𝑆𝐸delimited-[]𝑑superscriptπ‘₯𝒖⋅𝑆𝐸delimited-[]𝑑superscriptπ‘₯𝒗\displaystyle\frac{1}{2}\cdot\left(|S|\cdot\mathop{E}\left[\,d\left(x^{*},\boldsymbol{u}\right)\,\right]+|S|\cdot\mathop{E}\left[\,d\left(x^{*},\boldsymbol{v}\right)\,\right]\right)
β‰₯\displaystyle\geq 12β‹…|S|⋅𝐸[d​(𝒖,𝒗)].β‹…12𝑆𝐸delimited-[]𝑑𝒖𝒗\displaystyle\frac{1}{2}\cdot|S|\cdot\mathop{E}\left[\,d\left(\boldsymbol{u},\boldsymbol{v}\right)\,\right].

∎

Lemma 2.
βˆ‘y∈Sd​(xSβˆ—,y)≀|S|​rΒ―S.subscript𝑦𝑆𝑑subscriptsuperscriptπ‘₯𝑆𝑦𝑆subscriptΒ―π‘Ÿπ‘†\sum_{y\in S}\,d\left(x^{*}_{S},y\right)\leq|S|\,\bar{r}_{S}.
Proof.

By the optimality of xSβˆ—subscriptsuperscriptπ‘₯𝑆x^{*}_{S},

βˆ‘y∈Sd​(xSβˆ—,y)≀𝐸[βˆ‘y∈Sd​(𝒖,y)].subscript𝑦𝑆𝑑subscriptsuperscriptπ‘₯𝑆𝑦𝐸delimited-[]subscript𝑦𝑆𝑑𝒖𝑦\sum_{y\in S}\,d\left(x^{*}_{S},y\right)\leq\mathop{E}\left[\,\sum_{y\in S}\,d\left(\boldsymbol{u},y\right)\,\right].

Clearly,

𝐸[βˆ‘y∈Sd​(𝒖,y)]=|S|⋅𝐸[d​(𝒖,𝒗)].𝐸delimited-[]subscript𝑦𝑆𝑑𝒖𝑦⋅𝑆𝐸delimited-[]𝑑𝒖𝒗\mathop{E}\left[\,\sum_{y\in S}\,d\left(\boldsymbol{u},y\right)\,\right]=|S|\cdot\mathop{E}\left[\,d\left(\boldsymbol{u},\boldsymbol{v}\right)\,\right].

∎

For all xSβ€²βˆˆSsubscriptsuperscriptπ‘₯′𝑆𝑆x^{\prime}_{S}\in S,

βˆ‘y∈Md​(xSβ€²,y)β‰€βˆ‘y∈M(d​(xSβ€²,xβˆ—)+d​(xβˆ—,y))=nβ‹…d​(xSβ€²,xβˆ—)+βˆ‘y∈Md​(xβˆ—,y).subscript𝑦𝑀𝑑subscriptsuperscriptπ‘₯′𝑆𝑦subscript𝑦𝑀𝑑subscriptsuperscriptπ‘₯′𝑆superscriptπ‘₯𝑑superscriptπ‘₯𝑦⋅𝑛𝑑subscriptsuperscriptπ‘₯′𝑆superscriptπ‘₯subscript𝑦𝑀𝑑superscriptπ‘₯𝑦\displaystyle\sum_{y\in M}\,d\left(x^{\prime}_{S},y\right)\leq\sum_{y\in M}\,\left(d\left(x^{\prime}_{S},x^{*}\right)+d\left(x^{*},y\right)\right)=n\cdot d\left(x^{\prime}_{S},x^{*}\right)+\sum_{y\in M}\,d\left(x^{*},y\right). (1)

The next two lemmas constitute our main discovery.

Lemma 3.

For all xSβ€²βˆˆSsubscriptsuperscriptπ‘₯′𝑆𝑆x^{\prime}_{S}\in S and Ξ²β‰₯1𝛽1\beta\geq 1 satisfying βˆ‘y∈Sd​(xSβ€²,y)β‰€Ξ²β‹…βˆ‘y∈Sd​(xSβˆ—,y)subscript𝑦𝑆𝑑subscriptsuperscriptπ‘₯′𝑆𝑦⋅𝛽subscript𝑦𝑆𝑑subscriptsuperscriptπ‘₯𝑆𝑦\sum_{y\in S}\,d(x^{\prime}_{S},y)\leq\beta\cdot\sum_{y\in S}\,d(x^{*}_{S},y) and d​(xSβ€²,xβˆ—)≀2​β​rΒ―S𝑑subscriptsuperscriptπ‘₯′𝑆superscriptπ‘₯2𝛽subscriptΒ―π‘Ÿπ‘†d(x^{\prime}_{S},x^{*})\leq 2\beta\bar{r}_{S}, xSβ€²subscriptsuperscriptπ‘₯′𝑆x^{\prime}_{S} is an O​(β​n/|S|)𝑂𝛽𝑛𝑆O(\beta n/|S|)-approximate 111-median of (M,d)𝑀𝑑(M,d).

Proof.

By LemmaΒ 1,

nβ‹…d​(xSβ€²,xβˆ—)≀nβ‹…d​(xSβ€²,xβˆ—)β‹…2|S|​rΒ―Sβ‹…βˆ‘y∈Sd​(xβˆ—,y).⋅𝑛𝑑subscriptsuperscriptπ‘₯′𝑆superscriptπ‘₯⋅⋅𝑛𝑑subscriptsuperscriptπ‘₯′𝑆superscriptπ‘₯2𝑆subscriptΒ―π‘Ÿπ‘†subscript𝑦𝑆𝑑superscriptπ‘₯𝑦\displaystyle n\cdot d\left(x^{\prime}_{S},x^{*}\right)\leq n\cdot d\left(x^{\prime}_{S},x^{*}\right)\cdot\frac{2}{|S|\,\bar{r}_{S}}\cdot\sum_{y\in S}\,d\left(x^{*},y\right). (2)

As d​(xSβ€²,xβˆ—)≀2​β​rΒ―S𝑑subscriptsuperscriptπ‘₯′𝑆superscriptπ‘₯2𝛽subscriptΒ―π‘Ÿπ‘†d(x^{\prime}_{S},x^{*})\leq 2\beta\bar{r}_{S} and SβŠ†M𝑆𝑀S\subseteq M,

βˆ‘y∈Md​(xSβ€²,y)≀O​(β​n|S|)β‹…βˆ‘y∈Md​(xβˆ—,y)subscript𝑦𝑀𝑑subscriptsuperscriptπ‘₯′𝑆𝑦⋅𝑂𝛽𝑛𝑆subscript𝑦𝑀𝑑superscriptπ‘₯𝑦\sum_{y\in M}\,d\left(x^{\prime}_{S},y\right)\leq O\left(\frac{\beta n}{|S|}\right)\cdot\sum_{y\in M}\,d\left(x^{*},y\right)

by equationsΒ (1)–(2). ∎

Lemma 4.

For all xSβ€²βˆˆSsubscriptsuperscriptπ‘₯′𝑆𝑆x^{\prime}_{S}\in S and Ξ²β‰₯1𝛽1\beta\geq 1 satisfying βˆ‘y∈Sd​(xSβ€²,y)β‰€Ξ²β‹…βˆ‘y∈Sd​(xSβˆ—,y)subscript𝑦𝑆𝑑subscriptsuperscriptπ‘₯′𝑆𝑦⋅𝛽subscript𝑦𝑆𝑑subscriptsuperscriptπ‘₯𝑆𝑦\sum_{y\in S}\,d(x^{\prime}_{S},y)\leq\beta\cdot\sum_{y\in S}\,d(x^{*}_{S},y) and d​(xSβ€²,xβˆ—)>2​β​rΒ―S𝑑subscriptsuperscriptπ‘₯′𝑆superscriptπ‘₯2𝛽subscriptΒ―π‘Ÿπ‘†d(x^{\prime}_{S},x^{*})>2\beta\bar{r}_{S}, xSβ€²subscriptsuperscriptπ‘₯′𝑆x^{\prime}_{S} is an O​(n/|S|)𝑂𝑛𝑆O(n/|S|)-approximate 111-median of (M,d)𝑀𝑑(M,d).

Proof.

By the triangle inequality,

βˆ‘y∈Sd​(xβˆ—,y)β‰₯βˆ‘y∈S(d​(xSβ€²,xβˆ—)βˆ’d​(xSβ€²,y))=|S|β‹…d​(xSβ€²,xβˆ—)βˆ’βˆ‘y∈Sd​(xSβ€²,y).subscript𝑦𝑆𝑑superscriptπ‘₯𝑦subscript𝑦𝑆𝑑subscriptsuperscriptπ‘₯′𝑆superscriptπ‘₯𝑑subscriptsuperscriptπ‘₯′𝑆𝑦⋅𝑆𝑑subscriptsuperscriptπ‘₯′𝑆superscriptπ‘₯subscript𝑦𝑆𝑑subscriptsuperscriptπ‘₯′𝑆𝑦\displaystyle\sum_{y\in S}\,d\left(x^{*},y\right)\geq\sum_{y\in S}\,\left(d\left(x^{\prime}_{S},x^{*}\right)-d\left(x^{\prime}_{S},y\right)\right)=|S|\cdot d\left(x^{\prime}_{S},x^{*}\right)-\sum_{y\in S}\,d\left(x^{\prime}_{S},y\right). (3)

Furthermore,

βˆ‘y∈Sd​(xSβ€²,y)β‰€Ξ²β‹…βˆ‘y∈Sd​(xSβˆ—,y)≀LemmaΒ 2β​|S|​rΒ―S.subscript𝑦𝑆𝑑subscriptsuperscriptπ‘₯′𝑆𝑦⋅𝛽subscript𝑦𝑆𝑑subscriptsuperscriptπ‘₯𝑆𝑦superscriptLemmaΒ 2𝛽𝑆subscriptΒ―π‘Ÿπ‘†\displaystyle\sum_{y\in S}\,d\left(x^{\prime}_{S},y\right)\leq\beta\cdot\sum_{y\in S}\,d\left(x^{*}_{S},y\right)\stackrel{{\scriptstyle\text{Lemma~{}\ref{localoptimalupperbound}}}}{{\leq}}\beta\,|S|\,\bar{r}_{S}. (4)

As d​(xSβ€²,xβˆ—)>2​β​rΒ―S𝑑subscriptsuperscriptπ‘₯′𝑆superscriptπ‘₯2𝛽subscriptΒ―π‘Ÿπ‘†d(x^{\prime}_{S},x^{*})>2\beta\bar{r}_{S},

βˆ‘y∈Sd​(xβˆ—,y)β‰₯(3)–(4)|S|β‹…d​(xSβ€²,xβˆ—)βˆ’Ξ²β€‹|S|​rΒ―S>|S|2β‹…d​(xSβ€²,xβˆ—).superscript(3)–(4)subscript𝑦𝑆𝑑superscriptπ‘₯𝑦⋅𝑆𝑑subscriptsuperscriptπ‘₯′𝑆superscriptπ‘₯𝛽𝑆subscriptΒ―π‘Ÿπ‘†β‹…π‘†2𝑑subscriptsuperscriptπ‘₯′𝑆superscriptπ‘₯\sum_{y\in S}\,d\left(x^{*},y\right)\stackrel{{\scriptstyle\text{(\ref{againdontknowhowtoname1})--(\ref{againdontknowhowtoname2})}}}{{\geq}}|S|\cdot d\left(x^{\prime}_{S},x^{*}\right)-\beta\,|S|\,\bar{r}_{S}>\frac{|S|}{2}\cdot d\left(x^{\prime}_{S},x^{*}\right).

So

nβ‹…d​(xSβ€²,xβˆ—)=2​n|S|β‹…|S|2β‹…d​(xSβ€²,xβˆ—)<2​n|S|β‹…βˆ‘y∈Sd​(xβˆ—,y).⋅𝑛𝑑subscriptsuperscriptπ‘₯′𝑆superscriptπ‘₯β‹…2𝑛𝑆𝑆2𝑑subscriptsuperscriptπ‘₯′𝑆superscriptπ‘₯β‹…2𝑛𝑆subscript𝑦𝑆𝑑superscriptπ‘₯𝑦n\cdot d\left(x^{\prime}_{S},x^{*}\right)=\frac{2n}{|S|}\cdot\frac{|S|}{2}\cdot d\left(x^{\prime}_{S},x^{*}\right)<\frac{2n}{|S|}\cdot\sum_{y\in S}\,d\left(x^{*},y\right).

This and equationΒ (1) imply

βˆ‘y∈Md​(xSβ€²,y)≀O​(n|S|)β‹…βˆ‘y∈Md​(xβˆ—,y).subscript𝑦𝑀𝑑subscriptsuperscriptπ‘₯′𝑆𝑦⋅𝑂𝑛𝑆subscript𝑦𝑀𝑑superscriptπ‘₯𝑦\sum_{y\in M}\,d\left(x^{\prime}_{S},y\right)\leq O\left(\frac{n}{|S|}\right)\cdot\sum_{y\in M}\,d\left(x^{*},y\right).

∎

LemmasΒ 3–4 imply the following.

Lemma 5.

For all Ξ²β‰₯1𝛽1\beta\geq 1, every β𝛽\beta-approximate 111-median of (S,d|SΓ—S)𝑆evaluated-at𝑑𝑆𝑆(S,d|_{S\times S}) is an O​(β​n/|S|)𝑂𝛽𝑛𝑆O(\beta n/|S|)-approximate 111-median of (M,d)𝑀𝑑(M,d).

The following theorem is due to ChangΒ [3].

Theorem 6 ([3]).

For all constants Ο΅>0italic-Ο΅0\epsilon>0, metric 111-median has a deterministic, O​(exp⁑(O​(1/Ο΅))β‹…n​log⁑n)𝑂⋅𝑂1italic-ϡ𝑛𝑛O(\exp(O(1/\epsilon))\cdot n\log n)-time, (exp⁑(O​(1/Ο΅))β‹…n)⋅𝑂1italic-ϡ𝑛(\exp(O(1/\epsilon))\cdot n)-query, O​(Ο΅β‹…log⁑n)𝑂⋅italic-ϡ𝑛O(\epsilon\cdot\log n)-approximation and nonadaptive algorithm.

Below is our main theorem.

Theorem 7.

For each computable function f:β„€+β†’β„€+:𝑓→superscriptβ„€superscriptβ„€f\colon\mathbb{Z}^{+}\to\mathbb{Z}^{+} satisfying f​(n)=ω​(1)π‘“π‘›πœ”1f(n)=\omega(1), metric 111-median has a deterministic, o​(n)π‘œπ‘›o(n)-query, o​(f​(n)β‹…log⁑n)π‘œβ‹…π‘“π‘›π‘›o(f(n)\cdot\log n)-approximation and nonadaptive algorithm.

Proof.

Take any SβŠ†M𝑆𝑀S\subseteq M of size Ξ˜β€‹(n/f​(n))Ξ˜π‘›π‘“π‘›\Theta(n/\sqrt{f(n)}). Applying TheoremΒ 6 to (S,d|SΓ—S)𝑆evaluated-at𝑑𝑆𝑆(S,d|_{S\times S}), an O​(log⁑|S|)𝑂𝑆O(\log|S|)-approximate 111-median xSβ€²subscriptsuperscriptπ‘₯′𝑆x^{\prime}_{S} of (S,d|SΓ—S)𝑆evaluated-at𝑑𝑆𝑆(S,d|_{S\times S}) can be found deterministically and nonadaptively with O​(|S|)𝑂𝑆O(|S|) queries. By LemmaΒ 5 (with Ξ²=O​(log⁑|S|)𝛽𝑂𝑆\beta=O(\log|S|)), xSβ€²subscriptsuperscriptπ‘₯′𝑆x^{\prime}_{S} is an O​((log⁑|S|)β‹…n/|S|)𝑂⋅𝑆𝑛𝑆O((\log|S|)\cdot n/|S|)-approximate 111-median of (M,d)𝑀𝑑(M,d). ∎

Taking a very slowly growing f​(β‹…)𝑓⋅f(\cdot) (e.g., the iterated logarithm or the inverse Ackermann function), TheoremΒ 7 allows deterministic o​(n)π‘œπ‘›o(n)-query algorithms to be very close to being O​(log⁑n)𝑂𝑛O(\log n)-approximate.

3 Lower bound

Fix any deterministic qπ‘žq-query algorithm Alg, where q=q​(n)=O​(n)π‘žπ‘žπ‘›π‘‚π‘›q=q(n)=O(n). Then take a constant C>2​d+4​q/n𝐢2𝑑4π‘žπ‘›C>2d+4q/n, where d=O​(1)𝑑𝑂1d=O(1) is such that d𝑑d-regular expander graphs exist. By padding, assume the number of Alg’s queries to be exactly qπ‘žq. Adversary Adv in Fig.Β 1 answers the queries of Alg. All graphs are assumed to be undirected.

1:Β Β Let G(0)superscript𝐺0G^{(0)} be the complete graph on M𝑀M;
2:Β Β Pick a d𝑑d-regular expander graph Gexpsuperscript𝐺expG^{\text{exp}} on M𝑀M, where d=O​(1)𝑑𝑂1d=O(1);
3:  Mark all edges of Gexpsuperscript𝐺expG^{\text{exp}} as permanent;
4:Β Β forΒ i=1𝑖1i=1 up to qπ‘žqΒ do
5:Β Β Β Β Β Receive the i𝑖ith query, denoted by (ai,bi)∈M2subscriptπ‘Žπ‘–subscript𝑏𝑖superscript𝑀2(a_{i},b_{i})\in M^{2};
6:Β Β Β Β Β Pick a shortest aisubscriptπ‘Žπ‘–a_{i}-bisubscript𝑏𝑖b_{i} path Pisubscript𝑃𝑖P_{i} in G(iβˆ’1)superscript𝐺𝑖1G^{(i-1)};
7:Β Β Β Β Β Answer the i𝑖ith query by the length of Pisubscript𝑃𝑖P_{i};
8:Β Β Β Β Β Mark all edges of Pisubscript𝑃𝑖P_{i} as permanent;
9:Β Β Β Β Β G(i)←G(iβˆ’1)←superscript𝐺𝑖superscript𝐺𝑖1G^{(i)}\leftarrow G^{(i-1)};
10:Β Β Β Β Β forΒ each v∈M𝑣𝑀v\in MΒ do
11:Β Β Β Β Β Β Β Β ifΒ v𝑣v is incident to more than C𝐢C permanent edgesΒ then
12:Β Β Β Β Β Β Β Β Β Β Β Remove from G(i)superscript𝐺𝑖G^{(i)} all non-permanent edges incident to v𝑣v;
13:Β Β Β Β Β Β Β Β endΒ if
14:Β Β Β Β Β endΒ for
15:Β Β endΒ for
Figure 1: Adversary Adv for answering the queries of Alg

As a remark, whenever an edge of a graph is marked as permanent, that edge is considered to be permanent in all graphs. For example, an edge of Gexpsuperscript𝐺expG^{\text{exp}} marked as permanent in lineΒ 3 of Adv is considered to be permanent in linesΒ 11–13, even though the latter processes G(i)superscript𝐺𝑖G^{(i)} rather than Gexpsuperscript𝐺expG^{\text{exp}}. Similarly, although an edge marked as permanent by lineΒ 8 comes from G(iβˆ’1)superscript𝐺𝑖1G^{(i-1)} by lineΒ 6, it is considered to be permanent in linesΒ 11–13 as well.

Lemma 8.

For all 0≀i≀q0π‘–π‘ž0\leq i\leq q, Gexpsuperscript𝐺expG^{\text{\rm exp}} is a subgraph of G(i)superscript𝐺𝑖G^{(i)}.

Proof.

By lineΒ 1, Gexpsuperscript𝐺expG^{\text{\rm exp}} is a subgraph of G(0)superscript𝐺0G^{(0)}. Assume as induction hypothesis that Gexpsuperscript𝐺expG^{\text{\rm exp}} is a subgraph of G(iβˆ’1)superscript𝐺𝑖1G^{(i-1)}. By lineΒ 3 and the induction hypothesis, all edges of Gexpsuperscript𝐺expG^{\text{\rm exp}} are permanent edges of G(iβˆ’1)superscript𝐺𝑖1G^{(i-1)}. By linesΒ 9–14, all permanent edges of G(iβˆ’1)superscript𝐺𝑖1G^{(i-1)} are in G(i)superscript𝐺𝑖G^{(i)}. ∎

Lemma 9 (Implicit inΒ [4]).

For all 1≀i≀q1π‘–π‘ž1\leq i\leq q, Adv’s answer to the i𝑖ith query of Alg equals dG(q)​(ai,bi)subscript𝑑superscriptπΊπ‘žsubscriptπ‘Žπ‘–subscript𝑏𝑖d_{G^{(q)}}(a_{i},b_{i}).

Proof (included for completeness).

Let ansisubscriptans𝑖{\text{ans}}_{i} be Adv’s answer to the i𝑖ith query. By linesΒ 6–7, ansi=dG(iβˆ’1)​(ai,bi)subscriptans𝑖subscript𝑑superscript𝐺𝑖1subscriptπ‘Žπ‘–subscript𝑏𝑖{\text{ans}}_{i}=d_{G^{(i-1)}}(a_{i},b_{i}).444As Gexpsuperscript𝐺expG^{\text{exp}} is an expander, dG(iβˆ’1)​(ai,bi)<∞subscript𝑑superscript𝐺𝑖1subscriptπ‘Žπ‘–subscript𝑏𝑖d_{G^{(i-1)}}(a_{i},b_{i})<\infty by LemmaΒ 8. By linesΒ 9–14, G(q)superscriptπΊπ‘žG^{(q)} is a subgraph of G(iβˆ’1)superscript𝐺𝑖1G^{(i-1)}, implying dG(iβˆ’1)​(ai,bi)≀dG(q)​(ai,bi)subscript𝑑superscript𝐺𝑖1subscriptπ‘Žπ‘–subscript𝑏𝑖subscript𝑑superscriptπΊπ‘žsubscriptπ‘Žπ‘–subscript𝑏𝑖d_{G^{(i-1)}}(a_{i},b_{i})\leq d_{G^{(q)}}(a_{i},b_{i}). In summary, ansi≀dG(q)​(ai,bi)subscriptans𝑖subscript𝑑superscriptπΊπ‘žsubscriptπ‘Žπ‘–subscript𝑏𝑖{\text{ans}}_{i}\leq d_{G^{(q)}}(a_{i},b_{i}).

By lineΒ 7, ansisubscriptans𝑖{\text{ans}}_{i} is the length of Pisubscript𝑃𝑖P_{i}. As Pisubscript𝑃𝑖P_{i} is in G(iβˆ’1)superscript𝐺𝑖1G^{(i-1)} by lineΒ 6, all edges of Pisubscript𝑃𝑖P_{i} are permanent edges of G(i)superscript𝐺𝑖G^{(i)} by linesΒ 8–14. So by linesΒ 9–14, Pisubscript𝑃𝑖P_{i} exists in G(j)superscript𝐺𝑗G^{(j)} for all jβ‰₯i𝑗𝑖j\geq i.555Note that once an edge is marked as permanent, it cannot be removed by lineΒ 12. Therefore, the length of Pisubscript𝑃𝑖P_{i} is at least dG(q)​(ai,bi)subscript𝑑superscriptπΊπ‘žsubscriptπ‘Žπ‘–subscript𝑏𝑖d_{G^{(q)}}(a_{i},b_{i}) (in fact, at least dG(j)​(ai,bi)subscript𝑑superscript𝐺𝑗subscriptπ‘Žπ‘–subscript𝑏𝑖d_{G^{(j)}}(a_{i},b_{i}) for all jβ‰₯i𝑗𝑖j\geq i). In summary, ansiβ‰₯dG(q)​(ai,bi)subscriptans𝑖subscript𝑑superscriptπΊπ‘žsubscriptπ‘Žπ‘–subscript𝑏𝑖{\text{ans}}_{i}\geq d_{G^{(q)}}(a_{i},b_{i}). ∎

Lemma 10 (Implicit inΒ [4]).

For each v∈M𝑣𝑀v\in M, each run of lineΒ 8 marks as permanent at most two edges incident to v𝑣v.

Proof (included for completeness).

In lineΒ 6, Pisubscript𝑃𝑖P_{i} has at most two edges incident to v𝑣v. ∎

Let Epermsuperscript𝐸permE^{\text{perm}} be the set of edges ever marked as permanent, and Gperm=(M,Eperm)superscript𝐺perm𝑀superscript𝐸permG^{\text{perm}}=(M,E^{\text{perm}}). Denote by zβˆ—βˆˆMsuperscript𝑧𝑀z^{*}\in M the output of Alg with all queries answered by Adv. By padding dummy queries, assume without loss of generality that Alg queries for the distance between zβˆ—superscript𝑧z^{*} and each point in M𝑀M.

Lemma 11 (Implicit inΒ [4]).
βˆ‘x∈MdG(q)​(zβˆ—,x)=Ω​(n​log⁑n).subscriptπ‘₯𝑀subscript𝑑superscriptπΊπ‘žsuperscript𝑧π‘₯Ω𝑛𝑛\sum_{x\in M}\,d_{G^{(q)}}(z^{*},x)=\Omega(n\log n).
Proof (included for completeness).

By linesΒ 7–8, Adv answers each query of Alg by the length of a path whose edges are all in Epermsuperscript𝐸permE^{\text{perm}}. So for all iβ‰₯1𝑖1i\geq 1, the answer to the i𝑖ith query is at least dGperm​(ai,bi)subscript𝑑superscript𝐺permsubscriptπ‘Žπ‘–subscript𝑏𝑖d_{G^{\text{perm}}}(a_{i},b_{i}). Therefore, dG(q)​(ai,bi)β‰₯dGperm​(ai,bi)subscript𝑑superscriptπΊπ‘žsubscriptπ‘Žπ‘–subscript𝑏𝑖subscript𝑑superscript𝐺permsubscriptπ‘Žπ‘–subscript𝑏𝑖d_{G^{(q)}}(a_{i},b_{i})\geq d_{G^{\text{perm}}}(a_{i},b_{i}) by LemmaΒ 9, where iβ‰₯1𝑖1i\geq 1. This and the assumption that Alg queries for all distances between zβˆ—superscript𝑧z^{*} and the points in M𝑀M give

βˆ‘x∈MdG(q)​(zβˆ—,x)β‰₯βˆ‘x∈MdGperm​(zβˆ—,x).subscriptπ‘₯𝑀subscript𝑑superscriptπΊπ‘žsuperscript𝑧π‘₯subscriptπ‘₯𝑀subscript𝑑superscript𝐺permsuperscript𝑧π‘₯\displaystyle\sum_{x\in M}\,d_{G^{(q)}}(z^{*},x)\geq\sum_{x\in M}\,d_{G^{\text{perm}}}(z^{*},x). (5)

Consider the instant t𝑑t when the number of permanent edges incident to a vertex v∈M𝑣𝑀v\in M exceeds C𝐢C. By LemmaΒ 10, v𝑣v is incident to at most C+2𝐢2C+2 permanent edges at time t𝑑t. Then linesΒ 9–14 remove from G(i)superscript𝐺𝑖G^{(i)} all non-permanent edges incident to v𝑣v (and will not put them back to G(j)superscript𝐺𝑗G^{(j)} for any j>i𝑗𝑖j>i). So no more edges incident to v𝑣v will be marked as permanent after time t𝑑t. In summary, v𝑣v has degree at most C+2𝐢2C+2 in Gpermsuperscript𝐺permG^{\text{perm}}. In the above argument, v𝑣v can be any vertex whose number of incident permanent edges ever exceeds C𝐢C. So Gpermsuperscript𝐺permG^{\text{perm}} has maximum degree at most C+2𝐢2C+2.666Clearly, a vertex whose number of incident permanent edges never exceeds C𝐢C will have degree ≀Cabsent𝐢\leq C in Gpermsuperscript𝐺permG^{\text{perm}}. So for all kβ‰₯1π‘˜1k\geq 1, at most βˆ‘h=0k(C+2)hsuperscriptsubscriptβ„Ž0π‘˜superscript𝐢2β„Ž\sum_{h=0}^{k}\,(C+2)^{h} vertices in Gpermsuperscript𝐺permG^{\text{perm}} can be within distance kπ‘˜k (inclusive) from zβˆ—superscript𝑧z^{*}. Taking k=ϡ​log⁑nπ‘˜italic-ϡ𝑛k=\epsilon\log n for a small constant Ο΅>0italic-Ο΅0\epsilon>0 depending on C𝐢C, βˆ‘h=0k(C+2)h≀nsuperscriptsubscriptβ„Ž0π‘˜superscript𝐢2β„Žπ‘›\sum_{h=0}^{k}\,(C+2)^{h}\leq\sqrt{n}. I.e., at least nβˆ’n𝑛𝑛n-\sqrt{n} vertices are of distance greater than ϡ​log⁑nitalic-ϡ𝑛\epsilon\log n from zβˆ—superscript𝑧z^{*} in Gpermsuperscript𝐺permG^{\text{perm}}. So

βˆ‘x∈MdGperm​(zβˆ—,x)β‰₯(nβˆ’n)⋅ϡ​log⁑n.subscriptπ‘₯𝑀subscript𝑑superscript𝐺permsuperscript𝑧π‘₯⋅𝑛𝑛italic-ϡ𝑛\sum_{x\in M}\,d_{G^{\text{perm}}}(z^{*},x)\geq\left(n-\sqrt{n}\right)\cdot\epsilon\log n.

This and inequality (5) complete the proof. ∎

Let BadβŠ†MBad𝑀\text{Bad}\subseteq M be the set of vertices with degrees at least C𝐢C in Gpermsuperscript𝐺permG^{\text{perm}}.

Lemma 12 (Implicit inΒ [4]).

For all distinct y𝑦y, z∈Mβˆ–Bad𝑧𝑀Badz\in M\setminus\text{\rm Bad}, dG(q)​(y,z)=1subscript𝑑superscriptπΊπ‘žπ‘¦π‘§1d_{G^{(q)}}(y,z)=1.

Proof (included for completeness).

By lineΒ 1, (y,z)𝑦𝑧(y,z) is an edge of G(0)superscript𝐺0G^{(0)}. As y𝑦y, zβˆ‰Bad𝑧Badz\notin\text{\rm Bad}, y𝑦y and z𝑧z are incident to fewer than C𝐢C edges ever marked as permanent. So linesΒ 9–14 preserve the edge (y,z)𝑦𝑧(y,z) in G(i)superscript𝐺𝑖G^{(i)} for all iβ‰₯1𝑖1i\geq 1. ∎

By convention, d​(x,S)≑infs∈Sd​(x,s)𝑑π‘₯𝑆subscriptinfimum𝑠𝑆𝑑π‘₯𝑠d(x,S)\equiv\inf_{s\in S}\,d(x,s) for all x∈Mπ‘₯𝑀x\in M and SβŠ†M𝑆𝑀S\subseteq M.

Corollary 13.

For all y∈Mβˆ–Bad𝑦𝑀Bady\in M\setminus\text{\rm Bad},

βˆ‘x∈MdG(q)​(x,y)β‰€βˆ‘x∈M(dG(q)​(x,Mβˆ–Bad)+1).subscriptπ‘₯𝑀subscript𝑑superscriptπΊπ‘žπ‘₯𝑦subscriptπ‘₯𝑀subscript𝑑superscriptπΊπ‘žπ‘₯𝑀Bad1\sum_{x\in M}\,d_{G^{(q)}}(x,y)\leq\sum_{x\in M}\,\left(d_{G^{(q)}}(x,M\setminus\text{\rm Bad})+1\right).
Proof.

Assume Mβˆ–Badβ‰ βˆ…π‘€BadM\setminus\text{Bad}\neq\emptyset to avoid vacuous truth. For each x∈Mπ‘₯𝑀x\in M, let zx∈Mβˆ–Badsubscript𝑧π‘₯𝑀Badz_{x}\in M\setminus\text{\rm Bad} satisfy

dG(q)​(x,Mβˆ–Bad)=dG(q)​(x,zx).subscript𝑑superscriptπΊπ‘žπ‘₯𝑀Badsubscript𝑑superscriptπΊπ‘žπ‘₯subscript𝑧π‘₯d_{G^{(q)}}(x,M\setminus\text{\rm Bad})=d_{G^{(q)}}(x,z_{x}).

By LemmaΒ 12, dG(q)​(y,zx)≀1subscript𝑑superscriptπΊπ‘žπ‘¦subscript𝑧π‘₯1d_{G^{(q)}}(y,z_{x})\leq 1 for all x∈Mπ‘₯𝑀x\in M. By the triangle inequality,

dG(q)​(x,y)≀dG(q)​(x,zx)+dG(q)​(y,zx),subscript𝑑superscriptπΊπ‘žπ‘₯𝑦subscript𝑑superscriptπΊπ‘žπ‘₯subscript𝑧π‘₯subscript𝑑superscriptπΊπ‘žπ‘¦subscript𝑧π‘₯d_{G^{(q)}}(x,y)\leq d_{G^{(q)}}(x,z_{x})+d_{G^{(q)}}(y,z_{x}),

where x∈Mπ‘₯𝑀x\in M. ∎

Lemma 14 (Implicit inΒ [4]).

For all 1≀i≀q1π‘–π‘ž1\leq i\leq q and when lineΒ 6 picks Pisubscript𝑃𝑖P_{i}, Pisubscript𝑃𝑖P_{i} has at most one non-permanent edge.

Proof (included for completeness).

Write Pi=(v1,v2,…,vt)subscript𝑃𝑖subscript𝑣1subscript𝑣2…subscript𝑣𝑑P_{i}=(v_{1},v_{2},\ldots,v_{t}). Assume for contradiction that (vh,vh+1)subscriptπ‘£β„Žsubscriptπ‘£β„Ž1(v_{h},v_{h+1}) and (vk,vk+1)subscriptπ‘£π‘˜subscriptπ‘£π‘˜1(v_{k},v_{k+1}) are both non-permanent when lineΒ 6 picks Pisubscript𝑃𝑖P_{i} from G(iβˆ’1)superscript𝐺𝑖1G^{(i-1)}, for some 1≀h<k<t1β„Žπ‘˜π‘‘1\leq h<k<t. By lineΒ 1, G(0)superscript𝐺0G^{(0)} has the edge (vh,vk+1)subscriptπ‘£β„Žsubscriptπ‘£π‘˜1(v_{h},v_{k+1}). But by the optimality of Pisubscript𝑃𝑖P_{i} in lineΒ 6, G(iβˆ’1)superscript𝐺𝑖1G^{(i-1)} cannot have the edge (vh,vk+1)subscriptπ‘£β„Žsubscriptπ‘£π‘˜1(v_{h},v_{k+1}). So there exists 1≀ℓ≀iβˆ’11ℓ𝑖11\leq\ell\leq i-1 such that lineΒ 12 runs with v∈{vh,vk+1}𝑣subscriptπ‘£β„Žsubscriptπ‘£π‘˜1v\in\{v_{h},v_{k+1}\} in the β„“β„“\ellth iteration of the loop in linesΒ 4–15.777Let β„“β„“\ell be the smallest index such that G(β„“)superscript𝐺ℓG^{(\ell)} does not have (vh,vk+1)subscriptπ‘£β„Žsubscriptπ‘£π‘˜1(v_{h},v_{k+1}). LineΒ 9 initializes G(β„“)superscript𝐺ℓG^{(\ell)} to be G(β„“βˆ’1)superscript𝐺ℓ1G^{(\ell-1)}, which has (vh,vk+1)subscriptπ‘£β„Žsubscriptπ‘£π‘˜1(v_{h},v_{k+1}). So lineΒ 12 must remove (vh,vk+1)subscriptπ‘£β„Žsubscriptπ‘£π‘˜1(v_{h},v_{k+1}) from G(β„“)superscript𝐺ℓG^{(\ell)}. This happens only by running lineΒ 12 with v∈{vh,vk+1}𝑣subscriptπ‘£β„Žsubscriptπ‘£π‘˜1v\in\{v_{h},v_{k+1}\}. Being non-permanent when lineΒ 6 picks Pisubscript𝑃𝑖P_{i} from G(iβˆ’1)superscript𝐺𝑖1G^{(i-1)}, (vh,vh+1)subscriptπ‘£β„Žsubscriptπ‘£β„Ž1(v_{h},v_{h+1}) and (vk,vk+1)subscriptπ‘£π‘˜subscriptπ‘£π‘˜1(v_{k},v_{k+1}) must have remained non-permanent throughout the first iβˆ’1𝑖1i-1 iterations (including the β„“β„“\ellth iteration) of the loop in linesΒ 4–15 (because of the irreversibility of permanence). Therefore, when lineΒ 12 runs with v∈{vh,vk+1}𝑣subscriptπ‘£β„Žsubscriptπ‘£π‘˜1v\in\{v_{h},v_{k+1}\} in the β„“β„“\ellth iteration of the loop in linesΒ 4–15, (vh,vh+1)subscriptπ‘£β„Žsubscriptπ‘£β„Ž1(v_{h},v_{h+1}) or (vk,vk+1)subscriptπ‘£π‘˜subscriptπ‘£π‘˜1(v_{k},v_{k+1}) must be removed from G(β„“)superscript𝐺ℓG^{(\ell)}. By symmetry, assume G(β„“)superscript𝐺ℓG^{(\ell)} to not have (vh,vh+1)subscriptπ‘£β„Žsubscriptπ‘£β„Ž1(v_{h},v_{h+1}). By linesΒ 9–14 and as ℓ≀iβˆ’1ℓ𝑖1\ell\leq i-1, G(iβˆ’1)superscript𝐺𝑖1G^{(i-1)} cannot have (vh,vh+1)subscriptπ‘£β„Žsubscriptπ‘£β„Ž1(v_{h},v_{h+1}), either. As Pisubscript𝑃𝑖P_{i} is picked from G(iβˆ’1)superscript𝐺𝑖1G^{(i-1)} by lineΒ 6, G(iβˆ’1)superscript𝐺𝑖1G^{(i-1)} must have (vh,vh+1)subscriptπ‘£β„Žsubscriptπ‘£β„Ž1(v_{h},v_{h+1}) (which is on Pisubscript𝑃𝑖P_{i}), a contradiction. ∎

Corollary 15 (Implicit inΒ [4]).

Each run of lineΒ 8 increases the number of permanent edges by at most one.

Proof (included for completeness).

Immediate from Lemma 14. ∎

Lemma 16.

|Bad|≀n/2Bad𝑛2|\text{\rm Bad}|\leq n/2.

Proof.

As Gexpsuperscript𝐺expG^{\text{exp}} is d𝑑d-regular by lineΒ 2, lineΒ 3 marks d​n/2𝑑𝑛2dn/2 edges as permanent by the handshaking lemma. By CorollaryΒ 15, at most qπ‘žq edges are ever marked as permanent by lineΒ 8. To sum up, Gpermsuperscript𝐺permG^{\text{perm}} has at most d​n/2+q𝑑𝑛2π‘ždn/2+q edges. So by the handshaking lemma, the average degree in Gpermsuperscript𝐺permG^{\text{perm}} is at most d+2​q/n𝑑2π‘žπ‘›d+2q/n. This and Markov’s inequality imply that at most n/2𝑛2n/2 vertices have degrees at least 2​d+4​q/n2𝑑4π‘žπ‘›2d+4q/n in Gpermsuperscript𝐺permG^{\text{perm}}. As C>2​d+4​q/n𝐢2𝑑4π‘žπ‘›C>2d+4q/n, at most n/2𝑛2n/2 vertices have degrees at least C𝐢C in Gpermsuperscript𝐺permG^{\text{perm}}. ∎

Lemma 17.

For all y∈Mβˆ–Bad𝑦𝑀Bady\in M\setminus\text{\rm Bad}, βˆ‘x∈MdG(q)​(x,y)=O​(n)subscriptπ‘₯𝑀subscript𝑑superscriptπΊπ‘žπ‘₯𝑦𝑂𝑛\sum_{x\in M}\,d_{G^{(q)}}(x,y)=O(n).

Proof.

By LemmasΒ 16Β andΒ 25 (in AppendixΒ A),

βˆ‘x∈BaddGexp​(x,Mβˆ–Bad)=O​(n).subscriptπ‘₯Badsubscript𝑑superscript𝐺expπ‘₯𝑀Bad𝑂𝑛\displaystyle\sum_{x\in\text{Bad}}\,d_{G^{\text{exp}}}\left(x,M\setminus\text{Bad}\right)=O(n).

This and LemmaΒ 8 give

βˆ‘x∈BaddG(q)​(x,Mβˆ–Bad)=O​(n).subscriptπ‘₯Badsubscript𝑑superscriptπΊπ‘žπ‘₯𝑀Bad𝑂𝑛\displaystyle\sum_{x\in\text{Bad}}\,d_{G^{(q)}}\left(x,M\setminus\text{Bad}\right)=O(n). (6)

Clearly,

βˆ‘x∈Mβˆ–BaddG(q)​(x,Mβˆ–Bad)β‰€βˆ‘x∈Mβˆ–BaddG(q)​(x,x)=0.subscriptπ‘₯𝑀Badsubscript𝑑superscriptπΊπ‘žπ‘₯𝑀Badsubscriptπ‘₯𝑀Badsubscript𝑑superscriptπΊπ‘žπ‘₯π‘₯0\displaystyle\sum_{x\in M\setminus\text{Bad}}\,d_{G^{(q)}}\left(x,M\setminus\text{Bad}\right)\leq\sum_{x\in M\setminus\text{Bad}}\,d_{G^{(q)}}\left(x,x\right)=0. (7)

Now sum up equationsΒ (6)–(7) and invoke CorollaryΒ 13. ∎

Theorem 18.

Each deterministic O​(n)𝑂𝑛O(n)-query algorithm for metric 111-median is not (δ​log⁑n)𝛿𝑛(\delta\log n)-approximate for a sufficiently small constant Ξ΄>0𝛿0\delta>0.

Proof.

By LemmaΒ 9, Adv answers consistently with dG(q)​(β‹…,β‹…)subscript𝑑superscriptπΊπ‘žβ‹…β‹…d_{G^{(q)}}(\cdot,\cdot). By LemmasΒ 11Β andΒ 16–17, Alg’s output, zβˆ—superscript𝑧z^{*}, satisfies

βˆ‘x∈MdG(q)​(zβˆ—,x)=Ω​(log⁑n)β‹…βˆ‘x∈MdG(q)​(y,x)subscriptπ‘₯𝑀subscript𝑑superscriptπΊπ‘žsuperscript𝑧π‘₯⋅Ω𝑛subscriptπ‘₯𝑀subscript𝑑superscriptπΊπ‘žπ‘¦π‘₯\sum_{x\in M}\,d_{G^{(q)}}(z^{*},x)=\Omega(\log n)\cdot\sum_{x\in M}\,d_{G^{(q)}}(y,x)

for some y∈M𝑦𝑀y\in M. Finally, recall that Alg is an arbitrary deterministic O​(n)𝑂𝑛O(n)-query algorithm. ∎

3.1 Even fewer queries

For all nβˆˆβ„€+𝑛superscriptβ„€n\in\mathbb{Z}^{+}, [n]≑{1,2,…,n}delimited-[]𝑛12…𝑛[n]\equiv\{1,2,\ldots,n\}. This subsection assumes q=o​(n)π‘žπ‘œπ‘›q=o(n) and M=[n]𝑀delimited-[]𝑛M=[n]. An algorithm is said to be tame if its queries are in [2​q+1]Γ—[2​q+1]delimited-[]2π‘ž1delimited-[]2π‘ž1[2q+1]\times[2q+1] and its output in [2​q+1]delimited-[]2π‘ž1[2q+1].

1:Β Β cnt←0←cnt0\text{cnt}\leftarrow 0;
2:Β Β forΒ i=1𝑖1i=1 up to qπ‘žqΒ do
3:Β Β Β Β Β Receive the i𝑖ith query of Alg, denoted by (ai,bi)∈M2subscriptπ‘Žπ‘–subscript𝑏𝑖superscript𝑀2(a_{i},b_{i})\in M^{2};
4:Β Β Β Β Β ifΒ aiβˆ‰{a1,b1,a2,b2,…,aiβˆ’1,biβˆ’1}subscriptπ‘Žπ‘–subscriptπ‘Ž1subscript𝑏1subscriptπ‘Ž2subscript𝑏2…subscriptπ‘Žπ‘–1subscript𝑏𝑖1a_{i}\notin\{a_{1},b_{1},a_{2},b_{2},\ldots,a_{i-1},b_{i-1}\}Β then
5:Β Β Β Β Β Β Β Β cnt←cnt+1←cntcnt1\text{cnt}\leftarrow\text{cnt}+1;
6:        π​(ai)←cntβ†πœ‹subscriptπ‘Žπ‘–cnt\pi(a_{i})\leftarrow\text{cnt};
7:Β Β Β Β Β endΒ if
8:Β Β Β Β Β ifΒ biβˆ‰{a1,b1,a2,b2,…,aiβˆ’1,biβˆ’1}βˆͺ{ai}subscript𝑏𝑖subscriptπ‘Ž1subscript𝑏1subscriptπ‘Ž2subscript𝑏2…subscriptπ‘Žπ‘–1subscript𝑏𝑖1subscriptπ‘Žπ‘–b_{i}\notin\{a_{1},b_{1},a_{2},b_{2},\ldots,a_{i-1},b_{i-1}\}\cup\{a_{i}\}Β then
9:Β Β Β Β Β Β Β Β cnt←cnt+1←cntcnt1\text{cnt}\leftarrow\text{cnt}+1;
10:        π​(bi)←cntβ†πœ‹subscript𝑏𝑖cnt\pi(b_{i})\leftarrow\text{cnt};
11:Β Β Β Β Β endΒ if
12:Β Β Β Β Β Query for the distance between π​(ai)πœ‹subscriptπ‘Žπ‘–\pi(a_{i}) and π​(bi)πœ‹subscript𝑏𝑖\pi(b_{i}), and return the answer to Alg;
13:Β Β endΒ for
14:Β Β Receive the output zβˆ—superscript𝑧z^{*} of Alg;
15:Β Β ifΒ zβˆ—βˆ‰{a1,b1,a2,b2,…,aq,bq}superscript𝑧subscriptπ‘Ž1subscript𝑏1subscriptπ‘Ž2subscript𝑏2…subscriptπ‘Žπ‘žsubscriptπ‘π‘žz^{*}\notin\{a_{1},b_{1},a_{2},b_{2},\ldots,a_{q},b_{q}\}Β then
16:Β Β Β Β Β cnt←cnt+1←cntcnt1\text{cnt}\leftarrow\text{cnt}+1;
17:     π​(zβˆ—)←cntβ†πœ‹superscript𝑧cnt\pi(z^{*})\leftarrow\text{cnt};
18:Β Β endΒ if
19:Β Β return  π​(zβˆ—)πœ‹superscript𝑧\pi(z^{*});
Figure 2: Algorithm Sim for simulating Alg with points renamed
Lemma 19.

When Sim (in Fig.Β 2) terminates, π​(β‹…)πœ‹β‹…\pi(\cdot) is injective.

Proof.

Before lines 6, 10 and 17, cnt increments. ∎

Lemma 20.

When Sim terminates, π​(ai)πœ‹subscriptπ‘Žπ‘–\pi(a_{i}), π​(bi)πœ‹subscript𝑏𝑖\pi(b_{i}), π​(zβˆ—)∈[2​q+1]πœ‹superscript𝑧delimited-[]2π‘ž1\pi(z^{*})\in[2q+1] for all 1≀i≀q1π‘–π‘ž1\leq i\leq q.

Proof.

Each query increases cnt by at most two in linesΒ 4–11. LinesΒ 15–18 may also increase cnt. LinesΒ 6,Β 10,Β andΒ 17 set π​(x)πœ‹π‘₯\pi(x) to be cnt for some x∈Mπ‘₯𝑀x\in M. ∎

Lemma 21.

If Alg is h​(n)β„Žπ‘›h(n)-approximate for metric 111-median, where h:β„€+→ℝ:β„Žβ†’superscript℀ℝh\colon\mathbb{Z}^{+}\to\mathbb{R}, then Sim is a tame qπ‘žq-query h​(n)β„Žπ‘›h(n)-approximation algorithm for metric 111-median.

Proof.

By LemmaΒ 19, Sim simulates Alg with an injective renaming of points. So, inheriting from Alg, Sim is h​(n)β„Žπ‘›h(n)-approximate and makes qπ‘žq queries. By LemmaΒ 20 and linesΒ 12Β andΒ 19 of Sim, Sim is tame. ∎

The following result complements TheoremsΒ 7.

Theorem 22.

Each deterministic o​(n)π‘œπ‘›o(n)-query algorithm for Metric 111-median fails to be o​(f​(n)β‹…log⁑n)π‘œβ‹…π‘“π‘›π‘›o(f(n)\cdot\log n)-approximate for some computable function f:β„€+β†’β„€+:𝑓→superscriptβ„€superscriptβ„€f\colon\mathbb{Z}^{+}\to\mathbb{Z}^{+} satisfying f​(n)=ω​(1)π‘“π‘›πœ”1f(n)=\omega(1).

Proof.

By LemmaΒ 21, assume Alg to be tame without loss of generality (otherwise, prove the theorem against Sim instead of Alg). Let zβˆ—superscript𝑧z^{*} the Alg’s output when the queries are answered by Adv with M𝑀M (resp., n𝑛n) substituted by [2​q+1]delimited-[]2π‘ž1[2q+1] (resp., 2​q+12π‘ž12q+1). By LemmaΒ 11 with M𝑀M (resp., n𝑛n) substituted by [2​q+1]delimited-[]2π‘ž1[2q+1] (resp., 2​q+12π‘ž12q+1),

βˆ‘x∈[2​q+1]dG(q)​(zβˆ—,x)=Ω​((2​q+1)​log⁑(2​q+1)),subscriptπ‘₯delimited-[]2π‘ž1subscript𝑑superscriptπΊπ‘žsuperscript𝑧π‘₯Ξ©2π‘ž12π‘ž1\displaystyle\sum_{x\in[2q+1]}\,d_{G^{(q)}}(z^{*},x)=\Omega\left((2q+1)\log(2q+1)\right), (8)

where G(q)superscriptπΊπ‘žG^{(q)} is a graph on [2​q+1]delimited-[]2π‘ž1[2q+1] as in Adv. By LemmasΒ 16–17 with M𝑀M (resp., n𝑛n) substituted by [2​q+1]delimited-[]2π‘ž1[2q+1] (resp., 2​q+12π‘ž12q+1), there exists y∈[2​q+1]𝑦delimited-[]2π‘ž1y\in[2q+1] satisfying

βˆ‘x∈[2​q+1]dG(q)​(y,x)=O​(q).subscriptπ‘₯delimited-[]2π‘ž1subscript𝑑superscriptπΊπ‘žπ‘¦π‘₯π‘‚π‘ž\displaystyle\sum_{x\in[2q+1]}\,d_{G^{(q)}}(y,x)=O(q). (9)

EquationsΒ (8)–(9) and the triangle inequality imply

dG(q)​(zβˆ—,y)=Ω​(log⁑q).subscript𝑑superscriptπΊπ‘žsuperscriptπ‘§π‘¦Ξ©π‘ž\displaystyle d_{G^{(q)}}(z^{*},y)=\Omega(\log q). (10)

Recall that y∈[2​q+1]𝑦delimited-[]2π‘ž1y\in[2q+1]. Put all points in [n]βˆ–[2​q+1]delimited-[]𝑛delimited-[]2π‘ž1[n]\setminus[2q+1] extremely close to y𝑦y: For all distinct aπ‘Ža, b∈[n]𝑏delimited-[]𝑛b\in[n], d​(a,a)≑0π‘‘π‘Žπ‘Ž0d(a,a)\equiv 0 and

d​(a,b)≑{1/2n,ifΒ a,Β b∈{y}βˆͺ([n]βˆ–[2​q+1]),dG(q)​(a,y),ifΒ aβˆ‰{y}βˆͺ([n]βˆ–[2​q+1])Β andΒ b∈{y}βˆͺ([n]βˆ–[2​q+1]),dG(q)​(y,b),ifΒ a∈{y}βˆͺ([n]βˆ–[2​q+1])Β andΒ bβˆ‰{y}βˆͺ([n]βˆ–[2​q+1]),dG(q)​(a,b),otherwise.π‘‘π‘Žπ‘cases1superscript2𝑛ifΒ a,Β b∈{y}βˆͺ([n]βˆ–[2q+1]),subscript𝑑superscriptπΊπ‘žπ‘Žπ‘¦ifΒ aβˆ‰{y}βˆͺ([n]βˆ–[2q+1])Β andΒ b∈{y}βˆͺ([n]βˆ–[2q+1]),subscript𝑑superscriptπΊπ‘žπ‘¦π‘ifΒ a∈{y}βˆͺ([n]βˆ–[2q+1])Β andΒ bβˆ‰{y}βˆͺ([n]βˆ–[2q+1]),subscript𝑑superscriptπΊπ‘žπ‘Žπ‘otherwise.\displaystyle d(a,b)\equiv\left\{\begin{array}[]{ll}1/2^{n},&\text{if $a$, $b\in\{y\}\cup([n]\setminus[2q+1])$,}\\ d_{G^{(q)}}(a,y),&\text{if $a\notin\{y\}\cup([n]\setminus[2q+1])$ and $b\in\{y\}\cup([n]\setminus[2q+1])$,}\\ d_{G^{(q)}}(y,b),&\text{if $a\in\{y\}\cup([n]\setminus[2q+1])$ and $b\notin\{y\}\cup([n]\setminus[2q+1])$,}\\ d_{G^{(q)}}(a,b),&\text{otherwise.}\end{array}\right. (15)

It is not hard to see that d𝑑d is induced by the weighted graph obtained in the following way: (1)Β Add all vertices in [n]βˆ–[2​q+1]delimited-[]𝑛delimited-[]2π‘ž1[n]\setminus[2q+1] to G(q)superscriptπΊπ‘žG^{(q)}. (2)Β Add an edge between each v∈[n]βˆ–[2​q+1]𝑣delimited-[]𝑛delimited-[]2π‘ž1v\in[n]\setminus[2q+1] and each neighbor (in G(q)superscriptπΊπ‘žG^{(q)}) of y𝑦y. (3)Β Connect any two vertices in {y}βˆͺ([n]βˆ–[2​q+1])𝑦delimited-[]𝑛delimited-[]2π‘ž1\{y\}\cup([n]\setminus[2q+1]) by an edge of weight 1/2n1superscript2𝑛1/2^{n}, all other edge weights being 111.

As Alg is tame, (ai,bi)∈[2​q+1]Γ—[2​q+1]subscriptπ‘Žπ‘–subscript𝑏𝑖delimited-[]2π‘ž1delimited-[]2π‘ž1(a_{i},b_{i})\in[2q+1]\times[2q+1] for all 1≀i≀q1π‘–π‘ž1\leq i\leq q, implying d​(ai,bi)=dG(q)​(ai,bi)𝑑subscriptπ‘Žπ‘–subscript𝑏𝑖subscript𝑑superscriptπΊπ‘žsubscriptπ‘Žπ‘–subscript𝑏𝑖d(a_{i},b_{i})=d_{G^{(q)}}(a_{i},b_{i}) by equationΒ (15). So by LemmaΒ 9, Adv answers queries consistently with d​(β‹…,β‹…)𝑑⋅⋅d(\cdot,\cdot).

We have

βˆ‘x∈[n]βˆ–{y}d​(y,x)subscriptπ‘₯delimited-[]𝑛𝑦𝑑𝑦π‘₯\displaystyle\sum_{x\in[n]\setminus\{y\}}\,d(y,x) =\displaystyle= βˆ‘x∈[2​q+1]βˆ–{y}d​(y,x)+βˆ‘x∈[n]βˆ–([2q+1])βˆͺ{y})d​(y,x)\displaystyle\sum_{x\in[2q+1]\setminus\{y\}}\,d(y,x)+\sum_{x\in[n]\setminus([2q+1])\cup\{y\})}\,d(y,x)
=(15)superscript(15)\displaystyle\stackrel{{\scriptstyle\text{(\ref{distancefunctionwithcopies})}}}{{=}} βˆ‘x∈[2​q+1]βˆ–{y}d​(y,x)+βˆ‘x∈[n]βˆ–([2q+1])βˆͺ{y})12n\displaystyle\sum_{x\in[2q+1]\setminus\{y\}}\,d(y,x)+\sum_{x\in[n]\setminus([2q+1])\cup\{y\})}\,\frac{1}{2^{n}}
=(15)superscript(15)\displaystyle\stackrel{{\scriptstyle\text{(\ref{distancefunctionwithcopies})}}}{{=}} βˆ‘x∈[2​q+1]βˆ–{y}dG(q)​(y,x)+βˆ‘x∈[n]βˆ–([2q+1])βˆͺ{y})12n\displaystyle\sum_{x\in[2q+1]\setminus\{y\}}\,d_{G^{(q)}}(y,x)+\sum_{x\in[n]\setminus([2q+1])\cup\{y\})}\,\frac{1}{2^{n}}
=(9)superscript(9)\displaystyle\stackrel{{\scriptstyle\text{(\ref{bestpointbehaveswelllocally})}}}{{=}} O​(q).π‘‚π‘ž\displaystyle O(q). (17)

As Alg is tame, zβˆ—βˆˆ[2​q+1]superscript𝑧delimited-[]2π‘ž1z^{*}\in[2q+1]. By equationΒ (10), zβˆ—β‰ ysuperscript𝑧𝑦z^{*}\neq y.888For proving the theorem, we may assume q>nπ‘žπ‘›q>\sqrt{n} without loss of generality. So Ω​(log⁑q)Ξ©π‘ž\Omega(\log q) is nonzero. So zβˆ—βˆˆ[2​q+1]βˆ–{y}superscript𝑧delimited-[]2π‘ž1𝑦z^{*}\in[2q+1]\setminus\{y\}. Now,

βˆ‘x∈[n]d​(zβˆ—,x)β‰₯βˆ‘x∈[n]βˆ–[2​q+1]d​(zβˆ—,x)=(15)βˆ‘x∈[n]βˆ–[2​q+1]dG(q)​(zβˆ—,y)=(10)Ω​((nβˆ’(2​q+1))​log⁑q).subscriptπ‘₯delimited-[]𝑛𝑑superscript𝑧π‘₯subscriptπ‘₯delimited-[]𝑛delimited-[]2π‘ž1𝑑superscript𝑧π‘₯superscript(15)subscriptπ‘₯delimited-[]𝑛delimited-[]2π‘ž1subscript𝑑superscriptπΊπ‘žsuperscript𝑧𝑦superscript(10)Ω𝑛2π‘ž1π‘ž\displaystyle\sum_{x\in[n]}\,d(z^{*},x)\geq\sum_{x\in[n]\setminus[2q+1]}\,d(z^{*},x)\stackrel{{\scriptstyle\text{(\ref{distancefunctionwithcopies})}}}{{=}}\sum_{x\in[n]\setminus[2q+1]}\,d_{G^{(q)}}(z^{*},y)\stackrel{{\scriptstyle\text{(\ref{localsolutionfarawayfromlocaloptimal})}}}{{=}}\Omega((n-(2q+1))\log q).

This and equationsΒ (3.1)–(17) show zβˆ—superscript𝑧z^{*} to be no better than ((δ​n/q)β‹…log⁑q)β‹…π›Ώπ‘›π‘žπ‘ž((\delta n/q)\cdot\log q)-approximate for some constant Ξ΄>0𝛿0\delta>0. Clearly, (δ​n/q)β‹…log⁑q=ω​(log⁑n)β‹…π›Ώπ‘›π‘žπ‘žπœ”π‘›(\delta n/q)\cdot\log q=\omega(\log n). So taking f​(n)=⌊(n/q)β‹…(log⁑q)/(log⁑n)βŒ‹π‘“π‘›β‹…π‘›π‘žπ‘žπ‘›f(n)=\lfloor(n/q)\cdot(\log q)/(\log n)\rfloor completes the proof except that f​(n)𝑓𝑛f(n) may be uncomputable. Gladly, d𝑑d has codomain {1/2n,0,1,…,nβˆ’1}1superscript2𝑛01…𝑛1\{1/2^{n},0,1,\ldots,n-1\} by equationΒ (15).999Any graph on a subset of [n]delimited-[]𝑛[n] induces distances in {0,1,…,nβˆ’1,∞}01…𝑛1\{0,1,\ldots,n-1,\infty\}. But equationsΒ (3.1)–(17) forbid ∞\infty as a distance. So we may pretend as if qπ‘žq is Alg’s worst-case query complexity w.r.t.Β metrics with codomain {1/2n,0,1,…,nβˆ’1}1superscript2𝑛01…𝑛1\{1/2^{n},0,1,\ldots,n-1\}. This makes qπ‘žq, and thus f​(n)𝑓𝑛f(n), computable. ∎

Corollary 23.

Metric 111-median has no deterministic o​(n)π‘œπ‘›o(n)-query O​(log⁑n)𝑂𝑛O(\log n)-approximation algorithms.

Proof.

Immediate from Theorem 22. ∎

Corollary 24.

Metric 111-median has no deterministic o​(n)π‘œπ‘›o(n)-query algorithms with an asymptotically best approximation ratio.

Proof.

Take any deterministic o​(n)π‘œπ‘›o(n)-query algorithm A𝐴A. By TheoremΒ 22, there exists a computable fA​(n)=ω​(1)subscriptπ‘“π΄π‘›πœ”1f_{A}(n)=\omega(1) forbidding A𝐴A to be o​(fA​(n)β‹…log⁑n)π‘œβ‹…subscript𝑓𝐴𝑛𝑛o(f_{A}(n)\cdot\log n)-approximate. But TheoremΒ 7 asserts the existence of a deterministic o​(n)π‘œπ‘›o(n)-query o​(fA​(n)β‹…log⁑n)π‘œβ‹…subscript𝑓𝐴𝑛𝑛o(\sqrt{f_{A}(n)}\cdot\log n)-approximation algorithm. ∎

Appendix A Distances in expanders

It is well-known that an O​(1)𝑂1O(1)-regular expander graph Gexpsuperscript𝐺expG^{\text{exp}} on M𝑀M exists. I.e., there exist constants dβˆˆβ„€+𝑑superscriptβ„€d\in\mathbb{Z}^{+} and 0<Ξ±<10𝛼10<\alpha<1 such that

  1. (i)

    Gexpsuperscript𝐺expG^{\text{exp}} is d𝑑d-regular, and

  2. (ii)

    for each SβŠ†M𝑆𝑀S\subseteq M of size at most n/2𝑛2n/2, at least α​d​|S|𝛼𝑑𝑆\alpha d\,|S| edges of Gexpsuperscript𝐺expG^{\text{exp}} are in SΓ—(Mβˆ–S)𝑆𝑀𝑆S\times(M\setminus S).

Lemma 25.

For each nonempty UβŠ†Mπ‘ˆπ‘€U\subseteq M of size at most n/2𝑛2n/2,

βˆ‘x∈UdGexp​(x,Mβˆ–U)=O​(|U|).subscriptπ‘₯π‘ˆsubscript𝑑superscript𝐺expπ‘₯π‘€π‘ˆπ‘‚π‘ˆ\sum_{x\in U}\,d_{G^{\text{\rm exp}}}\left(x,M\setminus U\right)=O(|U|).
Proof.

For each iβ‰₯1𝑖1i\geq 1,

L0subscript𝐿0\displaystyle L_{0} ≑\displaystyle\equiv Mβˆ–U,π‘€π‘ˆ\displaystyle M\setminus U,
Lisubscript𝐿𝑖\displaystyle L_{i} ≑\displaystyle\equiv {x∈U∣dGexp​(x,Mβˆ–U)=i},conditional-setπ‘₯π‘ˆsubscript𝑑superscript𝐺expπ‘₯π‘€π‘ˆπ‘–\displaystyle\left\{x\in U\mid d_{G^{\text{exp}}}\left(x,M\setminus U\right)=i\right\},
Sisubscript𝑆𝑖\displaystyle S_{i} ≑\displaystyle\equiv LiβˆͺLi+1βˆͺβ‹―subscript𝐿𝑖subscript𝐿𝑖1β‹―\displaystyle L_{i}\cup L_{i+1}\cup\cdots

So Lisubscript𝐿𝑖L_{i} is the set of vertices at level i𝑖i of the BFS tree rooted at Mβˆ–Uπ‘€π‘ˆM\setminus U.101010Generalize BFS in the obvious way to allow the root to be a set of vertices.

Now fix any iβ‰₯1𝑖1i\geq 1. Because edges cannot cross non-adjacent levels of a BFS tree, SiΓ—(Mβˆ–Si)βŠ†LiΓ—Liβˆ’1subscript𝑆𝑖𝑀subscript𝑆𝑖subscript𝐿𝑖subscript𝐿𝑖1S_{i}\times(M\setminus S_{i})\subseteq L_{i}\times L_{i-1}. By itemΒ (ii) (with S𝑆S replaced by Sisubscript𝑆𝑖S_{i} and noting that SiβŠ†Usubscriptπ‘†π‘–π‘ˆS_{i}\subseteq U has size at most n/2𝑛2n/2), at least α​d​|Si|𝛼𝑑subscript𝑆𝑖\alpha d\,|S_{i}| edges of Gexpsuperscript𝐺expG^{\text{exp}} are in SiΓ—(Mβˆ–Si)subscript𝑆𝑖𝑀subscript𝑆𝑖S_{i}\times(M\setminus S_{i}). In summary, at least α​d​|Si|𝛼𝑑subscript𝑆𝑖\alpha d\,|S_{i}| edges are in LiΓ—Liβˆ’1subscript𝐿𝑖subscript𝐿𝑖1L_{i}\times L_{i-1} (and are thus incident to a vertex in Lisubscript𝐿𝑖L_{i}). As Gexpsuperscript𝐺expG^{\text{exp}} is d𝑑d-regular, therefore, |Li|β‰₯α​|Si|subscript𝐿𝑖𝛼subscript𝑆𝑖|L_{i}|\geq\alpha\,|S_{i}|. Hence

|Si+1|=|Siβˆ–Li|≀(1βˆ’Ξ±)​|Si|.subscript𝑆𝑖1subscript𝑆𝑖subscript𝐿𝑖1𝛼subscript𝑆𝑖\displaystyle|S_{i+1}|=|S_{i}\setminus L_{i}|\leq(1-\alpha)|S_{i}|. (18)

Iterating inequalityΒ (18),

|Sj|≀(1βˆ’Ξ±)jβˆ’1​|S1|=(1βˆ’Ξ±)jβˆ’1​|U|subscript𝑆𝑗superscript1𝛼𝑗1subscript𝑆1superscript1𝛼𝑗1π‘ˆ|S_{j}|\leq(1-\alpha)^{j-1}|S_{1}|=(1-\alpha)^{j-1}|U|

for all jβ‰₯1𝑗1j\geq 1. So

|Lj|≀|Sj|≀(1βˆ’Ξ±)jβˆ’1​|U|subscript𝐿𝑗subscript𝑆𝑗superscript1𝛼𝑗1π‘ˆ\displaystyle|L_{j}|\leq|S_{j}|\leq(1-\alpha)^{j-1}|U| (19)

for all jβ‰₯1𝑗1j\geq 1. Now,

βˆ‘x∈UdGexp​(x,Mβˆ–U)subscriptπ‘₯π‘ˆsubscript𝑑superscript𝐺expπ‘₯π‘€π‘ˆ\displaystyle\sum_{x\in U}\,d_{G^{\text{exp}}}\left(x,M\setminus U\right) =\displaystyle= βˆ‘j=1βˆžβˆ‘x∈LjdGexp​(x,Mβˆ–U)superscriptsubscript𝑗1subscriptπ‘₯subscript𝐿𝑗subscript𝑑superscript𝐺expπ‘₯π‘€π‘ˆ\displaystyle\sum_{j=1}^{\infty}\,\sum_{x\in L_{j}}\,d_{G^{\text{exp}}}\left(x,M\setminus U\right)
=\displaystyle= βˆ‘j=1βˆžβˆ‘x∈Ljjsuperscriptsubscript𝑗1subscriptπ‘₯subscript𝐿𝑗𝑗\displaystyle\sum_{j=1}^{\infty}\,\sum_{x\in L_{j}}\,j
=\displaystyle= βˆ‘j=1∞|Lj|β‹…jsuperscriptsubscript𝑗1β‹…subscript𝐿𝑗𝑗\displaystyle\sum_{j=1}^{\infty}\,|L_{j}|\cdot j
≀(19)superscript(19)\displaystyle\stackrel{{\scriptstyle\text{(\ref{levelnottoolarge})}}}{{\leq}} βˆ‘j=1∞(1βˆ’Ξ±)jβˆ’1​|U|β‹…jsuperscriptsubscript𝑗1β‹…superscript1𝛼𝑗1π‘ˆπ‘—\displaystyle\sum_{j=1}^{\infty}\,(1-\alpha)^{j-1}|U|\cdot j
=\displaystyle= O​(|U|),π‘‚π‘ˆ\displaystyle O(|U|),

where the last equality uses the convergence of βˆ‘j=1∞(1βˆ’Ξ±)jβˆ’1​jsuperscriptsubscript𝑗1superscript1𝛼𝑗1𝑗\sum_{j=1}^{\infty}\,(1-\alpha)^{j-1}j. ∎

Appendix B Acknowledgments

The author is supported by the Ministry of Science and Technology of Taiwan under grant 110-2221-E-155-012-.

References

  • [1] P.Β Bose, A.Β Maheshwari, and P.Β Morin. Fast approximations for sums of distances, clustering and the Fermat–Weber problem. Computational Geometry, 24(3):135–146, 2003.
  • [2] C.-L. Chang. A lower bound for metric 111-median selection. Journal of Computer and System Sciences, 84:44–51, 2017.
  • [3] C.-L. Chang. Metric 111-median selection with fewer queries. In Proceedings of the 2017 International Conference on Applied System Innovation, pages 1056–1059, 2017.
  • [4] C.-L. Chang. Metric 111-median selection: Query complexity vs.Β approximation ratio. ACM Transactions on Computation Theory, 9(4):1–23, 2018. Article 20.
  • [5] C.-L. Chang. A note on metric 111-median selection. In Proceedings of the 23rd International Computer Symposium, pages 457–459, Yunlin, Taiwan, 2018.
  • [6] T.Β H. Cormen, C.Β E. Leiserson, R.Β L. Rivest, and C.Β Stein. Introduction to Algorithms. The MIT Press, 3rd edition, 2001.
  • [7] A.Β Czumaj and C.Β Sohler. Sublinear-time approximation algorithms for clustering via random sampling. Random Structures & Algorithms, 30(1–2):226–256, 2007.
  • [8] D.Β Eppstein and J.Β Wang. Fast approximation of centrality. Journal of Graph Algorithms and Applications, 8(1):39–45, 2004.
  • [9] O.Β Goldreich and D.Β Ron. Approximating average parameters of graphs. Random Structures & Algorithms, 32(4):473–493, 2008.
  • [10] S.Β Guha, A.Β Meyerson, N.Β Mishra, R.Β Motwani, and L.Β O’Callaghan. Clustering data streams: Theory and practice. IEEE Transactions on Knowledge and Data Engineering, 15(3):515–528, 2003.
  • [11] P.Β Indyk. Sublinear time algorithms for metric space problems. In Proceedings of the 31st Annual ACM Symposium on Theory of Computing, pages 428–434, 1999.
  • [12] P.Β Indyk. High-dimensional computational geometry. PhD thesis, Stanford University, 2000.
  • [13] K.Β Jain, M.Β Mahdian, and A.Β Saberi. A new greedy approach for facility location problems. In Proceedings of the 34th Annual ACM Symposium on Theory of Computing, pages 731–740, 2002.
  • [14] A.Β Kumar, Y.Β Sabharwal, and S.Β Sen. Linear-time approximation schemes for clustering problems in any dimensions. Journal of the ACM, 57(2):5, 2010.
  • [15] R.Β R. Mettu and C.Β G. Plaxton. Optimal time bounds for approximate clustering. Machine Learning, 56(1–3):35–60, 2004.
  • [16] W.Β Rudin. Principles of Mathematical Analysis. McGraw-Hill, 3rd edition, 1976.
  • [17] S.Β Wasserman and K.Β Faust. Social Network Analysis: Methods and Applications. Cambridge University Press, 1994.
  • [18] B.Β Y. Wu. On approximating metric 111-median in sublinear time. Information Processing Letters, 114(4):163–166, 2014.