uu.seUppsala University Publications
Change search
ReferencesLink to record
Permanent link

Direct link
Improving massive experiments with threshold blocking
Kansas State Univ, Dept Stat, Manhattan, KS 66506 USA.
Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Social Sciences, Department of Economics.
Univ Calif Berkeley, Dept Polit Sci, Berkeley, CA 94720 USA;Univ Calif Berkeley, Dept Stat, Berkeley, CA 94720 USA.
2016 (English)In: Proceedings of the National Academy of Sciences of the United States of America, ISSN 0027-8424, E-ISSN 1091-6490, Vol. 13, no 27, 7369-7376 p.Article in journal (Refereed) Published
Abstract [en]

Inferences from randomized experiments can be improved by blocking: assigning treatment in fixed proportions within groups of similar units. However, the use of the method is limited by the difficulty in deriving these groups. Current blocking methods are restricted to special cases or run in exponential time; are not sensitive to clustering of data points; and are often heuristic, providing an unsatisfactory solution in many common instances. We present an algorithm that implements a widely applicable class of blocking-threshold blocking-that solves these problems. Given a minimum required group size and a distance metric, we study the blocking problem of minimizing the maximum distance between any two units within the same group. We prove this is a nondeterministic polynomial-time hard problem and derive an approximation algorithm that yields a blocking where the maximum distance is guaranteed to be, at most, four times the optimal value. This algorithm runs in O(n log n) time with O(n) space complexity. This makes it, to our knowledge, the first blocking method with an ensured level of performance that works in massive experiments. Whereas many commonly used algorithms form pairs of units, our algorithm constructs the groups flexibly for any chosen minimum size. This facilitates complex experiments with several treatment arms and clustered data. A simulation study demonstrates the efficiency and efficacy of the algorithm; tens of millions of units can be blocked using a desktop computer in a few minutes.

Place, publisher, year, edition, pages
2016. Vol. 13, no 27, 7369-7376 p.
Keyword [en]
experimental design; blocking; big data; causal inference
National Category
Research subject
URN: urn:nbn:se:uu:diva-299820DOI: 10.1073/pnas.1510504113ISI: 000379021700041PubMedID: 27382151OAI: oai:DiVA.org:uu-299820DiVA: diva2:950226
Available from: 2016-07-28 Created: 2016-07-28 Last updated: 2016-08-02Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full textPubMed
By organisation
Department of Economics
In the same journal
Proceedings of the National Academy of Sciences of the United States of America

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 50 hits
ReferencesLink to record
Permanent link

Direct link