Testing Properties of Collections of Distributions

by Reut Levi, Dana Ron, and Ronitt Rubinfeld

Theory of Computing, Volume 9(8), pp. 295-347, 2013

Bibliography with links to cited articles

[1]   Michał Adamaszek, Artur Czumaj, and Christian Sohler: Testing monotone continuous distributions on high-dimensional real cubes. In Proc. 21st Ann. ACM-SIAM Symp. on Discrete Algorithms (SODA’10), pp. 56–65. ACM Press, 2010. Short version in Property Testing. [ACM:1873601.1873607]

[2]   Noga Alon, Alexandr Andoni, Tali Kaufman, Kevin Matulef, Ronitt Rubinfeld, and Ning Xie: Testing k-wise and almost k-wise independence. In Proc. 39th STOC, pp. 496–505. ACM Press, 2007. [doi:10.1145/1250790.1250863]

[3]   Noga Alon, Seannie Dar, Michal Parnas, and Dana Ron: Testing of clustering. SIAM J. Discrete Math., 16(3):393–417, 2003. Preliminary version in FOCS’00. [doi:10.1137/S0895480102410973]

[4]   Noga Alon, Yossi Matias, and Mario Szegedy: The space complexity of approximating the frequency moments. J. Comput. System Sci., 58(1):137–147, 1999. Preliminary version in STOC’96. [doi:10.1006/jcss.1997.1545]

[5]   Alexandr Andoni, Piotr Indyk, Krzysztof Onak, and Ronitt Rubinfeld: External sampling. In Proc. 36th Internat. Colloq. on Automata, Languages and Programming (ICALP’09), pp. 83–94. Springer, 2009. [doi:10.1007/978-3-642-02927-1_9]

[6]   Khanh Do Ba, Huy L. Nguyen, Huy N. Nguyen, and Ronitt Rubinfeld: Sublinear time algorithms for Earth Mover’s Distance. Theory Comput. Syst., 48(2):428–442, 2011. Preprint available at arXiv. [doi:10.1007/s00224-010-9265-8]

[7]   Ziv Bar-Yossef, T. S. Jayram, Ravi Kumar, D. Sivakumar, and Luca Trevisan: Counting distinct elements in a data stream. In Proc. 6th Internat. Workshop on Randomization and Computation (RANDOM’02), pp. 1–10. Springer, 2002. [doi:10.1007/3-540-45726-7_1]

[8]   Tuǧkan Batu: Testing properties of distributions. Ph. D. thesis, Computer Science department, Cornell University, 2001. http://www.maths.lse.ac.uk/Personal/batu/papers/t.ps. [ACM:934334]

[9]   Tuǧkan Batu, Sanjoy Dasgupta, Ravi Kumar, and Ronitt Rubinfeld: The complexity of approximating the entropy. SIAM J. Comput., 35(1):132–150, 2005. Preliminary versions in CCC’02 and STOC’02. [doi:10.1137/S0097539702403645]

[10]   Tuǧkan Batu, Eldar Fischer, Lance Fortnow, Ravi Kumar, Ronitt Rubinfeld, and Patrick White: Testing random variables for independence and identity. In Proc. 42nd FOCS, pp. 442–451. IEEE Comp. Soc. Press, 2001. [doi:10.1109/SFCS.2001.959920]

[11]   Tuǧkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D. Smith, and Patrick White: Testing that distributions are close. In Proc. 41st FOCS, pp. 259–269. IEEE Comp. Soc. Press, 2000. [doi:10.1109/SFCS.2000.892113]

[12]   Tuǧkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D. Smith, and Patrick White: Testing closeness of discrete distributions. Technical report, 2010. Preliminary version in FOCS’00. Accepted for publication in JACM. [arXiv:1009.5397]

[13]   Tuǧkan Batu, Ravi Kumar, and Ronitt Rubinfeld: Sublinear algorithms for testing monotone and unimodal distributions. In Proc. 36th STOC, pp. 381–390. ACM Press, 2004. [doi:10.1145/1007352.1007414]

[14]   Vladimir Braverman and Rafail Ostrovsky: Measuring k-wise independence of streaming data. Technical report, 2008. [arXiv:0806.4790v1]

[15]   Vladimir Braverman and Rafail Ostrovsky: Measuring independence of datasets. In Proc. 42nd STOC, pp. 271–280. ACM Press, 2010. [doi:10.1145/1806689.1806728]

[16]   Vladimir Braverman and Rafail Ostrovsky: Zero-one frequency laws. In Proc. 42nd STOC, pp. 281–290. ACM Press, 2010. [doi:10.1145/1806689.1806729]

[17]   Don Coppersmith and Ravi Kumar: An improved data stream algorithm for frequency moments. In Proc. 15th Ann. ACM-SIAM Symp. on Discrete Algorithms (SODA’04), pp. 151–156. ACM Press, 2004. [ACM:982815]

[18]   Graham Cormode, Mayur Datar, Piotr Indyk, and S. Muthukrishnan: Comparing data streams using Hamming norms (how to zero in). IEEE Trans. Knowl. Data Eng., 15(3):529–540, 2003. Preliminary version in VLDB’02. [doi:10.1109/TKDE.2003.1198388]

[19]   Imre Csiszár: Information-type measures of difference of probability distributions and indirect observations. Studia Scientiarum Mathematicarum Hungarica, 2:299–318, 1967.

[20]    Artur Czumaj and Christian Sohler: Testing expansion in bounded-degree graphs. Combinatorics, Probability & Computing, 19(5-6):693–709, 2010. Preliminary version in FOCS’07. [doi:10.1017/S096354831000012X]

[21]   Joan Feigenbaum, Sampath Kannan, Martin J. Strauss, and Mahesh Viswanathan: An approximate L1-difference algorithm for massive data streams. SIAM J. Comput., 32(1):131–151, January 2003. Preliminary verison in FOCS’99. [doi:10.1137/S0097539799361701]

[22]   William Feller: An introduction to probability theory and its applications. Volume 1. Wiley, 3rd edition, 1967.

[23]   Eldar Fischer and Arie Matsliah: Testing graph isomorphism. SIAM J. Comput., 38(1):207–225, 2008. Preliminary version in SODA’06. [doi:10.1137/070680795]

[24]   Jessica H. Fong and Martin Strauss: An approximate Lp difference algorithm for massive data streams. Discrete Mathematics & Theoretical Computer Science, 4(2):301–322, 2001. DMTCS. Preliminary version in STACS’00.

[25]   Oded Goldreich and Dana Ron: On testing expansion in bounded-degree graphs. In Studies in Complexity and Cryptography, pp. 68–75. Springer, 2011. Preliminary version in ECCC. [doi:10.1007/978-3-642-22670-0_9]

[26]   Sudipto Guha, Andrew McGregor, and Suresh Venkatasubramanian: Sublinear estimation of entropy and information distances. ACM Trans. Algorithms, 5(4):35, 2009. Preliminary version in SODA’06. [doi:10.1145/1597036.1597038]

[27]   Bernard Harris: The statistical estimation of entropy in the non-parametric case. Colloquia Mathematica Societatis János Bolyai, 16:323–355, 1975. DTIC Online.

[28]   Piotr Indyk and Andrew McGregor: Declaring independence via the sketching of sketches. In Proc. 19th Ann. ACM-SIAM Symp. on Discrete Algorithms (SODA’08), pp. 737–745. ACM Press, 2008. [ACM:1347163]

[29]   Yuval Ishai, Eyal Kushilevitz, Rafail Ostrovsky, and Amit Sahai: Extracting correlations. In Proc. 50th FOCS, pp. 261–270. IEEE Comp. Soc. Press, 2009. [doi:10.1109/FOCS.2009.56]

[30]   Satyen Kale and Comandur Seshadhri: An expansion tester for bounded degree graphs. SIAM J. Comput., 40(3):709–720, 2011. Preliminary versions in ICALP’08 and ECCC. [doi:10.1137/100802980]

[31]   Donald Knuth: The Art of Computer Programming: Seminumerical Algorithms. Volume 2. Addison Wesley, Phillipines, 1969.

[32]   Reut Levi, Dana Ron, and Ronitt Rubinfeld: Testing properties of collections of distributions. In Proc. 2nd Symp. Innovations in Comput. Sci. (ICS’11), pp. 179–194, 2011. ICS’11.

[33]   Shang-keng Ma: Calculation of entropy from data of motion. J. Statistical Physics, 26(2):221–240, 1981. [doi:10.1007/BF01013169]

[34]   Dragoslav S. Mitrinović and Petar M. Vasić: Analytic Inequalities. Springer, 1970.

[35]   Asaf Nachmias and Asaf Shapira: Testing the expansion of a graph. Inform. and Comput., 208(4):309–314, 2010. Preliminary version in ECCC. [doi:10.1016/j.ic.2009.09.002]

[36]   Ilya Nemenman, William Bialek, and Rob de Ruyter van Steveninck: Entropy and information in neural spike trains: Progress on the sampling problem. Phys. Rev. E, 69(5):056111, 2004. Preprint available at arXiv. [doi:10.1103/PhysRevE.69.056111]

[37]   Liam Paninski: Estimation of entropy and mutual information. Neural Computation, 15(6):1191–1253, 2003. [doi:10.1162/089976603321780272]

[38]   Liam Paninski: Estimating entropy on m bins given fewer than m samples. IEEE Trans. Inform. Theory, 50(9):2200–2203, 2004. [doi:10.1109/TIT.2004.833360]

[39]   Liam Paninski: A coincidence-based test for uniformity given very sparsely sampled discrete data. IEEE Trans. Inform. Theory, 54(10):4750–4755, 2008. [doi:10.1109/TIT.2008.928987]

[40]   Sofya Raskhodnikova, Dana Ron, Ronitt Rubinfeld, and Adam Smith: Sublinear algorithms for approximating string compressibility. Algorithmica, 65(3):685–709, 2013. Preliminary version in RANDOM’07. [doi:10.1007/s00453-012-9618-6]

[41]   Sofya Raskhodnikova, Dana Ron, Amir Shpilka, and Adam Smith: Strong lower bonds for approximating distributions support size and the distinct elements problem. SIAM J. Comput., 39(3):813–842, 2009. Preliminary version in FOCS’07. [doi:10.1137/070701649]

[42]   Bero Roos: On the rate of multivariate Poisson convergence. J. Multivar. Anal., 69(1):120–134, 1999. [doi:10.1006/jmva.1998.1789]

[43]   Ronitt Rubinfeld and Rocco A. Servedio: Testing monotone high-dimensional distributions. Random Structures & Algorithms, 34(1):24–44, 2009. Preliminary version in STOC’05. [doi:10.1002/rsa.20247]

[44]   Ronitt Rubinfeld and Ning Xie: Robust characterizations of k-wise independence over product spaces and related testing results. Random Structures & Algorithms, 2012 (online). Preliminary version in ICALP 2010. [doi:10.1002/rsa.20423]

[45]   Steve P. Strong, Roland Koberle, Rob R. de Ruyter van Steveninck, and William Bialek: Entropy and information in neural spike trains. Phys. Rev. Lett., 80(1):197–200, 1998. [doi:10.1103/PhysRevLett.80.197]

[46]   Wojciech Szpankowski: Average Case Analysis of Algorithms on Sequences. John Wiley & Sons, New York, 2001. [ACM:517168]

[47]   Paul Valiant: Testing symmetric properties of distributions. Ph. D. thesis, CSAIL, MIT, 2008. DSpace@MIT.

[48]   Paul Valiant: Testing symmetric properties of distributions. SIAM J. Comput., 40(6):1927–1968, 2011. Preliminary version in STOC’08. [doi:10.1137/080734066]

[49]   Patrick White: Testing random variables for independence and identity. Unpublished manuscript, 2002.

[50]   David H. Wolpert and David R. Wolf: Estimating functions of probability distributions from a finite set of samples. Physical Review E, 52(6):6841–6854, 1995. [doi:10.1103/PhysRevE.52.6841]

[51]   Kenji Yamanishi: Probably almost discriminative learning. Machine Learning, 18(1):23–50, 1995. Preliminary version in COLT’92. [doi:10.1007/BF00993820]