The massive volume of data generated in modern applications can overwhelm our ability to conveniently transmit, store, and index it. For many scenarios, building a compact summary of a dataset that is vastly smaller enables flexibility and efficiency in a range of queries over the data, in exchange for some approximation. This comprehensive introduction to data summarization, aimed at practitioners and students, showcases the algorithms, their behavior, and the mathematical underpinnings of their operation. The coverage starts with simple sums and approximate counts, building to more advanced probabilistic structures such as the Bloom Filter, distinct value summaries, sketches, and quantile summaries. Summaries are described for specific types of data, such as geometric data, graphs, and vectors and matrices. The authors offer detailed descriptions of and pseudocode for key algorithms that have been incorporated in systems from companies such as Google, Apple, Microsoft, Netflix and Twitter.
人気のある作家
J KING (12) JJ TAM (12) yang hu (11) Al Sweigart (8) Mojang AB (8) desti publishhings (7) Hidenori Kusaka (6) John Bach (6) JP TAM (6) Andrea Vedaldi (5) Halonjash Publications (5) Hiro Ainana (5) Horst Bischof (5) Intelligent Feather Publications (5) Jan-Michael Frahm (5) Michael W. Lucas (5) Andrew Park (4) Benjamin Smith (4) Engr. Michael David (4) Harvey Deitel (4)最適なファイルサイズ
10531 KB 1079 KB 1116 KB 1233 KB 2661 KB 370 KB 484 KB 536 KB 649 KB 738 KB 790 KB 10049 KB 1006 KB 10137 KB 1016 KB 102097 KB 1029 KB 10325 KB 1032 KB 1035 KB