Announcement

Course Description

  • This course covers database management system design principles and techniques. Possible topics include internal design of DBMS, indexing, query processing, distributed scientific databases, computational geometry, geographic information systems, and data intensive computing. In the first half of the trimester, we will review fundamental knowledge about DBMS. After that we will select and read a couple of milestone papers in DB history as well as the state-of-the-art DB papers mainly focusing on indexing techniques. As need arises, this course will also cover some basic computational geometry theory.

Lecture Hours

  • Tue, Thu, 10:00 - 12:00

Recommended Textbook (Not Required)

  1. Database System Concepts,
    by Abraham Silberschatz, Henry F. Korth, S. Sudarshan, McGraw-Hill, 6th Edition
  2. Readings in Database Systems,
    by Hellerstein and Stonebraker, MIT Press, 4th Edition
    Redbook Web Site

Reading List

  1. David A. Patterson, Garth Gibson and Randy H. Katz "A case for redundant arrays of inexpensive disks (RAID)", SIGMOD 88
  2. Michael Stonebraker. Operating System Support for Database Management.. Commun. ACM, 24(7), 1981, 412-418.
  3. A. Guttman, R-Trees: A Dynamic Index Structure For Spatial Searching, SIGMOD 84
  4. N. Beckmann, H.P. Kriegel, R. Schneider, B. Seeger, R*-tree: An Efficient and Robust Access Method for Points and Rectangles, SIGMOD 90
  5. S. Berchtold, D. A. Keim, The X-Tree: An Index Structure for High-Dimensional Data, VLDB 96
  6. K. Chakrabarti, S. Mehrotra, The Hybrid Tree: An Index Structure for High Dimensional Feature Spaces, ICDE 99
  7. Indyk, Piotr.; Motwani, Rajeev. , "Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality.". Symposium on Theory of Computing, 1998
  8. N. Roussopoulos, S. Kelley, F. Vincent: Nearest Neighbor Queries. SIGMOD 95
  9. S. Venkataraman, N. Tolia, P. Ranganathan, R. H. Campbell, "Consistent and Durable Data Structures for Non-Volatile Byte-Addressable Memory", USENIX FAST, 2011
  10. J. Dean and S. Ghemawat, MapReduce: Simplified Data Processing on Large Clusters, in OSDI 2004
  11. A. Pavlo, E. Paulson, A. Rasin, D. J. Abadi, D. J. DeWitt, S. Madden, and M. Stonebraker, A Comparison of Approaches to Large Scale Data Analysis, in SIGMOD '09
  12. M. Stonebraker, D. Abadi, D. J. DeWitt, S. Madden, E. Paulson, A. Pavlo, and A. Rasin, MapReduce and parallel DBMSs: friends or foes?, in Communications of the ACM, January 2010
  13. M. Stonebraker, SQL databases v. NoSQL databases, in Communications. ACM 53(4): 10-11 (2010)
  14. F. Chang et al., Bigtable: A distributed storage system for structured data, OSDI 2006
  15. G. DeCandia et al., Dynamo: Amazon's highly available key-value store, SOSP 2007
  16. B. F. Cooper et al., PNUTS: Yahoo!'s hosted data serving platform, VLDB 2008
  17. A.Thusoo, et al., Hive: A Warehousing Solution Over a MapReduce Framework, VLDB 2009
  18. A.Lakshman, P. Malik., Cassandra: A Decentralized Structured Storage System, ACM SIGOPS Operating Systems Review, Volume 44 Issue 2, April 2010
  19. J. Baker, et al., MegaStore: Providing Scalable Highly-Available Storage for Interactive Services, CIDR 2011

Grading

  • Presentations - 10%
  • Projects - 50%
  • Midterm and Final exams - 20% / 20%