General Description The goal of the class project is to
implement a database system application. The project includes the
following activities spread over the entire semester:
The end result should be a functioning application that runs native and/or on
the web and that uses your database to allow useful functionality.
- Identify an application area for which database systems may
- Determine the functionalities of the database application,
- Model the data stored in the database (Identify the entities, roles,
relationships, constraints, etc.),
- Design, normalise, and perfect the relational database schema,
- Write the SQL commands to create the database, find appropriate data, and
populate the database, and
- Finally and most importantly, write the software needed to embed the
database system in the application.
Two students should do this project together. You are free to choose
your own project members; if you would like the instructor to assign you
to a group, say so in class. Each of the steps above will be a specific
project assignment, You will get detailed instructions with each
assignment. Each group should turn in a single solution to each
assignment. Every member of the group will get the same grade.
Project Ideas (stolen from Virginia Tech DB class)
These ideas are just a sample. You are free to propose your own ideas. Realize
that the ideas below are not complete descriptions. You need to work on them
more and develop your project more concretely and in more detail. Do not get
intimidated by the examples that are linked from this web page. These examples
are meant to give you a feel for the application domain. It is up to you to
narrowly define the scope of the application within the time frame of a semester-long
project. Do not forget that you are supposed to have fun!
- Bibliography database: Develop a system that will improve a research
group's ability to track its publications and publications of interest to
the group. Track information such as papers, authors, projects, conferences
and journals. Readers should be able to view chronological listings, find
papers by certain authors, group by projects, recover lists of papers based
on keywords, etc. It should be easy for group members to add new papers, both
written by the group and published by others in the literature. Examples of
such systems include Connotea and CiteULike.
- Nobel Awards Database: The goal is to model and populate information
about the awards made in the various fields (Physics, Chemistry, Physiology
or Medicine Literature, Peace and the Economic Sciences), the recipients,
their countries, their year of birth etc. Your system should be able to answer
questions such as "When was the first time an Asian won an award for the economic
sciences?" (the answer to this particular question is 1998). The Nobel Foundation
maintains such an interface.
You could also work on variants of this idea such as the recipients of the
ACM awards (unfortunately, there
is not too much information online about this). Interesting queries then could
be "Name people who have won at least two different awards" (the answer would
include Knuth, Thompson, Ritchie, Engelbart etc.) Or the people "who were
ACM Fellows before becoming Turing Award Winners" and so on. Although
this application is nice, I should warn you that the E/R diagram is very simple
and not complex enough for a CS 4604 project. If you can generalise to a number
of different awards in different disciplines (and are sure that you can get
the data for all the awards), this project would be suitable for CS 4604.
- Books Database: This domain is another popular one. Just look at
barnesandnoble.com or amazon.com for excellent examples. You could
model entities such as books, their authors, topics (which may be a complex
hierarchy). You may also model various attributes of the authors and the institutions
they belong to. You can support a service for buying and selling used books
or books used in specific university courses. Your system can build a personal
profile of people (and the books they like) and your database application
could form the basis for a "recommender system", such as those supported by
the commercial sites. The goal here is to "cluster" similar preferences together
and the system can then make recommendations: "Since you liked Harry Potter
and the Sorcerer's Stone, I recommend that you try Harry Potter and
the Chamber of Secrets".
- Movies Database: There are several excellent movie resources on the
web, such as the hollywood.com movies site or the Internet
Movie Database. You could model entities such as movies, their actors,
directors, genres, playing times, and reviews. There are several sources on
the web from which you could get data to populate such a database. You can
support various queries such as finding specific playing times, finding movies
playing in Blacksburg directed by a given director. You can also support updates
to the reviews section of the database (e.g., viewers giving their own opinions).
Another functionality is to provide personal profiles of people (i.e., the
movies they like) and then try to recommend movies to them based on profiles
of viewers with similar tastes. You could also create a database of OSCAR
or Golden Globe nominations and awards and answer queries such as "Find all
the sitcoms that have been nominated three years in a row".
- Personal Photos database: With the advent of cheap digital
cameras, everybody has piles of digital photos. People need a way to organize,
access, and show off their photos.
- Apartment Homes: Our friendly neighborhood web guide is here. This domain would require modeling
apartments and their attributes, areas of town and their various characteristics
(e.g., BT bus lines, crime rates, distance from various landmarks). You would
provide an interface for offering apartments for rent, finding apartments
based on various requirements ("gas heating + pets allowed + rent less than
$500 + close to campus + BEV modem facility").
- Research Literature: This domain involves modeling research publications.
You need to identify the title of the publication, the forum it was published
in, the authors, topics, keywords and related subtopic areas. This is a big
business now (under the name of digital libraries). For example, the ACM
digital library provides a beautiful searchable index (and retrievable
repository, but that is beyond our scope) of nearly all of the publications
of ACM. If you use this domain, then there are a lot of available resources
for you to use. The ACM computing classification
system provides a convenient hierarchial meta-index that you can use to
organize your class hierarchy etc. If you are interested in a smaller domain,
then the DBLP Bibliography
Site provides a searchable facility for publications related to the database
and programming communities. At the end of the day, you could identify papers
written by a particular person at a particular place or ones in a narrowly
- Census Database:Can you make a census data dissemination system for
the Census Bureau? A census gathers data
about people, business, geographic regions, etc. Different
types of users need to gain different types of answers from the data. Homeowners
want to know statistics about their region, such as crime rates. Business
owners want to find holes in the competition. Government decision makers want
to learn about demographic trends, and where to focus resources.
- Web Sites: How do you think web search engines such as Google model their domain? You could think of
them as a glorified database system where the basic entities modeled are web
sites. You could then model the various properties of a web site: Topic, URL,
domain name, other sites it links to, the background colour, etc. Retrieval
could be for sites that have similar characteristics and properties.
- Others: Of course, there are a whole host of other ideas such as
bank accounts, student records, NBA data, election results, senate demographics,
car rentals, auto insurance, consumer products, courses at Virginia Tech,
Hokie statistics, "match-making services" and so on. Let your imagination
Last modified: Wed Aug 20 16:30:20 EDT 2008