Online Session 5-1

A Gentle Introduction to SQL and SQLite


CS 3140
Database Systems


Where Are We?

This is a week of transition. Last week, we wrapped up our initial study of the Entity Relationship model and the relational data model. Next week, we will turn our attention to the theory of the relational data model, relational algebra, and use this theory to do relational database programming in SQL. You will also take our first exam next week.

While you finish up homework and prepare for the exam this week, we will use our class time to learn some of the basics of SQL and one its implementations, SQLite. Today's activity introduces SQL and SQLite and has you work through two scripted examples [ one | two ] accessing and using SQLite.

Be sure to submit your work at the end of the session! Ask any questions you have, especially about how to access SQLite and the tools we use.

Bonus:    To help you prepare for Exam 1, I also review of Homework 1.



Introduction to SQL and SQLite

After we design a conceptual model and convert it into a logical model, we can collect and represent data. How do put that data into a computer database and use it?

SQL is a language for creating and querying relational databases. In its standard form, it defines a small number of operations. We will familiarize ourselves with a few commands this week and explore the entire language in more detail later. SQL is based on a powerful theory, which we will also explore later.

As you will see, SQL is a very high-level language. In SQL, we say what to do rather than "how to do it". This avoids most of the data manipulation details we need in procedural languages like Python or Java. The database management system figures out the most efficient way to execute a query.

For this session, let's use a subset of SQL to create a simple database and look up facts.

There are several dialects and implementations of SQL, most of which have a common core extended by specific features. We will use the ubiquitous SQLite, which is built into all mobile phones and which comes bundled inside of many common applications. It comes with pre-built binaries for Windows, Linux, and Mac (where it comes built in).

It is also available on student.cs.uni.edu, the department's Linux server. We will use student.cs as a common platform for demos and experiments in class.

For this session, I will assume that you are logged in to student.cs or are otherwise in a shell window where you can start SQLite.

If you don't know how to login to the student server, see these instructions on the course resources page. If you'd like to install SQLite on your own computer, let me know. I'll try to help, and will post pointers to resource page.

Once you are logged in to student.cs.uni.edu, launch SQLite:

    $ sqlite3
    SQLite version 3.19.3 2017-06-27 16:48:08
    Enter ".help" for usage hints.
    Connected to a transient in-memory database.
    Use ".open FILENAME" to reopen on a persistent database.
    sqlite>

You're ready to start!



A Database to Implement

Back in Week 3, we designed a conceptual model of a bank's customer:

customers with multiple addresses and phones

Then, in Week 4, we converted this ER diagram into a set of relational schema using two simple rules and a tweak that eliminated two relations. This resulted in three relations:

    Customer(ssNo, name)
    Phone(areaCode, number, owner)
    Address(street, city, state, resident)

I created instances of each so that we could see the schema alive in data:

    ssNo          name                               CUSTOMER
    -----------   -----     
    123-45-6789   Eugene Wallingford
    321-45-6789   John Smith


    areaCode   number     owner                      PHONE
    --------   --------   -----------
    319        273-5919   123-45-6789
    517        452-9159   321-45-6789
    319        273-2618   123-45-6789


    street     city          state   resident        ADDRESS
    --------   -----------   -----   -----------
    339 WRT    Cedar Falls   IA      321-45-6789
    1 Elm St   Plainfield    IN      321-45-6789
    305 WRT    Cedar Falls   IA      123-45-6789

Let's implement this database in SQLite.



The Scripted Example

First, let's create the CUSTOMER relation. We create a table using the CREATE command:

    CREATE TABLE <name> (
      <list of elements>
      );
In its simplest form, the list of elements is a comma-separated list of attribute names:
    CREATE TABLE Customer (ssNo, name);
We can also declare the relation's key using the PRIMARY KEY constraint:
    CREATE TABLE Customer (
      ssNo, name,
      PRIMARY KEY (ssNo)
      );
The semi-colon terminates the command.

Type or paste that command into your SQLite shell:

    sqlite> CREATE TABLE Customer (
       ...>   ssNo, name,
       ...>   PRIMARY KEY (ssNo)
       ...>   );
    sqlite> 

Now we can insert rows into the table using the INSERT command:

    INSERT INTO <name> VALUES (
      <list of attribute values>
      );

Our Customer relation contains two tuples:

    ssNo          name
    -----------   -----     
    123-45-6789   Eugene Wallingford
    321-45-6789   John Smith

This command adds me to the table:

    sqlite> INSERT INTO Customer VALUES (
       ...>   '123-45-6789', 'Eugene Wallingford' );
    sqlite> 
Type this command into your shell. Then type an INSERT command that enters John Smith into the table, too.

If we've been successful, we should have two tuples in our Customer table. How can see the state of the table? We can use a special SQLite command to dump the entire database as SQL code:

    sqlite> .dump Customer
    PRAGMA foreign_keys=OFF;
    BEGIN TRANSACTION;
    CREATE TABLE Customer (
          ssNo, name,
          PRIMARY KEY (ssNo)
          );
    INSERT INTO Customer VALUES('123-45-6789','Eugene Wallingford');
    INSERT INTO Customer VALUES('321-45-6789','John Smith');
    COMMIT;

But what we really want to do is interact with the database. To do this, we can create a query. Type this query into your shell:

    sqlite> SELECT * FROM Customer;
    123-45-6789|Eugene Wallingford
    321-45-6789|John Smith
The * is a wildcard. The statement says, Select all attributes for every tuple in the Customer table.

The SELECT command is quite powerful. It has this basic shape:

    SELECT desired attributes
    FROM one or more tables
    WHERE this condition applies to the returned tuples

Here is a query that looks for tuples with the name 'John Smith': shell:

    sqlite> SELECT * FROM Customer WHERE name='John Smith';
    321-45-6789|John Smith
This statement statement says, Select all attributes for every tuple in the Customer table where the name attribute is 'John Smith'.

We will learn a lot more about SQL queries and the SELECT command in the coming weeks!

For now, let's save our database so that we can use it later. We can use an SQLite command to save our database to a file and quit:

    sqlite> .save example05-01.db
    sqlite> .quit
    $ 

Check your local directory to verify that SQLite saved our database to the file. On student.cs, I can type:

    $ ls -l example05-01.db
    -rw-r--r--  1 wallingf  staff  12288 Sep 16 12:22 example05-01.db
Note, that's a "dash L" (lowercade Ls look like ones!).

You've made your first database!



The Scripted Example, Part 2

We create databases so that we can interact with them over time. Let's use or example Customer database again.

SQLite provides a .open command that complements .save, as shown in the prolog you see when you launch the shell:

    $ sqlite3
    SQLite version 3.19.3 2017-06-27 16:48:08
    Enter ".help" for usage hints.
    Connected to a transient in-memory database.
    Use ".open FILENAME" to reopen on a persistent database.
    sqlite> .open example05-01.db
    sqlite> 

You can also start SQLite with a database already loaded:

    $ sqlite3 example05-01.db
    SQLite version 3.16.2 2017-01-06 16:32:41
    Enter ".help" for usage hints.
    sqlite> 

Use SQLite's .schema command to see that the Customer table has been loaded and has the schema we expect:

    sqlite> .schema Customer
    CREATE TABLE Customer (
          ssNo, name,
          PRIMARY KEY (ssNo)
          );
We can query the database to see all its tuples, using the SQL command SELECT:
    sqlite> SELECT * FROM Customer;
    123-45-6789|Eugene Wallingford
    321-45-6789|John Smith
Remember: SQL commands are terminated with semi-colons. SQLite's own commands start with a period and terminate at the end of the line.

Let's insert a couple of more tuples into the Customer relation:

    sqlite> INSERT INTO Customer VALUES ('987-65-4321', 'Boris Spassky');
    sqlite> INSERT INTO Customer VALUES ('456-78-9123', 'John Smith');
    sqlite> SELECT * FROM Customer;
    123-45-6789|Eugene Wallingford
    321-45-6789|John Smith
    987-65-4321|Boris Spassky
    456-78-9123|John Smith

Notice that two tuples have the same value for name attribute. That's okay. Name is not the key of the table.

Quick experiment to run: What happens if we insert a tuple with a duplicate key? Try it now and find out!

With a few more tuples and a little variety, we can query the table and get some more interesting results. This query asks for only the Social Security number attribute for the tuples in Customer:

    sqlite> SELECT ssNo FROM Customer;
    123-45-6789
    321-45-6789
    456-78-9123
    987-65-4321
Notice that SQLite returns these in sorted order. The tuples in a relation form a set, but humans often prefer ordered data.

We can ask to see all the tuples for the John Smiths in the database:

    sqlite> SELECT * FROM Customer WHERE name='John Smith';
    321-45-6789|John Smith
    456-78-9123|John Smith
Or just the Social Security numbers for those tuples:
    sqlite> SELECT ssNo FROM Customer WHERE name='John Smith';
    321-45-6789
    456-78-9123

As I said earlier, we will learn a lot more about SELECT statements in the coming weeks. For now, here's one that doesn't ask for exact matches on the name:

    sqlite> SELECT * FROM Customer WHERE name>'David Bronstein';
    123-45-6789|Eugene Wallingford
    321-45-6789|John Smith
    456-78-9123|John Smith

Let's save our database again so that we don't lose our new tuples.

    sqlite> .save example05-01.db
    Error: database is locked
! We don't have to save our database. SQLite commits all of the transactions we run to the file for us. (We could save the database to a new file.)

Check your local directory to verify that SQLite saved our database to the file. On student.cs, I can type:

    sqlite> .quit
    $ ls -1 example05-01.db
    -rw-r--r--  1 wallingf  staff  12288 Sep 16 12:22 example05-01.db
Our file is still there. If we re-start SQLite and ask to see all the tuples in the Customer table, we will see all four, including Boris Spassky and the new John Smith. Try it!



Submission for This Session

When you finish working through this session, submit your example05-01.db file using the electronic submission system, under "online05-01". Please submit by 8:00 AM on Monday, September 21.

If you don't know how to copy your file from a remote server to your local computer, see these instructions on the course resources page.

As always, if you have any questions, please email me. I'd like to help quickly so that you can experience success creating and using your first simple relational database.



Homework 1 Review

I will send you feedback on your Homework 1 submissions soon. In the meantime, let's review some possible answers.

Problems 1 and 3. It's easy enough to find the entities in the problem description: states, districts, legislators, bills, and votes. All are decsribed in terms of their attributes. Drawing the ER diagram creates a few challenges for us... Here is a diagram that resembles many of the ones you all submitted:

an ER diagram for a legislature's voting activity

After Week 4, we might now recognize that District and Vote are weak entities. But we didn't know that when we did this homework, so we needed to model them using basic entities and relationships. I assumed that districts and votes are given unique IDs that distunguish them across states and bills+legislators.

Following the rules listed in the assignment, we can convert this diagram into the following relational schema:

    State(name, region)
    District(number)
    Legislator(name, party, startDate)
    Bill(name, voteDate, result)
    Vote(id, voteCast)

    Contains(stateName, districtNumber)
    ServedBy(districtNumber, legislator)
    SponsoredBy(billName, legislator)
    VotesFor(billName, voteId)
    VotesBy(voteID, legislator)
Notice that we don't list attributes that are recorded in a relationship. In Week 4, we learned that we can eliminate some relationships by adding an attribute to an entity's schema.

Problems 2 and 4. Finding the entities in this problem caused me to back up once. Flight, leg, occurrence, and airport stood out right away. At first, I had airplane as an attribute on occurrence, but then I realized that planes had attributes, too, and moved out into its own entity.

Here is a diagram that resembles what many of you submitted:

an ER diagram for an airline reservation system

Leg and occurrence are both very weak entities. I assumed that we could label each with unique IDs.

Following the rules listed in the assignment, we can convert this diagram into the following relational schema:

    Flight(number)
    Leg(legID, origin, departs, arrives, terminus)
    Occurrence(occurrenceID, date, actualDeparture, actualArrival)
    Airplane(idNumber, type)
    Airport(faaCode)

    ContainsLeg(flightNumber, legID)
    Occurs(legID, occurrenceID)
    PlaneFlies(planeNumber, occurrenceID)
    CanUse(planeNumber, airportCode)
With unique IDs for legs and occurrences, the schema work out nicely.

I will be evaluating your submissions based on the assumptions you stated, the thinking you did, and your use of the tools we had available at the time you did this assignment. If you have any questions now, please ask!



Eugene Wallingford ..... wallingf@cs.uni.edu ..... September 16, 2020