LAB 4

Today’s lab we will focus on how to open csv files to populate tables with the data and write subqueries.

Let’s start by creating a folder on your desktop called subqueries. Inside the folder, we will place the database dbeaverdatabase.dbthat we used in Lab 3, a new python file called lab4.py and the following csv files:
Copy the following code into lab4.py. This program opens the database, creates the tables, and populates them with data from the CSV files.

        import csv
        import sqlite3

        conn = sqlite3.connect("dbeaverdatabase.db")
        cur = conn.cursor()

        cur.executescript("""
        DROP TABLE IF EXISTS Entry;
        DROP TABLE IF EXISTS Tournament;
        DROP TABLE IF EXISTS Member;

        CREATE TABLE Member (
            MemberID INTEGER PRIMARY KEY,
            LastName TEXT,
            FirstName TEXT
        );

        CREATE TABLE Tournament (
            TourID INTEGER PRIMARY KEY,
            TourName TEXT,
            TourType TEXT
        );

        CREATE TABLE Entry (
            MemberID INTEGER,
            TourID INTEGER,
            Year INTEGER
        );
        """)

        # read the csv files
        with open("Member.csv", newline="") as f:
            reader = csv.reader(f)
            next(reader) 
            for row in reader:
                if len(row) != 3:
                    continue
                cur.execute(
                    "INSERT INTO Member VALUES (?, ?, ?)",
                    (int(row[0]), row[1], row[2])
                )

        with open("Tournament.csv", newline="") as f:
            reader = csv.reader(f)
            next(reader)  
            for row in reader:
                if len(row) != 3:
                    continue
                cur.execute(
                    "INSERT INTO Tournament VALUES (?, ?, ?)",
                    (int(row[0]), row[1], row[2])
                )

        with open("Entry.csv", newline="") as f:
            reader = csv.reader(f)
            next(reader) 
            for row in reader:
                if len(row) != 3:
                    continue 
                cur.execute(
                    "INSERT INTO Entry VALUES (?, ?, ?)",
                    (int(row[0]), int(row[1]), int(row[2]))
                )

        conn.commit()
        conn.close()

Run the program with
```
 python lab4.py
```

** double check the tables in DBeaver to make sure the data was imported correctly.

result

IN Keyword

Now let’s write some subqueries using the IN keyword. Before doing so, let’s review how IN works.

Instead of repeating e.TourID three times:

SELECT e.MemberID
FROM Entry e
WHERE e.TourID = 36 OR e.TourID = 38 OR e.TourID = 40;

we can use the IN keyword and list the values inside parentheses:

SELECT e.MemberID
FROM Entry e
WHERE e.TourID IN (36, 38, 40);

the reason that someone may have been interested in tournaments 36, 38, and 40 might have been because they are the current Open tournaments, so rather than list the Open tournaments individually, we can use another SQL query to generate the set of values we require.

        import csv
        import sqlite3

        conn = sqlite3.connect("dbeaverdatabase.db")
        cur = conn.cursor()

        cur.execute("""
        SELECT e.MemberID
        FROM Entry e
        WHERE e.TourID IN (
        SELECT t.TourID
        FROM Tournament t
        WHERE t.TourType = 'Open');
        """)

        rows = cur.fetchall()

        with open("result.csv", "w", newline="") as f:
            writer = csv.writer(f)
            writer.writerow(["MemberID"])
            writer.writerows(rows)

        conn.close()

Run the program with

    python lab4.py

Open the result.csv file, you should see the following output:

result

EXISTS Keyword

Thinking about the EXISTS Keyword, we can translate it as

“I’ll write out the names from row m, where m comes from the Member table, if there exists a
row e in the Entry table where m.MemberID = e.MemberID .”

which in SQL will be

        SELECT m.LastName, m.FirstName
        FROM Member m
        WHERE EXISTS
        (SELECT * FROM Entry e WHERE e.MemberID = m.MemberID);

Now it is your turn to write this query in Python and save the output to a CSV file called exists.csv.
The file should look like the following:

exists

Different Types of Subqueries

There are different types of nested queries and alternate ways of expressing them.

A subquery returning a single value

To find the tournaments that member Cooper has entered, this is the same as

SELECT e.TourID, e.Year FROM Entry e WHERE e.MemberID =
(SELECT m.MemberID FROM Member m
WHERE m.LastName = 'Cooper');

this

SELECT e.TourID, e.Year
FROM Entry e INNER JOIN Member m ON e.MemberID = m.MemberID
WHERE m.LastName = 'Cooper';

Both of them will return the same output

output

A subquery returning a set of single values

It is your turn to find all the entries for an Open tournament, your output should be the following

output

A subquery checking for existence

To find the names of members that have entered any tournament, this is the same

SELECT m.LastName, m.FirstName
FROM Member m
WHERE EXISTS
(SELECT * FROM Entry e
WHERE e.MemberID = m.MemberID);

SELECT DISTINCT m.LastName, m.FirstName
FROM Member m INNER JOIN Entry e ON e.MemberID = m.MemberID;

You should see the following output

output