Skip to content

Latest commit

 

History

History
448 lines (379 loc) · 16.2 KB

File metadata and controls

448 lines (379 loc) · 16.2 KB

DQL in Detail

Next, we will use the school course-selection database created earlier to explain query operations in DQL. Querying is very important for both developers and data analysts, because it is related to whether we can get the data we need from a relational database. It is recommended that you run again the DDL from the lesson before last that created the database and tables, and the DML from the previous lesson that inserted data, to make sure there is no problem with the tables and data before running the operations below.

USE school;

-- Query all information for all students
SELECT stu_id,
       stu_name,
       stu_sex,
       stu_birth,
       stu_addr,
       col_id
  FROM tb_student;

-- Query student ID, name, and hometown (projection and alias)
SELECT stu_id AS StudentID,
       stu_name AS Name,
       stu_addr AS Hometown
  FROM tb_student;

-- Query course names and credits (projection and alias)
SELECT cou_name AS CourseName,
       cou_credit AS Credits
  FROM tb_course;

-- Query all female students' names and birth dates (data filtering)
SELECT stu_name,
       stu_birth
  FROM tb_student
 WHERE stu_sex = 0;

-- Query female students from Chengdu, Sichuan (data filtering)
SELECT stu_name,
       stu_birth
  FROM tb_student
 WHERE stu_sex = 0
       AND stu_addr = 'Chengdu, Sichuan';

-- Query students who are female or from Chengdu, Sichuan (data filtering)
SELECT stu_name,
       stu_birth
  FROM tb_student
 WHERE stu_sex = 0
       OR stu_addr = 'Chengdu, Sichuan';

-- Query all students born in the 1980s (data filtering)
SELECT stu_name,
       stu_sex,
       stu_birth
  FROM tb_student
 WHERE '1980-1-1' <= stu_birth
       AND stu_birth <= '1989-12-31';

SELECT stu_name,
       stu_sex,
       stu_birth
  FROM tb_student
 WHERE stu_birth BETWEEN '1980-1-1' AND '1989-12-31';

-- Query courses with more than 2 credits (data filtering)
SELECT cou_name,
       cou_credit
  FROM tb_course
 WHERE cou_credit > 2;

-- Query courses whose credits are odd numbers (data filtering)
SELECT cou_name,
       cou_credit
  FROM tb_course
 WHERE cou_credit MOD 2 <> 0;

-- Query student IDs for students who chose course 1111 and scored above 90 (data filtering)
SELECT stu_id
  FROM tb_record
 WHERE cou_id = 1111
       AND score > 90;

-- Query the name and sex of the student named Yang Guo (data filtering)
SELECT stu_name AS Name,
       CASE stu_sex WHEN 1 THEN 'Male' ELSE 'Female' END AS Gender
  FROM tb_student
 WHERE stu_name = 'Yang Guo';

SELECT stu_name AS Name,
       IF(stu_sex, 'Male', 'Female') AS Gender
  FROM tb_student
 WHERE stu_name = 'Yang Guo';

-- Query students whose family name is Yang (fuzzy matching)
-- Wildcard % matches zero or any number of characters
SELECT stu_name AS Name,
       CASE stu_sex WHEN 1 THEN 'Male' ELSE 'Female' END AS Gender
  FROM tb_student
 WHERE stu_name LIKE 'Yang%';

-- Query students whose family name is Yang and whose names have two Chinese characters (fuzzy matching)
-- Wildcard _ matches one character
SELECT stu_name AS Name,
       CASE stu_sex WHEN 1 THEN 'Male' ELSE 'Female' END AS Gender
  FROM tb_student
 WHERE stu_name LIKE 'Yang_';

-- Query students whose family name is Yang and whose names have three Chinese characters (fuzzy matching)
SELECT stu_name AS Name,
       CASE stu_sex WHEN 1 THEN 'Male' ELSE 'Female' END AS Gender
  FROM tb_student
 WHERE stu_name LIKE 'Yang__';

-- Query student ID and name for students whose student ID ends with 3 (fuzzy matching)
SELECT stu_id,
       stu_name
  FROM tb_student
 WHERE stu_id LIKE '%3';

-- Query student ID and name for students whose names contain Bu or Yan (fuzzy matching and union)
SELECT stu_id,
       stu_name
  FROM tb_student
 WHERE stu_name LIKE '%Bu%'
       OR stu_name LIKE '%Yan%';

SELECT stu_id,
       stu_name
  FROM tb_student
 WHERE stu_name LIKE '%Bu%'
 UNION
SELECT stu_id,
       stu_name
  FROM tb_student
 WHERE stu_name LIKE '%Yan%';

-- Query student ID and name for students whose family name is Yang or Lin and whose names have three Chinese characters in total (regex fuzzy matching)
SELECT stu_id,
       stu_name
  FROM tb_student
 WHERE stu_name REGEXP '[LinYang][\\u4e00-\\u9fa5]{2}';

-- Query names of students whose hometown was not entered (null handling)
SELECT stu_name
  FROM tb_student
 WHERE TRIM(stu_addr) = ''
       OR stu_addr is null;

-- Query names of students whose hometown was entered (null handling)
SELECT stu_name
  FROM tb_student
 WHERE TRIM(stu_addr) <> ''
       AND stu_addr is not null;

-- Query all course-selection dates (remove duplicates)
SELECT DISTINCT sel_date
  FROM tb_record;

-- Query students' hometowns (remove duplicates)
SELECT DISTINCT stu_addr
  FROM tb_student
 WHERE TRIM(stu_addr) <> ''
       AND stu_addr is not null;

-- Query male students' names and birthdays ordered by age from old to young (sorting)
SELECT stu_name,
       stu_birth
  FROM tb_student
 WHERE stu_sex = 1
 ORDER BY stu_birth ASC;

-- Extra: change the birthdays above into ages (date function, numeric function)
SELECT stu_name AS Name,
       FLOOR(DATEDIFF(CURDATE(), stu_birth) / 365) AS Age
  FROM tb_student
 WHERE stu_sex = 1
 ORDER BY Age DESC;

-- Query the birth date of the oldest student (aggregate function)
SELECT MIN(stu_birth)
  FROM tb_student;

-- Query the birth date of the youngest student (aggregate function)
SELECT MAX(stu_birth)
  FROM tb_student;

-- Query the highest score in course 1111 (aggregate function)
SELECT MAX(score)
  FROM tb_record
 WHERE cou_id = 1111;

-- Query minimum score, maximum score, average score, standard deviation, and variance for student 1001 (aggregate function)
SELECT MIN(score) AS MinScore,
       MAX(score) AS MaxScore,
       ROUND(AVG(score), 1) AS AvgScore,
       STDDEV(score) AS StdDev,
       VARIANCE(score) AS Variance
  FROM tb_record
 WHERE stu_id = 1001;

-- Query the average score of student 1001; if there is a null value, count it as 0 (aggregate function)
SELECT ROUND(SUM(score) / COUNT(*), 1) AS AvgScore
  FROM tb_record
 WHERE stu_id = 1001;

-- Query the number of male and female students (grouping and aggregate function)
SELECT CASE stu_sex WHEN 1 THEN 'Male' ELSE 'Female' END AS Gender,
       COUNT(*) AS Total
  FROM tb_student
 GROUP BY stu_sex;

-- Query the number of students in each college (grouping and aggregate function)
SELECT col_id AS CollegeID,
       COUNT(*) AS Total
  FROM tb_student
 GROUP BY col_id
  WITH ROLLUP;

-- Query the number of male and female students in each college (grouping and aggregate function)
SELECT col_id AS CollegeID,
       CASE stu_sex WHEN 1 THEN 'Male' ELSE 'Female' END AS Gender,
       COUNT(*) AS Total
  FROM tb_student
 GROUP BY col_id, stu_sex;

-- Query each student's student ID and average score (grouping and aggregate function)
SELECT stu_id AS StudentID,
       ROUND(AVG(score), 1) AS AvgScore
  FROM tb_record
 GROUP BY stu_id;

-- Query student ID and average score for students whose average score is at least 90 (filtering after grouping)
SELECT stu_id AS StudentID,
       ROUND(AVG(score), 1) AS AvgScore
  FROM tb_record
 GROUP BY stu_id
HAVING AvgScore >= 90;

-- Query student ID and average score for students whose average score in courses 1111, 2222, and 3333 is at least 90 (filtering before and after grouping)
SELECT stu_id AS StudentID,
       ROUND(AVG(score), 1) AS AvgScore
  FROM tb_record
 WHERE cou_id in (1111, 2222, 3333)
 GROUP BY stu_id
HAVING AvgScore >= 90
 ORDER BY AvgScore ASC;

-- Query the name of the oldest student (subquery)
SELECT stu_name
  FROM tb_student
 WHERE stu_birth = (SELECT MIN(stu_birth)
                      FROM tb_student);

-- Query the names of students who selected more than two courses (subquery and set operation)
SELECT stu_name
  FROM tb_student
 WHERE stu_id in (SELECT stu_id
                    FROM tb_record
                   GROUP BY stu_id
                  HAVING COUNT(*) > 2);

-- Query student names, birth dates, and the names of their colleges (table join)
SELECT stu_name,
       stu_birth,
       col_name
  FROM tb_student AS t1, tb_college AS t2
 WHERE t1.col_id = t2.col_id;

SELECT stu_name,
       stu_birth,
       col_name
  FROM tb_student INNER JOIN tb_college
       ON tb_student.col_id = tb_college.col_id;

SELECT stu_name,
       stu_birth,
       col_name
  FROM tb_student NATURAL JOIN tb_college;

SELECT stu_name,
       stu_birth,
       col_name
  FROM tb_student CROSS JOIN tb_college;

-- Query student names, course names, and scores
SELECT stu_name,
       cou_name,
       score
  FROM tb_student, tb_course, tb_record
 WHERE tb_student.stu_id = tb_record.stu_id
       AND tb_course.cou_id = tb_record.cou_id
       AND score is not null;

SELECT stu_name,
       cou_name,
       score
  FROM tb_student
       INNER JOIN tb_record
           ON tb_student.stu_id = tb_record.stu_id
       INNER JOIN tb_course
           ON tb_course.cou_id = tb_record.cou_id
 WHERE score is not null;

SELECT stu_name,
       cou_name,
       score
  FROM tb_student
       NATURAL JOIN tb_record
       NATURAL JOIN tb_course
 WHERE score is not null;

-- Supplement: take the first 5 rows
SELECT stu_name,
       cou_name,
       score
  FROM tb_student
       NATURAL JOIN tb_record
       NATURAL JOIN tb_course
 WHERE score is not null
 ORDER BY cou_id ASC, score DESC
 LIMIT 5;

-- Supplement: take rows 6 through 10
SELECT stu_name,
       cou_name,
       score
  FROM tb_student
       NATURAL JOIN tb_record
       NATURAL JOIN tb_course
 WHERE score is not null
 ORDER BY cou_id ASC, score DESC
 LIMIT 5
OFFSET 5;

-- Supplement: take rows 11 through 15
SELECT stu_name,
       cou_name,
       score
  FROM tb_student
       NATURAL JOIN tb_record
       NATURAL JOIN tb_course
 WHERE score is not null
 ORDER BY cou_id ASC, score DESC
 LIMIT 10, 5;

-- Query student names and average scores
-- Error Code: 1248. Every derived table must have its own alias
SELECT stu_name,
       avg_score
  FROM tb_student
       NATURAL JOIN (SELECT stu_id,
                            ROUND(AVG(score), 1) AS avg_score
                       FROM tb_record
                      GROUP BY stu_id) as tmp;

-- Query student names and the number of selected courses
SELECT stu_name,
       total
  FROM tb_student
       NATURAL JOIN (SELECT stu_id,
                            COUNT(*) AS total
                       FROM tb_record
                      GROUP BY stu_id) as tmp;

-- Query each student's name and course-selection count
SELECT stu_name AS Name,
       COALESCE(total, 0) AS CourseCount
  FROM tb_student AS t1
       LEFT JOIN (SELECT stu_id,
                         COUNT(*) AS total
                    FROM tb_record
                   GROUP BY stu_id) AS t2
              ON t1.stu_id = t2.stu_id;

There are several places that need extra explanation:

  1. The current version of MySQL does not support full outer join. Above, we used the union operation to take the union of the results of left outer join and right outer join, and this gives the effect of full outer join. You can deepen your understanding of join operations through the picture below.

  2. MySQL supports many kinds of operators, including arithmetic operators (+, -, *, /, %), comparison operators (=, <>, <=>, <, <=, >, >=, BETWEEN...AND.., IN, IS NULL, IS NOT NULL, LIKE, RLIKE, REGEXP), logical operators (NOT, AND, OR, XOR), and bitwise operators (&, |, ^, ~, >>, <<). We can use these operators in DML to process data.

  3. When querying data, functions can be used in the SELECT statement and its clauses, such as the WHERE clause, ORDER BY clause, and HAVING clause. These functions include string functions, numeric functions, date and time functions, flow control functions, and so on, as shown in the tables below.

    Common string functions.

    Function Meaning
    CONCAT Join multiple strings into one string
    FORMAT Format a number into a string and specify how many decimal places to keep
    FROM_BASE64 / TO_BASE64 BASE64 decode / encode
    BIN / OCT / HEX Convert a number into a binary / octal / hexadecimal string
    LOCATE Find the position of a substring in a string
    LEFT / RIGHT Return the characters of a specified length on the left / right side of a string
    LENGTH / CHAR_LENGTH Return the length of a string in bytes / characters
    LOWER / UPPER Return the lowercase / uppercase form of a string
    LPAD / RPAD If the string length is not enough, pad specified characters on the left / right side
    LTRIM / RTRIM Remove the spaces at the front / back of a string
    ORD / CHAR Return the code of a character / return the character of a code
    STRCMP Compare strings and return -1, 0, 1 to mean less than, equal to, greater than
    SUBSTRING Return the substring in the specified range of a string

    Common numeric functions.

    Function Meaning
    ABS Return the absolute value of a number
    CEILING / FLOOR Return the result of rounding a number up / down
    CONV Convert a number from one base to another base
    CRC32 Calculate the cyclic redundancy check code
    EXP / LOG / LOG2 / LOG10 Calculate exponent / logarithm
    POW Calculate a power
    RAND Return a random number in the range [0,1)
    ROUND Return the rounded result of a number
    SQRT Return the square root of a number
    TRUNCATE Truncate a number to the specified precision
    SIN / COS / TAN / COT / ASIN / ACOS / ATAN Trigonometric functions

    Common date and time functions.

    Function Meaning
    CURDATE / CURTIME / NOW Get the current date / time / date and time
    ADDDATE / SUBDATE Add / subtract two date expressions and return the result
    DATE / TIME Get the date / time from a string
    YEAR / MONTH / DAY Get year / month / day from a date
    HOUR / MINUTE / SECOND Get hour / minute / second from a time
    DATEDIFF / TIMEDIFF / TIMESTAMPDIFF Return how many days / hours two date-time expressions differ by
    MAKEDATE / MAKETIME Make a date / time

    Common flow control functions.

    Function Meaning
    IF Return different values depending on whether the condition is true
    IFNULL If it is NULL, return the specified value, otherwise return itself
    NULLIF If two expressions are equal, return NULL, otherwise return the value of the first expression

    Other common functions.

    Function Meaning
    MD5 / SHA1 / SHA2 Return the hash digest of a string
    CHARSET / COLLATION Return the character set / collation
    USER / CURRENT_USER Return the current user
    DATABASE Return the current database name
    VERSION Return the current database version
    FOUND_ROWS / ROW_COUNT Return the number of queried rows / affected rows
    LAST_INSERT_ID Return the value of the last auto-increment primary key
    UUID / UUID_SHORT Return a globally unique identifier