After completing this chapter, you will be able to:
Understand what SQL is and why it's important
Create databases and tables using SQL
Use appropriate data types for different kinds of data
Write basic SELECT queries to retrieve data
Filter data using WHERE clauses
Sort and limit query results
Use aggregate functions to summarize data
3.1 Introduction to SQL
What is SQL?
SQL (Structured Query Language) is the standard language for interacting with relational databases. It's used by all major database systems including MySQL, PostgreSQL, Oracle, SQL Server, and SQLite.
Why SQL Matters
SQL is one of the most valuable skills for business professionals because:
Universal: Works across all relational databases with minor variations
Powerful: Can retrieve and manipulate complex data with simple commands
In-demand: One of the top skills employers look for
Accessible: Easier to learn than most programming languages
SQL Categories
SQL commands fall into several categories:
Category
Purpose
Examples
DDL (Data Definition Language)
Define database structure
CREATE, ALTER, DROP
DML (Data Manipulation Language)
Manipulate data
SELECT, INSERT, UPDATE, DELETE
DCL (Data Control Language)
Control access
GRANT, REVOKE
TCL (Transaction Control Language)
Manage transactions
COMMIT, ROLLBACK
In this course, we'll focus primarily on DDL and DML.
Choosing the right data type is crucial for data integrity and performance.
Numeric Data Types
Data Type
Description
Example Use
INT
Whole numbers
Employee ID, Age, Quantity
DECIMAL(p,s)
Fixed-point decimal
Salary (10,2), Price (8,2)
FLOAT / DOUBLE
Floating-point decimal
Scientific calculations
TINYINT
Small integers (0-255)
Age, Rating (1-5)
BIGINT
Large integers
Large transaction IDs
Examples:
1234
employee_idINT-- -2,147,483,648 to 2,147,483,647salaryDECIMAL(10,2)-- Up to 99,999,999.99ageTINYINT-- 0 to 255product_viewsBIGINT-- Very large numbers
String Data Types
Data Type
Description
Example Use
CHAR(n)
Fixed-length string
State codes (CHAR(2))
VARCHAR(n)
Variable-length string
Names, emails, descriptions
TEXT
Large text
Articles, reviews, comments
ENUM
Predefined values
Status ('active', 'inactive')
Examples:
1234
state_codeCHAR(2)-- Always exactly 2 characters: 'IL', 'CA'emailVARCHAR(100)-- Up to 100 charactersdescriptionTEXT-- Large amounts of textstatusENUM('active','inactive','pending')
CHAR vs VARCHAR
Use CHAR when length is always the same (state codes, country codes)
Use VARCHAR when length varies (names, emails, addresses)
VARCHAR saves space but CHAR can be slightly faster
is_activeBOOLEAN-- TRUE or FALSE (stored as 1 or 0)-- Often implemented as TINYINT(1) in MySQL
Choosing Data Types: Examples
1 2 3 4 5 6 7 8 910111213
CREATETABLEcustomer(customer_idINTPRIMARYKEYAUTO_INCREMENT,first_nameVARCHAR(50)NOTNULL,last_nameVARCHAR(50)NOTNULL,emailVARCHAR(100)UNIQUENOTNULL,phoneCHAR(10),-- US phone: 2175551234birth_dateDATE,registration_dateDATETIMEDEFAULTCURRENT_TIMESTAMP,is_activeBOOLEANDEFAULTTRUE,loyalty_pointsINTDEFAULT0,account_balanceDECIMAL(10,2)DEFAULT0.00,membership_levelENUM('bronze','silver','gold','platinum'));
3.4 The SELECT Statement
The SELECT statement is used to retrieve data from databases.
Basic Syntax
12
SELECTcolumn1,column2,...FROMtable_name;
Selecting All Columns
1
SELECT*FROMemployee;
Use SELECT * Carefully
While SELECT * is convenient, in production:
- It's less efficient (retrieves unnecessary columns)
- It's less clear what data you need
- It can break code if table structure changes
-- Employees making more than $60,000SELECTfirst_name,last_name,salaryFROMemployeeWHEREsalary>60000;-- Employees hired after January 1, 2020SELECTfirst_name,last_name,hire_dateFROMemployeeWHEREhire_date>='2020-01-01';
-- Employees in dept 1 or 3, making more than $50kSELECTfirst_name,last_name,salary,department_idFROMemployeeWHERE(department_id=1ORdepartment_id=3)ANDsalary>50000;
-- Customers whose last name starts with 'S'SELECTfirst_name,last_nameFROMcustomerWHERElast_nameLIKE'S%';-- Customers whose email is from gmailSELECTfirst_name,last_name,emailFROMcustomerWHEREemailLIKE'%@gmail.com';-- Customers whose first name is 4 letters starting with 'J'SELECTfirst_name,last_nameFROMcustomerWHEREfirst_nameLIKE'J___';-- J + 3 underscores
Range Conditions with BETWEEN
1 2 3 4 5 6 7 8 910
-- Employees with salary between $40,000 and $60,000SELECTfirst_name,last_name,salaryFROMemployeeWHEREsalaryBETWEEN40000AND60000;-- Equivalent to: WHERE salary >= 40000 AND salary <= 60000-- Orders placed in January 2024SELECTorder_id,order_date,total_amountFROMordersWHEREorder_dateBETWEEN'2024-01-01'AND'2024-01-31';
Membership with IN
1 2 3 4 5 6 7 8 910
-- Employees in departments 1, 3, or 5SELECTfirst_name,last_name,department_idFROMemployeeWHEREdepartment_idIN(1,3,5);-- Equivalent to: WHERE department_id = 1 OR department_id = 3 OR department_id = 5-- Customers from specific citiesSELECTfirst_name,last_name,cityFROMcustomerWHEREcityIN('Chicago','Champaign','Urbana');
NULL Values
123456789
-- Employees with no emailSELECTfirst_name,last_nameFROMemployeeWHEREemailISNULL;-- Employees with an emailSELECTfirst_name,last_name,emailFROMemployeeWHEREemailISNOTNULL;
NULL is Special
You cannot use = NULL or <> NULL. Always use IS NULL or IS NOT NULL.
-- Sort by department, then by salary within departmentSELECTfirst_name,last_name,department_id,salaryFROMemployeeORDERBYdepartment_idASC,salaryDESC;
Sorting by Column Position
1234
-- Sort by the 3rd column (salary), then 1st column (first_name)SELECTfirst_name,last_name,salaryFROMemployeeORDERBY3DESC,1ASC;
Column Position Sorting
While sorting by column position works, it's better to use column names for clarity:
12345
-- Less clear:ORDERBY3DESC,1ASC-- More clear:ORDERBYsalaryDESC,first_nameASC
3.7 Limiting Results
MySQL/PostgreSQL - LIMIT
1 2 3 4 5 6 7 8 91011
-- Get the top 10 highest-paid employeesSELECTfirst_name,last_name,salaryFROMemployeeORDERBYsalaryDESCLIMIT10;-- Get rows 11-20 (pagination)SELECTfirst_name,last_name,salaryFROMemployeeORDERBYsalaryDESCLIMIT10OFFSET10;
SQL Server / MS Access - TOP
1234
-- SQL Server syntax (different from MySQL!)SELECTTOP10first_name,last_name,salaryFROMemployeeORDERBYsalaryDESC;
3.8 Aggregate Functions
Aggregate functions perform calculations on sets of rows.
Common Aggregate Functions
Function
Purpose
Example
COUNT()
Count rows
COUNT(*), COUNT(email)
SUM()
Sum values
SUM(salary)
AVG()
Average value
AVG(salary)
MIN()
Minimum value
MIN(salary)
MAX()
Maximum value
MAX(salary)
COUNT - Counting Rows
1 2 3 4 5 6 7 8 91011
-- Count all employeesSELECTCOUNT(*)ASemployee_countFROMemployee;-- Count employees with email addressesSELECTCOUNT(email)ASemployees_with_emailFROMemployee;-- Count unique departmentsSELECTCOUNT(DISTINCTdepartment_id)ASdepartment_countFROMemployee;
COUNT(*) vs COUNT(column)
COUNT(*) counts all rows
COUNT(column) counts rows where column IS NOT NULL
-- Total salary expenseSELECTSUM(salary)AStotal_salary_expenseFROMemployee;-- Total salesSELECTSUM(amount)AStotal_salesFROMordersWHEREorder_date>='2024-01-01';
AVG - Average
1234567
-- Average salarySELECTAVG(salary)ASaverage_salaryFROMemployee;-- Average order valueSELECTAVG(total_amount)ASaverage_order_valueFROMorders;
MIN and MAX
1 2 3 4 5 6 7 8 9101112
-- Salary rangeSELECTMIN(salary)ASlowest_salary,MAX(salary)AShighest_salary,MAX(salary)-MIN(salary)ASsalary_rangeFROMemployee;-- Earliest and latest hire datesSELECTMIN(hire_date)ASfirst_hire,MAX(hire_date)ASmost_recent_hireFROMemployee;
-- Find all customers who joined in the last yearSELECTcustomer_id,first_name,last_name,registration_dateFROMcustomerWHEREregistration_date>=DATE_SUB(CURRENT_DATE,INTERVAL1YEAR)ORDERBYregistration_dateDESC;-- Find customers from IllinoisSELECTfirst_name,last_name,email,cityFROMcustomerWHEREstate='IL'ORDERBYcity,last_name;
Finance: Transaction Analysis
1 2 3 4 5 6 7 8 9101112131415
-- Find all high-value transactionsSELECTtransaction_id,customer_id,amount,transaction_dateFROMtransactionsWHEREamount>1000ORDERBYamountDESCLIMIT20;-- Summary statisticsSELECTCOUNT(*)AStotal_transactions,SUM(amount)AStotal_revenue,AVG(amount)ASavg_transaction_value,MAX(amount)ASlargest_transactionFROMtransactionsWHEREtransaction_date>='2024-01-01';
HR: Employee Analysis
1 2 3 4 5 6 7 8 9101112131415
-- Find recently hired employeesSELECTfirst_name,last_name,hire_date,department_idFROMemployeeWHEREhire_date>='2024-01-01'ORDERBYhire_dateDESC;-- Salary statistics by departmentSELECTdepartment_id,COUNT(*)ASemployee_count,AVG(salary)ASavg_salary,MIN(salary)ASmin_salary,MAX(salary)ASmax_salaryFROMemployeeGROUPBYdepartment_id;
Key Takeaways
SQL is the universal language for working with relational databases
Choose appropriate data types to ensure data integrity and performance
SELECT retrieves data - be specific about which columns you need
WHERE filters rows - use comparison and logical operators effectively
ORDER BY sorts results - critical for meaningful output
Aggregate functions summarize data - COUNT, SUM, AVG, MIN, MAX
Practice is essential - the more queries you write, the more comfortable you'll become
Review Questions
What's the difference between CHAR and VARCHAR data types?
Write a query to find all products priced between $10 and $50, sorted by price.
Explain the difference between COUNT(*) and COUNT(column_name).
Why should you avoid using SELECT * in production code?
Calculate the total value of inventory (Price × Stock_Qty for all products)
Next Steps
In Chapter 3b, we'll unlock the power tools of SQL — JOINs for combining tables, subqueries for nested logic, CTEs for readable complex queries, and window functions for analytics.
Corresponds to Weeks 1-2 of BADM 554 — SQL Fundamentals