PostgreSQL
Intro
Databases are systems that allow users to store and organize data. They are useful when dealing with massive amounts of data.
PostgreSQL is open source, widely used on internet, and multi platform.
SQL(Structured Query Language) is the programming language used to communicate with the database.
Creating Databases and Tables
Data Types
- Boolean: true, false, NULL
- Character: char, char(n), varchar(n)
- Number: integers, floating-point numbers
- Temporal: data and time related data
- Special types
- Array
Primary Keys & Foreign Keys
A primary key is a column or a group of columns that is used to identify a row uniquely in a table.
A foreign key is a field or group of fields in a table that uniquely identifies a row in another table. In other words, a foreign key is defined in a table that refers to the primary key of the other table.
SQL Fundamentals
SELECT
SELECT: retrieve information from table.
SELECT column_name FROM table_name;
SELECT c1, c3 FROM table_1;
SELECT * FROM table_1;
Use an asterisk(*) in the SELECT will automatically query everything, which increases traffic between the data base server and the application, which can slow down the retrieval of results.
SELECT DISTINCT
Sometimes a table contains a column that has duplicate values, and you may find yourself in a situation where you only want to list the distinct values.
DISTINCT: return only the distinct values in a column.
SELECT DISTINCT column_name FROM table_name;
SELECT DISTINCT(column_name) FROM table_name;
COUNT
The COUNT function returns the number of input rows in the table that match a specific condition of a query.
SELECT COUNT(DISTINCT column_name) FROM table_name;
SELECT WHERE
SELECT column1, column2 FROM table1 WHERE conditions;
The conditions are used to filter the rows returned from the SELECT statement.
SELECT name, choice FROM table
WHERE name='David' AND choice='Red';
ORDER BY
Sometimes PostgreSQL returns the same request query results in a different order.
You can use ORDER BY to sort rows based on a column value, in either ascending or descending order.
SELECT column_1, column_2
FROM table
ORDER BY column_1 ASC/DESC;
If you leave it blank, ORDER BY uses ASC by default.
You can also ORDER BY multiple columns:
SELECT company, name, sales
FROM table
ORDER BY company, sales;
LIMIT
The LIMIT command allows us to limit the number of rows returned for a query to get an idea of the table layout.
LIMIT goes at the very end of a query request and is the last command to be executed.
BETWEEN
The BETWEEN operator is the same as:
value >= low AND value <= high
value BETWEEN low AND high
The NOT BETWEEN operator is the same as:
value < low OR value > high
value NOT BETWEEN low AND high
IN
Create a condition that checks to see if a value is included in a list of multiple options.
SELECT color FROM table
WHERE color IN/NOT IN ('red', 'blue', 'green');
LIKE and ILIKE
The LIKE allows us to perform pattern matching against string data with the use of wildcard characters:
- %: matches any sequence of characters
- _: matches any single character
LIKE is case-sensitive, while ILIKE is not.
GROUP BY
GROUP BY: aggregate data and apply functions to better understand how data is distributed per category.
Aggregation Functions
Take multiple inputs and return a single output.
Aggregate function calls happen only in the SELECT clause or the HAVING clause.
Most common aggregate functions:
- AVG(): returns a floating point value, use ROUND() ro specify precision after the decimal.
- COUNT()
- MAX()
- MIN()
- SUM()
SELECT ROUND(AVG(replacement_cost), 4)
FROM film;
GROUP BY
GROUP BY allows us to aggregate columns per some category.
We need to choose a categorical column to GROUP BY.
The GROUP BY clause must appear right after a FROM or WHERE statement.
SELECT category_col, AGG(data_col)
FROM table1
GROUP BY category-col;
SELECT customer_id, SUM(amount) FROM payment
GROUP BY customer_id
ORDER BY SUM(amount) DESC
SELECT DATE(payment_date), SUM(amount) FROM payment
GROUP BY DATE(payment_date)
ORDER BY SUM(amount) DESC
HAVING
HAVING allows us to filter after an aggregation has already taken place.
SELECT company, SUM(sales)
FROM finance_table
GROUP BY company
HAVING SUM(sales) > 1000;
JOINS
JOINS allows us to combine information from multiple tables.
AS
AS allows us to create an “alias” for a column or result.
SELECT SUM(amount) AS rental_price
FROM payment;
AS operator gets executed at the very end of a query, meaning that we can not use the ALIAS inside a WHERE operator.
INNER JOINS
SELECT * FROM TableA
INNER JOIN TableB
ON TableA.col_match = TableB.col_match;
FULL OUTER JOINS
Just grabs everything.
SELECT * FROM TableA
FULL OUTER JOIN TableB
ON TableA.col_match = TableB.col_match;
LEFT OUTER JOINS
SELECT * FROM TableA
LEFT OUTER JOIN TableB
ON TableA.col_match = TableB.col_match;
RIGHT OUTER JOINS
SELECT * FROM TableA
RIGHT OUTER JOIN TableB
ON TableA.col_match = TableB.col_match;
UNIONS
UNION combines result sets of two or more SELECT statements into a single result set.
The UNION operator is often used to combine data from similar tables that are not perfectly normalized. Those tables are often found in the reporting or data warehouse system.
SELECT column_1, column_2
FROM tbl_name_1
UNION
SELECT column_1, column_2
FROM tbl_name_2;
Advanced SQL Commands
- Timestamps and EXTRACT
- Math Functions
- String Functions
- Sub-query
- Self-Join