SQL for Data Analysis (Learn what matters)

SQL Queries
show tables;It shows all the tables available in dataset
desc products;Describe a particular table, here ‘products’ is a table
select * from yearly_sales;‘*’ refers everything, this means showing everything in ‘yearly_sales’ table
select Product, Cost_per_packet from products;This means show ‘Product’ & ‘Cost_per_packet’ column from ‘products’ table
select SaleDate, Amount, packetes, Amount / packetes from yearly_sales;Here we are showing some columns from ‘yearly_sales’ table & an extra column for calculating ‘Amount per packets’.
select SaleDate, Amount, packets, Amount / packetes 'Amount_per_packet' from yearly_sales;or
select SaleDate, Amount, packetes, Amount / packets as 'Amount_per_packet' from yearly_sales;Giving the new column a specific name
select * from yearly_sales where Amount > 20000;writing conditions, show yearly_sales data only where the column Amount is above 20,000. It is like FILTERING in excel
select * from yearly_sales where Amount > 20000 order by Amount;data by order. This will show the data for ‘yearly_sales’ table where the ‘amount’ is grater than 20,000 and then arrange them by the ‘amount’ data in ascending order (by default).
select * from yearly_sales where Amount > 20000 order by Amount desc;in descending order.
select * from yearly_sales where LocID='L4';yearly_sales data where ‘LocID’ is ‘L4’.
select * from yearly_sales where LocID='L4' order by Pincode desc;order by ‘Pincode’ in Descending order
select * from yearly_sales where Amount>10000 and Curr_Date>='2022-01-01';here, we want to see all the data where value>10000 & year is after 2022. (using ‘and’ operation)
select * from yearly_sales where Amount>10000 or year(Curr_Date) = 2022;‘or’ operation
select * from yearly_sales where Amount>10000 and year(Curr_Date) = 2022;here, we want to see all the data where value>10000 & year=2022, (here the date is in this format -
2022-01-05 00:00:00
select * from yearly_sales where packets>0 and packets<=100;or,
(using ‘between’ keyword)
select * from yearly_sales where Boxes between 0 and 100; order by Boxes;Here, we get the data where No. of packets is between 0 to 100.
select SaleDate, weekday(SaleDate) as `week_day` ,Amount, Customers, Boxes from yearly_sales where weekday(SaleDate) = 4;Here, we want to show all the shipments that happend on Thrusday (using ‘weekday’ function).
select * from students where Team in ('Hufflepuff', 'Ravenclaw');alternative of ‘or’ operator - ‘in clause’. Here we can see the data from ‘students’ table where the team is either ‘Hufflepuff’ or ‘Ravenclaw’.
select * from students where Student like 'B%';‘like’ operator. Here we will get every students name that starts with ‘B’.
select * from students where Student like '%B%';here we will get every student data which has ‘B’ anywhere in their name.
select SaleDate, Amount, Customers, Boxes, case when amount < 1000 then 'under 1000' when amount < 5000 then 'under 5000' when amount < 10000 then 'under 10000' else 'more than 10,000' end as 'Amount category' from yearly_sales;or
select *, case when amount < 1000 then 'under 1000' when amount < 5000 then 'under 5000' when amount < 10000 then 'under 10000' else 'more than 10,000' end as 'Amount category' from yearly_sales;Here, we made a ‘amount’ category where every amount upto 1000 having a label of ‘under 1000’,
‘amount’ between 1000 to 5000 having the labelof ‘under 5000’
‘amount’ between 5000 to 10,000 having the label ‘under 10,000’
& anything more than 10,000 having a label of ‘more than 10,000’
Joins (between multiple tables)
suppose there is two tables - ‘students’ and ‘results’. Common column is ‘RollNo’
Joining the both table
select r.total, r.grade, s.full_name
from results as r
join students as s on s.RollNo = s.RollNo;
There is two types of join - Left Join & Right Join
Joining multiple tables:
suppose there is an another table named ‘classrooms’
we want to see student name from ‘students’ table, total marks from ‘results’ table & classroom name from ‘classrooms’ table -
select r.total, r.grade, s.full_name, c.room_name
from results as r
join students as s on s.RollNo = r.RollNo
join classrooms as c on c.RoomNm = r.RoomNm;
with conditions,
select s.full_name, c.room_name, r.total, r.grade
from results r
join students s on s.RollNo = r.RollNo
join classrooms c on c.RoomNm = r.RoomNm
where r.total < 100
and c.RoomNm = "Gryffindor";
Now if you want all the students data who are not included in a specific house/room -
select s.full_name, c.room_name, r.total, r.grade
from results r
join students s on s.RollNo = r.RollNo
join classrooms c on c.RoomNm = r.RoomNm
where c.RoomNm = '';
or,
select s.full_name, c.room_name, r.total, r.grade
from results r
join students s on s.RollNo = r.RollNo
join classrooms c on c.RoomNm = r.RoomNm
where c.RoomNm is null;
Now if you want data except the students who are not specifid in a perticular house/room -
(‘<>’ means ‘not is epual to’) [also you can use the ‘having’ option]
select s.full_name, c.room_name, r.total, r.grade
from results r
join students s on s.RollNo = r.RollNo
join classrooms c on c.RoomNm = r.RoomNm
where c.RoomNm <> '';
Now I want the marks and grade only of those students who are from India or Bangaladesh -
select s.full_name, r.total, r.grade
from results r
join students s on s.RollNo = r.RollNo
join countries co on co.CountryId = r.CountryId
where co.Country in ("India", "Bangaladesh")
order by r.total desc;
Using Group By
Now suppose I want total marks group by Houses/Rooms
select RoomNm, sum(total) as sum
from results
group by RoomNm;
result -
RoomNm | sum |
Gryffindor | 67923 |
Hufflepuff | 54542 |
Ravenclaw | 89565 |
Slytherin | 45675 |
It is like Pivot table in excel.
also,
select RoomNm, sum(total) as sum, avg(total) as Avg
from results
group by RoomNm;
using two ‘group by’
select c.Country, s.RoomNm, sum(total) as sum, avg(total) as Avg
from results as r
join students as s on r.RollNo = s.RollNo
join countries as c on c.CountryId = r.CountryId
group by s.RoomNm, c.Country;
also sorting it -
select c.Country, s.RoomNm, sum(total) as sum, avg(total) as Avg
from results as r
join students as s on r.RollNo = s.RollNo
join countries as c on c.CountryId = r.CountryId
group by s.RoomNm, c.Country
order by c.Country;
Showing the Top 10 Students
select s.Student, r.total
from results r
join students s on s.RollNo = r.RollNo
order by r.total desc
limit 10;



