hello everybody in this lesson we're going to be taking a look at group by and order by in MySQL now when you use the group by clause in my SQL it's going to group together rows that have the same values in the specified column or columns that you're actually grouping on once you group those rows together you can run something called an aggregate function on those rows let's see how this actually works let's go ahead and copy this right here we'll bring that down and let me go back up one let's go ahead and write
gender right here now we want to group on this gender column and we're going to say Group by gender let's go ahead and run this we'll see what we get and so we have male and female now we could get the exact same output by saying select distinct gender from this table what is Group by doing that the gender actually isn't doing well it's actually rolling up all of these values into these rows so later when we run aggregate functions like average min max we'll do it based off of these rows and all those rows
are rolled up into these two rows and we'll see that in a little bit now what if I was to come up here and in this demographics we have a first underscore name what would happen if I'm selecting the first name but I'm grouping by the gender let's go ahead and run this if we come right down here let me pull this up you can see that the select list is not in group by clause and contains non-aggregated columns what this means is that when you are selecting a column if it's not an aggregated column
like say average of something if we're not using the aggregate functions in the select statement it has to be in the group by these have to match so this gender has to match this group by if we're not performing an aggregate function on it let's go ahead and run this and now it works properly now let's go back up let's run this query because I want to select everything again well let's say we wanted to take a look at the average ages or gender so we're going to do is we're selecting gender we're also grouping
by gender but what we're going to do is add a comma and we'll say the average that's AVG that stands for average and then we're going to put in here age so now this right here is an aggregate function this does not need to go in the group by we're just grouping on the gender and then we're performing this aggregate function or kind of a calculation based off of those grouped rows for gender so let's go ahead and run this and take a look at the output so what this is telling me is that for
the males all of the male rows that were grouped the average age is 41 Point let's say three and for female the average age is 38.5 so super quickly you can tell that the average age of females is lower than the average age of males now we'll take a look at aggregate functions more in just a little bit let's actually go to a different table let's come right down here we're going to go to the salary table and just select everything for now let's go ahead and run this now what we're going to actually be
grouping on is this occupation right here now there's a lot of unique values it's not as distinct as the gender which only had two values you'll notice we do have a few that are the same we have ones like office manager so when we come up here we're going to say occupation and of course we need to group by the occupation as well now let's run this you'll notice that office manager only has one row let's say we also want a group on the salary let's say salary now we can group on multiple so we're
going to say salary like this so we're grouping on the occupation as well as the salary now let's run this you'll notice that we have two rows for office manager now this is because this salary and this salary for those two employees are different we have fifty thousand and sixty thousand for this I just wanted to demonstrate that if these had both been 50 000 there would only be one row office manager fifty thousand but because this is a unique value different than fifty thousand they have their own individual rows which we would then perform
our aggregate calculations on let's go and get rid of that because we will not be using that anymore I just wanted to demonstrate it really quickly so before we were looking at gender and average age and we're also grouping on the gender we can perform other aggregate functions as well let's take a look at some of those we could look at the max age as well the max is going to show us the highest value within each of those groupings so we have a male and female the max age for those for the male is
61 and the highest age for the female is 46. we did the exact same thing exactly you could say Min or the exact opposite thing we can say the minimum age so it's going to be the lowest for both the male and the female go and run this now we have female and male and the minimum age is 29 and 34. and there is one last one that I want to show you which is count we're going to do count now count is going to count the actual rows within this age column so if we
run this you'll see that we have four females per count and we have seven males it's just telling us a count of how many values is in this column when we're actually grouping on the gender so that's how we can use Group by to actually roll up and group all of these similar values within a column or columns and perform our aggregate functions on them now let's come down here and what we're going to take a look at is order by so I'm going to say order by now let's actually pull in this demographics table
right here I'm just going to say select everything and run this really quickly after we got a semicolon so order by order by is going to actually sort the result set in either ascending or descending order let's take a look at how this works at the very end we could say order by and we could order by the first underscore name so we're going to take this column we're going to order all of our rows based off of this one column let's go ahead and run this so it's going to do it based off ascending
orange means smallest to largest now this is a text column or a character column so we do it A to Z so Andy and April all the way down to Tom Now by default this is in ASC order ascending order and if we run this it's going to be the exact same output but we can change this to do it the opposite highest to lowest or Z to A by doing descending so now if we run this you'll see that goes Tom all the way down to Andy now let's take a look at ordering on
something like gender and age because we can do both at the same time so let's order by the gender first let's go ahead and run this and you'll see that all the females are grouped together and then all the males are grouped together because that's just the order in which it is but we can do an additional column we could also do it based off of the age let's go ahead and run this so now within the female since that came first in our order by we're ordering by the gender and then we're also ordering
by the age after we've ordered by the gender so now it's 29 all the way up to 46 then 34 for males all the way up to 61. now we can change this just for the age let's say we want to do age descending so gender will stay the same in ascending order and now age will be in descending order let's go ahead and run this now female and male stay the same but now it starts at the highest down to the lowest now this is something that I would absolutely do in real life except
sometimes you can make mistakes and sometimes you do the wrong column first let's do age and then we'll do gender now if we run this the gender is not going to be used at all and this is because there are no unique values that are going to be on the same row so notice all these values are completely unique so the gender never is actually used to order anything on because if there were things like 34 34 34 34 these would be ordered based off of the gender but since there's no unique Fields this is
really pretty useless that's why the order of the order by or the columns that you place in the order by are actually quite important now the last thing that I want to show you and I'll just go back to gender and age is that you don't actually have to use the column names we can use the column positions now I will preface this by saying I don't recommend doing this but I sometimes do it in shorthand for just a quick query if I know the column position and I don't want to write out the whole
name sometimes I do it although it's not best practice but let's take a look at it so gender is the one two three four fifth column I'm going to place this with five and age is the one two three four column so these are the positions of the fields but not the names of them if we run it we're gonna get the exact same output because these represent these columns appropriately but again I just don't recommend it it's kind of a slippery slope that I've fallen down myself uh many times and when you get to
more advanced SQL and you're creating things like stroke procedures and triggers and all these things this can actually cause a lot of issues if you were to add any columns or remove any columns then you'd be ordering by the wrong column because let's say this last name got removed we didn't want it for some reason then the gender is one two three four now we're ordering on the wrong column and that would be a big mistake so just by best practice it is better to do gender um age but I just wanted to show you
that in case you want to be like me and kind of go down the wrong path so that is everything we're going to take a look at with Group by and order by in the next lesson we're going to be taking a look at how loving versus where [Music]