I've been working as a data scientist now for about 3 years and in this video I want to give you a breakdown of the exact road map I would follow if I was learning data science completely from scratch again let's get into it data science is all about breaking down problems and doing deep analysis and building machine learning models and maths is the Bedrock behind all these things fortunately the maths needed for data science is not like PhD or Master's level a lot of maths is taught in the later years of high school so it's
pretty accessible to most people in general there are three areas of maths you should know calculus probability and statistics and linear algebra probability and statistics is probably the most important and as the one I use most day-to-day so this is the one I would really focus your efforts into to start I would recommend learning things about descriptive statistics things like mean mode standard deviation median anything that summarizes data and I would also learn things about plots so how to visualize data using bar charts graphs violin plots just visualizing and summarizing data is the first step
then I would learn common probability distributions like Pon gamma binomial and normal after that it's time to learn some probability Theory so things like Central limit theorem maximum likel estimation and B regression these are used so much in data science so really make sure you understand these after that you should really learn hypothesis testing and confidence intervals AB tests are pretty much how any company nowadays tests a feature they're going to release and ab test under the hood are just a form of hypothesis testing so things like T Test Zed test K Square test are
really things you should learn because they used extensively in pretty much every company and finally I would learn things about modeling and inference even though this kind of crosses over with machine learning but there's two models linear regression and logistic regression which are more statistical models than they are machine learning models so I'll definitely learn these to learn all of this I recommend the textbook practical statistics for data science because it's in the title it's mainly designed for data scientists and it also has code examples so you can also practice your statistics whilst also brushing
up on your python skills after you've learned stats I will then move on to learning some C calculus calculus is like the lifeblood of how machine learning algorithms actually learn think gradient descent for linear aggression or back propagation for neural networks the reason these algorithms work is through calculus so it's a must know for any data scientist I would learn things like what is a derivative what are the common derivatives of functions like s coign the exponential partial derivatives and multivariable calculus Jacobian m es and hm matrices turning points stationary points product rule chain rule
all these things around how you basically calculate differentiation and also integration are really important to know to learn all this calculus I recommend a textbook mathematics for machine learning to be honest it's probably a bit too advanced but it will cover all the basics so just skip to the sections where those things are covered and don't worry about the rest of course you can dive deeper if you're really interested and that will make your Calculus knowledge very excellent and finally the last topic is linear algebra which is all about how linear Transformations affect vectors and
matrices now you may think this is not used that much but it's actually used quite a lot for example tensor flow which is like one of the best packages out there for neural networks tensor is actually a matrix and data frames where you store your data in training validation and testing sets are actually just a multi dimensional Matrix so matrices and vectors are literally everywhere in data science the things you should know are what are vectors why is the magnitude of vector what is it Direction then things like matrices the inverse transpose Trace ion values
icon vectors these are things that are so important and crop up all the time and finally you should be familiar with systems of linear equations these are used quite a bit in optimization problems and naturally sometimes in your data science career you will have an optimization problem to solve and these come in real handy when that's the case to learn all of these things I recommend the linear algebra for machine learning course on Cera again it's excellent and it's specifically designed for data scientist and machine learning after you've logged on the maths it's now time
to develop your coding skills I'm sure by now you know the two main languages for data science are Python and SQL with python probably being the more important one for python I recommend taking any introductory course I will link on screen here some recommendations in reality doesn't matter which course you take because any beginner course would teach you the exact same things the main thing is just to pick one and get started on the screen here are the things you should learn and the main thing I want to drive home is that make sure you
do the exercise problems you only really learn programming when you have hands-on experience watching a video of how to run a function is not quite the same as actually writing a function so make sure you do all the exercise problems and get hands-on experience after you've developed your python skills it's now time to learn some SQL on screen here are again some recommendation courses I suggest you take again doesn't matter which course you take just as long as you pick one and get started in terms of what you should learn on screen here are all
the basic functions these basic functions are pretty much what I use 95% of the time when I'm doing any form of squl if you have time I also recommend investing some effort into other tools and technologies that used frequently by data scientists these are get shell and Bash and also a package manager like Anaconda I'll link on screen here some really good tutorials to cover all these things and also in the description below right at this point we have developed all our fundamental knowledge in programming and maths to start getting into the fun bits which
is actually learning data science and machine learning to start I recommend learning the Pyon specific libraries for data science and machine learning the things you should learn are numpy which is like a scientific Computing package pandas for data manipulation psychic learn for the Baseline machine learning algorithms and map plot lib which is a plotting library inside python the next step is to delve into some machine learning theory by far in a way the best machine learning course is Andrew O's machine learning specialization I took this back in 2020 and it is amazing I mean back
in 2020 It Was Written in octave but now it's been revamped it's in Python and it even has more Cutting Edge algorithms in there like recommendation systems and reinforcement learning so I really recommend you start with this course after you've completed that course I recommend you take on the subsequent one which is the Deep learning specialization this will teach you things things like rnn's cnns and we'll even touch upon large language models so this will teach you all like The Cutting Edge aspects of current machine learning with all that new knowledge under our belts it's
now time for the most important step which is projects this is how you're going to solidify all your knowledge in Python SQL maths machine learning because getting hands-on experience practicing these things is by far and way the best way to learn some ideas to try are entering a Caro competition implementing a machine learning algorithm completely from scratch or any other personal projects that really interest you the point is just do a range of projects in different areas and that's the way you will learn try out different algorithms different business cases different types of problems because
the more break you get the better your skills will become before applying to any entry level roles I recommend you do some extra work to ensure that your CV and application really stand out nowadays loads of people want to be a data scientist and the market is quite tougher than moment so you have to do that a little bit extra to make sure that you stand out from the crowd some things I recommend you do is get a GitHub profile and populate it Implement a research paper from scratch write a blog post about anything you've
learned studied or are learning and finally create a personal website with all the things you're learning done your progress just to Showcase your interest in the field I promise that 80% of people probably won't have this so you immediately be in our top 20% of applicants and not to mention doing all these things will demonstrate your interest improve your coding skills and make you a better well-rounded data scientist now we at the final stage which is job applications and to be honest this is probably the hardest mainly because there's a big element of luck involved
in landing any job to improve PR luck make sure you do those things that I mentioned to stand out but ultimately it does come down to a numbers game I'm not lying when I say that I applied to over 300 jobs when I was trying to land a graduate data scientist role I mean not all of them were data science roles but majority of them were so this goes to show you that it is just about volume some people may disagree with this approach and say that you should tailor your CV and application to each
job I mean this is probably true but to be honest sometimes you just don't have the time to do this and sometimes it's really not worth the effort I'm not saying just Spam apply for every single thing you see make sure you review the company and the job to make sure it's somewhere you'd like to work and a job is something you'd like to do otherwise you're kind of wasting your time and their time and don't worry if your first job is not at a Fang hedge fund or some really sophisticated fancy company out there
the goal of your first role is just to learn as much as possible and become a better data scientist you can always change careers and maneuver your kind of trajectory later on if you are more tips and advice on how to break into data science then make sure you check out my Weekly Newsletter addition the data I send it every Monday morning and it's all about my thoughts and experiences as a practicing data scientist if that sounds interesting I'll link it in the description below for you to check out if you enjoy this video make
sure you click the like And subscribe button and I'll see you next time