These are a cultivated list of resources I used to get started and stay updated on Data Science topics, mainly for R and Python users, with a smattering of basic stats and comp sci. It worked for me, hopefully it works for others, but just remember everyone’s background experience is different, so what worked for someone may not work for another :)
Data Science Toolkit (Theory, some application)
Ok lot coming here, do you need to to look at any of these besides maybe the first one? probably not; but you can keep these links saved in case in the future you want to dive deeper or you need a refresher.
The Saints of Data Science resources, Jimmy Oty and Lawrence Juma put up this amazing repo of pdfs of leading books and resources for Data Science and other topics.
ISLR - this is the bible of data science, if you’re going to read one book, read this one. It’s available for free, for either Python or R versions. Section 5.1 on CV is good review as well to refer to if needed.
CV done Wrong - good, quick read on CV, a good refresher if you’re rusty.
Build a Career in Data Science I loved this book when i was searching for DS jobs - also does a great job explaining how a data analyst vs data engineer vs data scientist sometimes differ and sometimes really don’t, but a great resource all together. Also available for free via Jimmy Oty.
Python Machine Learning - a brick of a book but an absolute gem, one i keep on hand at my desk and probably the one i still reference. easiest one to follow along with on the concepts - written by Sebasitan Raschka.
R Programming for DS - self explanatory, if you wanna dive into R
Advanced R - note in the intro, authors put a ‘recommended reading’ list that can really be the starting point to the black hole of references. Black hole may be a bit strong, but a lot of the really good, widely available and free online pdfs reference each other, you’ll start to notice after a bit in the field.
Lantz ML in R - one of my favorite ML references, super good review of the basic concepts and really straightforward code to replicate on your own, best way to learn is to do!
from R to Python - github page FULL of references for for Python newbies, with a focus on those who have some R foundation. You’ll notice A LOT of people who don’t have a BS in Computer Science prefer R over Python - it is often easier to ‘muscle’ through code, there are fewer instances of object oriented work and i think it has a smaller learning curve.
VanderPlas’s Python Data Science Handbook - very solid starting point for using Python in DS work, with code repos to follow along with exercises. The basics (lists, data frames, pandas, numpy, loops/while/conditionals) are great starting point
Ng’s Machine Learning Yearning - Andrew Ng is the ‘father of AI/ML’ out of Standford, he’ll probably win a nobel one day
Internship specific: aijobs.net, datajobs, outerjoin - remote focused.
Tech Jobs For Good - Awesome site for mission based tech jobs, varies from part time to full time exec level positions.
Ben Green’s Job List- fantastic job lists - check out his hyperlinks, focus on Data Science in social good sectors, but good starting point
Towards Data Science - this is likely my favorite source, they send daily emails of data science topics that are very digestible but also very tailored (VS Code vs Jupyter Lab pros/cons, Random Forest vs XGBoost, etc) -
Youyang Gu - blog did some absolutely amazing covid modeling during the pandemic, excellent writer as well
Anna-Lena Popkes - Fantastic python walkthrough of ML concepts, very easy to follow along
Alexis Perrier - bit too academic-y for my tastes, but posts good amount more than others so i keep an eye on it
Marginal Revolution - not data science specific, but a great blog for economics, non-US focused pieces you wouldn’t see in leading news outlets, and a smattering of cultural pieces; Tyler Cowen is as idiosyncratic as they come, and has a fantastic podcast as well. I’m not a Libertarian, but I appreciate TC’s perspective and willingness to engage others across fields and ideologies.