Would you like to dig deep into Deep Learning, but don’t know where to start?
Do you want to work on Artificial Intelligence, but fear you don’t have enough intelligence?
Trying to remember what Stochastic Gradient Descent is, but can’t because you’ve graduated long time ago?
And finally, can you only find motivation for Learning when you’re being Supervised?
Rejoice! You can get this all and more from a certain service… It’s called Grad School and it’s available for hire.
A year and a half ago, I enrolled in a CS-related graduate program for professionals at Stanford, and have
finally completed it in December. Here’s my new shiny certificate. In total, I took 4 courses, and it took
me 5 quarters to complete (I took a break in the summer.) In this post I’ll describe this program in case
What is SCPD?
SCPD stands for Stanford Center for Professional Development. Here’s their website. They offer
courses in a variety of topics, from Foundations of Computer Science, to Cybersecurity Strategy. I’ve
enrolled the the Artificial Intelligence program that featured courses from the Computer Science
In this and most other SCPD programs, you just take classes with other Stanford students: you listen to the
lectures (online), submit the assignments (written solutions to the math problems or computer code), get
grades, take exams and midterms, and discuss your problems with the Teaching Assistants.
Each course takes a quarter to complete. I completed 4 courses over 5 quarters because I wanted a summer
break. I’m sure there’s a total time limit, but I can’t find any reference to it now.
Usually, each program offers multiple courses, and you need to complete a certain set of courses. As an
example, for the AI certificate you need to complete 4 out of 14 offered through the AI
certificate: one mandatory (CS221 Artificial Intelligence: Principles and
Techniques) and three elective courses. The set of courses offered is updated to keep up with the
You don’t have to complete the certificate program; in fact you can just take the classes you’re interested in
and take it easier with the grade and the assignments. You would still need to go through the similar
enrollment steps, but the expectations might probably be lighter.
If you finish the program, you get a piece of paper like the one above. And if you list it on LinkedIn,
you’ll have a Stanford icon next to your profile ;-)
What to expect?
Expect a lot of hard work if you’re actually going for the certificate (if you are just taking some courses,
then pick your own expectations). The requirements listed on the certificate page are strict, so please
- 10 hours of work a week + several weekends spent on the final project, and that’s if you take one course at
a time. You are required to take couses for 4 units (1 unit = 3 hours a week, so 4 units is 12 hours a
week), and ship a course project every time. This is consistent with my experience, and this is serious
workload. Consider this:
For each course taken, I had to work through 3-4 weekends on the final project. Do you like to spend
an occasional Saturday afternoon chilling at the park? Forget it–unless you’d bring an ML book with you.
I had to put my hobby (bike racing) on hold for the entire 2019.
My performance at work decreased. I had very little creativity left and noticed I often just went
with the flow instead of being the creative troublemaker I normally was. And this creativity is part of
your job description and expectations if you’re at a Senior level or above. I definitely missed out on
some opportunities because I was tired.
- You need to earn at least a B grade each time if you do want to earn the Certificate This is hard, and that actually means you need
to submit most of the assignments on time, and be proficient enough in math to do well on most midterms.
- Expect regular cadence. Typically, you need to submit an assigmnent every 1-2 weeks.
- The school doesn’t stop when you go on vacation. Rescheduling because you have a vacation is not
explicitly allowed (although I never tried to negotiate it). I found myself finishing up my final
project at a ski trip… twice.
- If earning the certificate is not your goal, and you’re “just taking classes”, you might think that
earning a grade is not among your goals. However, in most classes it’s pretty hard to not earn at least a
B if you’ve actually studied and understood most of the material. Consider doing the free
alternatives in this case.
- Each course costs $5,000 in the AI program, and you need to take 4 of them. The tuition for other
programs is different: it can be higher for
Enterpreneurship certificates and lower for Foundations of CS or somesuch.
- Pro tip: are you employed by a large company who can cover a part or the entirety of your tuition for
work-related education? Use that option if you can.
- The courses are not easy. Typically, many courses you take in college will be easy and some hard.
But in this certificate, most courses are “hard”.
That sounds hard. Why not just learn about Machine Learning online?
There are so many guides, there’s Towards Data Science, there are books, and free videos
online… But there are things you can only get through supervised education at a known
- Personalized feedback. Go talk to the TAs (Teaching Assistant, aka семинаристы in Russian) about
your homework and about the projects. There are Zoom videoconferencing available to remote students (but
I mostly just drove to Stanford for that). If you are struggling with something but decide to skip on
this, you’re not getting the most out of your $5k: go talk to a TA.
- Experienced advisors. For project work, pick the more senior TAs and talk to them. Find excuses to
talk to them (a good excuse can be “my project fits with your research interests”, “one of your papers
seemed relevant to my project so I wanted to discuss it with you”). Your project is not relevant?
Consider doing a project that is–I’m serious here.
- Accountability. If you lack inner drive, enrolling won’t help you start, but for many people,
accountability helps to be consistent. Due to the difficulty of the courses offered, sometimes you will
feel drained and demotivated, and knowing that you’re accountable to your school and to the TAs who helped
you might just give you that last bit of motivation.
- Actual completed projects and project mentorship. As part of most of the classes, you will
have to (a) come up, (b) iterate, (c) evaluate your project, and (d) prove that ML made a
difference compared to simple heuristics. These are all skills that are indispensable in your
career as a data scientist, and top students and professors from Stanford will help you there. It
is the projects that made the actual impact on my career, not just the learnings I got from the studies.
How to enroll?
So, the enrollment is open now for the Spring Quarter. What do you need to do?
First, the enrollment is time-sensitive. The available couses (just like the school itself) are in
high-demand. They spots usually sell out in the first days or 1-2 weeks tops after the enrollment opens. You
want to put a reminder on your phone and enroll as soon as it’s available. Find the enrollment dates in
the Academic Calendar for the right year (e.g. here’s the Calendar 2019-20) under something like
“Axess opens for course enrollment”. Then, enroll via the SCPD website, not through Axess!
You need to enroll into the program itself in addition to enrolling into the specific courses (because
the AI certificate is “in high demand”, duh). This was the active link with more information at the time of
writing. Get the required documents ready ahead of time. Find the list through the SCPD website; the
list will include your college transcript in English. I used the
translation from Russian that was prepared for one of the visas.
You need to nominate an “exam monitor." Usually, this should be your manager or a person who can pretend to
be one. Their role would be to supervise your exams and make sure you don’t cheat and abide by the time
limit. So make sure to pick a busy manager whose calendar is full of meetings and who won’t have the time
to care. ;-) And while I did nominate the exam monitor, I took all tests on campus anyway.
You need to design your program to minimize the completion time. For example, CS224N is offered once
per year, whereas CS231n is offered three times per year. If you’re interested in both, it might
make sense to take CS224N when you have a chance to, and take CS231N some other time despite that CS231n is a
great starter course.
(Actually, in one of the cs221 assignments, you’ll wirte a constraint solver to design your program
in this way for you automatically :-D).
See the section below on which courses I took and which I decided not to take and why.
Why do you want it?
What’s the goal you have in mind? Different people enroll for different reasons:
Change the career path.
The credit earned in this courses counts towards a Master’s degree should you choose to pursue one and
should you get accepted. One former colleague of mine did that. Now she works as a Natural Language
Processing ML engineer for Google Ads.
But she is actually an outlier! You’ll be surprised though that the hiring managers in Silicon Valley do
not consider such a certificate a big advantage, especially if they’re looking for people with production
experience in machine learning. Think about it: many Stanford CS students take the same courses as the ones
offered in the certificate,
but not all of them would be qualified for a specialized ML role. (This applies to Silicon Valley though;
perhaps other places are not crowded with recent Stanford graduates and this certificate will mean
To this point, the other two colleagues with the same ML certificates and Master’s from Stanford are
working on ML infrastructure without any modeling in their duties. While I’m sure it helps them in their
jobs (and managers are quite willing to consider such a certificate a big plus), one doesn’t really need a
degree to do that.
In other words, this degree is not a guarantee, but a step towards a different career. E.g. you could use
what you learned to start an ML-related side project at your current job. See this classic video titled
“You and your Research” but it should’ve been titled “You and your career”.
This is a well-trodden albeit a risky path. That colleague I mentioned at the beginning had transferred to her
ML modeling role from the ML infrastructure position shortly after getting her stamped degree. Heck, I
transferred from ML infra myself. But it’s not the degree that gets you there. It’s your grit and the
willingness to go the extra mile; education merely gives you the tools.
I just need a change. However hard it was to admit, this was a big part of my motivation. Enroll
first, and then see where this takes me later. Do on your own risk, and only if you’re a person who lives
by this principle in other aspects of your life as well.
Gain deeper understanding of the topics. Have you been doing “classical ML” and want to learn more
about deep learning? Or say, you’ve recently joined a team that is involved in Natural Language
Processing, but this topic is new to you and you need to bring yourself up to speed. Then these courses
could be a great tool; just keep in mind that some courses do offer deeper understanding but some do not.
In this case, consider also just taking 1-2 classes without completing the Certificate program.
You’ll have to use every free minute to study. Here’s me doing homework on the Google bus.
The Wrong Reasons
However, there are things that these courses won’t help you with.
Learning about the cutting-edge techniques actually employed in the industry. No course geared
towards undergraduate students will tell you that. Many of the TAs don’t know it either. You can get
this knowledge on the job, or learn it from the engineering community. Undergraduate-level courses are
the wrong tool for that.
Getting an equivalent of N years of experience on your resume
No, you won’t be able to pretend that you’re a senior, experienced ML developer after just a total
of 2-3 months of personal projects of undergrad student quality. I tried and it didn’t work. That’s less
than one internship worth of workload.
Just to have fun. School is not fun. You’re a professional now. Leave the all-nighter drinking
parties you had in your teens to the confines of your memories. At top schools, “3 units of credit”
actually means “9 hours of coursework a week”. You won’t get your youth back by enrolling into a school.
I need accountability to actually get something done. So… about that. If you don’t have an
intrinsic motivation to get the studies finished, you won’t make it. It’s like thinking you’ll go to the
gym more if you buy the membership: you won’t and $5000 is a lot of money.
How to try before you buy?
If paying money for things or spending $5k without prior research is not your thing, you can get a lot of that
education for free or at least try the course before you buy it.
The algorithm is simple:
Find the course you’re interested in in the Course Catalog.
Google “Stanford <course name>”. The first lnk will be the course website, wide open to the public.
Read the lecture notes. Download and try to solve the assignments.
It’s insane how much of this stuff is available for free. Like here are the full lecture notes for
CS231n. There are lecture notes for other classses too. Lecture notes for
CS229 is a book’s worth of foundational information for “classical ML”.
Besides, a simplified version of CS229 is offered on Coursera. Homework is not required.
And of course, besides Stanford, there are other resources where you can learn about ML. Consider the
Google Course course. It’s much more practice-oriented, and you’ll avoid doing all the math.
If you take it, and feel like you creave more math and are willing to learn these concepts on a deeper level,
then proceed to some more advanced programs. If not, proceed to hacking on cool ML-y things.
Now, I know, it’s hard to beat a certificate from Stanford. Few schools have the credentials, the reputation,
the resources, and the professors that Stanford offers. If you have never been to the Stanford
Campus and you live close by, you gotta check it out! It even has a Rodin Sculpture Garden.
However, if you live, say, close to another good school that offers a similar program, consider enrolling into
that school instead. There are certain benefits in face-to-face communication with TAs and other remote-ish
Yes, the SCPD policy discourages you from attending the lectures and tests on campus. While you “are not
guaranteed a spot” in the exam auditoriums, this never was a problem for me. I just showed up and always
found a spot. The TAs and professors were always happy to talk in person rather than mess with
Zoom or other remote conferencing. Although, I once drove to the campus only to find that my TA is home, and
is doing videoconferencing via Zoom.
Besides, some Poster Sessions where you display and present your project with a poster are essentially giant
career fairs. Especially the CS231n “Convolutional Neural Networks” poster session was one: Waymo, Zoox,
Cruise, and a bunch of other companies deployed their recruiters there.
The same can be said about other industry events too: CVPR has reportedly become not just a career fair but a
full-fledged industrial ML expo in addition to being a kinda scientific conference.
Stuff you get as a Stanford non-student
No, you can’t get a Student ID and so you won’t be eligible for discounts. You will, however, receive access
to the digital library, and you’ll be able to download all the papers needed for your work from ACM, IEEE,
etc and other websites for free.
And you will get the @stanford.edu email for the duration of your studies. Make use of it while you can.
Just get a GPU. It’ll be easier than messing with remote access to the free cloud compute.
Since Machine Learning requires a lot of GPU time, you’ll get credits on Microsoft Azure or Google Cloud to
run GPU-intensive computations for class assignments and projects. But if you have a gaming computer at home,
you don’t need to do that. Last year, while I did buiy a new GPU, I was able to do away with my
old GeForce GTX 1070 for the most part, which you can buy new for \$300 (it aged well, as far as machine
learning goes). Remember, you’ve shelled out \$20,000 for tuition, so you can definitely spare \$300 for the
Tips and tricks
- There are social mixers for local SCPD students. Join them.
- After your enrollment finishes… nobody will tell you what to do next. I had to ask my friends on where to
find the schedule. Look into your Stanford email account for instructions, and if that fails, just Google
your course name. The webpage will show the course schedule, and you can watch the lectures remotely via
Panopto video delivery website.
- If you’re at a big company, or are a member of a local meetup, find friends to do coursework together. I
partnered with two other Googlers for the CS224n course project.
- On the second or third week, start thinking about your final project. It’s not due for another 2-3 months,
but it will take a long time. Discuss your project with TAs; it’s a better use of their and your time than
trying to solve that one homework problem.
Which Courses I Took (and which projects I’ve completed)
Out of the AI certificate courses, here’s what I’ve taken (in the order they might be interesting
to you), with a more in-depth description to follow.
CS231n Convolutional Neural Networks for Visual Recognition
CS231n Convolutional Neural Networks for Visual Recognition is the hands-down best course
on Deep Learning and neural networks I have taken. If you’re interested in practical deep learning, take
this course first. At the time of publishing this blog post, the enrollment for this course is open, so
don’t miss your chance. You’ll not only learn how to work on computer vision, but you’ll learn the
foundations of deep learning, starting from manually deriving and implementing gradients and backpropagation,
culminating in applying tried and true architectures to foundational computer vision tasks (including
“nitty-gritty” details like building activaiton maps).
While 5 years ago this course would be state-of-art, it seemed a little bit dated to me. I wouldn’t take
it for the cutting-edge updates on the topic, but more for the foundations.
My Project: A Battery-Powered Friend: Towards Embedded Emotion Recognition” (poster,
report). This is my most successful project here; I extensively described it in my
blog, and I won’t repeat myself, but
I’ve literally built a Raspberry Pi device that ran a CNN on device and “smiled at you” with its green
LEDs if it detected a smile with its camera. The project included hiring MTurk workers for image labeling,
hacking on Raspberry Pi and deep learning of course, which made it a really fun experience.
Note: Fei-Fei Li only gave 2 lectures when I was taking this course; most of the instruction was done by the
head TAs. That is not to say the quality of this course is subpar: the instructors are experienced and
motivated; Justin is a great teacher. I just don’t want you to have some sort of expectation there.
Poster session for CS231n was essentially a career fair this year. Come in person if you can.
CS221 Artificial Intelligence: Principles and Techniques
CS221 Artificial Intelligence: Principles and Techniques is the required course that covers a wide range of AI
topics featuring approximated graph Search algorithms (A-star), Reinforcement Learning (with the fun
Pac-Man challenge I got 4th place in), Markov Decision processes, Constraint Satisfaction solvers, and
other basic tools of “classical” AI without a lot of ML per se. If you think this course is easy because
it’s an “overview” course, don’t be fooled. You are expected to dig deep and demonstrate understanding of
all of them on the 2 midterms and across 8 assignments.
The course was taugt by Percy Liang; he’s a great instructor and researcher; in fact our project
for the CS224n course (see below) was an extension of his paper. :-)
Algorithm performance evaluation using… ahem, volunteers 😇
My project: “Using Weather Data for Wine Recommendations” (poster,
final paper). It was my first project so don’t mind the poorly formatted report.
As recommended by the TAs, I picked an area I was passionate about, and tried to come up with
something useful that had a dataset I could train on.
Let me tell you a bit more because I did get some interesting results here. The goal of the
project was to build a recommendation system for wine based on wine reviews. The algorithm
would read the professional wine reviews, and then produce recommendations to users based on
similarity. However, I wanted the algorithm to learn similarity on its own.
The added challenge was that we were strongly encouraged to not use neural networks to get practice
with “Classical ML”. I downloaded all wine reviews from one of the major wine magazines I
subscribed to and got to work.
After a bit of research and trial and error, I found an algorithm that works well, Latent Dirichlet
Allocation as a latent feature vector generation. I then used K-means clustering to
evaluate the quality of the latent vectors (I learned much later in my career this is how a lot of
unsupervised learning models are evaluated).
It was a success! Not only did the algorithm learn to separate red wine from white wine (sic!)
It also learned to separate Chardonnay from other white wines, and it produced distinct categories
for Chateauneuf and Pinots, not even being aware that any of that exists! The weather data…
well, shall we say, it did not improve the recommendation, but I was blown away by the
power of unsupervised learning alone.
I compared the recommendations with the recommendations by a reputable wine store experts, and
a blind tasting with friends revealed my algorithm was indeed better at similarity than humans. ;-)
Leaderboard for Pac-Man competition featuring my submission 5 days before the deadline… :-( If you get a
large score, do not reveal it!
- The course features the famous Pac-Man challenge, in which students teach an AI to play
Pac-Man. Pro tip: if you want to win the challenge, do not submit early. Otherwise the other
students will get motivated by your scores and will try harder. :-(
CS224n Natural Language Processing with Deep Learning
CS224n Natural Language Processing with Deep Learning. I’ve learned about Recurrent Neural
Networks, LSTM, word vectors, and neural language models. In two of the assignments, you’ll implement
portions of the Neural Machine Translations and train an encoder-decoder network to translate English into
Spanish. You’ll do it once with word embeddings, and the other time with CNNs. This is insane!! These
were the world’s most mind-blowing achievements 10 years ago and now you can do it with your own bare
hands (and a GPU or two).
This is a pretty new course, and it did represent the state of art in NLP when I was taking it a year ago.
If you’re interested in NLP, this course is a must-have. Out of all courses, this was the best taught, most
modern, and most memorable to me.
Our Project: “Digital Mad Men: Training RNN as an Ad Copywriter” (poster,
report). We did a group project with Sri and Lin, two other Googlers who were
participating in the program. We applied some new research from Stanford NLP lab on variational language
models to try and get the neural network to write positive product reviews by inserting product names into the
existing reviews. We couldn’t get the network to insert the keywords, sadly, but we learned a lot in the
Pro tip: find your peers who are doing the same course on internal Slack / Mailing list, and band
with them for the final project. It’s a bit difficult to make people with busy scheduled collaborate, but
it’s worth it.
This course was taught by Chris Manning (60%) and Abigail See (40%); both are actively working in
the field and are great instructors. I’m sad Abigail doesn’t seem to be teaching this course this year.
AA228 Decision-Making under Uncertainty
AA228 Decision-Making under Uncertainty. This is a combined CS/Aeronautics course that covers
modeling and working with uncertainties in decision-making and planning. It’s quite a nieche course unlike
the ones listed above and almost exclusively focuses on planning, Q-learning / SARSA, Monte-Carlo tree search
methods, and partially observable Markov Decision Processes.
The most useful outcome for me was that I finally understood how to use Bayesian inference in practical
contexts. This course starts with foundations, and the instructor takes the approach of “unless you truly
understand the foundations, we might as well not even talk about POMDPs”. I finally couls actually work with
conditional probabilities, and I am now able to contribute to modeling the mission objectives at work.
My Project: “Self-driving under uncertainty” (report; poster not required). I
applied Monte-Carlo tree search for decision-making given imperfect perception. The project didn’t work
out in a variety of ways: essentially I learned why MCTS and other such methods are not yet widely used in the
industry. The POMCPOW planner I implemented was myopic and required handcrafted features to even dirve
forward and not stall in place.
Interestingly, there was a similar project in CS221 we’ve done as part of one of the assignments. The
methods used there worked much better for a similar formulation.
Pro Tip: this course is easier and less intense than the other courses.
Which Courses I Passed On
- CS230 Deep Learning was just… not deep enough, according to my friends that study it. Other
courses will give you deep practical knowledge of some specific area as well as this introduction to Deep
- CS229 Machine Learning was mostly about the “classical” Machine Learning, with a lot of math you’ll
need to solve in the assignments and not a lot of practical knowledge. Instead, I just read the lecture
notes I downloaded from the internets at my own pace in the summer.
My personal Learning Outcomes
So what did I get out of it?
- I can have a conversation about deep learning with other people. Heck, I can make jokes about ML! (Next
milestone: make jokes people laugh at).
- I can read, understand, and pass small amounts of judgement about the relevant ML papers.
- I revived the dormant memories of linear algebra, statistics, and calculus.
- I learned that 80% of Machine Learning is indeed downloading datasets, generating features, writing
evaluation metrics, and trying to find labeled data relevant to your project.
- …and finally, I did achieve my goal of changing the career path. I work on self-driving cars at Zoox now.