This roadmap takes you from zero ML knowledge to job-ready in approximately 6 months of consistent study
Month 1-2: Python, math foundations, and data manipulation with pandas/numpy
Month 3-4: Core ML algorithms — supervised, unsupervised, and model evaluation
Month 5-6: Deep learning, MLOps, portfolio projects, and interview preparation
Performance insight: 2 hours daily for 6 months equals 360 hours — sufficient for junior ML roles
Production insight: hiring managers value deployed projects over certificates — build and ship real models
Plain-English First
Think of this roadmap as a hiking trail with marked waypoints. You start at the trailhead knowing Python basics and follow a clear path through math fundamentals, core algorithms, deep learning, and finally arrive at the destination — job-ready with a portfolio. Each month has specific goals, free resources, and a hands-on project. The trail is designed so you never feel lost — every step builds on the previous one.
Machine learning roles require a specific skill progression that most bootcamps and courses fail to structure correctly. Developers waste months on disconnected tutorials without building deployable skills. This roadmap compresses the learning path into 6 months of focused study at 2 hours per day. Each month has concrete objectives, free resources, and a portfolio project. The sequence is designed so every concept builds on the previous one — no gaps, no dead ends. In 2026, the bar for entry-level ML roles has risen: hiring managers expect candidates to demonstrate working code, deployed models, and at least a surface-level understanding of LLM APIs and responsible AI practices. This roadmap accounts for that shift.
Month 1-2: Python, Math Foundations, and Data Manipulation
Months 1 and 2 build the foundation that every subsequent concept depends on. Python fluency is non-negotiable — you need to write clean functions, work with classes, and manipulate data structures without friction. Math foundations cover linear algebra (vectors, matrices, dot products), calculus (derivatives, gradients, chain rule intuition), and probability (distributions, Bayes theorem, conditional probability). Data manipulation means loading, cleaning, transforming, and visualizing datasets. Skip nothing here — gaps in foundations create cascading confusion later. In 2026, add one additional skill to this phase: learn to read and write basic SQL. The majority of production ML pipelines pull training data from SQL databases, not CSV files.
month1_2_foundation.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
# TheCodeForge — Month 1-2 Foundation Checklist# Verify you can do each of these without looking anything upimport numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# Python: functions, classes, list comprehensionsdefcompute_feature_stats(data: pd.DataFrame, columns: list) -> dict:
return {
col: {
'mean': data[col].mean(),
'std': data[col].std(),
'null_pct': data[col].isnull().mean() * 100
}
for col in columns
}
# Math: vector operations in numpy
weights = np.array([0.5, 0.3, 0.2])
features = np.array([1.0, 2.0, 3.0])
prediction = np.dot(weights, features) # dot product — this is what linear models doprint(f'Prediction: {prediction}')
# Math: gradient intuition — what a derivative looks like in code# The gradient of MSE loss w.r.t. weights drives parameter updatesdefmse_gradient(X: np.ndarray, y: np.ndarray, w: np.ndarray) -> np.ndarray:
residuals = X @ w - y
return (2 / len(y)) * X.T @ residuals # derivative of MSE# Data manipulation: pandas fluency
df = pd.DataFrame({
'age': [25, 30, 35, None, 45],
'income': [50000, 60000, None, 80000, 90000],
'purchased': [0, 1, 0, 1, 1]
})
# Clean, transform, and analyze in one pipeline
result = (
df
.fillna(df.median(numeric_only=True))
.assign(age_group=lambda x: pd.cut(x['age'], bins=[20, 30, 40, 50]))
.groupby('age_group')['purchased']
.mean()
)
print(f'Purchase rate by age group:\n{result}')
# Feature stats across all numeric columns
stats = compute_feature_stats(df, ['age', 'income'])
for col, metrics in stats.items():
print(f'{col}: mean={metrics["mean"]:.1f}, std={metrics["std"]:.1f}, null%={metrics["null_pct"]:.1f}')
# Visualization: basic exploratory plot
plt.figure(figsize=(8, 4))
plt.scatter(df['age'], df['income'], c=df['purchased'], cmap='coolwarm', s=80)
plt.xlabel('Age')
plt.ylabel('Income')
plt.title('Purchase Behavior by Age and Income')
plt.colorbar(label='Purchased')
plt.tight_layout()
plt.savefig('scatter_plot.png')
print('Plot saved to scatter_plot.png')
Output
Prediction: 2.2
Purchase rate by age group:
age_group
(20, 30] 0.5
(30, 40] 0.0
(40, 50] 1.0
age: mean=33.8, std=7.5, null%=20.0
income: mean=70000.0, std=17078.3, null%=20.0
Plot saved to scatter_plot.png
Foundation Learning Strategy
Python fluency means writing code, not reading it — close the tutorial and build something
Math intuition matters more than proofs at this stage — understand what the dot product represents before you memorize the formula
Pandas fluency is the single most important data skill for production ML work
If you cannot clean a messy dataset independently, you cannot build a reliable model
Learn basic SQL in parallel — most real training data lives in Postgres or BigQuery, not CSV files
Production Insight
80% of production ML time is data cleaning and pipeline maintenance, not model training.
Pandas fluency directly determines your speed on real projects and during technical interviews.
Skipping foundations to jump to algorithms produces fragile knowledge that collapses under interview pressure.
In 2026, engineers who can move fluidly between Python, SQL, and shell commands are hired faster than those who know only notebooks.
Key Takeaway
Foundations determine your ceiling — do not skip them to chase algorithms.
Pandas fluency is the most important practical skill for day-one ML productivity.
Add SQL to this phase — production data does not come in tidy CSV files.
Foundation Resource Selection
IfAlready know Python basics
→
UseSkip Python review — focus on numpy, pandas, and SQL immediately
IfNo programming background at all
→
UseSpend 2 weeks on Python basics before touching any ML concept — CS50P on edX is free and excellent
IfStrong math background (STEM degree)
→
UseSkip formal math review — focus on implementing math in numpy to build the code-to-concept connection
IfWeak math background
→
UseWatch 3Blue1Brown's linear algebra and calculus series for visual intuition before reading any ML textbook
IfAlready know pandas but not SQL
→
UseDo the SQLZoo interactive tutorial — 4 to 6 hours covers everything you need for pulling training data
Month 3-4: Core ML Algorithms and Model Evaluation
Months 3 and 4 cover the algorithms that power 80% of production ML systems. Start with linear regression and logistic regression — these teach the fundamental concepts of fitting, prediction, loss optimization, and evaluation. Then move to decision trees, random forests, and gradient boosting — these handle the nonlinear, messy, real-world data that linear models cannot. XGBoost and LightGBM are the specific implementations you will encounter in production and on Kaggle. Model evaluation is as important as model training: learn cross-validation, confusion matrices, precision, recall, F1, and ROC-AUC. A model you cannot evaluate is a model you cannot trust. This phase also introduces scikit-learn Pipelines — the right way to bundle preprocessing and modeling steps so your code is reproducible and deployment-ready from day one.
Accuracy alone is misleading on imbalanced datasets — always report F1 and AUC alongside it
Always use cross-validation with stratification — a single train/test split is unreliable and interviewers will flag it
F1-score balances precision and recall — use it whenever classes are imbalanced
ROC-AUC measures rank ordering quality — critical for any threshold-sensitive business decision
Wrap your preprocessing and model in a sklearn Pipeline — raw feature leakage from fitting a scaler on the full dataset is one of the most common interview gotchas
Production Insight
Gradient boosting wins most tabular data competitions and is the default choice for structured data in production.
Random forests are more forgiving when hyperparameter tuning time is limited — a good default under deadline pressure.
Cross-validation with stratification is mandatory for imbalanced datasets — without it, your fold metrics are statistically unreliable.
Pipelines prevent target leakage during cross-validation — fitting a scaler outside a pipeline leaks test data into training and inflates reported performance.
Key Takeaway
Master 4 algorithms: logistic regression, random forest, gradient boosting, and XGBoost.
Model evaluation skills are as important as training skills — interviewers test both.
Use sklearn Pipelines from day one — they are the industry standard and prevent subtle data leakage bugs.
Month 5: Deep Learning and Advanced Topics
Month 5 introduces neural networks and deep learning — and critically, the judgment to know when to use them. Start with a simple feedforward network using PyTorch, then move to convolutional neural networks for image data and Transformer-based models for text. In 2026, this month also covers the LLM API layer: calling OpenAI or Anthropic APIs, building basic RAG (Retrieval-Augmented Generation) pipelines with a vector store, and understanding when fine-tuning is warranted versus when prompt engineering is sufficient. Deep learning is not always the answer — for tabular data, gradient boosting still wins. The skill is knowing which tool the problem demands.
month5_deep_learning.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
# TheCodeForge — Month 5: Deep Learning with PyTorch + LLM API Awarenessimport torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data importDataLoader, TensorDatasetfrom sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.preprocessing importStandardScalerimport numpy as np
# --- Part 1: Feedforward Neural Network in PyTorch ---
X, y = make_classification(
n_samples=2000, n_features=20, n_informative=15, random_state=42
)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
X_train_t = torch.FloatTensor(X_train)
y_train_t = torch.FloatTensor(y_train)
X_test_t = torch.FloatTensor(X_test)
y_test_t = torch.FloatTensor(y_test)
train_dataset = TensorDataset(X_train_t, y_train_t)
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
classFeedForwardNet(nn.Module):
def__init__(self, input_size: int):
super().__init__()
self.network = nn.Sequential(
nn.Linear(input_size, 128),
nn.BatchNorm1d(128),
nn.ReLU(),
nn.Dropout(0.3),
nn.Linear(128, 64),
nn.BatchNorm1d(64),
nn.ReLU(),
nn.Dropout(0.3),
nn.Linear(64, 1),
nn.Sigmoid()
)
defforward(self, x: torch.Tensor) -> torch.Tensor:
returnself.network(x).squeeze()
model = FeedForwardNet(input_size=20)
criterion = nn.BCELoss()
optimizer = optim.Adam(model.parameters(), lr=0.001, weight_decay=1e-4)
scheduler = optim.lr_scheduler.StepLR(optimizer, step_size=30, gamma=0.5)
for epoch inrange(100):
model.train()
epoch_loss = 0.0for X_batch, y_batch in train_loader:
optimizer.zero_grad()
output = model(X_batch)
loss = criterion(output, y_batch)
loss.backward()
optimizer.step()
epoch_loss += loss.item()
scheduler.step()
if (epoch + 1) % 25 == 0:
print(f'Epoch {epoch+1}/100 | Loss: {epoch_loss/len(train_loader):.4f}')
model.eval()
with torch.no_grad():
predictions = (model(X_test_t) > 0.5).float()
accuracy = (predictions == y_test_t).float().mean()
print(f'Neural Network Test Accuracy: {accuracy:.2%}')
# --- Part 2: LLM API Pattern (2026 skill) ---# This shows the pattern — replace with your actual API key via environment variable# from openai import OpenAI# import os## client = OpenAI(api_key=os.environ['OPENAI_API_KEY'])## def classify_sentiment(text: str) -> str:# response = client.chat.completions.create(# model='gpt-4o',# messages=[# {'role': 'system', 'content': 'Classify sentiment as positive, negative, or neutral.'},# {'role': 'user', 'content': text}# ],# temperature=0 # deterministic for classification tasks# )# return response.choices[0].message.content## print(classify_sentiment('This product exceeded all my expectations.'))# Output: positive
Output
Epoch 25/100 | Loss: 0.3821
Epoch 50/100 | Loss: 0.2914
Epoch 75/100 | Loss: 0.2453
Epoch 100/100 | Loss: 0.2201
Neural Network Test Accuracy: 92.50%
When Deep Learning Is the Right Choice in 2026
Image data: CNNs and Vision Transformers dominate — start with a pretrained EfficientNet or ViT via torchvision
Text data: Transformers dominate — use sentence-transformers for embeddings, fine-tune BERT for classification
Tabular data: gradient boosting still wins — do not reach for a neural network when XGBoost will do
LLM tasks: use API-first before considering fine-tuning — GPT-4o or Claude with a good prompt beats a fine-tuned small model for most NLP tasks
Time series: ARIMA and Prophet for simple trends, PatchTST or TimesNet for complex multivariate forecasting
Production Insight
Deep learning is not always the best choice — gradient boosting wins on tabular data and is cheaper to maintain.
Neural networks require more data, more compute, more tuning, and more monitoring.
In 2026, the most practical deep learning skill is knowing how to use a pretrained model — not how to design one from scratch.
LLM API costs are real: temperature, token limits, and caching strategy affect production budgets. Learn to measure and control them.
Key Takeaway
Deep learning dominates vision and NLP — not tabular data.
Learn PyTorch first — go deep on one framework before touching another.
In 2026, LLM API fluency is a baseline expectation — add it to this month, not as an afterthought.
Month 6: MLOps, Portfolio Projects, and Interview Prep
Month 6 converts knowledge into job-readiness. Build 2 to 3 portfolio projects that demonstrate end-to-end ML skills: data collection, preprocessing, model training, evaluation, deployment, and monitoring. Learn the MLOps layer that separates junior candidates from mid-level candidates: experiment tracking with MLflow, containerized deployment with Docker, API serving with FastAPI, and basic CI/CD with GitHub Actions. In 2026, add responsible AI considerations to at least one project — document your bias evaluation, data provenance, and model limitations. Hiring managers at larger companies are increasingly reviewing this as part of technical screening. The portfolio is what gets you the interview. The depth of your understanding is what gets you the offer.
month6_portfolio_project.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
# TheCodeForge — Month 6: Production-Ready Portfolio Project# Deploy a model as a REST API with FastAPI, versioning, and health monitoringfrom fastapi importFastAPI, HTTPExceptionfrom pydantic importBaseModel, Field, validator
import joblib
import numpy as np
import logging
import time
from datetime import datetime
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
app = FastAPI(
title='Churn Prediction API',
description='Predicts customer churn probability using gradient boosting model',
version='1.0.0'
)
# Load model and scaler at startup — fail fast if artifacts are missingtry:
model = joblib.load('churn_model_v1.pkl')
scaler = joblib.load('feature_scaler_v1.pkl')
logger.info('Model and scaler loaded successfully')
exceptFileNotFoundErroras e:
logger.error(f'Failed to load model artifacts: {e}')
model = None
scaler = NoneclassCustomerFeatures(BaseModel):
tenure_months: int = Field(..., ge=0, le=120, description='Customer tenure in months')
monthly_charges: float = Field(..., ge=0, le=500, description='Monthly bill amount in USD')
total_charges: float = Field(..., ge=0, description='Total charges to date in USD')
support_tickets: int = Field(..., ge=0, description='Number of support tickets opened')
contract_type: int = Field(..., ge=0, le=2, description='0=month-to-month, 1=one-year, 2=two-year')
@validator('total_charges')
deftotal_must_exceed_monthly(cls, v, values):
if'monthly_charges'in values and v < values['monthly_charges']:
raiseValueError('total_charges must be >= monthly_charges')
return v
classPredictionResponse(BaseModel):
churn_probability: float
churn_prediction: bool
risk_tier: str
model_version: str
prediction_timestamp: str
classHealthResponse(BaseModel):
status: str
model_loaded: bool
uptime_seconds: float
STARTUP_TIME = time.time()
@app.post('/predict', response_model=PredictionResponse)
defpredict(features: CustomerFeatures):
if model isNoneor scaler isNone:
raiseHTTPException(status_code=503, detail='Model not available — check deployment logs')
input_array = np.array([[
features.tenure_months,
features.monthly_charges,
features.total_charges,
features.support_tickets,
features.contract_type
]])
scaled_input = scaler.transform(input_array)
probability = float(model.predict_proba(scaled_input)[0][1])
if probability >= 0.7:
risk_tier = 'high'elif probability >= 0.4:
risk_tier = 'medium'else:
risk_tier = 'low'
logger.info(f'Prediction: prob={probability:.4f}, tier={risk_tier}')
returnPredictionResponse(
churn_probability=round(probability, 4),
churn_prediction=probability >= 0.5,
risk_tier=risk_tier,
model_version='v1.0.0',
prediction_timestamp=datetime.utcnow().isoformat()
)
@app.get('/health', response_model=HealthResponse)
defhealth():
returnHealthResponse(
status='healthy'if model isnotNoneelse'degraded',
model_loaded=model isnotNone,
uptime_seconds=round(time.time() - STARTUP_TIME, 2)
)
Output
# Run with: uvicorn month6_portfolio_project:app --reload --port 8000
# Interactive docs: http://localhost:8000/docs
# POST /predict with JSON body returns churn probability, prediction, and risk tier
Every project needs a README explaining the problem, approach, evaluation results, and known limitations
Deploy at least one project as a live API — not just a notebook on GitHub
Include model evaluation metrics and explain why you chose the algorithm over alternatives
Version your models, include a requirements.txt, and pin dependency versions for reproducibility
Add input validation to your API — hiring managers review your code and raw prediction endpoints signal naivety
Document responsible AI considerations: what biases might exist, who could be harmed by errors, and what monitoring is in place
Production Insight
Hiring managers spend 30 seconds on each resume — your project links must be clickable, live, and load fast.
A deployed API with validation, logging, and a health endpoint demonstrates engineering judgment that notebooks cannot.
README quality signals communication skills — something every engineering team values as much as code quality.
In 2026, a project that uses an LLM API as one component — not the entire project — demonstrates proportionate judgment about when to use which tool.
Key Takeaway
Portfolio projects are your resume — deploy them, document them, version them, and make them load.
One deployed project with proper engineering beats ten completed courses.
MLOps skills — Docker, FastAPI, MLflow — differentiate junior from mid-level candidates at every company.
Portfolio Project Selection by Target Role
IfTargeting computer vision roles
→
UseBuild image classification with a pretrained EfficientNet via transfer learning — deploy as a FastAPI endpoint with image upload support
IfTargeting NLP or LLM-adjacent roles
→
UseBuild a RAG pipeline over a document corpus using LangChain, a vector store (ChromaDB or Pinecone), and an OpenAI API backend
IfTargeting general ML engineer roles
→
UseBuild a tabular prediction project with gradient boosting, deploy with FastAPI and Docker, track experiments with MLflow
IfTargeting MLOps or platform roles
→
UseBuild an end-to-end pipeline with MLflow experiment tracking, Docker containerization, GitHub Actions CI/CD, and a simple data drift monitor using evidently
● Production incidentPOST-MORTEMseverity: high
Six Months of Random Tutorials, Zero Job Offers
Symptom
Applied to 47 ML positions. Received zero callbacks. Resume listed 12 course certificates but no projects, no GitHub portfolio, and no deployed models.
Assumption
Completing many courses would demonstrate competence. The developer believed certificates equaled job-readiness.
Root cause
Course-hopping without building projects left the developer with fragmented knowledge and no practical skills. Interviewers asked about bias-variance tradeoff, cross-validation strategy, and production deployment — concepts that require hands-on experience, not video lectures. The learning path lacked structure, projects, and depth. In 2026, interviewers are also asking about prompt engineering, retrieval-augmented generation, and responsible AI — topics that never appear in generic course catalogs.
Fix
1. Followed a structured 6-month roadmap with monthly project milestones
2. Built 6 portfolio projects deployed on GitHub with README documentation
3. Practiced ML system design interviews using real-world scenarios
4. Contributed to one open-source ML library for resume differentiation
5. Added one LLM-integrated project to demonstrate awareness of the current production landscape
Key lesson
Certificates without projects are invisible to hiring managers
A structured roadmap prevents course-hopping and knowledge fragmentation
Deployed projects demonstrate skills that certificates cannot
In 2026, knowing when NOT to use an LLM is as important as knowing how to call one
Production debug guideSymptom to action mapping for common learning obstacles6 entries
Symptom · 01
Stuck on math concepts and cannot progress
→
Fix
Skip the proof, learn the intuition. Use 3Blue1Brown videos for visual understanding. Return to math rigor after you can apply concepts in code. Most production ML engineers never hand-derive a gradient — they understand what the optimizer is doing, not every step of the calculus.
Symptom · 02
Tutorial hell — can follow along but cannot build independently
→
Fix
Stop watching tutorials. Take the last tutorial project, delete the code, and rebuild it from memory. Then modify it with a new dataset or feature. The rebuild step is where real learning happens — passive consumption builds false confidence.
Symptom · 03
Overwhelmed by the number of ML algorithms to learn
→
Fix
Focus on 4 algorithms first: linear regression, logistic regression, random forest, and gradient boosting. These cover 80% of production ML use cases. Everything else — SVMs, k-nearest neighbors, naive Bayes — is supplementary knowledge you pick up when a specific problem demands it.
Symptom · 04
Cannot stay motivated after month 2
→
Fix
Join a Kaggle competition or find a study group. External accountability and community support sustain motivation better than solo study. Alternatively, pick a dataset tied to a domain you care about — sports, healthcare, finance — and build something personally meaningful.
Symptom · 05
Projects feel too simple to impress employers
→
Fix
Deploy the project with an API, add monitoring, write tests, and document decisions. A simple model with production infrastructure beats a complex model living in a notebook. Add a section to the README explaining what you would do differently with more time — that level of reflection signals engineering maturity.
Symptom · 06
Unsure whether to focus on traditional ML or LLMs in 2026
→
Fix
Learn both layers. Traditional ML is the foundation — gradient boosting still powers fraud detection, pricing models, and recommendation systems at scale. LLMs are the interface layer — most new products are built on top of APIs like OpenAI, Anthropic, or open-weight models like Llama 3. Your competitive advantage is knowing when each is appropriate.
★ Learning Environment Setup Cheat SheetImmediate setup commands for a 2026-ready ML development environment
Need to set up Python ML environment from scratch−
Immediate action
Install Python 3.11+, create a virtual environment, and install core libraries
Initialize MLflow tracking in your project directory
Commands
pip install mlflow && mlflow ui
# In your training script: import mlflow; mlflow.autolog()
Fix now
# Open http://localhost:5000 to view experiment runs, metrics, and saved models
6-Month ML Roadmap Overview
Month
Focus Area
Key Skills
Deliverable
Free Resources
Month 1
Python and Data
Python, numpy, pandas, matplotlib, SQL basics
Data analysis notebook on a real dataset with EDA and cleaning pipeline
Python.org tutorial, pandas docs, SQLZoo
Month 2
Math Foundations
Linear algebra, calculus intuition, probability, statistics
Math intuition notes with numpy code examples for each concept
3Blue1Brown, Khan Academy, StatQuest
Month 3
Supervised Learning
Linear regression, logistic regression, decision trees, sklearn Pipelines
Classification project with cross-validation, confusion matrix, and F1 evaluation
scikit-learn docs, Andrew Ng ML Specialization (Coursera audit)
Month 4
Advanced Algorithms
Random forest, gradient boosting, XGBoost, LightGBM, hyperparameter tuning
Kaggle competition submission with documented methodology
Kaggle Learn, fast.ai Practical ML
Month 5
Deep Learning and LLMs
PyTorch, CNNs, Transformers, LLM API basics, RAG pattern
Image or NLP project with neural network plus one LLM API integration
PyTorch tutorials, fast.ai, OpenAI cookbook
Month 6
MLOps and Portfolio
FastAPI, Docker, MLflow, GitHub Actions, responsible AI basics
3 deployed portfolio projects on GitHub with READMEs and live endpoints
MLOps Zoomcamp, Full Stack Deep Learning, evidently docs
Key takeaways
1
Follow a structured 6-month roadmap
course-hopping without projects wastes months and produces fragile knowledge
2
Master 4 core algorithms deeply rather than surveying 20 algorithms superficially
3
Deploy portfolio projects
one deployed API with documentation beats ten completed courses on a resume
4
Model evaluation skills are as important as model training skills
interviewers test both equally
5
In 2026, add LLM API fluency to month 5
the ability to call, prompt, and integrate language models is a baseline expectation at most product companies
6
2 hours daily for 6 months equals 360 hours
sufficient for junior ML roles if those hours produce shipped projects
Common mistakes to avoid
6 patterns
×
Course-hopping without building projects
Symptom
Completed 12 courses but cannot build a model independently. Knowledge feels broad but shallow. Cannot explain concepts without referencing slide decks. Freezes during take-home assignments.
Fix
Limit yourself to one primary course at a time. After each module, close the tutorial and build something using the concepts on a different dataset. Depth beats breadth at this stage — interviewers want to see what you can do, not what you have watched.
×
Skipping math foundations to jump to algorithms
Symptom
Can call sklearn functions but cannot explain what gradient descent does, why regularization prevents overfitting, or how a loss function is minimized. Struggles to answer 'what is actually happening when you call model.fit()'.
Fix
Spend month 2 on math intuition. You do not need to prove theorems — you need to understand what algorithms are doing under the hood. Engineers who understand the math debug models faster and design better experiments.
×
Building only Jupyter notebooks, never deploying models
Symptom
Portfolio contains 10 notebooks but no deployed APIs, no Docker containers, no production code. Hiring managers cannot assess engineering skills from a static notebook — they need to see you can ship.
Fix
Deploy at least one project as a REST API with FastAPI. Containerize it with Docker. Write a README that explains how to run it. This separates data scientists from ML engineers in the hiring funnel.
×
Learning too many algorithms without mastering any
Symptom
Can list 20 algorithm names but cannot tune hyperparameters for any of them, cannot explain the tradeoffs between them, and cannot defend a model choice to a stakeholder.
Fix
Master 4 algorithms deeply: linear regression, logistic regression, random forest, and gradient boosting. These cover 80% of production use cases. Add XGBoost once you can explain why you would choose it over random forest.
×
Ignoring model evaluation before deployment
Symptom
Model shows 95% accuracy in a notebook but degrades immediately in production. No cross-validation, no confusion matrix analysis, no understanding of class imbalance or data leakage.
Fix
Every model must have cross-validation scores, a confusion matrix, and precision/recall/F1 evaluation before any deployment discussion. If the dataset is imbalanced, accuracy is not a valid primary metric — full stop.
×
Ignoring the LLM layer entirely because it feels separate from 'real ML'
Symptom
Strong traditional ML skills but zero exposure to LLM APIs, embeddings, or RAG patterns. Fails to answer basic questions about generative AI during interviews at companies building AI-powered products — which is most companies in 2026.
Fix
Spend one to two weeks in month 5 calling an LLM API, building a simple embedding-based search, and understanding retrieval-augmented generation. You do not need to train a language model — you need to know how to use one appropriately.
INTERVIEW PREP · PRACTICE MODE
Interview Questions on This Topic
Q01SENIOR
Walk me through how you would approach a new ML problem from scratch.
Q02SENIOR
Explain the bias-variance tradeoff with a concrete example.
Q03SENIOR
How would you handle a dataset with 95% class imbalance for fraud detect...
Q04JUNIOR
What is the difference between training, validation, and test sets?
Q05SENIOR
When would you choose a gradient boosting model over a neural network?
Q01 of 05SENIOR
Walk me through how you would approach a new ML problem from scratch.
ANSWER
First, understand the business problem and define success metrics — not just accuracy, but business-relevant metrics like cost reduction, false negative rate, or revenue impact. Second, explore and clean the data — check distributions, missing value patterns, class balance, and potential data leakage sources. Third, establish a baseline model — even a simple logistic regression or a rule-based heuristic — to measure meaningful improvement against. Fourth, iterate on feature engineering and model selection using cross-validation for fair comparison. Fifth, evaluate on a held-out test set with the metrics defined in step one. Sixth, deploy with monitoring for data drift and performance degradation. The key insight interviewers want to hear: problem definition and data quality determine success more than algorithm selection. Picking the fanciest model for bad data does not work.
Q02 of 05SENIOR
Explain the bias-variance tradeoff with a concrete example.
ANSWER
Bias is error from a model being too simple — it underfits. A linear regression trying to model a clearly curved relationship has high bias because it cannot capture the shape of the data. Variance is error from a model being too complex — it overfits. A deep decision tree with no depth limit memorizes training noise and performs poorly on unseen data. The tradeoff: reducing bias typically increases variance, and vice versa. A decision tree with no depth limit has low bias but high variance. Pruning the tree or using a random forest — which averages many high-variance trees — reduces variance while maintaining low bias. The production implication: high-variance models perform well in offline testing but degrade unpredictably in production as data distribution shifts.
Q03 of 05SENIOR
How would you handle a dataset with 95% class imbalance for fraud detection?
ANSWER
First, never use accuracy as the primary metric — a model that predicts every transaction as non-fraud achieves 95% accuracy and catches zero fraud. Use precision, recall, F1, and AUC-PR as primary metrics. Second, try class weight adjustment before resampling — set class_weight='balanced' in scikit-learn, which adjusts the loss function without creating synthetic data. Third, consider SMOTE for oversampling the minority class, but validate carefully — SMOTE can generate unrealistic synthetic samples that inflate offline metrics without improving production performance. Fourth, use stratified cross-validation to ensure each fold contains representative fraud cases. Fifth, adjust the classification threshold based on business cost ratios — missing fraud is almost always more expensive than a false alarm, so shifting the threshold below 0.5 typically makes business sense.
Q04 of 05JUNIOR
What is the difference between training, validation, and test sets?
ANSWER
The training set is used to fit model parameters — gradient descent updates weights on this data. The validation set is used during development to compare models and tune hyperparameters — it prevents overfitting to the training set. The test set is held out completely until the final evaluation — it simulates unseen production data and must never influence any decision during development. The critical rule: using the test set for any model selection decision contaminates it and produces optimistically biased performance estimates — a form of data leakage. In practice, use cross-validation on the training set for model selection and hyperparameter tuning, then evaluate exactly once on the test set to report final performance.
Q05 of 05SENIOR
When would you choose a gradient boosting model over a neural network?
ANSWER
For structured tabular data, gradient boosting — XGBoost, LightGBM, CatBoost — is almost always the right default in 2026. It trains faster, requires less data, is more interpretable with SHAP values, handles missing values natively in some implementations, and routinely outperforms neural networks on tabular benchmarks. Neural networks win on unstructured data — images, text, audio — where the hierarchical feature learning of deep architectures captures structure that handcrafted features cannot. The decision rule I apply in practice: start with gradient boosting for any tabular problem; reach for a neural network when the data is unstructured, when you have millions of samples, or when the architecture has a strong inductive bias for the domain — like a CNN for images or a Transformer for sequences.
01
Walk me through how you would approach a new ML problem from scratch.
SENIOR
02
Explain the bias-variance tradeoff with a concrete example.
SENIOR
03
How would you handle a dataset with 95% class imbalance for fraud detection?
SENIOR
04
What is the difference between training, validation, and test sets?
JUNIOR
05
When would you choose a gradient boosting model over a neural network?
SENIOR
FAQ · 6 QUESTIONS
Frequently Asked Questions
01
How many hours per day do I need to follow this roadmap?
The roadmap is designed for 2 hours per day, 6 days per week. This totals approximately 360 hours over 6 months. If you can dedicate 4 hours per day, you can compress the timeline to 3 months. Consistency matters more than volume — 2 focused hours daily beats 10 distracted hours on weekends. Protect your daily sessions from interruption and treat them as non-negotiable.
Was this helpful?
02
Do I need a math degree to follow this roadmap?
No. You need high school level math — algebra and basic statistics — and the willingness to build intuition for linear algebra and calculus. Month 2 covers math foundations using visual resources like 3Blue1Brown and StatQuest. You do not need to prove theorems. You need to understand what algorithms are doing under the hood well enough to debug them when they behave unexpectedly and explain them clearly in interviews.
Was this helpful?
03
Should I learn PyTorch or TensorFlow in 2026?
PyTorch. It is now the dominant framework in both research and production, with the most active community, the best debugging experience, and the most tutorials. TensorFlow still appears in legacy codebases and has strong mobile deployment tooling through TFLite. If you join a team running TensorFlow, you can transfer PyTorch knowledge in a week. The reverse is also true. Pick one, go deep, and do not split your attention.
Was this helpful?
04
Should I learn traditional ML or focus on LLMs?
Learn both layers — they are not competing. Traditional ML with scikit-learn and gradient boosting is the foundation: it powers fraud detection, pricing, recommendation systems, and every structured data problem at scale. LLMs are the interface and capability layer: they power conversational features, document processing, code generation, and content generation. In 2026, the most hireable candidates understand both. The engineers who only know LLM APIs cannot build the data pipelines behind them. The engineers who only know traditional ML are increasingly asked to integrate LLM components and struggle.
Was this helpful?
05
What projects should I put in my portfolio?
Build 3 projects with a clear progression. First, a tabular data project — classification or regression with gradient boosting, deployed as a FastAPI endpoint with input validation and a README. Second, a deep learning or NLP project — image classification with a pretrained CNN, or text classification with a Transformer. Third, an MLOps or LLM project — an end-to-end pipeline with experiment tracking and a CI/CD workflow, or a RAG application that retrieves from a document corpus and answers questions. Quality over quantity: one well-documented, deployed, reproducible project beats five abandoned notebooks.
Was this helpful?
06
How do I stay motivated for 6 months of self-study?
Join a study group or Kaggle community — external accountability is more reliable than internal motivation after month 2. Set weekly milestones and track progress publicly, even just in a simple learning log. Build projects on domains you find personally interesting. Take one full day off per week. Remember that 6 months of consistent, structured study puts you ahead of the majority of people who start learning ML and quit by month 2 when the concepts get harder.