Question 1

What's the difference between AI, machine learning, and deep learning?

Accepted Answer

AI is the broad goal of machines doing smart tasks, machine learning is AI that learns from data, and deep learning is ML using many-layered neural networks. A quick mental model: AI is the field, ML is one approach to it, and deep learning is one powerful family of ML.

Question 2

What is overfitting and how do you prevent it?

Accepted Answer

Overfitting is when a model memorizes the training data instead of the general pattern; you fight it with more data, regularization, dropout, and validation-based early stopping. A simple tell: if training accuracy keeps climbing while validation accuracy stalls or drops, you're overfitting. Cross-validation plus a held-out test set is your best early-warning system.

Question 3

How do transformers and attention actually work?

Accepted Answer

Attention lets each token weigh how much every other token matters to it, so transformers capture long-range relationships in parallel instead of step by step like RNNs.

Question 4

Why does my neural network need so much data?

Accepted Answer

Deep models have millions of parameters, so without enough varied examples they latch onto noise instead of the real signal and fail to generalize.

Question 5

What is gradient descent in simple terms?

Accepted Answer

It's how a model learns: it nudges its parameters in the direction that most reduces error, taking small steps downhill on the loss surface.

Question 6

What's the difference between supervised and unsupervised learning?

Accepted Answer

Supervised learning trains on labeled examples to predict an answer, while unsupervised learning finds hidden structure like clusters in unlabeled data.

Question 7

Why do large language models hallucinate?

Accepted Answer

They predict plausible next tokens from patterns rather than looking up facts, so when a confident pattern is wrong they produce fluent but false statements. Honestly? Same reason I confidently give directions to places I've never been. 😄

Question 8

What is a loss function and how do I choose one?

Accepted Answer

It scores how wrong a prediction is; use cross-entropy for classification, mean-squared-error for regression, and task-specific losses when those don't fit.

Question 9

How is physics-informed machine learning different from normal ML?

Accepted Answer

It adds the governing equations into the training loss, so the model's predictions must respect known physical laws instead of fitting data blindly.

Question 10

What's the difference between a CNN and an RNN?

Accepted Answer

CNNs excel at spatial data like images using local filters, while RNNs and their gated variants handle sequences by carrying state across time steps.

Question 11

Why split data into training, validation, and test sets?

Accepted Answer

You learn on the training set, tune your choices on validation, and measure honest performance on the untouched test set so you don't fool yourself.

Question 12

What does 'embedding' mean in machine learning?

Accepted Answer

An embedding maps things like words or images into a vector space where similar items sit close together, letting models reason about meaning numerically.

Question 13

How do you know if a model is actually good or just lucky?

Accepted Answer

Evaluate on held-out data, use cross-validation, pick the right metric for the task, and always compare against a sensible baseline.

Question 14

What is reinforcement learning used for?

Accepted Answer

It trains an agent to maximize reward through trial and error, useful in game-playing, robotics, and control problems where good labels aren't available.

Question 15

What is a neural network in simple terms?

Accepted Answer

A stack of simple math units loosely inspired by neurons; by adjusting their connection strengths during training, the network learns to map inputs to outputs.

Question 16

What's the difference between AI and a chatbot like ChatGPT?

Accepted Answer

A chatbot is one application of AI; underneath it's a large language model — a neural network trained to predict text — wrapped in a conversational interface.

Question 17

What is a hyperparameter?

Accepted Answer

A setting you choose before training, such as learning rate or number of layers, as opposed to the weights the model learns on its own.

Question 18

Why do we normalize or scale input data?

Accepted Answer

Putting features on similar scales helps training converge faster and stops large-valued features from dominating the model.

Question 19

What is the bias-variance tradeoff?

Accepted Answer

Too simple a model underfits (high bias); too complex a model overfits noise (high variance); the goal is to balance the two for the best generalization.

Question 20

What is backpropagation?

Accepted Answer

The algorithm that computes how each weight contributed to the error and sends that signal backward through the network so the weights can be adjusted.

Question 21

What's the difference between classification and regression?

Accepted Answer

Classification predicts a category like spam-or-not, while regression predicts a continuous number like tomorrow's temperature.

Question 22

How much math do I really need for machine learning?

Accepted Answer

Mostly linear algebra, calculus, and probability/statistics — enough to grasp vectors, gradients, and distributions; you can begin applied work with the basics.

Question 23

What is transfer learning?

Accepted Answer

Reusing a model trained on a large dataset as a starting point for a related task, so you need far less data and compute.

Question 24

Why are GPUs used for deep learning?

Accepted Answer

A GPU does thousands of simple calculations in parallel, which matches the massive matrix math neural networks rely on.

Question 25

What does it mean when a model has billions of parameters?

Accepted Answer

Parameters are the adjustable numbers a model learns; more of them can capture more patterns but demand much more data and compute.

AI & Machine Learning Q&A