Saral Shiksha Yojna
Courses/Computer Vision

Computer Vision

CSE471
Prof. Makarand Tapaswi + Prof. Charu SharmaSpring 2025-264 credits

MCQs

One correct option. Pick, then check.

Marr defined vision as:

Which is NOT one of Jitendra Malik's Three Rs?

Which kernel is NOT used for smoothing?

Why does JPEG use DCT instead of DFT?

Otsu's threshold maximises:

Inverse warping is preferred over forward warping because:

Why not initialise neural network weights to zero?

Kaiming (He) initialisation uses variance:

Bagging reduces ____; boosting reduces ____.

Precision is preferred over recall when:

ROC vs PR curve — which is preferred for imbalanced data with rare positives?

Conv-layer parameter count is:

Two stacked 3×3 convs at C channels have ____ params and produce a ____ receptive field.

ResNet's residual gradient :

Depthwise-separable conv is approximately ____× cheaper than standard conv at .

BatchNorm at inference uses:

CNNs are translation ____ but NOT ____ equivariant.

Faster R-CNN replaces Selective Search with which component?

YOLO v1 default output tensor for PASCAL VOC is:

λ_coord and λ_noobj in YOLO loss are typically:

Mask R-CNN's mask head produces:

Dilated/atrous convolutions enlarge the receptive field by:

OpenPose's output channels for K keypoints and L limb types is:

SMPL's pose parameter θ has dimensionality:

PointNet achieves permutation invariance via:

Total learnable parameters per 3DGS Gaussian (degree-3 SH):

Σ = R·S·Sᵀ·Rᵀ in 3DGS guarantees:

Why is QKᵀ divided by √dₖ?

ViT-B/16 on 224² images yields how many tokens entering the encoder (including [CLS])?

Which SSL method does NOT use negatives?

MAE's mask ratio is:

RoPE's key property is:

In PaliGemma, which component is randomly initialised?

I3D's 'inflation' trick: