Computer Vision
CSE471Prof. Makarand Tapaswi + Prof. Charu Sharma•Spring 2025-26•4 credits
Mock Paper 5 — Geometry + Color + Hough + RNN/LSTM + Edge Topics
Duration: 180 min • Max marks: 100
Section A — Short Answer (1-2 marks each, 20 marks)
20 marks- 1.What is a homography and how many parameters does it have?1 m
- 2.How many point correspondences are needed to compute a homography from scratch?2 m
- 3.What colour is HSV with H = 0°?1 m
- 4.Forward vs inverse warping when applying a geometric transformation.2 m
- 5.Difference between CIE Lab and RGB colour spaces. Why is Lab preferred for colour quantisation?2 m
- 6.What is template matching, and what is Normalised Cross-Correlation (NCC)?1 m
- 7.Hough transform for circles — how many dimensions does the accumulator have?2 m
- 8.What is the vanishing gradient problem in RNNs?1 m
- 9.Name the three gates of an LSTM cell.2 m
- 10.LSTM vs GRU.1 m
- 11.Character-level RNN trained on Shakespeare. How to generate new text at inference?2 m
- 12.What is teacher forcing in RNN training?1 m
- 13.What is an autoencoder and what is its loss function?2 m
- 14.What is a VAE and how does it differ from a standard autoencoder?1 m
Section B — Conceptual / Explanation (4-6 marks each, 40 marks)
40 marks- 1.Stitch panorama images from a handheld camera — full pipeline from two overlapping images to a single panorama.5 m
- 2.Compare k-Means, Gaussian Mixture Model, and DBSCAN for clustering.5 m
- 3.Derive how the vanishing-gradient problem arises in deep RNNs trained with BPTT.4 m
- 4.Explain how Generative Adversarial Networks work. State the loss and explain training dynamics.5 m
- 5.Diffusion Models — modern image generator. Explain (a) forward diffusion, (b) reverse process, (c) why U-Net is the standard architecture.6 m
- 6.Compare three LiDAR 3D object detection approaches: VoxelNet, PointNet++, PointPillars.5 m
- 7.Knowledge distillation: how does it work, what is the loss, why is it widely used in deployment?5 m
- 8.Compare autoencoders, VAEs, and contrastive learning as unsupervised representation learning.5 m
Section C — Long Form (10 marks each, 40 marks)
40 marks- 1.Homography calculation. 4 correspondences: (100,100)→(50,120), (400,100)→(380,80), (400,400)→(400,400), (100,400)→(80,380). (a) Set up the linear system for H. (b) How RANSAC handles outliers. (c) Inverse-warping procedure.10 m
- 2.Real-time surveillance for 500 cameras in a transit station. (a) Edge vs cloud architecture. (b) Detector + tracker model selection. (c) Privacy implementation for faces.10 m
- 3.Automated photo colorisation: B&W historical photos → colour. (a) Compare classical (scribble propagation), GAN-based, diffusion-based. (b) Quantitative quality evaluation. (c) Handle ambiguity where multiple colorisations are plausible.10 m
- 4.Multi-object tracking design for CCTV pedestrians. (a) Architecture: detection, association, ID management. (b) Handle ID switching during occlusions. (c) Evaluation metrics.10 m
Track your attempt locally — score and time are recorded in your browser. (Coming soon: timed-attempt mode.)