0% found this document useful (0 votes)
18 views32 pages

Kalman Filters CV PT2

Uploaded by

roopasree004
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views32 pages

Kalman Filters CV PT2

Uploaded by

roopasree004
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Object Tracking

Dr K Adi Narayana Reddy


TRACKING
• Tracking in computer vision refers to the process of locating a specific object
or multiple objects in a sequence of frames in a video. It involves identifying
and following objects as they move through consecutive frames, maintaining
their identities over time. Tracking is a fundamental task in various
applications, including surveillance, human-computer interaction, augmented
reality, and autonomous vehicles.

• Tracking is the problem of generating an inference about the motion of an


object given a sequence of images. Generally, we will have some
measurements that appear at each tick of a (notional) clock. These
measurements could be the position of some image points, the position and
moments of some image regions, or pretty much anything else.
applications
• Motion Capture: If we can track the 3D configuration of a moving person
accurately, then we can make an accurate record of their motions. Once we have
this record, we can use it to drive a rendering process;
• for example, we might control a cartoon character, thousands of virtual extras in a
crowd scene, or a virtual stunt avatar.
• Recognition from Motion: The motion of objects is quite characteristic. We
might be able to determine the identity of the object from its motion. We should
be able to tell what it’s doing.
• Surveillance: Knowing what the objects are doing can be very useful. For
example, different kinds of trucks should move in different, fixed patterns in
an airport; if they do not, then something is going wrong. Similarly, there are
combinations of places and patterns of motions that should never occur (e.g.,
no truck should ever stop on an active runway). It could be helpful to have a
computer system that can monitor activities and give a warning when it detects
a problem case.
• Targeting: A significant fraction of the tracking literature is oriented toward
(a) deciding what to shoot, and (b) hitting it. Typically, this literature describes
tracking using radar or infrared signals (rather than vision),
Tracking as an abstract inference problem
• Tracking as an abstract inference problem" likely refers to a concept within
the field of computer vision or machine learning, specifically related to
tracking objects or entities in a scene using abstract inference techniques.
• In this context, "tracking" typically involves following the movement or
changes of objects over time in a series of images or frames. "Abstract
inference" suggests that this tracking task may involve making sense of
complex or ambiguous data to infer the location, trajectory, or other properties
of the tracked objects.
• This concept may involve advanced algorithms and mathematical models to
analyze patterns, relationships, and uncertainties in the data to accurately track
objects despite challenges such as occlusion, noise, or changes in
appearance.
SIMPLE TRACKING STRATEGIES
• There are two simple ways to track objects.
• In the first, tracking by detection, we have a strong model of the object, strong
enough to identify it in each frame. We find it, link up the instances, and we have a
track.
• In the second, tracking by matching, we have a model of how the object moves.
We have a domain in the nth frame in which the object sits, and then use this
model to search for a domain in the n+1th frame that matches it.
Tracking by Detection
• Assume that we will see only one object in each frame of video, that the state we
wish to track is the position in the image, and that we can build a reliable detector
for the object we wish to track.
• In tracking problems, we want to build space time paths followed by tokens—which
might be objects, or regions, or interest points, or image windows—in an image
sequence (left).
• There are two important sources of information; carefully used, they can resolve many
tracking problems without further complexity.
• One is the appearance of the token being tracked. If there is only one token in each
frame with a distinctive appearance, then we could detect it in each frame, then link the
detector responses (a).
• Alternatively, if there is more than one instance per frame, a cost function together
with weighted bipartite matching could be enough to build the track (b).
• If some instances drop out, we will need to link detector responses to abstract tracks (c);
in the figure, track 1 has measurements for frames n and n + 2, but does not have a
measurement for frame n + 1.
• Another important source of information is the motion of the token; if we have a
manageable model of the flow, we could search for the flow that generates the best
match in the next frame. We choose that match as the next location of the token, then
iterate this procedure (right).
Applications

• Object tracking in computer vision (e.g.,


pedestrian tracking, self-driving cars)
• Radar tracking (e.g., aircraft surveillance)
• Biological tracking (e.g., animal movement
analysis)
• Finance (e.g., stock price tracking)
• Robotics (e.g., SLAM—Simultaneous Localization
and Mapping)
TRACKING USING MATCHING
• Tracking translations by matching typically refers to a specific aspect of object tracking in computer
vision, where the goal is to follow the translation (movement) of objects between consecutive
frames by matching corresponding features or descriptors. how this process generally works:
1. Feature Extraction: Features are extracted from the objects of interest in the initial frame. These
features could be corners, edges, or other distinctive points that can be reliably detected and
described.
2. Matching: The extracted features in the initial frame are matched with features in the subsequent
frames. This matching process aims to find corresponding features that represent the same object
but have moved due to translation.
3. Motion Estimation: Based on the matched features, the motion of the object (translation) between
frames can be estimated. This estimation can be simple, such as computing the average
displacement of matched feature points, or more complex, involving techniques like optical flow
estimation.
4. State Update: The estimated motion parameters are used to update the state of the object's position
in the current frame. This updated state is then used as the initial guess for feature extraction and
matching in the next frame.
5. Iterative Refinement: The process is often iterative, with the feature extraction, matching, motion
estimation, and state update steps repeated for each consecutive frame to track the object's
translation over time accurately.
Introduction
• The Kalman Filter has inputs and outputs. The inputs are noisy and sometimes inaccurate
measurements.
• The outputs are less noisy and sometimes more accurate estimates.
• The estimates can be system state parameters that were not measured or observed
• Kalman Filter as an algorithm that can estimate observable and unobservable parameters with great
accuracy in real-time
Applications

• Object Tracking – Use the measured position of an object to more accurately estimate the position
and velocity of that object.
• Body Weight Estimate on Digital Scale – Use the measured pressure on a surface to estimate the
weight of object on that surface.
• Guidance, Navigation, and Control – Use Inertial Measurement Unit (IMU) sensors to estimate an
objects location, velocity, and acceleration; and use those estimates to control the objects next
moves.
Scenarios
Imagine:
• Viewing a small bird flying through a forest
• Tracking a missile given a blip every few seconds
• Tracking planets, given intermittent observations
Scenarios
• Imagine:
• Viewing a small bird flying through a forest
• Tracking a missile given a blip every few seconds
• Tracking planets, given intermittent observations
• In each case:
• The observations are noisy
• But we can formulate an expectation about the
trajectory
Kalman Filter Algorithm
• The initialization is performed only once, and it
provides two parameters:
• 𝐼𝑛𝑖𝑡𝑖𝑎𝑙 𝑠𝑦𝑠𝑡𝑒𝑚 𝑠𝑡𝑎𝑡𝑒 𝑥00
• 𝐼𝑛𝑖𝑡𝑖𝑎𝑙 𝑠𝑡𝑎𝑡𝑒 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 𝑝00
• Measurement: The measurement is performed in every
filter cycle, and it provides two parameters:
• 𝑚𝑒𝑎𝑠𝑢𝑟𝑒𝑑 𝑠𝑦𝑠𝑡𝑒𝑚 𝑠𝑡𝑎𝑡𝑒 𝑧𝑛
• 𝑀𝑒𝑎𝑠𝑢𝑟𝑒𝑚𝑒𝑛𝑡 𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 𝑟𝑛
• Filter outputs are
• 𝑆𝑦𝑠𝑡𝑒𝑚 𝑠𝑡𝑎𝑡𝑒 𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑒 𝑥𝑛𝑛
• 𝐸𝑠𝑡𝑖𝑚𝑎𝑡𝑒 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 𝑝𝑛𝑛
Cont…
ESTIMATING THE HEIGHT OF A
BUILDING
• Assume that we would like to estimate the height of a
building using an imprecise altimeter.
• The true building height is 50 meters.
• The altimeter measurement error (standard deviation) is 5 meters.
• The ten measurements are: 49.03m, 48.44m, 55.21m, 49.98m, 50.6m,
52.61m, 45.87m, 42.64m, 48.26m, 55.84m.
• The estimated height of the building2
for the initialization
purpose is: 𝑥00 = 60 𝑚, 𝑝00 = 225 𝑚
• Now, we shall predict the next state based on the
initialization values.
• Since our system's Dynamic Model is constant, i.e., the
building doesn't change its height:𝑥10 = 𝑥00 = 60 𝑚

• The extrapolated
2
estimate variance also doesn't change:𝑝10 =
𝑝00 = 225 𝑚
Iteration 1
• The first measurement is 𝑧1 = 49.03𝑚
• The measurement variance is 𝑟1 = 25𝑚2 𝜎 = 5𝑚
𝑝
• Kalman Gain calculation:𝐾1 = 10 =0.9
𝑝10 +𝑟1
• Estimating the current state:
• 𝑥11 = 𝑥10 + 𝐾1 𝑧1 − 𝑥10 = 60 + 0.9 49.03 − 60 = 50.13𝑚
• Update the current estimate variance:
• 𝑝11 = 1 − 𝐾1 𝑝10 = 22.5𝑚2
• Since the dynamic model of our system is constant, i.e., the
building doesn't change its height:
• 𝑥21 = 𝑥11 = 50.13𝑚
• The extrapolated estimate variance also doesn't change:
• 𝑝21 = 𝑝11 = 22.5𝑚2
Iteration 2
• fter a unit time delay, the predicted estimate from the previous
iteration becomes the prior estimate in the current iteration:
• 𝑥21 = 50.13𝑚
• The extrapolated estimate variance becomes the prior estimate
variance:
• 𝑝21 = 22.5𝑚2
• The second measurement is: 𝑧2 = 48.44𝑚
• The measurement variance is: 𝑟2 = 25𝑚2
• Kalman Gain calculation:𝐾2 = 0.47
• Estimating the current state:
• 𝑥22 = 𝑥21 + 𝐾2 𝑧2 − 𝑥21 = 49.33𝑚
• Update the current estimate variance
• 𝑝22 = 1 − 𝐾2 𝑝21 = 11.84𝑚2
Next iterations
Next Iterations
TRACKING LINEAR DYNAMICAL MODELS WITH KALMAN FILTERS
Tracking linear dynamical models with Kalman filters is a common technique in the field of signal
processing and control theory. The Kalman filter is an optimal recursive algorithm that estimates
the state of a linear dynamic system from a series of noisy measurements. how it works:
1. State Space Model: The system is represented as a set of linear equations describing its
evolution over time. This includes a state transition equation that predicts the next state based on
the current state and a measurement equation that relates the measurements to the underlying
state.
2. Prediction Step: Based on the current state estimate and the system dynamics, the Kalman filter
predicts the next state of the system.
3. Update Step: The predicted state is then compared to the actual measurement. The Kalman
filter updates the state estimate by combining the prediction with the measurement, taking into
account the uncertainty associated with both.
4. Covariance Update: The covariance matrix is also updated in each step to reflect the
uncertainty of the state estimate.
• By iteratively performing prediction and update steps, the Kalman filter provides an optimal
estimate of the system state, even in the presence of noisy measurements and uncertainty in the
system dynamics.
• This technique is widely used in various applications such as navigation systems, tracking
objects in video sequences, and sensor fusion in robotics. It's particularly effective when the
underlying system can be modeled as a linear dynamical system and the noise in the
measurements and dynamics can be characterized accurately.
Several key equations and concept
EXAMPLE - AIRPLANE WITHOUT
CONTROL INPUT
• we define the State Extrapolation Equation for
an airplane, assuming a constant acceleration
model
• There is no control input:𝑢𝑛 = 0
• Consider an airplane moving in three-
dimensional space with constant acceleration.
The state vector 𝑥ො𝑛 that describes the estimated
airplane position, velocity, and acceleration
in a Cartesian coordinate system (x, y, z)
The state Transition matrix F
State Exploration Equation
AIRPLANE WITH CONTROL INPUT
• now we have a sensor connected to the pilot's controls,
so we have additional information about the airplane
acceleration based on the pilot's commands
• The state vector 𝑥ො𝑛
• that describes the estimated airplane position and
velocity in a Cartesian coordinate system (x, y, z)is:
Control vector
• The control vector 𝑢𝑛 that describes the
measured airplane acceleration in a cartesian
coordinate system (x, y, z)is:
State Exploration Equation
Data association
• In tracking, data association refers to the process of linking detected objects or
features across multiple frames to maintain their identities over time. It involves
associating objects or features detected in one frame with those detected in
subsequent frames, despite challenges such as occlusions, appearance changes,
and noise.
• Here's how data association works in tracking:

[Link] or Feature Extraction: The process begins with detecting objects or


extracting features of interest in each frame of a video sequence. This could
involve techniques like object detection, keypoint extraction, or segmentation.

[Link] Matching: Once objects or features are detected or extracted in


consecutive frames, the next step is to match corresponding features between
frames. This matching process aims to establish associations between features that
represent the same physical object. Various techniques can be used for feature
matching, including nearest neighbor matching, geometric constraints, and
appearance-based matching using descriptors.
3. Data Association Algorithms: Data association algorithms are used to link
detected objects or features across frames based on the feature matches. These
algorithms typically consider factors such as motion dynamics, appearance
consistency, and spatial constraints to determine the most likely
correspondences between objects or features.

4. State Estimation: Once correspondences are established, state estimation


techniques, such as Kalman Filters, particle filters, or graph-based
approaches, can be used to estimate the state of tracked objects over time.
These techniques incorporate information from multiple frames and handle
uncertainties to provide robust estimates of object trajectories.

5. Validation and Correction: Data association algorithms often include


mechanisms for validation and correction to handle mismatches, outliers, and
occlusions. This may involve re-detecting objects or re-matching features in
problematic regions and updating the association accordingly.

You might also like