Camera Behaviour
With no input from the mouse, the camera remains at its
current orientation:
The green line denotes the camera's local Y axis, the red line denotes the
camera's local X axis, and the blue line denotes the camera's local
Z axis.
If the user moves the mouse left, the camera will rotate around the
global
Y axis and appear to turn left:
If the user moves the mouse right, the camera will rotate around the
global
Y axis and appear to turn right:
If the user pushes the mouse away,
the camera will rotate around its own local X axis and appear to turn
upwards:
If the user pulls the mouse towards,
the camera will rotate around its own local X axis and appear to turn
downwards:
The choice of whether towards and
away
mean "look down" and "look up",
or "look up" and "look down", respectively, is a matter
of personal taste. Most games and simulations provide an option to
invert the Y axis for mouse control, so that moving the mouse
away
results in the camera turning
downwards, and so on.
With no input from the keyboard, the camera remains at its
current position:
If the user presses whatever key is assigned to
right,
the camera moves towards positive infinity along its own local
X axis at a configurable rate:
If the user presses whatever key is assigned to
left,
the camera moves towards negative infinity along its own local
X axis at a configurable rate:
Note that movement occurs along the
local X
axis; if the camera has been
rotated
around
the global Y axis, then the local X axis has been transformed as a result,
and movement will occur along a different trajectory than in the unrotated
case:
If the user presses whatever key is assigned to
forward,
the camera moves towards negative infinity along its own local
Z axis at a configurable rate:
Whether
forward is considered to be towards
positive
or
negative
infinity on the Z axis is more or less a property of the coordinate system
used by the rendering system. Systems such as
OpenGL
traditionally
use a
right-handed coordinate system, with
forward
pointing towards negative infinity. Systems
such as
Direct3D
traditionally use a
left-handed coordinate
system, with
forward
pointing towards positive infinity. The
com.io7m.jcamera
package assumes a
right-handed
coordinate system.
As with movement on the
local X axis,
forward/backward movement occurs on the camera's local Z axis and is
therefore affected by
rotation around the Y axis.
Finally, if the user presses whatever key is assigned
to up,
the camera moves towards positive infinity along its local Y axis (with
down
moving the camera towards negative infinity, accordingly):
Note that up and
down
movement occurs on the local Y axis and is therefore affected by the
current orientation
of the camera:
All other movement is restricted. The camera cannot, for example, rotate
around its own local Z axis (the roll rotation,
in aircraft terminology).
The rest of this section attempts to give a mathematical description
of a camera system that implements the above behaviour, and describes
the design and implementation of the camera system derived from the
description as it exists in the
com.io7m.jcamera
package.
Camera Mathematics
An fps-style camera can be represented
as a 3-tuple (p, h, v), where
p
is the position of the camera,
h
is an angle around the
camera's local X axis in radians, and
v
is an angle around the
global Y axis in radians. In order to implement forward/backward and
left/right movement (and to derive a final
view matrix
so that the camera
can be used to produce a viewing transform for 3D graphics), it's
necessary to derive a 3-tuple of orthonormal
direction vectors
(forward, right, up)
from the angles h and
v.
Given the standard trigonometric functions:
It's possible to calculate the three components of the
forward
vector by assigning
pairs of axes to the unit circle and using three equations:
Note that the sign of the right hand side of the last equation
is inverted in order to take into account the fact that the
viewing direction is towards negative Z.
In most mathematics texts, a positive rotation around an axis
represents a counter-clockwise rotation when viewing the system along
the negative direction of the axis in question. Adhering to this
convention, the equations for calculating the
right
vector are identical
except for the fact that the equations work with a value of
v - (π / 2)
instead of
v
(a clockwise rotation
of 90°).
Finally, calculating the
up
vector is simply a matter of calculating the cross product
cross (right, forward).
The com.io7m.jcamera package
assumes that a camera with no rotation or translation applied is
placed at the origin position
p = (0, 0, 0)
with h = 0 and
v = π / 2. The reason for the
value of v is that in most
mathematics texts, an angle of
0
radians is illustrated as pointing to the right:
In a typical OpenGL configuration, the viewer is placed at the
origin looking towards negative infinity on the Z axis, and the X
axis appears to run horizontally, perpendicular to the viewing
direction. Given this convention, it's somewhat intuitive to map
those axes to the unit circle as follows (assuming a second observer
looking down onto the scene towards negative infinity on the Y axis):
Using this convention means that the values derived from the vector
equations above can be used directly to compute a
view matrix
in the coordinate
system conventionally used by OpenGL.
As a concrete example, using the default position and orientation
given above, the resulting vectors are calculated as
[
ExampleDefaultVectors.hs]:
The resulting forward,
right, and
up
vectors are consistent with the
Z,
X,
and Y axes typically used in
OpenGL.
With the
forward and
right
vectors calculated, it is
now trivial to derive forward/backward and left/right movement. Forward
movement by
d units is simply a
positive translation of the camera position
p
along the
forward
vector by
d units
[
Forward.hs]:
A backward movement is simply the same equation with a negative
d
distance:
Moving right is a positive translation of the camera position
p
along the
right
vector by d units:
Moving left is simply the same equation with a negative
d
distance:
Moving up is a positive translation of the camera position
p
along the
up
vector by d units:
Moving down is simply the same equation with a negative
d
distance:
The
right,
up, and
forward
vectors form an orthonormal
basis for a coordinate system. In practical terms, they provide the
rotational component for a combined rotation and translation that can
be used to transform arbitrary coordinates given in
world space
to
eye space
(also known as
view space). This is what allows the
camera system to actually be used as a camera in 3D simulations. A
matrix that rotates vectors according to the calculated camera vectors
is given by
[
ViewRotation.hs]:
A matrix that translates vectors according to the current camera
position is given by
[
ViewTranslation.hs]:
The matrices are multiplied together, resulting in
[
View.hs]:
Input
In the
com.io7m.jcamera package,
an
input is a simple abstraction intended
to keep
integrators
insulated from the platform-specific details of keyboard and mouse input.
With the
behaviour
described in the first subsection, there are two types of input:
Discrete
input (where the user presses
a key and the input is assumed to be constant until the key is released)
and
continuous input (where the user
moves a mouse and a stream of new mouse position vectors are generated).
Discrete input can be represented by a simple boolean flag, and continuous
input can be represented by summing the received input until an
integrator is ready to receive it.
An
input in the
com.io7m.jcamera
package is
represented by the following data structure
[
Input.hs]:
When the user presses whatever is key assigned to
up, the corresponding boolean field in
the data structure is set to true. When
the user releases the key, the corresponding field is set to
false.
The situation for mouse movement is slightly more complex. Most
OS-specific
windowing systems will provide the user with the current mouse cursor
coordinates
as a pair of integer offsets (in pixels) relative to some origin. Some
systems
have the origin (0, 0) at the
top-left corner of the
screen/window, whilst others have it at the bottom-left corner of the
window.
Additionally, the density of displays is increasing at a steady rate. A
monitor
manufactured five years ago may be 40cm wide and have a resolution that
fits
1440 pixels into that width. A modern display may be the same width but
have
over four times as many pixels in the same space. A camera system that
recklessly consumes coordinates given in pixels is going to behave
differently
on a screen that has a higher density of pixels than it would on an older,
lower
resolution display.
In order for the com.io7m.jcamera package
to remain system-independent, it's necessary to provide a way to map mouse
input
to a simple and consistent set of generic
rotation coefficients
that can be consumed by an
integrator. The rotation coefficients are a pair of values
(rx, ry)
expressing the intention to rotate
the camera, with
rx
affecting rotation around the camera's vertical axis, and
ry
affecting rotation around the camera's horizontal axis. In effect, when
rx == -1.0, the camera should appear
to
rotate
right
. When rx == 1.0,
the camera should appear to rotate left.
When
ry == 1.0, the camera should appear
to rotate
up. When ry ==
-1.0,
the camera should appear to rotate down.
The
coefficients linearly express fractional rotation, so a rotation of
0.5
is exactly half as much rotation as
1.0.
The scheme used to map screen positions to coefficients is as follows:
In order to actually map screen positions to rotation coefficients, it's
necessary
to take into account the windowing-system-specific origin. It's necessary
to define
a function that takes a
mouse region representing
the width and height of the screen with information labelling the origin,
and a pair
of screen/window-space coordinates
(sx,
sy), and
returns a pair of rotation coefficients
[
MouseRegion.hs]:
The assumption here is that the mouse cursor will be
warped
back to the center of the screen at periodic
intervals. If this did not occur, the mouse cursor would eventually reach
one or
more edges of the screen and would be unable to travel further, halting
any rotation
in those directions.
In
event-based windowing systems, every
time the
user moves the mouse, a
mouse event is
generated
containing the current cursor position. In some systems, the user must
explicitly
ask for the current mouse position when it is needed. In the former case,
new
rotation coefficients will be generated repeatedly. In the latter case,
the
user will typically ask for the current mouse position at the beginning of
rendering the current simulation frame, and therefore will only receive a
single
set of coefficients (effectively representing the furthest distance that
the mouse
travelled during that time period). In the
com.io7m.jcamera
package, an
integrator
will
read (and reset to
(0.0, 0.0))
the current rotation coefficients from an input at a (typically) fixed
rate. The current rotation coefficients stored in an input therefore
represent the sum of mouse movements for a given elapsed time period. To
this
end, the
JCameraFPSStyleInput
type in the
com.io7m.jcamera package
provides
an interface where the user simply submits new rotation coefficients each
time
they are received, and the type keeps a running total of the coefficients.
This
allows the input system to work the same way regardless of whether the
user
has to ask for mouse input, or is receiving it piecemeal via some event
system.
By taking the width and height of the screen in pixels, and dividing as
shown in the above equations, the resulting coefficients are
screen-density independent. In other words,
if the user moves the cursor halfway across the screen on a very high
density display, the resulting coefficients are the same as those
resulting
from a user moving the cursor across the same distance on a much lower
density display, even though the distances expressed in pixels are very
different.
Linear Integrators
A linear integrator updates the position
of a camera over time.
In physics, the first derivative of
position
with respect to
time is
velocity. The second derivative of
position with respect to time is
acceleration.
Newton's second law of motion relates force
f
with mass
m and acceleration
a
[
SecondLaw.hs]:
However, if m is assumed to
be 1,
a = (1 / 1) * f = f. So, rather than
assign mass
to a camera and try to apply forces, it's possible to simply apply
acceleration
as a (configurable) constant term directly. Linear integrators in the
com.io7m.jcamera
package are
represented as 8-tuples
(a, c, d, i, ms, sf, sr, su)
where:
The meaning of units mentioned above is
application specific. An application might choose to map units to meters,
or miles, or any other arbitrary measure of distance.
As mentioned, an integrator makes changes to the position and orientation
of a camera over a given
delta time period.
In most simulations, the camera will be updated at a fixed rate of
something
approaching
60 times per second. The
delta
time in this case would be given by
delta = 1.0 / 60.0 = 0.0166666....
The
integrator calculates a speed for each of the three
(right, up, forward)
axes in turn based
on the current linear acceleration/deceleration values, and the data from
the associated
input, and
tells the associated camera to move based on the resulting speeds.
For the
forward axis, the integrator
calculates a forward speed
sfr based
on the previous forward speed
sf, the
state of the input
i, the
acceleration
a, and the drag factor
d, and increases the camera position
by
sfr
units along the
forward axis. The
forward speed is clamped to the configurable range
[-ms, ms].
Specifically, the procedure is given by
[
IntegratorForward.hs]:
The drag factor is a configurable value
that specifies how the camera will slow down over time. Ideally, when the
user is not telling the camera to move, the camera is either stationary
or on its way to becoming stationary. A drag factor
d
will result in a speed
s'
by
s' = s * (d ** delta). Intuitively,
the drag factor can be seen as the fraction of the original speed that
will remain after one second of not receiving any acceleration. If
d = 0.0, any object not having
acceleration applied will immediately stop. If
d = 1.0, an object will continue
moving indefinitely
. A drag factor of 0.0 will
also imply an overall movement speed penalty due to the way integration is
performed. Usually, a drag factor of
0.0
is a bad idea - values closer to
0.0001
give the same abrupt behaviour but with slightly smoother results and less
of a movement speed penalty.
Angular Integrators
An angular integrator updates the
orientation
of a camera over time.
Integration of orientation occurs in almost exactly the same manner as
integration of
position;
orientation is treated as a pair of scalar rotations around two axes, and
the
rotation values are increased by speed values calculated from acceleration
values for each axis. Integration of rotations around the vertical axis is
given by
[
IntegratorAngularVertical.hs]:
Note that the acceleration around the axis is multiplied by the
rotation
coefficients
taken from the input.
Rotation around the horizontal axis is identical, except that the actual
camera itself may clamp rotations around
the horizontal axis. The reason for this is simple: If rotations are not
clamped, and the user rotates the camera upwards or downwards, there comes
a point where the camera's rotation value wraps around and the camera
begins
to rotate in the opposite direction, as illustrated:
The practical result of the above wrapping is that the user would, for
example,
be rotating the camera up towards the ceiling, the camera would reach the
limit
of rotation, and suddenly the camera would be facing the opposite
direction
and rotating down towards the floor again. This behaviour would be
irritating,
so cameras may optionally clamp rotations
and are required to indicate when clamping occurs so that the integrator
can
zero the speed of rotation around that axis. The reason for the zeroing of
the rotation speed is that if the speed were not zeroed, and the rotation
around the axis was proceeding at, say,
100
radians per second, the user would have to cause the rotation to decrease
by over 100 radians per second in the
opposite direction in order to get the camera to rotate at all. In effect,
the camera would appear to reach the limit of rotation, stop, and then the
user would have to scrub the mouse repeatedly in the opposite direction
in order to get rotation to begin again in the opposite direction.