Augmented Reality and Points Of Interest
Augmented Reality (AR) is extending user's experience of the world around him using computer generated objects. Its goal is to add information and meaning to a real object or place by introducing virtual objects into the camera view.
There are several concepts that a developer must grasp when preparing to write an Augmented Reality app: OpenGL, translations, rotations, scaling etc. To understand basic OpenGL principles, some working knowledge of analytic geometry is needed. In this post I will try to describe these concepts, as they are used in our demo app which is presented in my first article on AR and iOS.
The goal is simple: we want to display user's Points Of Interest (POI) when he points the device in any direction. Of course, this must happen in real time, and POIs must be placed intuitively indicating their real location and other properties. POIs will be displayed using simple colored circles, as we don't want to complicate things and use a third-party graphics library for the job - just simple custom controls and some mathematics. I already posted sample code containing instructions for creating custom controls with MonoTouch. In that post I've shown how to create a custom control with button functionality that has a clear background and a circle in the middle. We want to place this control on screen over the POI location.
Basic Trigonometry
To draw a control on the screen, we need to calculate screen coordinates for each POI. To determine which POI should be made visible and displayed on the screen - relative to the direction in which the device is pointed - we will observe the slope (or gradient) of a line.
As Wikipedia says , "slope is normally described by the ratio of the "rise" divided by the "run" between two points on a line." Using the slope of a line which is defined by 2 points - the location of our device and the location of the each POI - we will calculate the inclination in radians. Here are more details on the slope, inclination and other important concepts.
Input Data
We will use latitude, longitude and altitude as our location input data along with yaw and roll values to determine device orientation towards the ground. Yaw and roll (and pitch also) are represented by Euler angles given to us by iOS, to represent the device orientation (attitude) along its center. This functionality is part of the CoreMotion framework which is available in iOS 4.0 and above, and all available iPhone sensors are combined together to give most accurate value.
Here you can see how data from the sensors can be collected.
iOS 5.0 introduces the reference frame for attitude samples. We are using yaw data collected using the CMAttitudeReferenceFrameXArbitraryCorrectedZVertical frame. This constant is rather CPU expensive, but we want the most accurate and the most relevant data we can get. This constant also uses magnetometer to improve the long-term yaw accuracy.
Yaw value can range from -PI to PI for full horizontal circle and north is at zero. Be sure to set the heading orientation when user rotates the device. We want our yaw values to always be positive and use it as inclination in our calculation.
// Heading correction
float
inclinationX;
if
(yaw < 0F) {
inclinationX = yaw + TwoPI;
}
else
{
inclinationX = yaw;
}
Roll value can range from 0 to PI, or from 0 to -PI depending on the orientation of a device. Roll value is 0 when device is put on the horizontal surface screen down. To make it similar to the previous case we want our roll value to be 0 when camera points in front of us, so we subtract half PI from the absolute value of the roll. This way our roll value can bi from –PI/2 to PI/2.
//Heading correction
float
inclinationY = (
float
)Math.Abs (roll) - HalfPI;
if
(inclinationY <= 0.0)
inclinationY += TwoPI;
Distance between the device and the POI is calculated using the
MapKit function
MetersBetweenMapPoints which calculates distance between two MapPoints. We can use
MKMapPoint.FromCoordinate to create our
MapPoints based on the location data.
MetersBetweenMapPoint is available in iOS 4.0 and never versions.
var cameraPoint = MKMapPoint.FromCoordinate(
new
CLLocationCoordinate2D(cameraLatitude, cameraLongitude));
var poiPoint = MKMapPoint.FromCoordinate(
new
CLLocationCoordinate2D(poiLatitude, poiLongitude));
float
distanceAB = (
float
)MKGeometry.MetersBetweenMapPoints(cameraPoint, poiPoint);
Calculating Coordinates
We are displaying POIs on top of the video preview image, so we need to know which POIs will fit into the field of view of the camera. I found the exact values for the iPhone's camera field of view here: http://stackoverflow.com/questions/3594199/iphone-4-camera-specifications-field-of-view-vertical-horizontal-angle.
Armed with this knowledge we can calculate values for the coefficients for x and for y coordinates. Using the law of cosines and knowing the inclination of a line for x and y, we can calculate the distance between two points on the unit circle.
One point on the unit circle represents the inclination of our POI, while the second point represents the inclination of a device respective to the camera heading.
#region Coordinate X
// Heading correction
float
inclinationX;
if
(yaw < 0F) {
inclinationX = yaw + TwoPI;
}
else
{
inclinationX = yaw;
}
float
distanceX = (
float
)Math.Sqrt ((
float
)2 - (
float
)2.0 * (
float
)Math.Cos (inclinationX - inclinationXPOI));
if
(inclinationX < inclinationXPOI) {
distanceX = -distanceX;
}
if
(inclinationX <= TwoPI && inclinationX >= (3 * HalfPI) && inclinationXPOI >= 0 && inclinationXPOI < (HalfPI)) {
distanceX = -distanceX;
}
screenKoefX = distanceX;
screenX = Convert.ToInt32 (((
float
)viewWidth * screenKoefX));
#endregion
#region Coordinate Y
float
inclinationYPOI = (
float
)Math.Atan ((
float
)(poiAltitude - altitude) / distanceAB);
if
(inclinationYPOI <= 0.0)
inclinationYPOI += TwoPI;
//Heading correction
float
inclinationY = (
float
)Math.Abs (roll) - HalfPI;
if
(inclinationY <= 0.0)
inclinationY += TwoPI;
screenKoefY = (
float
)Math.Sqrt (((
float
)2 - (
float
)2.0 * (
float
)Math.Cos (inclinationYPOI - inclinationY)));
if
(inclinationYPOI < inclinationY) {
screenKoefY = -screenKoefY;
}
if
(inclinationYPOI <= TwoPI && inclinationYPOI >= (3 * HalfPI) && inclinationY >= 0 && inclinationY <= (HalfPI)) {
screenKoefY = -screenKoefY;
}
if
(inclinationY <= TwoPI && inclinationY >= (3 * HalfPI) && inclinationYPOI >= 0 && inclinationYPOI <= (HalfPI)) {
screenKoefY = -screenKoefY;
}
screenY = Convert.ToInt32 (viewHeight * screenKoefY);
#endregion
After this, we just need to multiply the width and height of our view with the coefficients and we have values expressed in right measurement units - note that they are positive for POIs in the top half of the view and negative in the lower half. We simply need to add the half of the screen height to our y the coordinate and half of the screen width to the x coordinate, and we'll get our coordinates ready to be displayed on the screen.
Note that this code snippet is valid only for the landscape device orientation, but it should be pretty simple to enable it to work in the portrait mode also. Just be careful about FOV.
return
new
Point (
Convert.ToInt32 ((viewWidth / 2.0) + screenX),
Convert.ToInt32 ((viewHeight / 2.0) - screenY)
);
I am attaching a helper class with the
code containing all of the steps described above that is ready to be uses in your MonoTouch project.
PointOfInterest class is also attached to allow for easier POI tracking.
You can find more info about iPhone field of view (FOV) here.