From Random Hacks of Kindness
Near-Real Time UAV Imagery Processing: Technical Notes area
[edit] Related Work
[edit] Implementation Options
[edit] Technical Notes
[edit] Notes about assumptions
Please feel free to argue with any of these points: they're mainly opinion and experience.
- Video mosaicing algorithms are non-trivial; the good news is that there are several image processing packages around that could help with this. These are also several communities home-building UAVs and investigating this sort of processing, eg. DIY Drones and Pict'Earth - Spike should be able to come up with some more names.
- You need to be aware of the nature of your data - in this case we're talking about slow-flying UAVs at low altitudes, so the overlap between successive video frames should be good. Theoretically, if you're using full-frame methods, this means that you don't need to process the entire frame of data each time, which reduces the processing times if you're using fft-based methods. You will also need to be aware of your camera and camera geometry - most camera lenses have distortions (more on this later).
- Although direct projection isn't sufficient to give mosic-quality overlaps, it can be combined with the mosaic outputs to produce better position estimates: look up SLAM (simultaneous location and mapping) algorithms for details of this. This will come in very handy if the UAV is surveying an area using something like a ladder search pattern (this has another name that I can't remember at the moment) where each frame will overlap not only with the frames next to it, but also to frames that were produced quite a while earlier.
- Bundle block adjustment relies on the generation of tie points in the images, i.e. object features that can be matched to each other and the image warped to fit. You can generate your own tie points using a person or a robust image segmentation and matching algorithm - beware the gotchas on each of these (person gets tired, makes mistakes, isn't exact; algorithm gets sidetracked by shadows, lighting changes; both can get fooled by images with very little structure in them).
- If you can't use bundle block adjustment, you have two main problems: one is that you'll need something else to match up the frames, and two is that you're going to have to deal with camera lens distortions. Alternatives to bundle block include FFT-based methods - if you use these you'll need to think about whether you'll take the hit of an extra FFT per frame to handle rotation between frames as well as translation, and you'll either need a beefy processor or some cheats, like only matching ('cross-correlating') part of each image. Camera distortion can be tricky if you're building generic software: you need to decide whether to remove the distortion using test images (e.g. checkerboards), work out what the distortion is on the fly, guess or just ignore it. Most UAV-sized camera lenses have barrel distortion which makes the edges of a square imaged with them appear to bend outwards a bit: this can make your generated images look a little strange if you overwrite the resultant image with each frame (because by doing this you're only using the most badly warped part of each image).
- Life is a lot easier if your camera is pointing straight down. If it isn't, and you haven't got tie points, you'll need to work out what the camera footprint on the ground is (hint: usually trapezoid, and for large slant angles the pixel footprints from nearest to furthest edge can be very different: see under orthorectification). You're also likely to get 3D effects at low altitudes, like the all-sided houses you can see on the corner points in some of the Google etc imagery. On the plus side, this allows you to do some cool 3d image reconstruction things with the images.
- You'll also have to choose how you merge each frame into the resultant image. Most 'simple' algorithms just write each frame over the top of the produced image - this gives you a set of edges that join up into an image. You can also merge frames by averaging the pixels on each produced image point - this can give a better image but takes much more time. And... if you've found something of particular interest in the image, you can use the overlapping frames (and fft-based image-matching techniques) to do superresolution, i.e. produce an image at up to roughly 5-7 times the resolution of the individual frames. You'll also have a better chance of removing flaws from the individual images, e.g. raindrops on he edges of the lens can really ruin your day. If you assume that the UAV is basically travelling in a straightline, you might try overlapping the centre strip of each frame, to reduce the distortion problems mentioned above.
[edit] Notes about data
The only platform people who've responded to the RHOK thread so far are YellowPlane. Yellowplane UAVs: http://www.yellowplane.co.uk/ SARbot = "search and rescue robot". Twinstar on rcgroups: http://www.rcgroups.com/forums/showthread.php?t=995999.
Useful UAV data online:
- None found yet. Are looking for downward-looking cameras specifically. Please add links here if you have them.
Not-so-useful UAV data online, but some very good examples of the types of artefact that you might have to process:
- Example: Draganflyer X4 plus FLIR Tau LWIR camera...
Camp Roberts notes:
[edit] Implementation Proposal
For the Random Hacks of Kindness #1.0 event, a minimal proof-of-concept solution that can be built over the June 4-6 weekend has been proposed with the intent of validating the concept and later iterations improving upon it.
[edit] Version 1.0, a.k.a. "Blue Sky"
This first version assumes ideal conditions for as many variables as possible. In other words, reality is being asked to step aside and not look in our general direction for a little while.
Input
- sequences of overlapping images obtained by UAV
- geographic references
Output
Assumptions and simplifications
- consistent lighting
- orthographic projection, flat earth, no parallax
- no shadows
- consistent weather
- camera pointing straight down
- level flight
- flight paths aligned with latitude and longitude lines
- all photos are geocoded
- sequences of photos contain overlap suitable for cross-correlation/stitching
- very few changes with respect to reference map
- stable camera w.r.t. UAV
- ladder search pattern
- no changes to scenery between frames
[edit] Future Tasks
There likely are a few tasks to address each assumption or simplification for Version 1.0. Here are some ideas:
- normalize the images to compensate for inconsistent lighting
- support low-altitude flight (i.e. parallax)
- compensate for camera lens distortions
- compensate for changes in sun position
- support changes in weather (partial cloudiness, fog, snow, etc.)
- support imperfect camera placement (i.e. slanted, loose or wobbly assembly)
- support changes in altitude, turns (bank & yaw)
- support arbitrary flight paths, irregular search patterns
- support dead reckoning and/or on-board inertial monitoring units
- support gaps in imagery due to malfunction, low framerate/high air speed, obstructions
- support matching to old reference maps (i.e. natural changes such as disasters, aging, seasons; man-made changes such as buildings, paths and roads)
- support imagery collected through video
- support imagery collected by several cameras, either on the same UAV or on separate UAVs
- support moving features in scenery, such as people, animals or vehicles