wiki:Help/AudioMapping

Audio mapping with JOSM

One way of surveying is to record notes about street names and points of interest on a dictaphone or other voice recorder while using a GPS receiver to accurately note the position. You can, of course, simply play back your recording through the sound recorder and relate it to your GPS track manually in some way. However, if you have a digital voice recorder which can upload audio files to your computer, JOSM provides facilities to automate this process.

There are four automated techniques you can use. Click on the headlines for more detail. See also how to synchronize your sound track with your GPS track. Don't forget to set the option for the method you want to use.

A note about calibration

You should calibrate the voice recorder's clock every so often, that is enter a reading to compensate for inaccuracy in your voice recorder's clock or audio sampling rate. This is most important with method 2, but will make working with the sound track easier for method 1. You can be certain that the clock in your GPS is highly accurate - GPS triangulation relies on extremely precise timing. However, audio recorders are often not so accurate, so you need to measure whether this is the case for your recorder and tell JOSM how much fast or slow it runs. An error of 5 seconds per hour would mean you could be 100m or more out after four hours surveying on a bike, and more in a car.

An alternative is to resynchronize every 30 minutes or so during surveying, but you should at least test your voice recorder for its accuracy to understand what the tolerances are.

1. Continuous audio with GPS waymarks

With this method, you collect explicit waypoints along your track using the buttons on your GPS and at the same time dictate onto a continuous sound recording on your voice recorder what the waymark represents on the ground - a street name or point of interest. Your GPS notes three key pieces of information about each waypoint - its location, the time it was made, and it's name or number. The audio and waypoint data are then synchronized in JOSM so that you can play back each description by clicking on a Marker representing the waypoint.

Each Marker is used to identify the position accurately and the voice recording is only used for annotation. Synchronization merely helps you to conveniently select the right part of the sound track to play for each marker.

Advantages: it's easy in JOSM to locate the audio description for each point of interest.

Disadvantages: some GPS receivers require quite a lot of fiddly button pressing to make a waymark, which makes it awkward and hazardous while moving.

2. Continuous audio with vocally-identified points of interest

With this method you also make a continuous sound recording but instead of entering waypoints into your GPS you dictate an audible cue for each point of interest, for example "MARK! River Lane Primary School on left". Though there isn't a precise "location" for street names it will help to be consistent about recording the name just after you enter a street so you know where to look for the relevant clip.

Synchronization of the sound track with the GPS data and calibration of the voice recorder's clock are then more critical because the time into the recording is used to accurately measure the equivalent time into the GPS track, and therefore the location.

Advantages: you don't need to press any buttons while you are moving

Disadvantages: unless you are very methodical it can be time consuming to find each bit of audio description, or to be sure you played them all; it relies on the accuracy of your voice recorder's clock; you need to be moving when you make your synchronization cue.

3. Audio clips using audio file time stamps

With this method, you make record a separate audio file for each location of interest. Then you import all the files onto your GPX track and it positions an audio marker on the track at the position corresponding to the modified time stamps of the audio files.

Advantages: doesn't require any waypoints in the GPX; doesn't require large capacity for the audio.

Disadvantages: requires you can get the audio files onto your computer with their time stamps intact; requires you to turn on and off the audio recorder for each clip.

4. Audio clips with waymarks

With this method, you make waypoints to identify locations of interest. However you record a separate audio file for each one and the name of each file is added as a <link> element in the corresponding waypoint in the GPX file before loading into JOSM. When JOSM then creates the Audio Marker for each waypoint, it knows which audio clip to play when you click on the marker.

Advantages: by far the simplest method to work with in JOSM; no synchronization or calibration is required; ideal if you have an easy to use GPS that has a voice note function, or if you can automate linking the individual files to waypoint elements in your GPX file.

Disadvantages: unless your GPS automatically takes voice notes at a waypoint the recorder needs to be started and stopped for each waypoint as well as making the waypoint itself; another piece of software is required to join the audio to the GPX file, unless you have a GPS that does this for you.

Bug: if the waypoint contains a <time> element, such as the time of recording, JOSM will show an error message "This is after the end of recording" when clicking the Audio Marker. So remove the <time> element or use <cmt> or <desc> instead.

5. Audio clips with sound markers

OK, so this method doesn't exist. But here is what I'd like to achieve if anyone has any sound analysis technology to help:

Record continuous audio. Say a predefined or trained word or phrase, e.g. "MARK NOTE" to start an audio note. Then process the audio track to locate these special phrases and use the length of time into the recording to place the audio marker relative to the GPX track.

This doesn't have to be built in to JOSM. WAV files can have labels, which are very simple structures which list name vs time offset into a recording tacked on to the end of a WAV file. So a preprocessor could pass over the WAV file building labels from the determined time offsets and append them to the file.

Advantages: no buttons to press at all while cycling.


Back to Main Help

Last modified 4 years ago Last modified on Jul 28, 2010, 5:49:59 PM