The Witness camera. Part 2

The most peculiar aspect of this design is that the firmware is written in BASIC. I’m a strong advocate of using C for embedded systems, and I use routinely the excellent GCC compiler for AVR development on PC and Linux platforms. But at the time I was waiting for the samples of the camera to arrive, I was asked to select a BASIC compiler and IDE for an educational board my company was developing.

BASCOM-AVR from MCS electronics is a stable, popular product with a rock-solid user base, and a syntax similar to Visual Basic. The IDE installs with lots of examples, and you don’t need anything else to make your applications. Actually, the IDE integrates all sorts of accessories including a simulator, a chip programmer and (nice touch) a serial terminal.

While browsing the online help, my attention was caught by the AVR-DOS library supplied with the package. The help includes a schematic for wiring an SD or MMC card to the SPI, and a 3-line-long code snippet for writing a text file:

open “README.TXT” for output as #1
print #1, “HALLO FILE!”
close #1

I couldn’t resist trying it. I grabbed an SD-card from the nearest photo camera, the Atmel’s STK300 evaluation board and a soldering iron. A few minutes later, Windows Notepad running on a 32-bit, 3 GHz machine was opening a file written by an 8-bit, 8 MHz AVR microcontroller. I was hooked.

BASCOM does a good work at simplifying hardware access. For small applications, it’s a time saver. Do you need a real time clock? Just tell the compiler, and he will do the hard work of making timer interrupt code specific to the AVR flavour you selected. Do you need buffered serial I/O? Just select the buffer size and baud rate. The list of supported features is long - RC5 infrared remote control, ADC readout, timers and PWM setup, to name a few that were useful for the Witnesscam project. But also graphic and text LCD, 1-wire devices, SPI and I2C busses, PS2 mice, AT keyboards, and even a TCP/IP stack.

On the other extreme, as the complexity of your application increases, you will miss C type checking, aggregate types and support for modularity. Also, BASCOM error massages are obscure when not misleading, and lacking of automatic type conversions and true expression handling is an unnecessary pain. My advice is to stay on C if you expect to develop more than 64 kB of code; for smaller designs, like this one, it makes sense to opt for BASIC if you are exploiting most of the built-in features it offers. BASCOM supports structured programming, and with a bit of discipline you can write code as organized as you can do in C. See the sources for my best attempt at doing it!

Filled to capacity
Keeping the disc filled to capacity, but not overfull, was a challenging problem.

In theory you can get disk occupation calling diskFree(), find the oldest file in the disk and delete it. In practice it turns out that both operations are time-consuming, so you must avoid them as much as possible. To complicate matters, file size isn’t constant. It depends on the scene and how it gets compressed by the JPEG algorithm. It depends also on the time of acquisition, because storing a picture file may require creating new directories (hence more disk space) each time the year, month, day, or hour change. And users are free to select a different resolution or recording mode at any time, changing the space required to accommodate new files.
I developed an heuristics based on common-sense rules:

When the disk is only partially full, it is sufficient to check disk occupation only every now and then, and all files can be conserved without problems.
As disk occupation increases, the amount of space left need to be checked more frequently, and it’s wise to start deleting some files. As deleting files burns precious machine cycles, it’s preferable to do it when the camera is idle, i.e. recording isn’t triggered.
Should disk occupation reach its limits, disk occupation checks must be performed frequently. Deletion of oldest files must be performed anyway, even if the processor needs resources for recording, because of the risk of getting out of disk space.

Based on the assumptions above, I wrote an algorithm that computes the value of two variables, deleteOnWrite and deleteWhenIdle, storing the number of files to delete according to the recorder’s state. (see diskCleanup(), file “archive.bas”). The algorithm computes also the interval between checks, cleanupInterval, using the rules outlined above and the following table:

Mega Bytes left	# of files to delete each time a new file is recorded	# of files to delete when idle	time before next disk occupation check
> 60	None	None	1 hour
> 30	None	Up to 1000	½ hour
> 20	1	Up to 1000	½ hour
> 15	2	Up to 1000	¼ hour
> 10	3	Up to 1000	¼ hour
> 5	10	Up to 1000	5 minutes
> 2	30	Up to 1000	5 minutes
Less than 2	100	Up to 1000	1 minute

Careful section of the value for deleteWhenIdle and cleanupInterval improves recorder’s performance in all but continuous recording modes, because all erasing is performed when there is nothing else to do. If you feel 1000 pictures is a large number, consider it represent a negligible percentage of full disk capacity. It takes time to get accustomed to the Gigabyte Age!

The flow chart in Figure should shed some light on how the heuristics works. Experimentation in my house as PIR-triggered recorder resulted in all file erases performed during idle, calling diskFree()at most every 30 minutes.

Speech preparation.
The Witnesscam speech synthesizer plays a set of speech files to build its messages. You can record the files with your own voice if you want, but I was reluctant doing this, so I used computer-generated speech instead.

I would like to tank the people at AT&T Research Laboratories for giving me the permission of using samples from their Natural Voices™ system for this contest. Those unfamiliar with this technology can test their fantastic software online (http://www.research.att.com/~ttsweb/tts/), typing-in any text and listening for a warm, natural voice reading it.

I used the AT&T online demo tool to download a set of .WAV files with the words of interest. I selected a male voice (Mike) in order to reduce the content of high frequencies as possible. The Witnesscam dynamic range is 8 bit, and sample rate is 11,250 Hz, so I used a sound editing program (Cool Edit Pro) for down-sampling. I compressed the voice waveforms using the dynamic compressor tool in order to compensate for the reduced bit resolution (AT&T audio files are 16-bit). Then, I normalized the amplitudes, and saved each speech segment in a separate file as 8-bit “unsigned” samples - that is, raw numbers where silence corresponds to a value of 128.

At run time, the Witnesscam uses the file name to select the files to play. It expects to find them under a common folder named “speech”. Sound playback technique is brutally tricky. The samples are banged straight to the PWM without buffering, the only timing coming from a busy-wait loop. Ironically, the unpredictable delays due to disk access add a pleasant chorus-like effect, making the voice warmer and fuller.

The circuit amplifying the PWM signal is even more brutal, using just one transistor and trusting the speaker elastic properties for filtering higher frequencies and moving the cone back to the positive direction. These shortcuts limit the kind of speaker to use: if it’s too small, the Nyquist products will be audible, if it’s too large, it won’t jump back in time for the next audio peak. I have experimented with various sizes from my junk box, and found that a 75mm/3” speaker works well – surprisingly well. A clear sign that speech is like no other sound, because the brain is capable to reconstruct the voice from very little information.

Source code for ATMega32

Speech files

In a final part we will consider a system design and an order of management of it, we will generalise results at various modes and options.