While there are several applications in the personal informatics domain that track what we eat, or how many calories we consume, outside the laboratory setup, there is very little technology that can measures how we eat.
In this project, I use audio-motion-tracking to measure the rate of eating, and provides real-time audio visual feedback based on those measurements, and also built into it, a convenient way to control media.
The inspiration for this project is based on observation of people sitting alone at the school dining area, or at coffee shops, where they try to multi-task by eating and working at the same time, and are so engrossed in their work, that they are not mindful of their eating behavior. This often translates to health related problems like obesity, binge eating and so on.
The set-up for this project is a simple microphone placed under a ceramic plate, at its rim. Based on trial and error, this seemed to be the ideal spot to place the microphone, as placing it on the surface of the plate, captured much of the surrounding sound, and made it difficult to distinguish between the different gestures tracked in this project. Also, the space between the slightly curved rim of the plate, against the surface on which it was placed, provided natural filtering or acoustic insulation.
This particular plate is selected for the clear transition in its shape from the surface to rim, which makes it easier to separate the tap on the rim from natural sounds of eating. The two main sounds being tracked are the tap on the rim of the plate, and tap/scratch on its surface. When a person taps on the rim, the video player toggles between play and pause states. This provides the basic mechanism to control the video player. The rate of eating is tracked through the sound of the spoon/ fork against the surface of the plate. The difference between two successive sounds gives the time between two eating events. This data is used to control the rate at which the video is being played. If a person eats more than one spoonful of food within a 15 second interval, the video speeds up by a factor of 0.2 which indicates that he or she needs to slow down.
The key challenge was to be able to separate out the sounds from explicit control gestures with those from natural eating. In order to do this, the incoming audio signal is first passed through a high pass filter with a cut-off frequency of 2000 Hz. A threshold of 0.7 is set for the peak amplitude, and anything above that threshold is inferred as an intended gesture for volume control. A window of 1 second is used to prevent double triggers. Any sound below that threshold, again with a 1000 millisecond window, is used to restart a timer that measures the rate of eating food. If the rate is below 15 seconds, a signal is sent to the video player to increase the frame rate to 1.2. This provides visual feedback that the person is eating at a faster pace and needs to slow down. Upon slowing down to a pace greater than 15 seconds, the video resumes to play at a normal frame rate.