inputting_20real-time_20audio
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
inputting_20real-time_20audio [2024/04/19 14:58] – draft richardrussell | inputting_20real-time_20audio [2024/04/19 17:43] (current) – richardrussell | ||
---|---|---|---|
Line 45: | Line 45: | ||
You could of course present the choices and accept the selection in different ways, for example a dialogue box containing a list box plus OK and Cancel buttons. | You could of course present the choices and accept the selection in different ways, for example a dialogue box containing a list box plus OK and Cancel buttons. | ||
+ | |||
+ | ==== Choosing the buffer size ==== | ||
+ | |||
+ | The first step is to decide how large the audio buffer should be. To some extent this is an arbitrary decision, but it will depend on things like //latency// (how much time elapses between the audio arriving and it being processed by your program) and the amount of work needed to process the received audio data.\\ \\ It is vitally important that your program can process the audio data quickly enough, otherwise data will be lost, with undesirable results. If there is any variability in the rate at which you can process the data (for example it depends on disk or network accesses) then you may need to use a larger buffer to 'iron out' the fluctuation. In the example below the length of the buffer is 1024 samples; at 44100 Hz stereo that implies a latency of at least 24 milliseconds. | ||
+ | |||
+ | <code bb4w> | ||
+ | SamplesPerBuffer% = 1024 | ||
+ | </ | ||
==== Selecting the audio format ==== | ==== Selecting the audio format ==== | ||
- | The first step is to decide the audio //format// you will use: the principal choices being of sampling rate (the main ones being 11025 Hz, 22050 Hz and 44100 Hz) and number of channels (mono, 1 channel, or stereo, 2 channels). The higher the sampling rate the higher the audio frequency that can be received, but the more work your software needs to do. Normally you should choose the lowest sampling rate suitable for your application, | + | The next step is to decide the audio //format// you will use: the principal choices being of sampling rate (the main ones being 11025 Hz, 22050 Hz and 44100 Hz) and number of channels (mono, 1 channel, or stereo, 2 channels). The higher the sampling rate the higher the audio frequency that can be received, but the more work your software needs to do. Normally you should choose the lowest sampling rate suitable for your application, |
You set up the required audio format and open the audio capture device as follows: | You set up the required audio format and open the audio capture device as follows: | ||
Line 59: | Line 67: | ||
want.format.l& | want.format.l& | ||
want.channels& | want.channels& | ||
- | want.samples% = Window% | + | want.samples% = SamplesPerBuffer% |
+ | </ | ||
+ | ==== Opening the audio device and creating the buffer ==== | ||
+ | |||
+ | Now the audio capture device can be opened and the buffer created: | ||
+ | |||
+ | <code bb4w> | ||
SYS " | SYS " | ||
IF Device% = 0 ERROR 100, " | IF Device% = 0 ERROR 100, " | ||
- | | + | |
+ | WordsPerBuffer% = BytesPerBuffer% DIV 4 | ||
+ | |||
+ | DIM Buffer%(WordsPerBuffer% - 1) | ||
</ | </ | ||
This code allows for the possibility that the capture device doesn' | This code allows for the possibility that the capture device doesn' | ||
- | ==== Creating and initialising the buffers | + | ==== Starting audio capture |
- | The next step is to decide how many audio buffers | + | Once you have initialised |
<code bb4w> | <code bb4w> | ||
- | nBuffers% = 3 | + | SYS " |
- | SamplesPerBuffer% = 1024 | + | </ |
- | | + | |
- | DIM Buffers{(nBuffers%-1) a&(SamplesPerBuffer% | + | ==== Inputting in real-time ==== |
+ | |||
+ | Once the above code has been executed you need to process the received audio buffers fast enough to keep up with the incoming data. The following code constantly cycles, filling the buffer and calling the **PROCprocessbuffer** routine: | ||
+ | |||
+ | <code bb4w> | ||
+ | REPEAT | ||
+ | p%% = ^Buffer%(0) | ||
+ | R% = BytesPerBuffer% | ||
+ | PROCprocessbuffer(p%%, SamplesPerBuffer%) | ||
+ | REPEAT | ||
+ | SYS " | ||
+ | p%% += I% : R% -= I% | ||
+ | UNTIL R% <= 0 | ||
+ | UNTIL FALSE | ||
+ | END | ||
</ | </ | ||
+ | In this example the audio processing continues indefinitely, | ||
+ | ==== Processing the audio data ==== | ||
+ | |||
+ | Obviously it's only possible to describe this aspect in general terms, because precisely what audio processing takes place will depend on what the program is designed to do. The code below simply calculates the RMS (Root Mean Square) value of the incoming audio: | ||
+ | |||
+ | <code bb4w> | ||
+ | DEF PROCprocessbuffer(B%%, | ||
+ | LOCAL I%, V%, sumsq | ||
+ | FOR I% = 0 TO N%*2-2 STEP 2 | ||
+ | V% = B%%!I% AND &FFFF : IF V% >= &8000 V% -= 65536 | ||
+ | sumsq += V%^2 | ||
+ | NEXT | ||
+ | RMS = SQR(sumsq / N%) | ||
+ | ENDPROC | ||
+ | </ | ||
+ | |||
+ | This code is appropriate for monaural input (one channel) where each audio sample consists of a signed 16-bit value in the range -32768 to +32767. | ||
+ | |||
+ | ==== Cleaning up ==== | ||
+ | |||
+ | When you stop the sound capture, or exit the program, you need to shut down the audio input in a controlled fashion: | ||
+ | |||
+ | <code bb4w> | ||
+ | DEF PROCcleanup | ||
+ | Device% += 0 | ||
+ | IF Device% THEN | ||
+ | SYS " | ||
+ | SYS " | ||
+ | Device% = 0 | ||
+ | ENDIF | ||
+ | ENDPROC | ||
+ | </ | ||
+ | This code might form part of a larger routine, if there are other things that need to be shut down. | ||
inputting_20real-time_20audio.1713538690.txt.gz · Last modified: 2024/04/19 14:58 by richardrussell