ARTICLES

These few topics are covered in the form of informative text with examples:



Using Computers and Understanding Computers

"Don't start telling me how a machine works, tell me rather how I can work with it"
(Keith Emerson)

Much of commercial advertising and popular science reviews keep on telling that computers have become so intelligent and efficient that they can do things hardly to imagine anyone else but a human being could do. Then, upon starting to learn about using a computer, one suddenly finds oneself in a field full of holes, inperfections, error messages, crashes and system hang-ups, which frustrate the ambition for creative success. What is then the issue?

Both of this is true, we know it from experience: Computers are nowadays more efficient (and cost-efficient) than ever before, and are likely to become even more cost-efficient in the near future. Nevertheless, they still are and will ever be stupid boxes (=hardware), which only perform tasks they have been instructed to. This instruction depends on a user's input, but also on cummulative work of hundreds of other people, which have been involved in writing the computing tools, the operating system and the application programs (=software). The user has to understand what (s)he is asking the computer to do, and the authors of the software have to make sure that such a request can be and will be properly interpreted and completed by the machine.

Here we come to a point of realizing that a little bit of knowledge about what we are dealing with can only help, and spare us of frustrating misunderstandings. Like software authors have to understand that there are people who intend to do a work with their software, the users also have to understand there are people who have written the software for a particular purpose, which may not exactly be the purpose they are using it for. Likewise, same application program may exist for different operating systems, and not perform on all of them as good. Furtheron, same professional programs used with different sound-cards, may also give different sound quality.

Hardware:

All hardware consists of a processor (CPU), working memory (RAM), disk-memory (internal hard disk), and peripherals (keyboard, mouse, sound-cards, video cards, network cards, monitor, external disks, etc...). From a processor point of view, there are two main types of hardware:

  1. Intel Pentium* based hardware, which is wide-spread, mass-produced, inexpensive, assembled world-wide, not so well quality and compatibility controlled, but probably first and most obvious alternative one can encounter. (All PC-s work on Intel processors).
  2. All other hardware, based on different specialized processors, such as IBM-Motorola PowerPC (Apple Macintosh and some IBM high-end machines, SGI MIPS (all SGI workstations), Sun Microsystems' SPARC. This hardware is mostly produced and assembled by a single manufacturer (Apple, IBM, SGI or Sun), and are therefore subject to more rigorous quality and compatibility control, but is therefore also costier.

Operating Systems:

Even though an application program can "talk" directly to the hardware, the engineering habit of writing programs which directly talk to the hardware is being gradually abandoned, as computers have become powerful enough to leave headroom for more elegant solutions. The major problem in directly talking to the hardware is that there is no way a program can predict what hardware it will find or what else is happening to the hardware, unless instructed so. Is another program tryig to access the same hardware, how it would react upon finding it busy, how to negotiate priorities in accessing a hardware, how to use its resources to the point needed, but not more than that, and how to release it, once the task is completed - these are but a few questions that neither hardware nor application programs can answer.

The answer lies in the Operating System (OS), which is just another software, running all the time a hardware is on, and has just one task: to mediate between hardware and software. Several operating systems have been written for same or different hardware. Different operating systems serve for different purpose, such as scientific computation, home entertainment, office correspondence, internet servers, media production, etc. Better sold also does not always mean better. Not necesarilly will an office computer running corresponding OS give satisfying results in video editing. Not necessearilly will an internet client also be a good server. Not every operating system on any hardware will be equally suitable for musical purpose, whather it will depends on a number of technical data!

For a musician these few technical parameters of an operating system on a particular hardware are of prioritary interest, and can often reveal more than tons of advertising material:

(Some time ago these parameters were not part of a system specification, since the affordable computers were much too weak for real time audio, so all music processing was done within dedicated expensive hardware connected to a computer. So, the processor only took care of user interface. Manufacturers would also write special patches for that time OS's, which would correct latency and jitter problems, and would serve only that particular hardware and its supporting software. In recent few years, processors have become powerful to the point of being capable real-time signal processing, thus the situation has changed. Much of real-time audio dedicated DSP hardware is slowly dying out, because doing the work on the main processor is more affordable. Thus the concern of real-time audio capability migrates to the OS).

If these parameters do not exist as part of a technical description, it means that the manufacturer doesn't care, and probably doesn't even intend to market the machine (with its OS) for the music production purpose.

Nowadays two major types of operating systems are widely in use:

  1. Microsoft Windows© brand systems (WindowsXP, Windows2000, WindowsNT, Windows97), which is the best-selling brand of operating systems. Almost every business office, law agency, household, internet cafe has a Windows PC. This system, aside from being extremely popular among casual users has only two serious setbacks: (1) it is an office-houshold-entertainment targeted system, and itself does not include a real-time-based audio-music-production interface framework which is transparent, stable and reliable enough to become an industry standard of professional quality. (2) All information about how the system is built is a patented property and trade secret of a single company, Microsoft Inc. It is therefore available only to the employees of the company and third parties who have signed non-disclosure agreements with Microsoft (and of course paid substantial amount of money for the agreement, which only big companies can afford). There have been reviews of some major professional programs such as Digidesign ProTools or Steinberg Nuendo excelently performing with WindowsNT, but it doesn't seem to be due to Windows, but to the extensions and patches which Digidesign and Steinberg have added to the Windows OS. However these patches also happen to be a trade secret, and seem only work with programs which they have been written for. Microsoft Windows works exclusively on Intel-based hardware.
  2. UNIX® based systems (MacOSX, IRIX, Linux, Solaris, UNICOS, AIX, HPUX), which rely on world's both most mature and most advanced concept in computer system engineering. It emerged from scientific research, rather tham from need for immediate profit. Every UNIX is bottom-up multi-tasking, multi-user, multi-processor and network-based, which means it is stable, reliable, high-performing and not that easy to confuse. Its advantages are the following: (1) Much of UNIX software is open-source and platform independent, which means it is also available in the form in which it was written, so-called source-code. (2) Some UNIX systems (such as Linux and Solaris) can work on different hardware platforms. (3) All UNIX-based systems are designed to be co-operative, are well documented and therefore is not difficult to write and implement mission critical programs on UNIX. (4) Some UNIX systems, e.g. Linux are completely free and open source, thus independent from industrial monopolies and therefore highly praised in education. For all the reasons aforementioned, UNIX will be this project's system of choice and the software we are dealing with is UNIX software. So are also our tutorial examples.


Creative using of existing computer programs for making music

There have been several music programs available at no charge. Among all of them, probably most brilliant example of a well written, high-quality professionally reliable program is Digidesign ProToolsFree, which we shall also use in our example. It is a program which simulates a multitrack tape recorder / MIDI sequencer, an editing desk and a mixer with DSP effects, all in one, requiring no special audio hardware. It is no demo! It has all the features and full functionallity of the commercial version of ProTools, with only one limitation: you can record, play, mix and edit up to eight simultanous audio tracks and 48 simultanous MIDI tracks. It is probably the best available program for studying musique concrete, because its features go beyond a "classical era electronic studio". And it fits into a modern laptop computer (such as MacIntosh PowerBookG4, on which the author of this article has been extensively working with it).

The program can be downloaded from Digidesign's web site.

Installation prosess is fairly simple. You also get a PDF manual, which you should carefully read. This article is by no means meant to be ProTools User's Manual Addendum. You may have to "put your hands" on the program before this article makes sense to you. Following few tips should help avoid "naive" user's errors which may stand in the way to a good creative work:

  1. Choosing bit depth
    When you create a new session in ProToolsFree, the program will ask you to choose between 16 bit (CD quality) or 24 bit (better than CD quality) session bit depth. 24 bit is consideres to be better, even though it uses more disk space. Here's why: more bits means better precision in mixing and DSP. If you process and re-process your tracks, you will have less chance for signal degradation due to processing. If your original tracks are 16-bit, you may have to use another program (such as SoundHack) to convert them to 24-bit format.
  2. Recording analog signals
    Take care that your input signal levels are good. Too low levels may result in a noisy recording. Too high levels may produce signal "clipping", an effect far more unpleasant to hear than the distortion on an analog tape recorder. Always monitor the signal levels while recording, and re-listen to the recording immediatelly, while you still have a chance to correct signal levels and redo the recording.
  3. Copying, cutting and pasting tracks
    ProToolsFree is industry standard for non-destructive audio editing. This means that the cuts and pastes one does in the Edit Window are not performed on the recording itself, but rather on the cue points, leaving the original recording untouched.There are two ways of pasting tracks: (1) by joining the end of the first segment with the beginning of the second segment at a zero-crossing point; (2) by cross-fading the end of the first take with the beginning of the second take.

    The following must be observed, else unwanted effects (clicks, glitches, dirty cuts my occur):

    (1) The edit point has to be as close as possible to the zero crossing (you may have to drastically zoom into the signal).

    (2) The slope of the left side has to match the slope of the right side near the zero-crossing as much as possible.

    (3) The dynamics of the left side should be compatible with the dynamics of the right side.

    (4) In the cross-fade edits, there is always an "overlapping region" between the first and second segment, during which the actual cross-fade takes place. This segment has to be alligned so, that signals in both of the overlapping tracks be as much in-phase with each other as possible, before applying the cross-fade to the segments. Else, weird phase-cancellation effects may appear, rendering the crossfade bad or ineffective.
  4. Elliminating clicks and glitches
    During the recording some unwanted clicks or similar effects may occur. There is an easy way to correct them by means of the Pencil Tool of the ProToolsFree Toolbox. How to do it correctly may require some practice. This operation also changes contents of the original recording. One thing worth taking care of is to zoom in the waveform display precisely enough to make sure that only the clicked part is getting redrawn and nothing else.
  5. Using inserts
    The mix window of ProToolsFree looks much alike a simplified mixer. Several DSP effects can be plugged in my means of insert switch-buttons. These are delays and equalizers. Most comfortable thing about them is that they work in real time and do not alter the original signal. You can do anything from subtle tone colloration to really drastic sound character changes, almost anything you would else find in a professional recording studio.
  6. AudioSuite DSP
    Some more complicated effects, such as time stretching or pitch shifting may require such intensive re-computation of a signal that they may not work in real time. These effects can be found in the AudioSuite menu of the program. They do change the contents of the recording, but ProToolsFree does it again in the most gentle way: it puts the original track off-line and replaces it with the processed version, so you have the both. These operations are also undoable. Try to carefully balance between what processing task you would assign to the inserts and what would be done by means of the AudioSuite, in order to attain maximum efficiency and optimum artistic result.




Writing your own computer music programs:

For who knows what reason you may need a computer program which for some other reason would not be available or affordable. You may still write it yourself, if you have a working knowledge of a programming language and a computer with a compiler (most of UNIX-based operating systems have a preinstalled GCC compiler.

For the moment being we will ommit a general tutorial how to write a UNIX program in C and C++, because it goes way beyond the scope of this site, and can be found in many superb tutorial books.

Case Study 1: A real-time interaction program

A step-by step explanation of how to write a simple real time low-latency program for modification of an audio signal in UNIX and C++ programming language will follow. The program which can be downloaded and studied under name RingMod is a full practical implementation of this case study.

In the audio pages of SGI web site the following recommendation of former head of audio engineering team, Mr. Doug Scott could be found, as part of answer to FAQ:

Have two processes, a UI process, and an audio process. The latter runs at high priority. It spends most of its time blocked so it doesn't eat the CPU. Use sproc() for your second process so the two share memory.

The audio process sits in a loop that looks like [in C++ pseudo-code]:

while(1) {
block until one of two things happens:
(a)we get an event to do something from the UI process
(b) the audio port needs more data

if (a) {
get the event
do what it says
} else if (b) {
compute the next chunk of data to be sent out the port
}
}

The blocking is accomplished using select(), which means you need to get file descriptors associated with actions (a) and (b). For (a) the best way is a pollable semaphore. If you haven't played with these, they're quite cool. See the usnewsema and usnewpollsema man pages. (I'm assuming you know how semaphores work in general. If not, they're pretty easy to learn, and really useful). You get a semaphore to share between processes. A pollable semaphore works like this: you initialize the semaphore to 0. The audio process gets a file descriptor for the semaphore, upon which it can block. If the audio process does a uspsema() and it fails, the audio process can then block on select waiting for the semaphore to become ready. When the UI process does a usvsema(), the audio process will wake up. So you have the UI process do the usvsema when it has something new for the audio process to do. A nice clean way to have one process wake another up.

For (b) you use ALgetfd and ALsetfillpoint.

The key to an *interactive* application is that you want low latency. What you said: you don't want to sit and wait for all the samples to go out the queue. When the user hits a button, he/she wants to hear the sound right away.

RingMod is a fully functional implementation of the upper specification. In practical implementation of RingMod it looks like this:

(simplified body of the main function in C++)

(simplified body of the audio process initialization in C++)

(user interface callback, which reads value changes of slider w, updates carrier frequency and monitor's frequency display reading)

(actual body of the low-latency event loop, which controlls over real-time execution of the program)

Some audio and shared memory related commands belong to IRIX Audio Library. Some other systems have similar calls. GUI calls belong to ICS ViewKit interface, which used to be free for Linux, but is unfortunately no more, since it became a part of the program BXPro. Please feel free to download RingMod source code and learn from it, use it, improve it.


Case Study 2: An analysis - resynthesis - visualization program

A step-by step explanation of how to write a simple sound file spectral analysis - resynthesis - visualization program in UNIX and ANSI C programming language will follow:

Again we will skip discussing how to generally write a C program in UNIX, because it is beyond scope of this article, so far. A program that does not need low-latency real time interaction is much easier to write. In ANSI C pseudocode it commonly looks like this:


/*************************/
C headers and declarations
/*************************/
main(int argc, char **argv)
{
read file and analyze it
initialize graphics and visualization tools
visualize the results

while(1){
wait for the next event
dispatch the event
}

}

/*************************/
callbacks, processes, redraws and updates
/*************************/

In practice we are interested to write something which is (1)free, (2)working good, (3)looking good, (4)fun to read, (5)extendable, (6) portable, (7)recyclable. The program FHTCore, which we use in this example tries to follow these guidelines as close as possible.

  1. for reading files we won't re-invent the wheel. We will use a standard, good-performing system inplemented audio file library (such as SGI libaudiofile, or its Linux ports by Richard Kent and Michael Pruett). For file analysis, there is a non-trivial advanced mathematics theory beyond it. You can read more about it in appropriate titles listed "bibliography" pages of this site. Here we are not interested to do math. We're rather intending to find a good free algorithm for spectral analysis, already published and apply, the results of matematicians' research to our program. A brilliant Hartley Transform algorithm by Ron Mayer, which meets all our conditions has been found on a web site of the University Of Rio de Janeiro, Brasil.
  2. a number of test documents have been included in the aforementionedalgorithm. These documents are important because they help understanding how to implement the algorithm. For graphically displaying results of a sound file spectral analysis the sound file has to be mathematically transformed in a way that we can graphically represent and see it. We need data in the form of Discrete Fourier Transform (DFT). This is exactly what the algorithm does: computing Fourier Transform by means of Hartley Transform.
  3. how a program looks on screen depends on visualisation method we are using. For this purpose we chose OpenGL graphics library by SGI, because of its superb performance. For user interface components we use OSF Motif libraries, which are universal on all UNIX systems, good looking, stable, well documented and easy to implement.
  4. when a program is being published in source-code it is important to give a reader a chance to understand how the development of the program took place. Therefore the code has to be simple, transparent and properly commented.
  5. It is also important to understand that a computer program should be written in such a way, that you or someone else may continue and extend your work. Once again, good comments prove to be invaluable.
  6. If you write a program on a particular system (e.g. IRIX) it is helpful and unselfish to try sticking to the code that is universal for all UNIX systems (e.g. using OpenGL instead of IRIS GL). When you have no better choice than having to use some system specific code, try to isolate it from the universal part of the program, so someone can replace it with another appropriate system dependent code, upon porting it to another OS (e.g. Linux, or MacOSX or Solaris).
  7. Parts of code that perform well in your program may also perform well in another program you or someone else may write in the future. Don't start writing anew from scratch. It takes too much time. Copy and paste from your old programs. Take care to write code in such a way that specific portions can easily be isolated, copied and re-applied elsewhere. IMPORTANT! In case the code from which you are copying is covered by GNU GPL, LGPL or come other open-source license, you HAVE TO include it in the preamble of your source code document. Else it would be an intellectual thefth!

Please feel free to read through the FHTCore source code.

We shall discuss some critical parts of the code:

File reading loop source code

Initialization of graphics and visualization requires some extra care. The tools being used matter, the order in which the objects are initialized also matters. Since we have found no good documentation on this particular issue, which would deal with open source and yet be simple enough, for someone other than professional engineer to understand, we shall pay some extra attention to this part of the program:

(Simplified visualization tools' initialization routine)

Audio resynthesis from sonogram is implemented as a menu-driven callback. Here is the interesting portion of code ( main.C lines 2343 - 2371):

for(j = 0; j < horiz_num; j++){

/*code for time stretching interpollation
------
*/

/*code for copying data from visualization arrays into resynthesis channels*/

unconvertC(&channel, &outSignal, &outZeros, &lastphase, N2, vI, R , &first);
ifft(N, outSignal, outZeros);
overlapadd(outSignal, N, window, output, Nw, n);

/*code for writing output data to a sound file*/
}

Function unconvertC( ) undoes what convertPolar( ) in analysis loop does. It also reconstructs imaginary data that ifft( ) passes to Hartley Transform algorithm.