MUSC 108. Introduction to Music Technology - Fall 2013

Unit 1 Reading - Lecture 1

[Syllabus]

Six Basic Properties of Sound

  1. Frequency
  2. Amplitude
  3. Timbre
  4. Duration
  5. Envelope
  6. Location

Frequency refers to how high or how low the note sounds (pitch). The term pitch is used to describe frequencies within the range of human hearing.

Amplitude refers to how loud or soft the sound is.

Duration refers to how long a sound lasts.

Timbre ( pronounced TAM-burr) refers to the characteristic sound or tone color of an instrument. A violin has a different timbre than a piano.

Envelope refers to the shape or contour of the sound as it evolves over time. A simple envelope consists of three parts: attack, sustain, and decay. An acoustic guitar has a sharp attack, little sustain and a rapid decay. A piano has a sharp attack, medium sustain, and medium decay. Voice, wind, and string instruments can shape the individual attack, sustain, and decay portions of the sound.

Location describes the sound placement relative to our listening position. Sound is perceived in three dimensional space based on the time difference it reaches our left and right eardrums.

These six properties of sound are studied in the fields of music, physics, acoustics, digital signal processing (DSP), computer science, electrical engineering, psychology, and biology. This course will study these properties from the perspective of music, MIDI, and digital audio.

Terminology

MIDI

MIDI (Musical Instrument Digital Interface) is a hardware and software specification that enables computers and synthesizers to communicate through digital electronics . The first version of the MIDI standard was published in 1983. MIDI itself does not produce any sounds, it simply tells a synthesizer to turn notes on and off. The quality of the sounds you hear are dependent on the sounds built into the synthesizer. Cheap MIDI synthesizers sound like toys. Expensive MIDI synthesizers can realistically reproduce a full orchestra, however they may cost over $10,000. The MIDI Manufacturers Association (www.midi.org) is in charge of all things MIDI.

Digital Audio

Digital audio is a mix of mathematics, computer science, and physics. Sound waves we hear are represented as a stream of numbers. An Analog Digital Converter (ADC) converts an analog signal (e.g., voltage fluctuations from a microphone) into numbers that are sent to a computer. The computer processes the numbers and sends them on to a Digital Analog Converter (DAC). The DAC converts the numbers back into an analog signal that drives a speaker.

DAW signal chain

http://upload.wikimedia.org/wikipedia/commons/8/84/A-D-A_Flow.svg

Analog and Digital Signals

An analog signal is a continuous signal. A digital signal is a discrete signal. Analog signal values are known for all moments in time. Digital signals are only known at certain specified times.

Continuous Analog Signal Discrete Digital Signal
Continuous signal Discrete Signal

These screen shots were captured using the open source, cross platform software Octave. (http://octave.sourceforge.net/)

Prefixes

These prefixes refer to numerical quantities. For example a 1 gigahertz computer's CPU is timed with a clock running in nanoseconds. Or, a slow digital audio recorder can record 23 samples every microsecond. I purchased my first computer hard drive in 1987, it was a 1 megabyte drive and cost $1000. I recently purchased a 3 terabyte hard drive for $200.

Prefix
Value
Abbreviation
Tera 1,000,000,000,000
T
Giga 1,000,000,000
G
Mega 1,000,000
M
Kilo 1,000
K
   
Milli 0.001
m
Micro .000001
μ
Nano .000000001
n
Pico .000000000001
p

The Six Properties of Sound as Described in Music, MIDI, and Digital Audio

Frequency

Frequency is measured in Hertz (Hz). One Hz is one cycle per second. Human hearing lies within the range of 20Hz - 20,000Hz. As we get older the upper range of our hearing diminishes. Human speech generally falls in the range from 85 Hz - 1100 Hz. Two frequencies are an octave apart if one of them is exactly twice the frequency of the other. These frequencies are each one octave higher than the one before: 100Hz, 200Hz, 400Hz, 800Hz, and 1600Hz.

Frequency in Music

Frequency and pitch are often thought to be synonymous. However, there is a subtle distinction between them. A pure sine wave is the only sound that consists of one and only one frequency. Most musical sounds we hear contain a mix of several harmonically related frequencies. Nonetheless, musical frequencies are referred to as pitch and are given a names like "Middle C" or "the A above Middle C".

The frequency spectrum used in music is a discrete system, where only a select number of specific frequencies are used. For example, the piano uses 88 of them from 27.5 Hz to 4186 Hz. The modern system of tuning is called Equal Temperament and divides the octave 12 equal parts. The frequency ratio between any two neighboring notes is Twelfth root of two.

piano keyboard picture

Lowest A

Unable to play MP3

A 440

Unable to play MP3

Highest C

Unable to play MP3

The higher the frequency, the higher the pitch. The lower the frequency, the lower the pitch. High pitches are written higher on the musical staff than low pitches. High notes on the piano are on the right, and low notes are on the left as you're facing the piano.

Low frequency = low pitch

High frequency = high pitch

sine wave low frequency picture
sine wave high frequency picture
Unable to play MP3
Unable to play MP3

Frequency in MIDI

MIDI represents frequency as a MIDI note number corresponding to notes on the piano. The 88 keys on the piano are assigned MIDI note numbers 21-108, with middle C as note number 60. You can move up or down an octave in MIDI by adding or subtracting 12 from the MIDI note number. MIDI is capable of altering the frequency of a note using using the "pitch bend" command. A MIDI Tuning Extension permits microtonal tuning for each of the 128 MIDI note numbers. However, not all synthesizers support micro tuning.

Frequency in Digital Audio

Physics and digital signal processing (DSP) deal with very large frequency ranges. The audio spectrum is a very small part of that range, a relatively narrow band between 20 - 20,000 Hz. The Audio band spans 20000 (20000 Hz) Hz out of a total bandwidth of Ten to the twentieth Hz. Because of the extreme differences between audio waves and gamma waves, a logarithmic scale was used for the graph.

Frequency Spectrum

https://upload.wikimedia.org/wikipedia/commons/7/72/Radio_transmition_diagram_en.png, modified JE

The "purest" pitched tone is a sine wave. The formula for a sampled sine wave is shown below where n is a positive integer sample number, f is the frequency in Hz, SR is the sampling rate in samples per second, and θ is the phase in radians.

Sampled sine wave formula

Many complex tones can be generated by adding, subtracting, and multiplying sine waves. Any frequency is possible and any tuning system is possible. You could divide the octave into N parts with a frequency spacing of Nth root of two between notes instead of the twelve equally spaced half steps used in Equal Temperament.

Amplitude

The amplitude of a sound is a measure of it's power and is measured in decibels. We perceive amplitude as loud and soft. Studies in hearing show that we perceive sounds at very low and very high frequencies as being softer than sounds in the middle frequencies, even though they have the same amplitude.

Amplitude in Music

The musical term for amplitude is dynamics. There are nine common music notation symbols used used to represent dynamics. From extremely loud to silence they are:

Symbol
Name
Performed
dynamics fff
fortississimo
as loud as possible
dynamics ff
fortissimo
very loud
dynamics f
forte
loud
dynamics mf
mezzo forte
medium loud
dynamics mp
mezzo piano
medium soft
dynamics p
piano
soft
dynamics pp
pianissimo
very soft
dynamics ppp
pianississimo

as soft as possible

rests
rest

silence

The pianoforte is the ancestor of the modern piano and was invented in Italy around 1710 by Bartolomeo Cristofori. It was the first keyboard instrument that could play soft (piano) or loud (forte) depending on the force applied to the keys, thus its name. If it had been invented in England we might know it as the softloud.

Amplitude in MIDI

The MIDI term for amplitude is velocity. MIDI velocity numbers range from 0-127. Higher velocities are louder.

Amplitude in Digital Audio

The amplitude of a sound wave determines its relative loudness. Looking at a graph of a sound wave, the amplitude is the height of the wave. These two sound waves have the same frequency but differ in amplitude. The one with the higher amplitude sounds the loudest.

Low amplitude = soft sound
High amplitude = loud sound
sine wave low amplitude picture
sine wave high amplitude picture
Unable to play MP3
Unable to play MP3

Amplitude is measured in decibels. Decibels have no physical units, they are pure numbers that express a ratio of how much louder or softer one sound is to another. Because our ears are so sensitive to a huge range of sound, decibels use a logarithmic rather than a linear scale according to this formula.

decibel reference formula

Positive dB's represent an increase in volume (gain) and negative dB's represent a decrease (attenuation) in volume. Doubling the amplitude results in a 6 dB gain and the sound seems twice as loud. Halving the amplitude results in a 6 dB attenuation, or a -6 dB gain, and the sound seems half as loud. When the amplitude is changed by a factor of ten the decibel change is 20 dB. Every 10 decibel change represents a power of ten increase in sound intensity. For example, the intensity difference between the softest symphonic music (20 dB) and loudest symphonic music (100 db) differs by a factor of 100,000,000 (10 to the 8th power). This table shows some relative decibel levels. Read 10^3 as Ten to the third power.

Decibels

Logarithmic Scale

Magnitude

Linear Scale

Description


160
10^16
 
150
10^15
 
140
10^14

Jet takeoff

130
10^13
 
120
10^12

Threshold of pain
Amplified rock band

110
10^11
 
100
10^10
Loudest symphonic music
90
10^9
 
80
10^8
Vacuum Cleaner
70
10^7
 
60
10^6
Conversation
50
10^5
 
40
10^4
 
30
10^3
 
20
10^2

Whispering
Softest symphonic music

10
10^1
 
1
10^0
Threshold of hearing
0
Silence

Digital audio often reverses the decibel scale making 0 dB the loudest sound that can be accurately produced by the hardware without distortion. Softer sounds are measured as negative decibels below zero. Software decibel scales often use a portion of the 0 dB to 120 dB range and may choose an arbitrary value for the 0 dB point. This dB scale is found in the Logic Pro software. The dB scale on the left goes from 0 dB down to -60 dB. The 0.0 dB setting on the volume fader on the right corresponds to -11 dB on the dB scale.

Logic Pro Fader

When recording digital audio, you never want the sound to stay under well under 0 dB. Notice the flat tops on the signal on the left at the 0 dB mark. That's referred to as digital clipping and it sounds terrible. The screen shots were captured using the open source cross platform software Audacity that you'll be using later in the course. (http://audacity.sourceforge.net/)

Clipped signal
Unable to play MP3
Unable to play MP3

Duration

When we talk about duration we're talking about time. We need to know two events related to the time of a sound, when did it start and how long did it last. In music and digital audio, time usually starts at zero. How time is tracked is usually a variation on chronological time or proportional time. Here are some examples.

Chronological Time or Clock Time

"When I heard the first sound, I looked at my digital watch and it was 4:25:13. When I heard the second sound it was 4:25:17. The first sound had already stopped by then."

We know the starting time of both sounds but we don't know when the first sound stopped.

Difference Time or Elapsed Time

"When I heard the first sound, I started the digital timer on my watch. The sound ended exactly 1.52 seconds later. I heard the second sound a little later but wasn't able to reset the timer."

We know the duration of the first sound but we don't know the exact time either the first or second sound started.

Proportional Time

"I just happened to be taking my pulse when I heard the first sound. I was on pulse count 8. Because I was also looking at my watch, I also noticed the time was 4:25:13. I kept counting my pulse and the sound stopped after four counts. I heard the second sound start four counts later. I had just been running and my pulse rate was a fast 120 beats a minute."

We know the first sound started at time 4:25:13 and lasted for 4 pulses or 2 seconds. We know that the second sound started 2 seconds after that.

Duration in Music

The proportional time example is very similar to the way time is kept in music. A metronome supplies a steady source of beats separated by equal units of time. One musical note value (often a quarter note) is chosen as the beat unit that corresponds to one click of the metronome. All other note values are a proportional to that. Metronome clicks can come at a slow, medium, or fast pace. The speed of the metronome clicks is measured in beats per minute and is called the tempo. Whether the tempo is slow or fast, the rhythmic proportions between the notes remains the same. The actual duration of each note expands or contracts in proportion to the tempo.

Duration in MIDI

Time is not defined in the MIDI standard. The software uses the computer clock to keep track of time, often in milliseconds, sometimes in microseconds. Here's a recipe (pseudo code) to play two quarter notes at a tempo of 60.

  1. Check the song tempo. It's 60 beats per minute so each quarter note lasts 1000 milliseconds.
  2. Set the computer clock to 0.
  3. Send a Note On message (NON) for the first note.
  4. Keep checking the clock and wait until the clock reads 1000 milliseconds.
  5. Send a Note Off message (NOF) to turn off the first note.
  6. Send a NON message to turn on the second note.
  7. Keep checking the clock and wait until the clock reads 2000 milliseconds.
  8. Send a NOF message to turn off the second note.

Duration in Digital Audio

Duration in digital audio is a function of the sampling rate. Audio CD's are sampled at a rate of 44,100 samples per second with a bit depth of 16. Bit depth refers to the range of amplitude values. The largest number that can be expressed in sixteen bits is Two to the 16th or 65,536. To put audio sampling in perspective, let's graph of one second of sound at the CD sampling rate and bit depth. Start with a very large piece of paper and draw tick marks along X axis, placing each tick exactly one millimeter apart. You'll need 44,100 tick marks which will extend about 44.1 meters (145 feet). Next you need place tick marks spaced one millimeter apart on the Y axis to represent the 16 bit amplitude values. Let's put half the ticks above the X axis and half below. You'll need 32.768 meters (about 107 feet) above and below the X axis. Now draw a squiggly wave shape above and below the X axis, from the origin to the end of the 44100th sample while staying within the Y axis boundaries. Next carefully measure the waveform height in millimeters above or below the X axis at every millimeter sample point point along the 145 foot X axis. Write these numbers down in a single file column. You've just sampled one second of a sound wave at the CD audio rate. It probably took longer than one second; definitely not real time sampling.

Digital sampling is done with specialized hardware called an Analog to Digital Converter (ADC). The ADC electronics contain a clock running at 44100 Hz. At every tick of the clock, the ADC reads the value of an electrical voltage at its input which is often a microphone. That voltage is stored as a 16 bit number. Bit depths of 24 are also used in today's audio equipment. A bit depth of 24 can use 16,777,216 different values to represent the amplitude.

The counterpart to the ADC is the DAC (Digital to Analog Converter) that converts the sample numbers back into an analog signal that can be played through a speaker. Most modern computers have consumer quality ADC and DAC converters built in. Professional recording studios use external hardware ADC's and DAC's that often cost thousands of or tens of thousands of dollars.

Timbre

Timbre (pronounced TAM-burr) refers to the tone color of a sound. It's what makes a piano sound different from a flute or violin. The timbre of a musical instrument is determined by its physical construction and shape. Sounds with different timbres have different wave shapes. Here are the waveforms of several different instruments playing the pitch A440.

Piano
piano waveform picture

Unable to play MP3

Violin
violin waveform picture
Unable to play MP3
Flute
flute  waveform picture
Unable to play MP3
Oboe
oboe waveform picture
Unable to play MP3
Trumpet
trumpet waveform picture
Unable to play MP3

Electric Guitar
electric guitar waveform picture
Unable to play MP3
Bell
bell waveform picture
Unable to play MP3
Snare Drum
snare drum waveform picture
Unable to play MP3

Timbre in Music

Timbre in music is specified as text in the score, like Sonata for flute, oboe and piano. Timbre changes may be specified in the score as special effects like growls, harmonics, slaps, scrapes, or by attaching mutes to the instrument.

Timbre in MIDI

Timbre in MIDI is changed by pushing hardware buttons or sending a MIDI message called the Patch change command.

Timbre in Digital Audio

Different timbres have different squiggly waveform shapes. More precisely, timbre is defined by the relative strength of the individual components present in the frequency spectrum of the sound as it evolves over time.The method of obtaining a frequency spectrum is based on the mathematics of the Fourier transform. The Fourier theorem states that any periodic waveform can be reproduced as a sum of a series of sine waves at integer multiples of a fundamental frequency with well chosen amplitudes and phases.

Fourier formula

The pure mathematics sum would have an infinite number of terms. However, in digital audio a reasonable number works quite well.

The Fast Fourier Transform (FFT) is arguably the single most important tool in the field of digital signal processing. It can convert a sampled waveform from the time domain waveform into the frequency domain and back again. By adjusting and manipulating individual components in the frequency domain it is possible to track pitch changes; create brand new sounds; create filters that modify the sound; morph one sound into another; stretch the time without changing the pitch; and change the pitch without affecting the time. Within the last 10 years, desktop computers have become fast enough to process these DSP effects in real time.

Here are three spectrum views of a violin playing the note A440. The standard FFT displays the frequency spectrum. You can see three prominent peaks near 440 Hz, 880 Hz, and 1320 Hz. Those are the first three notes of the harmonic series, f, 2f, and 3f, when f equals 440 Hz. The sonogram displays the same data plotted with time on the X axis and frequency on the Y axis. The spectrogram displays the same data with the axes reversed from the sonogram.

Standard FFT Violin FFT
Sonogram Violin Sonogram
Spectrogram Violin Spectrogram

These screen shots were captured in the open source, cross platform sound editor Snd. (https://ccrma.stanford.edu/software/snd/)

Envelope

The term envelope is used to describe the shape of a sound over the life of the note. Does it start abruptly or gradually? Does it sustain uniformly? Does it die away quickly or slowly?

Envelope in Music

The performers touch or breath can shape the envelope of a musical sound. Another term for this is articulation

Envelope in MIDI

A sound envelope in MIDI may be built in to the sound itself or it may be controlled through an Attack, Decay, Sustain, Release (ADSR) envelope. ADSR envelopes can sometimes be adjusted by tweaking knobs and buttons on the hardware or in the software. An ADSR envelope can be simulated in MIDI with volume or expression control messages.

Envelope in Digital Audio

The envelope of a sound is the outline of the waveform's amplitude changing over time as seen on an oscilloscope or in sound editing software.

Violin envelope

You can mathematically "envelope" a waveform by creating a second wave (amplitude values 0-1) that represents the envelope. The second wave must be the same duration as the first wave. When you multiply the samples of the original wave by the envelope wave, you create a third waveform conformed to the shape of the envelope.

Sound envelope

Many sound editing software programs have functions to create sound envelopes. This example was created in Audacity.

Audacity Envelope Editor
Unable to play MP3

This second example was created in SoundTrack Pro, a part of Logic Studio.

Soundtrack Pro envelope editor
Unable to play MP3

Location

Location refers to the listener's perception of where the sound originated.

Location in Music

The location of the sound is usually not specified in the score but is determined by the standing or seating arrangement of the performers.

Location in MIDI

MIDI restricts the location of sound to the two dimensional left right stereo field. The MIDI Pan control message can be used to position the sound from far left to far right or anywhere in between. If the sound is centered, equal volumes appear in the left and right speaker. If more signal is sent to the left speaker than to the right speaker, the sound will be heard coming from the left.

Location in Digital Audio

Sophisticated mathematical formulas can create a three dimensional sound from two dimensional stereo headphones or speakers. Waveforms can be played with multiple delayed versions of themselves to simulate the reverberation characteristics of an acoustic space.

Did you know?

How long is a sound wave?

You can calculate the distance between crests of a sound wave if you know its frequency. The speed of sound equals 791 mph or 1160.1 feet/second (sea level, 59 degrees F). In one second the note A440 goes through 440 cycles and has traveled 1160 feet. Therefore the distance between crests from one cycle to the next is 2.64 feet (1160.1 ft/sec ÷ 440 cycle/sec = 2.64 feet/cycle). The distance between crests for the lowest note on the piano (27.5 Hz) is 42 feet and 3.3 inches for the highest note (4186 Hz).

Beats

When two notes are slightly out of tune they produce beats. In this example, the first sound is a sine wave at 440 Hz, the second is 441 Hz, and the third is the combination of the two. When combined, their waveforms create audible interference patterns called beats. It's very difficult to hear the difference between the 440 Hz and 441 Hz sounds when played individually, but when they're combined it's easy to hear the beats. The rate of beating is equal to the difference in frequencies. In this example the beating rate is 1 Hz.


Sine wave at 440 Hz
Unable to play MP3
Sine wave at 441 Hz
Unable to play MP3
440 Hz and 441 Hz combined
Unable to play MP3

In Tune

The definition of being "in tune" is the absence of beats. As the two notes get further out of tune, the beats become faster. As the two notes get closer to being in tune, the beats become slower. When there are no perceptible beats, the two notes are in tune.

Out of Tune

The strings of this piano note are slightly out of tune. Listen for beats as the sound decays.

Unable to play MP3

The world's most accurate clock

Posted in Physics, 21st November 2011 13:00 GMT
US researchers recently named the atomic clock at the UK's National Physical Laboratory (NPL) in London the most accurate atomic timepiece on the planet.
Experts from the University of Pennsylvania found that NPL-CSF2 loses just one nanosecond every two months.
http://www.theregister.co.uk/2011/11/21/npl_measuring_the_second/

At that rate the clock will be one second off in just over 166 million years.

[Syllabus]

Revised John Ellinger, January - September 2013