Tuning

Naturally occuring musical sounds often contain harmonics, with lower harmonics stronger than higher ones. When two notes are played, if many of their harmonics coincide then they sound consonant. Otherwise they sound dissonant.

Two sinusoids close in frequency cause beats to be heard, namely, we perceive an oscillating change in volume. If their frequencies are further apart one hears roughness in the sound. Further apart still, the sound becomes smooth. The critical band, denoted by \(f_{c b}\), is where the transition from roughness to smoothness takes place. If the ratio of two given frequencies exceeds \(f_{c b}\) then they are independent and sound smooth, otherwise they interact, causing beats or roughness.

At about 500Hz and above, \(f_{c b}\) is under a minor third, while at 100Hz or so it is almost an octave. Hence the same interval can sound smooth at higher frequencies yet sound muddy at lower frequencies.

Consider two tones (whose fundamental frequencies are) an octave apart. The harmonics of the higher tone are precisely the even harmonics of the lower tone, hence tones that are any number of octaves apart will have harmonics that agree. We say they belong to the same pitch class. More generally, a tone whose fundamental frequency is the \(n\)th harmonic of another tone has harmonics that coincide with every \(n\)th harmonic of the lower tone.

Let us now consider the simplest case where not all harmonics coincide. Suppose two tones with fundamental frequencies \(2f, 3f\) are sounded, for some frequency \(f\). Then every second harmonic of the higher tone is not a harmonic of the lower tone, and at lower frequencies their combined sound is rough. However, their combined sound has frequencies at the harmonics of a tone with fundamental frequency \(f\); it is as if we have sounded a tone with fundamental frequency \(f\) missing the \(n\)th harmonic for each \(n\) that is not a multiple of 2 or 3. Even though the first harmonic \(f\) is missing, as well as many others, we associate the frequency \(f\) with their combined sound due to a psychophysical phenomenon known as residue pitch.

[I’m guessing our hearing range is partly to blame for larger critical bands at lower frequencies; two low tones with similar frequencies have a residue pitch lying around or beyond the lower extreme of the range of frequencies a human can sense.]

Perfect intervals and Pythagorean tuning

Apart from octaves, no other ratio meshes better than 3:2 so we call this ratio perfect.

Pythagoreas constructed a scale based entirely on this perfect interval. Starting from a frequency \(f\), we iteratively multiply by \(3/2\) to obtain other pitches. Then the notes are placed in the same octave by scaling by powers of 2. One is then chosen as the start note, determining the mode of the scale.

For example, starting from an F, multiplying by 3/2 repeatedly gives us C, G, D, A, E then B. Choosing the start to be C gives the Pythagorean diatonic major scale. It is the only scale where the intervals between the first four notes are precisely the intervals between the last four notes: tone, tone, semitone. A block of four notes occuring in this fashion is called a diatonic tetrachord.

The ratio between the first and fifth notes is 3/2, explaining the term "perfect fifth". In fact, every fifth on this scale is perfect except for that between B and F, which is called a diminished fifth. Similarly, since fourths are inversions of fifths, every fourth is perfect except for the augmented fourth between F and B.

By leaving out the 7th note, the last to be added, one can avoid imperfect intervals. A traditional Scottish tuning system does exactly this. One could also leave out the second-last note to be added, yielding the pentatonic scale, used in ancient Chinese, far eastern, and European folk music such as Celtic music.

On the other hand, by going further and adding an eight fifth gives us a perfect fifth with B as the bottom note; in other words, we have F# which yields a diatonic scale starting from G. The interval between F# and G is a Pythagorean semitone (i.e. their frequencies have the ratio \(2^8 / 3^5\)). However, between F and F# we have the larger Pythagorean chromatic semitone or Pythagorean apotome, whose ratio is given by \(3^7 / 2^{11}\).

The interval between the types of semitones is called the Pythagorean comma and takes the value \(3^{12}/2{19}\). This ratio also describes the amount 12 perfect fifths exceeds an octave.

Harmonic thirds

We have seen that two tones in the ratio 3/2 (perfect fifth) are consonant, as are two tones in the ratio 4/3 (perfect fourth). If the ratio is 5/4, then consonance still occurs though not as many harmonics agree as in the other two cases. This is the just or harmonic major third. One step further, the ratio 6/5 defines the just or harmonic minor third. Ptolemy proposed a tuning system which included this interval but was ignored until much later.

Pythagorean tuning has wider major thirds and sixths and narrower minor thirds and sixths, which is exploited in Gothic music. The syntonic comma or comma of Didymus is the ratio between a Pythagorean major third and a harmonic major third, and is also the ratio between a Pythagorean minor third and a harmonic minor third. It is the value \(81/80\).

By extending F# by two fifths we obtain C# and G#. Extending downwards two fifths from F gives Bb (most likely the first accidental introduced to organ keyboards around the 10th or 11th century) and Eb, yielding a scale that was common on keyboards around 1300. This gives perfect fifths for every note except for the one between the extremes, i.e. G# and Eb, the "wolf" interval, which is rarely present in Gothic music.

Around 1400, instead of extending upwards to find F#, C#, and G#, musicians would extend downwards from Eb to get Ab, Db and Gb, which altered thirds involving sharps so that they were closer to their harmonic versions. This changed the quality of many pieces of that period.

As we go from Gothic to Renaissance music, thirds and sixths gain importance. As a result, a tuning that smoothed thirds and sixths was sought.

Just tuning

Zarlino adjusted the major third so that it had a 5/4 frequency ratio. For example, in the key of C, we find E and G by using the ratios 5/4 and 3/2. Thus the frequencies of the three notes in this major triad are related by the ratio 4:5:6. The rest of the diatonic scale is constructed by repeating this procedure on the fourth and again on the fifth. In our example, this yields F, A and C, and then G, B and D. The full chromatic scale can be constructed via more major triads.

Thirds and sixths are now more harmonious, but we pay dearly. This tuning has two types of whole steps (major = 9/8, and minor = 10/9), and three types of semitones (diatonic = 16/15, chromatic major = 135/128, chromatic minor = 25/24).

A frequency \(f\) can be viewed as being the \(n\)th harmonic of the fundamental frequency \(f / n\), called the \(n\) subharmonic. The subharmonic series of \(f\) is \(f, f/2, f/3,…\). We have the minor triad at the fourth, fifth and sixth subharmonics (with harmonics, we get the major triad). Their frequencies lie in the ratio \(1/6 : 1/5 : 1/4\), i.e. \(10 : 12 : 15\). These can be used to construct the just minor scale. (That is, from an arbitrary pitch, construct a minor triad, then use the first note as the last note of another minor triad, then use the last note of the first triad to construct one more minor triad.)

Meantone temperament

The most extreme approach is to force all thirds to be harmonic at the expense of the fifths. Pythagorean tuning obtains a major third by ascending 4 perfect fifths. If we reduce the size of fifth by a quarter of the syntonic comma, then we instead reach the harmonic major third. This modification is called "1/4-comma meantone", and is the most common meantone tuning.

Well-Temperament

As we progress from Renaissance to Baroque music, major and minor thirds and sixths along with fourths and fifths are featured in practically every key. With any of the above tunings, they can never all be consonant. What can be done?

One solution is to incorporate more than 12 keys per octave on a keyboard. This was tried, but did not spread.

Werckmeister solved the problem without adding more keys by narrowing some fifths while leaving others intact. The fifths along the diatonic notes (white keys) are adjusted by a 1/4-comma, while the fifths among the accidentals (black keys) are left just. (Werckmeister proposed several tuning systems; this one is known as Werckmeister Temperament III.)

In a well-tempered tuning, all keys are playable, but each has a unique character.

Equal Temperament

By picking a semitone interval to be \(2^{1/12}\) (the equal-tempered semitone), every pitch is evenly spaced, thus all keys sound the same. Each interval slightly disagrees with its harmonic version, but the discrepancy is consistent.

For comparing tuning systems with equal-temperament, one can use cents. One cent is defined to be the ratio \(2^{1/1200}\). For example, a frequency 1200 cents higher than another is another way of saying to go up by an octave. An equal-tempered semitone is 100 cents.

References

F. Richard Moore, Elements of Computer Music, Appendix C
Margo Schulter, Pythagorean Tuning and Medieval Polyphony
Peter A. Frazer, The Development of Musical Tuning Systems

Ben Lynn blynn@cs.stanford.edu 💡