This page contains a little something I call my High School Senior Project. Enjoy!

Why Twelve Tones?

An examination of the development and continued use of the twelve-tone scale in the context of historical and mathematical factors.

by Robert J Coolman
First, a thank you to Ron Black and Ron Bertucci for all their years of musical expertise.
Thanks to Steve Owen and Robert Hurwitz for enduring painfully long and inquisitive interviews.
And last but not least, a thank you to Igor Gladstone for guiding me through the knowledge it took to construct this project.
Abstract

TABLE OF CONTENTS

Introduction

Music is an art. Not only does it have the ability to convey emotion and provide a place of refuge, but it can express an artist's mind: a world we can neither touch nor see.

Alternatively, music can be seen as nothing more than a sequence of sounds made to sound good. Although this argument has truth --indeed, music is supposed to sound "good"-- this argument fails to acknowledge that music has so many other qualities than aural gratification.

Yet, what is it that allows music to sound pleasing? Certainly, not all noises sound nice. A pleasing sequence must therefore possess certain qualities, not seen in other noises, which allow them to sound good. To ensure the production of music that is reproducible and consistently consonant, some system to identify pleasing sequences of noises must be developed. Today there are a variety of systems in use, but the Western style of music has agreed upon a system that uses twelve tones. Why is this?

Because there is more than one system in use, it would be simply wrong to argue that one is the "best." But still, given the success of this system, there exists the argument that we have arrived at this system for reasons other than arbitrary conventions and continued tradition. Why has the Western style of music agreed upon a system consisting of twelve tones?

Other systems have accomplished the ability to sound good to the ear of other peoples of the world. Therefore, some additional element besides the need to sound good must have been sought after, inevitably bringing musicians to a system of twelve tones. This paper will attempt to explain what that element is.

I will explain the means by which we have arrived at this scale through the mathematical analysis of the scale's development. Further support to my thesis will be provided by mathematical theory.

History of Development

The ancient Greeks recorded the first surviving mathematical investigations in the area surrounding the pitches of sound. In the sixth century BCE, Pythagoras of Samos observed and recorded a connection between mathematics and pleasant musical intervals.(R12)

Pythagoras and the Hammers

The agreed method by which Pythagoras discovered the consonant intervals was illustrated by a Roman named Anicius Manlius Saverinus Boethius (480-524 CE) in De Institutione Musica. It should be noted that this story is apocryphal. Nearly a millennium had passed in the times between Pythagoras and Boethius, and the Romans had conquered the Greeks, altering their remaining histories. The legend, as Boethius describes it, is paraphrased as follows:

One day while walking by a smithy, Pythagoras heard a melodic series of tones coming from the blows of four smiths.(listen) He entered, and after much consideration, he decided the differences in pitch between the workers’ sounds was due to the force of the blows. To support his conjecture, he ordered the men to exchange hammers. Much to his surprise, he found the properties of these melodious sounds did not depend on the force exerted by the men. Instead, a specific pitch seemed to be inherently constant within each hammer.(R4)

Further investigation led Pythagoras to believe the weights of each hammer determined the tone it would produce. Weighing each hammer, he found the weights of all four hammers formed small integer ratios when compared with the heaviest one: 1/1, 1/2, 1/3, and 1/4.(R2) This knowledge of ratios may have led him to later experiment with string lengths.

Strings

Although length, linear density and tension all have a profound effect on the pitch a string produces(R5) , by preparing two identical strings, Pythagoras was able to isolate the single variable of length. Plucking the strings together produced two identical pitches, or unison. By fretting one string in half, and plucking it the same time as the other string, he found that it produced the same interval between pitches as the 1/1 and 1/2 hammers, as well as that between 1/2 and 1/4. Aware of the consonance (pleasantness) produced by the 1/3 hammer, he fretted the string at a point one-third of the way from the end, and plucked the shorter section of string. This interval also corresponded to one heard earlier: the ratio between the 1/1 and 1/3 hammers. The longer section produced the interval between the 1/2 and 1/3 hammers.(R12)

By now Pythagoras may have been able to hypothesize that the weights of the hammers could be directly proportional to the lengths of strings, leading him to experiment with several fractional values of string played together with its original length. He would have determined that string ratios that can be expressed in the form (1/2, 2/3, 3/4, 4/5, etc.) sound progressively more dissonant as the value of n is increased; they have more and more inner tension. See Figure 1. Because intervals seemed to become more dissonant as the intigers needed to express their ratios increased, this result in an almost strict adherence to small intiger ratios. It seems unison in combination with the first three ratios of the sequence were enough to use as the basis for the first Western scale.(PC2)

The First Scale

It seems as though the 1/1, 3/4, 2/3 and 1/2 ratios were likable to the point that there was impetus to construct more tones based on them. It is speculated that more tones were produced by evenly spacing new ones between the ratios of 1/1 and 1/2 by using 3/4 and 2/3 as landmarks.(PC3) These two landmarks are reasonably close together, so the distance between them provides a suitable increment by which the spaces between tones may be defined. The interval (quotient) between these two pitches is 8/9. Wanting to make a relatively evenly spaced "scale" between 1/1 and 1/2, one could have thought to take 8/9 of the remaining string for each tone. The problem with this was that only five of these increments can fit between 1/1 and 1/2, and a little is left over. This remainder was cut it in half (square rooted) and called a "limma" or "half tone" , leaving the 8/9 to be a "whole tone."(R10)

For the sake of explanation and the consistent use of modern notation, I will call "A" the first pitch of the scale, being the total string length and having a ratio of 1/1. 8/9 of that, or a whole tone up, is "B." A half tone up from B is "C." The scale continues with whole ("D"), whole ("E"), half ("F"), whole ("G"), whole (A). See Figure 2.

Notice that the 4th, 5th and 8th tones all correspond with the 3/4, 2/3 and 1/2 strings, respectively. This is where the terms "Perfect 4th" and "Perfect 5th" come from. Naturally, a "Perfect 5th above" is defined as "3/2 of the frequency."(R1) The term "octave" developed because of the eight tones.
In taking sections of a string, the frequency was increased by the reciprocal of the section that was taken, where frequency is defined as the quantity of vibrations by an object within a specific amount of time.

To say the above-described scale was the first is not exactly precise. It is more likely that the Greeks developed this scale as one in a set of seven, all created simultaneously. Within the boundary of a single octave, there are possible ways of arranging the half tones. Seven of these possibilities are unique in that the half tones are separated by as many whole tones as possible.(PC3) Furthermore, these seven scales produce a Perfect 5th between almost all pairs of tones separated by 3 others, with the single exception that one pair forms a Tritone (diminished 5th). These "unique" seven were the first Western scales, and are now known as the seven "church modes."(PC4) (later seen in Figure 3)

Additionally, in accordance with the Greeks' fascination with the planets, some credit for these seven arrangements can be given to their ideal of the Music of the Spheres. "...The seven musical notes were assigned to the seven heavenly bodies... Kepler... noticed that the ratios between planets' extreme angular velocities were all harmonic intervals."(R9)

Not Seven, but Twelve

These seven "church modes" were the basis for all Western music up through the Middle Ages. They all sound different, physically and emotionally. Each one can be separately formulated by starting and finishing specifically on one of the seven tones.

Figure 3 (PC1)
w =whole tone h =half tone
Mode NameStarting ToneStep Pattern
Aeolian
(minor)
Awhwwhww
LocrianBhwwhwww
Ionian
(Major)
Cwwhwwwh
DorianDwhwwwhw
PhrygianEhwwwhww
LydianFwwwhwwh
MixolydianGwwhwwhw

However, if one desired to start on a specific tone and play a mode other than the one assigned above, more tones would be needed. For example, in order to start on C and play G's step pattern (Mixolydian) a tone somewhere between A and B needs to be played instead of B. Five of these "accidental" tones are needed to play all seven kinds of modes starting on any tone. This quantity holds even if a scale is started on an accidental. Seven "natural" tones, plus five accidentals makes a total of twelve tones; seven white keys and five black keys. Hence, the first version of the twelve-tone scale.

The above scale has a temperament called "Pythagorean Intonation," although it is not known whether Pythagoras had any involvement in its development.(R2) This form of temperament soon proved insufficient, reasons for which will be later explained.

Ideal Implementation of Twelve Tones

At this point, all fractional values will represent ratios between frequencies, and not string lengths. Thus, all fractional values will be reciprocated.

The twelve tones together form what is musically described as the "twelve-tone chromatic scale." Other theories explaining the formulation of this twelve-tone value exist; some of which will be discussed later. First, an understanding of different forms of the twelve-tone scale is necessary. They differ in temperament, or how the spacing between each tone is defined. The terms "intonation" and "temperament" are used synonymously.(R11)

Temperament

After the Greeks, several factors lead to the need to define the scale differently. Although Pythagorean Intonation preserves the Perfect 4th and Perfect 5th between almost all tones, it's often times impractical. There are no 3rds -- Major (5/4) or minor (6/5) -- so Major and minor sentiments cannot be expressed. Harmonies cannot function with more than two voices because the scale is designed to preserve only one consonant ratio at a time. An interval like 256/243 is difficult to tune and goes against the idea of small integer ratios. Most importantly, Pythagorean Intonation rarely tunes the accidentals in a stable fashion.

Accidentals are given one of two names, depending on which tone they replace in the scale. If an accidental is above the replaced tone, it's given the same name, only followed by "sharp" (#); if under, followed by "flat" (b). Going a limma (243/256) up or down the respective pitch closely approximates the needed tone, but "close" isn't good enough. To the trained ear, the pitch sounds noticeably wrong.

Just Intonation

Musicians have experimented with several systems, but it was found that, within a single key, a temperament based solely on the pure ratios sounded best. See Figure 4. Today we use a version known as "Just Intonation," based on the first five pure ratios (and unison).(R7) Just Intonation is designed to preserve the ratios of the Perfect 4th and Perfect 5th (as in Pythagorean Intonation) in addition to the Major 3rd (5/4) and minor 3rd (6/5). All other intervals of the twelve-tone chromatic scale are derived from dividing the octave and these four ratios by each other. See Figure 5. The value of a half tone is no longer constant, but that is not important to the purpose of the scale; the goal of Just Intonation is not to satisfy equal temperament, but to make the intervals with respect to the first tone sound as pure as possible.(R8) See Appendix A: Personal Research.


Equal Temperament

The Just scale works well (listen), except when a player changes keys. Instruments, if tuned "Justly," can only be tuned to best play in a single key. Using an instrument tuned Justly to the key of C, a melody transposed to F Major sounds strangely dissonant. (listen) In the case of transposing to F# Major, the piece sounds awful. (listen) To avoid such clashes, songs would sometimes be written only in one key, or would sometimes take advantage of different tunings in different keys to change the feeling of the piece. A multitude of temperaments have been devised, each attempting to satisfy specific ratios in specific keys. Because of their quantity and the vast skill required to harness the highlights of each temperament, the overall system is very complicated. The solution to this problem is generally credited to Johan Sebastian Bach. Bach, a revolutionary composer of the 17th and 18th centuries, saw that any interval based on a rational ratio was going to become tuned differently whenever there was a key change, so he suggested the equal spacing of every tone. (PC3)

Bach most likely did not understand the true workings behind his proposal, but mathematically, the frequencies of all tones became the product of 440Hz and the nth power of the twelfth root of two: , where n is the quantity of half tones the tone is above A4.(R3)

The number succeeding the note name denotes the octave. A4 is currently defined at frequency of 440 Hz.

In the equally-tempered scale, all tones are equally in (or out) of tune. Unfortunately, this means the Perfect 5th can no longer be used. Instead of the ideal , the equally-tempered scale defines the 5th as .(R3) The overall result still produces somewhat pleasing intervals, but is noticeably imperfect to a trained ear. Later, in the section on continuous fractions, we will see how the use of the twelve-tone scale best minimizes this imperfection.

Inherent Mathematical Pull?

Why are there twelve tones? Why this particular quantity? There is actually a lot of theory, beyond the history of scale's development, as to why we should arrive at this quantity. The simplest argument, although very weak, is that twelve was alluring because of the rhythmic simplicities that can applied to playing the scale; twelve notes can be easily divided into measures of three or four beats. (R13) Although true, this argument fails to recognize that wholly chromatic lines are rarely commonplace in a melodic phrases. Even when they are, it is doubtful the notes would have to be of equal length in order to communicate the feelings the composer is trying to convey.

Instead, physiologists relate the arrival at the twelve-tone system to an innate likability buried in the human condition: we like the Perfect 5th, by default. Music is not practical unless the octave is divided into smaller parts. Coming after the octave, the 3/2 ratio is the simplest consonant interval. (R13) Naturally, we're forced to use it as a basis for our scale. It seems the need to preserve this single tone interval as best as possible drove us to a system of twelve tones.

The Circle of 5ths

The most common suggestion of the link between the 3/2 ratio and the twelve-tone scale is known as the "Circle of 5ths": A Perfect 5th up from C is G. A 5th up from that is D, then A, then E, then B... See Figure 6.

Eventually, after twelve repetitions, the original pitch is obtained. (R13) Most importantly, all twelve tones have been named.

This suggestion of linkage works best when a scale of equal temperament is assumed, otherwise there exists a slight mathematical error. If the process of increasing 5ths is repeated twelve times by using 3/2, the final tone is 1.36% sharper than the original. By defining the 5th as , as in the equally-tempered scale, the error disappears.

Because of this error, this suggestion does not directly support the link of twelve tones to 3/2. It's best support comes from the fact that the margin of error is small, but to an experienced musician, this margin of error is not negligible. We will later see how the difference between the rational and irrational values of the 5th -- 3/2 and -- is as small as possible when assuming a scale containing a reasonable quantity of tones.

Ratios & Corresponding Tones

A far superior suggestion of the link between the 3/2 ratio and the twelve-tone scale requires more mathematical analysis. It assumes the use of an equally-tempered scale, but is still based on the likability of the 3/2 ratio.

Assuming any reasonably spaced scale, each time pitch is increased, the change in frequency becomes progressively larger. In the case of an equally-tempered scale, the frequency of each tone is increased by the repeated multiplication of in order to form what sounds like equally-spaced increments. Pitch and frequency are therefore logarithmically related.

One note does not a scale make. Other factors must be incorporated or else the scale has no dimensions. The simplest point to which the origin of 1/1 can connect is 2/1. This could also be thought of as the connection between the first two consonant ratios. Although this in itself is technically a scale -- one that is equally tempered -- it is hopelessly impractical because it consists of nothing but octaves; a one tone scale.

To create an equally tempered scale of a greater quantity of tones, we must set some primary interval p must equal 2/1 (one octave) when multiplied by itself the number of times equal to the quantity of tones in the scale. All tones become powers of p. The values in the below example are completely arbitrary, but illustrate the process of corresponding tones to ratios in an equally tempered scale.

Suppose we were to determine which tone of a 17 tone scale is closest to an interval of 11/6. We start by simply calling the tone closest to 11/6 'x.' A scale with 17 tones per octave means primary interval p satisfies or . For 11/6 to be the xth tone of the scale, . We substitute in the value of p and solve for x.

Thus, for a 17 tone scale, the ratio of 11/6 is closest to the 15th tone above 1/1.

What about the quantity of tones that, when equally tempered, would best approximate the 3/2 ratio? Undoubtedly, it would be a scale with a quantity of tones approaching the limit of infinity, completley underminding the entire prosepct of dividing up the octave.

Conversely, the number of tones does not necessarily quantify how closely approximated the 3/2 ratio is, as demonstrated by the fact that a scale with five tones more closely approximates the ratio of 3/2 than one with six.

Identifying how close a scale approximates 3/2 is not a problem, however, analyzing all scales with quantities between one and say 500, is. Is there a way to identify a set of scales that best approximate without individually applying calculations to each one?

Continuous Fractions

Yes. First we apply the above method to the best approximating scales of 3/2 by employing this approximation: . Just as before, x represents the tone of the scale closest to the 3/2 ratio, but now y is the quantity of tones in the scale. Ideally, both variables would be small integers, without their quotient deviating too much from the desired value of . If y becomes too large, then it will become difficult to differentiate consecutive tones.

Best approximating values for x and y are best identified though a process known as continuous fractions.(R13) This process can be applied to any real number, but is most commonly used to identify patterns in irrational numbers. It consists of repeatedly alternating between
1) separating (and recording) the digits to the left of a decimal and
2) reciprocating the ones to the right.
After the process is applied, the number obtains this form:

Due to the cumbersome nature of the expression, it can be more conveniently written as:

Applying the described process to our number, we obtain:

See here for derivation.

The accuracy of the approximation is controlled by the number of times this process is repeated. Thus, the recorded numbers can be used to derive a sequence of fractions that progressively more closely approximate the applied number. By increasing the accuracy by one term each time, the following sequence of best approximating scales is obtained:

See here for derivation.

Each term of the sequence more closely approaches the 3/2 ratio at the specified tone, where the numerator is the respective tone of the equally tempered scale and the denominator is the quantity of tones.

The first few elements give us scales with five, twelve and 41 tones between octaves. Notice the large skip between a scale with twelve tones and one with 41. Considering how small the differences between tones of a 41 tone scale are, it is doubtful this scale would come to practical use. Only the experienced and/or gifted could tell consecutive tones apart, as a matter of ear physiology. As of such, twelve is the most optimal quantity for it has the closest approximation of the 3/2 ratio within a reasonable quantity of tones.

Interestingly, the element preceding 7/12 matches up with another popular system of tones: The five-tone or "pentatonic" scale. Its fourth* tone also closely approximates the 3/2 ratio. Its success as a commonly used system supports the idea that an innate likability to the 3/2 ratio can drive us to what system of music we use. Although this scale is not usually equally tempered, it still relates here through the five tone equivalent of the Just scale. *Although the fraction reads "3/5," this is really the 4th tone because 0/5 is the first tone.

Conclusion

Several lines of investigation point to the 3/2 ratio as the important factor in the development of the twelve-tone scale. The apocryphal pleasures of Pythagoras suggest historic roots to the importance of 3/2. The development of tempering, Equal and Just, center around the scale's "fit" to the 3/2 ratio. A mathematical analysis of tempering shows the inherent pull of the 3/2 ratio as it is approximated by . My own experience with the physics of trombone overtones points to the importance of the 3/2 ratio. (Appendix A)

Not only does the scale's development revolve primarily around the integration of this ratio into the construction of all tones, but ideal implementation of twelve tones tends to show that this quantity of tones is most optimal for the preservation of 3/2, regardless of differences in temperament. Therefore, for all reasons discussed, the Western style of music has agreed upon a system of twelve tones because of the innate likability to the 3/2 ratio.

Appendices

Appendix A

Personal Research:

A scale dependent on the displayed pure ratios (Figure 4) is generally the tendency for a musician when playing music(PC3), but in the development of the theory of music, it becomes almost impossible to avoid. Some harmonies demand it, but even more demanding is the fundamental tendencies of many musical instruments.

I've created Figure 7 with my trombone and a program called Spectrum Analyzer. The graph is of a single pitch, Bb1, which has a frequency of 58.3 Hz. This tone is the lowest open note one can play on the trombone (also known as the trombone's fundamental).

Also present in this sample are integer multiples of this fundamental frequency. These are called harmonic overtones. Overtones are present in any sound, and are the factors by which the brain is able to differentiate two different types of sounds (also known as "timbre"). Even though the bassoon and the trombone are in the same range, their overtones differ, making them sound different.(R6)

More interestingly, the frequencies of each of these overtones can each be played individually, acting as their own fundamental for their own respective tone. Brass players know them as "open notes". This terminology came about by the use of valved instruments because all the valves remain open during these tones. They are controlled by the firmness of the lip and air speed. Overtones still occur in whole integer ratios of the given tone. Figure 8 and Figure 9 are for Bb2, and F2, which are the 2nd- and 3rd-lowest open notes a trombone player can play on the trombone, respectively.

There is no limit to the quantity of open notes a brass player can theoretically play, though it is rare (not to mention extremely difficult) for a trombonist to exceed the range of the first twelve. The fact that they occur in whole number multiples of the fundamental frequency means they can be related to the pure intervals talked about earlier:

See Figure 10.

Figure 10
Open noteNoteFrequency (Hz)

Distance from A 4 (cents)
n
n*55*2^(1/12)

1200*log(n*2^(1/12)/8)/log2
1Bb158.27047Bb1/Bb1=1/1-3500
2Bb 2116.54094Bb 2/Bb 1=2/1-2300
3F 2174.81141F 2/Bb 2=3/2-1598.04
4Bb 3233.08188Bb 3/F 2=4/3-1100
5D 3291.35235D 3/Bb 3=5/4-713.69
6F 3349.62282F 3/D 3=6/5-398.04
7Ab 3407.89329

-131.17
8Bb 4466.16376Bb 4/Bb 3=2/1100
9C 4524.43423C 4/Bb 4=9/8303.91
10D 4582.7047D 4/Bb 4=5/4486.31
11E 4640.97517

651.32
12F 4699.24564F 4/Bb 4=3/2801.96

It should be noted, that these ratios are not at all "programmed" by craftsmen into the instrument, but are moreover natural phenomena demanded by the laws of physics.

The use of the trombone, or any brass instrument, would tend to tune the 3rd tone of the Major scale to 5/4. Continued use of these instruments would invariably bring musicians to use some form of temperament that tunes the Major 3rd in this manor. One example is the Just scale.

Appendix B

Further support via the Perfect 4th:

This paper was primarily concerned with the ideal number of tones for a scale regarding the Perfect 5th, or 3/2 ratio. Thus, I did not deem it appropriate to incorporate information and support using other consonant ratios. In this appendix I will illustrate that twelve tones is not unique to the 3/2 ratio.

This appendix employs the process illustrated in the section on continuous fractions, only without derivation graphics. That is, by now I assume the reader has a solid grasp of how they work.

Just as this process was applied to 2/3 (the Perfect 5th), I will do the same for 4/3 (the Perfect 4th).

As before: the accuracy of the approximation is controlled by the number of times this process is repeated. Thus, the recorded numbers can be used to derive a sequence of fractions that progressively more closely approximate the applied number. By increasing the accuracy by one term each time, the following sequence of best approximating scales is obtained:

Each term of the sequence more closely approaches the 4/3 ratio at the specified tone, where the numerator is the respective tone of the equally tempered scale and the denominator is the quantity of tones.

Similarly to the Perfect 5th, the first few elements give us scales with five, twelve and 41 tones between octaves. Again, 41 is a rather large amount of tones to distinguish for an untrained ear, so twelve is just about right; not too many tones, and it approximates the 3/2 and 4/3 ratios best.

Unfortunately, twelve’s saga ends here at the Perfect 4th. Should we apply the process to the Major 3rd (or 5/4 ratio), we would find that first few optimal scales contain three, 28 and 59 tones.

I've been asked “Is there a special reason why only the Perfect 4th and Perfect 5th yield twelve?” As far as I can tell, it’s because when using octaves (a ratio of 2/1) the Perfect 5th implies the Perfect 4th: An octave minus a Perfect 5th is a Perfect 4th.

This conjecture is supported by the fact that other 'octave sums' share optimal scales. For example, an octave minus a Major 3rd is a minor 6th. It can be shown that first few optimal scales for a minor 6th contain three, 28, and 59 tones; just as before with the Major 3rd.

As demonstrated above, certainly not all consonant ratios yield the twelve tones I based my paper around, but I will still argue that twelve is still the best because the most consonant (and therefore simplest) ratios will tend to cause gravitatation towards an equally tempered scale with twelve tones.