Shota
Παλαιό Μέλος
Here is an article on the method of pitch determination using a computer:
http://www.cs.hmc.edu/~kperdue/MusicalDSP.html
The most interesting part for me is this paragraph:
Our goal is to find the frequencies of the constituent sinusoids of the musical tone, because this relates to how its pitch is perceived. For the FFT to be useful, we have to have the FFT operate on a long enough time (T) so that it can distinguish between an instrument's lower pitches. The frequency resolution is the inverse of T, due to the fact that in order to resolve a frequency accurately, enough time has to pass to complete one full cycle at that frequency. For example, if we choose a T of 0.03 seconds, we have a frequency resolution of only 1/0.03 or 33.3 Hz. That's about the difference between middle C and the D above it. This means if we were using this data, for that 0.03 seconds we couldn't reliably tell if a C4 or a C#4 or a D4 was being played. Lengthening T to 0.5 seconds gives us a frequency resolution of 1/0.5 or 2 Hz, which is less than 1/8 the distance from middle C to the C# above it. However, we've sacrificed time resolution to attain this accuracy. With this T, we can distinguish only two events per second, which is too slow for our pitch detection, as the shortest notes in jazz are about 0.094 seconds long.
I think this has to be taken into account when presenting results with Melodos, Audacity or any other software, or at least one has to think a bit more about it. Everything depends on the note length and absolute pitch of course, but maybe 7.5 or 7.75 moria, say, will make some difference?
http://www.cs.hmc.edu/~kperdue/MusicalDSP.html
The most interesting part for me is this paragraph:
Our goal is to find the frequencies of the constituent sinusoids of the musical tone, because this relates to how its pitch is perceived. For the FFT to be useful, we have to have the FFT operate on a long enough time (T) so that it can distinguish between an instrument's lower pitches. The frequency resolution is the inverse of T, due to the fact that in order to resolve a frequency accurately, enough time has to pass to complete one full cycle at that frequency. For example, if we choose a T of 0.03 seconds, we have a frequency resolution of only 1/0.03 or 33.3 Hz. That's about the difference between middle C and the D above it. This means if we were using this data, for that 0.03 seconds we couldn't reliably tell if a C4 or a C#4 or a D4 was being played. Lengthening T to 0.5 seconds gives us a frequency resolution of 1/0.5 or 2 Hz, which is less than 1/8 the distance from middle C to the C# above it. However, we've sacrificed time resolution to attain this accuracy. With this T, we can distinguish only two events per second, which is too slow for our pitch detection, as the shortest notes in jazz are about 0.094 seconds long.
I think this has to be taken into account when presenting results with Melodos, Audacity or any other software, or at least one has to think a bit more about it. Everything depends on the note length and absolute pitch of course, but maybe 7.5 or 7.75 moria, say, will make some difference?