Automated Music Transcription System (my Honours project)

Started by bvanoudtshoorn, October 23, 2008, 13:16:40

Previous topic - Next topic

bvanoudtshoorn

Today I handed in my Honours project (in Computer Science at the University of Western Australia). My topic was "Investigating the feasibility of developing a near real-time system for music transcription on mobile devices". In doing this project, I developed a desktop Java application which transcribes music from an audio signal into XML -- which is now available for free download!

Read more about handing it in on my blog.

Download and find out more about the system at my wobsite.

You can also download and read my thesis, which contains a lot of information about audio signal processing in this context, lots of pretty diagrams (more than are on the website), and lots of other exciting goodies. Well, perhaps not exciting. And perhaps not goodies. But it contains stuff! Interesting stuff! :D

Anyway, if you've got a moment or two to spare, why not go and check it all out? There's a whole stack of information up there now.


And while you're at it, you can take a look at my gallery of programming sketches, as realised in the Processing environment.

Sam_Zen

Very nice project, Barry. So I wish you to get the honours for it.

I've read the dissertation PDF roughly so far, and the text is very to the point and well-written.

QuoteIt is interesting to note that there is little literature which concerns
itself with any tuning other than Western; indeed, most authors go so far as to
unquestioningly accept equal temperament as the only tuning worth considering.
Interesting indeed.
The only variation I know of (disregarding quarter-notes for the moment) are the Indian raga scales,
with a difference between ascending and descending scale. But I still don't have a clue about the practice of that.

I do know of another octave-division, which I would call 'mathematical' or 'natural'.
I can realize this with my analog synth, which has the function of 'sync', forcing osc 2 to 'follow' the freq. of osc 1.
This force can be regulated with a pot. If the sync-force is 100%, then f2 = f1 (still with its own waveform)
If the force is reduced, then, at some stage, osc2 isn't able to produce the f1 anymore, but switches to 1/2*f1.
So an octave lower. (If osc2 < osc1, otherwise the same process would be valid by switching to 2*f1)
If the force is reduced more, then a note is produced half between 1/2*f1 and f1
So (around!) the fifth, or quint, I'm not sure how it should be called.
And so on. More finetuning of the sync-force delivers more freqs at the half between two other higher and lower freqs.

QuoteConverting a piece of audio into an abstract form (such as notation) is a multi-stage process.
Choosing for a multi-stage process could often be a much more efficient method, than trying a one-way solution.
This is about Fourier analysis, and I would like to point to a kind of step in this process, referring to multi-stage.
The same company of my synth also released an app called 'Frequency to Voltage Converter'.
With a fluctuating DC-signal as output. This is much easier to handle and digitize, than an audio-signal.
0.618033988

uncloned

Barry - there is literature on other tunings from other cultures.

http://xenharmonic.wikispaces.com/MicrotonalTheory

is a wiki devoted to the endeavor of alternate tunings.

There are papers regularly being published on the microtonal mail list.
Of course for some of it you need to read Turkish or Persian...


Congrats on completing your paper!!

Now to go read....

bvanoudtshoorn

@Sam_Zen:
Thanks for reading and commenting, S_Z, and thanks for the compliment about my writing style, too. :D Regarding tuning algorithms, there are quite a few western twelve tone tuning ones around which just aren't really used any more, like Pythagorean tuning and Just Intonation. And of course, as you mention, the various other mechanisms scattered across the globe.

Using your synth in that way to achieve an alternate tuning mechanism is very interesting, and kinda cool! It'd be interesting to see how it differs from other tuning schemes.

Yes, the multi-stage aspect does make the system more efficient, I think, and it also lets me swap components in and out. So I can, for example, decide to perform analysis using technique x instead of technique y, or write an outputter that produces, say, colours instead of XML.


@Clones:
Thanks for the link, clones - very interesting stuff there. (S_Z, they provide an overview Just Intonation there ). And thanks, clones - I'm glad it's finally over. Almost. :)

uncloned

Barry,

I do know enough to follow your paper but not enough to make any intelligent comment beyond - hey! I'm impressed!

A very nice piece of work and it is mighty good of you to publish it with the detailed explanation on your page.

bvanoudtshoorn

Thanks clones! I'm glad that you enjoyed it. :) I think that putting all the info up on my website has actually been a good way for me to figure out what I'm going to do in the presentation -- I've forced myself to explain the whole thing in short paragraphs.

Sam_Zen

Yep, wiki is a nice overview.
QuoteIt'd be interesting to see how it differs from other tuning schemes.
I thought about that. My guess is, that it could lead f.e. to an octave-scale divided in 16 notes.
Because every new note that is produced due to finetuning, is at exactly half between the nearest higher and lower notes.
I found a chart with the tempered frequencies :



If A 440 Hz is the base, the synth would play the E with 330 Hz, not 329.628.
To demonstrate the result, I placed a track in the DL section, Introvert 4, where I'm playing with this.
Sometimes I also use what I call 'binary scales', but that's another story.
0.618033988

uncloned

I think the microtonal mailing list people would consider that a sort of "adaptive just intonation"


I'm toying around wit the idea of implementing a way to play in a harmonic series derived scale - one that would not be tempered into an octave  - so there would be no pure doubling at the octave and each register would be different and not just a duplication.

this is not an original idea - but one I want to try to realize without resorting to composing in cSound by coding music in frequencies. But if it takes that I will....

Sam_Zen

Of course, according to physical laws, not by musical theory, an octave is doubling the frequency.
But what, if this note is supposed to be at the center of another octave around it ?
0.618033988

uncloned

the difference comes from our hearing being in logarithms and the harmonic series being linear.

And the octaves the ear perceives do not need to be a doubling of frequencies under the right conditions.

Check out this site

http://eceserv0.ece.wisc.edu/~sethares/consemi.html

there are links to a java application to let you explore the relationship if you'd like as well.

uncloned

Quote from: "Sam_Zen"Of course, according to physical laws, not by musical theory, an octave is doubling the frequency.
But what, if this note is supposed to be at the center of another octave around it ?

not sure I understand your point.

uncloned

for Barry

lots of Bach in the best guess of his actual tuning (not 12-tet)

http://www.larips.com/

background and audio and video

Sam_Zen

2 uncloned
I mean the calculation of E in C - E - C' , or the calculation of C' in E - C' - E' . But just guessing though.
0.618033988

uncloned

let me try to understand

the tritone, c f# c -> f# is the middle of an octave I think

Sam_Zen

f# is the middle ? I guess you're right, my fault. Even so, nevertheless.
0.618033988