Posted by hordia on 26th August 2007
Working to have audio-to-midi in NetworkEditor (CLAM) I needed to convert a fundamental frequency value to a MIDI note one.
I found some source code related with this in Voice2MIDI app, but was not explained at all, so looking for the reason of that formula I arrived at this:
Knowing about equal-tempered scale (check this) and
relation between frequencies plus the fact that C4 or “middle c” has a MIDI value of 60, it’s easy to conclude that then A4 (which its frequency value is 440Hz, a standard for tunning and is 9 semi-tones more) has a MIDI value of 69.
Then, starting with:

It’s easy to arrive at this:

and then, also taking in account this mathematical relation::

the final formula looks like:

and a final c++ code like:
fund_midinote = round( 69. + log(fundfrec/440.)*17.31234 );
Related post: nictuku’s inverse formula (i.e. from MIDI to Hz) here “Translanting MIDI Notes to frequencies in the diatonic scale using the central A (440hz) as reference“.
algorithms
, audio
, c++
, CLAM
, english
, GSoC2007
, math
, midi
, music theory
, programming
, standards 
Share This/Compártelo
Posted in audio, algorithms, programming, music theory, c++, midi, math, English, CLAM, standards, GSoC2007 | No Comments »
Posted by hordia on 6th August 2007
Morph effect (best know in images domain) it’s about hybridize two sounds so the resulting one has intermediate characteristics. This implementation is mainly based on interpolation (peaks and residual spectrum) and a balance (depending on interpolation factor) of fundamental.
All the code is mainly based on this idea:
, where alpha is the interpolation factor (bounded to 0..1 range).
I’m still have to tweak it a bit… but anyway I’ve made some demos of it:
Sources: Piano C5 and Oboe C5.
Demos: Take 1, Take 2
To hear the online/streaming version go here.
Samples were taken from Freepats / Iowa Musical Instruments Samples.
algorithms
, audio
, CLAM
, effects
, english
, GSoC2007
, programming
, signal processing 
Share This/Compártelo
Posted in audio, algorithms, effects, signal processing, programming, English, CLAM, GSoC2007 | No Comments »
Posted by hordia on 31st July 2007
I’ll start talking a bit about this effect which is mainly used for vocal harmonizing. Given an input voice (or whatever) as output you obtain (how many as you want) automatic
harmonic related voices (a
minor/
major third, a
fifth, a
sixth or any
musical interval you want).
This implementation, is mainly based on many SMS pitch-shiftings (one for each voice) and a control gain for each one. Pitch controls are based on equal-tempered scale semitones, following
relation for each voice.
This was my first version of the network:
Testing it, my voice never sounded so musical, hehehe… but still awful, so I was thinking in your ears health and demos are with Elvis one
Disclaimer: all audio demos are early testing versions (still with artifacts and clicks that should be removed soon)
Elvis harmonized demo: elvis-harmonized.ogg (to hear the online/streaming version go here)
Prototype:
Configuration:
Note: demos were done without residual processing because adding residual does not improve results much and adds a lot of overhead.
Then, following xamat’s suggestions I also added a detunning effect (and delay, but this one isn’t working properly yet)
Elvis harmonized (detunned version) demo:
elvis-harmonized-detunned.ogg (to hear the online/streaming version go
here)
but wait! a lot of graphics and this is also a ‘coding’ blog!!! here you have some code… and btw you can see that programming under CLAM could be very easy once you get the basics…
bool SMSHarmonizer::Do( const SpectralPeakArray& inPeaks,
const Fundamental& inFund,
const Spectrum& inSpectrum,
SpectralPeakArray& outPeaks,
Fundamental& outFund,
Spectrum& outSpectrum
)
{
outPeaks = inPeaks;
outFund = inFund;
outSpectrum = inSpectrum;
TData gain0 = mInputVoiceGain.GetLastValue();
mSinusoidalGain.GetInControl("Gain").DoControl(gain0);
mSinusoidalGain.Do(outPeaks,outPeaks);
SpectralPeakArray mtmpPeaks;
Fundamental mtmpFund;
Spectrum mtmpSpectrum;
for (int i=0; i < mVoicesPitch.Size(); i++)
{
TData gain = mVoicesGain[i].GetLastValue();
if (gain<0.01) //means voice OFF
continue;
TData amount = mVoicesPitch[i].GetLastValue() + frand()*mVoicesDetuningAmount[i].GetLastValue(); //detuning
amount = CLAM_pow( 2., amount/12. ); //adjust to equal-tempered scale semitones
mPitchShift.GetInControl("PitchSteps").DoControl(amount);
mPitchShift.Do( inPeaks,
inFund,
inSpectrum,
mtmpPeaks,
mtmpFund,
mtmpSpectrum);
mSinusoidalGain.GetInControl("Gain").DoControl(gain);
mSinusoidalGain.Do(mtmpPeaks,mtmpPeaks);
TData delay = mVoicesDelay[i].GetLastValue();
if (delay>0.)
{
mPeaksDelay.GetInControl("Delay Control").DoControl(delay);
mPeaksDelay.Do(mtmpPeaks, mtmpPeaks);
}
outPeaks = outPeaks + mtmpPeaks;
if (!mIgnoreResidual)
mSpectrumAdder.Do(outSpectrum, mtmpSpectrum, outSpectrum);
}
return true;
}
The plan includes add MIDI control for each voice pitch (then will be easy to control them for example by a keyboard by the same singing person)
Next post: SMSMorph.
algorithms
, audio
, c++
, CLAM
, effects
, english
, GSoC2007
, GUI
, math
, midi
, music theory
, programming
, signal processing 
Share This/Compártelo
Posted in audio, algorithms, effects, signal processing, programming, music theory, c++, midi, math, English, CLAM, GSoC2007, GUI | 1 Comment »
Posted by hordia on 30th July 2007
From today I’ll try to start blogging a little more about my gsoc progress…
First of all I was adding bounded limits to many transformations, a task that taught me a lot about CLAM infrastructure (good suggestion pau!), then I had pitch discretization working in NetworkEditor (new network) and built a prototyped example (this taught me about how to make GUI prototypes with QTDesigner). I also worked in some minor bug fix and new features like add set default value to InControls.

Indeed I had wrote a couple of unit tests too
(something very easy, but totally new for me). I have to say that testfarm and automatic testing are very cool features for this kind of development.
Testfarm looks like this:

I’ve also added a new network for hoarseness, I think very useful as first aproach to work with Sinusoidal+Residual models (SMS)
Anyway, most of my work was with SMS Harmonizer, but that is for a forthcoming post.
audio
, CLAM
, effects
, english
, GSoC2007
, GUI
, noise
, programming 
Share This/Compártelo
Posted in audio, effects, programming, noise, English, CLAM, GSoC2007, GUI | No Comments »
Posted by hordia on 28th July 2007
I was testing my new harmonizer network (with my mic open) and a new SMS arrive to my phone…

funny, don’t? a perfect pulse…
btw: any of you could give a complete explanation about this kind of effect? I also noticed the same with some speakers (of course only hearing the signal) and I think a big clue are the wires, because with my car speakers that effect only happen when I have plugged my “cassette-to-mini-plug” adaptor…
Update: check “SMS interference mystery solved” post.
audio
, CLAM
, english
, GSoC2007
, hardware 
Share This/Compártelo
Posted in hardware, English, CLAM, GSoC2007 | 1 Comment »
Posted by hordia on 12th July 2007
CLAM es un completo framework para hacer investigación y desarrollo sobre audio y música (esto también incluye aplicaciones para usuarios finales). Ofrece un modelo conceptual y herramientas para el análisis, la síntesis y el procesamiento de señales de audio. Tiene una interfaz muy amigable, es Software Libre, multiplataforma y esta escrito en C++ (en muchas de sus aplicaciones utiliza tiempo real).
A pesar de que tuvo su origen en una Universidad de Barcelona, España, la documentación en español sobre este framework es escasa, asi que me decidí a hacer una pequeña introducción sobre las cosas básicas, pero con links (eso si, la mayoria en inglés) para el que quiera ir más allá. Pienso que le puede servir a más de uno para empezar.
Básicamente hay dos perfiles: el de usuario final (de las aplicaciones) y el de desarrolladores que quieran escribir sus propios programas sobre este framework.
En este momento se compone de 4 programas principales:
NetworkEditor:
Es una aplicación que permite conectar módulos en forma de red de procesamiento al estilo pd (pero mucho más amigable), MaxMSP o Reaktor (o para los que usan matlab, tipo simulink). Estas redes se ejecutan en tiempo real y se pueden correr con jack, portaudio, LADSPA o VST como backend.
Una de las características más interesantes es que esta red se puede exportar y después correr con una interfaz gráfica diseñada con QTDesigner (ambos programas exportan a un xml que luego se corre con la aplicación Prototyper)
Recomiendo ver esta presentación: “Visual prototyping of audio applications”
Es decir, un usuario que no es programador puede armar complejos plugins o aplicaciones sin escribir una sola línea de código. También es muy útil para armar prototipos de futuras aplicaciones o desarrollos.
En este momento se esta integrando con LADSPA, lv2 y se planea reforzar aún más la posibilidad de usar plugins externos (como un módulo más) dentro del NetworkEditor y vicerversa, usar estas redes como plugins en otras aplicaciones.
Para el que quiera empezar, recomiendo esto:
Annotator:
Es un programa para hacer transcripciones, en el estilo de Sonic Visualizer. Muy potente y con características que lo hacen único.
Para conocer más:
SMSTools:
Un analizador de señales de audio en el estilo de wavesurfer que soporta diferentes tipos de visualización como spectogramas, y todas las derivadas del módelo Sinusoides + Residuo asi como trasnformaciones complejas basadas en este modelo (gender change, pitch-shifting, morph, etc) y muchas otras cosas más (ver tutorial).
Voice2MIDI:
Convierte voz en MIDI. Esta comentado en este artículo de linuxjournal.
Para desarrolladores, sirve como entorno para realizar sus propias aplicaciones de forma fácil o como herramienta para hacer prototipos de sus futuras implementaciones.
Recomiendo:
Si uno quiere, puede aportar al proyecto mandando ‘patchs’ de código a los desarrolladores principales y hasta convertirse en ‘developer’ luego de haber mandado varios de ellos.
En fin, es un proyecto bastante grande y ambicioso. Incluso hay miles de desarrollos más sobre el mismo que no están en el ‘paquete principal’, pero me pareció útil dar un panorama general porque tal vez sea tan abarcativo que maree un poco para el que recién escucha algo de él.
Como ya dije, es multiplataforma y esta disponible para GNU/Linux (con paquetes para varias distribuciones), Windows y Mac (ver más y descargar)
Otro links interesantes:
algorithms
, audio
, c++
, Castellano
, CLAM
, effects
, free software
, GNU/Linux
, GPL
, GSoC2007
, GUI
, libraries
, library
, midi
, music
, noise
, programming
, projects
, publications
, signal processing
, speech
, standards 
Share This/Compártelo
Posted in audio, algorithms, effects, signal processing, music, free software, programming, GNU/Linux, GPL, c++, noise, libraries, midi, publications, projects, Castellano, CLAM, standards, speech, GSoC2007, GUI, library | No Comments »
Posted by hordia on 20th May 2007
Next Monday finally starts the google summer of code, here my finally accepted application:
Title: Real-time spectral transformations
Mentor: Pau Arumí Albó
License: GNU General Public License (GPL)
Abstract: Revamp all CLAM SMS transformations. Turn real-time all those still aren’t and have them working on Network Editor. For example: Harmonizer, Morph and Time Stretch. Make nice prototypes for use them with Prototyper and have special focus on some. Also make real-time Voice2Midi and all those widgets which can be needed.
Some words more about this:
- Port it to real-time and NetworkEditor these spectral transformations: Harmonizer, morph, time-stretch and pitch-discretization.
- Fix gender change (already real-time, residual improvement)
- Make a harmonizer prototype with sliders to control each “voice” gain and the option of control them by midi too.
- Realtime Voice2MIDI. Piano-roll widget for NetworkEditor/Prototyper.
- More general improvements on SMS transformations over issues that can arise during the development of the project.
Last days I was mainly reading about SMS transformations and its model and the “Spectral Processing” chapter of the DAFX (Digital Audio Effects) book.
More news and details about this project here soon.
audio
, books
, c++
, CLAM
, english
, GPL
, GSoC2007
, GUI
, midi
, programming
, signal processing 
Share This/Compártelo
Posted in audio, signal processing, programming, GPL, books, c++, midi, English, CLAM, GSoC2007, GUI | No Comments »