What's new
LiteRECORDS

Register a free account today to become a member! Once signed in, you'll be able to participate on this site by adding your own topics and posts, as well as connect with other members through your own private inbox!

  • Guest, before your account can be reviewed you must click the activation link sent to your email account. Please ensure you check your junk folders.
    If you do not see the link after 24 hours please open a support ticket.

Powerful New Vocal Remover AI - Instructions

[MENTION=39673]Anjok[/MENTION] Will we see the baseline updated to use commands or have menu option in the GUI for specific stems? Like drums, bass, guitars, keys, etc? Yes lower end machines will take longer but it'll damn be worth it.
 
[MENTION=39673]Anjok[/MENTION] Will we see the baseline updated to use commands or have menu option in the GUI for specific stems? Like drums, bass, guitars, keys, etc? Yes lower end machines will take longer but it'll damn be worth it.

I don't think this exact method will work for separate stems, as it's only used for instrumentals. In that case, use Demucs - it's by far the best tool to extract bass, drums, synth and vocals.
 
I don't think this exact method will work for separate stems, as it's only used for instrumentals. In that case, use Demucs - it's by far the best tool to extract bass, drums, synth and vocals.

Can you point me there? I currently use RX7. But if it's better than that I'll give it a whirl.
 
Can you point me there? I currently use RX7. But if it's better than that I'll give it a whirl.

I can make a guide on how to install it on Google Colab, since it won't require coding there, otherwise this is the GitHub page and there are a few examples and comparisons https://github.com/facebookresearch/demucs

It's my go-to way for DIY stems. Downside is on some tracks it can ignore background vocals and leave them as synths, rather than part of the vocals track. But depends from song to song.
 
I can make a guide on how to install it on Google Colab, since it won't require coding there, otherwise this is the GitHub page and there are a few examples and comparisons https://github.com/facebookresearch/demucs

It's my go-to way for DIY stems. Downside is on some tracks it can ignore background vocals and leave them as synths, rather than part of the vocals track. But depends from song to song.

I went to the github to checked it out and can probably/maybe get it going but if you made a guide it would sure help djtayz.
 
I can make a guide on how to install it on Google Colab, since it won't require coding there, otherwise this is the GitHub page and there are a few examples and comparisons https://github.com/facebookresearch/demucs

It's my go-to way for DIY stems. Downside is on some tracks it can ignore background vocals and leave them as synths, rather than part of the vocals track. But depends from song to song.

I need the background vocals anyways. LOL.
 
Anjok........is there anyway that the AI has been used on a speaking part of a song......
does it struggle with just plain talking as opposed to singing due to the amount of reverb that some singers use
 
Anjok........is there anyway that the AI has been used on a speaking part of a song......
does it struggle with just plain talking as opposed to singing due to the amount of reverb that some singers use

Just speaking from my own experiences here, I have found it handles speaking exceptionally well in general.
I've heard some fantastic results on rap/hip-hop tracks etc. The flip side is that when the instrumentation is bare, every tiny little missed detail stands out. It's easier to scrub in spectral editing since there's not much sound on the spectrum to dig through ... but it all sticks out.

But really, since it's trained on voice, so much depends on how well it recognizes a specific type of voice sound, and how much it mistakes certain instrumentation for voice. That's why having multiple models could potentially prove very useful. Reverb kinda fits into that category as well. This model is trained on different music than the primary AI over in the other thread. The primary one can't handle reverb nearly as well as this one.
 
Just speaking from my own experiences here, I have found it handles speaking exceptionally well in general.
I've heard some fantastic results on rap/hip-hop tracks etc. The flip side is that when the instrumentation is bare, every tiny little missed detail stands out. It's easier to scrub in spectral editing since there's not much sound on the spectrum to dig through ... but it all sticks out.

But really, since it's trained on voice, so much depends on how well it recognizes a specific type of voice sound, and how much it mistakes certain instrumentation for voice. That's why having multiple models could potentially prove very useful. Reverb kinda fits into that category as well. This model is trained on different music than the primary AI over in the other thread. The primary one can't handle reverb nearly as well as this one.

I agree with this 100%. I just got my new PC parts today so I will be ready to start doing some tests this week! Regarding rkeanes' question, I put a dataset together consisting of trailer music with and without movie dialogue to see how well it learns to separate spoken word. I found a YouTube channel that has official instrumentals for trailers, so I'm hoping it pans out! I will probably include rap in that dataset as well because it isn't quite big enough.

I will be doing A LOT of experiments. This will take some time!
 
I will be doing A LOT of experiments. This will take some time!

Totally looking forward to the results!

I just completed a split conversion you can all check out.
It's for Red Hot Chili Peppers - Higher Ground
The AI in this thread was more effective at converting the 'verses' with the reverb,
and the AI 2.0 from the other thread was more effective at the end part.
Plus some added hours in spectral editing and a couple other cuts, and here's the results!
I included both conversions and my completed mix.

https://mega.nz/folder/IR8UxArT#b14akmLvfbkptUVrKpQYXw
 
Just speaking from my own experiences here, I have found it handles speaking exceptionally well in general.
I've heard some fantastic results on rap/hip-hop tracks etc. The flip side is that when the instrumentation is bare, every tiny little missed detail stands out. It's easier to scrub in spectral editing since there's not much sound on the spectrum to dig through ... but it all sticks out.

But really, since it's trained on voice, so much depends on how well it recognizes a specific type of voice sound, and how much it mistakes certain instrumentation for voice. That's why having multiple models could potentially prove very useful. Reverb kinda fits into that category as well. This model is trained on different music than the primary AI over in the other thread. The primary one can't handle reverb nearly as well as this one.

the vocal was tied to the acoustic guitar and it withdrew the guitar that much it was unusable.....not worried too much about it.
funny thing is that i had to reformat my laptop and cannot get AI to work again...lol
 
**UPDATE**

I've made a lot traction on the GUI and should have it released by the end of the first week of June, along with an updated model that so far will be the best one I've made. On par with the one used to create the instrumentals in the other thread!
 
Last edited:
**UPDATE**

I've made a lot traction on the GUI and should have it released by the end of the first week of June, along with an updated model that so far will be the best one I've made so far. On par with the one used to create the instrumentals in the other thread!

Thanks for all your hard work Anjok am looking so forward to this..
 
[MENTION=39673]Anjok[/MENTION] would it be possible to train a model to preserve backing vocals within allowed boundaries? I work with music on the karaoke side of things and sing and record. Backing vocals would be very helpful. I realize they can't be preserved in all songs due to how the mix is done. but your A.I. has really helped to make instrumentals from older tracks I thought would never be possible. It could essentially just be another command that tries to filter and export another track but instead of being acapella it's the backing vocals.
 
[MENTION=39673]Anjok[/MENTION] would it be possible to train a model to preserve backing vocals within allowed boundaries? I work with music on the karaoke side of things and sing and record. Backing vocals would be very helpful. I realize they can't be preserved in all songs due to how the mix is done. but your A.I. has really helped to make instrumentals from older tracks I thought would never be possible. It could essentially just be another command that tries to filter and export another track but instead of being acapella it's the backing vocals.

That's a good question and it's something I would have to experiment with. I think one way I can potentially make this happen is to train the AI on a dataset consisting of only full mixes paired with their official TV track counterparts. To have a model as effective as the ones I've shared, it would have to consist of at least 200 pairs as well. The GUI that's being developed will have a drop-down of models to choose from, so you'll be able to toggle between a karaoke model and a full vocal removal model.
 
That's a good question and it's something I would have to experiment with. I think one way I can potentially make this happen is to train the AI on a dataset consisting of only full mixes paired with their official TV track counterparts. To have a model as effective as the ones I've shared, it would have to consist of at least 200 pairs as well. The GUI that's being developed will have a drop-down of models to choose from, so you'll be able to toggle between a karaoke model and a full vocal removal model.

That would be absolutely fantastic. Some songs leave backing vocals currently already but again, it depends on the mix. As for the earlier mention songs someone mentioned with talking parts, I have one such old country classic I'd like to get devocaled. I may have to send it to you since I bought a retail CD in order for you to have the full frequency of the track. But it's called "Phantom 309." Really would mean a lot to get it devocaled properly.
 
need of some assistance....i had to reformat laptop and then i installed AI again but it wont run
PS C:\Users\rkean\Documents\vocal-removerV2> python inference.py --input bc.wav
PS C:\Users\rkean\Documents\vocal-removerV2> python inference.py --input bc.wav --gpu 0
PS C:\Users\rkean\Documents\vocal-removerV2>
any thoughts