What do you think on the new Stem separation in Serato

Pretty sure Serato’s marketing budget is a lot bigger than Algoriddim’s plus they have a way bigger user base. I’m definitely excited to see some competition pushing the advancement of STEMs though. It’s come a LONG way since Traktor first introduced it back in like 2015 or something.


STEMS are cool for about 10 minutes - but I’m willing to bet, 3 months from now 90% of us will barely be using them. We kinda already know this from how people are actually using STEMS in Traktor, djay Pro and VDJ. Seriously - almost nobody switches DJ platforms for STEMS?!?

BUT … introduce “Industrial Strength Library Management”, and I bet you see people switching DJ platforms like crazy.

1 Like

Thinking about this some more this might actually be what Serato are doing. And explain why they don’t allow fading stems in and out as it would complicate overlapping frequencies in different stems.

If I’m correct they simply switch between different versions of pre-processed stems (e.g drums + bass is in reality one additional optimized audio file) to which the app switches when you think it’s mixing two stems.

That could also explain their huge stem files on the hard drive & (beta) memory issues.

I’ve been in the 10% for two years now. Depending on the genre you’re mixing it’s more useful IMO.

1 Like

Based on this video I’d rank STEMS sound quality:
#1 - Serato
#2 - djay Pro
#3 - VDJ


I’ve also been in the 10% since Algoriddim released this feature about 2 years ago. I use it a lot in combination with EQs in my mixing, to remove unwanted vocals and to layer an acapella over another track. Honestly, I can’t imagine DJing without it anymore.


Friends. I can honestly say that I’m
Seriously considering to leave DJAY for SRT. The SRT stems are significantly better . I almost can’t stomach the difference.

I love DJAY . But
To be able to use stems
In the real world we need better quality and SRT has done it .


Well as a software developer point of view:
it’s totally different thing to do realtime vs “preprocessed”.

When you are doing it realitme : you need to be sure that there is enough processing power to do all other stuff ie. let say max to 50%.

But when you are doing the preprocessing you can fill up and use all remaining processing power to the max 100% even if you “might freeze” the machine.

So yeah… everything done with preprocessing will sound better as you can calculate the result several minutes vs realtime you need to have the result now.

1 Like

Not sure who is calculating vor several minutes. Should this explain the difference in sound quality ?

The fact is that Djay is using spleeter whereas Serato is using their own stuff. No matter if you do it in real time or not. All the Apps build on spleeter have more or less the same result depending on the pre and post processing which it used in addition.

1 Like

Well my point was that if you do the preprocessing like Serato: you can assume that the user of the tool is willing to wait X amount of seconds to get the “best result”.

If you do like Algoriddim: the result should be ready when we press the button i.e less than in <1s.

And when I’ve looked up AI stuff: creating AI model might take weeks/months to get good results and even after that the model isn’t perfect. Results might not be accurate enough for realtime processing even with RTX 20x or RTX30x…

I would assume that audio processing is kind of similar to what I’ve seen when reading related stuff like realtime face detection.

Good example of such research is Performance analysis of real-time face detection system based on stream data mining frameworks - ScienceDirect

They used clustered high end machines to do the processing.

Page 814:
“Apache Storm system showed linear growth of throughput until 64 parallelism value in all tested
image sizes. Apache Storm based program for 1920 × 1080 image size achieved throughput of 24 frames per second.This performance enables to process FullHD video streams in real-time.”

This means that they had 64 high end computers calculating in the realtime to get the realtime detection work!

Even though computing power has grown better after that research but 64X better performance in last 5 years… Nope : haven’t seen that kind of improvement on this AI based systems. Maybe there is, but not in a system using 1 chip.

Do you have a source?

1 Like

I suspect they’re using optimized stems / ‘tracks’ for all different (classic) stem combinations. Which is pre-processed. On the other hand it’s already pretty fast on a M2 so it almost feels like realtime.

Djay does some form of processing too of course, and they always can expand the concept in addition to improving the AI…

1 Like

Another great example of real-time processing is this…

I.e this one developer wanted to create a program which drives his car in GTA… instead him pressing the w-a-s-d…

So in the video you can see how long it takes to process 1 frame.

As you can see in the video there are times between 0.03-0.9s when processing one frame.
So if we assume we would like to see 60 frames / second for nice gaming experience, so frame changes every 0.0166666666667s.
And the processing times for a frame on the video are bigger than that value.
It’s laggy as it can’t really handle it fully real-time.

So, as a dj point of view, the delay when audios are synced, we will notice if there is a small delay.
To have a good audio result, we need to calculate good amount stemmed audio before streaming it out to the speakers…
But what if we turn the knob for some of the stem parts => we need to recalculate whole preprocessed audio buffer again before streaming it out to the speakers.

So… M1’s & M2’s are getting better as performance point of view. But to do things parallel and try to be real-time makes things much harder as a developer point of view.

Doing calculation in one thread and giving it out to second thread is like boxing the data to a card box and sending with mail to receiver. Receiver will unbox the data and put it out to the speakers.

There will be delays because of the hardware and how operating system share data between cores, threads and related hw.

It is on the GitHub if you follow my link but below a screenshot from what is written.

1 Like

Every DJ software is doing preprocessing at leased to get the waveform. I did not have any delay when I was trying Serato Stems.

So… that’s my point…
There is no delay as when you preprocess and save the stems to your local drive in serato.
The stems are there like any other mp3… no delay when loading and playing out, no need to do extra processing.

djpro needs to that stuff on the fly in realtime when ever we “push the buttons” to do it. So the quality won’t be the same and there will be longer delays to do that extra realtime isolation than in serato.

Does it work with an ipad……no….end of discussion.


In Serato you can choose to pre-compute the stems by using the “stems crate” or just load a song to a deck and have the stems on-demand computed while loading. The stems analysis done is the same, so there is no quality improvement by pre-computing.

The intention with the stems crate, is to make track loading performance acceptable even on older less capable machines.

The pre-computed stems comes with a heavy disk usage penalty though, whereas the on-demand stems does not. So, you probably would not want to pre-compute for your entire collection.


Do you know if we can access the pre-computed stems to use them in a DAW ?

1 Like

No, I don’t know in which format they store the stems. But knowing Serato it is sure to be some proprietary format not easily usable in non-Serato software. But it is likely that they will integrate it with their own “Studio” software.