Author Topic: Some sort of ID tag standardization or redirects  (Read 6485 times)

Nazosan

  • Newbie
  • *
  • Posts: 14
    • View Profile
Some sort of ID tag standardization or redirects
« on: March 02, 2009, 10:14:53 pm »
I think that one thing that's a bit of a problem in Audiosurf right now is that not everyone gets their tags from the same sources.  For example, I tag all of my music using MusicBrainz, but apparently not everyone does.  I have a lot of stuff that is in the original language, but a lot of people will have transliterations (I guess I'm going to have to go through and manually retag all of these or something, though if I can figure out how to use it I think Picard has a plugin for this that would at least help.)  It's too bad now that Audiosurf has such great Unicode support, but it seems that all too few are taking advantage of it apparently.  And just recently a friend posted a score on one song, but her copy had the artist's name reversed (actually technically correct, but no one uses it that way) from the ordering everyone else was using.  As such, her score isn't even on the global board for that particular song, but is instead completely on its own.  I did a run on that song and was quite frustrated as I had to restart several times and such due to some really tight spaces between the grays and by the time I finished there was no way I was running it again...  (Yeah, we can compare the hard way, but what's the point of the global boards if people have to do this?)  Most importantly, I think everyone playing the same song should be comparing scores rather than everyone playing the same tagged song...

Now Picard and such aren't perfect by any means.  I've had a perfect copy straight from a CD where Picard thought most of the tracks didn't match the album for some reason (and one or two where it tried to match a track or two to another album or a single -- sometimes not even the right song.)  I will say I've had better results with the MusicBrainz's algorithms than any others so far though.  So I don't think it should be enforced to this or something since there are incorrect results.  I think maybe it should try to check each song while it's generating the track data and if it finds a match, ask the user if this match is correct.  I think it should maybe save the MusicBrainz album/track IDs in the metadata for the file, but not change the actual artist/title/etc data in case the user has them a certain way on purpose.  I do think that if it were made optional it should be turned on by default as many people don't really look through the settings much and would just leave such a thing turned off, thus defeating much of the purpose here.

Now obviously this has a couple of obvious problems.  First, MusicBrainz lookups aren't exactly blazing fast.  This should perhaps be an optional thing and obviously the results need to be saved locally in some manner so that each particular song would only need to be looked up once.  (The game is already saving data on each song anyway, so this seems to me like it wouldn't be a big deal as we're talking about likely considerably less than 1KB of plain text.)  Also, MusicBrainz may not be super-happy about the volume of requests to their server.  I would propose that to this end, AudioSurf could host its own proxy server for this that does some major caching.  All of the popular results would be looked up instantly in the local cache and thus all of the more popular songs would result in a non-hit to MusicBrainz servers (and I imagine this would make up the majority of the volume of requests -- especially given that the worst hits would come from the likes of the radio, though I would imagine that the radio could download this information while it's downloading the song anyway.)  I think the actual volume of data would be relatively small -- especially compared to what the radio is using up -- and with basic HTTP compression enabled it should really be quite minimal by comparison I would imagine.  Even the cache probably wouldn't be much given that we're talking about data that consists more of small amounts of plain text than anything else (and obviously it doesn't even need to bother with loading ALL of the album data.  For example, the Amazon ASIN has no value in Audiosurf and thus should be skipped.)

I realize that not everything that is being worked out with MusicBrainz is complete and, in fact, it's not being used right now, but I think that in the discussions with MusicBrainz, maybe something of this should be discussed as well?

Alternately, I'm not quite sure how it could be done, but perhaps some sort of "redirect" system could be implemented.  For example, Last.fm is now starting to utilize this sort of concept.  Many things that are transliterated will redirect automatically to the correct thing for example.  Artist names that are reversed are often switched back or even transliterated back to the original text sometimes.  As such, a lot of incorrect tags (or technically correct, but different from what the majority use) are now being redirected to the correct artist/song so that the entry will work much better.  (Especially useful for the charts since if you listen to the same artist and their name is reversed on some of the songs but not all, it throws off the whole thing.)  Such a system pretty much has to rely on user corrections though and this obviously isn't perfect by any means, but it still works a lot better than without it.

Just a thought anyway.  Whatever the case may be, we certainly need some way of getting differently tagged same songs showing up on the same charts. 
« Last Edit: March 04, 2009, 04:51:17 pm by Nazosan »

charlieh

  • Jr. Member
  • **
  • Posts: 59
    • View Profile
Re: Some sort of ID tag standardization or redirects
« Reply #1 on: March 03, 2009, 12:07:19 am »
I think Nazosan is dead-on, here. This is quite an annoying problem, and it would be nice to see a solution implemented.

Here is a simple example:
Black Keys at satf.se
The Black Keys at satf.se

As you can see, plenty of people may or may not have the prepending "The." This throws off a lot of scoreboards.

godjonez

  • Newbie
  • *
  • Posts: 22
    • View Profile
Re: Some sort of ID tag standardization or redirects
« Reply #2 on: April 24, 2009, 05:43:15 am »
I also have seen this being a problem for quite a long time. My thoughts for solving it were quite different, though. Since AudioSurf already saves the track profile and song length, that data should be used to comparing the songs, not the metadata that everyone can edit.

There are currently two major problems:

- different names for the exact same song, causing the score boards to be separated
- different songs having the exact same name, causing scoreboards of different songs being merged


So what I would suggest is AudioSurf looks for the songs that have the same track profile and similar length (there might be a few seconds difference due to silence at end or start). If the song played already has metadata (most do!), that can also be used to find out the closest match.

This could also be helpful when you have untagged songs, artist, title or both being unknown. AudioSurf could find out someone who has played the exact same song before but with properly tagged, so you could actually fix your song metadata based on what other players have played.

This is just a suggestion of how it could work. I know it might not be very practical with the database AudioSurf uses for the scoreboards, but that's how I feel it should have been from the beginning.

laofmoonster

  • Newbie
  • *
  • Posts: 15
    • View Profile
Re: Some sort of ID tag standardization or redirects
« Reply #3 on: June 30, 2009, 10:26:31 am »
This has bothered me too. Half of my top scores are meaningless because I tagged them wrong.  :'(
I'd think there would be *some* consistent way of correcting tags without combining the high scores unrelated songs. For example:

3 Doors Down = Three Doors Down != Three Days Grace
Beatles, The = The Beatles = Beatles != Beatless
Dave Matthews Band = Dave Matthew's Band != Dave Matthews
[Band Name] -- [Title name] feat./ft./featuring [guest artist] = [Band Name] feat./ft./featuring [guest artist] -- [Title Name]

zig131

  • Newbie
  • *
  • Posts: 2
    • View Profile
Re: Some sort of ID tag standardization or redirects
« Reply #4 on: October 02, 2009, 01:43:00 pm »
Even more annoying is when Audiosurf doesn't seem to be able to get the artist info from tagged tracks. It is for this reason that I am the casual champion of five songs - all of them by the the rockin' 'Unknown Artist'. The tracks are tagged (Windows, AIMP2, MPC HC and WMP show the artists) it's just that audiosurf dosn't seem to get this. It doesn't happen with all artists but it happens with a lot of them. I use the Windows Media Audio file format for those interested and I tag manually using creative media source (initially when ripped) and AIMP2 advanced tag editor (for adjustments). If Audiosurf can generate small 'track' files from songs, can't it just compare this with a database or something? Or maybe just look at the name of the folder the file is in or ask the user if it is missing tag details.

ncom0pl

  • Newbie
  • *
  • Posts: 1
    • View Profile
Re: Some sort of ID tag standardization or redirects
« Reply #5 on: January 01, 2010, 05:02:23 pm »
It's been some time since the last message in this thread, but it doesn't look like it's been solved yet.

I agree that this is very annoying. For example, there is a band called AFI, and they have a song called The New Patron Saints and Angels.

Most people probably have it tagged simply as AFI - The New Patron Saints and Angels.
But i've also seen some other tags, like:
  • AFI - New Patron Saints and Angels
  • AFI - New Patron Saints and Angels, The
  • A.F.I - The New Patron Saints and Angels
  • A.F.I - New Patron Saints and Angels
  • A.F.I - New Patron Saints and Angels, The
  • A.F.I. - The New Patron Saints and Angels
  • A.F.I. - New Patron Saints and Angels
  • A.F.I. - New Patron Saints and Angels, The
  • A Fire Inside - The New Patron Saints and Angels
  • A Fire Inside - New Patron Saints and Angels
  • A Fire Inside - New Patron Saints and Angels, The

So there are at least 12 separate score lists for just one song.

last.fm, a music community website that provides those global statistics(and lets users compare their "musical compatibility") service deals with it by correcting the artist/title on your web profile(leaves your local files and tags untouched) by default. You can later revert to the original if you want or completely disable the feature.

Audiosurf needs a thing like that for the global scoreboards to be fair and accurate.

ViRUS

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 3156
    • MSN Messenger - newvirus@live.com.ar
    • View Profile
    • Email
Re: Some sort of ID tag standardization or redirects
« Reply #6 on: January 01, 2010, 05:15:45 pm »
It's been some time since the last message in this thread, but it doesn't look like it's been solved yet.

Maybe because there's nothing to be solved?


yindesu

  • Newbie
  • *
  • Posts: 10
    • View Profile
Re: Some sort of ID tag standardization or redirects
« Reply #7 on: February 06, 2010, 01:44:10 pm »
What if Audiosurf implemented a vote-to-correct system like Last.fm's, where users can vote to merge songs and artists, with moderators/admins reviewing the cases before approving them?  For tracks I can understand this would be overwhelming, but just merging artists would be a grand start.

edit: I see that this was mentioned in the very last paragraph of the opening post, I did read _most_ of it before posting..
« Last Edit: February 06, 2010, 01:46:46 pm by yindesu »

murlough23

  • Sr. Member
  • ****
  • Posts: 403
    • View Profile
    • Email
Re: Some sort of ID tag standardization or redirects
« Reply #8 on: February 18, 2010, 04:42:43 pm »
What if Audiosurf implemented a vote-to-correct system like Last.fm's, where users can vote to merge songs and artists, with moderators/admins reviewing the cases before approving them?  For tracks I can understand this would be overwhelming, but just merging artists would be a grand start.

I was thinking of something like this as well. Let it be user-driven, but with some moderator control so people can't cheat the system. A good idea would be to store, as a simple text field, the exact spelling, punctuation, etc. of the track as the user had it tagged, just in case there's ever a need to "un-merge" two separate songs that were incorrectly merged (such as distinct recordings of the same song that someone erroneously merged even though they were labelled separately - an album version and a live version or radio edit, etc.)

Aquinox

  • Newbie
  • *
  • Posts: 30
    • View Profile
    • Email
Re: Some sort of ID tag standardization or redirects
« Reply #9 on: March 02, 2010, 12:15:11 pm »
Also, you have for example people who don't put (Original Mix) behind the track name, also causing double tracks.

I have the track Tiėsto  - In my Memory (Airwave Instrumental Mix). I played this and yet it doesn't appear when I search for tiesto or in my memory.(as you understand this is not the only example) Only if the whole string is typed  it gets a result.

I think the easiest would be that it compares with waveforms of other tracks, and if its not sure it guives you suggestions and asks which is the right one.

murlough23

  • Sr. Member
  • ****
  • Posts: 403
    • View Profile
    • Email
Re: Some sort of ID tag standardization or redirects
« Reply #10 on: March 03, 2010, 06:23:50 pm »
I think the easiest would be that it compares with waveforms of other tracks, and if its not sure it guives you suggestions and asks which is the right one.

That would be very easy for us to use, but very difficult for the developers to implement. Simple is better - matching substrings should be sufficient. Failing that, the ability to search artist AND title as separate search fields would be a good quick fix (so we can find songs with very common titles by artists with more popular songs in the database).

Aquinox

  • Newbie
  • *
  • Posts: 30
    • View Profile
    • Email
Re: Some sort of ID tag standardization or redirects
« Reply #11 on: March 04, 2010, 10:24:59 am »
Not a deep comparison of the two waveforms, but just one with a +/- 5% difference margin. It shows the waveform of a played track in the scoreboard already, so  comparing that shouldn't  be hard.

blue_h3x

  • Hero Member
  • *****
  • Posts: 4577
    • View Profile
    • AS Tournament
Re: Some sort of ID tag standardization or redirects
« Reply #12 on: March 04, 2010, 10:31:24 am »
Not a deep comparison of the two waveforms, but just one with a +/- 5% difference margin. It shows the waveform of a played track in the scoreboard already, so  comparing that shouldn't  be hard.

You might get differences greater than that if you compare a compressed MP3 vs something lossless like FLAC.
Austria is just like Yorkshire, but they have bigger hills.... oh and they have real snow too

Mincus

  • Hero Member
  • *****
  • Posts: 2394
    • View Profile
Re: Some sort of ID tag standardization or redirects
« Reply #13 on: March 04, 2010, 10:51:38 am »
Bugger comparing the waveforms, Dylan already has this wonderful rollercoaster visual he can generate. Just need to stick close to that.

Currently the one shown has a limited number of points, but since it's a small version of the track shape itself in-game, adding more points wouldn't be a problem.

I still maintain I'd rather see work on a new engine though. :P
With a properly coded engine Audiosurf could easily run with Premium on graphics cards 5 years old.
What you lose for rapid prototyping...

Aquinox

  • Newbie
  • *
  • Posts: 30
    • View Profile
    • Email
Re: Some sort of ID tag standardization or redirects
« Reply #14 on: March 06, 2010, 01:34:37 pm »
Yeah it is strange that a 500$ vidcard from a few years ago can't run this on max.

Not a deep comparison of the two waveforms, but just one with a +/- 5% difference margin. It shows the waveform of a played track in the scoreboard already, so  comparing that shouldn't  be hard.

You might get differences greater than that if you compare a compressed MP3 vs something lossless like FLAC.
I don't think so. maybe 2-3%, but not sure.

Completely unnecessary, unrelated, and mostly untrue.
« Last Edit: March 06, 2010, 08:19:55 pm by Laserrobotics »