The MP3.com 2003 Project

Hello and welcome to this little project.

TLDR: This website contains a static copy of the MP3.com website as it existed during Thanksgiving November 2003.

Go straight to the artist list page.


The What

MP3.com was the "go to" place for rising independent musicians to share their music online with other musicians and music lovers in the heady days of the internet. Musicians (either as individuals, groups or organizations) would create a profile on the website where they could provide some information about their music project, including elements such as genre, location and band members, while also sharing free downloads of their music. Think an early version of MySpace, or for the more current crowd, a very early Bandcamp.

During its history (detailed more here on Wikipedia) the website changed hands in 2001 to be acquired by Vivendi Universal, who later sold it off to CNET.com (very much at the time this project is capturing). Over the coming years that followed the websites purpose appear to change, with a stronger focus on music news instead of being a place where individuals could share their music.

As previously mentioned, many other services came to popularity in the years that followed and with more musicians realising how important the internet was as a tool for marketing and distribution many of these became well-known to the general public.


The Why

A lot of online projects have emerged to try and keep an evidence-driven database of all music information, and although finding information about music that would be on any major label can be considered rather trivial - information about independent releases is often hard to come by.

One of these online projects that exists is MusicBrainz, an "open music encyclopedia that collects music metadata and makes it available to the public". This project is very close to my heart, and I have spent a great of time contributing and existing within its community. However it is very difficult to add entities to the database without any hard evidence, certainly if that information was say stored on a website who have since removed any trace of it, and it wasn't captured by early web archives.

So with this dilemma, the only two options available are:

  1. Hope that someone has a good quality backup or "dump" of the websites content
  2. A time machine that can allow you to travel back to the times when these websites existed

Considering that #2 isn't yet possible, we're hoping for #1


The How

The Internet Archive is widely known to anyone who is interested in "lost" content. As well as primarily serving as an online library of written texts, audio, video and software it often serves as a place for "dumps" to appear.

The Internet Archive also run one of the most (if not the most) popular web archives in existence, allowing anyone to visit a website from the past in their web browser, entirely for free!

And yes, that archive does cover some of the pages that existed on MP3.com - but not all of them, actually a considerable amount of them simply were never touched (or at least are not easily "find-able").

However in August 2022, free-range archivist Jason Scott (textfiles), posted to his Twitter a message that a dump of the audio content (i.e. the free music downloads) had been contributed to the archive. For those interested in lost media this was surprising, as many had considered this content to have been lost a long time ago, as many of the audio downloads were impossible to reach via the web archives that existed.

I was one of these individuals who's interest was sparked into a flurry of ideas and thoughts, a post was made to the MusicBrainz community boards and the interest went further. As we began to download some of these packs of MP3's, we wondered how many of these are already known to such a large database like MusicBrainz.

Luckily for us MusicBrainz's tagging application, Picard, has an audio fingerprinting function (Chromaprint, provided by AcoustID.org) which when given any digital audio file, generates a unique "string" for it and then checks against existing strings as to whether the content already exists in the database.

After some testing by both myself, MusicBrainz contributors and Jason Scott himself - it was looking that less than 2% of each pack was returning any result in Picard, meaning that the fingerprint was not linked to any recording entity in the database. Ultimately these were rare tracks indeed.

The digital audio files presented had some metadata attached - a track title, an artist and a comment, linking back to a profile URI that would (at the time) take you to the musicians MP3.com page.

This meant that to submit the content in the dump to MusicBrainz, one would have to hope that URI had been covered in the web archive - as to avoid adding thousands of "unknown entities". This is because the artist pages contained extremely important identifying information such as where they were based, the genre of music they catered for, possible physical releases they had, links to old webpages, and possibly even information about band members, real names, forming dates, live performances and photos.

At this point, a lot of that information was sadly absent - meaning that contributors like myself were at the top of a very deep, dark well knowing that we'd have to break out some our most skillful research-fu to be able to have a hope of adding the music to the database.

However, the following day, Jason Scott (the hero that he is) located another "dump" that was contributed to The Internet Archive in 2012 which had the chance to contain the ever critical artist pages.

Work began to download the 10GB archive, and begin unpacking it. And sure enough there they were, every single artist page that existed over Thanksgiving 2003; along with dumps of artist information pages, song pages (useful for identifying work languages) and some images.

As a lot of this content was considerably "flat", all that would really need to exist for this content to be easily browsable again is a simple web server instance.

So that is what you're looking at, a web server with the dump data pumped into it - however there are some restrictions and things to note...


Restrictions/Limitations of this Project

There are some limiations and personal restrictions to this project. Some of these are because the data simply doesn't exist, and some of these are because it doesn't "fit" my current needs.


The Thank-you's

  • The Internet Archive for being awesome, and allowing knowledge to be accessible and free. If you're interested in any of this kind of preservation, I highly suggest you consider donating to their cause.
  • Jason Scott for being an awesome human being, for being patient with some of my questions, for capturing and sharing these dumps. If you would like, you can follow Jason and his escapades on Twitter at @textfiles
  • Brewster Kahle for not only founding the Internet Archive, but also for being involved with the "just in time" metadata archive that makes this project possible
  • John Gilmore for helping found one of the most important organizations on the internet, Electronic Frontier Foundation (EFF), and his work with Brewster to capture the data presented in this project.
  • Metabrainz Foundation for providing the MusicBrainz project, an answer to any music nerds hopes and dreams. If you've been spending all your time submitting information to closed-source datbases (like Discogs) may I kindly suggest you look at donating some of your hard efforts to MusicBrainz, where your data and hardwork is always appreciated
  • To MP3.com for existing and providing a central place for many of these aspiring musicians to share their content

  • Browse Pages

    With the explanations out of the way, I'm sure you're keen to get stuck in - go right ahead, follow me


    Contact

    Got something to say? Want to give some feedback? Can you help with missing data? Want your content revoked from this project?

    Contact me via E-Mail at mp3-2003@computer-legacy.com


    A computer-legacy.com project - click for more old computer business.