Ahh ... now I have a much better idea of your knowledge of DBs and programming. Sorry, but I had to feel you out on this because I regularly get questions from people whose idea of a database is a 50 row Excel spreadsheet, and they have never programmed anything more than, "Hello, World" in their freshman CompSci course, LOL.
WRT your comment,
"...rather than keying in a meaningful description in the image name...", you are absolutely on the correct path. As per common DB practice, I put almost nothing in the image name except a unique identifier. All the good stuff goes in the many (standardized) IPTC fields. If you are not familiar with it, IPTC has a good website, and you definitely should take a stroll through it:
http://www.iptc.org/cms/site/index.html .
Also, for future compatibility, you should also get a feeling for which IPTC fields are currently commonly supported by image viewing programs and commercial DAM software. I would say that probably fewer than 10% of the available IPTC fields are generally recognized by such software, so to ensure maximum compatibility, you should probably stick to using / populating only the most commonly used fields.
WRT to having > 600k images, I'm impressed! I thought that as an individual photographer, I had a fairly large archive with > 200k images. Like you, I've been shooting since the late 1950's, and have digitized quite a few of my images from those days.
OP:
"...You mentioned a package called 'Photo Mechanic'. I'll have a look at this first...." - Yes, definitely have a look at it, but realize that it's forte is speeding up the process of initial manual entry of keywords and other IPTC metadata into the file itself, not adding to a DB or retrieval of images using a DB. With respect to Photo Mechanic, you mentioned sports. PM has a neat feature called "code replacement" which is essentially shortcut "aliases" for team members, so if you have thousands of pix of one team or from some game, but different groups of players appear in each image, you can very quickly enter the full names, spelled correctly each time, for every person in that image. For a more detailed description of this feature, see:
http://www.controlledvocabulary.com/imagedatabases/cv-photo-mechanic-code-replacement.html
Personally, I wouldn't reinvent the wheel by writing software to extract the metadata from your images. Rather, I would strongly suggest you use EXIFtool. If you aren't already familiar with it, you owe it to yourself to read the following:
http://www.sno.phy.queensu.ca/~phil/exiftool/
This tool is very stable and well respected. If you can make use of it, then you can concentrate your efforts on the construction of the DB and actually using it.
Finally, with respect to construction of the DB itself, realize that by constructing one that handles > 600k records, you are essentially constructing an enterprise sized DAM tool, but such commercial software does already exist, so it probably behooves you to survey what's available before you embark what is going to be a significant effort. It probably boils down a trade-off between your time (plus the usual development uncertainties) versus your money (but no development uncertainty).
With respect to other DAM software, I can tell you this: From what I've heard, as well as from my own personal experience, Lightroom starts getting very slow over 100k images. (FYI, this is on a very fast i7, Win8 box with 64G of RAM and several SSDs), so I can't recommend it for an archive of your size. I know that Picasa works fine up to about 300k images, the largest number I've ever tested it (...after that, I split my DB into two), but it's record retrieval features are primitive compared to what you probably are looking for.
I also use a DAM product called "Extensis Portfolio" (personal edition). I've had a love-hate relationship with it for probably 7 years now. In some ways, it's wonderful, but in some ways, it's horrible. For example, if the version of it that I own encounters an unknown file type (eg, a 16 bit per channel TIF), it stops dead in its tracks and waits for user input instead of writing the suspect file to a log and continuing the ingest process. Such behavior may be tolerable if you are ingesting 1000 images, where it might hang a couple of dozen times. It's completely unacceptable behavior if you are attempting to ingest (...or re-ingest for the n-th time) a 200k image archive. It also appears that almost no development has been done to the studio/personal version in many years. Maybe their enterprise level version has received more attention. Bottom line: You might want to look into their enterprise level version, but I can't personally recommend it.
HTH,
Tom M