What's new
Photoshop Gurus Forum

Welcome to Photoshop Gurus forum. Register a free account today to become a member! It's completely free. Once signed in, you'll enjoy an ad-free experience and be able to participate on this site by adding your own topics and posts, as well as connect with other members through your own private inbox!

Metadata


Orc

New Member
Messages
2
Likes
0
I complete 'File Info' for all the photographs I take. I take several hundred images per week.
I would like to use the Metadata attached to each image and port into a database.
Does anyone know of a piece of software that does this? Happy to pay for s/w that works for me.

Orc :lol:
 

ALB68

Dear Departed Guru and PSG Staff Member
Messages
3,020
Likes
1,332
I complete 'File Info' for all the photographs I take. I take several hundred images per week.
I would like to use the Metadata attached to each image and port into a database.
Does anyone know of a piece of software that does this? Happy to pay for s/w that works for me.

Orc :lol:

Perhaps there is help here https://forums.adobe.com/thread/966789?tstart=0
 

Tom Mann

Guru
Messages
7,223
Likes
4,343
Exactly what do you intend to do with the metadata DB, ie, in what ways will you likely querry it?

If all you want to do is search for specific images, e.g., requests from old wedding clients to order new prints, there are numerous solutions. They all go under the name, "Digital Asset Management" systems, and there is even a very good book (by the same name) that describes best practices in this field.

Although lacking in some ways (eg, Boolean searches not fully implemented), and doesn't fit some photographer's workflows (images must be "imported" before you can use LR's image editing capabilities), Adobe's Lightroom is probably the most modern, best selling, best supported, general choice at the present time for individual photographers, very small agencies, etc.

In addition to expected search criteria like file name, keywords, image date and time, type and size of image, number of "stars" and "color rating", etc., there are also aftermarket plugins (e.g., Jeff Friedel's excellent software) that do fairly unique things with the metadata like plot graphs of the frequency of usage of different focal lengths, usage of different camera bodies, lenses, time of day, etc.

In addition to LR, there are enterprise-level (eg, large stock agency) costly software packages, all the way down to something as simple (and free) as Google's Picassa. For simple keyword / caption / etc. searchs of collections as large as mine (over 200k images), it is amazingly good. Unfortunately, entering metadata into Picasa is abysmally limited, so Picassa is best used in conjunction with pro level dedicated image ingestion / metadata entry software such as "Photo Mechanic" (which I have used virtually every day for the past 5 or so years, and can highly recommend). PM is the historic choice of photojournalists, sports photographers, etc. who must get fully annotated images to their editors ASAP, but it has no built-in DB (although one has been rumored to be in development for the last couple of years), so it can't be used for searches other than within one directory of images.

OTOH, if by "a database", you really mean something as simple as an Excel or CSV file, there are numerous other choices available, although most of these require you to have a reasonable amount of programming experience, eg, command line solutions such as EXIFtool, jsx scriipts for Bridge, etc. Just Google {image metadata to CSV}.

HTH,

Tom M
 
Last edited:

Orc

New Member
Messages
2
Likes
0
Thanks Tom for replying so promptly.
I have 600,200 images varying from family photographs to sports pix.
I have photographs going back 40 years. As I'm using a good quality Nikon camera, most of my images are already digitised and filed away. My older photographs (negatives) have been digitised. Half of my photographs have only the image identifier and no more info apart from the embedded metadata. So rather than keying in a meaningful description in the image name, this is where I thought that extracting the metadata automatically would speed up things.
I was going to attempt to write a routine to scan a file of images and write to a database and use SQL to interrogate the data.
I'm very often asked for photographs of a particular sportsman dating back years. Sometimes the information I'm given apart from the name, is a date or a place. If I could copy over the data for every image I have to a database, using a fast search engine would give me what I require.
You mentioned a package called 'Photo Mechanic'. I'll have a look at this first.
I'm quite experienced in building and using databases. I'm currently building a database for a bank in Scotland to contain several billion records. I'm using Cloudera, Terradata, Hadoop and using Hive and SAS to interrogate it.
Thanks again. You have given me a number of avenues to explore.

regards

jan
 

Tom Mann

Guru
Messages
7,223
Likes
4,343
Ahh ... now I have a much better idea of your knowledge of DBs and programming. Sorry, but I had to feel you out on this because I regularly get questions from people whose idea of a database is a 50 row Excel spreadsheet, and they have never programmed anything more than, "Hello, World" in their freshman CompSci course, LOL.

WRT your comment, "...rather than keying in a meaningful description in the image name...", you are absolutely on the correct path. As per common DB practice, I put almost nothing in the image name except a unique identifier. All the good stuff goes in the many (standardized) IPTC fields. If you are not familiar with it, IPTC has a good website, and you definitely should take a stroll through it: http://www.iptc.org/cms/site/index.html .

Also, for future compatibility, you should also get a feeling for which IPTC fields are currently commonly supported by image viewing programs and commercial DAM software. I would say that probably fewer than 10% of the available IPTC fields are generally recognized by such software, so to ensure maximum compatibility, you should probably stick to using / populating only the most commonly used fields.


WRT to having > 600k images, I'm impressed! I thought that as an individual photographer, I had a fairly large archive with > 200k images. Like you, I've been shooting since the late 1950's, and have digitized quite a few of my images from those days.


OP: "...You mentioned a package called 'Photo Mechanic'. I'll have a look at this first...." - Yes, definitely have a look at it, but realize that it's forte is speeding up the process of initial manual entry of keywords and other IPTC metadata into the file itself, not adding to a DB or retrieval of images using a DB. With respect to Photo Mechanic, you mentioned sports. PM has a neat feature called "code replacement" which is essentially shortcut "aliases" for team members, so if you have thousands of pix of one team or from some game, but different groups of players appear in each image, you can very quickly enter the full names, spelled correctly each time, for every person in that image. For a more detailed description of this feature, see: http://www.controlledvocabulary.com/imagedatabases/cv-photo-mechanic-code-replacement.html

Personally, I wouldn't reinvent the wheel by writing software to extract the metadata from your images. Rather, I would strongly suggest you use EXIFtool. If you aren't already familiar with it, you owe it to yourself to read the following:
http://www.sno.phy.queensu.ca/~phil/exiftool/

This tool is very stable and well respected. If you can make use of it, then you can concentrate your efforts on the construction of the DB and actually using it.

Finally, with respect to construction of the DB itself, realize that by constructing one that handles > 600k records, you are essentially constructing an enterprise sized DAM tool, but such commercial software does already exist, so it probably behooves you to survey what's available before you embark what is going to be a significant effort. It probably boils down a trade-off between your time (plus the usual development uncertainties) versus your money (but no development uncertainty).

With respect to other DAM software, I can tell you this: From what I've heard, as well as from my own personal experience, Lightroom starts getting very slow over 100k images. (FYI, this is on a very fast i7, Win8 box with 64G of RAM and several SSDs), so I can't recommend it for an archive of your size. I know that Picasa works fine up to about 300k images, the largest number I've ever tested it (...after that, I split my DB into two), but it's record retrieval features are primitive compared to what you probably are looking for.

I also use a DAM product called "Extensis Portfolio" (personal edition). I've had a love-hate relationship with it for probably 7 years now. In some ways, it's wonderful, but in some ways, it's horrible. For example, if the version of it that I own encounters an unknown file type (eg, a 16 bit per channel TIF), it stops dead in its tracks and waits for user input instead of writing the suspect file to a log and continuing the ingest process. Such behavior may be tolerable if you are ingesting 1000 images, where it might hang a couple of dozen times. It's completely unacceptable behavior if you are attempting to ingest (...or re-ingest for the n-th time) a 200k image archive. It also appears that almost no development has been done to the studio/personal version in many years. Maybe their enterprise level version has received more attention. Bottom line: You might want to look into their enterprise level version, but I can't personally recommend it.

HTH,

Tom M
 

Tom Mann

Guru
Messages
7,223
Likes
4,343
To add a bit more to my recommendation to use EXIFtool, here's an excerpt from another thread that illustrates how simple it is to use EXIFtool to build up a CSV file:

(from: http://photo.stackexchange.com/ques...hat-will-export-metadata-for-a-folder-full-of )

Code:
The -csv (comma separated values) option solves this dilemma by pre-extracting information from all input files, then producing a sorted list of available tag names as the first row of the output, and organizing the information into columns for each tag. As well, a first column labelled "SourceFile" is generated. These features make it practical to use the -csv option for extracting all information from multiple images. For example, this command:

       exiftool -csv -r t/images > out.csv

gives an output like this:

       SourceFile,AEBBracketValue,AELock,AFAreaHeight,AFAreaMode,AFAreas,[...]
       t/images/Canon.jpg,0,,151,,,[...] t/images/Casio.jpg,,,,,,[...]
       t/images/Nikon.jpg,,,,Single Area,,[...]
       t/images/OlympusE1.jpg,,Off,,,"Center (121,121)-(133,133)",[...]

This will include a very long list, so if you want to just include a few specific things you can do that:

exiftool -csv -Model -CreateDate -GPS:all -time:all *.jpg

(in this example, all of the files in the current directory).

The documentation warns that the -csv flag, unlike most exiftool options, builds the entire output in memory and so memory usage can be quite large when used on many files — probably best to script up something that goes folder-by-folder. (Easily done in even a simple batch langage.)
 

Top