I think I already asked, it's an old problem I drag:

Say I have a bunch of files stored in an arborescence (they are HDF5 but the format is not the most important here), they are coming from a monitoring system which spits data + metadata regularly.

I want to be able to search for files based on metadata. And I kind of hate DB like SQL and stuff.

I was thinking using a basic method : get all the metadata from all files and put them in a single central reference file.
For that I could use pytables, which would itself produce a hdf5 with all metadata, and with a possibility of searching in that file (since pytables support something that looks like DB request)
I'm not really fond of that solution because it relies quite heavily on python, and python is not stable IMO, but it seems like a compromise.

Any thought on this ? I may miss some very basic stuff, like "stop hating DB" for example.
Of course I would also be interested by any existing piece of software that is able to manage a bunch of files based on metadata (didn't find something "at my level" yet)

#files #DB #database #question #python #hdf5 #data #computing

25