Keywords(Tags)/Description for files based on an ID system like MD5

Have a suggestion for "Everything"? Please post it here.
Post Reply
Herkules97
Posts: 71
Joined: Tue Oct 08, 2019 6:42 am

Keywords(Tags)/Description for files based on an ID system like MD5

Post by Herkules97 »

Original title: Keywords(Tags)/Description for files based on an unchanging ID system
I looked for existing topics but nothing seemed the same/I don't know what to search for so here is a newer post:

I'm lazy when it comes to sorting. So lazy I haven't even done it for many sources, like my phone's photos I have done zero sorting for. I don't even extract them from the archives I make on the phone(ZArchiver). I copy the archives and they just sit there and whenever I want one of the photos, I skip the task because it would take hours if not days to find something specific and a repeat hours or days for any future specific thing.

What if you can combine the active monitoring of EbV with text attached, only in the db itself so it doesn't modify the files, to IDs.
Maybe it could actually be a separate document similar to search history, run history and such.
Or kept in db, but exportable AND importable, so like search history and run history you can dump them into a new folder and continue using that with a different db. This is mostly useful when updating with a new db version. But also so you can read them outside EbV, extra export options for external use only could be a thing.
Export for importing into a different db or export for use externally. Difference can be that first only has md5 and custom text and is less user-friendly and the second is similar to index journal where whatever columns are in the window it exports from are what those exports will have.
It would benefit from having its own window like Index Journal so that you can have columns specific to exporting this data instead of having to mess with the main window's columns or the index journal one's. Idk about search history, one solution could be to search in the main window and then copy the search into the custom text window. I do something similar for index journal because some of them are so big and lag hard writing each letter(it could benefit from not searching immediately like the main window does it).
Write and copy externally is much faster for index journals in my experience.
Maybe it has to be imported into the db unlike the other two because using it within that is faster than as its own document. Or give choice, I don't know, as long as it can be exported and then used by other dbs.

Here is a post from 2015 where you said you wouldn't make content indexing possible because it would be too big..But too big for who..
With 1.5 content indexing is possible so something changed.
viewtopic.php?p=12932#p12932

This isn't just to make sure any move keeps the text attached to the files' next locations. It's also to help against modifying original locations.
This is probably more so a me thing, I keep things samey at some point..Don't touch them. But if they can be indexed externally with a similar system as sorting them into folders and renaming them properly..I'd use that instead. Also sometimes I want to rename something longer than what some programs may support.
It means if you have multiple copies across devices, every copy will be searchable because they all use the same ID.
If I want 5 copies of the same folder, but I only sorted on one device I'd have to copy the folder to 4 other devices and remove the duplicates that are there. The unsorted versions may be in all sorts of paths that I only undid in that one device.

I don't know which ID system to use, MD5? How likely is it to have 2 different files with same MD5, maybe MD5+file size can be a second option. MD5+file name might be useless if you've modified one file's name and wouldn't it just be a worse version of MD5+size..
Even if you have 3 different files with the same MD5, that would be obvious if you found said 3 different files.
But maybe in more automated purposes, if automation is a thing with EbV, finding 3 different files with same MD5 is terrible.
therube
Posts: 4977
Joined: Thu Sep 03, 2009 6:48 pm

Re: Keywords(Tags)/Description for files based on an ID system like MD5

Post by therube »

How likely is it to have 2 different files with same MD5
Possible.
Likely, not very.
Depending on ones needs, MD5 may be more then adequate.
(If you have to "guarantee" against collisions, then MD5 would not be sufficient.)
maybe MD5+file size can be a second option
Just as MD5 has collisions, you can have the same MD5 & the same file size too.
Herkules97
Posts: 71
Joined: Tue Oct 08, 2019 6:42 am

Re: Keywords(Tags)/Description for files based on an ID system like MD5

Post by Herkules97 »

therube wrote: Mon Jun 03, 2024 3:01 pm
How likely is it to have 2 different files with same MD5
Possible.
Likely, not very.
Depending on ones needs, MD5 may be more then adequate.
(If you have to "guarantee" against collisions, then MD5 would not be sufficient.)
There is also sha, maybe that's better for it. But MD5 might be more compatible and require no extra effort to translate for use in other programs. I have no idea, never used other programs for tagging based on such ID systems.
EbV is really the only one where I see any point in doing it.

I think foobar2000 might have some way to tag without modifying files, but I listen to music there based on datetimes like first and last played most of the time or if album it's based on the date field with track number included.
Plus foobar2000 sucks with granting data outside of itself. Probably because anyone has to build for their plugin to support exporting the data into a user-readable format on top of making the plugin do its primary thing inside foobar2000 and that can be extra tedious work to make both.

Unless I just don't know how to read foobar2000's data outside of itself without having to program it myself..I tried Python and I'm clearly not big-brained enough for the programming future.
Post Reply