Small database

If you are experiencing problems with "Everything", post here for assistance.
Post Reply
yellowyellow
Posts: 4
Joined: Mon Jan 30, 2017 12:31 pm

Small database

Post by yellowyellow »

I have 1 million of files.

Everything.db is only 6 MB, so this means 6 bytes to store on average for each file. How is this possible? 6 bytes is even not enough to store the filename!

Same question about the RAM usage: Everything uses 42MB in RAM for 1 million of files. 42MB => 42 bytes on average for each file. How is this possible (it's very low) ?

As an algorithm enthusiast, I'm curious about how it does work.

Cheers
void
Developer
Posts: 16748
Joined: Fri Oct 16, 2009 11:31 pm

Re: Small database

Post by void »

Everything stores filenames sorted by name, it only stores the difference between these filenames.

For example:
foobar.mp3
foobar.txt

the database would store:
foobar.mp3
code to delete 3 characters from last filename and store txt

These differences are then compressed with bzip2.
Typically you will see about 10:1 to compression (depending on how unique your filenames are).

I hope to improve Everything's memory usage in future versions by using similar compression while Everything is running.
yellowyellow
Posts: 4
Joined: Mon Jan 30, 2017 12:31 pm

Re: Small database

Post by yellowyellow »

Cool stuff here!

Something I still don't understand: when nothing or * is entered in the search field, the ListView GUI element has 1 million lines, each of them having a name column (maybe 20 char on average), a path column (60 char on average), i.e. (60+20) * 1 million ~ 80 MB to store.
But Everything takes 42 MB in RAM :)

This cannot be compressed because it's a rendered ListView GUI element, and I don't think WinAPI has ListView with "compressed in RAM" content.

Looks like magic!
NotNull
Posts: 5461
Joined: Wed May 24, 2017 9:22 pm

Re: Small database

Post by NotNull »

yellowyellow wrote:Cool stuff here!

a path column (60 char on average)
You don't have to store the complete path; just the filename (or folder) with it's ID and the ID of the parent folder. That way you can "build" the complete path with minimal database size.

(Don't know if this is the way Everything operates, but it is one way to explain the database size)
Post Reply