Everything ignores files with over 127 UTF-8 characters

Found a bug in "Everything"? report it here
Post Reply
shmidman
Posts: 13
Joined: Wed Jun 15, 2011 10:08 pm

Everything ignores files with over 127 UTF-8 characters

Post by shmidman »

I just realized that Everything has been ignoring some of my files. Upon inspection, I found that it specifically regards relatively long filenames with non-English (UTF-8 encoded) filenames. As we know, NTFS allows filenames until 255 characters or so, where each character can be any valid UTF-8 character. However, I found that everything stops handling UTF-8 filenames when they hit 128 characters.
For instance, I created a filename consisting of 127 Hebrew characters. This filename was listed in Everything. I then renamed the file, adding a single extra Hebrew character to the filename. The file now disappeared from Everything's window, and Everything's total count of files went down by one.
It would seem, perhaps, that Everything has a buffer of 255 bytes, rather than 255 UTF-8 characters; and thus, as soon as a filename includes double-byte characters in UTF-8, Everything runs out of room at the 128th character.
void
Developer
Posts: 16745
Joined: Fri Oct 16, 2009 11:31 pm

Re: Everything ignores files with over 127 UTF-8 characters

Post by void »

This is a known issue.

Filenames in Everything are limited to 255 bytes.
This limit is about 85 - 255 characters depending on the number of bytes used to encode as UTF8.

This limitation has been removed for the next release of "Everything".
shmidman
Posts: 13
Joined: Wed Jun 15, 2011 10:08 pm

Re: Everything ignores files with over 127 UTF-8 characters

Post by shmidman »

Thanks for the update. Is there a beta available that incorporates the fix?
void
Developer
Posts: 16745
Joined: Fri Oct 16, 2009 11:31 pm

Re: Everything ignores files with over 127 UTF-8 characters

Post by void »

Unfortunately, No. There is no update yet..
shmidman
Posts: 13
Joined: Wed Jun 15, 2011 10:08 pm

Re: Everything ignores files with over 127 UTF-8 characters

Post by shmidman »

Hi void,

It's been almost a year since this bug was last discussed... Would you please consider releasing the Everything code as an open source model so that we can fix the bug?

For those of us who use non-English characters in filenames, this UTF-8 filename limitation is absolutely debilitating, since it means that we can never really know if the results are exhaustive or not... I imagine that the fix is not much more than a constant DEFINE value somewhere in the code, so if you'd release source, we could fix that and recompile.
shmidman
Posts: 13
Joined: Wed Jun 15, 2011 10:08 pm

Re: Everything ignores files with over 127 UTF-8 characters

Post by shmidman »

New version fixes this completely.
THANK YOU SO MUCH!!!!
Post Reply