Duplicate names in a single folder, ignore whitespace

Discussion related to "Everything" 1.5 Alpha.
Post Reply
ChrisGreaves
Posts: 688
Joined: Wed Jan 05, 2022 9:29 pm

Duplicate names in a single folder, ignore whitespace

Post by ChrisGreaves »

A month ago I studied duplicates.Now I am trying to ignore white space.
Dupe_11.png
Dupe_11.png (68.79 KiB) Viewed 1713 times
This is my search screen showing two files that my brain "sees" as duplicates. The size and date match exactly. I suspect that I have issued a File, saveAs to introduce spaces into the name, but have forgotten to delete the squeezed name.
Dupe_12.png
Dupe_12.png (19.08 KiB) Viewed 1713 times
And here I am trying to isolate culprits.
I have "Ignore white space" selected in the Search menu but the Result List is empty.
I have tried selecting the Name column and choosing "Find name Duplicates.
I tried putting "Dupe:" at the start of the search string and then at the end of the search string.

I had great success a month ago with "Dupe:", so I suspect that I am somehow mis-using the "Ignore white space", but that seems too simple to get wrong?

Thanks for any clues, guidance etc.
Chris
NotNull
Posts: 5468
Joined: Wed May 24, 2017 9:22 pm

Re: Duplicate names in a single folder, ignore whitespace

Post by NotNull »

I suspect that this will be a very tough one to accomplish. Will think about it ...
ChrisGreaves
Posts: 688
Joined: Wed Jan 05, 2022 9:29 pm

Re: Duplicate names in a single folder, ignore whitespace

Post by ChrisGreaves »

NotNull wrote: Fri Mar 24, 2023 10:28 pm I suspect that this will be a very tough one to accomplish. Will think about it ...
NotNull I greatly respect your advice and knowledge, but are you pulling my leg here?

I had expected "ignore white space" to be a bit of a no-brainer as far as the executable code went. That's why, in this example more than any other, I thought that I had erred.
I mean, eliminating white space would be about the most common thing to do when comparing string data. No?

(signed) "eagerly awaiting your thoughts" of Bonavista
NotNull
Posts: 5468
Joined: Wed May 24, 2017 9:22 pm

Re: Duplicate names in a single folder, ignore whitespace

Post by NotNull »

ChrisGreaves wrote: Fri Mar 24, 2023 10:34 pm but are you pulling my leg here?
I wouldn't dare! (although .. it is almost April 1st ..)

Seriously: "ignore white space" works on the search itself, so searching for abc will also find "a b c".
If no specific filename(pattern) search text is given, Everything will see "this is a file.txt" and "thisisafile.txt" as different filenames, even when "ignore white space" is enabled.

I guess some ugly regular expression is needed to be able to compare with/ without spaces.
ChrisGreaves
Posts: 688
Joined: Wed Jan 05, 2022 9:29 pm

Re: Duplicate names in a single folder, ignore whitespace

Post by ChrisGreaves »

NotNull wrote: Fri Mar 24, 2023 11:42 pmSeriously: "ignore white space" works on the search itself, so searching for abc will also find "a b c". If no specific filename(pattern) search text is given, Everything will see "this is a file.txt" and "thisisafile.txt" as different filenames, even when "ignore white space" is enabled.
Thank you NotNull. This progress.

It appears to be NOT user error!

I know nothing about the program code of Everything, and have no desire to go there. But from a pure programming perspective there must be a slave/utility procedures somewhere called (VBA)

Code: Select all

strSqueezeWhiteSpace(strIncoming, strcWhiteSpaceDefinitions) as String
Likewise I anticipate some equivalent of a string array that holds the Names, Paths, Date, Size and so on.

So I imagine that the routine that assembles that/those(see below) arrays could, at the time it is loading a name into the array, apply strSqueezeWhiteSpace to the string. (OK, two arrays, one for OriginalNames and a matching array for SqueezedNames)

Below: In general when a user looks for duplicates the definition is not in the program, but in the mind of the user. To that end I would suggest that Dupe: allows a variety of processes to stipulate, for many columns, what is, to the user's mind, a duplicate.
Size: equal when rounded to the nearest hundred bites (=ROUND(LONG, -2), I think)
Date: equal when rounded to the nearest hour
Type: equal if maps to the same list of types (Picture, Audio, Document ...)

User: No matter how powerful a Dupe function is, it will be of little use to the user if it does not locate duplicates according to the user's mind. That does NOT mean that Everything should accommodate every sing;e user's day-to-day wishes - there must be a cutoff point.

User: When the average user sees "Ignore White Space" as a setting, the user assumes that the program will ignore white space until the setting is turned off. (I think this describes me). There is an understanding that setting ON "Ignore White Space" in the menu was a global edict to Everything to ignore white space in searches, whether the searches be for audio, pictures, documents, duplicates, ...

I see that Mike_PB has something on the go that might be useful
I guess some ugly regular expression is needed to be able to compare with/ without spaces.
Could be true. I have dabbled in regex from time to time, but am ill-equipped to fabricate a complex regex.

I can leave my example for the time being and get on with something else. But I'd be interested to learn that "Ignore White Space" could be applied globally once it was set!

Cheers, Chris
Phlashman
Posts: 41
Joined: Sun Sep 11, 2022 4:57 am

Re: Duplicate names in a single folder, ignore whitespace

Post by Phlashman »

You could try adding a column using column functions. Add the following to your search

column1:=substitute($name:," ","") addcolumn:column1

This will add a column of the file name "ignoring" spaces. You could then DUPE for that column?
ChrisGreaves
Posts: 688
Joined: Wed Jan 05, 2022 9:29 pm

Re: Duplicate names in a single folder, ignore whitespace

Post by ChrisGreaves »

Phlashman wrote: Sat Mar 25, 2023 10:13 pmYou could try adding a column using column functions. Add the following to your search; column1:=substitute($name:," ","") addcolumn:column1; This will add a column of the file name "ignoring" spaces. You could then DUPE for that column?
Thank you Phlashman . Brilliant! (mainly because I had not yet put my toe into the water in that pond :D )
ColumnFormulas_01.png
ColumnFormulas_01.png (126.45 KiB) Viewed 1559 times
I did try adding "Dupe:" at first at the left-hand end, then at the right-hand end of my search string, and neither worked.
But selecting your "Column 1" and from the menu choosing (right-click) "Find Column 1 Duplicates" gave me exactly what I wanted.

So thanks for the introduction to Column Functions.
My plan now is to pour another coffee, and use my new trick on my entire data partition, then start studying Column functions.

Thanks again, Chris
ChrisGreaves
Posts: 688
Joined: Wed Jan 05, 2022 9:29 pm

Re: Duplicate names in a single folder, ignore whitespace

Post by ChrisGreaves »

ChrisGreaves wrote: Sun Mar 26, 2023 12:03 pm... use my new trick on my entire data partition,
ColumnFormulas_02.png
ColumnFormulas_02.png (76.43 KiB) Viewed 1556 times
3,702 objects which does not mean that I have duplicates of 1,851 documents, since some of the entries are for triplicates and quadruplicates(?) and the like.
That is, once I learned that I choosing (right-click) "Find Column 1 Duplicates" was a necessary step!

Many many thanks
Chris
Post Reply