DUPE: - Call for Experimentation

Discussion related to "Everything" 1.5 Alpha.
Post Reply
ChrisGreaves
Posts: 684
Joined: Wed Jan 05, 2022 9:29 pm

DUPE: - Call for Experimentation

Post by ChrisGreaves »

Win7 Everything 1.5.0.1333a(x64)

I am learning about the function Dupe:

I have 351,386 objects on my Win7 system, a boot partition C: and a data partition T:.
I use Dupe: and am told that 183,281 objects are collected.
That is, about one half of my HDD objects are duplicate objects!
Dupe01.png
Dupe01.png (52.57 KiB) Viewed 2023 times
Windows is partly to blame.
Dupe02.png
Dupe02.png (59.87 KiB) Viewed 2023 times
Chris Greaves is partly to blame. I am in the habit of splitting multi-track Greatest Hits albums into separate tracks:-

When I remove my data partition and index only the boot partition C:, I find 78,100 duplicate objects which must be the responsibility of Windows and application programs (and Browser files in %appdata”, and …)
My machine is responsible for about 22% of duplicates.
When I index only my data partition (T:) I find 73,940 objects.
Myself I am responsible for 21% of my duplicates.

I am not surprised that the union of both drives presents more than 22%+21% duplicates, but I am surprised that the union of 21% and 22% leaps to 52% when the union is between system developers and a user.

I did not expect that 52% of my objects are duplicates. I expected to see about ten percent, perhaps fifteen percent.
Dupe03.png
Dupe03.png (4.12 KiB) Viewed 2023 times
I’d love to hear back form other users who maintain separate boot and data partitions.
Thanks, Chris
therube
Posts: 4985
Joined: Thu Sep 03, 2009 6:48 pm

Re: DUPE: - Call for Experimentation

Post by therube »

Do note, that unless you say otherwise, you are dup'ing Name.
So only the file name is being taken into consideration.
And in that respect, why would it be odd to not have a lot of file name duplication's?

My E: drive (which is almost exclusively Windows) has 73,046 dup'd file names (so virtually the same as you came up with).


Note you can turn on color highlighting, so dup'd items are color banded together.


Here I am dup'ing size:
.
Everything size dup.png
Everything size dup.png (24.21 KiB) Viewed 2005 times

Also note that this is not identifying content duplicates (which can also be done).
(Why in the world is MBAM updating rules. I don't even use it & ought to turn it off or uninstall it.)
ChrisGreaves
Posts: 684
Joined: Wed Jan 05, 2022 9:29 pm

Re: DUPE: - Call for Experimentation

Post by ChrisGreaves »

therube wrote: Fri Feb 03, 2023 5:03 pm Do note, that unless you say otherwise, you are dup'ing Name.
Thanks therube. I got as far as a simple dupe this morning and am this afternoon wading into complexities.
So only the file name is being taken into consideration. And in that respect, why would it be odd to not have a lot of file name duplication's?
I had not sat down to think about the proportion of Name duplication in Windows (or any other data partition for that matter. I was surprised to find it at roughly a quarter. Almost 25% of the Windows files have a duplicated name. Off the cuff I would have guessed at 10%, maximum, but again, I had not sat down to think about it. Ever.
My E: drive (which is almost exclusively Windows) has 73,046 dup'd file names (so virtually the same as you came up with).
Reassuring!
Here I am dup'ing size: Everything size dup.png
I have been building up my filters one term at a time and inspecting the count of objects in the result list at each stage, thus:

Code: Select all

*.mp3  
*.mp3 t:\  
*.mp3 t:\ dupe:  
*.mp3 t:\ dupe: name:  
*.mp3 t:\ dupe: name: size: 
Where successive lines show all mp3, then only on my data partition, duplicates, by name, by size.
The last three lines give identical results (4,743 objects) and it is possible that I do not have any matching Name.mp3 that does not match in size - hence no change - that said it seems like a remote chance that out of 24,091 mp3 objects on my data partition (*.mp3 t:\ ) NOT ONE of them has a same name but a different size.
My grounds for this is that I often delete applause from tracks (especially the extended three-minute applause between Acts of Wagnerian proportion) and I find it hard to guarantee that I don't have two identical tracks, downloaded on separate days, where one has been shortened by 3 minutes and the other track has not yet attracted my attention.
Also note that this is not identifying content duplicates (which can also be done).
Content Duplicates is a separate issue of great interest to me. A simple example is deletion of applause; a straight binary match shows that the contents are different, but a careful match of Frames (elements of an MP3 file) reveals that they are essentially the same track.

In the end, for me, a duplicate track is one that appears far too often in my rotation, such as the four versions of "City of New Orleans":-
Untitled.png
Untitled.png (26.21 KiB) Viewed 1995 times
Obviously I had a crush on that song just over two years ago (grin!)
(Why in the world is MBAM updating rules. I don't even use it & ought to turn it off or uninstall it.)
MBAM?

Thanks for the information.
Cheers, Chris
therube
Posts: 4985
Joined: Thu Sep 03, 2009 6:48 pm

Re: DUPE: - Call for Experimentation

Post by therube »

Take the spaces out.

dupe:name
dupe:size

dupe:<property-list>


Malwarebytes Anti-Malware.


Note that in 1.5a, you can also right-click a column header & get a dupe of that column (as filtered by the existing search).
ChrisGreaves
Posts: 684
Joined: Wed Jan 05, 2022 9:29 pm

Re: DUPE: - Call for Experimentation

Post by ChrisGreaves »

therube wrote: Fri Feb 03, 2023 8:16 pm Take the spaces out.
Dupe08.png
Dupe08.png (41.65 KiB) Viewed 1948 times
Thanks therube.
I think this is what I was aiming for.
I have a result list of audio tracks which are suspiciously named AND SIZED alike.

Code: Select all

*.mp3 t:\ dupe:name;size
When I add Path to the result list I can start fixing problems e.g. "What is Doris day doing in Ringtones?"
Dupe09.png
Dupe09.png (70.22 KiB) Viewed 1948 times
:thumbsup: Chris
void
Developer
Posts: 16773
Joined: Fri Oct 16, 2009 11:31 pm

Re: DUPE: - Call for Experimentation

Post by void »

Added help for finding duplicates.
ChrisGreaves
Posts: 684
Joined: Wed Jan 05, 2022 9:29 pm

Re: DUPE: - Call for Experimentation

Post by ChrisGreaves »

void wrote: Mon Feb 06, 2023 4:46 am Added help for finding duplicates.
:thumbsup:
Chris
Post Reply