Extremely slow rescan of offsite mapped drives

Discussion related to "Everything" 1.5 Alpha.
Post Reply
heyheyhey
Posts: 4
Joined: Sat Sep 21, 2024 6:47 am

Extremely slow rescan of offsite mapped drives

Post by heyheyhey »

At my workplace, I have installed 1.5a to enable faster searching of our files. I've been testing it on my local machine only, and while it's great, whenever i make a change to the options that would initiate a rescan, it takes multiple hours and is unusable. Could this be due to network bottlenecking? The offsite drives do have a noticeable latency (50-150ms).

I also want to know the reasoning for having the db located in %LOCALAPPDATA% and not with the settings in %APPDATA%. On my personal computer, I have it set to the latter, and there are no issues. Thanks!
void
Developer
Posts: 16745
Joined: Fri Oct 16, 2009 11:31 pm

Re: Extremely slow rescan of offsite mapped drives

Post by void »

whenever i make a change to the options that would initiate a rescan
Any changes to included files/folders/properties/content or excluded files/folders will trigger a rebuild.


it takes multiple hours and is unusable. Could this be due to network bottlenecking?
Very little network bandwidth is required.
Your samba shares are likely rate limited on the server.


I also want to know the reasoning for having the db located in %LOCALAPPDATA% and not with the settings in %APPDATA%. On my personal computer, I have it set to the latter, and there are no issues. Thanks!
%APPDATA% can be stored remotely. This is often the case in domain environments.
%LOCALAPPDATA% is always local.

Ideally, Everything should store the large database locally.
Settings are stored in %APPDATA%\Everything.
heyheyhey
Posts: 4
Joined: Sat Sep 21, 2024 6:47 am

Re: Extremely slow rescan of offsite mapped drives

Post by heyheyhey »

Appreciate the knowledge.
Any changes to included files/folders/properties/content or excluded files/folders will trigger a rebuild.
Rebuild of the database? And the index is located in the database? The drives are using the MAFS filesystem, is this compatible with fast rescan?
Very little network bandwidth is required.
Your samba shares are likely rate limited on the server.
That could be the case, what is should kB throughput would not cause a bottleneck? The db is being bulit on an nvme SSD if that matters. If unavoidable, I'll just not touch any settings that cause a rebuild.
%APPDATA% can be stored remotely. This is often the case in domain environments.
%LOCALAPPDATA% is always local.

Ideally, Everything should store the large database locally.
Settings are stored in %APPDATA%\Everything.
Thank you for the Windows knowledge. I'll be moving the db back to the default.




For some advanced settings, is alloc_align=1 good? As I understand, modern processors prefer this.

I've read in a another thread that you recommended adding Reparse Tag and Reparse Target to properties to remove them from being added to the database. Does that mean that properties need to be excluded on purpose in the Properties tab? There are many files, but we only really need date access/modified and its location. It'd be nice to index the file content on the MAFS drives but that'd be unreasonable given their size.
image.png
image.png (18.37 KiB) Viewed 1980 times
void
Developer
Posts: 16745
Joined: Fri Oct 16, 2009 11:31 pm

Re: Extremely slow rescan of offsite mapped drives

Post by void »

Rebuild of the database?
Yes, a full rebuild is required.
Everything will try to re-use the existing indexes.
However, it's not always possible. (eg: include something that was previously excluded)


And the index is located in the database?
The current index is stored in a database in memory and on disk when you exit Everything.
A reindex will use the current index in memory if possible.


The drives are using the MAFS filesystem, is this compatible with fast rescan?
Fast rescan only applies to rescanning an already indexed folder.
Fast rescan will not improve performance of the initial index.



Fast rescan requires a NTFS compatible file system.

The best way to test if the file system is compatible is to:

Create a new folder inside another folder.
Check if the date modified timestamp for the parent folder updates to the current time.

To clarify create folder B inside folder A and check the date modified timestamp of folder A updates.



If the date modified timestamp updates, then the file system is "Fast rescan" compatible.
Fast rescan will only read folders that have a new date modified timestamp.



Everything cannot reliably test if the folder is "Fast rescan" compatible.
It's up to the user to select this option.



Re: Network drives 'Fast rescan' explanation about NTFS volumes on Linux share


Very little network bandwidth is required.
Your samba shares are likely rate limited on the server.
That could be the case, what is should kB throughput would not cause a bottleneck? The db is being bulit on an nvme SSD if that matters. If unavoidable, I'll just not touch any settings that cause a rebuild.
It will depend on:
  • Your NAS hardware/OS
  • Network speed
  • Disk IO (other writes/reads)


Everything should index your remote network shares at about 3 minutes per 1 million files.

On my old hardware, after a restart of the server, I am seeing less than 3 minutes per 1 million files with constant writes over 10GB ethernet.
Everything network bandwidth is about 4MB/s
Everything network usage is about 700MB per 1million files


Any way to speed up the process?
Check if your NAS is rate limiting SMB.

Please try Everything 1.5
Setup each network folder as multi-threaded (Tools -> Options -> Folders -> For each folder: Select a folder -> Right click -> Advanced -> Threads -> Multiple threads)
-Everything 1.5 will then read all 12 network folders at the same time.

Please try the Everything Server.
Setup the Everything Server as close as possible to your NAS. (ideally on your NAS)

Re: network share indexing speed


For some advanced settings, is alloc_align=1 good? As I understand, modern processors prefer this.
For x86/x64, no.
The extra RAM usage hurts performance more than any speed improvements from the alignment.


I've read in a another thread that you recommended adding Reparse Tag and Reparse Target to properties to remove them from being added to the database. Does that mean that properties need to be excluded on purpose in the Properties tab?
Everything doesn't follow reparse points with NTFS volumes.
Everything does follow reparse points with folder indexing (Tools -> Options -> Folders)
Tools -> Options -> Advanced -> follow_reparse_points controls this behavior for folder indexes.

The reparse target property is only the target filename.
Enabling reparse target property indexing will not index files/folders inside reparse points for NTFS volumes.

If you want to index the reparse target property, I recommend only indexing for folders.
While files can have reparse targets, it's common to only want this information for folders (I will make folders only the default in the next update)
heyheyhey
Posts: 4
Joined: Sat Sep 21, 2024 6:47 am

Re: Extremely slow rescan of offsite mapped drives

Post by heyheyhey »

It is fast rescan compatible in that case, as date modified of the parent folder does change when creating/modifying within it. Thank you for the tip on multi-threading, and everything else. Reparse will be needed now that I understand it's not extended attributes and does help with work efficiency.
Check if your NAS is rate limiting SMB.
That may be a possibility. Once I work out what settings are best for our work, hopefully I can get our IT team to have Everything server set up on the Azure instance directly.

Having metadata caching in Azure, would that change anything? https://techcommunity.microsoft.com/t5/ ... -p/4046390

Opening files on the mapped drive always takes a few seconds, and opening folders can take a few seconds too sometimes.
On my old hardware, after a restart of the server, I am seeing less than 3 minutes per 1 million files with constant writes over 10GB ethernet.
Everything network bandwidth is about 4MB/s
Everything network usage is about 700MB per 1million files
From my local work machine, a speed test site showed about 200MB/s down, and transferring file from/to mapped drive to a local machine is no more than 50MB/s. I don't know how to interpret the difference in your hardware speed. Can't wait to get back into the office and try out the changes. I am on the latest 1.5 build for reference.
void
Developer
Posts: 16745
Joined: Fri Oct 16, 2009 11:31 pm

Re: Extremely slow rescan of offsite mapped drives

Post by void »

Having metadata caching in Azure, would that change anything?
Please try metadata caching and please let us know if it helps.

Everything reads network share filenames with the FindFirstFileEx Win32 API (with the large fetch option enabled).
heyheyhey
Posts: 4
Joined: Sat Sep 21, 2024 6:47 am

Re: Extremely slow rescan of offsite mapped drives

Post by heyheyhey »

void wrote: Sun Sep 22, 2024 11:35 pm
Having metadata caching in Azure, would that change anything?
Please try metadata caching and please let us know if it helps.

Everything reads network share filenames with the FindFirstFileEx Win32 API (with the large fetch option enabled).
Metadata caching is not possible, it has not rolled out to our location.

I added the network drives to the Folders section I was able to get it down to 4 hours for full rebuild this time around. Adding rules that would trigger a rescan did not leave Everything's results empty until it completed. I did have multi-threaded scanning in the advanced options initially but I enabled them individually for the Folders too.

Is it needed to have the network drives in the database? I'm getting duplicate results. Such as \\network\share\abc.txt and F:\abc.txt
void
Developer
Posts: 16745
Joined: Fri Oct 16, 2009 11:31 pm

Re: Extremely slow rescan of offsite mapped drives

Post by void »

Is it needed to have the network drives in the database? I'm getting duplicate results. Such as \\network\share\abc.txt and F:\abc.txt
It's not needed, just add \\network\share or f:

Remove the other from Tools -> Options -> Folders/network drives.
Post Reply