Problem indexing content
Problem indexing content
Hi, this is the first time I tried to search for indexed file content, but I've a problem with that feature.
I didn't understand the working principle how everything is indexing content.
e.g.
Settings:
Include only folder: X:\RFQ;
Include only files: *.txt
But "everything search" is indexing files from other/all locations, as you can see in the attachment, but why?
I didn't understand the working principle how everything is indexing content.
e.g.
Settings:
Include only folder: X:\RFQ;
Include only files: *.txt
But "everything search" is indexing files from other/all locations, as you can see in the attachment, but why?
- Attachments
-
- indexing.png (19.38 KiB) Viewed 4328 times
Re: Problem indexing content
This tooltip also shows properties being indexed.
Are you indexing any properties under Tools -> Options -> Properties?
Are you indexing any properties under Tools -> Options -> Properties?
Re: Problem indexing content
I have some sort of similar problem using content.
Every time I open up SE, and I type something like content:"text to search"
The Indexing properties option starts right away, and it takes forever to complete, like 2-3 hours, or more!
I admit that I have a few Gb of data, but I limit my search to a specific path and folder. For instance, I have Tools\Options\Content setup like this:
Under Content:
Index file content. Checked.
Include only folders: D:\MyHardDrive\Folder2Index
Exclude folders: Empty
Include only files: *.doc;*.docx;*.pdf
Exclude files: Many extensions here.
Exclude not content indexed. Unchecked.
Exclude recall on data access: Checked.
Max size: 20 Mb.
I thought that the indexing properties would remain "somewhere", to make sure that SE doesn't re-index every time it starts. But it's not the case. So my computer is slowing down each time I start SE.
Any solutions?
Every time I open up SE, and I type something like content:"text to search"
The Indexing properties option starts right away, and it takes forever to complete, like 2-3 hours, or more!
I admit that I have a few Gb of data, but I limit my search to a specific path and folder. For instance, I have Tools\Options\Content setup like this:
Under Content:
Index file content. Checked.
Include only folders: D:\MyHardDrive\Folder2Index
Exclude folders: Empty
Include only files: *.doc;*.docx;*.pdf
Exclude files: Many extensions here.
Exclude not content indexed. Unchecked.
Exclude recall on data access: Checked.
Max size: 20 Mb.
I thought that the indexing properties would remain "somewhere", to make sure that SE doesn't re-index every time it starts. But it's not the case. So my computer is slowing down each time I start SE.
Any solutions?
Re: Problem indexing content
Check Tools -> Debug -> Statistics -> Build -> Last rebuild reason.
Check Tools -> Debug -> Statistics -> Build -> Last build date.
Check Tools -> Debug -> Statistics -> Save -> Last save date.
What is shown for these values?
Check Tools -> Debug -> Statistics -> Build -> Last build date.
Check Tools -> Debug -> Statistics -> Save -> Last save date.
What is shown for these values?
Re: Problem indexing content
Under Build:
Last build date: 17-Sep-24 08:58
Last rebuild reason: invalid parent folder index 1710478 >= 726678 + 8
Under Save:
Last save date: 18-Sep-24 04:22
I stopped the indexing process yesterday, and put my computer to Sleep mode before turning it on early this morning, so those values may be wrong. I was wondering about the "Last rebuild reason", what does "Invalid parent folder" means, and what are those numbers?
How can I do a full rebuild, and maintain that database so SE doesn't have to redo it each time it's starting?
Last build date: 17-Sep-24 08:58
Last rebuild reason: invalid parent folder index 1710478 >= 726678 + 8
Under Save:
Last save date: 18-Sep-24 04:22
I stopped the indexing process yesterday, and put my computer to Sleep mode before turning it on early this morning, so those values may be wrong. I was wondering about the "Last rebuild reason", what does "Invalid parent folder" means, and what are those numbers?
How can I do a full rebuild, and maintain that database so SE doesn't have to redo it each time it's starting?
Re: Problem indexing content
This is database corruption.Last rebuild reason: invalid parent folder index 1710478 >= 726678 + 8
Could you please send your Help -> Troubleshooting information to support@voidtools.com
Could you please send your Tools -> Debug -> Statistics to support@voidtools.com
I would like to do some tests my end.
Everything saves your database to disk on exit. (File -> Exit or shutdown Windows)
Everything saves your database to disk daily. (at 4am, or the next time a UI window is closed)
Something is going wrong with updating your database in memory.
Please try exiting Everything (File -> Exit) and restarting Everything.
Does Everything load immediately?
Re: Problem indexing content
My normal daily procedure is to use Windows Sleep function whenever I shut down for the day. Which means that some software are still there, but in sleeping mode. About once a week, I completely shut down Windows to have a clean restart.
As a translator, I usually keep 2 or 3 Word files open, some specialized dictionaries, xChange PDF viewer, Firefox and Thunderbird email client, and also Search Everything. So whenever I start up in the morning all of those are readily available. Otherwise, my (rather old) computer is taking a very long time to load, 15-20 minutes, including loading the above apps.
My system is Windows 10 (latest updates) and 16Gb RAM, no SSD (unfortunately!)
When Everything is loaded, it usually sits in the Taskbar, hidden on the right side. If I shut it down, I noticed that it stills reside in "memory" or as a service, from what I can see in Task Manager. So even though I shut Everything, I need to use the "end task" in Task Manager to really shut it down, and only then am I able to open it up again.
I'm sending you the Troubleshooting and Statistics file separately.
Thank you for your time to look into this. It's appreciated!
As a translator, I usually keep 2 or 3 Word files open, some specialized dictionaries, xChange PDF viewer, Firefox and Thunderbird email client, and also Search Everything. So whenever I start up in the morning all of those are readily available. Otherwise, my (rather old) computer is taking a very long time to load, 15-20 minutes, including loading the above apps.
My system is Windows 10 (latest updates) and 16Gb RAM, no SSD (unfortunately!)
When Everything is loaded, it usually sits in the Taskbar, hidden on the right side. If I shut it down, I noticed that it stills reside in "memory" or as a service, from what I can see in Task Manager. So even though I shut Everything, I need to use the "end task" in Task Manager to really shut it down, and only then am I able to open it up again.
I'm sending you the Troubleshooting and Statistics file separately.
Thank you for your time to look into this. It's appreciated!
Re: Problem indexing content
Thank you for the troubleshooting information and statistics.
I found the issue by running Everything with your config from the troubleshooting information.
There is an issue with saving/loading the include-only-files setting when you have over 128 filters.
The issue will be fixed in the next alpha update.
I will have an update soon.
For now, please reduce the number of Tools -> Options -> Exclude -> include-only-file filters to 127 or less.
I found the issue by running Everything with your config from the troubleshooting information.
There is an issue with saving/loading the include-only-files setting when you have over 128 filters.
The issue will be fixed in the next alpha update.
I will have an update soon.
For now, please reduce the number of Tools -> Options -> Exclude -> include-only-file filters to 127 or less.
Re: Problem indexing content
Thank you!
For now, I removed all 128+ filters, and I restarted Everything. It's been doing an indexing since early this morning, non-stop (more than 5-6 hours), and it's at 51% as seen in the "Indexing properties" box.
In "Content", I have checked the box "Index file content" because I need to search for text inside many documents. I'm limiting my search to include only one very large folder, many Gb, in which I only include docx, pdf and eml data.
Is it normal that it take so long to do the indexing (it's now at 2Gb)?
Or could it be that it's searching in the former database still sitting somewhere?
What happens if I shut off my computer, will the indexing restart from the start, or continue where it left off?
I'm hoping to start from scratch with the new update.
For now, I removed all 128+ filters, and I restarted Everything. It's been doing an indexing since early this morning, non-stop (more than 5-6 hours), and it's at 51% as seen in the "Indexing properties" box.
In "Content", I have checked the box "Index file content" because I need to search for text inside many documents. I'm limiting my search to include only one very large folder, many Gb, in which I only include docx, pdf and eml data.
Is it normal that it take so long to do the indexing (it's now at 2Gb)?
Or could it be that it's searching in the former database still sitting somewhere?
What happens if I shut off my computer, will the indexing restart from the start, or continue where it left off?
I'm hoping to start from scratch with the new update.
Re: Problem indexing content
Indexing content will take a very long time.
I recommend keeping the index under 1GB.
Please try reducing the number of files content-indexed.
You can exit Everything.
Everything will save the content indexing progress.
Content indexing will resume the next time you start Everything.
I recommend keeping the index under 1GB.
Please try reducing the number of files content-indexed.
You can exit Everything.
Everything will save the content indexing progress.
Content indexing will resume the next time you start Everything.
Re: Problem indexing content
I shut down my computer yesterday, while Everything was at about 74% on Indexing properties. I was hoping, like you said, that this morning it would resume the indexing process. It was not the case. As soon as I opened Everything, searching for some Content, it re-started the whole process from scratch, re-indexing the properties. I had to Pause it, and it will remain as such until someone finds a solution.Everything will save the content indexing progress.
Content indexing will resume the next time you start Everything.
Thanks for still looking into this situation.
Re: Problem indexing content
Please check your last rebuild info, what is shown for these values:
Check Tools -> Debug -> Statistics -> Build -> Last rebuild reason.
Check Tools -> Debug -> Statistics -> Build -> Last build date.
Check Tools -> Debug -> Statistics -> Save -> Last save date.
Check Tools -> Debug -> Statistics -> Build -> Last rebuild reason.
Check Tools -> Debug -> Statistics -> Build -> Last build date.
Check Tools -> Debug -> Statistics -> Save -> Last save date.
Re: Problem indexing content
Last rebuild reason: invalid parent folder index 1710478 >= 726678 + 8
Last build date: 23-Sep-24 11:18
Last save date: None.
If this has any value: My Indexing is turned off in Windows.
Should it be on to use Everything?
Last build date: 23-Sep-24 11:18
Last save date: None.
If this has any value: My Indexing is turned off in Windows.
Should it be on to use Everything?
Re: Problem indexing content
Everything doesn't use the Microsoft system index for indexing content.
You can search your system index in Everything with the si: search function:
For example:
si:"My content search"
Are your include only file filters sticking?
-Please check Tools -> Options -> Exclude -> Include only files.
You can search your system index in Everything with the si: search function:
For example:
si:"My content search"
Are your include only file filters sticking?
-Please check Tools -> Options -> Exclude -> Include only files.
Re: Problem indexing content
I don't quite understand the search function si:"xxx"
Tools -> Options -> Exclude -> Include only files.
I still had the bunch of file extensions (~190), which I thought I had removed!
I'll give it another try without including any extensions, and see how it works.
Tools -> Options -> Exclude -> Include only files.
I still had the bunch of file extensions (~190), which I thought I had removed!
I'll give it another try without including any extensions, and see how it works.
Re: Problem indexing content
si: will search your Microsoft Windows system index.
No need to index content in Everything if you use windows indexing.
Searching the Windows index
si:
File -> Exit
Restart Everything
Check your Tools -> Options -> Exclude -> "Include only files" setting.
No need to index content in Everything if you use windows indexing.
Searching the Windows index
si:
After removing the filters, please try restarting Everything to see if the setting is saved:Tools -> Options -> Exclude -> Include only files.
I still had the bunch of file extensions (~190), which I thought I had removed!
File -> Exit
Restart Everything
Check your Tools -> Options -> Exclude -> "Include only files" setting.
Re: Problem indexing content
You said to index no more than 1 Gb of data. I am indexing the content of full hard drive (600 Gb), but I use the following setup:
Options - Indexes - Exclude - "Include only files" where I have only those extensions: *.doc;*.docx;*.pdf
and I am also using the same rule under Content - Include only files.
That hard drive may have about 1 or 2 Gb of data with those file types.
Would Everything limit the indexing to those file extensions only, or would it still index the whole hard drive?
If so, is there a way to limit the scope of the "Indexing properties"?
Options - Indexes - Exclude - "Include only files" where I have only those extensions: *.doc;*.docx;*.pdf
and I am also using the same rule under Content - Include only files.
That hard drive may have about 1 or 2 Gb of data with those file types.
Would Everything limit the indexing to those file extensions only, or would it still index the whole hard drive?
If so, is there a way to limit the scope of the "Indexing properties"?
Re: Problem indexing content
Please try to keep the indexed size under 1GB (Tools -> Debug -> Statistics -> File data size)
You can index the full drive (600 GB) as long as this isn't 600 GB of plain text files
Everything indexes only plain text. The plain text inside doc, docx, pdf should be rather small compared to the file size as images and formatting is ignored.
Only indexed files will be scanned (Tools -> Options -> Exclude -> Include only files)
In this case, only doc, docx and pdf files will be scanned.
You can index the full drive (600 GB) as long as this isn't 600 GB of plain text files
This is fine, this will help reduce the indexed size.Options - Indexes - Exclude - "Include only files" where I have only those extensions: *.doc;*.docx;*.pdf
and I am also using the same rule under Content - Include only files.
Everything indexes only plain text. The plain text inside doc, docx, pdf should be rather small compared to the file size as images and formatting is ignored.
Only the files you specify under Tools -> Options -> Content -> "Include only files" are scanned.Would Everything limit the indexing to those file extensions only, or would it still index the whole hard drive?
Only indexed files will be scanned (Tools -> Options -> Exclude -> Include only files)
In this case, only doc, docx and pdf files will be scanned.
Re: Problem indexing content
I'm still trying to understand "deeper" search in Everything. For instance, I just found out how to do multiple and specific searches.
For those interested, that's now what I do to search for any file with extension .doc or .docx, including the word(s) I'm looking for:
*.doc* content:word1 content:word2 content:word3
To search for a string of characters using a similar argument which will find Word documents containing word1 with a specific string of text:
*.doc* content:word1 content:"string of characters"
I'm sure there are plenty of other tricks to use, but I didn't have (or didn't take) the time to find them!
The more I learn how to use Search Everything the more powerful it gets!
I have two questions:
1. Where can I find more info on the types of searches I can do with Everything?
2. You already mentioned that we should limit our search database to <1 Gb. But is there a way to speed up the Indexing process when I have folders containing thousands of Word documents, with a size of more than 50 Gb of data? Reason: I am a writer and translator, and I need all of those for my research. Those files could contain whole books in two languages, research papers, and more.
For those interested, that's now what I do to search for any file with extension .doc or .docx, including the word(s) I'm looking for:
*.doc* content:word1 content:word2 content:word3
To search for a string of characters using a similar argument which will find Word documents containing word1 with a specific string of text:
*.doc* content:word1 content:"string of characters"
I'm sure there are plenty of other tricks to use, but I didn't have (or didn't take) the time to find them!
The more I learn how to use Search Everything the more powerful it gets!
I have two questions:
1. Where can I find more info on the types of searches I can do with Everything?
2. You already mentioned that we should limit our search database to <1 Gb. But is there a way to speed up the Indexing process when I have folders containing thousands of Word documents, with a size of more than 50 Gb of data? Reason: I am a writer and translator, and I need all of those for my research. Those files could contain whole books in two languages, research papers, and more.
Re: Problem indexing content
This is a job for the Windows indexer.PeterPan8 wrote: ↑Fri Sep 27, 2024 2:08 pm 2. You already mentioned that we should limit our search database to <1 Gb. But is there a way to speed up the Indexing process when I have folders containing thousands of Word documents, with a size of more than 50 Gb of data? Reason: I am a writer and translator, and I need all of those for my research. Those files could contain whole books in two languages, research papers, and more.
It's not limited in size and can be searched using Everythings
si:
Re: Problem indexing content
The following search does the same:For those interested, that's now what I do to search for any file with extension .doc or .docx, including the word(s) I'm looking for:
*.doc* content:word1 content:word2 content:word3
*.doc* content:<word1 word2 word3>
(should be easier to type)
Subexpresssions
Everything -> Help -> Search syntax for the basics.1. Where can I find more info on the types of searches I can do with Everything?
Search functions for all searches with examples.
Store the files on a good NVMe SSD drive.You already mentioned that we should limit our search database to <1 Gb. But is there a way to speed up the Indexing process when I have folders containing thousands of Word documents, with a size of more than 50 Gb of data? Reason: I am a writer and translator, and I need all of those for my research. Those files could contain whole books in two languages, research papers, and more.
No need to use Everything indexing.
Everything will read all the content without indexing in a few seconds.
If you are storing on a SSD, please make sure Everything is using multiple threads under
Tools -> Debug -> Statistics -> NTFS Index (C:) -> Multithreaded
Everything will index files on HDD as fast as possible.
Everything should only index once.