Search Preprocessor Terminology

Discussion related to "Everything" 1.5 Alpha.
Post Reply
ChrisGreaves
Posts: 684
Joined: Wed Jan 05, 2022 9:29 pm

Search Preprocessor Terminology

Post by ChrisGreaves »

void wrote: Thu May 27, 2021 2:18 am The Search Preprocessor rewrites your searches in Everything 1.5.
Searches passed from the command line are automatically expanded.
Searches are expanded when executing an Everything query.
Hello David. I'd like to check my understanding of your terminology here.
"Searches" refers to user-composed search strings such as "T:\Greaves\Training\Everything\Tutorial\Section 02\ ext:doc".
"Command Line" means Searches passed when executing Everything from the DOS Command prompt
"Searches are expanded" should be read as "Searches are automatically expanded"
"an Everything query" is synonymous with any attempt to ask Everything to return a result List in some form or another.

In summary, every time you ask Everything to find objects, The Search Preprocessor supplies an intermediate and transient/temporary translation of your request". Is that close to the truth?
Thanks, Chris
NotNull
Posts: 5461
Joined: Wed May 24, 2017 9:22 pm

Re: Search Preprocessor

Post by NotNull »

"Command Line" means Searches passed when executing Everything from the DOS Command prompt
.. or similar, like in the registry to create context menu items.
In summary, every time you ask Everything to find objects, The Search Preprocessor supplies an intermediate and transient/temporary translation of your request". Is that close to the truth?
The search preprocessor helps in building the actual search query. It gets expanded to an actual (part of the) search.

For example, you know just a filename and want to find files with the exact same size as said filename -- which size is unknown as you only know the name.
The search would be: size:<something>

<something> can be filled in by the preprocessor:
[get-size:filename]
void
Developer
Posts: 16773
Joined: Fri Oct 16, 2009 11:31 pm

Re: Search Preprocessor Terminology

Post by void »

The Search Preprocessor supplies an intermediate and transient/temporary translation of your request". Is that close to the truth?
Basically, yes.
Think of it as like expanding environment variables, eg: %path%
The %path% text is replaced with the path environment variable value.



To be more technical:

Preprocessor expressions are executed and replaced with the resulting value.

#[preprocessor-function:...#]:
style expressions are expanded first.

The entire search is then broken into search terms using search operators (space=AND, |=OR, < > = grouping).

Then for each search term, any
[preprocessor-function:...]
expressions are expanded.

Finally, the fully expanded search terms are compiled into byte code and executed.
ChrisGreaves
Posts: 684
Joined: Wed Jan 05, 2022 9:29 pm

Re: Search Preprocessor

Post by ChrisGreaves »

NotNull wrote: Mon May 01, 2023 8:07 pm
"Command Line" means Searches passed when executing Everything from the DOS Command prompt
.. or similar, like in the registry to create context menu items.
Hi NonNull. [grinOn]Please stop throwing new things at me while I am trying to understand old things[grinOFF] If we stick with the first part of my statement, you think I am basically correct, correct? That is, when I see "Command Line Searches", it's OK for me to think in the context of a search initiated from the DOS CMD prompt, right?
I now think I must embark on a separate area to investigate ~~~controlling search behavior via the Windows Registry ~~~ as well.
For now I should get used to the advantages of Command-line searches.(1)
In summary, every time you ask Everything to find objects, The Search Preprocessor supplies an intermediate and transient/temporary translation of your request". Is that close to the truth?
The search preprocessor helps in building the actual search query. It gets expanded to an actual (part of the) search.
There is something in a comment that Raindrops made a couple of months ago.
I have a background in compiler-writing, so "pre-processor" was familiar to me 50+years ago. Today I might call it a "syntax-checker". Whereas a compile of a program on the old mainframes took 30+ minutes and produced an object deck, a pre-process was a simple one-pass scan of the source that reported blatant syntax errors and ran in under 60 seconds. So I have no problem with a pre--processing phase of any type.
In Everything's case the pre-process is more like a translation of a search string prior to its execution. Some people might call that "tokenising"

However :drumroll: from the user's point of view we simply type a string into the Search Box and Everything generates a result List, right?
In that sense we users don't care about how many passes Everything makes across the search string, just as long as we can be confident in the results?
There may be twenty-eight processes, run one after the other, that will in the end produce the Result List. Maybe three hundred pre-processes, but we users don't care, do we?

I will think about whether or not I want to introduce the term "pre-processor. It is fascinating to learn about it, but as far as the (novice) end-user goes, do they need to understand the pre-process in order to encode powerful search Strings (Please answer "No they don't" :D )
For example, you know just a filename and want to find files with the exact same size as said filename -- which size is unknown as you only know the name.
The search would be: size:<something>
<something> can be filled in by the preprocessor:
[get-size:filename]
I think this is a great example! As well I think this is a great example of the user NOT needing to know "pre-processor".
size:1000

and
size:[get-size:MyFile.doc]

where the size of MyFile.doc is 1,000 bytes produce the same result,
But the second version is more powerful, more versatile.

(1)I have a foolproof method for dealing with items such as "registry". I create a new document "Registry Doc" with a Heading paragraph and link to the (this) post, and then leave that document dormant until it shows up at the top of the list in date-modified or size sequence :D

Cheers, and thanks again.
Chris
Last edited by ChrisGreaves on Fri May 05, 2023 4:18 pm, edited 1 time in total.
ChrisGreaves
Posts: 684
Joined: Wed Jan 05, 2022 9:29 pm

Re: Search Preprocessor Terminology

Post by ChrisGreaves »

void wrote: Tue May 02, 2023 3:53 am
The Search Preprocessor supplies an intermediate and transient/temporary translation of your request". Is that close to the truth?
Basically, yes.
Thank you, Void, for the confirmation, which to me is encouraging.
Think of it as like expanding environment variables, eg: %path%
The %path% text is replaced with the path environment variable value.
To be more technical:
Preprocessor expressions are executed and replaced with the resulting value.
#[preprocessor-function:...#]:
style expressions are expanded first.
The entire search is then broken into search terms using search operators (space=AND, |=OR, < > = grouping).
Then for each search term, any
[preprocessor-function:...]
expressions are expanded.
Finally, the fully expanded search terms are compiled into byte code and executed.
So To be even more technical:
The user types a string into the Search Box, and that string may then go through several processes, each one manipulating the original string into different forms until a string emerges that can be fed into an engine that spits out a set of objects into the Result List.

From my posts immediately above this response, I believe that the end-user doesn't really care what the processes are as long as they generate correct results.

Probably the most common example is that of arithmetic operator precedence, where we are told that * and / are effected before + and -, and that grouping by ( and ) are dominant over the first set of rules.

So the end-user might want to know the various steps - as you have written them above - to determine how to write a powerful and versatile search string, but that will require the same amount of discipline as building a Regular Expression. How does that sound?

I will track down the differences between
#[#]
and plain old
[]
Real Soon Now. :D

Poole and Waite's STAGE2 processor was a viciously nested evaluation of macro expansion. This links to a facsimile of the paper. Basically Stage2 relentlessly processed an input line over and over again, until it had exhausted all possibilities of transformation, then moved on to the output line.

I suspect that in the case of Everything, we end-users don't care about how many levels or stages or steps of processing go on when we enter NotNull's example
size:[get-size:MyFile.doc]
, as long as we get the correct result List.

Cheers, Chris
void
Developer
Posts: 16773
Joined: Fri Oct 16, 2009 11:31 pm

Re: Search Preprocessor Terminology

Post by void »

So the end-user might want to know the various steps - as you have written them above - to determine how to write a powerful and versatile search string, but that will require the same amount of discipline as building a Regular Expression. How does that sound?
There's likely to be limited documentation within Everything on the preprocessor.
The preprocessor will be a hidden feature for Everything 1.5 while it continues to improve..



Most users will not use the preprocessor.
I'll keep the online information up-to-date.
ChrisGreaves
Posts: 684
Joined: Wed Jan 05, 2022 9:29 pm

Re: Search Preprocessor Terminology

Post by ChrisGreaves »

void wrote: Fri May 05, 2023 7:36 am[Most users will not use the preprocessor.
I'll keep the online information up-to-date.
I suspect that from the ordinary user's point of view, they can't help using the preprocessor, but they won't need to know that they are using a pre-processor. On the old IBM1401 we had a deck of cards called "An Object Deck". We thought it was our compiled program, but it wasn't. It was A 2-card bootstrap loader, followed by an 8-card bootstrap loader, followed by a symbol table, followed by machine instructions. We didn't find out about the bootstrap loaders until later.

Likewise an ordinary user will post a question and a Guru will respond with
Try this command
, which will do the trick and the user will respond "Works great. Thanks!" without knowing about the pre-processor and its mechanisms.

What we see as special commands to which are given the names "Preprocessor Commands" will be to the ordinary user, just another batch of command-like stuff. Like SearchFunctions, SearchModifiers. MacroStrings, etc.

I think that's what raindrops was trying to say a few months ago.

My experience with users of MSWord is that 90% of them say "Oh, I never use STYLES", but the truth is you can't use MSWord without using styles.
My experience with users of MSWord is that 90% of them say "Oh, I never use a TEMPLATE", but the truth is you can't use MSWord without using a template, even if it is just NORMAL.DOT.

My bottom line is that in a User Tutorial i probably don't need to stress the "preprocessor" angle as much as the order of precedence that takes place among the batches of Reserved Words.
Cheers, Chris.
NotNull
Posts: 5461
Joined: Wed May 24, 2017 9:22 pm

Re: Search Preprocessor Terminology

Post by NotNull »

Exactly!



Off-topic:
ChrisGreaves wrote: Fri May 05, 2023 4:34 pm On the old IBM1401 we had a deck of cards called "An Object Deck". We thought it was our compiled program, but it wasn't. It was A 2-card bootstrap loader, followed by an 8-card bootstrap loader, followed by a symbol table, followed by machine instructions. We didn't find out about the bootstrap loaders until later.
*Now* I know why this sounded so familiar (you mentioned IBM1401 before): it is a (neo-classical) music album by Jóhann Jóhannsson (iirc, his father had one of these at home) :

Jóhann Jóhannsson - IBM 1401, A User's Manual (2006)
Part 1 - IBM 1401 Processing Unit (8:32)
Part 2 - IBM 1403 Printer (9:33)
Part 3 - IBM 1402 Card Read-Punch (10:23)
Part 4 - IBM 729 II Magnetic Tape Unit (7:15)
Part 5 - the Sun's Gone Dim and the Sky's Turned Black (7:09)
Post Reply