The Search Function Will Not Do Phrases In Quotes [FIXED]

Ask a question or request a feature related to the website or forum...

Moderator: scott

Post Reply
User avatar
rocky
Enthusiast
Enthusiast
Posts: 153
Joined: Mon Mar 10, 2008 9:55 pm
Location: Anaheim (Disneyland) California

The Search Function Will Not Do Phrases In Quotes [FIXED]

Post by rocky »

The forum database is getting huge and searching for past subjects is tedious because the search engine will not look for a phrase in quotes.
Instead, a reference to each word is found and it takes hours to check them.

For example, searching for the phrase “climbing back up� finds every occurrence of the word “back�. New topics are created for a subject because a search to see if it has been discussed before is just too time consuming. Sometimes hundreds of topics are found when I request a search. This is just too many to manually check. Creating new topics on something already discussed wastes memory and makes the database even larger.

There has been many a time I would have a new idea for the wheel and wanted to see if it had been discussed or built by searching for it on the forum, but when hundreds of matches are found I just give up the search. Google, Yahoo and Wiki search engines will all hunt phrases in quotes. Can the forum software be updated? There are over 900 members here. If we all did a modest PayPal contribution, could the improvement be made?

If the search algorithm would do phrases in quotes it would really improved searching for past data.

-Rocky
User avatar
jim_mich
Addict
Addict
Posts: 7467
Joined: Sun Dec 07, 2003 12:02 am
Location: Michigan
Contact:

Post by jim_mich »

Yes, the search function of the phpBB software that runs this forum sucks. There is no difference between Search for any terms or use query as entered and Search for all terms. Both always return the same results.

An alternate is to use the Google search at the bottom of each page. But it also has a problem in that it only searches the public forums. Google does not search the Community Buzz forum nor any of the private forums.


Image
triplock

re: The Search Function Will Not Do Phrases In Quotes

Post by triplock »

nor any of the private forums.
phew !!!!!!!! ;-)


Chris
User avatar
scott
Site Admin
Site Admin
Posts: 1409
Joined: Tue Nov 04, 2003 7:05 am
Location: Colorado
Contact:

Post by scott »

Thanks for your post Rocky. I hadn't realized this was causing so much trouble. phpbb2 does not support searching on phrases since it stores the search words in the database individually, and does not do a full text search. I've heard that phpbb3 also does not support full text search by default, but it may with some tweaking as long as the DB supports it.

In the meantime I wrote some code that will hopefully help without hurting performance too much. Now if you search on a quoted string, it will perform the search on the individual words as before, but then I loop the results and filter out any that don't have an exact match on the quoted string.

Your example is a little tricky since for some reason "up" is a stop word, meaning it is not indexed. Stop words are usually like "the" "and" etc to reduce clutter in the results. Not sure why up and down are in there.

In any case the new code seems to help. Without quotes, the search on climbing back up returns 67 matches. With quotes, the search on "climbing back up" returns only 28 matches. and every hit has "climbing back" on it (since up was not indexed).

If you do a search on a quoted phrase without any stop words the results are even better.

Thanks again for the feedback and hope the new feature improves your experience here.

Best,
Scott

P.S. Despite what Jim says, the ANY and ALL search option does indeed work. E.g. for the search terms "equivalent effective weight," when I search for ANY terms I get 11505 matches, when I search for ALL terms I get 7 matches and now when I search for the quoted string "equivalent effective weight" I get 2 exact matches (including this post).
Last edited by scott on Sat Jul 31, 2010 4:32 am, edited 2 times in total.
User avatar
jim_mich
Addict
Addict
Posts: 7467
Joined: Sun Dec 07, 2003 12:02 am
Location: Michigan
Contact:

Post by jim_mich »

I initially used the two words 'pulley question' as a test because it was a recent posting. I got the same results regardless of which two buttons I selected, as I stated in my above post.

I just now did a little more testing and found that the reason that I got the same results was because the word 'question' does not produce any hits, even though it is contained in some titles and some text.

Example of a post with the words 'pulley question' in the title:
http://www.besslerwheel.com/forum/viewtopic.php?p=77553

Example of a post with the word 'question' in both the title and the text:
http://www.besslerwheel.com/forum/viewtopic.php?p=64946

So though I was wrong concerning the ANY and ALL search option there was a true basis for my statement. Unfortunately I failed to do a second check using different words. I did not realise that the search database didn't contain all of the words used on the forum. Evidently 'question' is considered a stopword. It might be nice to list all the stop words somewhere or for the search results to inform that a search word is a stopword.

Thank you Scott for all that you do.

Image
User avatar
scott
Site Admin
Site Admin
Posts: 1409
Joined: Tue Nov 04, 2003 7:05 am
Location: Colorado
Contact:

re: The Search Function Will Not Do Phrases In Quotes

Post by scott »

I agree with you Jim, it doesn't seem like the word "question" should be a stop word. I was also surprised to find "up" and "down" in the list. I tried taking them out though and it broke the whole search function so it's not a trivial matter to edit the list. I will research it further.

That's a great idea to post the stop words. Will do that next.

Best,
Scott
Thanks for visiting BesslerWheel.com

"Liberty is the Mother, not the Daughter of Order."
- Pierre Proudhon, 1881

"To forbid us anything is to make us have a mind for it."
- Michel de Montaigne, 1559

"So easy it seemed, once found, which yet unfound most would have thought impossible!"
- John Milton, 1667
User avatar
scott
Site Admin
Site Admin
Posts: 1409
Joined: Tue Nov 04, 2003 7:05 am
Location: Colorado
Contact:

Post by scott »

Thanks again for the feedback. I made some more changes to work around issues in phpbb on searching for two letter words like "up." phpbb2 is designed from the ground up on the decision that 2-letter words don't count as far as the search index goes, so the word "up" has always been collateral damage. However my workaround seems to help as long as the 2-letter word is within a quoted phrase (such as yours Rocky).

In any case, here are the results from my latest testing. Please let me know if you notice any problems.

Code: Select all

search on words:

    climbing back up
    
        matches
            ANY: 8736 matches
            ALL: 0 matches    (a phpbb bug because of the word "up")
  quoted phrase: 32 good matches
Thanks to your feedback Rocky the search feature works a lot better now than it did before. Thanks!
User avatar
rocky
Enthusiast
Enthusiast
Posts: 153
Joined: Mon Mar 10, 2008 9:55 pm
Location: Anaheim (Disneyland) California

re: The Search Function Will Not Do Phrases In Quotes [FIXED

Post by rocky »

I am very impressed with your fix for the phrase search Scott.

A big THANK YOU for fixing it.

I have tried several different phrases and all work well.

It is a huge improvement.

-Rocky
User avatar
Mark
Aficionado
Aficionado
Posts: 548
Joined: Fri Aug 21, 2009 7:18 am
Location: USA - California

re: The Search Function Will Not Do Phrases In Quotes [FIXED

Post by Mark »

I agree, the search engine tune-up sure has optimized performance. Emissions are much cleaner. :)

Thank you Scott, Rocky and Jim.

Quick comment - One thing that I like about the Google search box is that it also searches the archived Discussion Board.
User avatar
scott
Site Admin
Site Admin
Posts: 1409
Joined: Tue Nov 04, 2003 7:05 am
Location: Colorado
Contact:

Post by scott »

Thanks for the feedback guys!
Post Reply