When you search at Google, the answers you receive sometimes now include additional questions, that often have the label above them, “People Also Ask.” I was curious if I might be able to find a patent about these questions, and I saw that they were sometimes referred to as “related questions.”
An article at Moz today on the topic was interesting: Infinite ‘People Also Ask’ Boxes: Research and SEO Opportunities. The answers about how these related questions are decided upon seems to have a simpler origin as described in Google’s patent, but it is interesting comparing the ideas from that post with the patent.
I searched through Google patent search for “related questions” and I came up with a patent named, “Generating related questions for search queries”. When I looked at the screenshots that accompanied the patent, they appeared to be very similar to the “People also ask” type questions Google shows us today in search results.
The patent provides some information about how Google gets these related questions.
It appears that Google looks at a query it receives and after receiving a number of search results, will decide upon one or more topic sets for each of the search result resources from “previously submitted search queries that have resulted in users selecting search results identifying the search result resource” and “selecting related questions from a question database using the topic sets.” The questions that are selected from those topic sets may be returned along with search results for the query searched for.
The question database includes previously submitted search queries that have been determined to be in question form.
Deciding upon topic sets for each of the search result resource involves:
(1) Identifying qualified search queries for the search result resource – previously submitted search query that resulted in a user selecting a search result that identifies the search result resource;
(2) Ranking the qualified search queries based either on a number of times each query has been submitted or based on a number of times users have selected a search result identifying the search result resource after submitting each query; and
(3) Selecting one or more highest-ranked qualified search queries as the topic sets for the search result resource.
The questions chosen to be included with a set of search results might be compared with each other, and if they appear to be equivalents of one another, the best version of that question might replace equivalent questions.
The patent gives us a reason for showing searchers related questions:
Providing related questions to users can help users who are using un-common keywords or terminology in their search query to identify keywords or terms that are more commonly used to describe their intent. The user experience can be improved by submitting the displayed content of a related question as a new search query and receiving a pre-determined, pre-formatted answer to the related question as part of a response from the search engine.
The patent is:
Generating related questions for search queries
Inventors Yossi Matias, Dvir Keysar, Gal Chechik, Ziv Bar-Yossef, Tomer Shmiel
Publication number US9213748 B1
Granted date: Dec 15, 2015
Filing date Mar 14, 2013
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for identifying related questions for a search query is described. One of the methods includes receiving a search query from a user device; obtaining a plurality of search results for the search query provided by a search engine, wherein each of the search results identifies a respective search result resource; determining one or more respective topic sets for each search result resource, wherein the topic sets for the search result resource are selected from previously submitted search queries that have resulted in users selecting search results identifying the search result resource; selecting related questions from a question database using the topic sets; and transmitting data identifying the related questions to the user device as part of a response to the search query.
The Question Database
The question database may be selected from queries that are in question form, and a query may be determined to have been in question form based upon a number of features that may be associated with it. It might have:
The predetermined set of question terms can include one or more of interrogative words, e.g., interrogative determiners, interrogative pronouns, and interrogative pro-adverbs, other function words that are frequently used to ask a question,
It may also have punctuation marks, such as question marks.
It may match certain question query templates, such as, “why is [X] used,” where [X] is a placeholder for one or more query terms.
The search queries that are selected may have to meet a threshold of having been asked a certain number of times before it is selected as a potential related question.
Questions may also be taken from other sources than just queries, and could include questions from question and answer websites.
The patent tells us about the selection of search queries and sources of answers to those queries. When questions are selected, they may be ranked more highly based upon things such as whether they were provided as questions that came with answers. Questions with answers may be promoted in rankings, and questions without answers may be demoted as choices of related questions.
A quality score for an answers may be based upon such things as:
(1) A quality score generated by the search engine for the resource from which the answer is derived (the page that is the source of the answer).
(2) The quality of the answer may be based in part on a ranking of a search result identifying the resource from which the answer is derived in a ranking of search results generated by the search engine in response to the question being submitted as a search query.
(3) The quality of the answer may be based in part on the length of the answer, i.e., the number of tokens, terms, or characters in the answer.
(4) If multiple answers are available for a given question, the quality of each answer can be based in part on the number or proportion of terms in the answer that are repeated in other answers for the question.