Relevance search explained

Since the version 2024.2, it is possible to disable or enable the relevance search sort.

 

Wedia uses a relevance search formula to determine which documents best match a search query. Here are the main components of this formula explained simply:

  1. Term Frequency (TF): The more frequently a term (word) appears in a document, the more relevant that document is considered for that term. For example, if you search for “cat” and this word appears often in a document, that document will have a high score for “cat.”

  1. Inverse Document Frequency (IDF): If a term is rare across all documents, it carries more weight. For example, if “cat” appears in only a few documents, each document containing “cat” will be considered particularly relevant for that term.

  1. Coordination Factor: Wedia considers the number of query terms that appear in a document. The more query terms found in a document, the higher its score. For example, if your query is “black cat” and a document contains both “black” and “cat,” it will be more relevant than a document containing only “cat.”

  1. Query Normalization: This step adjusts the scores to try to make the results of different queries comparable. However, comparing the scores of different queries is complex and generally not recommended.

In summary, Wedia combines the TF-IDF model (which measures term relevance by their frequency and rarity) and the vector space model (which evaluates relevance by the number of terms found in a document). This combination allows Wedia to rank documents based on their relevance to a given query.