Pipelines

Configuring Autocomplete

Customizing your autocomplete

There are two aspects to configuring autocomplete functionality in Sajari:

  • Training your autocomplete model.
  • Querying your autocomplete model to get autocomplete suggestions.

Autocomplete training is automatic for website and API collections. Training is performed from metadata in your records and from user queries (please refer to the sections below for more detail). A query pipeline called autocomplete is automatically created which will be used for your application when it is in autocomplete mode.

Training your autocomplete model

Inputs to train your autocomplete model can come from fields in your records and user queries.

Training autocomplete from fields

Common fields used for autocomplete training include:

  • Crawler (Website) collections: headings, titles, or other meta fields in your webpages.
  • API collections (e.g. example e-commerce): brand, product name, category and other fields in your product data.

Be selective in the fields you use for autocomplete training to generate the best possible suggestions. Typically short, descriptive fields are best.

The training is done via the train-autocomplete-v2 step in your record pipeline to enable query training from record fields.

- id: train-autocomplete-v2
  params:
    fields:
      constant: name:name,brand:brand,categories:categories
    maxWords:
      constant: "5"
    model:
      constant: default

Training autocomplete from user queries

User queries typed into your search box can be used as a training source for your autocomplete model. This allows autocomplete to instantly adapt to trending or popular queries.

You can customize criteria that determine whether a user query should be used for autocomplete training via the train-autocomplete step in the Relevance Adjustments

The possible criteria include:

  • Does the query return a minimum number of search results? For example, you probably do not want to train queries that lead to no search results.
  • Does the top result for the query meet a minimum score or index score? We recommended setting a minimum score or index score as an indicator of whether the query will yield relevant results.
  • Is the query word length less than a maximum word length? It is recommended to set a maximum word limit otherwise your autocomplete model may include nonsensical suggestions.

The following trains query q (the search query the user entered) if it meets the following criteria:

  • The query lead to at least 1 result.
  • The best result had an index score of 1.0.
  • The query contains at most 3 words.
- id: train-autocomplete
  params:
    minIndexScore:
      constant: "1.0"
    minResults:
      constant: "1"
    maxWords:
      constant: "3"
    model:
      constant: default
    text:
      bind: q

Other sources

It is possible to train your autocomplete model from other data sources like your query history from another system. Please get in touch for more details..

Querying your autocomplete model

To enable autocomplete when querying, add the autocomplete step in the query pipeline. Using this step, you can configure how training sources are weighted when returning the order of autocomplete suggestions. For example, you may want to weight phrases from user queries to be more important than a field in your collection.

Normally, you do not want to be returning autocomplete suggestions and performing a search at the same time. It is typical to add the skip-search step in the pipeline that uses the autocomplete step.

The following example performs autocomplete in the input query q, setting the query suggestions to q.suggestions and skips a search.

description: Autocomplete specific pipeline
preSteps:
- id: autocomplete
  params:
    labelWeights:
      constant: query:1.0,name:0.05,brand:0.05,categories:0.05
    model:
      constant: default
    original:
      bind: q.original
    outText:
      bind: q
    overrideSuggestions:
      bind: q.overrideSuggestions
    suggestions:
      bind: q.suggestions
    text:
      bind: q
- id: skip-search

The order in which query suggestions are returned can be influenced by setting const:labelWeights. The labels are assigned in the train-autocomplete-v2 step. If empty, all labels are weighted equally.

The following example sets live query training (denoted by the label query) to be twice as important as the record fields brand and category.

- id: autocomplete
  params:
    labelWeights:
      constant: query:1.0,name:0.05,brand:0.05,categories:0.05
    model:
      constant: default
    original:
      bind: q.original
    outText:
      bind: q
    overrideSuggestions:
      bind: q.overrideSuggestions
    suggestions:
      bind: q.suggestions
    text:
      bind: q
  consts:
    labelWeights:
      - value: "query:1.0,brand:0.5,category:0.5"