Content search
Summary
Whilst content browsing is still the most common way to discover content, search is a very useful tool when trying to find specific content by one of its features, such as title, actor, or description. A side effect of its functionality is that other content can also be discovered.
The MDS supports search functionality, including a keyword suggest API. Its APIs cover both Live content and VOD, but the scope can be restricted by one or the other.
The fields involved in the search APIs are a reduced subset of those fields available in the browse APIs. This is because during a search the MDS has to gather applicable results rather than find specific ones, which makes the process take longer. Using a reduced set of data reduces the search time dramatically.
The parameters of the search APIs are not always equivalent to the browse APIs as different underlying technology is used.
Data models
Search result
The search result should return just enough information to be able to display results in a list to the user. It is expected that the UI application will make secondary calls to the browse APIs (by content IDs) to enrich the results or display a media card type screen.
Each object type conforms to the same schema, as seen below:
Suggestion result
The suggestion result returns keywords that match the search term entered so far. Each result includes the entity it corresponds to (for example, content or actor) and the suggested keyword. This allows the UI to either amalgamate the results in one list, or split the list per entity type.
Use cases
Keyword suggestion
Users will expect a UI's search functionality to provide suggestions for keywords as they start entering text before selecting a keyword to search for or just using the entered one. This reduces the time taken to enter the keyword as they only need to enter part of the keyword before the UI suggests the one they want.
The MDS builds an index of keywords based on the content metadata. By default, the fields indexed for the suggest API are actor and content title, and the ranking of each keyword is based on the occurrence of the keywords in the content metadata, and the relative importance of the field it is within. For example, if an actor such as Tom Cruise is in more movies or TV shows than Tom Hanks, when "Tom" is entered, Tom Cruise will be returned with a higher rank. Other fields can be configured to boost the ranking of a keyword if the keyword is found there, but this type of customisation is project specific.
The suggest API matches on the beginning of words only, it will not match on the middle or end of words as this would require the searchable index to be too large, and be confusing to use in practice.
Basic suggest query for a single string with no filter
In the example below, you search for any keywords with "Joh" in them:
http://server:port/metadata/solr/[provider]/suggest?q=Joh
If the appropriate metadata existed, then we could expect results such as:
- Johnny Depp – Actor
- John Cusack – Actor
- Being John Malkovich – Title
- John Carter – Title
- John Lithgow – Actor
- John Hurt – Actor
- Dear John – Title
- John Travolta – Actor
- John Barrowman – Actor
- John Tucker Must Die – Title
Suggest query for a single string for titles only
In the example below, you restrict the previous query to content titles only by using a filter:
http://server:port/metadata/solr/[provider]/suggest?q=Joh&fq=entity:content
Based on the above example, the results would be:
- Being John Malkovich – Title
- John Carter – Title
- Dear John – Title
- John Tucker Must Die – Title
Suggest query using two strings
The MDS supports using multiple strings in the query, separated by a space. Again, these must be the beginning of words or full words.
Example: retrieving keywords suggestions for Joh C:
http://server:port/metadata/solr/[provider]/suggest?q=Joh C
Given the first use case, you'd only expect the following two results:
- John Cusack – Actor
- John Carter – Title
Filter a suggest query by scope and domain
You can filter a suggest query by scope and domain. Scope means either live TV data (btv) or On Demand data (vod). Domain means the device and region combination you want to use.
Example 1: Retrieving keyword suggestions for Joh in On Demand for iPads in Sao Paulo:
http://server:port/metadata/solr/[provider]/suggest?q=Joh C&fq=scope:vod&fq=domain:("ipad|Sao Paulo")
Example 2: Retrieving keyword suggestions for Joh in Live TV for any device in France:
http://server:port/metadata/solr/[provider]/suggest?q=Joh C&fq=scope:btv&fq=domain:("|France")
Restrict a suggest query to only certain nodes
You might want your UI to restrict the suggest query to a specific on-demand node and its child nodes. his is especially useful in deployments where the on-demand catalogue is split into distinct areas such as Movies, TV shows boxsets, Catchup TV, and so on. To use this feature you must specify the node that represents the root of the tree that you want to restrict the results to.
For example, if you want to restrict the results to only the contents of the Movies node (and all its sub-nodes), and the Movies node ID is "Movies_12345", you could use the following query:
http://server:port/metadata/solr/[provider]/suggest?q=Joh C&fq=scope:vod&fq=domain:("STB|UK")&descendantsOf:Movies_12345
Search
The search API allows the UI to provide search functionality. It is based upon the Lucene query engine, so all semantics and search rules from Lucene apply here.
A number of entities from the core discovery APIs are ingested into the search index, including:
- VOD Editorials
- BTV Editorials
- BTV Programmes
- BTV Series
Each entity is represented by the same schema within the search API.
Basic search query with no filter
The simplest search is against all categories without any filtering:
http://server:port/metadata/solr/[provider]/search?q=sta wa
This would match content such as:
- Star Wars (due to start of word matching)
Search query on actor field only
To search a single field only (for example, actors) we can specify the field upfront:
http://server:port/metadata/solr/[provider]/search?q=actors:"tom"
This would bring back content such as:
- Top Gun (starring Tom Cruise)
- Forest Gump (starring Tom Hanks)
- Inception (starring Tom Hardy)
Search query filtered by actor
Alternatively, if you want to actively search for films starring a specific actor, you can filter as follows:
http://server:port/metadata/solr/[provider]/search?q=top&fq=actors:"tom cruise"
This would only return films starring "tom cruise", so:
- Top Gun
This example introduces an "fq" or "filter query" field. The purpose of this field is to act as a concrete filter, rather than a fuzzy search term (as with "q").
Search filtered by rating
A filter in SOLR can also be a range of values. The canonical example of a range filter is to only search within certain ratings or age groups.
For example, say you only want films suitable for age group 15 or less (in this example, the rating code for the "15" age group is represented by the precedence value 15):
http://server:port/metadata/solr/[provider]/search?q=actors:"tom"&fq=rating.precedence:[0 TO 15]
Search filtered by ancestor
There might be cases where you want to restrict your search to be within a certain catalogue branch (for example, to search through the "Action Movie" catalogue tree).
To achieve this you can use the "descendantsOf" field. This enable you to search through records that appear beneath the "Action Movie" catalogue node:
http://server:port/metadata/solr/[provider]/search?q=sta wa&fq=descendantsOf:Action_Movies
Search filtered by vod scope with sort applied
There are two distinct scopes available within the search results:
- VOD
- BTV
It is a common requirement to search only BTV or VOD results, rather than both at the same time.
You might want to sort your search results, for example alphabetically. Both of these scenarios are included in the following example, where you restrict results to vod content only and sort by title ascending:
http://server:port/metadata/solr/[provider]/search?q=sta wa&fq=scope:vod&sort=title asc
Although there are some scenarios where you will want to sort your sort results, it is generally not advised. This is due to the fact that SOLR provides a suitability ranking; the first record being the closest search match.
If you use sort you remove SOLR's added benefit to give you the matches in suitability order.
Search filtered by domain (deviceType and region)
A domain is a deviceType and Region concatenated by a "|" like "ipad|France".
Where all deviceTypes are appropriate, the deviceType can be left blank.
Equally where all regions are appropriate, the region can be left blank. For example, ipad in all regions "ipad|" and all devices in a region "|France".
Here is an example of any content suitable for devices "iPad" or "Android":
http://server:port/metadata/solr/[provider]/search?q=sta wa&fq=domain:("iPad|" OR "Android|")
The use of "descendantsOf" and "domain" together might bring unexpected results. It's possible for an asset to be a descendant of "Action_Movies" and be in the domain "ipad|". But it's also possible that the asset under "Action_Movies" is not suitable for an ipad. In such a case, the ipad version of this asset would be stored under a different catalogue node. The asset would still be returned in the results because it's available in "Action_Movies" AND it's available as an ipad asset.
Search filtered by entity
The SOLR schema caters for multiple entity types, for example:
- series
- content
- programme
You can filter by entity type to search through different types of object.
Below is an example of searching for series only:
http://server:port/metadata/solr/[provider]/search?q=fr&fq=entity:series
This would give results such as :
- Friends Season 1
- Friends Season 2
- Friends Season 3