News API v1 - Developer's Guide

Query Parameters

The News API supports the following query parameters:

ParameterUsageTypeDefaultfeedsschedulespackagesassetsgalleries
apikey Specify the APIKEY for the request striung Y Y Y Y Y
accept Specify the preferred response format enum json Y Y Y Y Y
callback Specify the JSON-P callback function name string Y Y Y Y Y
query Specify text search criteria for filtering the results string Y N N Y N
date Specify a date to constrain the results date Y N N Y N
from Return only results on or more recent than the specified date/time datetime Y N N Y N
to Return only results prior to the specified date/time datetime Y N N Y N
when Return only results from within the specified timeframe window enum Y N N Y N
tag Return only results with the specified tag string Y Y N Y N
limit Maximum number of results to return integer 10 Y N N Y N
offset The index of the first result into the total result set integer 1 N N N Y N
sortby The sort order for the results enum modified:desc Y N N Y N
facets The facets to include in the response enum Y N N Y N
crop The media crops to include in the response enum Y Y Y Y Y

These parameters are may be available in one or more endpoints as indicated above, and will behave contextually. Use of undocumented parameters may have zero, or undesirable effects on the response. The passing of incorrect values to a parameter may cause a request error.

apikey

The request APIKEY may be passed as a query parameter instead of an accept header, and is needed for every request. 

accept

The accept parameter is used, as an alternative to the Accept HTTP header, in order to change the response format of the API. By default, the API responds in JSON.

If provided, the accept parameter supercedes any value specified by the Accept HTTP header. This is because we have observed that many client libraries and applications, including low-level ones such as curl, will specify Accept: */* if the developer does not specify their own headers - which would cause the News API to respond with JSON (which is its preferred response format) regardless of the value of the accept parameter.

callback

The callback parameter can be used by JSON-P clients in order to set the callback function. When set, the response format of the API is changed to Content-Type: application/javascript

The callback parameter has no effect if the requested response format is not JSON.

query

The query parameter is used for specifying the search terms to be used when filtering the results. It supports boolean operators, quoted phrases, parentheses, and the negation of terms by preceding them with a minus sign.

The parameter will only be effective if search capability is enabled for the APIKEY in use, and if supported by the endpoint called e.g. assets

A string query is a plain text search string composed of terms, phrases, and operators that can be easily composed by end users typing into an application search box. For example, 'cat AND dog' is a string query for finding documents that contain both the term 'cat' and the term 'dog'.

The search engine provides a robust ability to generate complex queries. The following are some examples:

(cat OR dog) NEAR vet

at least one of the terms cat or dog within 10 terms of the word vet

dog NEAR/30 vet

the word dog within 30 terms of the word vet

cat -dog

the word cat where there is no word dog.

For more details, see the Query Grammar page.

date

The date filter criteria restricts the results to articles most recently published on the specified calendar day.

Dates should be specified in CCYY-MM-DD format. Optionally, a timezone can be specified e.g. 2014-06-03+01:00 in order to clarify in which timezone the date should be interpreted, i.e. when midnight is considered to be. If no timezone offset is specified, the date is assumed to be Europe/London local time for the specified day - matching the Press Association's news cycle.

from/to

The from and to parameters provide fine-grained control over the date and time range filter for the search results. Both must be specified as ISO 8601 formatted datetimes, in the form CCYY-MM-DDThh:mm:ss.sssZ

As with date, if no timezone is specified then Europe/London local time is assumed. Where this is invalid (a time is specified but the clocks would have jumped forwards over that time), the filter will be rejected with an error. Where the time is ambiguous (i.e. the clocks go backwards and the same clock time appears twice on the given day) then the earlier time is assumed.

Note that from is inclusive, but to is exclusive. This enables you to easily search on day boundaries by setting e.g. from=2014-06-03T00:00:00+01:00 and to=2014-06-05T00:00:00+01:00 and receive only items published on 3rd and 4th of June, and none published at or after midnight on the 5th, without having to set to=2014-06-04T23:59:59.999+01:00 and having to worry about millisecond precision for the end criteria.

when

The when parameter is an alternative time-based banding criteria that recognises that searches for news are generally biased towards recency. when offers the following time periods:

ValueDefinition
today Articles that were most recently updated today
yesterday Articles that were most recently updated yesterday
7-days Articles that were most recently updated in the past 7 days (including today and yesterday)
30-days Articles that were most recently updated in the past 30 days (including today, yesterday and the past 7 days)
older Articles that were most recently updated more than 30 days ago

The calendar days used are relative to Europe/London local time, to match the Press Association's news cycle.

The when parameter may be returned as a facet in search-style responses (see below).

tag

The tag constraint is used to filter search results based on various types of classification, including (but not limited to) the traditional PA topic schemes. Multiple tags can be specified by repeating the constraint, in which case only articles that match both tags will be returned (i.e. a logical AND). More complex searching based on tag can be performed by including it as a term in the query parameter (e.g. to search for Politics articles that are not also tagged Environment, you can do query=tag:patopic:POLITICS -tag:patopic:ENVIRONMENT). See the query grammar for more details.

The tag parameter may be returned as a facet in search-style responses (see below).

limit

The limit and offset parameters are used to scroll/page through the set of available results for a given search.

limit affects the number of results returned in each page. In most cases the default (if no limit is specified) this will be 10 results; this can be altered up to a maximum of 100. Setting it to a value greater than 100 will result in an error response.

offset

The offset value affects where in the result set the page starts. By default (if no offset is specified) this will be 1, i.e. the first available result. This can be altered up to a maximum of 500. Setting it to a value greater than 500 will result in an error response.

sortby

By default, search results are returned in newest-first order. This can be changed via the sortby parameter. Permitted values are:

ValueEffect
relevance Search results are ranked based on terms in the query. The BOOST mechanism can also be used to weight specific search terms (see the query grammar page for more details)
modified:desc Search results are ordered with the most recently updated items first. This is the default.
modified:asc Search results are ordered with the most recently updated items last, i.e. oldest first.

facets

The facets parameter is used to enable facet information in a search response, such as from an assets list call. By default no facets are calculated, since it has an impact on search time. Facets are calculated for the total set of results matched by the current call, providing values that can be used to drill-down further in the next call. Each facet returns the top 10 (or less) most frequently occuring values of that facet type. The id of each facet value returned may be used as a query parameter value, to return the set counted by the facet. Possible facet values are:

FacetDescription
tag The most frequest tag values in the result set
when A set of bucketed facets covering fixed date-time ranges e.g.today, yesterday 

crop

The crop parameter is used to enable media crop information in any response that exposes image media, such as an asset or package.

The paramter accepts an optional comma-separated list of media crops. This provides details of PA's recommended crop co-ordinates for the specified crop shapes, to be applied to the full size rendition of the image. Please note that the News API does not perform any cropping of the images. 

Available shapes are: POI, SQUARE, THREE_TO_TWO, FOUR_TO_THREE, SIXTEEN_TO_NINE, FIVE_TO_FOUR, TWO_TO_ONE, SEVEN_TO_TWO.

POI means point-of-interest and is just a co-ordinate, not a rectangle, and may be used to identify the focal point of the image.

Response formats

The News API offers the following response formats:

FormatAccept headeraccept parameter
JSON application/json json
ATOM application/atom+xml or application/xml atom
RSS application/rss+xml rss

Regardless of format, there are 3 kinds of response that can be provided based upon the endpoint. Furthermore, the level of detail provided in each response can vary according to certain defaults, or overriding query parameters. Generally a list response will provide a summary of each item, whilst an individual fetch will provide a full item.

Where a list or search of assets is provided, the level of detail for each asset varies across the endpoints in order to cater to the most likely use case for that endpoint and format. For example, the ATOM response for a package provides the full size assets by default, for convenience.

EndpointResponse TypeJSON itemATOM entryRSS item
/v1 list of endpoints


/v1/feeds list of feeds feed sumary feed summary feed summary
/v1/feeds/{feed} search of assets asset summary asset summary full asset
/v1/assets search of assets asset summary asset summary full asset
/v1/assets/{asset} asset full asset full asset full asset
/v1/assets/{asset}/associatedMedia list of media asset summary asset summary asset summary
/v1/schedules list of schedules schedule summary schedule summary schedule summary
/v1/schedules/{schedule} list of assets in schedule asset summary full asset full asset
/v1/packages list of packages package summary package summary
package summary
/v1/packages/{package} list of assets in package asset summary full asset full asset
/v1/galleries list of galleries gallery summary gallery summary gallery summary
/v1/galleries/{gallery} gallery asset full gallery full gallery full gallery

list response

In a list response, the items are returned as a JSON array, or as an ATOM or RSS feed. List responses are not paginated and do not include facets.

Lists are for short finite sets of results - be that a list of options, or an editorially curated list of content on a given theme.

search response

Search responses look like list responses at a high level, but contain extra information in order to allow for paging through the results. The search response is used where the potential number of results is not bounded.

Search result response formats include OpenSearch information in order to assist the API client to page through the result set, and a list of facets - a breakdown of subsequent search terms along with counts, in order to provide guided navigation of the result set.

summary asset response

The summary asset response is used in a context where the API client is expected to be searching or browsing for a single item that they will eventually consume. It provides a headline and abstract for each story, along with its classification metadata, and a thumbnail image if available.

full asset response

The full asset response contains all of the same metadata as the summary response, but also includes the full body text of the article, and references all of the attached media items rather than just one.

 

Facets

The News API can provide facets in all search-style responses. The facets provide a breakdown of the results of the search into how many of those results also match the given term, providing the API consumer with guided navigation to help refine the search further to find the content they are looking for. Currently, facets are provided for both the when and tag search constraints.

For when, the News API returns each of the bands that contains at least one search result, with a count of how many results fall into that band.

For tag, it returns a list (maximum of 10), in decreasing order of popularity, of the tags used across the result set.

Facets are available in all three response formats. In JSON, they look like this:

"facets": {
  "tag" : {
    "terms" : [
      { "id":"pacategory:HHH", "name":"General News", "count":2532 }, 
      { "id":"patopic:POLITICS", "name":"Politics", "count":1234 }, 
      { "id":"patopic:COMMONS", "name":"Commons", "count":973 } 
    ]
  }  
}

In the XML formats (ATOM and RSS), they appear like this:

<pa:facets>
   <pa:facet field="tag" term="pacategory:HHH" name="General News" count="2532"/>
   <pa:facet field="tag" term="patopic:POLIICS" name="Politics" count="1234"/>
   <pa:facet field="tag" term="patopic:COMMONS" name="Commons" count="973"/>
 </pa:facets>

Docs Navigation