Campylobacter gastroenteritis is a leading cause of acute bacterial gastroenteritis in high, low, and middle income countries. The number of confirmed cases has continued to increase across countries of the European Union (214,000 in 2013 to 246,000 in 2016 and 2017) [1], and over 800,000 cases are estimated to occur annually in the United States (data from 2000 to 2008) [2]. In low income countries Campylobacter is increasingly implicated in growth faltering among children under 2 years of age [3].
Chicken products have been identified as an important risk factor for human infection by a variety of techniques including natural experiments, case–control studies, and increasingly by the application of genotypic methods [4–10]. Other infection sources identified by observational epidemiological studies include cattle, sheep, pigs, wild birds and the environment [10].
Alongside epidemiological studies there has been an increasing use of population genetic analyses to attribute human cases to likely sources. In these analyses, the genetic diversity of isolates from humans is compared with that of collections of Campylobacter isolates obtained from possible sources of infection, allowing quantitative attribution to these sources.
Multilocus sequence type (MLST) data [8] have become the standard data used in such population genetic analyses, the results of which are generally consistent with the findings from epidemiological analyses [11,12]. Large collections of isolates have been sequenced at the MLST loci from a wide range of sources. The approaches provide a potential means of monitoring change in sources of human infection, for example those that occur as a consequence of public health and food chain interventions [13]. Insights obtained from seven-gene MLST analyses can also inform analyses using more extensive genomic data, as large well sampled datasets of whole genome sequenced (WGS) isolates accumulate from humans and putative sources. Other techniques such as multiplex PCR, PFGE, and comparative genomic fingerprinting have neither been taken up widely nor offer compatibility with whole genome based approaches.
Studies analysing MLST data vary in terms of both the analytical algorithm applied and the reference datasets used [13–18] (‘reference’ data throughout this paper describe data from known reservoirs such as animal species that can act as sources of human infection). Here, our objectives on the use of MLST analysis to attribute infection in human populations to sources are to: (i) summarise the findings from these studies to date; (ii) describe the approaches used; and (iii) identify lessons to guide further genetic source attribution work using these data and more extensive genomic data as they become available.
