## Abstract

Detrital zircon U–Pb geochronology is a rapidly expanding and useful technique for addressing the sedimentary rock record. However, because of difficulties in evaluating discordant analyses, common in most datasets, a high proportion of results may be discarded. These analyses, if interpreted correctly, can provide valuable information regarding the deformation, alteration or metamorphism of the zircon population, as these events are all capable of producing U–Pb discordance. A novel modelling procedure permits analysis of probabilistic relationships within U–Pb datasets to deconvolve the age information present within discordant analyses by assessing the relative likelihood of potential discordia lines. Additionally, because we retrieve useful information from discordant data, a stricter filter can be used to assess concordance, increasing confidence in those distributions. The validity of this modelling method is demonstrated using two previously published cases from the Caledonide orogen where clearly discordant analyses exist. In the Southern Uplands of Scotland, these analyses indicate resetting of the U–Pb systematics in metasedimentary rocks in the Grampian orogen prior to Ordovician erosion and redeposition. In the second case, from the Greenland Caledonides, discordant data provide additional constraints on previously proposed *in situ* resetting during Scandian deformation and fluid flow events.

**Supplementary material:** The code used for writing the model, along with an instruction guide, is available at http://doi.org/10.6084/m9.figshare.c.2182197.

The burgeoning field of detrital zircon U–Pb analysis has evolved into an indispensable tool for investigating the sedimentary rock record. Applications of detrital zircon data have been found in provenance and source characteristics studies as well as tectonic reconstructions and terrane analysis (e.g. Fedo *et al.* 2003; Andersen 2005; Cawood *et al.* 2007; Gehrels 2011, 2014). However, despite the clear importance of the data, many disagreements exist regarding the interpretation and visualization of U–Pb ages (e.g. Sircombe 2004; Vermeesch 2012; Gehrels 2014). Some of the most pressing issues confronting those working with detrital zircon datasets relate to biases inherent to the detrital record, and development of statistical techniques designed to help avoid them (e.g. Fedo *et al.* 2003; Nemchin 2005; Hietpas *et al.* 2011*b*; Gehrels 2011; Vermeesch 2012; Malusà *et al.* 2013). Many of the possible sources of bias within detrital zircon datasets are well documented and may be avoided by careful sample selection and processing (e.g. Sircombe & Stern 2002; Sláma & Kosler 2012) and analysis of a statistically representative proportion of grains (Sircombe 2000; Vermeesch 2004; Andersen 2005; Nemchin 2005; Pullen *et al.* 2014). More controversial biasing issues relate to the reduction, filtering and plotting of detrital zircon data after analysis (e.g. Carter & Moss 1999; Hietpas *et al.* 2011*b*; Sláma & Košler 2012; Malusà *et al.* 2013). Some of the many challenges with the interpretation of detrital zircon U–Pb data are due to discordant analyses, which are prevalent in the vast majority of datasets.

Zircon U–Pb geochronology relies on the presence of two independent chronometers, ^{206}Pb/^{238}U and ^{207}Pb/^{235}U (for a detailed review of U–Pb geochronology, see Schoene 2014). When the ages of these two isotopic systems agree the analysis is said to be concordant; the value plots along a curve termed a concordia (Wetherill 1956) in U–Pb isotopic space, and the age is interpreted as reliable. If, however, the two ages are not in agreement the analysis is labelled discordant, and deemed unreliable. Discordance may be created in a variety of ways (for an excellent discussion, see Schoene 2014), the most common being (1) removal of Pb from the crystal structure by fluid alteration or metamorphism (Pb-loss) and (2) mixing of zircon components with different ages during analysis (e.g. cores and rims). Discordant analyses contain valuable, although difficult to interpret age information, as the ages of different events affecting the zircon are recorded in the U–Pb data, but may not be clearly apparent when dealing with detrital zircon populations (Fig. 1). During processing of detrital zircon data, a subjective filter is often placed on the dataset, typically between 5 and 30% discordance; any analysis more discordant than the cut-off is deemed unreliable and excluded from interpretation and further analysis. In datasets containing many discordant zircons, a situation that occurs often, a high proportion of analyses may be removed (>30% is not uncommon). We note that discordance may be removed by techniques such as chemical abrasion pre-analysis (Crowley *et al.* 2014; Von Quadt *et al.* 2014; Solari *et al.* 2015), although, as we show here, valuable information can be extracted from discordant analyses using our technique.

Filtering of discordant data may impose biases onto the final age distribution, as distinct populations of zircons may be selectively removed depending on factors such as age and alpha-dose that contribute to metamictization and therefore to Pb-loss. When analyses are discarded owing to discordance, any sophisticated quantitative methods that are carried out to compare age distributions within and between samples, such as Kolmogorov–Smirnov tests (Press *et al.* 1986), tests for overlap and similarity (Gehrels 2000), likeness metrics (Satkoski *et al.* 2013) or cluster analyses (Ferreira *et al.* 2014), will carry biases imparted by the original discordance filter.

More importantly, although a single discordant zircon analysis is impossible to interpret independently, the analysis itself carries valuable geological information (i.e. crystallization ages and timing of Pb-loss; Fig. 1). Pre-depositional metamorphism of populations of detrital zircons, which potentially creates discordance, is an indicator of multi-stage zircon recycling prior to deposition, with important implications for the tectonic setting and provenance of the studied sediments (e.g. Eriksson *et al.* 2004; Campbell *et al.* 2005; Waldron *et al.* 2008). Also, syn- or post-deposition discordance may result from fluid alteration or metamorphism (e.g. Balan *et al.* 2001; Kramers *et al.* 2009; Morris *et al.* 2015). Therefore, both pre- and post-depositional overprinting events may have important geological implications. However, because most discordant data points are typically removed from detrital zircon datasets, this information is often lost.

Here we present a novel method for analysing the probabilistic relationships within discordant portions of U–Pb datasets and extracting age information. Our procedure provides two major advances in the evaluation of detrital zircon datasets. First, we are able to obtain age information from the discordant data, valuable information that is often discarded, by using a procedure that does not require an arbitrary discordance filter. Potential age information extracted from discordant analyses using our procedure can then be compared both within the dataset, with truly concordant data from the same sample, and between samples, as is currently done with concordant data. The second major advantage of our new technique relates to the trustworthiness of ‘concordant’ data in detrital zircons suites. Because we are now able to obtain age information from all the data, including discordant analyses, a much stricter discordance filter (e.g. 3–5%) may be confidently applied to the dataset, thereby increasing the reliability of age distributions produced with these data. These two benefits represent a significant advance in the visualization and analysis of detrital zircon suites, and allow for a statistical treatment of discordant detrital zircon data without any *a priori* knowledge or assumptions regarding the resetting history of the zircon grains.

Here we apply our modelling procedure to two studies in the Caledonide orogen where attempts have been made to interpret discordant analyses within datasets. In Ordovician turbidites from the Southern Uplands terrane of Scotland, Waldron *et al.* (2008) interpreted discordant data as indicating metamorphic reworking of Archaean zircon within the early Palaeozoic Grampian orogen, prior to their redeposition in the subduction trench that preceded Iapetus closure during the Silurian. In a second study, Morris *et al.* (2015) noted that discordant data from detrital zircons within Silurian turbidites in the Caledonian foreland basin of northern Greenland implied post-depositional partial to complete lead-loss. They used a simple modelling technique to interpret the most likely timing of lead-loss at 380 Ma, during late stages of collision, by comparing the distributions of concordant and discordant data, arbitrarily separated at a 10% level of discordance.

This modelling strategy may also be applied to many other geological problems associated with complex U–Pb datasets including lower crustal zircons, multiply-metamorphosed igneous rocks, and datasets containing other high U, low-common-Pb minerals susceptible to fluid overprinting and discordance (e.g. titanite, xenotime, rutile), although only the implications for detrital zircon datasets are discussed here. This method may also prove powerful when used to model datasets that include multi-mineral U–Pb analyses such as paired zircon–monazite data (e.g. Hietpas *et al.* 2010, 2011*a*). Additionally, although many groups do not currently publish discordant data, our method allows for mining of this portion for useful information, including legacy data.

## Methods

In this section we first provide a broad overview of our strategy, followed by the mathematical theory behind our modelling method, and finally a discussion of the visualization outputs. The model is written in the R coding language (R Core Team 2014).

### Concept

Zircon grains that have been reset typically show isotopic compositions that lie on a discordia line (Wetherill 1956) in U–Pb space, connecting the older crystallization age (the upper intercept) with the younger modification age (the lower intercept). The lower intercept may represent a time of Pb-loss or new zircon growth. Typically, each analysis is represented by an error ellipse determined from the variation in isotopic ratios recorded during analysis, and discordia lines are chosen to intersect these ellipses. Error ellipses, typically drawn at 2σ, are convenient boundaries representing the region of greatest probability density associated with each analysis (Fig. 1).

Our model defines a discrete grid of chords (straight lines in U–Pb space) within a Wetherhill concordia plot with pre-defined upper and lower intercepts (Fig. 2). These chords represent discordia lines that might exist within a detrital zircon suite. We then evaluate the summed probability density imparted by zircon analyses along each of these lines. The summed probability density can be used as an estimate of the likelihood that each line represents a true discordia line.

Chords within our model are created by defining a suite of nodes along concordia, evenly spaced in time. For clarity, the example in Figure 2 has a node spacing of 400 myr whereas a typical analysis has a 5 myr node spacing (selectable by the user). Each node is then connected to every other node, creating a grid of chords. Then, using the equations for U–Pb linear regression outlined by Davis (1982) and summarized below, the summed probability density contributed by U–Pb analyses along each chord is computed. Each chord has a unique upper and lower intercept age pair along with an associated summed probability density, essentially a measure of the likelihood that a particular line contains discordant data.

### Statistical analysis of the discordant population

The equations from Davis (1982) are related to those used for linear regression of bivariate normal datasets (Ludwig 2003). Davis (1982) defined a suite of functions that rotate a line with initial slope and *y*-intercept (*a* and *b* respectively) into the coordinate system normal to an individual U–Pb error ellipse. Each U–Pb data point is defined by a bivariate normal distribution, but rotated owing to the correlation between the ^{207}Pb/^{235}U and ^{206}Pb/^{238}U ratios, expressed as ρ. The rotation of the line, with initial slope (*a*) and *y*-intercept (*b*) results in a new line, oriented normal to a given U–Pb uncertainty ellipse, with slope (*a*_{i}) and *y*-intercept (*b*_{i}), given by
(1)and
(2)where , represents the location of the data point, while σ_{x} and σ_{y} represent the uncertainty associated with *X*_{i} and *Y*_{i}, respectively. It should be noted that Davis (1982) defined slope as *b* and *y*-intercept as *a*, whereas we use the conventional notation here.

Once rotated, the probability density for the data point (*X*_{i}, *Y*_{i}, σ_{x}, σ_{y}, ρ) along a line with given parameters *A*_{i} and *B*_{i} is given by
(3)where *A*_{i} and *B*_{i} are defined as
We have also included a simple weighting metric to devalue concordant analyses in our model. Because we are interested in finding linear relationships within discordant data we need to weight against concordant analyses. If discordant data remain unweighted, the model records artificially high likelihood along lines connecting clusters of concordant analyses that have no geological meaning. The weighting is performed by taking the final result of equation (3) and multiplying by the discordance, defined as
(4)where λ_{238} is the ^{238}U decay constant and *t*_{76} is the ^{207}Pb/^{206}Pb age of the analysis.

Using this approach, a nearly concordant analysis (1% discordant) will contribute much less probability to the final distribution than a discordant analysis (10% discordant). We have found this strategy more effective than applying a discordance filter and using only analyses deemed discordant. The sum of equation (3) weighted using equation (4) for a unique chord (*a, b*) over all data points (defined by *x*_{i}, *y*_{i}, σ_{x}, σ_{y} and ρ) in a given dataset will give the sum probability density, which we have termed likelihood. This metric may then be applied to each chord to determine the relative likelihood for potential discordia lines.

Some additional complication arises when analysing U–Pb data owing in part to the variable analytical uncertainty, or heteroscedasticity, of typical zircon U–Pb data. Although U contents in zircon U–Pb analyses may play a role in the final uncertainty of the analysis, the heteroscedasticity in detrital zircon populations is principally analytical. The age uncertainty of a single analysis broadly scales with age; 2σ uncertainties are typically 1–3% of the measured age for laser ablation inductively coupled plasma mass spectrometry (LA-ICP-MS) analyses. This heteroscedasticity creates a bias within our model whereby younger analyses, with smaller absolute uncertainties, contribute a disproportionally large probability density to any chord that crosses near the centre of the data point. As we are interested in spatial relationships within the discordant zircon dataset, the heteroscedasticity in laser ablation U–Pb data imposes significant bias onto the resulting probability distributions. To circumvent any bias imparted by over-precise data we normalize the probability density of each analysis relative to the dataset. The computationally simplest method by which to achieve this goal is to calculate the median uncertainty for both ^{207}Pb/^{235}U and ^{206}Pb/^{238}U ratios over the entire dataset and replace all uncertainties for each data point with this median value. This means that 2σ uncertainty ellipses plotted on concordia will have similar dimensions and carry identical probabilistic weights across the dataset, allowing us to evaluate distributions free of biases imparted by heteroscedasticity.

### Calculation of peak centres and shapes

To fully evaluate the positions of any given peak in likelihood, we have created a method for calculating the centre of any peak along with a measure of the peak shape. To do this, we implement a peak finding method within the R code that identifies peaks at a given sensitivity level. The peaks are identified within the 2D likelihood plotted in each output (e.g. Fig. 3a) and identified as local maxima. We must caution at this stage that the sensitivity parameter is a subjective metric that does not necessarily identify peaks with valid age meaning; rather it identifies peaks with likelihoods that are larger than one-third of the maximum likelihood in the dataset. We have allowed the model to identify and analyse many small peaks that probably have no age significance so that the program does not bias the peaks used in an inappropriate manner.

After peaks are identified in both the upper and lower intercept ages, we attempt to fit a Gaussian curve to the data immediately surrounding the peak age. A suitable Gaussian curve is identified by minimizing the sum of squared deviates from the model output data, while also maximizing the number of data points included in the fit calculation. Once a suitable Gaussian curve is fitted to the data, the mean of that fit is reported. We have included another fit parameter to identify and compare peak shapes within each dataset. This parameter is the peak width, and is calculated by taking the full width of the peak at the base of the data used in the Gaussian curve fit. It must be stressed that although the peak width is reported in millions of years, it does not represent a fully propagated age uncertainty and should certainly not be treated as such. This metric is created to compare peak shapes within a particular dataset. All peak parameters are reported in an exported data file.

Our method of calculating peak locations and shapes can lead to somewhat confusing outputs. Because we select local maxima within the likelihood map, lower intercept ages tend to become slightly younger with decreasing upper intercept age. This is due to the geometry of the concordia curve and can be visualized by thinking about one discordant analysis. When connecting a discordant data point to old upper intercept ages, the lower intercept is relatively old. However, as the upper intercept age becomes younger, the lower intercept hinged around the one discordant point must also become younger. Although not as drastic as the example described above, our modelling procedure still documents this slight effect. Therefore, the true lower intercept age probably falls between the upper and lower estimates.

### Visualization

The output of our model is a data file containing the lower and upper intercept ages of each chord as well as a summed probability density, or likelihood, along it. To visualize this likelihood we generate two new data series, one containing a binned lower intercept age and associated likelihood and another with the binned upper intercept ages and associated likelihood. To produce this we first group chords based on lower intercept age such that each lower intercept age has many lines associated with it. The likelihood assigned to that age is taken from the line with the maximum likelihood (the line most likely to represent a true discordia chord at that age). The same is done for upper intercept ages. These two series may then be plotted on the same graph, showing likelihood as a function of age (Fig. 3).

Peaks in these spectra may be interpreted as upper and lower intercept ages, signifying zircon crystallization and resetting events, respectively. It must be pointed out that the absolute value of the likelihood cannot be used for interpretation. Although we account for differences in the number of data points included within different datasets by normalizing the raw summed probability density to the number of analyses included in the modelling, the *y*-axis value (likelihood) remains a relative metric for deciphering peaks and troughs within and between datasets.

The second visualization method used here is somewhat less intuitive. We have produced a plotting procedure that creates a 2D likelihood map where lower intercept age is the *y*-variable, upper intercept age is the *x*-variable, and the normalized probability is plotted as a function of these two variables over the triangular area of *x*–*y* space where the upper intercept is greater than the lower intercept. Normalized probability is represented in this domain by a contoured likelihood map (Fig. 3). This plot will be described in some detail below.

## Example 1: proof of concept using an idealized dataset

To explain the method in more detail, as well as validate the sensitivity of our modelling method to background noise, we ran our model using a synthetic discordant dataset consisting of three clear discordia lines, with data spaced evenly along lines with lower and upper intercepts at 0–3000, 1500–3500 and 1000–2000 Ma (Fig. 3a). The 0–3000 Ma line contains six data points, the 1000–2000 Ma line contains nine and the 1500–3500 Ma line contains 12. Once processed, the model correctly identifies precise major peaks at all the correct upper and lower intercept ages (Fig. 3b). The model also detects the relative number of data points falling on lines, demonstrated by the higher normalized probability associated with lower and upper intercepts of lines that contain more data points (Fig. 3b).

The 2D likelihood map of normalized probability assists in the visualization of the distribution of likelihood within the discordant dataset (Fig. 3c). In this figure lower and upper intercept peak heights may be related to one another in a clear manner. The likelihood for each line is related to the colour shown at that point, where the spectrum moves from white to black (grey to orange to blue in the online colour version) in order of increasing likelihood. On these plots, clusters of high likelihood represent a convergence towards a single discordia line, whereas horizontal bands of high probability represent multiple upper intercept ages with a shared lower intercept age (discussed below; e.g. Fig. 4c).

Additionally, this dataset exemplifies how our discordance weighting measure provides added benefit to the analysis. Despite having a greater number of points (nine) the 1000–2000 Ma line has a lower overall likelihood than the 0–3000 Ma line, which contains only six data points. This may at first seem problematic; however, closely spaced crystallization and resetting events create ‘shallow’ discordia patterns that are cryptic and difficult to detect even in meta-igneous rocks. In detrital zircon datasets where multiphase crystallization events are represented by abundant concordant data with closely spaced crystallization ages, detecting potential shallow discordance is difficult. Thus it is difficult to distinguish between true discordance and multiple crystallization events narrowly spaced in time. We therefore discriminate against data that are close to concordant.

Overall our method of analysing discordant data proves highly sensitive to lines containing many analyses that represent true discordia lines. Small background peaks are created where meaningless lines accrue probability by crossing unrelated data points. However, the program shows good sensitivity by clearly highlighting true discordance relative to the probably meaningless background likelihood.

In the following sections we use real datasets from two distinct settings to test the model, with different types of discordance addressing distinct aspects of geological history.

## Example 2: identifying post-depositional Pb-loss events

In some cases, partial Pb-loss and recrystallization events may affect detrital zircons after deposition, creating discordance arrays that are complex and difficult to interpret. These events may not necessarily recrystallize new zircon material, meaning that truly concordant analyses may not be present. Because of this, post-depositional Pb-loss events can be invisible to conventional analytical techniques where discordant data are automatically rejected using a subjective discordance filter. Morris *et al.* (2015) highlighted such an event in Neoproterozoic to Palaeozoic strata from northern Greenland and we use their dataset to test our model.

Morris *et al.* (2015) used detrital zircon Pb-loss patterns and a new modelling method to suggest that significant post-depositional Pb-loss around 380–390 Ma may have been caused by an extended period of fluid flow associated with the Caledonian orogeny. Their modelling method convincingly identifies lower intercept ages that affected much of the detrital zircon population. Here we model the Sydgletcher Formation sample of Morris *et al.* (2015) to highlight the usefulness of our model. Although our results are similar to those of Morris *et al.* (2015), our modelling procedure differs significantly, and, importantly, is completely automated and incorporates all of the data during implementation.

Figure 4 shows the output from our modelling of the U–Pb data from Sydgletcher detrital zircons (Morris *et al.* 2015). The sample contains a significant proportion of discordant analyses (39 of 63 are >10% discordant; Fig. 4a) that prove difficult to interpret by traditional methods. Morris *et al.* (2015) suggested that a post-deposition Pb-loss event between 380 and 390 Ma caused major Pb-loss to all zircons present within the sample; our model allows a similar interpretation. Figure 4b shows the binned normalized likelihood as a function of upper and lower intercept ages. The lower intercept likelihood spectrum shows a principal peak at 441 Ma with small peaks at slightly younger ages (*c*. 400–380 Ma). Peaks in upper intercept likelihood correspond to ages around 1830 and 2740 Ma, matching well with concordant ages from the same sample (Fig. 4a).

Additionally, a *c*. 2150 Ma peak in upper intercept likelihood is identified in our analysis. This peak does not correspond to any measured concordant data and has multiple interpretations. First, there may be grains in the Sydgletcher dataset that have crystallization ages *c*. 2150 Ma, but all the grains analysed with those crystallization ages experienced significant resetting. In this interpretation, new information is provided with the analysis of the discordant population of analyses. A second explanation is that the *c*. 2150 Ma peak has no age significance, and is instead created by grains that experienced multiple episodes of resetting (i.e. *c*. 2700 Ma crystallization, variable resetting a *c*. 1800 Ma, and further resetting at *c*. 440 Ma) creating complex age relationships. Without concordant analyses at *c*. 2150 Ma it remains difficult to interpret this upper intercept peak, and perhaps more analyses of the Sydgletcher dataset would resolve this issue.

Figure 4c shows the relative likelihood associated with specific lines of unique upper- and lower-intercept age combinations. The broad vertical likelihood band at a lower intercept age *c*. 400 Ma implies that a one major resetting event affected grains with an array of crystallization ages, consistent with the interpretation of Morris *et al.* (2015).

Despite broad similarities, our model identifies a slightly different age of maximum zircon Pb-loss at 441 Ma (Fig. 4), whereas the model of Morris *et al.* (2015) indicated 380–390 Ma as the most probable time of Pb-loss. Although the analysis of Morris *et al.* (2015) is broadly similar to our analysis, the discrepancies warrant further discussion. The difference in age between our analysis and that of Morris *et al.* (2015) is probably due to the subjective discordance filter of 10% placed on the dataset by Morris *et al.* (2015), allowing their discordance model to use only grains with >10% discordance. Our weighting of the probability density using discordance allows us to evaluate possible Pb-loss events throughout the entire dataset, without setting an arbitrary filter. The age we find with maximum likelihood of Pb-loss, 441 Ma, corresponds to the maximum age for onset of Caledonian thrusting in the area (Higgins *et al.* 2004; Gee *et al.* 2008).

Lower intercept ages derived by our model vary from 441 to 361 Ma depending upon the upper intercept age (see our discussion in the Methods section for a description of the cause of this discrepancy). The real age probably falls somewhere in this range, but defining a true age would be unwarranted (see our discussion of model uncertainties and bootstrapping reproducibility in the supplementary material). Our modelling suggests that the age of this event, interpreted as orogenic fluid flow by Morris *et al.* (2015), cannot be precisely defined. Despite our preference for the former model and differences between the peak in our analysis and that of Morris *et al.* (2015), the presence of near-concordant analyses around 400–380 Ma and the broad shoulder in our lower intercept data (Fig. 4b) are compatible with the conclusions of Morris *et al.* (2015) regarding an extended period of orogenic fluid flow in the area.

## Example 3: discordance in Southern Uplands detrital zircons

Example 2 above illustrates the use of discordant zircon data to shed light on discordance that results from post-depositional U–Pb resetting. However, the discordant portions of detrital zircon suites may also shed light on pre-depositional overprinting of detrital zircons, such as large-scale metamorphism or fluid alteration in the source terrane. Metamorphism and crustal reworking of older grains may help to elucidate multi-cycle histories of sedimentary grains. Waldron *et al.* (2008) invoked such a model in analysis of detrital zircon data from the Late Ordovician Glenlee Formation in the Southern Uplands terrane in the Scottish Caledonides. The concordant analyses included large numbers of ages in the Archaean and Mesoproterozoic eons. In addition, 16 discordant grains were analysed that lay close to a discordia line with intercepts at 2723 and 499 Ma. This line was interpreted by Waldron *et al.* (2008) as evidence that Archaean grains were not necessarily derived directly from an Archaean craton exposed at the time of deposition; many such grains may instead have been first incorporated into the Palaeozoic Grampian orogen immediately to the NW of the Southern Uplands, where they either underwent Pb-loss or were rimmed by new metamorphic zircon (Waldron *et al.* 2008).

We have re-evaluted the Glenlee Formation dataset using our model to assess the discordia line identified by Waldron *et al.* (2008). Figure 5 shows the output of our modelling of the Glenlee Formation dataset. The model clearly identifies a peak in likelihood along a discordia line similar to that identified by subjective analysis (Waldron *et al.* 2008). The large lower intercept peak has a maximum of 501 Ma and a corresponding upper intercept peak of *c*. 2697 Ma. Additionally, the model identifies an upper intercept peak at 1896 Ma, which when viewed in the 2D likelihood map can also be related to the *c*. 500 Ma lower intercept age (the horizontal band of likelihood around 500 Ma). This 500 Ma lower intercept age is interpreted to represent the early stages of metamorphism in the Grampian orogen. It is noteworthy that the calculated lower intercept age is about 30 myr older than peak metamorphism in the currently exposed rocks of the Grampian orogen at *c*. 470 Ma, but is close to the timing of initial Grampian collision suggested by Chew *et al.* (2010).

The success of our method in identifying overprinting histories that have previously been identified using subjective criteria highlights the possibility that analyses of discordant data, often excluded from published datasets, may lead to improved understanding of the provenance of zircon elsewhere. Here we confirm, using statistically robust methods, the presence of significant disturbance to the U–Pb systematics of Precambrian zircons that have been influenced by Caledonide events.

## Cautions

Despite our efforts to produce a U–Pb model where bias is minimized, there remains some potential that our model can produce pseudo-discordia lines with no geological meaning. This is especially the case in very small datasets where random sampling may lead to artificial biases. Although the production of dominant peaks by random sampling is unlikely (see above examples), we must caution against over-interpretation of smaller peaks and clusters of likelihood without further geological evidence. Of particular note are the arc-shaped lines of likelihood plotted in the 2D likelihood maps (Fig. 4c). These lines are created by discordia lines that sweep across single points or clusters of points close together, and are meaningless. Only point maxima in 2D likelihood maps and major peaks in the line plots should be considered representative of discordia lines that are worth further investigation.

As with all modelled data, the outputs of this model must be verified with other geological evidence before significant conclusions can be made. Additionally, owing to the angular relationships of discordant U–Pb data chords with the concordia curve, U–Pb resetting events that are closely spaced in age may remain undetectable because of their shallow discordia trajectories. Although we have partly addressed this problem by weighting analyses with discordance, our method works best when there is a large disparity in age between crystallization and overprinting events, of the order of hundreds of millions of years.

Our model is also blind to data quality. Therefore, only data with the highest accuracy should be input. One way in which poor-quality data may generate spurious discordia lines relates to corrections for common-Pb that may be implemented in the processing of detrital zircon data. Cases with low ratios of radiogenic to total lead (Pb*/Pb ratios in most presentations) may lead to artificial displacement of analyses from concordia. Our model will then weight such data in an inappropriate manner. Therefore, strict controls on data quality are recommended prior to implementation of this model.

## Conclusions

We have provided a description and code for a new method with which to analyse discordant data from detrital zircon U–Pb datasets. This analysis has a firm grounding in statistical methods and provides useful information from data that are typically discarded. This method provides two significant advances in the interpretation of detrital zircon datasets: (1) it allows for the extraction of age information from the discordant population within detrital zircon datasets, owing to disturbances to the U–Pb isotope system caused by either Pb-loss and metamorphic overprinting; (2) because useful information is extracted from discordant data, a much stricter and more trustworthy filter may be used to determine concordance, thereby increasing the reliability of the concordant dataset.

This model will work well for datasets containing a significant Precambrian component, where differences in ages of crystallization and Pb-loss are large, as well as those containing a high proportion of discordant data. The technique described here is easily implemented and rapid, and also produces publication ready, fully editable figures. This modelling method has been designed to provide new information from currently unused portions of detrital zircon datasets but may equally be applied to other complex suites of discordant data such in lower crustal rocks, multiply metamorphosed igneous suites, zircons with complex fluid overprinting and multi-mineral U–Pb datasets.

We have demonstrated the effectiveness of this modelling method using two previously published datasets that have discordance imposed during Caledonide events. Previous treatments of these two datasets used somewhat subjective methods to identify the potential ages of zircon resetting events. These events, although corresponding closely to ages associated with the Caledonide orogeny, represent two distinct scenarios where zircon U–Pb systematics experienced resetting. The first, zircon U–Pb data from the Sydgletcher Formation in northern Greenland (Morris *et al.* 2015), probably represent *in situ*, post-depositional Pb-mobilization during major fluid flow. The uncertainty in ages identified by our model differs significantly from the ages inferred from a modelling procedure, using an arbitrary discordance cutoff, applied to these data by Morris *et al.* (2015), although our conclusions do not contradict those of the earlier study. The second dataset modelled here provides significant evidence for resetting of Archaean zircon U–Pb systematics during early stages of the Grampian orogeny prior to sediment deposition. This result negates the need for a direct Archaean source for sediments from the Scottish Southern Uplands (Waldron *et al.* 2008). Together, our semi-quantitative modelling of discordance within these two datasets sheds light on the processes of Caledonide orogenesis, and has potential to similarly illuminate the significance of discordant detrital zircon data in other settings.

## Acknowledgements and Funding

We thank T. Chacko, L. Heaman, M. Rittner, P. Vermeesch, J. Wotzlaw, A. Reyes, D. Petts, D.G. Pearson and P. Waterton for advice regarding the implementation and usefulness of this method. J. Reimer, A. Reyes, M. Speta and C. Chelle-Michou, and the efficient but hard-to-please users of StackOverflow are thanked for assistance in combating our ignorance in coding with R. This paper was substantially improved by thorough reviews from A. Nemchin and F. Corfu as well as Chief Editor Q. Crowley.

- © 2016 The Author(s)