Planet Hunters NGTS: New Planet Candidates from a Citizen Science Search of the Next Generation Transit Survey Public Data

Sean M. O’Brien; Megan E. Schwamb; Samuel Gill; Christopher A. Watson; Matthew R. Burleigh; Alicia Kendall; Sarah L. Casewell; David R. Anderson; José I. Vines; James S. Jenkins; Douglas R. Alves; Laura Trouille; Solène Ulmer-Moll; Edward M. Bryant; Ioannis Apergis; Matthew Battley; Daniel Bayliss; Nora L. Eisner; Edward Gillen; Michael R. Goad; Maximilian N. Günther; Beth A. Henderson; Jeong-Eun Heo; David G. Jackson; Chris Lintott; James McCormac; Maximiliano Moyano; Louise D. Nielsen; Ares Osborn; Suman Saha; Ramotholo R. Sefako; Andrew W. Stephens; Rosanna H. Tilbrook; Stéphane Udry; Richard G. West; Peter J. Wheatley; Tafadzwa Zivave; See Min Lim; Arttu Sainio

doi:10.3847/1538-3881/ad32c8

1. Introduction

Since the detection of the first exoplanet orbiting a Sun-like star (Mayor & Queloz 1995) over 5000 exoplanets have been discovered to date (Akeson et al. 2013; Christiansen 2022).²² The most prolific discovery method thus far has been the transit method, where one observes a decrease in the brightness of a star as a planet passes in front of its host (Borucki & Summers 1984; Winn 2010, and references therein). This method relies on the detection of multiple, periodic transit signals, from which we can determine the planetary orbital period as well as the orbital inclination and the planetary radius (provided we have an accurate measurement of the stellar radius), among other parameters (e.g., Burke et al. 2014; Mayo et al. 2018; Guerrero et al. 2021). This information can be combined with radial-velocity (RV; Lovis & Fischer 2010, and references therein) observations, which provide a measurement of the planetary mass when combined with accurate measurement of the stellar mass, and hence allows the planetary bulk density to be determined—enabling constraints to then be placed on the planetary composition (e.g., Seager et al. 2007; Spiegel et al. 2014; Zeng et al. 2016). Precisely characterizing exoplanets in this way provides insights into the formation mechanisms and evolutionary histories of the diverse range of planetary systems that have been discovered (e.g., Kley & Nelson 2012; Zhu & Dong 2021, and references therein).

The vast majority of transit searches utilize algorithms such as the box-fitting least squares (BLS; Kovács et al. 2002) and transiting planet search (TPS; Jenkins 2002; Jenkins et al. 2020) algorithms, which detect repeated decreases in the brightness of stars to identify the strongest periodic signals (e.g., Bakos et al. 2004; Pollacco et al. 2006; Vanderburg et al. 2016). These candidates are visually vetted by small teams of professional astronomers to determine whether the dips in starlight are characteristic of exoplanet transits or are due to false positives such as instrumental systematics or eclipsing binaries (Morton et al. 2016; Collins et al. 2018; Schanche et al. 2019; Guerrero et al. 2021). The most promising planet candidates are selected for follow-up observations such as RV measurements to attempt to validate or confirm the planetary nature of the candidate, provided they are amenable to such observations (e.g., Konacki et al. 2005; Bakos et al. 2007; Esposito et al. 2019).

However, the visual vetting stage is time intensive, and transit surveys are becoming increasingly limited by computing resources and the number of hours humans can contribute to the search, rather than being data limited (Baron 2019; Kohler 2019). The process of visually vetting light curves for exoplanet transit-like features is an exercise in pattern recognition, which the human brain excels at (Dashti et al. 2010). Indeed, with minimal training, nonexpert volunteers can be trained to classify these dips, allowing us to discover exoplanets that were missed by traditional vetting techniques (e.g., Fischer et al. 2012; Christiansen et al. 2018; Eisner et al. 2021). Citizen science has been employed in many astronomical (e.g., Lintott et al. 2008; Schwamb et al. 2012; Kuchner et al. 2017; Aye et al. 2019) and nonastronomical fields (e.g., Jones et al. 2018; Blickhan et al. 2019; Semenzin et al. 2020; Spiers et al. 2021) to efficiently search through these large data sets and also serendipitously spot peculiar phenomena that would be missed by automated algorithms (Cardamone et al. 2009; Lintott et al. 2009).

The successes of these citizen science projects, in particular the Planet Hunters (Schwamb et al. 2012), Exoplanet Explorers (Christiansen et al. 2018), and Planet Hunters TESS (Eisner et al. 2021) projects, which use data from the spaced-based Kepler (Borucki et al. 2010), K2 (Howell et al. 2014), and Transiting Exoplanet Survey Satellite (TESS; Ricker et al. 2015) missions, respectively, has led to the launch of the Planet Hunters NGTS (PHNGTS) citizen science project. PHNGTS uses data from the Next Generation Transit Survey (NGTS; Wheatley et al. 2018; see Section 2), that has been surveying large sections of the sky in search of exoplanet transits since 2015. To date, the facility has discovered tens of exoplanets (e.g., Bayliss et al. 2018; Tilbrook et al. 2021; Jackson et al. 2023) primarily through the visual vetting of candidates identified by ORION, a custom BLS algorithm adapted from Collier Cameron et al. (2006). While the primary objective of the Planet Hunters and Planet Hunters TESS projects is to identify transit events that were not detected by the standard detection pipelines, PHNGTS uses phase-folded light curves from the ORION algorithm with the aim being to classify transit-like events in the NGTS data and identify planet candidates that were missed in the initial vetting of these data.

In this paper, we outline the PHNGTS project and describe the key results from the analysis of data from the first two NGTS Data Releases (DR1 and DR2), including the discovery of five planet candidates not previously detected in these data. The layout of this paper is as follows. Section 2 describes the photometric data from the NGTS facility that is used in the PHNGTS project. We give an overview of the PHNGTS project in Section 3. We describe the process of how we identify planet candidates in Section 4 and provide details of the most promising planet candidates discovered and describe the follow-up data obtained in Section 5. We evaluate the detection efficiency of the citizen science project in Section 6. Finally, in Section 7, we outline the conclusions.

2. NGTS

The NGTS (Wheatley et al. 2018) is an array of 12 robotically operated telescopes located at the Paranal Observatory in Chile. Each telescope has a 20 cm aperture and a field of view of 8 deg². The primary goal of NGTS is to survey large sections of the sky in search of exoplanet transits (Bayliss et al. 2018). Since 2018, NGTS has also been used extensively for exoplanet follow-up observations, particularly for TESS mission candidates (e.g., Armstrong et al. 2020). NGTS operates using a custom filter with a bandpass from 520 to 890 nm, maximizing sensitivity to late-K- and early-M-dwarf stars. The NGTS facility is designed to be sensitive to transit depths of 0.1% in order to detect Neptune-sized planets around Sun-like stars. The discovery of NGTS-4b with a transit depth of 0.13% represents the shallowest transiting system ever discovered from the ground (West et al. 2019).

The details of the NGTS survey strategy and detection pipeline are given in Wheatley et al. (2018). Here, we provide a brief summary. The 12 NGTS telescopes each survey large (8 deg²) sections of the sky ("fields") for ∼250 nights during an observing season, which results in ∼500 hr of coverage per field. The NGTS telescopes operate with exposure times of 10 s, which limits survey targets to I-band magnitudes brighter than approximately 16 mag. The data for all survey fields observed up until 2018 are available via the ESO Archive Science Portal²³ (NGTS Consortium 2018, 2020), hereafter referred to as the "NGTS Public Data." The results presented in this work are based on the analysis of the NGTS Public Data. The survey data are detrended using the SysRem algorithm (Tamuz et al. 2005; Mazeh et al. 2007) and searched for candidate transiting events using a custom BLS (Kovács et al. 2002) algorithm called ORION. The ORION algorithm computes a periodogram to detect the most significant periodic signals for each star, with up to five candidate periods identified for a given target. Each candidate period has an associated signal detection efficiency (SDE), where a higher SDE value indicates a stronger periodic signal (see Alcock et al. 2000; Kovács et al. 2002).

For context, we describe the internal vetting process that has been used in the discovery of 23 exoplanets to date by NGTS (Bayliss et al. 2018; Günther et al. 2018; Raynard et al. 2018; Eigmüller et al. 2019; Vines et al. 2019; West et al. 2019; Bryant et al. 2020; Costes et al. 2020; McCormac et al. 2020; Grieves et al. 2021; Smith et al. 2021; Tilbrook et al. 2021; Alves et al. 2022; Jackson et al. 2023; F. Bouchy et al. 2024, in preparation). This process proceeds as follows: for each field, two professional astronomers (vetters) are assigned to independently review ORION candidates by scrolling through a webpage that displays all candidates for the given field. This interface displays for each candidate: a light curve phase folded on the most significant period (Peak 1); a plot of the BLS periodogram; and a thumbnail image of the region around the star to examine whether there are any nearby stars contaminating the photometric measurements. Further information such as the measured orbital period, SDE, and estimated radius of the orbiting body are also given in this interface. In addition, each candidate has its own webpage that can be accessed from the main interface to view the light curves for individual nights and phase-folded data for any other significant periods identified by the algorithm. The candidate webpage can also be used to check whether secondary eclipse events are visible, or if there is a difference between the depths of the odd and even transits. These checks are designed to identify signals due to eclipsing binaries, a common false positive in transit surveys (O'Donovan et al. 2006; Howell et al. 2011; Lillo-Box et al. 2014; Ciardi et al. 2015; Lester et al. 2021). Vetters can designate promising candidates for discussion that are then reviewed by the full vetting team to identify the best candidates for further follow-up observations. Typically, vetters are expected to review ∼1000–5000 potential candidates that will contain a large number of false positives, due to the limitations of the search algorithm. The nature of this task leads to fatigue and therefore lower effectiveness in identifying potential planet candidates, as cognitive fatigue is common when undertaking repetitive tasks (Robertson & O'Connell 2010; Langner & Eickhoff 2013).

3. Planet Hunters NGTS

The PHNGTS project is the next iteration of the Planet Hunters citizen science project and is hosted on the Zooniverse platform (Lintott et al. 2008, 2011; Fortson et al. 2012).²⁴ The aim of the project is to identify planet candidates that were missed in the initial vetting of the NGTS data. Volunteers who visit the website are shown phase-folded light curves (known as "subjects") and are asked questions to classify these subjects in a series of workflows. The project is divided into three workflows: the Exoplanet Transit Search (Figure 1: top panel); the Secondary Eclipse Check (Figure 1: bottom-left panel); and the Odd/Even Transit Check (Figure 1: bottom-right panel). Volunteers are free to select any of these three workflows to do. The Exoplanet Transit Search is the main classification interface where the primary transit in phase-folded light curves for ORION candidates are reviewed. If a subject is identified as a potential candidate via this workflow (see Section 4), then the candidate is also classified in both the Secondary Eclipse Check and Odd/Even Transit Check.

**Figure 1.** PHNGTS classification interfaces. Top panel: Exoplanet Transit Search interface showing a phase-folded light curve with the response "A V-shaped dip in the middle" selected. Bottom-left panel: Secondary Eclipse Check interface showing the phase-folded light curve centered on phase = 0.5 with the response "A secondary eclipse" selected. Bottom-right panel: second task of the Odd/Even Transit Check workflow asking whether the green points (odd transits) and magenta points (even transits) have similar depths, the selected response is "Yes." Note the example images for each workflow do not show data for the same star.
Download figure:
Standard image High-resolution image

First-time visitors to the PHNGTS site and users who are not logged in are prompted to read a short tutorial for each workflow that outlines the aims of the PHNGTS project, explains the transit method and the phenomena each possible response is designed to identify, and shows examples of each of the possible responses for the given workflow. In addition to the tutorial, which can be accessed at any time, there is a help box and a "Field Guide" that show a wide variety of example light curves for each of the responses. After viewing this tutorial, volunteers can begin classifying real data from the NGTS facility. Once a volunteer chooses the response that best describes the features in the subject presented and selects the "Done" button, the classification is stored in the Zooniverse database and cannot be edited. The subject identifier, volunteer's anonymized Internet Protocol (IP) address, Zooniverse username and user ID number (if the user is logged in), time stamp, web browser and operating system information, and user response are recorded. Volunteers can classify either through a registered Zooniverse account or while not logged in. The classification process is identical for both user types, although not-logged-in users are prompted to login/register for a Zooniverse account. Classifications made by registered users are linked by their Zooniverse username/ID, while classifications made by users who are not logged in can be linked via the masked IP address. We note that non-logged-in classifications from a single IP address may not necessarily equate to a single individual; hence, we refer to unique IP addresses as "non-logged-in sessions" rather than users. Tables 15, 16, and 17 in Appendix B provide examples of the cleaned versions (duplicate classifications and auxiliary information columns removed) of the raw classifications submitted for each workflow. The full tables are available in the online supplementary material.

In the classification interface, each subject is presented without identifying information or stellar properties in order to reduce biases in the classifications (Mayo 1934; Adair 1984; Levitt & List 2009; Schwamb et al. 2012). Once a subject has been classified, users can choose to comment on the light curve, or discuss light curves classified by other volunteers, via the "Talk" discussion forum. Each subject has its own discussion page in the "Talk" forum where users can discuss the light-curve features and see information about the subject including the stellar radius, if available, from the TESS Input Catalog (TIC) v8.2 (Stassun et al. 2019; Paegert et al. 2021). The stellar radius, when combined with the transit depth (which can be estimated from the light curve presented), allows volunteers to estimate the radius of the transiting body and therefore highlight promising subjects to the attention of the science team via the "Talk" forum. The additional data for each subject do not include the TIC ID for the star. Each candidate receives a unique subject ID for each workflow but can be linked by the science team through their unique NGTS IDs or TIC ID where available.

The PHNGTS project utilizes phase-folded light curves for the most significant peak in the periodogram for each candidate (Peak 1) where the SDE is greater than 8. We elect to use this threshold for SDE as it has been found through the internal vetting process that below this limit the frequency of false positives due to systematics is excessive. The phase-folded light curves presented are binned to 10 minutes as it was found during testing that the subjects proved more straightforward to classify in this format, without possible transits being obscured. In addition, the light curves are displayed without error bars as in previous citizen science projects we have found that error bars do not influence the classification made by volunteers. The y-axis (normalized flux) spans a range from 1 − 1.5 × δ to 1 + δ, where δ is the fractional transit depth measured by ORION. The x-axis (orbital phase) spans a range from $-\tfrac{3\times {T}_{\mathrm{dur}}}{P}$ to $\tfrac{3\times {T}_{\mathrm{dur}}}{P},$ where T_dur and P are the transit duration and orbital period measured by ORION, respectively. Here, we describe each of the three workflows that constitute the PHNGTS project.

3.1. Exoplanet Transit Search

The Exoplanet Transit Search (Figure 1) is the main workflow of the PHNGTS project. There is no vetting stage similar to that described in Section 2 that occurs prior to subjects being uploaded to the Exoplanet Transit Search. All ORION candidates with an SDE greater than 8 are classified in this workflow before potentially advancing to the additional workflows. Users are shown a phase-folded light curve centered on the location of the primary transit as identified by ORION and are asked to select any of the following responses that may apply: "A U-shaped or box-shaped dip in the middle"—typically indicative of an exoplanet transit; "A V-shaped dip in the middle"—typically indicative of an eclipsing binary transit; "No significant dip in the middle"—no clear transit visible; "Stellar variability"—identifies light curves that show out-of-transit variation; "A large data gap near the middle"—identifies light curves with large gaps in the data points around the location of the possible transit. Each subject in the Exoplanet Transit Search is seen by 20 volunteers before being retired from the workflow. We chose the number of classifications a subject must receive before retirement to balance the time it takes to fully classify the full sample while receiving a sufficient number of classifications to characterize the features present in a given subject. We elected to obtain 20 classifications per subject, an increase compared to previous citizen science projects such as Planet Hunters (Schwamb et al. 2012) and Planet Hunters TESS (Eisner et al. 2021) that use ∼8–15 classifications per subject. We determined through beta testing of the project with a small number of Zooniverse volunteers that the increased complexity of the PHNGTS tasks required a greater number of classifiers. The Exoplanet Transit Search is designed to identify the most promising exoplanet candidates. This is achieved by combining the classifications of multiple users on each subject to select the candidates that are most likely to show a transit-like feature (see Section 4). These candidates are then advanced to both the Secondary Eclipse Check and Odd/Even Transit Check for further vetting. These workflows are described below. We note that due to an error in uploading, 170 candidates were uploaded twice to the Exoplanet Transit Search. We merged the classifications for each of these candidates and removed duplicate classifications by the same users when applying the user weighting scheme (see Section 4).

3.2. Secondary Eclipse Check

The Secondary Eclipse Check (Figure 1) is designed to help check whether the candidate is an eclipsing binary system where we can observe both the primary transit and secondary eclipse. Users are presented with phase-folded light curves centered on orbital phase = 0.5 with the x-axis spanning the same width as the Exoplanet Transit Search. We expect that given the short periods of these candidates the orbits are likely to be near-circularized (Winn & Fabrycky 2015 and references therein) and therefore any secondary eclipses will be at phase = 0.5. Volunteers are asked to choose a single response from: "A secondary eclipse," "No secondary eclipse," and "A large data gap." This allows us to reject candidates where a clear secondary eclipse is observed. The NGTS telescopes are sensitive to transit depths of ∼0.1% (Wheatley et al. 2018; West et al. 2019), which is approximately the maximum depth that has been observed from the ground for optical secondary eclipses of hot Jupiters (Wheatley et al. 2018). Therefore, we do not expect the secondary eclipses of real exoplanets to be visible in the presented light curves. The middle row of Figure 2 shows an example of a candidate (TIC-389932515) that was identified as displaying a U-shaped transit in the Exoplanet Transit Search but the Secondary Eclipse Check reveals that this system is an eclipsing binary with a clear secondary eclipse. Each subject in this workflow is seen by 15 volunteers before being retired. The number of classifications per subject before retirement is designed to balance speed of classifying the sample with obtaining enough classifications for accurate characterization. We require less classifications for the Secondary Eclipse Check than the Exoplanet Transit Search as the task presented is more straightforward and, through the selection of the most promising candidates from the Exoplanet Transit Search (see Section 4), the rate of false positives will be lower.

**Figure 2.** Light curves for three different stars as presented in each of the three workflows. Top row: light curves for NGTS-8, which shows a clear U-shaped transit in the Exoplanet Transit Search. No secondary eclipse is visible in the Secondary Eclipse Check, and the depths of the odd and even transits match in the Odd/Even Transit Check. Middle row: light curves for TIC-389932515, which shows a U/V-shaped transit in the Exoplanet Transit Search. The depths of the odd and even transits match in the Odd/Even Transit Check; however, a secondary eclipse is visible in the Secondary Eclipse Check, indicating this is an eclipsing binary. Bottom row: light curves for TIC-441292449, which shows a clear V-shaped transit in the Exoplanet Transit Search. No secondary eclipse is visible in the Secondary Eclipse Check; however, the depths of the odd and even transits do not match in the Odd/Even Transit Check, indicating this is an eclipsing binary.
Download figure:
Standard image High-resolution image

3.3. Odd/Even Transit Check

The Odd/Even Transit Check (Figure 1) is designed to check whether the depths of the odd and even transits match. We can use this workflow to identify eclipsing binary systems where ORION has misidentified the orbital period as half the true period. The primary and secondary eclipse, which will correspond to the odd and even transits (or vice versa), will have different depths if the stars have differing luminosities. Volunteers are presented with the phase-folded light curve (centered on phase = 0) with the odd-numbered transits (i.e., the first, third, fifth, etc. consecutive transits) shown in green and the even transits shown in magenta. These colors were chosen to make the task accessible to volunteers with different types of color vision deficiency. The first task asks whether both sets of points cover the middle portion of the plot to check that there are no large gaps in the data. If large data gaps exist then it will not be possible to accurately determine whether the odd and even transit depths match. The second task (shown in Figure 1) asks whether the odd and even transits have similar depths. The bottom row of Figure 2 shows an example of a candidate (TIC-441292449) where the Exoplanet Transit Search light curve appears as a regular V-shaped transit, while the Odd/Even Transit Check light curve shows the clear difference in depths. The scenarios checked by both the Secondary Eclipse Check and Odd/Even Transit Check are common false positives in exoplanet transit searches (O'Donovan et al. 2006; Howell et al. 2011; Lillo-Box et al. 2014; Ciardi et al. 2015; Lester et al. 2021). Each subject in this workflow is seen by 15 volunteers before being retired. This threshold for the number of classifications per subject is chosen for the same reasons described above in Section 3.2.

3.4. Site Statistics

Since the launch of PHNGTS on 2021 October 18, there have been 2,626,380 individual classifications completed across the three workflows on the NGTS Public Data. Of these, 87.6% were made by 8559 registered volunteers, with the rest made in 3319 non-logged-in sessions (tracked by anonymized IP address). Across the three workflows 138,198 subjects were classified, which comprised 85,000 individual target stars. Figure 3 shows the distribution of the number of classifications made by registered volunteers. This distribution is common in citizen science projects in that we find that many volunteers classify only a few subjects, while a small number of users classify a large number of subjects (Spiers et al. 2019 and references therein). Registered volunteers classified a mean of 268 subjects and a median of 40, while non-logged-in sessions classified a mean of 98 subjects and a median of 31. For comparison, registered Planet Hunters volunteers classified a mean of 68 and median of 5 Kepler Q1 light curves (Schwamb et al. 2012), and registered Planet Hunters TESS volunteers submitted a mean of 647 and median of 33 classifications (Eisner et al. 2021). We assessed the distribution of the number of classifications made by registered users using the Gini coefficient, which ranges from 0 (equal contributions from all users) to 1 (large disparity in the contributions). The Gini coefficient for the PHNGTS project is 0.85. This is similar to the mean Gini coefficient of 0.82 among other astronomy Zoonvierse projects (Spiers et al. 2019) and the mean value for individual sectors of 0.87 in the Planet Hunters TESS project (Eisner et al. 2021). To compare the time taken by the PHNGTS project to classify the data with the original NGTS vetting process, we measure time taken for 99% of subjects in the Exoplanet Transit Search to receive their 20th classification (i.e., the time for 99% of the data set to be retired). The Secondary Eclipse Check and Odd/Even Transit Check workflows had data uploaded at irregular intervals, and therefore we do not include measurements of how long these stages took. The Exoplanet Transit Search data set was uploaded and classified in two distinct sets in order to initially prioritize ORION candidates with higher SDE values. The first 85,316 subjects uploaded had SDE > = 10, and 99% of subjects were retired by 2021 December 12, which was 56 days after the launch of the project. The second set of 20,218 subjects had 8 < SDE < 10, with the first classification submitted on 2022 April 25, and 99% of subjects were retired by 2022 September 18 (147 days later). The initial data set greatly benefited from media coverage driving engagement in the first weeks of the project and, hence, the much higher rate of classification. For comparison, the original NGTS vetting process of the data set took 1.5 yr. We stress, however, that the aim of the PHNGTS project is not to classify the data faster but rather uncover any candidates that may have been missed in the traditional vetting process.

**Figure 3.** Histogram of the number of classifications by registered users, using a bin size of 5. The plotted distribution is truncated at 300 classifications for clarity. A total of 1023 registered users had more than 300 classifications, which is 12.0% of all registered volunteers. The most frequent number of classifications by registered volunteers and non-logged-in sessions is 1.
Download figure:
Standard image High-resolution image

4. Identifying Candidates

In this section, we describe the user weighting and subject scoring scheme used to combine multiple user classifications and select the best candidates to be advanced from the Exoplanet Transit Search to the additional workflows and then to the visual vetting stage.

4.1. Weighting Scheme

We use a user weighting and subject scoring scheme based on the scheme implemented by Schwamb et al. (2018) for the Planet Four: Terrains citizen science project, which in turn is based on the schemes developed by Lintott et al. (2008, 2011) for the Galaxy Zoo project and Schwamb et al. (2012) for the original Planet Hunters project. This iterative weighting scheme assumes users who agree with the majority vote to be better at the tasks and up-weights them accordingly, while those who disagree are down-weighted. This is not a perfect assumption; however, citizen science in general relies on the assumption that the majority of users are correct. By implementing a user weighting scheme, we can determine which users are better at identifying U-shaped dips, for example, and pay more attention to their responses compared with others when attempting to identify which subjects have these features present. In this scheme, the classifications of a user are linked via their username if they are logged into Zooniverse, or by their anonymized IP address if they are not logged in. If a user excels at spotting U-shaped dips in the Exoplanet Transit Search, it does not necessarily mean that they are adept at spotting other features in this workflow or the additional workflows. Therefore, we assign and assess the following separate user weights. For the Exoplanet Transit Search, user weights are assigned for each of the five possible responses in the workflow. For the Secondary Eclipse Check, a single weight per user is calculated, as users can select only one of the three possible responses in this workflow. For the Odd/Even Transit Check, two user weights are calculated for each user: one weight for the first task, where users check whether there is sufficient data coverage for both the odd and even transits, and one weight for the second task, where users check whether the depths of the odd and even transits match. Both these tasks are binary Yes/No questions and therefore only one weight per task is required. These various user weights are applied when combining the multiple volunteer assessments for each subject to calculate subject scores, which can then be used to apply various threshold cuts to select the best candidates for further vetting. By assigning weights independently, we can calculate subject scores for each response on each subject across all three workflows to identify the best-ranked candidates for each of the possible responses.

Each user starts with weights equal to 1, and these weights are adjusted based on how often the user agrees with the majority assessment of all volunteers for the subjects the given user classified. User weights typically lie in a range from 0 to 1.6, with a scaling factor applied such that the mean user weight is 1. Note that user weights cannot be equal to 0. This is not to be confused with the subject scores that can be in the range [0, 1]. Users with only one classification for a given task retain a weight of 1 as we do not have enough information to evaluate the ability of the user to identify features in the light curves. The equations for calculating the subject scores and user weights in each of the workflows differ due to the differences in how many responses can be selected in each workflow and the number of tasks that each workflow consists of. We describe the weighting schemes used in each of the workflows below.

4.1.1. Exoplanet Transit Search

We describe the process for calculating scores and weights for the "U-shaped" response in the Exoplanet Transit Search as an example. We use the "U-shaped" response as the primary criterion for spotting exoplanet-like transit features. The other possible responses in the Exoplanet Transit Search are "V-shaped," "Data gap," "No dip," and "Stellar variability," for which the process of calculating scores and weights is identical to the method described below. For each subject i in the Exoplanet Transit Search, a subject score, s_i(U), is calculated as

$\begin{eqnarray}\begin{array}{rcl}{s}_{i}(U) & = & \displaystyle \frac{1}{{U}_{i}}\displaystyle \sum _{k}{w}_{k}(U)\\ k & = & \mathrm{users}\ \mathrm{who}\ \mathrm{selected}\ {\rm{``}}U-\mathrm{shaped}{\rm{\mbox{''}}}\ \mathrm{for}\ \mathrm{subject}\ i,\end{array}\end{eqnarray} \tag{ 1 }$

where w_k(U) is the weight for user k for the "U-shaped" response and U_i is the sum of the weights of all users who classified subject i:

$\begin{eqnarray}\begin{array}{rcl}{U}_{i} & = & \displaystyle \sum _{k=j}{w}_{k}(U)\\ j & = & \mathrm{all}\ \mathrm{users}\ \mathrm{who}\ \mathrm{classified}\ \mathrm{subject}\ i.\end{array}\end{eqnarray} \tag{ 2 }$

Initially, as all user weights are equal to 1, these subject scores will effectively be a simple counting of votes; however, as user weights change through the iterative scheme, these subject scores will vary from these initial values. The subject scores can be in the range [0, 1], where a score of 1 means all users who classified the subject agreed that the given feature (in this case a U-shaped dip) was present, while a score of 0 indicates that all users agreed that the given feature is not visible in the subject presented. Once the subject scores have been calculated, the next step is to assign new user weights as follows:

$\begin{eqnarray}\begin{array}{rcl}{w}_{j}(U) & = & \left\{\begin{array}{ll}\displaystyle \frac{A}{{N}_{j}}\left(\displaystyle \sum _{i=p}{s}_{i}(U)+\displaystyle \sum _{i=q}[1-{s}_{i}(U)]\right) & \ \mathrm{if}\ {N}_{j}\gt 1\\ 1 & \ \mathrm{if}\ {N}_{j}=1\end{array}\right.\\ p & = & \mathrm{subjects}\ \mathrm{user}\ j\ \mathrm{classified}\ \mathrm{as}\ {\rm{``}}U-\mathrm{shaped}{\rm{\mbox{''}}}\\ q & = & \mathrm{subjects}\ \mathrm{user}\ j\ \mathrm{did}\ \mathrm{not}\ \mathrm{classify}\ \mathrm{as}\ {\rm{``}}U-\mathrm{shaped}{\rm{\mbox{''}}},\end{array}\end{eqnarray} \tag{ 3 }$

where N_j is the number of subjects classified in this workflow by user j. The scaling factor A is chosen such that the mean user weight of users who classified more than one subject is 1. This equation up-weights a user when they agree with the majority of users who classified a given subject and down-weights them when their response differs from the majority assessment of the users who classified the subject. Using the new user weights calculated from Equation (3), the subject scores are recalculated with Equations (1) and (2). The user weights are then recalculated with Equation (3). This process is iterated until convergence is achieved, which we define as when the median absolute difference between the old and new weights is less than or equal to 10⁻⁴ for all of the possible responses in the workflow. Once convergence is achieved and the final user weights have been calculated, these weights are applied to calculate the final subject scores that we use to implement the threshold cuts described in Section 4.2.

4.1.2. Secondary Eclipse Check

The equations for calculating the subject scores and user weights for the Secondary Eclipse Check (Section 3.2) differ from above as it is only possible to select one of the three possible responses ("A secondary eclipse," "No secondary eclipse," and "A large data gap"). For this workflow, each user j is assigned a single weight, w_j(SE), that is applied to the calculation of the subject scores. For each subject i in the Secondary Eclipse Check, a set of three subject scores, ${s}_{i}({{YS}}_{\sec })$ , ${s}_{i}({{NS}}_{\sec })$ , and ${s}_{i}({{DG}}_{\sec })$ , are calculated for the three responses, "A secondary eclipse," "No secondary eclipse," and "A large data gap," respectively. These scores are calculated as

$\begin{eqnarray}\begin{array}{rcl}{s}_{i}({{YS}}_{\sec }) & = & \displaystyle \frac{1}{{{SE}}_{i}}\displaystyle \sum _{k}{w}_{k}({SE})\\ k & = & \mathrm{users}\ \mathrm{who}\ \mathrm{selected}\ {\rm{``}}\mathrm{A\; secondary}\ \mathrm{eclipse}{\rm{\mbox{''}}}\ \mathrm{for}\mathrm{subject}\,i,\end{array}\end{eqnarray} \tag{ 4 }$

$\begin{eqnarray}\begin{array}{rcl}{s}_{i}({{NS}}_{\sec }) & = & \displaystyle \frac{1}{{{SE}}_{i}}\displaystyle \sum _{k}{w}_{k}({SE})\\ k & = & \mathrm{users}\ \mathrm{who}\ \mathrm{selected}\ {\rm{``}}\mathrm{No\; secondary}\ \mathrm{eclipse}{\rm{\mbox{''}}}\ \mathrm{for}\ \mathrm{subject}\ i,\end{array}\end{eqnarray} \tag{ 5 }$

$\begin{eqnarray}\begin{array}{rcl}{s}_{i}({{DG}}_{\sec }) & = & \displaystyle \frac{1}{{{SE}}_{i}}\displaystyle \sum _{k}{w}_{k}({SE})\\ k & = & \mathrm{users}\ \mathrm{who}\ \mathrm{selected}\ {\rm{``}}{\rm{A}}\ \mathrm{large}\ \mathrm{data}\ \mathrm{gap}{\rm{\mbox{''}}}\ \mathrm{for}\ \mathrm{subject}\ i,\end{array}\end{eqnarray} \tag{ 6 }$

where w_k(SE) is the weight for user k for this workflow and SE_i is the sum of the weights for all users who classified subject i:

$\begin{eqnarray}\begin{array}{rcl}{{SE}}_{i} & = & \displaystyle \sum _{k=j}{w}_{k}({SE})\\ j & = & \mathrm{all}\ \mathrm{users}\ \mathrm{who}\ \mathrm{classified}\ \mathrm{subject}\ i.\end{array}\end{eqnarray} \tag{ 7 }$

These subject scores can be in the range [0, 1], with a score of 1 indicating unanimous agreement that the given feature is present and a score of 0 meaning that all users agreed that the given feature is not visible in the subject presented. As in the Exoplanet Transit Search, all user weights are initially set to 1 and the weights of users who classified only one subject remain as 1. Once the subject scores have been calculated, we assign new user weights as follows:

$\begin{eqnarray}\begin{array}{rcl}{w}_{j}({SE}) & = & \left\{\begin{array}{ll}\displaystyle \frac{B}{{N}_{j}}\left(\displaystyle \sum _{i=p}{s}_{i}({{YS}}_{\sec })+\displaystyle \sum _{i=q}{s}_{i}({{NS}}_{\sec })+\displaystyle \sum _{i=r}{s}_{i}({{DG}}_{\sec })\right) & \ \mathrm{if}\ {N}_{j}\gt 1\\ 1 & \ \mathrm{if}\ {N}_{j}=1\end{array}\right.\\ p & = & \mathrm{subjects}\ \mathrm{user}\ j\ \mathrm{classified}\ \mathrm{as}\ {\rm{``}}{\rm{A}}\ \mathrm{secondary}\ \mathrm{eclipse}{\rm{\mbox{''}}}\\ q & = & \mathrm{subjects}\ \mathrm{user}\ j\ \mathrm{classified}\ \mathrm{as}\ {\rm{``}}\mathrm{No}\ \mathrm{secondary}\ \mathrm{eclipse}{\rm{\mbox{''}}}\\ r & = & \mathrm{subjects}\ \mathrm{user}\ j\ \mathrm{classified}\ \mathrm{as}\ {\rm{``}}{\rm{A}}\ \mathrm{large}\ \mathrm{data}\ \mathrm{gap}{\rm{\mbox{''}}},\end{array}\end{eqnarray} \tag{ 8 }$

where N_j is the number of subjects classified in this workflow by user j. The scaling factor B is chosen such that the mean user weight of users who classified more than one subject is 1. The principles of this scheme are the same as those used for the Exoplanet Transit Search. Users are up-weighted when they agree with the majority of users who classified a given subject and down-weighted otherwise. A higher subject score for a given response indicates greater consensus among the users who classified the given subject that the given feature is present. The scheme proceeds iteratively, calculating subject scores using Equations (4), (5), (6), and (7) and weights using Equation (8), until convergence is achieved and a final set of subject scores are calculated, as described above in Section 4.1.1.

4.1.3. Odd/Even Transit Check

In the Odd/Even Transit Check (Section 3.3), users are presented with a subject and first asked to classify whether the data for both the odd and even transits cover the middle portion of the plot. If the user selects "No" to this first task, then they move on to reviewing the next subject; however, if they select "Yes," the user is then asked whether the depths of the odd and even transits match. The options for responses in both tasks are simply "Yes" or "No." Since both tasks are separate Yes/No questions, each user j who classifies subjects in this workflow is assigned a weight for the first task, w_j(OC), and a weight for the second task, w_j(OD). The subject scores for the first task, s_i(YC_odd) for "Yes, the data for both the odd and even transits cover the middle portion of the plot" and s_i(NC_odd) for "No, the data for the odd and/or even transits do not cover the middle portion of the plot," are calculated as follows:

$\begin{eqnarray}\begin{array}{rcl}{s}_{i}({{YC}}_{\mathrm{odd}}) & = & \displaystyle \frac{1}{{{OC}}_{i}}\displaystyle \sum _{k}{w}_{k}({OC})\\ k & = & \mathrm{subjects}\ \mathrm{user}\ j\ \mathrm{classified}\ \mathrm{as}\ {\rm{``}}\mathrm{Yes},\\ & & \mathrm{the}\ \mathrm{data}\ \mathrm{for}\ \mathrm{both}\ \mathrm{the}\ \mathrm{odd}\ \mathrm{and}\ \mathrm{even}\ \mathrm{transits}\\ & & \mathrm{cover}\ \mathrm{the}\ \mathrm{middle}\ \mathrm{portion}\ \mathrm{of}\ \mathrm{the}\ \mathrm{plot}{\rm{\mbox{''}}},\end{array}\end{eqnarray} \tag{ 9 }$

$\begin{eqnarray}\begin{array}{rcl}{s}_{i}({{NC}}_{\mathrm{odd}}) & = & \displaystyle \frac{1}{{{OC}}_{i}}\displaystyle \sum _{k}{w}_{k}({OC})\\ k & = & \mathrm{subjects}\ \mathrm{user}\ j\ \mathrm{classified}\ \mathrm{as}\ {\rm{``}}\mathrm{No},\\ & & \mathrm{the}\ \mathrm{data}\ \mathrm{for}\ \mathrm{the}\ \mathrm{odd}\ \mathrm{and}/\mathrm{or}\ \mathrm{even}\ \mathrm{transits}\\ & & \mathrm{do}\ \mathrm{not}\ \mathrm{cover}\ \mathrm{the}\ \mathrm{middle}\ \mathrm{portion}\ \mathrm{of}\ \mathrm{the}\ \mathrm{plot}{\rm{\mbox{''}}},\end{array}\end{eqnarray} \tag{ 10 }$

where w_k(OC) is the weight for user k and OC_i is the sum of the weights for all users who classified subject i:

$\begin{eqnarray}\begin{array}{rcl}{{OC}}_{i} & = & \displaystyle \sum _{k=j}{w}_{k}({OC})\\ j & = & \mathrm{all}\ \mathrm{users}\ \mathrm{who}\ \mathrm{classified}\ \mathrm{subject}\ i.\end{array}\end{eqnarray} \tag{ 11 }$

These subject scores can be in the range [0, 1], with higher scores indicating greater agreement among the users who classified the subject that the given feature is present. As in the other workflows, all user weights are initially set to 1 and the weights of users who classified only one subject remain as 1. The weights for this task are calculated as follows using the subject scores determined above:

$\begin{eqnarray}\begin{array}{rcl}{w}_{j}({OC}) & = & \left\{\begin{array}{ll}\displaystyle \frac{C}{{N}_{j}}\left(\displaystyle \sum _{i=p}{s}_{i}({{YC}}_{\mathrm{odd}})+\displaystyle \sum _{i=q}{s}_{i}({{NC}}_{\mathrm{odd}})\right) & \ \mathrm{if}\ {N}_{j}\gt 1\\ 1 & \ \mathrm{if}\ {N}_{j}=1\end{array}\right.\\ p & = & \mathrm{subjects}\ \mathrm{user}\ j\ \mathrm{classified}\ \mathrm{as}\ {\rm{``}}\mathrm{Yes},\\ & & \mathrm{the}\ \mathrm{data}\ \mathrm{for}\ \mathrm{both}\ \mathrm{the}\ \mathrm{odd}\ \mathrm{and}\ \mathrm{even}\\ & & \mathrm{transits}\ \mathrm{cover}\ \mathrm{the}\ \mathrm{middle}\ \mathrm{portion}\ \mathrm{of}\ \mathrm{the}\ \mathrm{plot}{\rm{\mbox{''}}}\\ q & = & \mathrm{subjects}\ \mathrm{user}\ j\ \mathrm{classified}\ \mathrm{as}\ {\rm{``}}\mathrm{No},\\ & & \mathrm{the}\ \mathrm{data}\ \mathrm{for}\ \mathrm{the}\ \mathrm{odd}\ \mathrm{and}/\mathrm{or}\ \mathrm{even}\\ & & \mathrm{transits}\ \mathrm{do}\ \mathrm{not}\ \mathrm{cover}\ \mathrm{the}\ \mathrm{middle}\ \mathrm{portion}\ \mathrm{of}\ \mathrm{the}\ \mathrm{plot}{\rm{\mbox{''}}},\end{array}\end{eqnarray} \tag{ 12 }$

where N_j is the number of subjects in this task classified by user j. The scaling factor C is chosen such that the mean user weight of users who classified more than one subject is 1. We follow the iterative scheme of recalculating the subject scores and user weights until convergence is achieved as described above in Section 4.1.1.

Once the subject scores and user weights for the first task of the Odd/Even Transit Check have been finalized, we apply a threshold cut of s_i(YC_odd) ≥ 0.5. That is, the subject scores for the second task, s_i(YD_odd) for "Yes, the odd/even transits have similar depths" and s_i(ND_odd) for "No, the odd/even transits do not have similar depths," are only calculated for subjects that pass the criteria of s_i(YC_odd) ≥ 0.5. This ensures that only subjects that have sufficient data coverage are evaluated for depth differences. This in turn means that users are neither up-weighted or down-weighted when classifying subjects in the second task where only a few volunteers were able to provide a response. The equations for calculating the subject scores for a given subject i are as follows:

$\begin{eqnarray}\begin{array}{rcl}{s}_{i}({{YD}}_{\mathrm{odd}}) & = & \displaystyle \frac{1}{{{OD}}_{i}}\displaystyle \sum _{k}{w}_{k}({OD})\\ k & = & \mathrm{subjects}\ \mathrm{user}\ j\ \mathrm{classified}\ \mathrm{as}\ {\rm{``}}\mathrm{Yes},\\ & & \mathrm{the}\ \mathrm{odd}/\mathrm{even}\ \mathrm{transits}\ \mathrm{have}\ \mathrm{similar}\ \mathrm{depths}{\rm{\mbox{''}}},\end{array}\end{eqnarray} \tag{ 13 }$

$\begin{eqnarray}\begin{array}{rcl}{s}_{i}({{ND}}_{\mathrm{odd}}) & = & \displaystyle \frac{1}{{{OD}}_{i}}\displaystyle \sum _{k}{w}_{k}({OD})\\ k & = & \mathrm{subjects}\ \mathrm{user}\ j\ \mathrm{classified}\ \mathrm{as}\ {\rm{``}}\mathrm{No},\\ & & \mathrm{the}\ \mathrm{odd}/\mathrm{even}\ \mathrm{transits}\ \mathrm{do}\ \mathrm{not}\ \mathrm{have}\ \mathrm{similar}\ \mathrm{depths}{\rm{\mbox{''}}},\end{array}\end{eqnarray} \tag{ 14 }$

where w_k(OD) is the weight for user k and OD_i is the sum of the weights for all users who classified subject i:

$\begin{eqnarray}\begin{array}{rcl}{{OD}}_{i} & = & \displaystyle \sum _{k=j}{w}_{k}({OD})\\ j & = & \mathrm{all}\ \mathrm{users}\ \mathrm{who}\ \mathrm{classified}\ \mathrm{subject}\ i.\end{array}\end{eqnarray} \tag{ 15 }$

These subject scores can be in the range [0, 1], with higher scores indicating greater agreement among the users who classified the subject that the given feature is present. All user weights are initially set to 1 and the weights of users who classified only one subject remain as 1. The weights for this task are calculated as follows using the subject scores determined above:

$\begin{eqnarray}\begin{array}{rcl}{w}_{j}({OD}) & = & \left\{\begin{array}{ll}\displaystyle \frac{D}{{N}_{j}}\left(\displaystyle \sum _{i=p}{s}_{i}({{YD}}_{\mathrm{odd}})+\displaystyle \sum _{i=q}{s}_{i}({{ND}}_{\mathrm{odd}})\right) & \ \mathrm{if}\ {N}_{j}\gt 1\\ 1 & \ \mathrm{if}\ {N}_{j}=1\end{array}\right.\\ p & = & \mathrm{subjects}\ \mathrm{user}\ j\ \mathrm{classified}\ \mathrm{as}\ {\rm{``}}\mathrm{Yes},\\ & & \mathrm{the}\ \mathrm{odd}/\mathrm{even}\ \mathrm{transits}\ \mathrm{have}\ \mathrm{similar}\ \mathrm{depths}{\rm{\mbox{''}}}\\ q & = & \mathrm{subjects}\ \mathrm{user}\ j\ \mathrm{classified}\ \mathrm{as}\ {\rm{``}}\mathrm{No},\\ & & \mathrm{the}\ \mathrm{odd}/\mathrm{even}\ \mathrm{transits}\ \mathrm{do}\ \mathrm{not}\ \mathrm{have}\ \mathrm{similar}\ \mathrm{depths}{\rm{\mbox{''}}},\end{array}\end{eqnarray} \tag{ 16 }$

where N_j is the number of subjects in this task classified by user j. The scaling factor D is chosen such that the mean user weight of users who classified more than one subject is 1. We again follow the iterative scheme of recalculating the subject scores and user weights until convergence is achieved as described above in Section 4.1.1.

4.1.4. Distribution of Scores and Weights

Table 1 provides the final scores for each candidate, with the scores for subjects in the Secondary Eclipse Check and Odd/Even Transit Check provided where available. Figure 4 shows the cumulative distributions of the subject scores for each response in the Exoplanet Transit Search. V-shaped dips are much more common in the sample compared with U-shaped dips, reflecting the prevalence of eclipsing binary systems that are typically found by exoplanet transit searches (Brown 2003; Alonso et al. 2004). The histograms for the user weights for each response in the Exoplanet Transit Search are shown in Figure 5. The apparent spike at w_j(SV) = 1 is due to the larger spread in weights for this response, resulting in the fixed weights of users who only classified one subject to be more obvious compared with the distribution of weights for the other four responses. For w_j(U), 96% of users have weights greater than 0.8%, and 59% have weights greater than 1. Table 2 shows the percentage of users with weights above 0.8 and 1 for all weights in the PHNGTS project. The cumulative distributions of scores and histograms of user weights for the Secondary Eclipse Check and Odd/Even Transit Check are provided in Appendix A.

**Figure 4.** Cumulative distribution of subject scores greater than the given score on the x-axis. The scores for each response of the Exoplanet Transit Search (s_i(U), s_i(V), s_i(DG), s_i(ND), and s_i(SV)) are shown.
Download figure:
Standard image High-resolution image

**Figure 5.** Histograms of user weights for each response in the Exoplanet Transit Search.
Download figure:
Standard image High-resolution image

Table 1. PHNGTS Subject Information and Scores for Each Workflow Where Available

	Exoplanet Transit Search						Secondary Eclipse Check				Odd/Even Transit Check
TIC ID	Subject ID	s_i(U)	s_i(V)	s_i(ND)	s_i(DG)	s_i(SV)	Subject ID	${s}_{i}({{YS}}_{\sec })$	${s}_{i}({{NS}}_{\sec })$	${s}_{i}({{DG}}_{\sec })$	Subject ID	s_i(YC_odd)	s_i(NC_odd)	s_i(YD_odd)	s_i(ND_odd)
57908727	74820792	0.183673	0.130785	0.647565	0.082170	0.058148	⋯	⋯	⋯	⋯	⋯	⋯	⋯	⋯	⋯
14092921	69494436	0.141035	0.387247	0.090076	0.680189	0.134000	⋯	⋯	⋯	⋯	⋯	⋯	⋯	⋯	⋯
333018311	69674773	0.039031	0.901027	0.000000	0.000000	0.049919	73015380	0.965594	0.034406	0.0	73029773	0.955156	0.044844	0.932479	0.067521
165374842	69660130	0.650532	0.094216	0.000000	0.038786	0.418303	71368514	0.955075	0.044925	0.0	71374760	0.978074	0.021926	1.000000	0.000000
60768993	69541419	0.000000	0.649549	0.000000	0.235270	0.659147	⋯	⋯	⋯	⋯	⋯	⋯	⋯	⋯	⋯
31141153	74812193	0.000000	0.637888	0.088087	0.033987	0.445865	⋯	⋯	⋯	⋯	⋯	⋯	⋯	⋯	⋯
135310361	69662716	0.228356	0.641675	0.039714	0.000000	0.189841	⋯	⋯	⋯	⋯	⋯	⋯	⋯	⋯	⋯
242027017	74817247	0.397940	0.092060	0.000000	0.601868	0.230841	⋯	⋯	⋯	⋯	⋯	⋯	⋯	⋯	⋯
441311367	69641162	0.189460	0.192876	0.000000	0.033981	1.000000	⋯	⋯	⋯	⋯	⋯	⋯	⋯	⋯	⋯
404020484	69674877	0.000000	0.845810	0.048033	0.263466	0.051691	91086040	1.000000	0.000000	0.0	91086092	0.946373	0.053627	0.930785	0.069215
⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮

Note. We note that, prior to the implementation of the user weighting and subject scoring scheme, subjects with ≥50% of votes for "U-shaped" and <60% of votes for "Stellar variability" were pushed to the Secondary Eclipse Check and Odd/Even Transit Check. The subject scores calculated across all workflows for the 745 subjects that met this vote criterion but did not meet the score thresholds are included in the online supplementary material.

Only a portion of this table is shown here to demonstrate its form and content. A machine-readable version of the full table is available.

Download table as: Data Typeset image

Table 2. Percentage of Users with Weights above 0.8 and 1

Workflow	Weight	w_j(X) > 0.8	w_j(X) > 1
	U-shaped	96%	59%
	V-shaped	97%	52%
Exoplanet Transit Search	Data gap	93%	63%
	No dip	95%	55%
	Stellar Variability	88%	56%

Secondary Eclipse Check	Secondary Eclipse	80%	50%

Odd/Even Transit Check	Odd/Even Data Gap	81%	50%
	Odd/Even Depths	90%	46%

Download table as: ASCII Typeset image

4.2. Candidate Selection

We use the subject scores calculated from the weighting scheme to impose a series of threshold cuts in order to select a list of candidates that advance to the additional workflows and then to subsequent visual vetting by the PHNGTS science team. The aim of these cuts is to generate a list of potential exoplanet transit candidates without also including an overwhelming number of false positives. We used the information from visual vetting of a random subset of preliminary candidates from the Exoplanet Transit Search to set these threshold cuts for candidate selection. Figure 6 shows example light curves for 10 subjects that were classified in the Exoplanet Transit Search for s_i(U) randomly selected from bins of size 0.1. We see that subjects with a score that is closer to 1 tend to have more obvious U-shaped transits present, and therefore we select candidates with s_i(U) ≥ 0.5 and s_i(SV) < 0.6 for further vetting in the additional workflows. The imposed limit on stellar variability (s_i(SV) < 0.6) reduces the number of false positives from light curves that show more variability around the primary transit, which is typically indicative of systematics or eclipsing binary systems that exhibit ellipsoidal modulation (Morris 1985; Mazeh 2008).

**Figure 6.** Random sample of Exoplanet Transit Search subjects per U-shaped score s_i(U) binned with bin size of 0.1. The x-axes show Phase and the y-axes show Normalized Flux. Titles display Subject ID. The U-shaped subject score, s_i(U), is shown to the left of each image.
Download figure:
Standard image High-resolution image

We also search for candidates that may exhibit a grazing or near-grazing exoplanet transit, which typically display a V-shaped transit. We achieve this by using the scores for the V-shaped response (s_i(V)) to select candidates with s_i(V) ≥ 0.8 and s_i(SV) < 0.6 to be advanced to the additional workflows. We do not include subjects that have already been advanced to the additional workflows with the s_i(U) ≥ 0.5 cut again. We impose the same limit as before on stellar variability (s_i(SV) < 0.6) to reduce the number of false positives. These cuts were selected after visual vetting of a preliminary selection of candidates from the Exoplanet Transit Search. Although this cut will also identify eclipsing binaries, the majority of these will be filtered by the Secondary Eclipse Check and Odd/Even Transit Check. In the scenario where only the primary eclipse of an eclipsing binary system is visible, further vetting tests by the PHNGTS science team (such as estimates of the radius of the secondary body) can be used to rule out these false positives.

The candidates selected by the above cuts are classified in the Secondary Eclipse Check and Odd/Even Transit Check workflows that, as described in Sections 3.2 and 3.3, are designed to check for secondary eclipses and differences in the depths of the odd and even transits. Once candidates are retired from these workflows, we apply the user weighting schemes to each of the workflows as described in Sections 4.1.2 and 4.1.3. We apply another set of threshold cuts using the scores calculated for each candidate in these workflows in order to generate a list of potential planet candidates to be reviewed by the PHNGTS science team, while aiming to rule out as many false positives as possible. As described in Section 4.1.3, we only calculate scores for the second task of the Odd/Even Transit Check if a subject has sufficient data coverage, i.e., s_i(YC_odd) ≥ 0.5. In addition to this cut, candidates with both ${s}_{i}({{NS}}_{\sec })\geqslant 0.5$ and s_i(YD_odd) ≥ 0.5 are selected for review. These cuts were chosen following visual vetting of a random subset of the preliminary candidate lists from these workflows. These cuts are designed to select candidates that have been classified as showing no secondary eclipse and appear to have matching depths for the odd and even transits, both of which are designed to rule out obvious eclipsing binary systems.

The final stage of the vetting process involves the PHNGTS science team reviewing the remaining potential planet candidates. Prior to this vetting stage, we remove 1144 known NGTS Objects of Interest (NOIs) that constitute known planets and planet candidates, as well as eclipsing binaries and objects that were initially identified as possible candidates but have since been rejected in the original NGTS candidate identification process. We review the remaining 4726 potential planet candidates using the internal NGTS vetting portal (Wheatley et al. 2018). This allows the PHNGTS science team to do the following: review the estimated stellar and planet candidate parameters; review the phase-folded light curves for other significant periods detected by ORION; determine whether the detected period is common within the field (indicative of systematics); identify nearby objects that may be the true source of the observed signal; review the light curves for individual nights of observation; and review the centroid vetting (Günther et al. 2017). A full description of the vetting process is described in Bayliss et al. (2018). Table 3 outlines the number of subjects that remain at each step of this candidate selection process.

Table 3. Number of PHNGTS Subjects that Remain at Each Stage of the Candidate Selection Process

	Number of Subjects Remaining	Percentage Remaining from Previous Step
All	105,534
Exoplanet Transit Search Scores: s_i(U) ≥ 0.5 and s_i(SV) < 0.6^a	6303	5.97
s_i(V) ≥ 0.8 and s_i(SV) < 0.6	10,029	9.50
Secondary Eclipse and Odd/Even Transit Check Scores:
${s}_{i}({{NS}}_{\sec })\geqslant 0.5$ and s_i(YC_odd) ≥ 0.5 and s_i(YD_odd) ≥ 0.5	5870	36.94
Removing NOIs	4726	80.51
Planet Candidates	5	0.11

Notes. Terms s_i(U), s_i(V), and s_i(SV) are the subject scores for the U-shaped, V-shaped, and Stellar Variability responses in the Exoplanet Transit Search, respectively. Terms ${s}_{i}({{NS}}_{\sec })$ , s_i(YC_odd) and s_i(YD_odd) are the subject scores for "No Secondary Eclipse," "Yes, the data for both the odd and even transits cover the middle portion of the plot," and "Yes, the odd/even transits have similar depths," respectively. The five remaining planet candidates are selected after visual vetting by the PHNGTS science team.

^aIncludes 745 subjects that were moved to the additional workflows using the initial vote counting method.

Download table as: ASCII Typeset image

5. Planet Candidates and Interesting Systems

The PHNGTS science team review stage identified five planet candidates for further analysis from the 4726 subjects that passed all threshold cuts. The 4721 subjects not carried forward consisted primarily of spurious signals due to telescope systematics and eclipsing binary candidates, as well as a number of possible planet candidates that were deemed infeasible for further observations. Those deemed infeasible typically had estimated secondary radii above 2R_J and therefore a higher likelihood of being brown dwarfs or very low-mass stars; thus, follow-up observations would be an inefficient use of limited observing resources. Additionally, some candidates showed only a single convincing transit event, and therefore we could not constrain the period or confidently determine possible planetary parameters. Here, we discuss the details of the most interesting planet candidates that were identified via the science team review. At the time of discovery, none of these candidates were NOI) or TESS Objects of Interest (TOIs—Guerrero et al. 2021).²⁵ We note that, in addition to the planet candidates described here that were discovered through PHNGTS, the full online version of Table 1 includes additional candidate/solved systems that are to be included in forthcoming publications, e.g., TIC-388917846 (M. Battley et al. 2024, in preparation). Table 4 provides the initial transit parameters derived by ORION for each of the five most interesting planet candidates and estimates of their planetary radii, calculated by combining the transit depth and the stellar radius from the TESS Input Catalog (TIC v8.2; Stassun et al. 2019; Paegert et al. 2021).

Table 4. ORION Parameters for PHNGTS Planet Candidates

TIC ID	Epoch	Period	Transit Depth	SDE	R_p
	(BJD − 2,457,000)	(days)	(%)		(R_J)
165227846	1098.557674	2.096882	5.18	50.66	0.71
135251751	1104.019630	4.048836	0.27	25.91	1.21
125925505	789.818866	1.743484	2.19	27.55	1.11
125759305	795.630509	9.992979	2.35	32.48	1.15
180997904	1100.341400	4.936048	0.96	33.24	1.27

Download table as: ASCII Typeset image

5.1. Additional Follow-up Data

Here, we describe the facilities used to obtain additional follow-up data of our noteworthy planet candidates. Table 5 outlines the follow-up obtained for each planet candidate to date. We use available TESS data and obtain additional ground-based photometric observations using the 1 m telescope at the South African Astronomical Observatory (SAAO). High-resolution speckle imaging was obtained using the Zorro instrument on Gemini South (Scott et al. 2021; Howell & Furlan 2022). Spectroscopic follow-up was carried out using CORALIE (Queloz et al. 2001a) and the Fibre-fed Extended Range Optical Spectrograph (FEROS; Kaufer & Pasquini1998).

Table 5. An Overview of the Observing Facilities Used for PHNGTS Candidate Follow-up

TIC ID	Photometry	Spectroscopy	Speckle
165227846	TESS (10, 37, 63, 64); SAAO (2022-04-28, $z^{\prime} ;$ 2023-02-28, ${g}^{{\prime} };$ 2023-04-11, ${i}^{{\prime} };$ 2023-05-23, V)		Zorro (2022-05-22)
135251751	TESS (10, 37, 63)	CORALIE; FEROS	Zorro (2023-05-27)
125925505	SAAO (2022-04-30, $z^{\prime} ;$ 2022-07-16, $r^{\prime} ;$ 2023-07-24, V)	FEROS	Zorro (2022-05-22)
125759305	TESS (11,38)
180997904	TESS (10, 36, 37, 63); SAAO (2023-04-10, $i^{\prime}$ )

Note. More details on the observations can be found in the main text and subsequent tables. Numbers in brackets for each candidate with TESS photometry indicate which TESS sectors this target was observed in.

Download table as: ASCII Typeset image

5.1.1. TESS

TESS (Ricker et al. 2015) is a space-based NASA survey telescope. TESS monitors 96 × 24 deg² sectors of the sky for ∼27.4 days at a time with near-continuous coverage during this observing window. None of our targets were observed with 2 min or 20 s cadence. We utilize the TESS Full Frame Images (FFIs), which can be used to extract light curves for any target in the TESS field of view (see Table 5). We use the light curves generated from the FFIs by the Quick-Look Pipeline (QLP; Huang et al. 2020; Kunimoto et al. 2021, 2022) where available and use the eleanor (Feinstein et al. 2019), TESSCut (Brasseur et al. 2019), and lightkurve (Lightkurve Collaboration et al. 2018) packages to access and analyze the light curves for targets without QLP light curves.

5.1.2. SAAO

We obtained follow-up photometry for a selection of our targets (see Table 5) between 2022 April 28 and 2023 July 24 using the 1 m telescope at the SAAO. The 1 m telescope was equipped with different versions of the Sutherland High-speed Optical Camera (SHOC) at different times. Observations carried out in 2022 were obtained using the "shocnawe" camera, while observations from 2023 were obtained using "shocndisbelief." These instruments are nearly identical with 2 farcm 85 by 2 farcm 85 fields of view and pixel scales of 0 farcs 167 pixel⁻¹ (Coppejans et al. 2013). Observations were carried out using a range of photometric filters (see Table 5) to check for differences in the transit depth across multiple passbands. A difference in the measured transit depth in differing filters would indicate that the system is an eclipsing binary consisting of two stars with differing colors (Drake 2003; Tingley 2004; Parviainen et al. 2019). We were able to simultaneously observe at least two comparison stars of similar brightness for each target. Calibration frames for the data reduction were taken each night at sunset and/or sunrise. Each light curve was bias and flat-field corrected using the local Python-based SAAO SHOC pipeline, which uses IRAF (Tody 1986) photometry tasks (PyRAF Science Software Branch at STScI 2012) and facilitates the extraction of raw and differential light curves. We used the Starlink package AUTOPHOTOM (Currie et al. 2014) to perform aperture photometry on both our target and comparison stars, and chose apertures that gave the maximum signal-to-noise ratio. Background annuli were chosen around the target and comparison stars that avoided any contaminating faint sources in the field of view. Finally, the measured fluxes of the comparison stars were used in order to conduct differential photometry of the target.

5.1.3. Gemini/Zorro

We performed high-resolution speckle imaging of TIC-125925505, TIC-165227846, and TIC-135251751 using the Zorro instrument (Scott et al. 2021; Howell & Furlan 2022) on the 8.1 m Gemini South telescope on Cerro Pachón, Chile. Speckle imaging observations provide extremely high-resolution images by taking multiple sets of 1000 rapid (60 ms) exposures in quick succession. This effectively "freezes out" the atmosphere such that the diffraction limit of the telescope can be reached and we can search for close-in companion stars at up to 8 mag contrast levels. We can quantify the contribution of any nearby stars to the photometric measurements made with NGTS, TESS, and SAAO instruments, allowing for more accurate estimates of the planet candidate radii. Simultaneous observations were obtained with the 562 and 832 nm filters. These data were reduced as described in Howell et al. (2011) to produce contrast curves for both filters.

5.1.4. CORALIE

We monitored TIC-135251751 to obtain radial-velocity measurements using the CORALIE high-resolution (R ∼ 60,000) échelle spectrograph mounted on the Swiss 1.2 m Euler telescope at La Silla Observatory (Queloz et al. 2001a). We obtained eight spectra between 2023 January 3 and 2023 May 15. The spectra were reduced using the standard CORALIE Data Reduction Software (DRS) before being cross-correlated with an A0 stellar mask close to the spectral type of the host star to obtain a cross-correlation function (CCF; e.g., Baranne et al. 1996; Pepe et al. 2002). The pipeline calculates the radial velocity and its associated error as well as parameters such as FWHM, contrast, and bisector inverse slope (BIS), which are derived from the CCF. These additional parameters are used as diagnostics of stellar activity that may be affecting the RV measurements, and have previously been used to detrend the impact of activity on the radial velocities of stars when searching for orbiting planets (e.g., Queloz et al. 2001b; Melo et al. 2007; Díaz et al. 2018). We opt for an A0 template to measure the RVs of an early F-type star as this is the closest match of the templates available (A0, G2, K5, and M2). Early F-types undergo less magnetic braking compared with Solar-type stars. Furthermore, the use of the "incorrect" mask has been shown to have little effect on the overall trends of parameters, with only small changes in the absolute values (e.g., Costes et al. 2021).

5.1.5. FEROS

FEROS (Kaufer & Pasquini 1998) is a high-resolution (R ∼ 48,000) échelle spectrograph mounted on the MPG/ESO 2.2 m telescope at La Silla Observatory. We obtained radial-velocity measurements of TIC-125925505 and TIC-135251751. A total of seven FEROS spectra were collected between 2022 April 10 and 2023 March 31. FEROS data were processed with the CERES pipeline (Brahm et al. 2017), which uses the cross-correlation technique to calculate radial velocities, the associated errors, and the BIS values (sometimes referred to as Bisector Span; however, the methods of calculating BIS for CORALIE and FEROS are identical and as described in Queloz et al. 2001b).

5.2. Modeling

We performed spectral energy distribution (SED) fitting for each candidate to obtain more accurate stellar parameters using the ARIADNE package (Vines & Jenkins 2022). ARIADNE fits broadband photometry measurements from catalogs (see Vines & Jenkins 2022 for photometric bandpasses used) to different stellar atmosphere models such as Phoenix V2 (Husser et al. 2013), BT-Stell, BT-Cond, BT-NextGen (Hauschildt et al. 1999; Allard et al. 2012), Kurucz (1993), and Castelli & Kurucz (2003). The results of these analyses, along with additional parameters from the TIC v8.2 catalog where noted (Stassun et al. 2019; Paegert et al. 2021), are provided in Table 6.

Table 6. Stellar Parameters for PHNGTS Planet Candidates

TIC ID	Component	Stellar Mass	Stellar Radius	Stellar T_eff	Stellar log	Stellar Luminosity	Stellar V_mag ^a	Stellar J_mag (1)
		(M_⊙)	(R_⊙)	(K)	(g)	(L_⊙)	(mag)	(mag)
165227846	⋯	${0.3257}_{-0.0134}^{+0.0193}$	${0.3299}_{-0.0174}^{+0.0291}$	${3244.4892}_{-117.2090}^{+80.4376}$	${3.7335}_{-0.1179}^{+0.4008}$	${0.0109}_{-0.0018}^{+0.0022}$	16.365 ± 1.133	11.743 ± 0.021
135251751	(TIC)	1.44 ± 0.23855	2.39031 ± 0.124894	6742 ± 133	3.83952 ± 0.0910193	10.63554 ± 0.8621945	11.125 ± 0.01	10.222 ± 0.022
135251751	A	1.50	1.679	7020	⋯	5.75	⋯	⋯
135251751	B	1.33	1.473	6550	⋯	3.64	⋯	⋯
125925505	⋯	${0.7555}_{-0.0276}^{+0.0375}$	${0.5643}_{-0.0442}^{+0.0576}$	${4741.7769}_{-126.9399}^{+126.9399}$	${4.2991}_{-0.3883}^{+0.4636}$	${0.1441}_{-0.0258}^{+0.0352}$	15.482 ± 0.103	13.443 ± 0.026
125759305	⋯	${0.8250}_{-0.0350}^{+0.0380}$	${0.7832}_{-0.0161}^{+0.0154}$	${4931.5295}_{-62.1902}^{+73.4975}$	${4.2963}_{-0.4133}^{+0.5860}$	${0.3270}_{-0.0206}^{+0.0238}$	14.73 ± 0.206	12.914 ± 0.023
180997904	⋯	${0.9946}_{-0.0376}^{+0.0543}$	${1.3033}_{-0.0307}^{+0.0400}$	${5390.6185}_{-74.4120}^{+91.7171}$	${4.6502}_{-0.6755}^{+0.2154}$	${1.3012}_{-0.0985}^{+0.1204}$	14.376 ± 0.069	12.979 ± 0.023

Note. TIC parameters for TIC-135251751 which assume a single star are provided for comparison. The derivation of the parameters of the individual components of TIC-135251751 are described in Section 5.3.2.

^aTIC v8.2 (Paegert et al. 2021).

Download table as: ASCII Typeset image

We derive planetary parameters for each system using the allesfitter package (Günther & Daylan 2019, 2021). Allesfitter combines ellc (light-curve and RV models; Maxted 2016), emcee (Foreman-Mackey et al. 2013), dynesty (Nested Sampling (NS); Speagle 2020), and celerite (Foreman-Mackey et al. 2017) to simultaneously fit photometric and spectroscopic data, with models available for a variety of signals including multiple planet systems and stellar variability. Prior to fit with allesfitter we use the BRUCE package to compute preliminary planetary and orbital parameters from the NGTS data only.²⁶ BRUCE is an open-source package for the modeling of binary stars and exoplanets, which employs emcee (Markov Chain Monte Carlo (MCMC) sampling; Foreman-Mackey et al. 2013) and celerite (Gaussian Process (GP) models; Foreman-Mackey et al. 2017). We also use the stellar parameters obtained from the SED fits with ARIADNE as priors for the allesfitter modeling. Given the length of time for the initial BRUCE MCMC fits to converge, a nested sampling approach with allesfitter was taken to perform the global analysis. For all candidates, we adopt a quadratic limb-darkening law for the photometric data as parameterized in Kipping (2013). We use the PyLDTk package (Parviainen & Aigrain 2015), which uses Phoenix V2 models (Husser et al. 2013) and transmission curves from the Spanish Virtual Observatory (SVO) Filter Service (Rodrigo et al. 2012; Rodrigo & Solano 2020), to estimate priors for the limb-darkening coefficients. We implement the Matérn 3/2 GP kernel to model any out-of-transit variability, as this kernel is versatile in its ability to model both long- and short-term trends (Günther & Daylan 2021). Due to the large pixel scale of TESS (21'' pixel⁻¹; Section 5.1.1), the TESS data are fit with a dilution factor due to the possibility of contribution from nearby stars.

We also utilize the TRICERATOPS (Giacalone & Dressing 2020; Giacalone et al. 2021) validation package to calculate false positive probabilities (FPPs) for each of our planet candidates where TESS data are available (Section 5.1.1). TRICERATOPS uses Bayesian analysis to calculate the probabilities that a transiting signal was produced by a transiting planet or a number of false positive scenarios such as nearby or background eclipsing binaries. TRICERATOPS can also include high-resolution speckle imaging data to provide stronger constraints on false positive scenarios such as unresolved companions. Therefore, we include Gemini/Zorro data where available in our calculations (Section 5.1.3). We perform the FPP calculation 1000 times for each planet candidate and compute the mean and standard deviation of the results. The parameters derived using allesfitter and the fractional FPP values computed using TRICERATOPS are shown in Table 7. We note that the FPP values presented are higher than any generally accepted validation threshold (FPP ≲0.01; Montet et al. 2015; Giacalone & Dressing 2020; Giacalone et al. 2021); however, TRICERATOPS is known to penalize giant planets due to the degeneracy between their radii and those of brown dwarfs and very low-mass stars (Bryant et al. 2023). We are presenting these systems as candidates only and plan where possible to gather further, more robust follow-up observations to confirm or rule out these systems as planetary.

Table 7. PHNGTS Planet Candidate Fitting Results

TIC ID	Component	Epoch	Period	R_p/R_*	R_p	`TRICERATOPS` FPP
		(BJD − 2,457,000)	(days)		(R_J)
165227846	⋯	$2094.{49836}_{-0.00005}^{+0.00005}$	${2.0966799}_{-1E-7}^{+1E-7}$	${0.50050}_{-0.02385}^{+0.01664}$	${1.61406}_{-0.12850}^{+0.13200}$	${0.7267}_{-0.2733}^{+0.4316}$
135251751	A	${2091.84231}_{-0.00065}^{+0.00066}$	${4.04841}_{-0.000004}^{+0.000004}$	${0.068013}_{-0.00191}^{+0.00221}$	${1.111}_{-0.031}^{+0.036}$	⋯
135251751	B	⋯	⋯	${0.08552}_{-0.00241}^{+0.00278}$	${1.226}_{-0.035}^{+0.040}$	⋯
125925505	⋯	${1970.07766}_{-0.00040}^{+0.00040}$	${1.7433513}_{-8E-7}^{+8E-7}$	${0.15059}_{-0.00383}^{+0.00392}$	${0.83261}_{-0.07627}^{+0.07892}$	⋯
125759305	⋯	${1585.26417}_{-0.00070}^{+0.00071}$	${9.995842}_{-8E-6}^{+5E-6}$	${0.13589}_{-0.00092}^{+0.00122}$	${1.03640}_{-0.02241}^{+0.02250}$	0.1617 ± 0.0298
180997904	⋯	${2072.85973}_{-0.00153}^{+0.00145}$	${4.93668}_{-0.00001}^{+0.00001}$	${0.08773}_{-0.00234}^{+0.00200}$	${1.11353}_{-0.04033}^{+0.03922}$	0.6714 ± 0.0114

Note. We list the possible planetary parameters for the candidate in the TIC-135251751 system, where it may be orbiting either the primary or secondary star of this binary system. The derivation of these parameters is described in Section 5.3.2. We do not include TRICERATOPS FPP values for TIC-135251751 due to the uncertain nature of the system. We do not know which star the transiting candidate orbits, nor do we have secure estimates of the stellar radii of the possible hosts.

Download table as: ASCII Typeset image

5.3. Interesting Systems

The five candidates identified by the PHNGTS project are all consistent with being hot giant planets, orbiting their host stars with periods of less than 10 days and with estimated radii similar to those of Jupiter or Saturn. We report the discovery of three hot giant planet candidates orbiting late-G/early-K-type host stars (TIC-125925505, TIC-125759305, and TIC-180997904), as well as the detection of a transiting companion likely in an S-type orbit around one component of a binary star system (TIC-135251751). In addition, we highlight the discovery of TIC-165227846, a hot Jupiter candidate orbiting an M dwarf that would be the lowest-mass star to host a giant planet if confirmed.

5.3.1. TIC-165227846

ORION detected a transit signal in the NGTS data with a period of 2.097 days, depth of 5.18%, and SDE of 50.66 from a total of 27 full or partial transits. We note that ORION underestimated the depth of the transit. The stellar parameters of TIC-165227846 indicate that it is a mid-M dwarf; therefore, the large transit depth remains consistent with a planetary radius for the transiting companion. Using four TESS sectors and observations of four individual transits in different filters using SHOC at SAAO, we measure a transit depth of 13.1% that, when combined with the stellar radius of 0.32 R_⊙, gives a companion radius of ${1.61}_{-0.21}^{+0.13}$ R_J. The transit depth measured in Sector 10 is shallower compared with the other data sets (∼10%); however, the data have a cadence of 30 min, resulting in poor sampling of the full transit. Figure 7 shows the photometric data obtained for TIC-165227846. Zorro imaging reveals no companions within 1 farcs 17 of the target at the 4–5 mag limit in the 562 nm filter and at the 4–7 mag limit in the 832 nm filter. These data are shown in Figure 8. With a mass of 0.33 M_⊙, TIC-165227846 would be the lowest-mass star to host a close-in giant planet if the companion is confirmed to be planetary (Bryant et al. 2023). We note that this candidate was independently detected in Bryant et al. (2023). We are actively seeking radial-velocity measurements, and the further analysis of this system will be the subject of future work.

**Figure 7.** Discovery and follow-up data obtained for TIC-165227846. The phase-folded light curves from NGTS, SAAO, and each TESS sector are shown with the median best-fit circular model from `allesfitter` in orange.
Download figure:
Standard image High-resolution image

**Figure 8.** The Zorro contrast curve for TIC-165227846. The blue and red lines show the contrast curves for the 562 nm and 832 nm filters, respectively. The inset shows the reconstructed image of the star from Zorro.
Download figure:
Standard image High-resolution image

5.3.2. TIC-135251751

ORION detected a transit signal in the NGTS data with a period of 4.049 days, depth of 0.27%, and SDE of 25.91 from a total of 15 full or partial transits. Catalog values for TIC-135251751 quote a stellar radius of 2.39 ± 0.125 R_⊙, T_eff = 6742 K, log (g) = 3.84 ± 0.091, and luminosity of 10.64 ± 0.862 L_⊙, indicating that it is likely a subgiant. Combining this stellar radius and the measured transit depth gives an initial estimate for the radius of the orbiting body to be 1.21 R_J, suggesting the possibility of this candidate being a hot Jupiter orbiting a subgiant. However, high-resolution speckle imaging of TIC-135251751 using Zorro, shown in Figure 9, reveals that this is a binary star system with a projected separation of 0 farcs 033 and flux ratios of 0.662 and 0.603 in the 562 nm and 832 nm filters, respectively. These flux ratios correspond to magnitude differences between the two stars of 0.45 and 0.55 for the 562 nm and 832 nm filters, respectively, indicating that the two stars are likely of similar spectral type.

**Figure 9.** The Zorro contrast curve for TIC-135251751. The blue and red lines show the contrast curves for the 562 nm and 832 nm filters, respectively. The inset shows the reconstructed image of the system from Zorro. The elongated shape of the reconstructed image is due to this being a binary star system.
Download figure:
Standard image High-resolution image

In addition to the NGTS data, the target was observed in TESS Sectors 10, 37, and 64. We identify a total of 19 transits across the three TESS sectors. We obtained eight CORALIE spectra (Section 5.1.4) and three FEROS spectra (Section 5.1.5) between 2023 January 4 and 2023 May 16. We excluded the CORALIE spectra obtained on 2023 May 16 from further analysis due to high instrumental drift. Given the binarity of the host system, the radial velocities will not be an accurate measure of the radial velocity induced by the transiting companion; however, we report the values and analysis here to highlight the importance of obtaining high-resolution speckle imaging when validating the planetary nature of candidate systems. The CORALIE RV data are shown in Table 8, and the FEROS data are shown in Table 9. The photometric light curves and radial-velocity data for TIC-135251751 are shown in Figure 10 with models from allesfitter that assume a single host star with catalog values from the TIC (Table 6).

**Figure 10.** Discovery and follow-up data obtained for TIC-135251751. The phase-folded light curves from NGTS and each TESS sector are shown in the top row. The radial-velocity data are shown in time and phase folded in the bottom row. The model shown in orange is the median best-fit circular model from `allesfitter`. The green dashed line shows the RV model for the 99.99994% upper mass limit, corresponding to the 2.74 M_J companion mass limit. The radial-velocity measurements and `allesfitter` models assume a single star with catalog values.
Download figure:
Standard image High-resolution image

Table 8. CORALIE Spectroscopic Data for TIC-135251751 Assuming a Single Star

Time	RV	RV error	FWHM	BIS	Contrast
(BJD − 2,457,000)	(km s⁻¹)	(km s⁻¹)	(km s⁻¹)	(km s⁻¹)
2948.80439868	−13.097311973	0.110387058	40.547364878	−0.929035261	17.588155
2971.85923573	−12.94049194	0.094379824	41.324681030	−0.535237690	17.359035
3009.79791702	−12.691021747	0.121918536	40.535339566	−0.873255231	18.005368
3046.81076808	−12.79670901	0.11689358	40.668496072	−1.026139583	17.696245
3047.73086323	−12.862174746	0.130432863	40.872850012	−1.305164977	17.729222
3054.76065788	−12.740658082	0.108603896	40.744196360	0.500147341	17.916060
3078.71216262	−12.81379369	0.111130462	40.222439766	−0.225302864	17.983820

Download table as: ASCII Typeset image

Table 9. FEROS Spectroscopic Data for TIC-135251751 Assuming a Single Star

Time	RV	RV error	BIS
(BJD − 2,457,000)	(km s⁻¹)	(km s⁻¹)	(km s⁻¹)
3030.755730267	−12.99280	0.0615	0.55370
3032.812970775	−13.0902	0.0607	⋯
3034.762647563	−13.0018	0.0529	⋯

Note. No measurements of BIS were obtained for two of the observations.

Download table as: ASCII Typeset image

Assuming a single star, the modeling of the photometric and radial-velocity measurements is consistent with a planetary companion with a radius of ${1.32}_{-0.037}^{+0.046}$ R_J and a 99.99994% upper mass limit on the companion body of 2.74 M_J. The spectra show no obvious signs of the system being a double-lined binary, likely due to the similar spectral types of the two binary components resulting in largely similar spectral lines for both that cannot be disentangled easily. In addition, the CCFs are broad with FWHM ≈ 40 km s⁻¹, indicating that the stars are fast rotators, which results in broad, overlapping features that are blended together and further hinder the ability to separate the two component stars. We see correlation between the RVs and CCF contrast, which is indicative that we are not measuring the radial-velocity signal expected of a hot Jupiter around a single star. The high-resolution speckle imaging of this candidate reveals part of the true nature of this system, highlighting the importance of obtaining these data from instruments such as 'Alopeke and Zorro (Scott et al. 2021; Howell & Furlan 2022) when validating planetary candidates.

In order to obtain estimates of the possible parameters of the transiting companion, we first estimate the stellar parameters of the two possible host stars. We use the total luminosity of 9.394021 ± 0.404434 L_⊙ provided by Gaia DR2 (Gaia Collaboration et al. 2016, 2018) and the mean flux ratio $\tfrac{{F}_{B}}{{F}_{A}}=0.6325$ from Zorro to calculate individual luminosities of the two stellar components. We find luminosities of L_A = 5.75 L_⊙ and L_B = 3.64 L_⊙ for the primary and secondary star, respectively. We compare these values to standard values provided by Pecaut et al. (2012) and Pecaut & Mamajek (2013) and estimate the stars to have spectral types of F1V and F5V. We perform basic spectral analysis of the CORALIE data and find the spectra are consistent with these estimated spectral types. We use these spectral types to estimate the stellar parameters that are given in Table 6 for each of the stars. Due to both stars being in the photometric apertures for all instruments, we must account for the dilution of the transit depth due to the nonhost star when calculating the possible parameters of the transiting companion. The diluted transit depth is measured as ${\delta }_{\mathrm{dil}}={0.00283}_{-0.00016}^{+0.00018}$ . The undiluted transit depth is given as ${\delta }_{\mathrm{undil}.}={\delta }_{\mathrm{dil}}\times (1+\tfrac{{F}_{\mathrm{cont}}}{{F}_{\mathrm{host}}})$ , where F_cont is the flux of the contaminant (nonhost) star and F_host is the flux of the host star. We estimate the radius of the transiting companion to be ${1.111}_{-0.031}^{+0.036}$ R_J if it orbits the primary (F1V) star and ${1.226}_{-0.035}^{+0.040}$ R_J if it orbits the secondary (F5V) star. Therefore, the transiting companion in this system is consistent with being planetary in size and may be a hot Jupiter in an S-type orbit around one of the stars of this binary star system. The discovery of exoplanets in close binary systems pose interesting questions for planet formation theories (Thebault & Haghighipour 2015). Additional follow-up and analysis of the TIC-135251751 system will be the subject of future work.

5.3.3. TIC-125925505

ORION detected a transit signal in the NGTS data with a period of 1.743 days, depth of 2.19%, and SDE of 27.55 from a total of 19 full or partial transits. The host star stellar radius is 0.56 R_⊙, with SED fitting indicating that the host is a late-K/early-M-type star. We observed three transits of TIC-125925505 using the 1 m telescope at SAAO using different photometric filters ( $z^{\prime}$ , $r^{\prime}$ , and V). TIC-125925505 falls in the overscan region of the CCD in TESS Sectors 11 and 65, and therefore we have no TESS data for this candidate. The transit depth is consistent across all data sets, ruling out the possibility of this system being an eclipsing binary consisting of two stars with differing colors. Figure 11 shows the Zorro observations for TIC-125925505. We detect no companions within 1 farcs 17 of the target at the 4–5 mag limit in both the 562 and 832 nm filters. We obtain a companion radius of ${0.83}_{-0.076}^{+0.079}$ R_J, indicating that this candidate, if planetary, is a hot giant planet. We obtained four FEROS spectra of TIC-125925505 (see Table 10). We see a maximum radial-velocity variation of 1.6 km s⁻¹ and place a 99.99994% upper mass limit of 9.07 M_J on the companion using the global allesfitter modeling. However, we do not believe the available RV data are sufficient to consider this candidate a validated planet. This candidate requires additional RV monitoring to confirm the companion as planetary. Figure 12 shows the discovery and follow-up data obtained for TIC-125925505 to date.

**Figure 11.** The Zorro contrast curve for TIC-125925505. The blue and red lines show the contrast curves for the 562 nm and 832 nm filters, respectively. The inset shows the reconstructed image of the star from Zorro.
Download figure:
Standard image High-resolution image

**Figure 12.** Discovery and follow-up data obtained for TIC-125925505. The phase-folded light curves from NGTS is shown in the top row, and each of the SAAO transit observations are shown in the middle row with the median best-fit circular model shown in orange. The radial-velocity data are shown in time and phase folded in the bottom row. The model shown in orange is the median best-fit circular model from `allesfitter`. The green dashed line shows the RV model for the 99.99994% upper mass limit, corresponding to the 9.07 M_J companion mass limit.
Download figure:
Standard image High-resolution image

Table 10. FEROS Spectroscopic Data for TIC-125925505

Time	RV	RV error	BIS
(BJD − 2,457,000)	(km s⁻¹)	(km s⁻¹)	(km s⁻¹)
2679.853131131	7.63910	0.0544	−0.1776
2685.764694888	8.67995	0.0275	−0.2057
2703.665987695	7.68750	0.0333	−0.3509
2704.73451741	7.073100	0.0352	0.1549

Download table as: ASCII Typeset image

5.3.4. TIC-125759305

ORION detected a transit signal in the NGTS data with a period of 9.993 days, depth of 2.35%, and SDE of 32.48 from a total of 10 full or partial transits. The host star stellar radius of 0.78 R_⊙ with SED fitting indicating that the host is a K dwarf. In addition to the NGTS data, the target was observed in TESS Sectors 11 and 38. We identify a total of five transits across the two TESS sectors with a consistent transit depth across all data sets, ruling out the possibility of this system being an eclipsing binary consisting of two stars with differing colors. Figure 13 shows the NGTS and TESS photometry obtained for TIC-125759305. We obtain a companion radius of ${1.02}_{-0.01}^{+0.02}$ R_J, indicating that this candidate, if planetary, is a hot Jupiter.

**Figure 13.** Discovery and follow-up data obtained for TIC-125759305. The phase-folded light curves from NGTS and each TESS sector are shown with the median best-fit circular model from `allesfitter` in orange.
Download figure:
Standard image High-resolution image

5.3.5. TIC-180997904

ORION detected a transit signal in the NGTS data with a period of 4.936 days, depth of 0.96%, and SDE of 33.24 from a total of seven full or partial transits. The host star stellar radius is 1.30 R_⊙, with SED fitting indicating that the host is a G-type star. In addition to the NGTS data, the target was observed in TESS Sectors 10, 36, and 37 and using the ${i}^{{\prime} }$ filter on SHOC to observe a full transit. We identify a total of 12 transits across the three TESS sectors with a consistent transit depth across all data sets, ruling out the possibility of this system being an eclipsing binary consisting of two stars with differing colors. Figure 14 shows the NGTS, SAAO, and TESS photometry obtained for TIC-180997904. The cluster of points above the normalized flux level at phase ≈ −0.18 is due to systematics on a single night dominating this region of the light curve. We obtain a companion radius of ${1.07}_{-0.02}^{+0.10}$ R_J, indicating that this candidate, if planetary, is a hot Jupiter.

**Figure 14.** Discovery and follow-up data obtained for TIC-180997904. The phase-folded light curves from NGTS, SAAO, and each TESS sector are shown with the median best-fit circular model from `allesfitter` in orange.
Download figure:
Standard image High-resolution image

6. Detection Efficiency

We assess the detection efficiency of the PHNGTS project by calculating the number of confirmed planets successfully recovered in our sample. We determine a planet to be successfully recovered if it passes to the additional workflows and satisfies the criteria ${s}_{i}({{NS}}_{\sec })\geqslant 0.5$ and s_i(YD_odd) ≥ 0.5 and s_i(YC_odd) ≥ 0.5; i.e., the planet passes to the final list of candidates that are reviewed by the NGTS science team. In the PHNGTS sample, there exists 42 ORION candidates that correspond to confirmed/validated planets from the NASA Exoplanet Archive (accessed 2023 August 02; Akeson et al. 2013) where ORION identifies the published orbital period to within 5%.²⁷ PHNGTS successfully recovers 31 of these planets. We further assess the detection efficiency by dividing the confirmed planet sample into transit depth and planetary radius bins. We elect to use bin sizes of 0.5% for transit depth and 0.5 R_J for planetary radius in order to preserve a sufficient number of planets in each bin to measure the detection efficiencies. Planetary radii are calculated by combining the transit depth measured by ORION with the stellar radius provided by the TIC. We elect to use the transit depth measured by ORION so that the radius value is consistent with the depth of the transit in the image presented to the volunteers. We note that NGTS-10 b/TIC-37348844 (McCormac et al. 2020) does not have a stellar radius listed in the TIC; therefore, we use the value provided by the NASA Exoplanet Archive. The percentages of planets recovered in each bin are shown in Figure 15 and outlined in Tables 11 and 12. While the recovery rate is apparently higher for shallower transits compared with, for example, the 1.0% < δ ≤ 1.5% bin, we find that the percentage of planets recovered is consistent across all bins within the Poissonian 68% uncertainties. These uncertainties are calculated as described in Kraft et al. (1991). Our sample of confirmed planets have orbital periods of between 0.77 and 7.53 days and therefore do not span a sufficient range of parameter space to analyze the recovery fraction as a function of period. We use the TOI catalog (accessed 2023 July 27; Guerrero et al. 2021) to further assess the detection efficiency.²⁸ There are 112 TOIs in our sample that have dispositions of Known Planets (KP), Confirmed Planet (CP), or Planet Candidate (PC) and have the same detected period in NGTS and TESS data to within 5%. We successfully recover 70 of these TOIs. Figure 16 shows the detection efficiency of TOIs as a function of transit depth and secondary radius with the values given in Tables 13 and 14. As in the confirmed planet sample, we find that the recovery rate is apparently lower for the 1.0% < δ ≤ 1.5% bin compared with shallower transits; however, the percentage of TOIs recovered is consistent across all bins within the Poissonian 68% uncertainties. Similarly, the percentage of TOIs recovered with 1.5 < R₂ ≤ 2.0 is considerably lower; however, this measurement is limited by the small number of TOIs in this bin (n_TOI = 7). Our sample of TOIs has orbital periods of between 0.70 and 10.85 days and therefore do not span a sufficient range of parameter space to analyze the recovery fraction as a function of period.

**Figure 15.** Left: recovery efficiency as a function of transit depth for the confirmed planets in the PHNGTS sample. Right: recovery efficiency as a function of planetary radius for the confirmed planets in the PHNGTS sample. The number of planets, n_pl, in each bin is noted on the plot. Error bars are the Poissonian 68% uncertainty, described in Kraft et al. (1991).
Download figure:
Standard image High-resolution image

**Figure 16.** Left: recovery efficiency as a function of transit depth for TOIs in the PHNGTS sample, up to a depth of 3%. Right: recovery efficiency as a function of secondary radius for TOIs in the PHNGTS sample, up to a radius of 2 R_J. The number of TOIs, n_TOI, in each bin is noted on the plot. Error bars are the Poissonian 68% uncertainty, described in Kraft et al. (1991).
Download figure:
Standard image High-resolution image

Table 11. Number of Confirmed/Validated Planets Successfully Recovered by PHNGTS per Transit Depth Bin

Transit Depth	Number of Planets	Planets Recovered	Percentage of Planets Recovered
(%)
0 ≤ δ ≤ 0.5	5	5	${100}_{-38.67}^{+0}$
0.5 < δ ≤ 1.0	15	10	${66.67}_{-18.97}^{+23.46}$
1.0 < δ ≤ 1.5	8	3	${37.5}_{-18.01}^{+36.68}$
1.5 < δ ≤ 2.0	4	4	${100}_{-42.56}^{+0}$
2.0 < δ ≤ 2.5	4	4	${100}_{-42.56}^{+0}$
2.5 < δ ≤ 3.0	6	5	${83.33}_{-32.23}^{+16.67}$

Note. Error bars are the Poissonian 68% uncertainty, described in Kraft et al. (1991).

Download table as: ASCII Typeset image

Table 12. Number of Confirmed/Validated Planets Successfully Recovered by PHNGTS per Secondary Radius Bin

Secondary Radius	Number of Planets	Planets Recovered	Percentage of Planets Recovered
(R_J)
0 ≤ R₂ ≤ 0.5	3	3	${100}_{-48.02}^{+0}$
0.5 < R₂ ≤ 1.0	6	3	${50}_{-24.01}^{+35.57}$
1.0 < R₂ ≤ 1.5	28	22	${78.57}_{-15.56}^{+17.93}$
1.5 < R₂ ≤ 2.0	5	3	${60}_{-28.81}^{+40}$

Note. Error bars are the Poissonian 68% uncertainty, described in Kraft et al. (1991)

Download table as: ASCII Typeset image

Table 13. Number of TOIs Successfully Recovered by PHNGTS per Depth Bin, up to a Depth of 3%

Transit Depth	Number of TOIs	TOIs Recovered	Percentage of TOIs Recovered
(%)
0 ≤ δ ≤ 0.5	8	6	${75}_{-26.79}^{+25}$
0.5 < δ ≤ 1.0	24	17	${70.83}_{-15.81}^{+18.58}$
1.0 < δ ≤ 1.5	13	7	${46.15}_{-16.49}^{+21.69}$
1.5 < δ ≤ 2.0	12	10	${83.33}_{-23.71}^{+16.67}$
2.0 < δ ≤ 2.5	8	6	${75}_{-26.79}^{+25}$
2.5 < δ ≤ 3.0	8	6	${75}_{-26.79}^{+25}$

Note. Error bars are the Poissonian 68% uncertainty, described in Kraft et al. (1991).

Download table as: ASCII Typeset image

Table 14. Number of TOIs Successfully Recovered by PHNGTS per Secondary Radius Bin, up to a Radius of 2 R_J

Secondary Radius	Number of TOIs	TOIs Recovered	Percentage of TOIs Recovered
(R_J)
0 ≤ R₂ ≤ 0.5	4	2	${50}_{-28.29}^{+46.05}$
0.5 < R₂ ≤ 1.0	20	10	${50}_{-14.23}^{+17.57}$
1.0 < R₂ ≤ 1.5	36	28	${77.78}_{-13.75}^{+15.60}$
1.5 < R₂ ≤ 2.0	7	2	${28.57}_{-16.16}^{+26.31}$
2.0 < R₂ ≤ 2.5	4	4	${100}_{-42.56}^{+0}$

Note. Error bars are the Poissonian 68% uncertainty, described in Kraft et al. (1991).

Download table as: ASCII Typeset image

7. Conclusions

We present the results from the analysis of the NGTS Public Data through the Planet Hunters NGTS citizen science project. The PHNGTS project engaged 8559 registered citizen scientists, with responses provided by an additional 3319 non-logged-in sessions. A total of 2,626,380 individual classifications for 138,198 subjects were submitted to the project. We combined the classifications of multiple volunteers using an iterative weighting scheme to search for new planet candidates in these data. This search has yielded five new planet candidates that are all consistent with being hot giant planets. Each of these candidates are undergoing active follow-up observations in an effort to characterize the systems and confirm whether the transiting companions are planetary. In particular, TIC-165227846 would be the lowest-mass star to host a transiting giant planet if confirmed, while TIC-135251751 may host a giant planet in an S-type orbit around one of the components of a binary star system. We assessed the detection efficiency of our project by determining how many of the confirmed planets and TOIs present in the data set were successfully recovered. We successfully recover 31 out of 42 confirmed planets and 70 out of 112 TOIs in the PHNGTS sample. We provide the scores for each response for each subject classified in the PHNGTS project. These data, available in the online supplementary material, can be harnessed for a wide range of applications, for example eclipsing binary cataloging or the search for additional planet candidates not reported in this work. Overall, we have shown that the citizen science approach can be complementary to the traditional eyeballing process of the NGTS data in finding new planet candidates not previously detected in the NGTS Public Data. Furthermore, the PHNGTS project can classify these large data sets much faster than the traditional eyeballing approach. In addition, the fresh perspective of citizen scientists compared with professional astronomers allows the detection of more unusual planet candidates, such as TIC-165227846, that may have been previously overlooked due to its unusually large transit depth (13.1%). The further analysis of some of these planet candidates will be the subject of future work, and the PHNGTS project will continue to search for new planet candidates as NGTS continues to observe the night sky in search of transiting planets.

Acknowledgments

The data presented in this paper are the result of the efforts of the PHNGTS volunteers, without whom this work would not have been possible. Their contributions are individually acknowledged at: https://www.zooniverse.org/projects/mschwamb/planet-hunters-ngts/about/results. We also thank our PHNGTS Talk moderators Arttu Sainio and See Min Lim for their time and efforts helping the PHNGTS volunteer community.

Data access: All PHNGTS data supporting this study are provided as supplementary information accompanying this paper. Some of the data presented in this paper were obtained from the Mikulski Archive for Space Telescopes (MAST) at the Space Telescope Science Institute. The specific observations analyzed can be accessed via the following references: QLP: Huang (2020); TIC: STScI (2018). STScI is operated by the Association of Universities for Research in Astronomy, Inc., under NASA contract NAS5–26555. Support to MAST for these data is provided by the NASA Office of Space Science via grant NAG5–7584 and by other grants and contracts. Data from the NASA Exoplanet Archive Planetary Systems Table can be accessed via NASA Exoplanet Archive (2020).

S.M.O. is supported by a UK Science and Technology Facilities Council (STFC) Studentship (ST/W507751/1). The contributions at the University of Warwick by S.G., D.B., P.J.W., R.G.W., and D.A. have been supported by STFC through consolidated grants ST/P000495/1, ST/T000406/1, and ST/X001121/1. C.A.W. would like to acknowledge support from STFC (grant number ST/X00094X/1). J.S.J. gratefully acknowledges support by FONDECYT grant 1201371 and from the ANID BASAL project FB210003. This work has been carried out within the framework of the National Centre of Competence in Research PlanetS supported by the Swiss National Science Foundation under grants 51NF40_182901 and 51NF40_205606. M.P.B. acknowledges the financial support of the SNSF. The contributions at the Mullard Space Science Laboratory by E.M.B. have been supported by STFC through the consolidated grant ST/W001136/1. E.G. gratefully acknowledges support from the UK STFC (ST/W001047/1). A.O. is supported by an STFC studentship.

We thank Jean C. Costes for a useful discussion on RV template mask choice.

Based on data collected under the NGTS project at the ESO La Silla Paranal Observatory. The NGTS facility is operated by the consortium institutes with support from the UK STFC under projects ST/M001962/1, ST/S002642/1, and ST/W003163/1.

This publication uses data generated via the Zooniverse.org platform, development of which is funded by generous support, including from the National Science Foundation, NASA, the Institute of Museum and Library Services, UKRI, a Global Impact Award from Google, and the Alfred P. Sloan Foundation. The code base for the Zooniverse Project Builder Platform is available under an open-source license at https://github.com/zooniverse/Panoptes and https://github.com/zooniverse/Panoptes-Front-End.

This paper includes data collected by the TESS mission, which are publicly available from the Mikulski Archive for Space Telescopes (MAST). Funding for the TESS mission is provided by the NASA's Science Mission Directorate.

This paper uses observations made at the South African Astronomical Observatory (SAAO).

Some of the observations in the paper made use of the High-Resolution Imaging instrument Zorro. Zorro was funded by the NASA Exoplanet Exploration Program and built at the NASA Ames Research Center by Steve B. Howell, Nic Scott, Elliott P. Horch, and Emmett Quigley. Zorro was mounted on the Gemini South telescope of the international Gemini Observatory, a program of NSF's NOIRLab, which is managed by the Association of Universities for Research in Astronomy (AURA) under a cooperative agreement with the National Science Foundation on behalf of the Gemini partnership: the National Science Foundation (United States), National Research Council (Canada), Agencia Nacional de Investigación y Desarrollo (Chile), Ministerio de Ciencia, Tecnología e Innovación (Argentina), Ministério da Ciência, Tecnologia, Inovações e Comunicações (Brazil), and Korea Astronomy and Space Science Institute (Republic of Korea).

This work has made use of data from the European Space Agency (ESA) mission Gaia (https://www.cosmos.esa.int/gaia), processed by the Gaia Data Processing and Analysis Consortium (DPAC, https://www.cosmos.esa.int/web/gaia/dpac/consortium). Funding for the DPAC has been provided by national institutions, in particular the institutions participating in the Gaia Multilateral Agreement.

This research made use of Lightkurve, a Python package for Kepler and TESS data analysis (Lightkurve Collaboration et al. 2018).

This research has made use of NASA's Astrophysics Data System.

This research has made use of the Spanish Virtual Observatory (https://svo.cab.inta-csic.es) project funded by MCIN/AEI/10.13039/501100011033/ through grant PID2020-112949GB-I00.

Facilities: NGTS - , TESS - , SAAO:1.0m (SHOC) - , Gemini:South (Zorro) - , Euler1.2m (CORALIE) - , Max Planck:2.2m (FEROS) -

Software: astropy (Astropy Collaboration et al. 2013, 2018, 2022), matplotlib (Hunter 2007), Jupyter Notebook (Kluyver et al. 2016), NumPy (van der Walt et al. 2011; Harris et al. 2020), pandas (pandas development team 2023), triceratops (Giacalone & Dressing 2020; Giacalone et al. 2021), panoptes, panoptes-python-client, lightkurve (Lightkurve Collaboration et al. 2018), eleanor (Feinstein et al. 2019), tess-point (Burke et al. 2020), TESSCut (Brasseur et al. 2019), NASA GSFC-eleanor lite (Powell et al. 2022), allesfitter (Günther & Daylan 2019, 2021), ellc (Maxted 2016), emcee (Foreman-Mackey et al. 2013), dynesty (Speagle 2020), celerite (Foreman-Mackey et al. 2017).

Appendix A: Distribution of Scores and Weights for the Additional Workflows

We present the cumulative distributions of subject scores (as in Figure 4) for the Secondary Eclipse Check (Figure 17) and Odd/Even Transit Check (Figure 18). In addition we provide the histograms of user weights for the Secondary Eclipse Check (Figure 19) and Odd/Even Transit Check (Figure 20).

**Figure 18.** Cumulative distribution of subject scores greater than the given score on the x-axis for the Odd/Even Transit Check. The solid red line shows the distribution for "Yes, the data for both the odd and even transits cover the middle portion of the plot." The solid black line shows the distribution for "No, the data for the odd and/or even transits do not cover the middle portion of the plot." The orange dashed line shows the distribution for "Yes, the odd/even transits have similar depths." The blue dashed line shows the distribution for "No, the odd/even transits do not have similar depths."
Download figure:
Standard image High-resolution image

**Figure 19.** Histogram of user weights in the Secondary Eclipse Check. The apparent spike at w_j(SE) = 1 is due to the large spread in weights for this response, resulting in the fixed weights of users who only classified one subject to be more obvious compared with the full distribution of weights.
Download figure:
Standard image High-resolution image

**Figure 20.** Histograms of user weights for both the data coverage check and transit depth check in the Odd/Even Transit Check. The apparent spikes at w_j(OC) = 1 and w_j(OD) = 1 are due to the large spread in weights for these responses, resulting in the fixed weights of users who only classified one subject to be more obvious compared with the full distribution of weights.
Download figure:
Standard image High-resolution image

Appendix B: Tables of PHNGTS Classifications

Tables 15, 16, and 17 show the raw PHNGTS classifications submitted for the Exoplanet Transit Search, Secondary Eclipse Check, and Odd/Even Transit Check, respectively.

Table 15. PHNGTS Classifications Submitted for the Exoplanet Transit Search

User Name	Classification Time Stamp	Subject ID	Responses	Classification ID
beyondcommunication	2021-10-16 17:25:37 UTC	69473214	[A U-shaped or box-shaped dip in the middle, A V-shaped dip in the middle,Stellar variability]	1
beyondcommunication	2021-10-16 17:27:32 UTC	69475120	[A U-shaped or box-shaped dip in the middle, Stellar variability]	2
beyondcommunication	2021-10-16 17:28:46 UTC	69475064	[A U-shaped or box-shaped dip in the middle, A V-shaped dip in the middle,Stellar variability]	3
beyondcommunication	2021-10-16 17:29:19 UTC	69474371	[A U-shaped or box-shaped dip in the middle, Stellar variability]	4
beyondcommunication	2021-10-16 17:29:45 UTC	69474972	[A V-shaped dip in the middle]	5
⋮	⋮	⋮	⋮	⋮

Only a portion of this table is shown here to demonstrate its form and content. A machine-readable version of the full table is available.

Download table as: Data Typeset image

Table 16. PHNGTS Classifications Submitted for the Secondary Eclipse Check

User Name	Classification Time Stamp	Subject ID	Response	Classification ID
astro-sobrien	2021-10-26 15:45:44 UTC	69757462	No secondary eclipse	1
astro-sobrien	2021-10-26 15:48:22 UTC	69757703	A secondary eclipse	2
astro-sobrien	2021-10-26 16:03:36 UTC	69757434	No secondary eclipse	3
Vidar87	2021-10-26 23:15:51 UTC	69757265	A secondary eclipse	4
Vidar87	2021-10-26 23:15:56 UTC	69757551	No secondary eclipse	5
⋮	⋮	⋮	⋮	⋮

Only a portion of this table is shown here to demonstrate its form and content. A machine-readable version of the full table is available.

Download table as: Data Typeset image

Table 17. PHNGTS Classifications Submitted for the Odd/Even Transit Check

User Name	Classification Time Stamp	Subject ID	Data Coverage Response	Depth Match Response	Classification ID
mschwamb	2021-10-26 15:39:01 UTC	69760298	Yes	Yes	1
not-logged-in-66f08cfbb0dc0936544d	2021-10-26 23:23:53 UTC	69760264	Yes	Yes	2
not-logged-in-66f08cfbb0dc0936544d	2021-10-26 23:23:59 UTC	69759935	Yes	Yes	3
not-logged-in-66f08cfbb0dc0936544d	2021-10-26 23:24:12 UTC	69760084	Yes	No	4
not-logged-in-66f08cfbb0dc0936544d	2021-10-26 23:24:50 UTC	69760288	Yes	Yes	5
⋮	⋮	⋮	⋮	⋮	⋮

Only a portion of this table is shown here to demonstrate its form and content. A machine-readable version of the full table is available.

Download table as: Data Typeset image

Planet Hunters NGTS: New Planet Candidates from a Citizen Science Search of the Next Generation Transit Survey Public Data

Article metrics

Share this article

Author e-mails

Author affiliations

ORCID iDs

Dates

Abstract

1. Introduction

2. NGTS

3. Planet Hunters NGTS

3.1. Exoplanet Transit Search

3.2. Secondary Eclipse Check

3.3. Odd/Even Transit Check

3.4. Site Statistics

4. Identifying Candidates

4.1. Weighting Scheme

4.1.1. Exoplanet Transit Search

4.1.2. Secondary Eclipse Check

4.1.3. Odd/Even Transit Check

4.1.4. Distribution of Scores and Weights

4.2. Candidate Selection

5. Planet Candidates and Interesting Systems

5.1. Additional Follow-up Data

5.1.1. TESS

5.1.2. SAAO

5.1.3. Gemini/Zorro

5.1.4. CORALIE

5.1.5. FEROS

5.2. Modeling

5.3. Interesting Systems

5.3.1. TIC-165227846

5.3.2. TIC-135251751

5.3.3. TIC-125925505

5.3.4. TIC-125759305

5.3.5. TIC-180997904

6. Detection Efficiency

7. Conclusions

Acknowledgments

Appendix A: Distribution of Scores and Weights for the Additional Workflows

Appendix B: Tables of PHNGTS Classifications

Footnotes