How to use the converter, how to interpret confidence scores and result badges, validation accuracy, and which assemblies are included.
All conversions run entirely in your browser. No data is sent to any server. The converter loads pre-computed PCHIP conversion tables (sampled at 1 kb resolution) on demand and interpolates target coordinates using piecewise linear lookup between adjacent table entries. Anchor files are additionally loaded for dotplot rendering.
Enter a chromosome, a start position, and an end position (in base pairs). For a single position, enter the same value in both fields. Select one or more target assemblies and click Convert.
Chromosome: Chr2B
Start (bp): 450000000
End (bp): 520000000
Assembly: Lancer, Jagger
Upload a BED file containing multiple regions. The file should be tab-delimited with columns: chrom, start, end, and an optional name column. Chromosome names must use the Chr1A–Chr7D convention.
Chr1A 10000000 15000000 QTL_1
Chr2B 450000000 520000000 resistance_locus
Chr4A 200000000 250000000 yield_QTL
Results are returned as a table and can be downloaded as a BED file for each target assembly.
All coordinates are in base pairs (bp). The converter accepts plain integers; do not include commas or unit suffixes. Chromosome names follow the convention Chr1A, Chr1B, Chr1D through Chr7D (21 chromosomes total).
For single-position or region queries with a single target assembly selected, a synteny dotplot is shown below the result. The dotplot displays all gene anchors for the queried chromosome as dark blue points, with the query region highlighted as a red box. This lets you visually verify that your region of interest falls within a well-anchored collinear block before relying on the converted coordinates. The dotplot can be downloaded as a PNG using the download button.
The dotplot is not available in batch mode (BED file upload) or when multiple assemblies are selected. For those cases, refer to the per-assembly chromosome dotplots on GitHub.
Every conversion result includes a confidence score: the anchor recovery fraction. This is the proportion of CS RefSeq v2.1 gene models expected to be present in a ±5 Mb window around the query position that were successfully projected onto the target assembly by Liftoff.
Unlike a raw anchor count, the recovery fraction normalises for gene density variation along the chromosome, so pericentromeric regions have fewer genes per Mb, so they are not penalised simply for being gene-poor. What matters is how many of the available genes were recovered.
| Tier | Threshold | Display | Interpretation |
|---|---|---|---|
| High confidence | ≥ 80% gene recovery | Green tag · full confidence bar | Strong collinearity. Median coordinate error ~26 bp (KASP) / ~107 bp (LOO). Suitable for fine-mapping and marker design. |
| Moderate confidence | 50–80% gene recovery | Amber tag · partial bar | Reduced collinearity: structural variation, introgression, or assembly gap likely in this region. Median error ~311 bp (KASP) / ~1.4 kb (LOO). Use with caution. |
| Low confidence | < 50% gene recovery | Red tag · minimal bar | Very low collinearity. Median error ~4.6 Mb (KASP) / ~3.9 kb (LOO). The returned coordinate may be unreliable. Treat as approximate only. |
The returned start coordinate is larger than the end coordinate. This means the query region maps to the opposite strand in the target assembly; the region is present but in reverse orientation relative to CS. The coordinates are still valid; the start and end simply need to be swapped if your downstream tool requires start < end.
The entire chromosome appears to be assembled in the opposite orientation in the target relative to CS. All coordinates on this chromosome will be inverted. This is an assembly orientation choice, not a biological inversion.
A tilde (~) prefix indicates the coordinate was extrapolated beyond the range of available anchors. The first anchor on a chromosome may not start at position 0, and the last anchor may not reach the chromosome end. Extrapolated coordinates are estimated by projecting the boundary slope beyond the anchor range and may be less reliable than interpolated coordinates.
When extrapolation occurs, a warning message appears below the result:
The anchor coverage range shown in the message tells you where the first and last gene anchors are on that chromosome for that assembly. Queries that start before or end after these positions use values from the boundary of the pre-computed PCHIP conversion table, where the spline was extrapolated at pipeline construction time using a robust boundary slope estimated from the nearest anchors. In most cases this extrapolation is accurate, but in pericentromeric regions or near structural rearrangements the extrapolated coordinate may be less reliable than positions within the anchor range.
The query region overlaps a known inter-chromosomal translocation boundary. The result is split into two segments, one for each side of the breakpoint, mapping to different target chromosomes. This currently applies to ArinaLrFor and SY_Mattis, which carry a Chr5B/Chr7B translocation: the proximal portion of Chr5B maps to Chr5B in the target, while the distal portion maps to Chr7B. A query spanning the breakpoint will return two results accordingly.
The query falls within the breakpoint gap itself, a region with no reliable anchor coverage on either side of the translocation. No coordinate can be returned for this region.
WheatCoordDB was validated using two independent approaches: leave-one-out (LOO) cross-validation across all 24 assemblies, and KASP marker validation against 2,547 SNP markers with known positions in 21 assemblies.
For each of the 24 assemblies, one assembly was held out and its anchor positions were predicted from the remaining assemblies' splines. Prediction error was calculated as the distance between the predicted and actual anchor midpoint positions.
KASP marker positions in target assemblies were obtained by BLASTn of flanking sequences (~100 bp). Of 51,331 marker-assembly pairs tested, 93.9% showed correct chromosome assignment. Large errors (>5 Mb) were concentrated in Low-confidence regions (82.6% of large errors had anchor recovery fraction <50%) and on chromosomes with known homeologous sequence similarity (particularly Chr4A and Chr6B), where short BLAST queries can produce ambiguous hits. These BLAST mapping artefacts are distinct from WheatCoordDB interpolation errors.
Detailed plots showing anchor recovery fraction, anchor density, mean anchor gap, and coordinate accuracy along all 21 chromosomes for each of the 24 assemblies are available in the supplementary plots folder on GitHub. These plots allow users to identify chromosomal regions where accuracy may be reduced for a specific assembly before relying on converted coordinates.
All 24 assemblies are chromosome-scale hexaploid wheat (Triticum aestivum) assemblies unless otherwise noted. Gene anchors were projected from IWGSC CS RefSeq v2.1 using Liftoff with alignment thresholds of 90% coverage and 90% identity.
Planned additions include 17 Chinese cultivars from Jiao et al. (2025, Nature) spanning Chinese wheat breeding history, and wild relative assemblies (Ae. tauschii, T. timopheevii, Th. bessarabicum).
Want a different assembly included? Get in touch. We are happy to add any publicly available chromosome-scale wheat assembly. Please provide the accession number or a download link.
WheatCoordDB is developed and maintained by the Grewal lab at the University of Nottingham.
For questions, bug reports, or assembly addition requests, contact surbhi.grewal@nottingham.ac.uk.
To report issues or contribute, visit the GitHub repository.