WheatCoordDB: Guide

How to use the converter

All conversions run entirely in your browser. No data is sent to any server. The converter loads pre-computed PCHIP conversion tables (sampled at 1 kb resolution) on demand and interpolates target coordinates using piecewise linear lookup between adjacent table entries. Anchor files are additionally loaded for dotplot rendering.

Single position or region

Enter a chromosome, a start position, and an end position (in base pairs). For a single position, enter the same value in both fields. Select one or more target assemblies and click Convert.

Chromosome:  Chr2B
Start (bp):  450000000
End (bp):    520000000
Assembly:    Lancer, Jagger

Batch BED conversion

Upload a BED file containing multiple regions. The file should be tab-delimited with columns: chrom, start, end, and an optional name column. Chromosome names must use the Chr1A–Chr7D convention.

Chr1A  10000000  15000000  QTL_1
Chr2B  450000000 520000000 resistance_locus
Chr4A  200000000 250000000 yield_QTL

Results are returned as a table and can be downloaded as a BED file for each target assembly.

Coordinate format

All coordinates are in base pairs (bp). The converter accepts plain integers; do not include commas or unit suffixes. Chromosome names follow the convention Chr1A, Chr1B, Chr1D through Chr7D (21 chromosomes total).

Synteny dotplot

For single-position or region queries with a single target assembly selected, a synteny dotplot is shown below the result. The dotplot displays all gene anchors for the queried chromosome as dark blue points, with the query region highlighted as a red box. This lets you visually verify that your region of interest falls within a well-anchored collinear block before relying on the converted coordinates. The dotplot can be downloaded as a PNG using the download button.

The dotplot is not available in batch mode (BED file upload) or when multiple assemblies are selected. For those cases, refer to the per-assembly chromosome dotplots on GitHub.

Confidence scores

Every conversion result includes a confidence score: the anchor recovery fraction. This is the proportion of reference assembly gene models expected to be present in a ±5 Mb window around the query position that were successfully projected onto the target assembly by Liftoff.

Unlike a raw anchor count, the recovery fraction normalises for gene density variation along the chromosome, so pericentromeric regions have fewer genes per Mb, so they are not penalised simply for being gene-poor. What matters is how many of the available genes were recovered.

Tier	Threshold	Display	Interpretation
High confidence	≥ 80% gene recovery	Green tag · full confidence bar	Strong collinearity — the region is well-anchored in this assembly. Coordinate error is typically within tens of base pairs. Suitable for fine-mapping and marker design.
Moderate confidence	50–80% gene recovery	Amber tag · partial bar	Reduced collinearity — structural variation, an introgression, or an assembly gap is likely present nearby. Coordinate error is typically hundreds of base pairs to a few kilobases. Use with caution and verify against the per-assembly dotplots.
Low confidence	< 50% gene recovery	Red tag · minimal bar	Very low collinearity — few gene anchors were recovered in this region. The returned coordinate may be unreliable and error can reach megabase scale. Treat as approximate only; do not use for marker design without independent verification.

Regional variation: confidence scores are specific to the query region and assembly. The same chromosome can have High confidence on chromosome arms and Low confidence in pericentromeric regions or known introgression intervals. Check the per-assembly chromosome profiles on GitHub to understand accuracy in your region of interest before relying on a result.

Result badges and symbols

⚠ inverted orientation

The returned start coordinate is larger than the end coordinate. This means the query region maps to the opposite strand in the target assembly; the region is present but in reverse orientation relative to CS. The coordinates are still valid; the start and end simply need to be swapped if your downstream tool requires start < end.

⚠ whole-chromosome inversion

The entire chromosome appears to be assembled in the opposite orientation in the target relative to CS. All coordinates on this chromosome will be inverted. This is an assembly orientation choice, not a biological inversion.

~ symbol before coordinates

A tilde (~) prefix indicates the coordinate was extrapolated beyond the range of available anchors. The first anchor on a chromosome may not start at position 0, and the last anchor may not reach the chromosome end. Extrapolated coordinates are estimated by projecting the boundary slope beyond the anchor range and may be less reliable than interpolated coordinates.

When extrapolation occurs, a warning message appears below the result:

⚠ Extrapolated coordinates — query extends beyond anchor range on Chr1B (anchors cover 1.8–700.4 Mb). Returned coordinates are extrapolated from the nearest anchor boundary and may be less reliable.

The anchor coverage range shown in the message tells you where the first and last gene anchors are on that chromosome for that assembly. Queries that start before or end after these positions use values from the boundary of the pre-computed PCHIP conversion table, where the spline was extrapolated at pipeline construction time using a robust boundary slope estimated from the nearest anchors. In most cases this extrapolation is accurate, but in pericentromeric regions or near structural rearrangements the extrapolated coordinate may be less reliable than positions within the anchor range.

⚠ Query spans a translocation breakpoint

The query region overlaps a known inter-chromosomal translocation boundary. The result is split into two segments, one for each side of the breakpoint, mapping to different target chromosomes. This currently applies to ArinaLrFor and SY_Mattis, which carry a Chr5B/Chr7B translocation: the proximal portion of Chr5B maps to Chr5B in the target, while the distal portion maps to Chr7B. A query spanning the breakpoint will return two results accordingly.

Query falls in translocation breakpoint gap

The query falls within the breakpoint gap itself, a region with no reliable anchor coverage on either side of the translocation. No coordinate can be returned for this region.

Technical validation

WheatCoordDB was validated using two independent approaches: leave-one-out (LOO) cross-validation across all 24 assemblies, and KASP marker validation against 2,547 SNP markers with known positions in 21 assemblies.

Leave-one-out (LOO) cross-validation

For each of the 24 assemblies, one assembly was held out and its anchor positions were predicted from the remaining assemblies' splines. Prediction error was calculated as the distance between the predicted and actual anchor midpoint positions.

LOO: by gene recovery tier

High ≥80%107 bp median · 5.4 kb P90

Moderate 50–80%1.4 kb median · 58 kb P90

Low <50%3.9 kb median · 201 kb P90

KASP SNP markers: by gene recovery tier

High ≥80%26 bp median · 3.0 kb P90

Moderate 50–80%311 bp median · 207 kb P90

Low <50%4.6 Mb median · 41.9 Mb P90

Important caveat: these headline figures are genome-wide averages. Individual chromosomes and specific regions within an assembly can have substantially higher error rates, particularly in pericentromeric regions, known introgression intervals, and regions of low anchor density. We strongly recommend checking the per-assembly chromosome profile plots on GitHub to assess the likely accuracy for your specific region of interest before using converted coordinates for fine-mapping or marker design.

KASP validation notes

KASP marker positions in target assemblies were obtained by BLASTn of flanking sequences (~100 bp). Of 51,331 marker-assembly pairs tested, 93.9% showed correct chromosome assignment. Large errors (>5 Mb) were concentrated in Low-confidence regions (82.6% of large errors had anchor recovery fraction <50%) and on chromosomes with known homeologous sequence similarity (particularly Chr4A and Chr6B), where short BLAST queries can produce ambiguous hits. These BLAST mapping artefacts are distinct from WheatCoordDB interpolation errors.

Per-assembly accuracy profiles

Detailed plots showing anchor recovery fraction, anchor density, mean anchor gap, and coordinate accuracy along all 21 chromosomes for each of the 24 assemblies are available in the supplementary plots folder on GitHub. These plots allow users to identify chromosomal regions where accuracy may be reduced for a specific assembly before relying on converted coordinates.

Included assemblies

WheatCoordDB provides two parallel coordinate systems. In CS RefSeq v2.1 mode (default), 24 target assemblies are available, with gene anchors projected from the v2.1 annotation. In CS RefSeq v1.0 mode, 23 target assemblies are available (CS v2.1 is included as a target; CS v1.0 itself becomes the reference). All assemblies are chromosome-scale hexaploid wheat (Triticum aestivum). Liftoff alignment thresholds of 90% coverage and 90% identity were applied in both cases.

The table below shows assemblies available in v2.1 mode. In v1.0 mode, CS_v1 is replaced by CS_v2.1 as a target in the CS versions group.

10+ wheat panel

Jaggerready

Lancerready

ArinaLrForready

Stanleyready

Speltready

Maceready

SY_Mattisready

Juliusready

Landmarkready

Norin61ready

CS versions

CS_IAAST2T

CS_CAUT2T

CS_v1RefSeq v1

Additional cultivars

Aikang58ready

Chuanmai104ready

Sumai3ready

JIN50ready

MOVready

Fielderready

Kariegaready

Attraktionready

Renan_v2ready

Paragon_v3ready

Cadenza_v2ready

Planned additions include 17 Chinese cultivars from Jiao et al. (2025, Nature) spanning Chinese wheat breeding history, and wild relative assemblies (Ae. tauschii, T. timopheevii, Th. bessarabicum).

Want a different assembly included? Get in touch. We are happy to add any publicly available chromosome-scale wheat assembly. Please provide the accession number or a download link.

Contact

WheatCoordDB is developed and maintained by the Grewal lab at the University of Nottingham.

For questions, bug reports, or assembly addition requests, contact surbhi.grewal@nottingham.ac.uk.

To report issues or contribute, visit the GitHub repository.