Two-Stage Resolution¶
The resolver converts raw user input into a structured (artist, title) pair before any search provider runs. This dramatically improves hit quality: instead of sending "нон стоп молли" to a BitTorrent tracker's full-text search, we send "Пошлая Молли Нон стоп".
Source: src-tauri/src/resolver/mod.rs, src-tauri/src/resolver/types.rs
Why it exists¶
File-based search (RuTracker, SoulSeek) works on exact strings inside filenames and torrent names. User queries are messy:
- Lyric snippets: "смотрю в иллюминатор я вижу море огней"
- Transliteration: "paranoid android radiohead" → actual artist is "Radiohead", title "Paranoid Android"
- Colloquial: "нон стоп молли" → artist "Пошлая Молли", title "Нон стоп"
- Cross-script: "perviy klass sukiny deti" → artist "1.Kla$", title "Сукины дети"
The resolver bridges this gap by asking a web-scale index (Genius via Brave Search) what the canonical metadata is, then feeding the clean "Artist Title" string to the providers.
Output: ResolveResult¶
pub struct ResolveResult {
pub query: String, // raw user input, echoed back
pub canonical: Option<ArtistTitle>, // best-guess (artist, title) — null if resolver failed
pub candidates: Vec<TrackCandidate>,// up to ~10 alternatives, best-first
pub intent: Intent, // Track / Artist / Album / Lyric / Raw
pub elapsed_ms: u64,
}
pub struct ArtistTitle {
pub artist: String,
pub title: String, // empty for Artist-intent results
}
pub enum Intent {
Track, // one clear song identified
Artist, // query ~= artist name, multiple songs returned
Album, // cluster of tracks from one album
Lyric, // LRCLIB matched + query looks like a lyric snippet
Raw, // nothing useful resolved — providers search the raw string
}
The frontend uses intent to decide which UI chip to show ("Трек", "Исполнитель", etc.) and whether to route to the artist page or the track list.
Architecture¶
Only one source runs: Brave Search with site:genius.com. iTunes and LRCLIB were both dropped:
- iTunes ranked by global popularity and surfaced wrong tracks when the user asked for a different song by the same artist
- LRCLIB was network-blocked on the primary development machine and always timed out after 3 s, adding 3 s to every query
The Brave path typically resolves in 700–1200 ms on a healthy connection. When Brave returns 429, the resolver solves an Argon2id proof-of-work challenge and retries (see Brave Search & PoW).
resolve_query(raw_query)
↓
sources::brave::lookup(client, query, 10, dlog)
↓ Vec<(artist, title, MatchKind)>
merge_candidates(...)
↓ Vec<TrackCandidate> (deduped by normalized key)
rank by match_score(query, candidate)
↓
pick canonical (score ≥ 2.0 OR lyric/script-mismatch override)
↓
classify_intent(query, candidates, has_canonical)
↓
ResolveResult { canonical, candidates, intent, elapsed_ms }
Candidate merging¶
merge_candidates deduplicates by (kind, norm(artist), norm(title)). norm() is lowercase + strip non-alphanumeric (see resolver/norm.rs):
pub fn norm(s: &str) -> String {
s.chars()
.filter(|c| c.is_alphanumeric())
.flat_map(|c| c.to_lowercase())
.collect()
}
So "Пошлая Молли" and "пошлая молли" collapse to the same key (пошлаямолли), preventing duplicates when Brave returns the same artist from multiple page titles.
Each candidate tracks which sources reported it:
pub struct TrackCandidate {
pub artist: String,
pub title: String,
pub kind: MatchKind, // Track / Album / Artist
pub sources: Vec<String>, // e.g. ["brave"]
}
More sources = higher confidence, used as a tiebreaker in scoring.
MatchKind¶
Brave returns pages of three types. The parser in brave.rs classifies each:
| Kind | Example title | Meaning |
|---|---|---|
Track |
"Пошлая Молли - Нон стоп Lyrics | Genius" | A specific song |
Album |
"Пошлая Молли - Незваный гость Lyrics and Tracklist | Genius" | An album page |
Artist |
"Пошлая Молли Lyrics, Songs, and Albums | Genius" | An artist hub page |
The kind carries through to Intent classification and to the UI chip shown in the search hint.
Canonicalization rules¶
A candidate is accepted as canonical when:
OR one of two overrides applies:
-
Lyric override (
trust_lyric_source): query is long (>20 chars), has no "Artist - Title" separator, and at least one candidate came from a lyric source (Brave). Used for lyric-snippet queries where the query text is deliberately not the track name. -
Script-mismatch override: query contains Cyrillic characters but the top candidate is Latin-only.
match_scorecan't bridge scripts, so it reports 0 for e.g. "параноид андроид" ↔ "Radiohead - Paranoid Android". The override lets these through anyway.
If no candidate meets the threshold, canonical = None and intent = Raw. The provider then searches the raw string.
Artist-only query detection¶
After picking the canonical candidate, the resolver checks whether the raw query is entirely contained within the top candidate's artist name (token-by-token):
let q_tokens_set: HashSet<String> = tokenize(trimmed).into_iter().collect();
let artist_only_query = artist_page_cand.as_ref().is_some_and(|c| {
let stripped = strip_bracket_annotations(&c.artist);
let a_tokens: HashSet<String> = tokenize(&stripped).into_iter().collect();
q_tokens_set.iter().all(|t| a_tokens.contains(t))
});
If true, the canonical title is set to empty and intent is forced to Artist. This prevents "Пошлая Молли" (a pure artist query) from picking the first track Brave happened to list for that artist.
Intent classification¶
After canonicalization:
canonical.kind == Album → Intent::Album
canonical.kind == Artist → Intent::Artist
canonical.kind == Track →
if (query looks like lyric phrase)
AND (query has words not in artist+title)
→ Intent::Lyric
else
→ Intent::Track
canonical == None → Intent::Raw
The "looks like lyric" check: char_count > 20 AND no "Artist - Title" separator (" - ", " – ", " — ").
The "has extra words" check: at least one query token is not in the union of tokenize(artist) ∪ tokenize(title). If every query word is already explained by the metadata, it is a plain track query, not a lyric query.
Tauri command¶
The resolver is exposed as a single Tauri command:
#[tauri::command]
pub async fn resolve_query(
state: State<'_, ResolverState>,
query: String,
) -> Result<ResolveResult, String>
ResolverState holds a shared reqwest::Client (connection-pooled across invocations) and the Arc<AppDebugLog>.
The frontend calls it through the thin wrapper in src/search/resolver.ts: