Alien - Dictionary
Problem Description
Link : Alien-Dictionary
You are given a list of words in an alien language, and the words are sorted lexicographically according to the rules of that language. The alphabet order in that language is unknown. Determine a valid order of characters that explains the given sorting. If no valid order exists (due to inconsistency or cycles), return an empty string.
Intuition
By comparing each pair of adjacent words, we can derive relative order constraints between letters: the first position where two words differ gives an ordering letter1 → letter2, meaning letter1 comes before letter2.
Collect all unique letters appearing in the words. Build a directed graph over these letters with edges representing constraints. If this graph has a cycle, no valid order exists. Otherwise, any valid topological ordering of the graph yields one possible alphabet order.
Approach: Kahn’s Algorithm (BFS-based Topological Sort)
Steps
-
Gather all unique characters from the words into a set.
-
Initialize an adjacency list (
Map<char, List<char>>) and anindegreemap for characters. -
For each adjacent pair of words:
-
Compare characters until the first mismatch. If found:
-
Add an edge from
char1 → char2in the graph. -
Increment
indegree[char2]. -
Stop comparing further.
-
-
If no mismatch and first word is longer than second (
"abc"before"ab"), ordering is invalid → return"".
-
-
Initialize a queue with all characters having
indegree == 0. -
Perform BFS:
-
Remove node from queue, append to result.
-
Reduce indegree of its neighbors; if any becomes 0, enqueue.
-
-
After BFS:
-
If result includes all unique characters → return string order.
-
Otherwise → cycle detected → return
"".
-
Java Code (BFS / Kahn’s Algorithm)
class Solution {
public String alienOrder(String[] words) {
int n = words.length;
Map<Character, List<Character>> graph = new HashMap<>();
Map<Character, Integer> indegree = new HashMap<>();
// Step 1: Initialize nodes
for (String w : words)
for (char c : w.toCharArray())
indegree.putIfAbsent(c, 0);
// Step 2: Build edges
for (int i = 0; i < n - 1; i++) {
String w1 = words[i], w2 = words[i + 1];
int len = Math.min(w1.length(), w2.length());
int j = 0;
while (j < len && w1.charAt(j) == w2.charAt(j)) j++;
if (j < len) {
char c1 = w1.charAt(j), c2 = w2.charAt(j);
graph.computeIfAbsent(c1, x -> new ArrayList<>());
if (!graph.get(c1).contains(c2)) {
graph.get(c1).add(c2);
indegree.put(c2, indegree.get(c2) + 1);
}
} else if (w1.length() > w2.length()) {
// Invalid prefix case
return "";
}
}
// Step 3: BFS
Queue<Character> queue = new LinkedList<>();
for (char c : indegree.keySet()) {
if (indegree.get(c) == 0) queue.offer(c);
}
StringBuilder result = new StringBuilder();
while (!queue.isEmpty()) {
char u = queue.poll();
result.append(u);
List<Character> neighbors = graph.getOrDefault(u, new ArrayList<>());
for (char v : neighbors) {
indegree.put(v, indegree.get(v) - 1);
if (indegree.get(v) == 0) queue.offer(v);
}
}
// Step 4: Check for cycle
if (result.length() < indegree.size()) return "";
return result.toString();
}
}
Dry Run Example
words = ["wrt","wrf","er","ett","rftt"]
-
Unique letters: w,r,t,f,e
-
Comparing "wrt" vs "wrf": mismatch at index 2 →
t → f -
"wrf" vs "er": mismatch at index 0 →
w → e -
"er" vs "ett": mismatch at index 1 →
r → t -
"ett" vs "rftt": mismatch at index 0 →
e → r
Graph edges:
t → f
w → e
r → t
e → r
In-degree:
w:0, e:1, r:1, t:1, f:1
Queue start: [w]
Order:
w → e → r → t → f
Output: "wertf"
Correctness and Edge Cases
-
If a word is a prefix of the next but appears later (e.g.,
"abc"before"ab"): return empty string. -
Handles duplicate words or identical prefixes.
-
Multiple valid orders possible; BFS returns one valid topological order.
-
Detects cycles if result length < number of unique letters.
Time and Space Complexity
| Metric | Complexity |
|---|---|
| Time | O(N * L + V + E) |
| — Building graph | Compare adjacent words: O(N * L) |
| — BFS | Topological sort: O(V + E) |
| Space | O(V + E) includes graph + indegree |
-
N = number of words, L = average word length
-
V = unique characters, E = number of precedence edges
Optional: DFS-Based Topological Sort Version
You can alternatively build the same graph and apply a DFS with state coloring (0 = unvisited, 1 = visiting, 2 = done/safe) to detect cycles and generate a reverse post-order list as the alphabet order. This method is also widely used, but BFS/Kahn’s is often simpler for clarity and cycle detection.