Anadromes Problem
Introduction
An anadrome is a string that forms another word when read in reverse. For example, “top” and “pot” are anadromes of each other. This algorithm takes a set of strings and returns a set of all the pairs of anadromes in the input set.
Implementation
(** Anadromes Problem *)
module StrSet = Set.Make(String)
let read_line_seq ch =
let rec repeat () =
match input_line ch with
| s -> Seq.Cons (s, repeat)
| exception End_of_file -> Nil
in repeat
let string_rev s =
let last = pred (String.length s) in
String.init (succ last) (fun i -> s.[last - i])
let get_anadromes set =
let aux s =
let r = string_rev s in
if s < r && StrSet.mem r set
then Some (s, r)
else None
in
Seq.filter_map aux (StrSet.to_seq set)
let () = read_line_seq stdin |> Seq.filter (fun s -> String.length s > 6)
|> Seq.map String.lowercase_ascii |> StrSet.of_seq |> get_anadromes
|> Seq.iter (fun (s, r) -> Printf.printf "%9s | %s
" s r)
The algorithm reads lines from standard input until end-of-file is reached. It filters out lines that are too short to have anadromes (less than 7 characters), transforms all the letters to lowercase using String.lowercase_ascii, and stores the resulting strings in a StrSet. Then it calls get_anadromes on the set, which returns a sequence of pairs of anadromes. Finally, it prints each pair as a line with its two elements separated by a vertical bar.
Step-by-step Explanation
- Define a module
StrSetthat provides a set implementation for strings. - Define a function
read_line_seqthat returns a sequence of lines read from a channel until end-of-file is reached. UseSeq.Consto build the sequence recursively, and amatchexpression to handle end-of-file as an exception. - Define a function
string_revthat takes a stringsand returns its reverse. UseString.lengthto find the index of the last character,String.initto initialize a new string with the same length ass, and a lambda function to fill each position with the corresponding character froms. - Define a function
get_anadromesthat takes a setsetof strings and returns a sequence of all the pairs of anadromes inset. UseSeq.filter_mapto apply the auxiliary functionauxto each element ofsetand keep only the ones that returnSome(i.e., anadromes). - Define an auxiliary function
auxthat takes a stringsand checks whether its reverse is inset. If it is, returnSome (s, r)whereris the reverse ofs. If it isn’t, returnNone. - In the main program, read lines from standard input using
read_line_seq, filter out the ones that are too short usingSeq.filter, transform them to lowercase usingSeq.mapandString.lowercase_ascii, build aStrSetwithStrSet.of_seq, callget_anadromeson it, and iterate over the resulting sequence withSeq.iter, printing each pair as a line withPrintf.printf.
Complexity Analysis
Let n be the number of strings in the input set, and m be the maximum length of a string.
The time complexity of String.length is O(1), the time complexity of String.init is O(m), and the time complexity of the lambda function is O(1), so the time complexity of string_rev is O(m).
The time complexity of StrSet.mem is logarithmic in the size of the set, so the time complexity of aux is O(m log n).
The time complexity of Seq.filter_map is proportional to the size of the sequence, which is at most n, and the time complexity of get_anadromes is proportional to the number of anadrome pairs in the set, which is at most n/2, so the time complexity of the main algorithm is O(nm log n).
The space complexity of the algorithm is dominated by the StrSet data structure, which stores up to n strings of up to length m, so the space complexity is O(nm). The other data structures used (sequences, pairs, variables) have constant size.