Prelude.StringsAdditional string functions.
(rev str) is the reverse of str.
Invariant:
rev = (explode >> List.rev >> implode)(random ?(printable=true) ?charset ?size ()) returns a random string of length (size ()) (default: < 64). Each of the characters in the string is the result of calling (Char.random ?printable () ?printable ?charset ()).
slice and the infix operators (#.) and (#!) do string indexing and slicing modeled on Python. There is currently no support for "steps".
Negative indexes are as in Python. Indexes "missing" in Python's slice notation are replaced with 0.
Indexes that are out-of-range are treated as the max or min string index as appropriate, so no exceptions are ever raised, e.g.:
(slice "" (0,100)) = "" Analogues:
"str"[i] = Ocaml: "str"#!i"str"[i:j] = Ocaml: "str"#.(i,j)"str"[:j] = Ocaml: "str"#.(0,j)"str"[i:] = Ocaml: "str"#.(i,0)"str"[:] = Ocaml: "str"#.(0,0)(str #! i) is like (str.[i]) with Python-like support for negative indexes.
(slice str (l,u)) is Python string slicing.
Example: (slice "0123456789" (2,4)) = "23"
(any p str) is true iff (p x) for any of the characters of str.
Invariant: (any p str) = (explode str |> map p |> ored)
(all p str) is true iff (p c) for all of the characters of str.
Invariant: (all p str) = (explode str |> map p |> anded)
(only ~these str) is true iff str consists of only characters in these; tail recursive.
Partially apply for efficiency.
(anyof ~these str) is true iff str contains any of the characters in these; tail recursive.
Partially apply for efficiency.
(allof ~these str) is true iff every character in these is contained in str; tail recursive.
Partially apply for efficiency.
(prefix s str) is true iff s is a prefix of str.
Invariants:
(prefix "" s) = true(prefix s s) = true(suffix s str) is true iff s is a suffix of str.
Invariants:
(suffix "" s) = true(suffix s s) = true(substr sub str) is (Some i) where i is the index of the first occurrence of sub in str, or None if sub does not occur in str.
Examples:
(substr "x" "") = None(substr "x" "x") = Some 0(substr "x" "xs") = Some 0(substr "x" "zxs") = Some 1(substr "x" "zs") = None(take n str) is the prefix of str of length n, or str itself if (n < 0 || n > len str).
(drop n str) is the suffix of str after the first n elements, or if (n < 0 || n > len str).
(splitat n str) is (a,b) such that a is the (min n (len str))-byte prefix of str, and b is the remainder of the string.
N.B. if n > (len str) then (len @@ fst @@ splitat n str = len str).
(splitat n str) is equivalent to (take n str, drop n str).
(takeall n str) returns the sequential substrings of length n in str.
(takeall 3 digits) = ["012"; "345"; "678"; "9"]Invariant: (takeall n str |> String.concat "") = str
(takewhile p str) is the leading string of chars of str for which p is true.
(takewhile (contains digits) "7654abc123") = "7654"(takewhile (contains digits) "x7654abc") = ""(dropwhile p str) is str trimmed of leading chars for which p is true.
(dropwhile (contains digits) "7654abc") = "abc"(dropwhile (contains digits) "x7654abc") = "x7654abc"(splitwhile p str) is (takewhile p str, dropwhile p str).
(foldl f acc s) is a left-fold over the chars in a string; (f (...(f (f acc s.[0]) s.[1])...) s.[m]) with m = String.length s - 1.
Tail-recursive.
(foldr f acc s) is a right-fold over the chars in a string; (f s.[0] (f s.[1] (...(f s.[m] acc) )...)) with m = String.length s - 1.
Not tail-recursive.
(foldlines ?start ?sep f init str) folds (f i acc line) over the implied lines in str, with init as the initial value for acc.
Compared to doing (String.split str |> List.foldl), String.foldlines avoids allocations, and is typically 500-600% faster.
f's i parameter is the offset in str of the next line to be processed; the offset of the start of the line is always available as (i - String.len line).
It can be useful to return i if you are exiting the fold early with Exn.return e.g. foldlines could be restarted where you left off by passing ~start:i.
?sep is the line-separator (default: '\n'); you can fold over other, non-line, chunks if you like.
Example:
foldlines (fun _i -> (snocwhen (String.len >> gt 200))) [] str
is the list of lines more than 200 bytes long in str.
(maps f str) is like String.map except f returns a string instead of a char.
(filter p str) is all the characters of the string str that satisfy the predicate p.
The order of the characters in the input string is preserved.
(iter f str) calls (f c) for each character c in str (for side-effects).
(iteri f str) calls (f i c) for each character c and zero-based index i in str (for side-effects).
Infix version of Chars.upto.
(explode s) is the list of characters comprising the string s.
Example: explode "abc" = ['a'; 'b'; 'c']
to_list is explode.
implode is the inverse of explode.
Example: (explode $ implode) "abc" = "abc"
of_list is implode.
(of_array a) is the string consisting of all the characters of a.
All the characters in these predefined strings occur in lexicographic order.
whitespace is a string consisting of the ASCII whitespace characters:
"\t\n\x0B\x0C\r "
alphabet is a string consisting of the lowercase ASCII alphabetic characters:
"abcdefghijklmnopqrstuvwxyz"
(split ?elide ?complement ?sep str) is the list of (possibly empty) substrings that are delimited by any of the set of chars contained in the string sep (default: whitespace). If complement is true (default: false), the separator characters are any that are not contained in sep. Empty substrings are omitted in the list if elide is true (the default).
The sep string is treated as US-ASCII.
(split "foo \t bar\nbaz") = ["foo"; "bar"; "baz"] (split ~complement:true ~sep:String.(miniscules ^ majuscules) "foo BAR1765 isn't") = ["foo"; "BAR"; "isn"; "t"] (split ~elide:false ~sep str) is the parser for lines in simple delimited file formats such as /etc/passwd's :-delimited format, tab-delimited formats, etc.
(split ~elide:false ~sep:":" ":foo:bar::baz:") = [""; "foo"; "bar"; ""; "baz"; ""] (split ~elide:true ~sep:":" ":foo:bar::baz:") = ["foo"; "bar"; "baz"] (join ?(elide=false) ?sep list) is String.concat sep; the default sep is " ".
If (elide = true), empty strings in list are ignored (requires an additional pass).
(cut ~sep str) divides str into two parts (left, Some right) which are delimited by the leftmost occurrence of sep. If there is no occurrence of sep in str, then the result is (str, None).
Examples:
(cut ~sep:"--" "") = ("", None)(cut ~sep:"--" "x") = ("x",None)(cut ~sep:"--" "x--") = ("x",Some "")(cut ~sep:"--" "x--y") = ("x",Some "y")cuts ~sep str is the list of substrings of str that are delimited by occurrences of sep.
Note the differences between these:
(cuts ~sep:"--" "1--2---3") = ["1"; "2"; "-3"](split ~elide:false ~sep:"--" "1--2---3") = ["1"; ""; "2"; ""; ""; "3"](split ~elide:false ~sep:"-" "1--2---3") = ["1"; ""; "2"; ""; ""; "3"](split ~elide:true ~sep:"--" "1--2---3") = ["1"; "2"; "3"](split ~elide:true ~sep:"-" "1--2---3") = ["1"; "2"; "3"](replace subj rep str) replaces all occurences of subj in str with rep.
(replace subj rep) = (cuts ~sep:subj $ concat rep) (pad ?(left=false) ?(def=' ') n str) pads xs on the right (or on the left, if (left = true)) with enough instances of def such that the length of the result is at least n.
If you need the result list to be exactly of length n (never longer), use:
(take n $ pad ~def n)Invariant: (len (pad ~left ~def n xs) = max n (len xs))
(translate xs ys str) transliterate the characters in str which appear in xs to the corresponding character in ys.
If (len ys < len xs), ys will be padded on the right with ys#!(-1). For example:
(translate majuscules "!" "FooBar" = "!oo!ar")If (xs = ""), the returned value is str. Raises Failure if (len ys = 0 || len xs < len ys).
Example:
(translate miniscules majuscules "foobar") = "FOOBAR"(commas ?comma num) formats the (presumed) numeric string num with commas in the conventional manner.
For example, (commas "3628800") = "3,628,800".
All OCaml integer and float literals are accepted, but only decimal integers are commafied; hex, octal, and binary integers, and all floating point numbers are returned unmodified.
Leading zeros, a leading '+'-sign, and all underscore spacers are elided in the commafied number.
Other non-numeric strings will have commas inserted into them, but you shouldn't rely on this behavior due to all the special cases above.
(plural ?reg ?irr n word) returns the possibly plural form of the singular word in the context of the number n.
Use ~irr to provide a fixed irregular plural form. (plural ~irr:x) is equivalent to (plural ~reg:(k x)).
~reg is a function that returns the regular plural of a word; the default is for English and is (postpend "s"); see also es.
Pluralization is done as for English, viz. the singular is used for 1 and all other numbers, including 0 and negative numbers, use the plural form.
Examples:
(List.map (id *** flip plural "dog") [-1;0;1;2]) = [(-1, "dogs"); (0, "dogs"); (1, "dog"); (2, "dogs")](List.map (id *** flip (plural ~irr:"oxen") "ox") [-1;0;1;2]) = [(-1, "oxen"); (0, "oxen"); (1, "ox"); (2, "oxen")](es w) is (postpend "es"); it is an alternative pluralization function for certain regularly irregular English nouns.
Example:
(List.map (id *** flip (plural ~reg:es) "wrass") [-1;0;1;2]) = [(-1, "wrasses"); (0, "wrasses"); (1, "wrass"); (2, "wrasses")](ocaml str) converts its argument to OCaml syntax, wrapping it in double quotes and adding escapes as necessary.
(parens (l,r) s) parses the string on Stream s into a forest, recognizing l and r as left and right parentheses respectively.
The forest is a list of nodes; each node is either (`S s) where s is a string, or (`P f) where f is the (sub)forest representing a prenthesized expression.
If the parsed string contains an unterminated parenthesized expression, Failure is raised.
Example:
(Stream.of_string "foo" |> parens ('(',')')) = [`S "foo"](Stream.of_string "(foo)" |> parens ('(',')')) = [`P [`S "foo"]](Stream.of_string "a(foo)z" |> parens ('(',')')) = [`S "a"; `P [`S "foo"]; `S "z"](Stream.of_string "a(foo(bar))z" |> parens ('(',')')) = [`S "a"; `P [`S "foo"; `P [`S "bar"]]; `S "z"]Unmatched right parens are allowed. Example:
(Stream.of_string "1) one" |> parens ('(',')')) = [`S "1) one"](string_of_parens (l,r)) converts the parsed forest returned by (parens (l,r) s) into a string.
(string_of_parens (l,r)) is the inverse of (Stream.of_string >> parens (l,r)).
val seq_of_fields : sep:string -> string -> string Seq.t(seq_of_fields ~sep str) is the sequence of sep-separated fields in str.
sep is as for cut.
val seq_of_lines : string -> string Seq.t(seq_of_lines) is the sequence of "\n"-separated lines in str
module Ops : sig ... endInfix and prefix operators.