Module Utf_offset_conv.Offset_units

type t = {
  1. encoding : [ `UTF_8 | `UTF_16 | `UTF_32 ];
  2. units : [ `Bytes | `Uchars | `Code_units ];
}

encoding is the encoding of the text at the time the offset is taken/applied: when providing an offset, it is the encoding of the text at the time the offset was taken; when using a returned offset, the text must be encoded in encoding. This is tracked separately from text_encoding as the text may have been converted to a different encoding since the time the provided offset was taken (or may yet be converted to a different encoding at the time the returned offset will be used). The endianness of the encoding does not matter for interpreting the offset units.

units refers to the measurement of the offset - it is either a regular byte offset or expressed in terms of "code units," which are 1 for UTF-8, 2 for UTF-16, and 4 for UTF-32. When units is `Uchars, the value of encoding does not matter.

val sexp_of_t : t -> Sexplib0.Sexp.t