protocol
UnicodeCodec
Inheritance | View Protocol Hierarchy β |
---|---|
Associated Types |
A type that can hold code unit values for this encoding. |
Import |
|
Initializers
Static Methods
Encodes a Unicode scalar as a series of code units by calling the given closure on each code unit.
For example, the musical fermata symbol ("π") is a single Unicode scalar
value (\u{1D110}
) but requires four code units for its UTF-8
representation. The following code uses the UTF8
codec to encode a
fermata in UTF-8:
var
bytes
: [
UTF8.CodeUnit
] = []
UTF8.encode
(
"π"
,
into
: {
bytes
.
append
($
0
) })
(
bytes
)
// Prints "[240, 157, 132, 144]"
Parameters: input: The Unicode scalar value to encode. processCodeUnit: A closure that processes one code unit argument at a time.
Declaration
static
func
encode
(
_
input
:
UnicodeScalar
,
into
processCodeUnit
: (
Self
.
CodeUnit
) -
>
Swift
.
Void
)
Instance Methods
Starts or continues decoding a code unit sequence into Unicode scalar values.
To decode a code unit sequence completely, call this method repeatedly
until it returns UnicodeDecodingResult.emptyInput
. Checking that the
iterator was exhausted is not sufficient, because the decoder can store
buffered data from the input iterator.
Because of buffering, it is impossible to find the corresponding position
in the iterator for a given returned UnicodeScalar
or an error.
The following example decodes the UTF-8 encoded bytes of a string into an
array of UnicodeScalar
instances:
let
str
=
"β¨Unicodeβ¨"
(
Array
(
str
.
utf8
))
// Prints "[226, 156, 168, 85, 110, 105, 99, 111, 100, 101, 226, 156, 168]"
var
bytesIterator
=
str
.
utf8
.
makeIterator
()
var
scalars
: [
UnicodeScalar
] = []
var
utf8Decoder
=
UTF8
()
Decode
:
while
true
{
switch
utf8Decoder
.
decode
(
&
bytesIterator
) {
case
.
scalarValue
(
let
v
):
scalars
.
append
(
v
)
case
.
emptyInput
:
break
Decode
case
.
error
:
(
"Decoding error"
)
break
Decode
}
}
(
scalars
)
// Prints "["\u{2728}", "U", "n", "i", "c", "o", "d", "e", "\u{2728}"]"
input
: An iterator of code units to be decoded. input
must be
the same iterator instance in repeated calls to this method. Do not
advance the iterator or any copies of the iterator outside this
method.
Returns: A UnicodeDecodingResult
instance, representing the next
Unicode scalar, an indication of an error, or an indication that the
UTF sequence has been fully decoded.
Declaration
mutating
func
decode
<
I
:
IteratorProtocol
where
I
.
Element
==
CodeUnit
>
(
_
input
:
inout
I
) -
>
UnicodeDecodingResult
A Unicode encoding form that translates between Unicode scalar values and form-specific code units.
The
UnicodeCodec
protocol declares methods that decode code unit sequences into Unicode scalar values and encode Unicode scalar values into code unit sequences. The standard library implements codecs for the UTF-8, UTF-16, and UTF-32 encoding schemes as theUTF8
,UTF16
, andUTF32
types, respectively. Use theUnicodeScalar
type to work with decoded Unicode scalar values.See Also:
UTF8
,UTF16
,UTF32
,UnicodeScalar