International Components for Unicode

ICU Home
  · ICU Home
ICU4C Demos
  · Converter Explorer
  · Collation Demo
  · Segments
  · IDNA
  · Locale Explorer
  · Normalization Browser
  · Regular Expressions
  · String Compare
  · Transforms
  · Unicode Browser
ICU4J Demos
  · Demo Page
Tools
 

Related Websites

Unicode Consortium

Common Locale Data

 

 
ICU  >  Demo  >  Converter Explorer  > 

UTF-8

Select a standard to view:











Related Topics
 
 · Converter Explorer Help 
 · ICU Charset Information 
 

List of Converter Aliases
Internal
Converter Name
IBM IANA
UTF-8 ibm-1208
ibm-1209
ibm-5304
ibm-5305
ibm-13496
ibm-13497
ibm-17592
ibm-17593
UTF-8


Codepage Layout
  000102030405060708090A0B0C0D0E0F 
00
NUL
0000
SOH
0001
STX
0002
ETX
0003
EOT
0004
ENQ
0005
ACK
0006
BEL
0007
BS
0008
HT
0009
LF
000A
VT
000B
FF
000C
CR
000D
SO
000E
SI
000F
00
10
DLE
0010
DC1
0011
DC2
0012
DC3
0013
DC4
0014
NAK
0015
SYN
0016
ETB
0017
CAN
0018
EM
0019
SUB
001A
ESC
001B
FS
001C
GS
001D
RS
001E
US
001F
10
20

0020
!
0021
"
0022
#
0023
$
0024
%
0025
&
0026
'
0027
(
0028
)
0029
*
002A
+
002B
,
002C
-
002D
.
002E
/
002F
20
30
0
0030
1
0031
2
0032
3
0033
4
0034
5
0035
6
0036
7
0037
8
0038
9
0039
:
003A
;
003B
<
003C
=
003D
>
003E
?
003F
30
40
@
0040
A
0041
B
0042
C
0043
D
0044
E
0045
F
0046
G
0047
H
0048
I
0049
J
004A
K
004B
L
004C
M
004D
N
004E
O
004F
40
50
P
0050
Q
0051
R
0052
S
0053
T
0054
U
0055
V
0056
W
0057
X
0058
Y
0059
Z
005A
[
005B
\
005C
]
005D
^
005E
_
005F
50
60
`
0060
a
0061
b
0062
c
0063
d
0064
e
0065
f
0066
g
0067
h
0068
i
0069
j
006A
k
006B
l
006C
m
006D
n
006E
o
006F
60
70
p
0070
q
0071
r
0072
s
0073
t
0074
u
0075
v
0076
w
0077
x
0078
y
0079
z
007A
{
007B
|
007C
}
007D
~
007E
DEL
007F
70
80                                 80
90                                 90
A0                                 A0
B0                                 B0
C0     C2 C3 C4 C5 C6 C7 C8 C9 CA CB CC CD CE CF C0
D0 D0 D1 D2 D3 D4 D5 D6 D7 D8 D9 DA DB DC DD DE DF D0
E0 E0 E1 E2 E3 E4 E5 E6 E7 E8 E9 EA EB EC ED EE EF E0
F0 F0 F1 F2 F3 F4                       F0
  000102030405060708090A0B0C0D0E0F 

Information About This Converter
Type of converterUCNV_UTF8
Minimum number of bytes per UChar1
Maximum number of bytes per UChar3
Substitution character\xEF\xBF\xBD
Is ASCII [\x20-\x7E] compatible?TRUE
Is ASCII [\u0020-\u007E] ambiguous?FALSE
Contains ambiguous aliases?FALSE
Always generates Unicode NFC?UNKNOWN
Contains BiDi characters?TRUE

List of Languages Representable By This Codepage
LocaleLocale Name
afAfrikaans
agqAghem
akAkan
amAmharic
arArabic
asAssamese
asaAsu
astAsturian
azAzerbaijani
az_CyrlAzerbaijani (Cyrillic)
basBasaa
beBelarusian
bemBemba
bezBena
bgBulgarian
bgcHaryanvi
bhoBhojpuri
bloAnii
bmBambara
bnBangla
boTibetan
brBreton
brxBodo
bsBosnian
bs_CyrlBosnian (Cyrillic)
caCatalan
ccpChakma
ceChechen
cebCebuano
cggChiga
chrCherokee
ckbCentral Kurdish
csCzech
cswSwampy Cree
cvChuvash
cyWelsh
daDanish
davTaita
deGerman
de_CHGerman (Switzerland)
djeZarma
doiDogri
dsbLower Sorbian
duaDuala
dyoJola-Fonyi
dzDzongkha
ebuEmbu
eeEwe
elGreek
enEnglish
eoEsperanto
esSpanish
etEstonian
euBasque
ewoEwondo
faPersian
ffFula
ff_AdlmFula (Adlam)
fiFinnish
filFilipino
foFaroese
frFrench
furFriulian
fyWestern Frisian
gaIrish
gdScottish Gaelic
glGalician
gswSwiss German
guGujarati
guzGusii
gvManx
haHausa
hawHawaiian
heHebrew
hiHindi
hrCroatian
hsbUpper Sorbian
huHungarian
hyArmenian
iaInterlingua
idIndonesian
ieInterlingue
igIgbo
iiSichuan Yi
isIcelandic
itItalian
jaJapanese
jgoNgomba
jmcMachame
jvJavanese
kaGeorgian
kabKabyle
kamKamba
kdeMakonde
keaKabuverdianu
kgpKaingang
khqKoyra Chiini
kiKikuyu
kkKazakh
kkjKako
klKalaallisut
klnKalenjin
kmKhmer
knKannada
koKorean
kokKonkani
ksKashmiri
ks_DevaKashmiri (Devanagari)
ksbShambala
ksfBafia
kshColognian
kuKurdish
kwCornish
kxvKuvi
kxv_DevaKuvi (Devanagari)
kxv_OryaKuvi (Odia)
kxv_TeluKuvi (Telugu)
kyKyrgyz
lagLangi
lbLuxembourgish
lgGanda
lijLigurian
lktLakota
lmoLombard
lnLingala
loLao
lrcNorthern Luri
ltLithuanian
luLuba-Katanga
luoLuo
luyLuyia
lvLatvian
maiMaithili
masMasai
merMeru
mfeMorisyen
mgMalagasy
mghMakhuwa-Meetto
mgoMetaʼ
miMāori
mkMacedonian
mlMalayalam
mnMongolian
mniManipuri
mrMarathi
msMalay
mtMaltese
muaMundang
myBurmese
mznMazanderani
naqNama
ndNorth Ndebele
ndsLow German
neNepali
nlDutch
nmgKwasio
nnhNgiemboon
noNorwegian
nqoN’Ko
nusNuer
nynNyankole
ocOccitan
omOromo
orOdia
osOssetic
paPunjabi
pa_ArabPunjabi (Arabic)
pcmNigerian Pidgin
plPolish
prgPrussian
psPashto
ps_PKPashto (Pakistan)
ptPortuguese
quQuechua
rajRajasthani
rmRomansh
rnRundi
roRomanian
rofRombo
ruRussian
rwKinyarwanda
rwkRwa
saSanskrit
sahYakut
saqSamburu
satSantali
sbpSangu
scSardinian
sdSindhi
sd_DevaSindhi (Devanagari)
seNorthern Sami
sehSena
sesKoyraboro Senni
sgSango
shiTachelhit
shi_LatnTachelhit (Latin)
siSinhala
skSlovak
slSlovenian
smnInari Sami
snShona
soSomali
sqAlbanian
srSerbian
sr_LatnSerbian (Latin)
suSundanese
svSwedish
swSwahili
sw_CDSwahili (Congo - Kinshasa)
sw_KESwahili (Kenya)
syrSyriac
szlSilesian
taTamil
teTelugu
teoTeso
tgTajik
thThai
tiTigrinya
tkTurkmen
toTongan
tokToki Pona
trTurkish
ttTatar
twqTasawaq
tzmCentral Atlas Tamazight
ugUyghur
ukUkrainian
urUrdu
uzUzbek
uz_ArabUzbek (Arabic)
uz_CyrlUzbek (Cyrillic)
vaiVai
vai_LatnVai (Latin)
vecVenetian
viVietnamese
vmwMakhuwa
vunVunjo
waeWalser
woWolof
xhXhosa
xnrKangri
xogSoga
yavYangben
yiYiddish
yoYoruba
yo_BJYoruba (Benin)
yrlNheengatu
yueCantonese
yue_HansCantonese (Simplified)
zaZhuang
zghStandard Moroccan Tamazight
zhChinese
zh_HantChinese (Traditional)
zuZulu

Set of Unicode Characters Representable By This Codepage

[^\uD800-\uDFFF]