Problem with charset on SC2 API

API Discussion
Exemple :
http://eu.battle.net/sc2/fr/profile/3478583/1/A%C3%A9rius/

it's write Aérius but it's Aérius and for this there is an error 404 with API.
This issue affects the match history in the battle.net site as well. I've had problems due to this for both clan tags and player names within the API.
up
up
@Elrel I noticed the two different dates of your most recent posts.

What are you referring to in regard to "up"? Are you just stating those characters are able to be reached. Status updates are great when requested or if the problem is an ongoing issue. Please though post with clarity and only when necessary.
@unbound I guess the 'up' is a bumping.

@Elrel as for the issue, it could be improper output encoding on the website itself, see the headers returned by the page:

C:\>curl -I http://eu.battle.net/sc2/fr/profile/3478583/1/Aérius/
HTTP/1.1 200 OK
Date: Sun, 07 May 2017 02:45:16 GMT
Server: Apache
X-Frame-Options: SAMEORIGIN
Retry-After: 600
Set-Cookie: login.cookies=1; Domain=battle.net; Path=/
Content-Language: fr-FR
Content-Length: 43361
Content-Type: application/xhtml+xml;charset=UTF-8


If you inspect the byte sequences on the source text in the HTML body, you can see that what the website shows is actually ISO-8859-1 (Western european), but the website declares all output to be UTF-8. What you are experiencing on the Starcraft II site is called Mojibake.

TL;DR This has to be fixed on Blizzard's side
Here is another example where it is a major issue, as every single map name is encoded improperly:
https://kr.api.battle.net/sc2/profile/1366428/1/Ryung/matches?locale=en_US&apikey=[APIKEY]

{
"matches" : [ {
"map" : "어비설 리프 - 래더",
"type" : "SOLO",
"decision" : "WIN",
"speed" : "FASTER",
"date" : 1494919760
}, {
"map" : "블러드 보일 - 래더",
"type" : "SOLO",
"decision" : "WIN",
"speed" : "FASTER",
"date" : 1494919410
}, {
"map" : "프록시마 ì •ê±°ìž¥ - 래더",
"type" : "SOLO",
"decision" : "WIN",
"speed" : "FASTER",
"date" : 1494918888
}, {
"map" : "시퀀스 - LE",
"type" : "SOLO",
"decision" : "WIN",
"speed" : "FASTER",
"date" : 1494918038
}, {
"map" : "오딧세이 - 래더",
"type" : "SOLO",
"decision" : "WIN",
"speed" : "FASTER",
"date" : 1494917182
}, {
"map" : "프록시마 ì •ê±°ìž¥ - 래더",
"type" : "SOLO",
"decision" : "WIN",
"speed" : "FASTER",
"date" : 1494916378
}, {
"map" : "오딧세이 - 래더",
"type" : "SOLO",
"decision" : "WIN",
"speed" : "FASTER",
"date" : 1494915122
}, {
"map" : "어비설 리프 - 래더",
"type" : "SOLO",
"decision" : "LOSS",
"speed" : "FASTER",
"date" : 1494912460
}, {
"map" : "오딧세이 - 래더",
"type" : "SOLO",
"decision" : "WIN",
"speed" : "FASTER",
"date" : 1494911955
}, {
"map" : "오딧세이 - 래더",
"type" : "SOLO",
"decision" : "LOSS",
"speed" : "FASTER",
"date" : 1494911181
}, {
"map" : "프록시마 ì •ê±°ìž¥ - 래더",
"type" : "SOLO",
"decision" : "WIN",
"speed" : "FASTER",
"date" : 1494909975
}, {
"map" : "블러드 보일 - 래더",
"type" : "SOLO",
"decision" : "WIN",
"speed" : "FASTER",
"date" : 1494908608
}, {
"map" : "오딧세이 - 래더",
"type" : "SOLO",
"decision" : "WIN",
"speed" : "FASTER",
"date" : 1494908124
}, {
"map" : "블러드 보일 - 래더",
"type" : "SOLO",
"decision" : "WIN",
"speed" : "FASTER",
"date" : 1494907002
}, {
"map" : "오딧세이 - 래더",
"type" : "SOLO",
"decision" : "WIN",
"speed" : "FASTER",
"date" : 1494906506
}, {
"map" : "오딧세이 - 래더",
"type" : "SOLO",
"decision" : "WIN",
"speed" : "FASTER",
"date" : 1494905662
}, {
"map" : "어비설 리프 - 래더",
"type" : "SOLO",
"decision" : "WIN",
"speed" : "FASTER",
"date" : 1494905127
}, {
"map" : "어비설 리프 - 래더",
"type" : "SOLO",
"decision" : "LOSS",
"speed" : "FASTER",
"date" : 1494904662
}, {
"map" : "블러드 보일 - 래더",
"type" : "SOLO",
"decision" : "LOSS",
"speed" : "FASTER",
"date" : 1494849519
}, {
"map" : "어비설 리프 - 래더",
"type" : "SOLO",
"decision" : "WIN",
"speed" : "FASTER",
"date" : 1494848419
}, {
"map" : "어비설 리프 - 래더",
"type" : "SOLO",
"decision" : "WIN",
"speed" : "FASTER",
"date" : 1494847649
}, {
"map" : "어비설 리프 - 래더",
"type" : "SOLO",
"decision" : "LOSS",
"speed" : "FASTER",
"date" : 1494846728
}, {
"map" : "블러드 보일 - 래더",
"type" : "SOLO",
"decision" : "WIN",
"speed" : "FASTER",
"date" : 1494846291
}, {
"map" : "오딧세이 - 래더",
"type" : "SOLO",
"decision" : "WIN",
"speed" : "FASTER",
"date" : 1494845262
}, {
"map" : "시퀀스 - LE",
"type" : "SOLO",
"decision" : "LOSS",
"speed" : "FASTER",
"date" : 1494843416
} ]
}


This affects every single result on the Korean API (along with likely the China endpoint).
It also affects all clan tags/clan names listed through the account endpoints, such as https://kr.api.battle.net/sc2/profile/1366428/1/Ryung/?locale=en_US&apikey=[APIKEY]
{
"id" : 1366428,
"realm" : 1,
"displayName" : "Ryung",
"clanName" : "둥치",
"clanTag" : "둥치",
"profilePath" : "/profile/1366428/1/Ryung/",
"portrait" : {
"x" : -450,
"y" : 0,
"w" : 90,
"h" : 90,
"offset" : 5,
"url" : "http://media.blizzard.com/sc2/portraits/1-90.jpg"
},
etc


I understand that the new data endpoints are in the works, but this encoding problem makes the old endpoints useless and until the new endpoints are out there is nothing that can be done with this data.
04/28/2017 01:32 PMPosted by unbound
@Elrel I noticed the two different dates of your most recent posts.

What are you referring to in regard to "up"? Are you just stating those characters are able to be reached. Status updates are great when requested or if the problem is an ongoing issue. Please though post with clarity and only when necessary.


Why ??? because nothing changes ...
If it was concerning bumping, please do not bump blindly. A sentence describing an ongoing problem is better, or even a notice that "This is an ongoing problem. This is what I discovered" goes a long way to make why an issue occurs a lot better to understand and more likely to get noticed. Remember the goal is to compartmentalize the various parts of information so it can be conveyed efficiently, for Blizzard's benefit and ours.

@Kalle and @Ophidian thanks for the more detailed info. For how many times I have dealt with these issues I never heard the term coined like that before "Mojibake" (where have I been :D).

I will ping relevant sources. Note I have just as much as access as you do in regards to getting a response on these issues, so a response is not guaranteed. Hopefully though this gets on someones radar to take a more in depth look.
@Ophidian I could go on and rant a lot about this, but for the Korean data. UTF-8 cannot represent Korean language data, and you will need to convert it into a proper encoding for usage. Ideally for Korean that would be something like EUC-KR (KS_C_5601), but you should convert to ISO-8859-1 (Don't be fooled by guides saying ISO-8859-9 aka ISO Latin5, as this is only an extension for turkish).

This means that in PHP, you can use the utf8_decode() function (highly not recommeded!) or the more modern UConverter class available as of PHP 5.5.0 in the ext/intl extension:

<?php
function api_request($url, $decode = false)
{
$raw_json = file_get_contents($url);

if($decode)
{
static $u;

if(!$u)
{
$u = new UConverter('ISO-8859-1', 'UTF-8');
}

return(json_decode($u->convert($raw_json)));
}

return(json_decode($raw_json));
}

$key = 'API KEY HERE!';
$url = 'https://kr.api.battle.net/sc2/profile/1366428/1/Ryung/matches?locale=ko_KR&apikey=' . $key;

echo 'Encoded: ' . api_request($url)->matches[0]->map, PHP_EOL;
echo 'Decoded: ' . api_request($url, true)->matches[0]->map, PHP_EOL;
?>


Output:

Encoded: 오딧세이 - 래더
Decoded: 오딧세이 - 래더


As you can see, the characters are correctly represented after the decoding. I hope this is somewhat helpful to you :)
05/20/2017 05:38 AMPosted by Kalle
@Ophidian I could go on and rant a lot about this, but for the Korean data. UTF-8 cannot represent Korean language data, and you will need to convert it into a proper encoding for usage. Ideally for Korean that would be something like EUC-KR (KS_C_5601), but you should convert to ISO-8859-1 (Don't be fooled by guides saying ISO-8859-9 aka ISO Latin5, as this is only an extension for turkish).


Quite right, a friend also pointed out that these can be decoded as you suggest. Sadly there are still cases where this garbled encoding causes problems, for example, these both fail:
https://eu.api.battle.net/sc2/profile/7165753/1/Ä%20ŊǾмÄ%20ŁĭЄ/?locale=en_GB&apikey=[APIKEY]
https://eu.api.battle.net/sc2/profile/7165753/1/ąŊǾмąŁĭЄ/?locale=en_GB&apikey=[APIKEY]

The main profile page can load: http://eu.battle.net/sc2/en/profile/7165753/1/ąŊǾмąŁĭЄ/
but none of the other pages load
http://eu.battle.net/sc2/en/profile/7165753/1/ąŊǾмąŁĭЄ/matches
http://eu.battle.net/sc2/en/profile/7165753/1/ąŊǾмąŁĭЄ/ladder/
http://eu.battle.net/sc2/en/profile/7165753/1/ąŊǾмąŁĭЄ/achievements/

It seems likely these are all related.
@Ophidian if my memory strikes me right, then you will automatically get an error if you URL-encode data when sending it to the API (e.g. the two API calls you put as an example above).

This mojibake you linked: ąŊǾмąŁĭЄ, where is this obtained from? Is that the website it self that poorly translates this when you search for a specific name in korean or?
05/25/2017 12:40 AMPosted by Kalle
@Ophidian if my memory strikes me right, then you will automatically get an error if you URL-encode data when sending it to the API (e.g. the two API calls you put as an example above).

This mojibake you linked: ąŊǾмąŁĭЄ, where is this obtained from? Is that the website it self that poorly translates this when you search for a specific name in korean or?

No, that is the player's actual account name.
To follow up here, it seems the previous examples are now working on the website, but they are all still broken on the API. For example, all Korean language account names are returning 404 profile not found (in API only). https://kr.api.battle.net/sc2/profile/1396784/1/전태양/matches?locale=en_US&apikey=[API_KEY]
@Ophidian Are you sure you have proper encoding of those special characters. Not sure if those will be processed correctly.
06/06/2017 11:38 AMPosted by unbound
@Ophidian Are you sure you have proper encoding of those special characters. Not sure if those will be processed correctly.


Yes
This is still a major issue. 100% of requests using Korean language names will fail.

Join the Conversation

Return to Forum