Samsung’s processor chaos affects Mobile divisions processor choice
Over the past few months there have been a lot of talks going on about the system on a chip (SoC) choices that Samsung’s Mobile division has made for their recent and upcoming products; now new information has shed more light as to how these choices came to be, their reasoning, and what the repercussions for both the company and their users are.
Back in January, during the official announcement CES 2013, and as early as several weeks earlier as the internet rumour-mill and Korean analysts predicted that the Galaxy S4 would come with the company’s own Exynos Octa SoC, most people were not be aware what is going on behind the scenes at Samsung.
Samsung’s System LSI Business, a business unit of the company’s Semiconductor division, has had several SoC projects in their pipeline. We know for certain of the 5410 (Octa), the 5450 and 5440, both quad-core A15 chips that seem to never have been released. Other rumoured projects were the 5210, a supposed big.LITTLE chip in a 2 + 2 configuration.
Something went on during last winter that panicked the mobile division to change SoC provider for many variants of the S4. Mounting evidence of this can be found in the overlapping local-specific variants in the official source code of both Qualcomm and Exynos platforms. JF based variants, which are based on the Snapdragon chips, overlapped JA based variants, based on the Exynos platform. Korean (jf_ktt, jf_lgt, jf_skt <> jaltektt, jaltelgt, jalteskt), European (jf_eur <> jalte), and Japanese (jf_dcm <> jaltedcm) variants were developed for both platforms. The Korean variants ended up being the sole ones to actually hit the market with the Exynos platform, other than the global 3G version of which there’s no evident Qualcomm counterpart.
Chaotic development for the whole phone seems to have been the norm: Many parts changed supplier in prototype revisions, such as the Amtel touchscreen controller which gave way to a Synaptics counterpart, a MagnaChip AMOLED controller which is missing in action, and Philips LED controller which was shelved for a Texas Instruments IC. All of the prior seem like last-minute changes simply for the fact that their drivers and firmwares are delivered in the shipping product, even if they’re not used. A well-planned product is certainly something you would not call the S4.essor change was the unexpected power consumption of the Exynos. While this remains partly true and undoubtedly had an impact, the other reasons are far more sinister.
As a reminder, the Exynos 5410, as a big.LITTLE chip by design, is supposed to have different kinds of operating modes, mostly something that is defined and limited only by software:
– Cluster-migration; where only either one or the other quad-core clusters works at any one time.
– Core-migration; where both clusters can work in tandem but only have up to 4 physical CPUs online, but any mix of A7 and A15 cores can be achieved.
– Heterogeneous multiprocessing: All 8 cores are online at once.
The problem is that to achieve any of the latter two operating modes, a specific piece of hardware is needed that allows efficient and useful use of those models: the Cache-Coherrent-Interconect (CCI). As per ARM’s own claims: “Hardware coherency with CoreLink CCI-400 is a fundamental part of ARM big.LITTLE processing.”
@_martinovich_ Big.LITTLE can be implemented several different ways, and cluster migration does show performance & power-saving advantages
— SamsungExynos (@SamsungExynos) May 8, 2013
While it has been obvious for several months that the person behind the SamsungExynos twitter account is nothing but a clueless PR representative, the above claim is nothing short of a lie.
We have information from several sources that Exynos’s CCI is inherently crippled in silicon. It is not functional or even powered on in the shipping product (i9500). In fact, this has been such of an issue, that as a result, the chip was almost cancelled. It was reportedly only salvaged by having it work in the cluster migration policy and bypass the CCI entirely. While contradicting, it questions the validity of ARM’s own videos while demonstrating the Octa.
Internally at SLSI, as many as three projects were cancelled late last year. We don’t know the reasons for their cancellation, however it is said that the issues are related, and unacceptable power consumption also plays a big role.
One can argue that ARM’s Cortex A15 is partly to blame here: The inherent architecture is to power consuming to be implemented in a smart-phone. big.LITTLE provides major breathing-room, but only in scenarios where continuous load is not an issue. HD gaming is a major Achilles heel where power consumption can run rampant. Nvidia is having it much worse with their Tegra 4: With only a single tablet design win besides their own Shield gaming console, it’s a chip that needs to, and will be, quickly forgotten.
Plagued by delays, hardware bugs, and high power consumption, one could view the Exynos 5410 as nothing short of a failure. In fact, Samsung’s Mobile division was so dismayed at the whole situation that their next major products will completely forego the company’s own Exynos chips and go straight with Qualcomm’s offerings.
Reports that the Note 3 would come with the S800 match with this information, and are probably very correct.
With confirmed designs such as the Galaxy Tab 3 coming with an Intel processor, and the rest of the new Galaxy line-up shipping with various variants of Qualcomm’s Snapdragon S-series, the mobile division should be lauded for providing the user with the best possible experience – even if that involves skipping the Exynos. They have proven that they have no qualms to use a wide array of third-party suppliers (ST-Ericsson, Intel, Broadcom, Qualcomm) to base their products on, and this strategy is proven to be successful.
As for SLSI, things look very bleak for Samsung’s in-house processors. The business is failing to cater, not only in terms of support, like providing proper hardware documentation and source code to the public, but the current line-up is in shambles also in terms of hardware.
Lackluster graphics performance and outdated GPUs have become sort of a habit for the company. This reportedly is due to an unwillingness to spend the money on IP licenses from third-party companies, and the use of Mali GPUs in their SoCs is due to a free licensing agreement they receive from ARM as a lead partner. The surprise use of the SGX 544MP3 in the Exynos 5410 is due to panic caused by Mali’s own T6xx GPUs: again an issue of extremely excessive power consumption. The first generation Midgard lineup was quickly scrapped, leaving the Exynos 5250 and its T604 as something of an orphan. Products like the T658 never saw the light of day and are not even mentioned anymore on ARM’s website.
Meanwhile, while their shipping products are failing to properly compete, Samsung is spending a lot of money on developing their own GPU IP from scratch. Not much information is available as to when we will see this in actual products, but it will eventually come, if not cancelled or delayed due to its unorthodox implementation of an FPGA-like re-programmable design which might be hit-or-miss. Imagination’s Rogue architecture and years of experience as a leading GPU IP provider will be tough competition.
CPU wise, things look just as bleak. Qualcomm currently dominates the performance per Watt scale for the high-end with the newest Krait architectures. With no custom design in the works, as done by Apple or Qualcomm, and no A57 or A53 as architectural refreshes from ARM, nor a new 20nm manufacturing process coming until 2014, the Exynos A15 line-up looks incapable of competing in the near-future.
The above is written by a guest writer, a current respected and knowledgeable developer in the community.
We of SamMobile have asked Samsung Exynos for a comment on this article, and when we receive any reaction from them we will update this post.
Update 1. (Addendum)
SamsungExynos has replied to the above claims, denying them:
@rahulzeven There are not issues with the CCI & this particular implementation of big.LITTLE does show increased performance/efficiency
— SamsungExynos (@SamsungExynos) May 30, 2013
It is wishful thinking that the CCI is indeed not crippled, and we’d be gladly proven wrong by demonstration, after all, the Exynos 5410 has had 4 identifiable chip revisions before making it into a consumer product, and given the slight chance that our insider information refers to earlier revisions (Would also explain the very late mass-production schedule), one might consider this possibility.
However, the explanation about power consumption is very false (technical jargon ahead):
Potential power benefits of having cluster migration over a core-migration scheme come from the fact that at any given time only one cluster would be powered on: This eliminates the need to keep power to the whole cluster, including its L2 cache. This also enables shutting off the regulator which is dedicated for that cluster, again gaining some power benefit. And lastly, of course, there is a power benefit from not having the CCI itself enabled, however that cannot account for any large amount of power.
While all these facts would technically give a cluster migration scheme an advantage, they are not exclusive to it, and are possible with a core migration scheme too. In fact, Samsung’s own (non-working) implementation of the In-Kernel-Switcher in the kernel source-code handles exactly this use-case.
It is possible to mimic cluster-migration, but with help of the CCI in a core-migration scheme: Instead of having the L2 caches flushed to main memory and then back to the other CPU (As it is currently done on the Galaxy S4), it would use the CCI to transfer the data. As is in the current shipping product, not only is this a power inefficiency problem, but it is also an issue of user-experience as it may induce micro-lags.
Meanwhile, the power deficit of running a task which could easily be handled by an A7 core, and having it instead run on the A15 can be massive. Such situations is why Qualcomm goes towards asynchronous clock planes for each CPU. ARM’s architecture however cannot do this by design and all cores inside a cluster must run at the same clock.
And to put a nail through such false claims which cannot subjectively hold their weight: We have Samsung’s own leaked internal presentation where they admit that cluster migration only holds a limited power advantage and use of a big.LITTLE system, it goes against ARM’s own claims, and through discussions with persons involved in development of software for the system: “PR bullshit”.
The above is written by a guest writer, a current respected and knowledgeable developer in the community.
We of SamMobile have asked Samsung Exynos for a comment on this article, and when we receive any reaction from them we will update this post.Join the Discussion