Coreteks

The official website for the youtube channel Coreteks

Beyond Apple M1 – Achieving maximum density and flexibility through 3D stacking

Apple recently unveiled its newest M1 SoC to the world, designed specifically to meet the demand for improved performance on their Macs, challenging for the first time the hegemony of the x86 duopoly. Using the most recently developed 5nm TSMC node in its monolithic SoC, the M1 design raises some questions about the future evolution of its architecture. As discussed in previous articles the days of Moore’s Law are numbered and we can no longer count on new nodes for the development of new SoCs at ever increasing big performance jumps, a limitation that is forcing the industry to desperately search for alternatives to bring more performance to the world.

As a way to face the design development challenges caused by the death of Moore’s law, the industry has begun to look at die splitting of SoC cores into separate dies, such as the 2.5D integration “chiplet” approach used recently by AMD, as an alternative to continue improving architectures without having to use huge (and very expensive) dies. However this approach cannot be used directly in all situations. In the case of mobile devices even more requirements must be met before this type of approach can be used. It is necessary to take into account, for example, the need for a small form factor package and the high bandwidth communication between the multiple dies, with the minimum energy per bit possible.

At this point, the foundry industry has been researching several 3D stacking alternatives over the past few years to meet these and other requirements demanded by the market. For example, in 2019, TSMC presented its 3D-MiM fan-out approach designed as a low-cost alternative to the current 2.5D IC, for use in 5G/AI-driven HPC and server applications. More recently, at Hot Chips 2020, Samsung introduced its X-Cube chip packaging technology which uses through-silicon via for vertical electrical connection instead of using wires, integrating an SRAM die on top of a logic die. However none of the alternatives presented so far would be able to meet two essential requirements to be introduced in a “beyond M1” SoC: High bandwidth communication to support the communication between two split logical dies (i.e., CPU and GPU in different dies), and real low-cost packaging.


Comparison between the maximum memory data bandwidth achieved by FC-PoP and 3DMiM fan-out. [link]


An illustrative image of the 3D X-Cube technology proposed by Samsung. [link]

And that brings us to the main question: How does Apple intend to evolve its architecture knowing the ultimate fate of Moore’s Law and that the use of large dies in advanced nodes can make the cost of developing monolithic SoCs non-viable?

The trivial answer to that question would be to separate the development of its SoCs into multiple distinct architectures, aimed at different types of package, which would cause even more complexity and increase R&D and production costs. However the following patent demonstrates that Apple, unlike the rest of the industry, is not willing to settle for a trivial answer to this question and is moving towards a true low-cost 3D stacking packaging aimed at future SoCs.

Achieving a high bandwidth die-to-die interconnect through a folded 3D die arrangement


A cross-sectional side view illustration of folded 3D die arrangement structure proposed by Apple. [link]



An isometric view of a portion of a substrate s trace routing for multi-layer break out pads (Intel Patent)

In August 2020 a new Apple patent was published with the intention of solving the big issue of developing a low cost 3d packaging for its future SoCs. The patent proposes a folded 3D die arrangement which can be used to split SoC cores into separate dies, making joint use of both a conventional interposer and a new vertical interposer, which is the key point in the development of this new approach. This folded die package structure presented in the patent can leverage both vertical stacking and a local interposer to achieve both high bandwidth die-to-die interconnects, while being capable of significantly reducing footprint compared to conventional fan-out RDL or 2.5D packaging solutions, being a superior solution in direct comparison to the solutions presented by both TSMC and Samsung so far. Furthermore, it is possible that such an arrangement can provide significant reduction in production costs, especially when compared to conventional 3D packaging methods that use face-to-face die Interconnections using TSVs.

The patent highlights some great application possibilities that clearly show the possible evolution paths for the development of its “beyond M1” architectures. The first and most obvious possibility is splitting CPU and GPU in two distinct dies, interconnected through this arrangement. Such an evolution path would enable a better binning to improve effective yield for both classes of device, significantly reducing the cost of manufacturing both and boosting the development of a larger CPU / GPU in their future SoCs.

Another application possibility would be splitting the CPU in two distinct dies, thus increasing the core count significantly while using a discrete GPU. Such a possibility would be quite interesting for the development of a high-end Mac workstation that could be a real opponent to the x86 hegemony present today in this class of products.

Finally, the most interesting and non-trivial possibility of using this patent would be to use different node technologies, thus enabling better possibilities for using different nodes with better performance in specific applications, to achieve an even better performance in the SoC as a whole.

Intel’s answer

As much as the most enthusiastic reader may be impressed by the development brought by Apple, it is possible to point out a perfect patent counterpart developed by Intel, whose technical features in turn could cause the reader equal or even greater astonishment.

In 2019 an intel patent was published describing its side mounted interconnect bridge, which I like to call the vertical EMIB. This new vertical EMIB would be an even more complex and high-performance solution, aimed mainly at further interconnecting their 3D stacked designs. As much as Intel’s approach seems much more complete, in the sense of achieving a higher bandwidth interconnection and with an even better energy per bit, the level of complexity in the development of this solution and the cost of its manufacture would be absolutely prohibitive outside of HPC scope. Therefore, the proposal patented by Apple, even having several debatable points of high complexity and manufacturing cost, remains adequate and more feasible for the mainstream consumer market.

In a future article I will return to comment this Intel patent in more detail.

A final thought

It was predictable some time ago that some mobile company would eventually migrate to advanced packages for the use of 3D stacking designs. In this sense, it is not surprising that Apple, the prominent leader in smartphone development, is the first company to develop a 3D stacked SoC design for its future iPhone line. What is truly surprising is the larger context in which this patent is inserted and which will lead to a multitude of new opportunities and place Apple as a possible rival to the x86 hegemony in the development of desktop processors.

This folded 3D die arrangement proposed by Apple may eventually lead to the development of more affordable advanced packages produced by foundries, which will reach the mobile industry as a whole, going far beyond the Cupertino stronghold. The new wave of 3D development in the mobile market is on its way.

References and further reading:

  1. https://3dfabric.tsmc.com/english/dedicatedFoundry/technology/InFO.htm
  2. Samsung Announces Availability of its Silicon-Proven 3D IC Technology for High-Performance Applications, Samsung Newsroom, 2020.
  3. US20200273843 – High bandwidth die to die interconnect with package area reduction, Zhong et al., Apple Inc. (2020)
  4. US20190341349 – Side mounted interconnect bridges, Hossain et al., Intel Corporation (2019)

Underfox

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top