Dark silicon and the end of multicore scaling
Citation: Hadi Esmaeilzadeh, Emily Blem, Renée St. Amant, Karthikeyan Sankaralingam, Doug Burger (2011/07) Dark silicon and the end of multicore scaling. Annual International Symposium on Computer Architecture (RSS)
DOI (original publisher): 10.1145/2000064.2000108
Semantic Scholar (metadata): 10.1145/2000064.2000108
Sci-Hub (fulltext): 10.1145/2000064.2000108
Internet Archive Scholar (search for fulltext): Dark silicon and the end of multicore scaling
Download: https://dl.acm.org/doi/10.1145/2000064.2000108
Tagged: Computer Science
(RSS) computer architecture (RSS)
Summary
For decades, Moore's Law meant more transistors and Dennard Scaling meant less power; In combination, performance scales exponentially. However, Dennard Scaling has failed, which drove the industry towards multicore. But we are getting to the diminishing returns of multicore due to power limtations. This paper uses analytic and empirical models to prove that.
Theoretical and Practical Relevance
This is a hugely influential and controversial paper, because it claims that the multicore era is over. The conclusion states that either architects have to create totally novel architectures or Moore's Law will end!
Elsewhere
- PeerLibrary
- Related work: A Perspective on Dark Silicon
Conclusion
- In the past three decades, Moore's law combined with Dennard Scaling have created exponentially increasing performance and decreasing power.
- When Dennard Scaling began to fail, the industry went with multi-core.
- But power limitations make it hard to increase performance with multi-core; We have diminishing returns, even for highly parallel codes.
- Dark silicon is a circuit which is turned off during a given clock cycle. This paper claims that the amount of dark silicon will increase due to power constraints.
- Either computer architects will design novel architectures (past the existing pareto curve) or Moore's Law will end soon due to power limitations.
Method
- Device Model: area, frequency, and power requirements as a function of technology node (e.g. 5nm).
- Core Model: Pareto-frontier of possible (area, power, performance).
- Normally, Pollack's rule , but newer technology has different voltage and frequency, so instead need to derive an empirical model.
- OLS fit cubic polynomial to power as a function of performance for existing processors and quadratic polynomial to area as a function of performance.
- Multicore model: How to use the transistors to create multiple cores
- Topologies modeled: Symmetric multicore (traditional), asymmetric multicore (one big, many little cores e.g. big+LITTLE), dynamic multicore (asymmetric, but you can turn the big core off), and composed multicore (little cores are reconfigurable into big core).
- Applications modeled: with Amdahl's Law.
- Measure parallel/serial fractions on PARSEC applications.
- Architectural features modeled: multi-level cache
- Model validation through GPGPUSim
- Device Model + Core Model + Multicore Model:
- Use device model to get area, freq, and power. Use core model on area and power to get performance. Exhaustively earch through multicore models to find best candidate.
- This can model CPU- and GPU-like processors.
- Sensitivity analysis validates model.
- Limitations:
- Ignore SMT
- Ignore memory subsystem
- Ignore ARM and Tilera cores, no SPECmark scores.