I know that parallelizing over that many compute units is very tricky, but 50 to 62% utilization…

In the experiments, GSPMD achieved 50 percent to 62 percent compute utilization on 128 to 2048 Cloud TPUv3 cores for models with up to one trillion parameters. The results validate GSPMD as an effective single pr…
Google Presents New Parallelization Paradigm GSPMD for common ML Computation Graphs: Constant…
66
1
Synced
Ygor Serpa
·Follow
Jul 23, 2021
--
I know that parallelizing over that many compute units is very tricky, but 50 to 62% utilization doesn't seem very good. With those numbers, over half of money you are putting in is getting wasted.
What I don't know, however, is the utilization numbers of prior approaches. Do you (or the paper) have these numbers? It would be a great addition to the article to have some reference numbers from prior work.
Thanks :)
--
--
Written by Ygor Serpa1.93K Followers
·197 Following
Former game developer turned data scientist after falling in love with AI and all its branches.
No responses yet
Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams