Building up the water effect layer by layer.
The 45x speedup from fori_loop to vmap wasn’t a better algorithm. It was the same algorithm with one additional piece of information: “these Q blocks are independent.” XLA is a JIT compiler — it does dataflow analysis, operator fusion, memory planning. But it can’t infer independence from a fori_loop with carried state. vmap is semantically “map this function over a batch” — independence is built into the abstraction.
,更多细节参见雷电模拟器
Share on Facebook (Opens in new window)。手游是该领域的重要参考
Москвичи пожаловались на зловонную квартиру-свалку с телами животных и тараканами18:04