Native dataflow — Commonly occurring operators in machine-learning frameworks and DSLs can be described in terms
of parallel patterns that capture parallelizable computation on both dense and sparse data collections along with
corresponding memory access patterns. This enables exploitation and high utilization of the underlying platform while
allowing a diverse set of models to be easily written in any framework of choice.
Support for terabyte-sized models — A key trend in deep-learning model development uses increasingly large model
sizes to gain higher accuracy and deliver more sophisticated functionality. For example, leveraging billions of datapoints (referred to as parameters) enables more accurate Natural Language Generation. In the life sciences field,
analyzing tissue samples requires the processing of large, high-resolution images to identify subtle features. Providing
much larger on-chip and off-chip memory stores than those that are available on core-based architectures will
accelerate deep-learning innovation.
Efficient processing of sparse data and graph-based networks — Recommender systems, friend-of-friends problems,
knowledge graphs, some life-science domains and more involve large sparse data structures that consist of mostly zero
values. Moving around and processing large, mostly empty matrices is inefficient and degrades performance. A nextgeneration architecture must intelligently avoid unnecessary processing.
Flexible model mapping — Currently, data and model parallel techniques are used to scale workloads across the
infrastructure. However, the programming cost and complexity are often prohibiting factors for new deep-learning
approaches. A new architecture should automatically enable scaling across infrastructure without this added
development and orchestration complexity and avoid the need for model developers to become experts in system
architecture and parallel computing.
Incorporate SQL and other pre-/post data processing — As deep learning models grow and incorporate a wider variety
of data types, the dependency on pre-processing and post-processing of data becomes dominant. Additionally, the
time lag and cost of ETL operations impact real-time system goals. A new architecture should allow the unification of
these processing tasks on a single platform.
references:
https://sambanova.ai/hubfs/23945802/SambaNova_Accelerated-Computing-with-a-Reconfigurable-Dataflow-Architecture_Whitepaper_English-1.pdf
No comments:
Post a Comment