Proj CDeepFuzz Paper Reading: TensorFlow: a system for Large-Scale machine learning

发布时间 2023-09-08 02:11:15作者: 雪溯

Abstract

本文:Tensorflow
Github: https://github.com/tensorflow/tensorflow
Task: Detail on Tensorflow dataflow model
特点:

  1. operates at large scale and in heterogeneous environments
  2. supports a variety of applications

Method

  1. use dataflow graphs to represent computation, shared state, and operations that change that state
  2. maps the nodes of a dataflow graph(can across many machines/multiple devices) into a cluster

1. Intro

2. Background & motivation

2.1 Previous system: DistBelief

2.2 Design principles

3. TensorFlow execution model

3.1 Dataflow graph elements

3.2 Partial and concurrent execution

3.3 Distributed execution

3.4 Dynamic control flow

4. Extensibility case studies

4.1 Differentiation and optimization

4.2 Training very large models

4.3 Fault tolerance

4.4 Synchronous replica coordination

5. Implementation

6. Evaluation

6.1 Single-machine benchmarks

6.2 Synchronous replica microbenchmark

6.3 Image classification

7. Conclusions