How to control the version of a data analysis project

how can a data analysis project be versioned effectively?

where version control is required and which is not.

how to manage the charts generated in the data analysis project.


basically my plan is to use jupyter notebook . Put some intermediate results (stored in Pickle) and functions used by Pipeline in the tool module, then display the version by the label of Notebook, and finally use git to do version control. For example:

-- project
  |__ data:
      |__ SQL:SQL
      |__ pickle:
  |__ src:Notebook
  |__ notebooks:
      |__ 0.0 contents and introduction.ipnb:notebook
      |__ 1.0 EDA.ipnb
      |__ 1.1 .ipnb
      |__ 1.2 .ipnb
      |__ 2.0 EDA.ipnb
      |__ ...
      |__ end.0 .ipnb
  |__ temp_module:notebook
  |__ README
Menu