If you want to quickly start using Python for data-science-related work, it is recommended that you directly install Anaconda, and then use the Notebook or JupyterLab tools integrated in Anaconda to write code. This is because for beginners, first installing the official Python interpreter and then installing the third-party libraries that will be used in work one by one is relatively troublesome. Especially in the Windows environment, installation often fails because of missing build tools or DLL files, and beginners usually find it hard to take the correct action based on the error message, so it is easy to develop a strong sense of frustration. If there is already a Python interpreter environment on the computer, you can also directly use Python's package-management tool pip to install Jupyter, and then install third-party libraries according to actual work needs. This way is more suitable for users who already have some experience.
For individual users, you can download the "Individual Edition" installer from the official Anaconda website. After the installation is finished, your computer will not only have a Python environment and Spyder, which is an integrated development tool similar to PyCharm, but will also have nearly 200 tool packages related to data-science work, including the three great tools of Python data analysis that we mentioned above. In addition, Anaconda also provides a package-management tool named conda. Through this tool, we can not only manage Python packages, but also use it to create virtual environments for running Python programs.
As shown in the picture above, you can choose an installer suitable for your operating system through the download link provided on the Anaconda official website. It is recommended that everyone choose the graphical installer. After the download is finished, double-click the installer to start the installation. During installation, the default settings are usually fine. After the installation is complete, macOS users can find an application named Anaconda-Navigator in Applications or Launchpad. Running this application shows an interface like the one below, where we can choose the action we need to perform.
For Windows users, it is recommended to install Anaconda according to the prompts and suggested options of the installation wizard. There is basically nothing to choose except the installation path. After the installation is finished, you can find Anaconda3 in the Start Menu.
Tip: You can choose Miniconda as an alternative to Anaconda. Miniconda only installs the Python interpreter environment and some necessary tools, and other third-party libraries are installed by the user when needed. Actually, I personally do not like Anaconda very much, because it is for beginner users. Once we already have a Python environment, we can completely install the third-party libraries we need according to our own wishes.
For non-beginner users, if you want to use the conda tool to manage dependencies or create virtual environments for projects, you can use conda commands in the terminal or command prompt. Windows users can find Anaconda3 in the Start Menu, and then click Anaconda Prompt or Anaconda PowerShell to start a command-line prompt that supports conda. If beginner users want to create new virtual environments or manage third-party libraries, meaning dependencies, it is recommended to directly use Environments in Anaconda-Navigator to manage virtual environments and dependencies visually.
-
Version and help information.
- Check version:
conda -Vorconda --version - Get help:
conda -horconda --help - Related information:
conda list
- Check version:
-
Virtual-environment related commands.
- Show all virtual environments:
conda env list - Create a virtual environment:
conda create --name venv - Create a virtual environment with a specified Python version:
conda create --name venv python=3.7 - Create a virtual environment with a specified Python version and install specified dependencies:
conda create --name venv python=3.7 numpy pandas - Create a virtual environment by cloning an existing virtual environment:
conda create --name venv2 --clone venv - Share a virtual environment and redirect it into a specified file:
conda env export > environment.yml - Create a virtual environment from a shared virtual environment file:
conda env create -f environment.yml - Activate a virtual environment:
conda activate venv - Leave a virtual environment:
conda deactivate - Delete a virtual environment:
conda remove --name venv --all
Note: In the commands above,
venvandvenv2are the names of the virtual-environment folders. You can replace them with any name you like, but it is strongly recommended to use English names and not use special characters. - Show all virtual environments:
-
Package, meaning third-party library or tool, management.
- Show installed packages:
conda list - Search for a specified package:
conda search matplotlib - Install a specified package:
conda install matplotlib - Update a specified package:
conda update matplotlib - Remove a specified package:
conda remove matplotlib
Note: When searching, installing, and updating packages, it connects to the official website by default. If you think the speed is not strong enough, you can replace the default official website with a domestic mirror website. The Tsinghua University open-source mirror website is recommended. The commands for changing the default source to a domestic mirror are
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/andconda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main. If you need to switch back to the default source, you can use the commandconda config --remove-key channels. - Show installed packages:
If Anaconda is already installed, you can directly start Notebook or JupyterLab in Anaconda-Navigator as mentioned above. According to the official statement, JupyterLab is the next generation of Notebook and provides a friendlier interface and more powerful functions, so we also recommend everyone use JupyterLab. Windows users can also open Anaconda Prompt or Anaconda PowerShell from the Start Menu. Because the default Anaconda virtual environment is already activated, you only need to enter the command jupyter lab to start JupyterLab. On macOS, after Anaconda is installed, the default Anaconda virtual environment is automatically activated every time you open the terminal, so you can also start JupyterLab by entering the command jupyter lab.
For users who have installed a Python environment but not Anaconda, you can use Python's package-management tool pip to install JupyterLab. After the installation succeeds, run the command jupyter lab in the terminal or command prompt to start JupyterLab, as shown below.
Install JupyterLab:
pip install jupyterlabInstall the three great tools of Python data analysis:
pip install numpy pandas matplotlibStart JupyterLab:
jupyter labJupyterLab is a web-based application for interactive computing. It can be used for code development, document writing, code running, and result presentation. Simply speaking, you can directly write code and run code on a web page, and the running results of the code are also directly shown below the code cell. If you need to write explanatory documents while writing code, you can write them on the same page in Markdown format, and you can directly see the rendered effect. In addition, the original design purpose of Notebook was to provide a working environment that can support many programming languages. At present, it can support more than 40 programming languages, including Python, R, Julia, Scala, and so on.
First, we can create a Notebook for writing Python code, as shown below.
Next, we can write code, write documents, and run programs, as shown below.
If you use Python for engineering-style project development, PyCharm is definitely the best choice. It provides all the functions that an integrated development environment should have. In particular, functions such as smart hints, code completion, and automatic error correction make developers feel very comfortable. If you use Python for data-science-related work, JupyterLab is not inferior to PyCharm, and JupyterLab is even better in data and chart presentation. Because of this, JetBrains also specially developed a new tool, DataSpell, to compete with JupyterLab. Interested readers can learn about it by themselves. Below, we introduce some tips for using JupyterLab, hoping they can help everyone improve work efficiency.
-
Auto-completion. When writing code in JupyterLab, pressing the
Tabkey gives code hints and completion. -
Getting help. If you want to know the related information or usage of an object, such as a variable, class, or function, you can put
?after the object and run the code. The matching information will be shown below the window to help us understand the object, as shown below.
-
Searching names. If you only remember part of the name of a class or function, you can use the wildcard
*together with?to search, as shown below.
-
Running commands. In JupyterLab, you can run system commands by putting
!before the system command. -
Magic commands. There are many very interesting and useful magic commands in JupyterLab. For example, you can use
%timeitto test the running time of a statement, and%pwdto see the current working directory. If you want to see all magic commands, you can use%lsmagic. If you want to understand how magic commands are used, you can use%magicto see it, as shown below.
Common magic commands are shown below.
Magic Command Description %pwdShow the current working directory %lsList the contents under the current or specified folder %catShow the contents of the specified file %histShow input history %matplotlib inlineSet matplotlib output charts to be embedded in the page %config Inlinebackend.figure_format='svg'Set charts to use SVG format %runRun the specified program %loadLoad the specified file into a cell %quickrefShow the quick reference of IPython %timeitRun code many times and count execution time %prunUse cProfile.runto run code and show profiler output%who/%whosShow the variables in the namespace %xdelDelete an object and clear all references to it -
Shortcuts. Many operations in JupyterLab can be done through shortcuts, and using shortcuts can improve work efficiency. The shortcuts of JupyterLab can be divided into shortcuts in command mode and shortcuts in edit mode. The so-called edit mode is the mode where you are entering code or writing documents. In edit mode, pressing
Esccan go back to command mode. In command mode, pressingEntercan enter edit mode.Shortcuts in command mode:
Shortcut Description Alt+EnterRun the current cell and insert a new cell below Shift+EnterRun the current cell and select the cell below Ctrl+EnterRun the current cell j/k,Shift+j/Shift+kSelect the cell below / above, continuously select cells below / above a/bInsert a new cell below / above c/xCopy a cell / cut a cell v/Shift+vPaste a cell below / above dd/zDelete a cell / restore a deleted cell Shift+lShow or hide line numbers for the current / all cells Space/Shift+SpaceScroll the page down / up Shortcuts in edit mode:
Shortcut Description Shift+TabGet hint information Ctrl+]/Ctrl+[Increase / decrease indentation Alt+EnterRun the current cell and insert a new cell below Shift+EnterRun the current cell and select the cell below Ctrl+EnterRun the current cell Ctrl+Left/RightMove the cursor to the start / end of the line Ctrl+Up/DownMove the cursor to the start / end of the code Up/DownMove the cursor up / down one line or move to the previous / next cell Note: On macOS, you can replace the
Altkey with theOptionkey, and replace theCtrlkey with theCommandkey.



