Shruti Turner.

Let's Get Virtual

PythonData ScientistMachine Learning EngineerML EngineeringVirtual Environment
Cover Image for Let's Get Virtual

Photo by Zak on Unsplash

The concept of "virtual" has been gaining momentum in recent years, with "virtual reality" becoming more commonplace in gaming and the like. But actually, the term "virtual" has been around in the programming world for a lot longer. Except here, I'm talking about "virtual environments" rather than "virtual reality".

What is a Virtual Environment?

I don't know if you've ever Googled this (maybe you have and that's how you got here?) but across the internet there are lots of "definitions" of virtual environments that actually are more descriptions of what a virtual environment does or what it's useful for, rather than what is actually is. So here's my attempt at a definition:

A virtual environment is a tool that allows a user to create and control an isolated computing set up for a project, which can be replicated by others.

It turns out, it was really quite difficult for me to come up with that definition without using the word "environment".

Why do you need a Virtual Environment?

The way I think of a virtual environment is more physically: it's like a room where you do the programming for your project and all the resources you need are in that room. You can add whatever libraries you need to the room that your code relies on (i.e. your dependencies can be added to the virtual environment) and you might have the same resources as in the room next to you where you are doing a different project. It might be that the resources are exactly the same or maybe they're different versions? It doesn't matter because you don't share resources across the rooms.

So, to put this back into code terms...as you would typically create a new environment for each project you may have, you might install the same libraries into each one. However, a virtual environment is isolated, so if for one project you've been working with pandas 1.2.4 and then on a new project you're working with a team using pandas 2.0.3 then you can use both on one machine without any conflicts.

requirements.txt

You may have joined a project or cloned a GitHub repo and been stumped by the amount of libraries that you need. Perhaps, you've tried to install each dependency one by one and something still isn't right. Or maybe it's the opposite and you have already downloaded everything you need, and you need to send those exact dependencies to someone else.

This is where a requirements.txt file comes in. It's just a simple text file with a dependency (i.e. library name, maybe a version number) on each line. You can write one manually by creating a txt file and writing what you want in it (you don't need to put a version number, but it's helpful to do so to make sure you're always using the same one) or you can get a list of them from the terminal once you've installed them using the following code:

pip freeze

You can then copy the output to a requirements.txt file or you can do this in one step:

pip freeze > requirements.txt

If you've already got a requirements.txt file and you're wanting to install all of those requirements, you can use:

pip install -r requirements.txt

NB these code snippets assume the requirements.txt file is in the root of your file structure, which is the standard.

This is a great way to make sure that everyone has all the required dependencies installed, without missing any, and that they are all the same version.

How do I get a Virtual Environment?

Well, you get them by creating them. Working in Python, I find there are two main libraries to create your virtual environment with: venv and conda. Both are set up a little differently (check out the links to see how), but all in all they very much do the same thing as far as most of us will experience. Some key differences to note though: venv is a native python library, whereas Conda is it's own distribution which includes python as part of install. Conda can be used for other programming languages - not just python. I have to say, the main thing that catches me out is that the virtual environments are stored in different locations by default..handy to know when you're trying to find them!

Personally, I prefer to use conda as I find it easier to manage - maybe because that's what I got working first and then stuck with it. The commands are slightly different so mixing up which virtual environments you use might be a bit tricky if you're anything like me. There's no right or wrong, try them both and see which you prefer!

As a habit, I would recommend using a virtual environment (venv or conda) for any project you start up. Errors due to dependencies being unexpected can lead to a lot of lost time as they can be tough to find, especially when you think you have them! Virtual environments also help to avoid any conflicts between projects.

Share Now



More Stories

Cover Image for Tickets, Please?
Ways of WorkingTicketsAgileScrumKanban

Tickets are the building blocks that make up a team’s work, without clearly defined blocks it’s difficult to work efficiently and effectively as a team, catching gaps and avoiding duplication of work.