Datascience in Towards Data Science on Medium,

How to Make Your Data Science/ML Engineer Workflow More Effective

9/26/2024 Jesus Santana

Learn how you can use VS Code interactive window to program more effectively

Anyone working with programming needs an effective workflow. Many tasks are time-consuming, and you want to automate as much as possible to reduce manual work. In this article, I discuss how I have recently updated my workflow as a data scientist, moving away from Jupyter Notebooks and to using VS Code interactive windows.

This article discusses how you can effectivize your data science / ML engineering workflow with VS Code interactive windows. Image by ChatGPT.

To showcase the new workflow, I will use some simple code highlighting how you can work faster using the new workflow. You should note, however, that I think the benefits of the new workflow increase the more complex a project becomes. Many problems with Jupyter Notebooks arise when a project grows bigger, and it’s more difficult to have an overview of your data. Thus, I think the benefits of the workflow I am showcasing in this article will only increase with real-world projects. I will use pictures and videos throughout the article to visually showcase how you can work with the VS Code interactive window. My inspiration for this article was this YouTube video from Dave Ebbelaar about how he stopped using Jupyter Notebook.

Table of Contents

· Motivation
· Using VS Code interactive windows
Setting up
Benefits
· Conclusion

Motivation

My motivation for this article is that I, as a data scientist, am always looking for methods to improve the way I work. I think it's critical in my line of work to stay updated with the latest trends, whether it be keeping up with the latest machine-learning models, using new IDEs like Cursor, or improving my workflow by using interactive windows. Thus, in this article, I am sharing my recent change in workflow for data science-related projects, from working in Jupyter Notebooks to using interactive windows in VS code. After changing the workflow, I became significantly more effective when writing code and experimenting, which is critical when you want to succeed with a data science project.

Using VS Code interactive windows

Setting up

To start using interactive windows in VS Code, you must activate it in the settings. Open settings on VS Code, search for Jupyter interactive window, and enable the checkbox stating When pressing shift+enter, send selected code …. You can see an image of the setting below:

Activating the interactive window setting in VS Code. Image by author.

Now, you are ready to use the interactive window. Open up a Python file, mark the code you want to run, and press shift enter. Only the code within the marked area will be run, and the variable will be stored. This allows you to work directly from a Python file, with most of the benefits you get from using Jupyter Notebooks.

This image show my VS Code view after marking all of the code to the left, and pressing shift enter. This opens up the interactive window to the right, which runs all the code you selected (you only see the first line, but you can also extend the line, as you see in the image below. Image by the author.
Image showing how you can extend the run cell for clarity. Image by the author.

Now, the variables a and b are defined, so for example, to understand how the addition function works, I can run the results line by marking line 5 pressing, shift enter, and then marking only the word result to print out the value of the result. Note that in this instance, the function is very easy to understand, but as you work with more complex programming, this is super useful for easily understanding and debugging functions.

My VS Code view after running line 5, defining the variable result, and then marking the variable result, and running it to see the value of the variable. Image by the author.

Another useful feature you can use here is to open the terminal window (using ctrl+j on Windows or cmd+j on Mac). Then, you can select the Jupyter tab to see the values of all the variables.

My VS Code view after opening up the Jupyter terminal in the bottom, to see the values of all variables. This is useful to quickly check the value of all variables, allowing for easy debugging and understanding of the code. Image by the author.

You can see a video of what I explained below:

Benefits

There are numerous benefits to using interactive windows. I have highlighted the main advantages in the list below:

  • Faster. I think working with interactive windows is faster than using Jupyter Notebooks. You don’t have to create cells; you can simply mark the code and run it. To print variables, for example, you only need to select the variable and press shift enter rather than making a new cell, writing the variable name, and running the cell.
  • Cleaner code. I think writing in .py files, in general, forces you to write cleaner code, for example, by always making functions and modulizing the code as much as possible. Py files are also usually easier to read than Jupyter Notebooks.
  • Production ready. Writing in Py files means your code is ready to be pushed to a production environment, saving more time.

Conclusion

In this article, I have discussed how you can use VS Code interactive window to effectivize your workflow as a data scientist or ML engineer. I noticed a significant productivity increase after changing from Jupyter Notebooks to interactive windows, and I therefore think this is definitely a change you should try out.


How to Make Your Data Science/ML Engineer Workflow More Effective was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.



from Datascience in Towards Data Science on Medium https://ift.tt/v7dNFeH
via IFTTT

También Podría Gustarte