Systemd Cheat Sheet

Systemd is an init system in Linux used for system intialization and service management. It is fairly useful to manage and monitor services. In this cheatsheet you will find a collection of common commands used with the command line tools systemctl and journalctl.

How to Install Presto or Trino on a Cluster and Query Distributed Data on Apache Hive and HDFS

Presto is an open source distibruted query engine built for Big Data enabling high performance SQL access to a large variety of data sources including HDFS, PostgreSQL, MySQL, Cassandra, MongoDB, Elasticsearch and Kafka among others.

Classifying the Iris Data Set with PyTorch

In this short article we will have a look on how to use PyTorch with the Iris data set. We will create and train a neural network with Linear layers and we will employ a Softmax activation function and the Adam optimizer.

Google Analytics Analytics with Python

Google Analytics is a powerful analytics tool found in an astonishing number of websites. In this tutorial, we will take a look at how to access the Google Analytics API (v4) with Python and Pandas. Additionally, we will take a look at the various ways to analyze your tracking data and create custom reports.

How to Manage Apache Airflow with Systemd on Debian or Ubuntu

Apache Airflow is a powerfull workflow management system which you can use to automate and manage complex Extract Transform Load (ETL) pipelines. In this tutorial you will see how to integrate Airflow with the systemd system and service manager which is available on most Linux systems to help you with monitoring and restarting Airflow on failure.

How to Create Your Data Science Blog with Pelican and Jupyter Notebooks

Writing articles and tutorials are a great way to learn new things in depth while building a portfolio. In this tutorial, you will find the first steps that you will need to start your data science blog with Pelican and Jupyter Notebooks.

How to Execute Shell Commands with Python

Python is a wonderful language for scripting and automating workflows and it is packed with useful tools out of the box with the Python Standard Library. A common thing to do, especially for a sysadmin, is to execute shell commands. But what usually will end up in a bash or batch file, can be also done in Python. You’ll learn here how to do just that with the os and subprocess modules.

Installing and Running Jupyter on a Server

Jupyter Notebook is a powerful tool, but how can you use it in all its glory on a server? In this tutorial you will see how to set up Jupyter notebook on a server like Digital Ocean, AWS or most other hosting provider available. Additionally, you will see how to use Jupyter notebooks over SSH tunneling or SSL with with Let’s Encrypt.

Using Virtual Environments in Jupyter Notebook and Python

Are you working with Jupyter Notebook and Python? Do you also want to benefit from virtual environments? In this tutorial you will see how to do just that with Anaconda or Virtualenv/venv.

Analyzing Your File System and Folder Structures with Python

Say you have an external hard drive with layers upon layers of cryptically named folders and intricate mazes of directories (like here, or here). How can you make sense of this mess? Python offers various tools in the Python standard library to deal with your file system and the folderstats module can be of additional help to gain insights into your file system.