Sometimes when we have a bigger project or a set of projects we want to extract some code in order to make it reusable. When integrating it with project we run into a question: how?
Submodules and other Git Features
One of the common approaches is using submodules. It is mean to be used in cases where you mainly consume the content. Git sees the project in a submodule as a single file which is updated.
Other is subtrees. Intended for similar purpose like submodules, but are more write-oriented. Git sees entire subdirectory, knows the history of the project allowing you to commit more easily.
I’m not going to go into depths with those solutions, you can read more about them clicking on the links listed at the bottom.
Both of them have certain problems:
- changes in the source code of the library require recompilation of the project. It might be a better idea to have it compiled once. This is especially important when we’re using script language and have some binary code for it. For example in Python we might want to write some code in C for efficiency (compare with “Writing Python Modules in C“). In this case we would like to pull a precompiled package instead of its source code.
- You have separate way of pulling normal dependencies and the submodules/subtrees.
- Last, but not least (probably this is the most important problem), you are tempted to get to know the internals of the projects. Since it’s assumed that it’s a big dependency, which has a dedicated team working on it, it would be wise to separate concerns and deliver a solution, which requires familiarity only with public API, not the way it’s installed, built, compiled and so on. Same applies to “write” side: bug fixes, new features are meant to be tracked using dedicated approach, versioning. They should be thought of as a separate project not a part of something else.
GitLab’s Package Registry
An answer to these issues might be a dedicated package registry. Let’s imagine a library with simple math functions (entire project is linked below):
def add(a: int, b: int) -> int: return a + b def sub(a: int, b: int) -> int: return a - b def mul(a: int, b: int) -> int: return a * b def div(a: int, b: int) -> float: return a / b
Now in order to turn this into a functional library, we need a
import setuptools setuptools.setup( name="mymath", version="0.0.1", author="gonczor", author_email="", description="A small example package", packages=setuptools.find_packages(), classifiers=[ "Programming Language :: Python :: 3", "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", ], python_requires='>=3.7', )
This defines the name of the module, author, version, required Python version (so if one team using our project has an outdated python version, we can still prepare a dedicated library just for them).
The project structure is following:
$ tree -I '*pycache*' . ├── Makefile ├── README.md ├── my_math │ ├── __init__.py │ └── my_math.py └── setup.py 2 directories, 5 files
Now, we can run
make command which invokes
python3 setup.py sdist bdist_wheel and see built files:
$ ls -l total 24 -rw-r--r-- 1 wgonczaronek staff 42 11 paź 22:02 Makefile -rw-r--r-- 1 wgonczaronek staff 11 11 paź 21:41 README.md drwxr-xr-x 4 wgonczaronek staff 128 11 paź 22:11 build drwxr-xr-x 4 wgonczaronek staff 128 11 paź 22:11 dist drwxr-xr-x 5 wgonczaronek staff 160 11 paź 21:41 my_math drwxr-xr-x 6 wgonczaronek staff 192 11 paź 22:11 mymath.egg-info -rw-r--r-- 1 wgonczaronek staff 404 11 paź 21:41 setup.py
What we actually need is in
dist directory. It can be uploaded using:
python3 -m twine upload --repository gitlab dist/*
Which I’ve also included in the Makefile for your convenience.
To deploy, you’ll need a
.pypirc file with some data about project and personal access token (read this to learn how to get it). My
[distutils] index-servers = gitlab [gitlab] repository = https://gitlab.com/api/v4/projects/21712491/packages/pypi username = __token__ password = S0M3S3(RE7
OK, we can try it out.
The library sits there on GitLab ready to be downloaded.
Let’s Test it!
Let’s create and example project and add this library as a dependency:
$ cat requirements.txt -i https://gitlab.com/api/v4/projects/21712491/packages/pypi/simple mymath==0.0.1
And the code we’re integrating with is:
$ cat main.py from my_math import * print(add(1, 2)) print(sub(2, 3)) print(mul(3, 4)) print(div(4, 5))
Note the underscore in module name. What we’re importing has different name from what we’re about to install. OK, let’t get down to business.
$ pip3 install -r requirements.txt Looking in indexes: https://pypi.org/simple, https://gitlab.com/api/v4/projects/21712491/packages/pypi/simple Collecting mymath==0.0.1 Using cached https://gitlab.com/api/v4/projects/21712491/packages/pypi/files/d5a8718624896c8dd65086adbd984f2cded5b38f2e5764b2a8f81196c7b92e9a/mymath-0.0.1-py3-none-any.whl (1.5 kB) Installing collected packages: mymath Successfully installed mymath-0.0.1 $ python3 main.py 3 -1 12 0.8
Today we’ve learnt how to use GitLab as package registry and don’t use git in a way it’s not really intended to. Managing dependencies should be done with tools for managing dependencies, not version control systems.
We can extend this example with automated CI builds and uploads, but that’d enough for a separate blog post.
Please also note that in GitLab’s docs you’ll see how to use these libraries in private projects. I’m not doing this because I wanted to show a simplified process and also let you download the example, so it needed to be open-sourced.