Jekyll2023-04-25T20:06:02+00:00http://justus.science/atom.xmljustus’ homepage - v2.0Hello, my name is Justus Adam. I am a computer science student and this is my personal website.
How to create downloadable executables for your project with GitHub actions2020-08-27T00:00:00+00:002020-08-27T00:00:00+00:00http://justus.science/blog/2020/08/27/creating-executables-with-actions<p>With the release of <a href="https://docs.github.com/en/actions">GitHub Actions</a> we have
gained an incredibly powerful tool when it comes to testing and publishing
software hosted on GitHub. Actions lets you essentially perform arbitrary
computing tasks whenever certain events in your repository are triggered. While
platforms like Circle-CI or Travis have offered similar capabilities for a
while, Actions make some of those tasks significantly more convenient due to how
deeply it is integrated with GitHub.</p>
<p>What I want to show you in this post is how you can provide downloadable
compiled binaries and tarballs for users of your projects using just a few lines
of Actions configuration. Having prebuilt binaries like this makes it
significantly easier for new users to check out and start using your project,
especially those less familiar with the language and tooling you use.</p>
<p>In this post I will be describing a project written in
<a href="https://haskell.org">Haskell</a> and built using
<a href="https://docs.haskellstack.org">stack</a>. However except for the build commands
none of the configuration is unique to this toolchain and you should be able to
easily adapt it to your own build method.</p>
<p>I’ve created a test repo to tinker with this stuff which you can find
<a href="https://github.com/JustusAdam/create-haskell-binaries-with-actions">here</a>.
There you can find the configuration I describe below, as well as see the
uploaded assets on the releases page.</p>
<p>If you just want to see the configuration I use and figure the rest out for
yourself, take a look at the section <a href="#configuration">Configuration</a>. <a href="#explanation">After
that</a> I’ll explain in more detail the individual steps taken in
the configuration and I’ll close with some <a href="#caveats">Caveats</a> that apply to
this method.</p>
<p>The next section is about my motivation for getting involved with this and
writing this post. If you are only here for the technical stuff you can safely
<a href="#configuration">skip it</a>.</p>
<h2 id="motivation">Motivation</h2>
<p>I recently found myself wanting to check out a project called
<a href="https://github.com/jgm/gitit">gitit</a> which is essentially a small server that
serves a wiki. All the wiki really is is a git repository of files with support
for several formats, such as markdown. The server lets you view, create and edit
the files in the browser and persists your changes by committing. Its a nice,
simple piece of software, especially because the stored data can easily be
handled outside of the server, making it easy to migrate or interact with from
an editor.</p>
<p>What I found rather annoying is that, given the features, this piece of software
should be a simple, small binary with perhaps a few static assets. And it is.
However the only method available for acquisition is building from source. Now
💚 Haskell, <strong>but</strong> it is notorious for its slow build times. To make matters
worse, <code class="language-plaintext highlighter-rouge">gitit</code>, in order to support multiple file formats, depends the library
version of a software called <a href="https://pandoc.org"><code class="language-plaintext highlighter-rouge">pandoc</code></a>, which is a large,
multi-format document rendering and conversion software. This means to build
<code class="language-plaintext highlighter-rouge">gitit</code> the large <code class="language-plaintext highlighter-rouge">pandoc</code> software has to be built, pulling in and building all
of its many dependencies. If I remember correctly a total of around 120 Haskell
libraries were downloaded and built, which took over 30 minutes on a quad core
machine, just to make this simple, tiny 40mb binary. Not to mention that someone
who doesn’t have Haskell installed would also have to download the build tools
<code class="language-plaintext highlighter-rouge">ghc</code>, <code class="language-plaintext highlighter-rouge">stack</code>, <code class="language-plaintext highlighter-rouge">cabal</code> etc.</p>
<p>Given how scary and involved this process is and how obscure Haskell as a
language is I feel that this simple software would be significantly more
accessible to people if it provided prebuilt binaries that one would just have
to download and run. Takes but a few seconds and avoids having to explain the
build process. And since the project is already using GitHub Actions as CI,
adding another config to build assets on release seems like a pretty
straightforward addition.</p>
<p>Back in the day I used to so similar things on Travis, building and uploading
binaries. Back then it required you to create a custom API key for interacting
with the GitHub releases and storing that in Travis as well as properly talking
to the GitHub API to upload stuff. It may be a bit much to ask of authors to set
up travis and the keys etc, etc. But since Actions needs none of that I feel
it’s low-effort enough for virtually anyone to just start using it. Hence this
call to … Actions 😜</p>
<p>If you’re interested, my PR with the changes to the gitit workflows can be found
<a href="https://github.com/jgm/gitit/pull/662">here</a>.</p>
<h2 id="configuration">Configuration</h2>
<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml"><span class="na">name</span><span class="pi">:</span> <span class="s">Create Assets</span>
<span class="na">on</span><span class="pi">:</span>
<span class="na">release</span><span class="pi">:</span>
<span class="na">types</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">published</span><span class="pi">]</span>
<span class="na">jobs</span><span class="pi">:</span>
<span class="na">build</span><span class="pi">:</span>
<span class="na">runs-on</span><span class="pi">:</span> <span class="s">${{ matrix.os }}</span>
<span class="na">strategy</span><span class="pi">:</span>
<span class="na">matrix</span><span class="pi">:</span>
<span class="na">os</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">ubuntu-latest</span><span class="pi">,</span> <span class="nv">macos-latest</span><span class="pi">]</span>
<span class="na">steps</span><span class="pi">:</span>
<span class="pi">-</span> <span class="na">uses</span><span class="pi">:</span> <span class="s">actions/checkout@v2</span>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Cache programs and libraries</span>
<span class="na">uses</span><span class="pi">:</span> <span class="s">actions/cache@v2</span>
<span class="na">env</span><span class="pi">:</span>
<span class="na">cache-name</span><span class="pi">:</span> <span class="s">cache-tools-and-libraries</span>
<span class="na">with</span><span class="pi">:</span>
<span class="na">path</span><span class="pi">:</span> <span class="s">~/.stack</span>
<span class="na">key</span><span class="pi">:</span> <span class="s">${{ runner.os }}-ca-${{ env.cache-name }}-${{ hashFiles('**/stack.yaml.lock') }}</span>
<span class="na">restore-keys</span><span class="pi">:</span> <span class="pi">|</span>
<span class="s">${{ runner.os }}-ca-${{ env.cache-name }}-</span>
<span class="s">${{ runner.os }}-ca-</span>
<span class="s">${{ runner.os }}-</span>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Build the project</span>
<span class="na">run</span><span class="pi">:</span> <span class="s">stack build</span>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Tar and strip the binary</span>
<span class="na">run</span><span class="pi">:</span> <span class="pi">|</span>
<span class="s">export PROGRAM=chbwa</span>
<span class="s">cp `stack exec -- which $PROGRAM` $PROGRAM</span>
<span class="s">tar -cavf program.tar.gz $PROGRAM</span>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Upload assets</span>
<span class="na">id</span><span class="pi">:</span> <span class="s">upload-release-asset</span>
<span class="na">uses</span><span class="pi">:</span> <span class="s">actions/upload-release-asset@v1</span>
<span class="na">env</span><span class="pi">:</span>
<span class="na">GITHUB_TOKEN</span><span class="pi">:</span> <span class="s">${{ secrets.GITHUB_TOKEN }}</span>
<span class="na">with</span><span class="pi">:</span>
<span class="na">upload_url</span><span class="pi">:</span> <span class="s">${{ github.event.release.upload_url }}</span>
<span class="na">asset_path</span><span class="pi">:</span> <span class="s">./program.tar.gz</span>
<span class="na">asset_name</span><span class="pi">:</span> <span class="s">program-${{ runner.os }}.tar.gz</span>
<span class="na">asset_content_type</span><span class="pi">:</span> <span class="s">application/tar.gz</span></code></pre></figure>
<h2 id="explanation">Explanation</h2>
<h3 id="header-and-build-config">Header and build config</h3>
<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml"><span class="na">name</span><span class="pi">:</span> <span class="s">Create Assets</span>
<span class="na">on</span><span class="pi">:</span>
<span class="na">release</span><span class="pi">:</span>
<span class="na">types</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">published</span><span class="pi">]</span>
<span class="na">jobs</span><span class="pi">:</span>
<span class="na">build</span><span class="pi">:</span>
<span class="na">runs-on</span><span class="pi">:</span> <span class="s">${{ matrix.os }}</span>
<span class="na">strategy</span><span class="pi">:</span>
<span class="na">matrix</span><span class="pi">:</span>
<span class="na">os</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">ubuntu-latest</span><span class="pi">,</span> <span class="nv">macos-latest</span><span class="pi">]</span></code></pre></figure>
<p>Standard header. We name the workflow and configure the trigger using the <code class="language-plaintext highlighter-rouge">on</code>
clause. While you can choose several triggers, you can only upload assets for
releases. If another event were to trigger the workflow the upload url used
later in the configuration would be missing.</p>
<p>We configure to run on all platforms available to github<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup> using the
<a href="https://docs.github.com/en/actions/configuring-and-managing-workflows/configuring-a-workflow#configuring-a-build-matrix"><code class="language-plaintext highlighter-rouge">matrix</code></a>.
<strong>Important:</strong> GitHub also offers builds on Windows, but I haven’t yet
translated the <code class="language-plaintext highlighter-rouge">steps</code> to the shell used on windows. I know how to include
platform-dependent steps in the config, but I’m not familiar enough with the
Windows shell to translate the commands. If you know what these commands would
look like on Windows, <a href="mailto:dev@justus.science">let me know</a>.</p>
<h3 id="checkout">Checkout</h3>
<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml"> <span class="na">steps</span><span class="pi">:</span>
<span class="pi">-</span> <span class="na">uses</span><span class="pi">:</span> <span class="s">actions/checkout@v2</span></code></pre></figure>
<p>The steps list the various commands we’d like to run. They can either call on
actions (<code class="language-plaintext highlighter-rouge">uses</code> key) or sun shell commands (<code class="language-plaintext highlighter-rouge">run</code> key). This action here <a href="https://docs.github.com/en/actions/configuring-and-managing-workflows/configuring-a-workflow#using-the-checkout-action">checks
out the repo</a>.</p>
<h3 id="caching"><a href="https://docs.github.com/en/actions/configuring-and-managing-workflows/caching-dependencies-to-speed-up-workflows">Caching</a></h3>
<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml"> <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Cache programs and libraries</span>
<span class="na">uses</span><span class="pi">:</span> <span class="s">actions/cache@v2</span>
<span class="na">env</span><span class="pi">:</span>
<span class="na">cache-name</span><span class="pi">:</span> <span class="s">cache-tools-and-libraries</span>
<span class="na">with</span><span class="pi">:</span>
<span class="na">path</span><span class="pi">:</span> <span class="s">~/.stack</span>
<span class="na">key</span><span class="pi">:</span> <span class="s">${{ runner.os }}-ca-${{ env.cache-name }}-${{ hashFiles('**/stack.yaml.lock') }}</span>
<span class="na">restore-keys</span><span class="pi">:</span> <span class="pi">|</span>
<span class="s">${{ runner.os }}-ca-${{ env.cache-name }}-</span>
<span class="s">${{ runner.os }}-ca-</span>
<span class="s">${{ runner.os }}-</span></code></pre></figure>
<p>I’ve left this in here, but I’m sad to say it doesn’t work for me. For some
reason the key lookup always fails. If anyone has an idea why <a href="mailto:dev@justus.science">let me
know</a>.</p>
<h3 id="building">Building</h3>
<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml"> <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Build the project</span>
<span class="na">run</span><span class="pi">:</span> <span class="s">stack build</span>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Tar and strip the binary</span>
<span class="na">run</span><span class="pi">:</span> <span class="pi">|</span>
<span class="s">export PROGRAM=chbwa</span>
<span class="s">cp `stack exec -- which $PROGRAM` $PROGRAM</span>
<span class="s">tar -cavf program.tar.gz $PROGRAM</span></code></pre></figure>
<p>Build the project the usual way.</p>
<p>If you have additional assets you need to distribute with your executable I
recommend packaging it as a archive here. This reduces the download times.</p>
<h3 id="dealing-with-data-files-haskell-specific">Dealing with <code class="language-plaintext highlighter-rouge">data-files</code> (Haskell specific)</h3>
<p>Some Haskell libraries and executables rely on additional <code class="language-plaintext highlighter-rouge">data-files</code> and the
<code class="language-plaintext highlighter-rouge">Paths_package_name</code> module, <a href="https://cabal.readthedocs.io/en/3.4/cabal-package.html#accessing-data-files-from-package-code">managed by cabal</a>. <strong>If you get an
error that a certain file could not be opened when running the binary you
uploaded as a asset, this is likely the reason.</strong> Especially if the path is
something like <code class="language-plaintext highlighter-rouge">/home/runner/.stack/snapshots/<a long hash
value>/package-1.0.5/...</code>.</p>
<p>There are two components to fixing this error.</p>
<ol>
<li>You must identify the files to include and</li>
<li>Overwrite the paths cabal has baked into the program</li>
</ol>
<p>I describe how to do both of those shortly, but you may also like to look at
<a href="https://gist.github.com/JustusAdam/5904249d909e975edb612e5eea581ba1">this gist</a>
which is a Haskell script that does both and copies the files to a directory
(<code class="language-plaintext highlighter-rouge">vendor-data/package-name</code>).</p>
<h4 id="1-finding-assets">1. Finding assets</h4>
<p>If the missing assets are from your project itself you can skip this step and
move on.</p>
<p>When stack or cabal was building your project it will have stored asset files of
all libraries and <code class="language-plaintext highlighter-rouge">ghc-pkg</code> knows where. To find the data directory of a library
(say <code class="language-plaintext highlighter-rouge">filestore</code>) use the command <code class="language-plaintext highlighter-rouge">ghc-pkg field filestore data-dir</code>. If you are
using stack you should prepend <code class="language-plaintext highlighter-rouge">stack exec --</code> before this command to make sure
you query the correct package database. This returns a string of the form
<code class="language-plaintext highlighter-rouge">"data-dir: /home/user/...\n"</code>. So you need to strip off the <code class="language-plaintext highlighter-rouge">"data-dir: "</code>
prefix, as well as the trailing <code class="language-plaintext highlighter-rouge">\n</code>.</p>
<h4 id="1-overwriting-cabal-paths">1. Overwriting cabal paths</h4>
<p>Libraries that use the <code class="language-plaintext highlighter-rouge">data-dir</code> functionality interact with is typically using
the <code class="language-plaintext highlighter-rouge">Paths_<package name></code> generated module. You can overwrite any paths set in
this module using environment variables. For instance, if I wanted to set the
<code class="language-plaintext highlighter-rouge">data-file</code> path for the <code class="language-plaintext highlighter-rouge">filestore</code> package to <code class="language-plaintext highlighter-rouge">"foo/bar"</code>, I have to set the
variable <code class="language-plaintext highlighter-rouge">filestore_datadir=foo/bar</code>. This is documented <a href="https://cabal.readthedocs.io/en/3.4/cabal-package.html#accessing-data-files-from-package-code">here</a> in
the cabal manual.</p>
<p>My solution for doing this automatically ist to not directly call the compiled
binary, but instead providing a shell script that sets these variables before
calling the actual program, forwarding any arguments. You can see an example of
such a script
<a href="https://github.com/JustusAdam/gitit/blob/4e5e522c673411eba4c02ec33dcf5528cdcb4312/unix-proxy.sh">here</a>.</p>
<h3 id="upload">Upload</h3>
<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml"> <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Upload assets</span>
<span class="na">id</span><span class="pi">:</span> <span class="s">upload-release-asset</span>
<span class="na">uses</span><span class="pi">:</span> <span class="s">actions/upload-release-asset@v1</span>
<span class="na">env</span><span class="pi">:</span>
<span class="na">GITHUB_TOKEN</span><span class="pi">:</span> <span class="s">${{ secrets.GITHUB_TOKEN }}</span>
<span class="na">with</span><span class="pi">:</span>
<span class="na">upload_url</span><span class="pi">:</span> <span class="s">${{ github.event.release.upload_url }}</span>
<span class="na">asset_path</span><span class="pi">:</span> <span class="s">./program.tar.gz</span>
<span class="na">asset_name</span><span class="pi">:</span> <span class="s">program-${{ runner.os }}.tar.gz</span>
<span class="na">asset_content_type</span><span class="pi">:</span> <span class="s">application/tar.gz</span></code></pre></figure>
<p>This is where the convenience of Actions really comes in. We can use the
predefined
<a href="https://github.com/actions/upload-release-asset"><code class="language-plaintext highlighter-rouge">upload-release-asset</code></a> action
which will deal with talking to the the GitHub API for us. In addition we have
access to the <code class="language-plaintext highlighter-rouge">secrets.GITHUB_TOKEN</code> which is used to authenticate the upload
request. We do not need to request the token, it is already there in every
Actions run.</p>
<p>You can call this action several times in your workflow config if you need to
upload more than one asset per build, for instance if you compile several
binaries with more or fewer features.</p>
<p>To note here is that you should make sure the asset name is unique to the build
(bu including information such as the OS) or it will clash with assets uploaded
from other runs of our matrix.</p>
<h2 id="caveats">Caveats</h2>
<p>By running on the GitHub infrastructure the platforms and OS versions for which
you can build these binaries is limited to whatever GitHub has to offer. However
since these encompass the most common OS’es found out there I think it is worth
the effort, especially for smaller projects that are unable to afford their own
build servers.</p>
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:1" role="doc-endnote">
<p>Not quite true, we only run on their latest versions. You can run on more
versions, but I haven’t yet discovered how to get the version identifier during
the build to include in the asset name. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>Justus AdamWith the release of GitHub Actions we have gained an incredibly powerful tool when it comes to testing and publishing software hosted on GitHub. Actions lets you essentially perform arbitrary computing tasks whenever certain events in your repository are triggered. While platforms like Circle-CI or Travis have offered similar capabilities for a while, Actions make some of those tasks significantly more convenient due to how deeply it is integrated with GitHub.Haskell on Travis CI2015-09-04T00:00:00+00:002015-09-04T00:00:00+00:00http://justus.science/blog/2015/09/04/travis<p>Building Haskell code on <a href="https://travis-ci.org">travis</a> is a bit complicated. Haskell happens to be one of the less well supported languages on <a href="https://travis-ci.org">travis</a>.</p>
<p><a href="https://travis-ci.org">Travis</a> ships with an older version of ghc (I think its 7.8) and cabal. But many projects either rely on newer versions of ghc, such as <a href="https://github.com/JustusAdam/elm-init">elm-init</a> or want to be compatible with older ghc, such as <a href="https://github.com/JustusAdam/ja-base-extra">ja-base-extra</a>. This requires building with either a different version of ghc or multiple ghc/base library versions.</p>
<p>There’s a wonderful project by a github user named @hvr. The project itself is called <a href="https://github.com/hvr/multi-ghc-travis">multi-ghc-travis</a> and provides an example <a href="https://github.com/hvr/multi-ghc-travis/blob/master/.travis.yml">.travis.yml</a> which configures a matrix of build environments for travis based on as many ghc and cabal versions as you require by manually downloading and installing the necessary ghc ppa’s on the build VM.</p>
<p>This is great and all, however bears a downside as it requires root privileges in the VM to add ppa’s and install packages, which prohibits use of the new and faster, container based travis architecture.</p>
<p>There has been some nice development in container customization lately and as a result there’s now a container compatible way of customizing your Haskell build environment on travis as this <a href="https://github.com/hvr/multi-ghc-travis#travisyml-for-container-based-infrastructure">section</a> of the <a href="https://github.com/hvr/multi-ghc-travis">README</a> shows.</p>
<p>The even nicer thing is that the repository also provides you with a Haskell script that automatically creates the new-style .travis.yml from your <strong>tested-with</strong> section in your <strong>.cabal</strong> file. Simply provide the script with a <strong>.cabal</strong> file and pipe the output into a file called .travis.yml and you’re pretty much set.</p>
<p>Now I found it rather difficult to find information on how the <strong>tested-with</strong> section in a <strong>.cabal</strong> file should look. The <a href="https://www.haskell.org/cabal/users-guide/developing-packages.html#package-properties">cabal documentation</a> simply states that it contain <em>list compiler</em>.</p>
<p>Searching further I found that <em>compiler</em> is supposed to be the short name of a compiler, such as <em>ghc</em> version bounds for that compiler. Those version bounds are very similar to those of dependencies. Resulting in a field which looks something like this:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>tested-with:
GHC >= 7.0 && < 7.10,
LHC >= 0.6 && < 0.8
</code></pre></div></div>
<p>That’s all I’ve got for now. If you’ve got something to add <a href="https://twitter.com/justusadam_">catch me on twitter</a>.</p>JustusBuilding Haskell code on travis is a bit complicated. Haskell happens to be one of the less well supported languages on travis.Basic authenticated http requests and forms in Haskell2015-07-30T00:00:00+00:002015-07-30T00:00:00+00:00http://justus.science/blog/2015/07/30/basic-auth-haskell<h2 id="prelude">Prelude</h2>
<p>I’ve recently needed to make a basic, authenticated HTTP request in Haskell, however I found it difficult to find examples and documentation on the web so I thought I’d share my findings with the world in the form of a blog post (and a <a href="https://gist.github.com/JustusAdam/9f1b3da2fadef823ff8b">gist</a>).</p>
<p>First of all, in order to use http in Haskell you’ll want to use the right library. Fortunately for me I already knew a library which is sort of the standard library for Haskell when it comes to http (client side). The aptly named <a href="https://hackage.haskell.org/package/HTTP">HTTP</a> library.</p>
<p>The Hackage page for the <a href="https://hackage.haskell.org">hackage</a> page for the <a href="https://hackage.haskell.org/package/HTTP">HTTP</a> library albeit being helpful does not provide very many examples on how to use it and more importantly does not provide a lot of guidance for beginners when it comes to choosing the right submodule for a particular task.</p>
<p>I the past I’ve mostly dealt with the very basic Network.HTTP package. Which is reasonably easy to understand and totally sufficient for simple, unauthenticated <strong>GET</strong> requests. However if you want to do more complicated things like for example auth or cookies it is too low level. For those more complicated requests you’ll want to use the Network.Browser module, which seems a bit intimidating at first.</p>
<p>Network.Browser defines the BrowserAction Monad, which is basically IO combined with (Browser)State. All further actions are then defined on this BrowserAction.</p>
<h2 id="browser-basics">Browser basics</h2>
<p>The main action with the BrowserAction Monad, outside of BrowserAction, is the <code class="language-plaintext highlighter-rouge">browse</code> function. This function evaluates the BrowserAction and returns whatever contents it is holding, very much like IO. This is the usual entry and exit point for Browser related computation in the <a href="https://hackage.haskell.org/package/HTTP">HTTP</a> library.</p>
<p>The basic function for performing requests is the <code class="language-plaintext highlighter-rouge">request</code> function, which given a <code class="language-plaintext highlighter-rouge">Request</code> object performs the request, ending again inside a BrowserAction.</p>
<p><code class="language-plaintext highlighter-rouge">Request</code> objects can either be created by hand, or with the utility functions from <code class="language-plaintext highlighter-rouge">Network.HTTP</code> or by using the utility functions in <code class="language-plaintext highlighter-rouge">Network.HTTP.Browser</code> itself.</p>
<p>The quickest one of these to get started is using the <code class="language-plaintext highlighter-rouge">getRequest</code> function from <code class="language-plaintext highlighter-rouge">Network.HTTP</code>, it just takes a <code class="language-plaintext highlighter-rouge">String</code> and returns a <code class="language-plaintext highlighter-rouge">Request</code>.</p>
<p>Which means a basic request starts like this</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
2
3
4
5
6
</pre></td><td class="code"><pre><span class="kr">import</span> <span class="nn">Network.HTTP.Browser</span>
<span class="n">main</span> <span class="o">=</span>
<span class="n">browse</span> <span class="o">$</span>
<span class="n">request</span> <span class="o">$</span> <span class="n">getRequest</span> <span class="s">"http://github.com"</span>
</pre></td></tr></tbody></table></code></pre></figure>
<div class="snippet-meta light-font">
<div>
<small class="right">How it starts.</small>
</div>
<div class="clearfix"></div>
<div>
<small class="right">
<em>
This snippets source on <a href="https://github.com/JustusAdam/justusadam.github.io/tree/master/_includes/snippets/haskell/http/Start.hs">GitHub</a>
</em>
</small>
</div>
<div class="clearfix"></div>
</div>
<h2 id="handling-uris">Handling URI’s</h2>
<p>If you want to be slightly more fancy and safe with your requests, instead of using the <code class="language-plaintext highlighter-rouge">getRequest</code> function you can first parse your URI using the <code class="language-plaintext highlighter-rouge">Network.URI</code> module from the <a href="https://hackage.haskell.org/package/network-uri">network-uri</a> package (which is what <code class="language-plaintext highlighter-rouge">getRequest</code> does internally). This module provides several ways of parsing URI’s that return either Maybe’s, Either’s or throwing exceptions, if you’re okay with throwing exceptions. But they all return <code class="language-plaintext highlighter-rouge">URI</code> type objects.</p>
<p>Getting those URI’s into a <code class="language-plaintext highlighter-rouge">Request</code> can be done by for example <code class="language-plaintext highlighter-rouge">defaultGetRequest</code> which takes a <code class="language-plaintext highlighter-rouge">URI</code> and returns a Request that the Browser can carry out.</p>
<h2 id="requests-with-forms">Requests with forms</h2>
<p>Sending requests with actual (x-www-urlencoded) payload is, as I discovered with joy, similarly easy with the Browser module. It provides a function called <code class="language-plaintext highlighter-rouge">formToRequest</code> which takes a <code class="language-plaintext highlighter-rouge">Form</code> and <code class="language-plaintext highlighter-rouge">URI</code>, returning a <code class="language-plaintext highlighter-rouge">Request</code> and a data constructor for <code class="language-plaintext highlighter-rouge">Form</code> which takes a <code class="language-plaintext highlighter-rouge">RequestMethod</code> for which the constructors are simply <code class="language-plaintext highlighter-rouge">POST</code>, <code class="language-plaintext highlighter-rouge">GET</code> etc and a list of 2-Tuples of Strings for the payload values.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
</pre></td><td class="code"><pre><span class="kr">import</span> <span class="nn">Network.HTTP.Browser</span>
<span class="n">main</span> <span class="o">=</span>
<span class="n">browse</span> <span class="o">$</span>
<span class="n">formToRequest</span> <span class="o">$</span>
<span class="kt">Form</span>
<span class="kt">POST</span>
<span class="p">(</span><span class="n">fromJust</span> <span class="o">$</span> <span class="n">parseURI</span> <span class="s">"http://github.com/register/new"</span><span class="p">)</span>
<span class="p">[</span> <span class="p">(</span><span class="s">"name"</span><span class="p">,</span> <span class="s">"Guido"</span><span class="p">)</span>
<span class="p">,</span> <span class="p">(</span><span class="s">"occupation"</span><span class="p">,</span> <span class="s">"Plumber"</span><span class="p">)</span>
<span class="p">,</span> <span class="p">(</span><span class="s">"email"</span><span class="p">,</span> <span class="s">"guido@python.org"</span><span class="p">)</span>
<span class="p">]</span>
</pre></td></tr></tbody></table></code></pre></figure>
<div class="snippet-meta light-font">
<div>
<small class="right">Requests with form data</small>
</div>
<div class="clearfix"></div>
<div>
<small class="right">
<em>
This snippets source on <a href="https://github.com/JustusAdam/justusadam.github.io/tree/master/_includes/snippets/haskell/http/Form.hs">GitHub</a>
</em>
</small>
</div>
<div class="clearfix"></div>
</div>
<h2 id="requests-with-authentication">Requests with Authentication</h2>
<p>Even though it took me a relatively long time to figure out how to do authentication with this library, it is actually relatively simple.
The BrowserAction will actually handle most of the hard authentication work for you, provided the webpage you’re visiting communicates in the canonical way, using HTTP error codes.</p>
<p>When performing the POST or GET request with the <code class="language-plaintext highlighter-rouge">request</code> function the Browser action will check the returned status code and take action depending on the code. The two codes that are of interest are 200 (Status OK) and 401 (Unauthorized).
In case of 200 the server has computed the resource you requested and BrowserAction will simply return the body of the request. In case of 401 the server requests you to authenticate to it.</p>
<p>If the server requests the authentication, BrowserAction will attempt to satisfy the authentication by fetching a Username, Password combination from a generator function and sending a request for authentication to the server, retrying the original request afterwards.
The type signature for the generator function is <code class="language-plaintext highlighter-rouge">URI -> String -> IO (Maybe (String, String))</code> and by default is equivalent to <code class="language-plaintext highlighter-rouge">\_ _ -> return Nothing</code> aka there will be no authentication for any URI.
You can however set your own generator function with the <code class="language-plaintext highlighter-rouge">setAuthorityGen</code> function in the BrowserAction.</p>
<h4 id="a-few-things-to-note-about-the-generator-function">A few things to note about the generator function</h4>
<ul>
<li>The two arguments provided are the full URI for the requested resource and the so called realm, which is a message the server sends to unauthorized clients.<br />
Thus your generator function can return different Username, Password combinations depending on the URI.</li>
<li>The generator function returns a maybe. If you don’t recognize the URI that it is trying to authenticate for you don’t have to provide any credentials by simply returning <code class="language-plaintext highlighter-rouge">Nothing</code></li>
<li>The return type of the generator function is an IO computation, which means you may read the credentials from a file or request them from a different server.</li>
</ul>
<h4 id="how-does-this-look-in-practice">How does this look in practice?</h4>
<p>A very simple authenticated request could look something like this.</p>
<script src="https://gist.github.com/9f1b3da2fadef823ff8b.js"> </script>
<p>Another version of the provider function, for multiple URI’s would be by hardcoding an association list of them and fetching from the list. Like in the example below.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
</pre></td><td class="code"><pre><span class="kr">import</span> <span class="nn">HTTP.Browser</span>
<span class="kr">import</span> <span class="nn">Control.Arrow</span> <span class="p">(</span><span class="nf">second</span><span class="p">)</span>
<span class="n">authList</span> <span class="o">=</span> <span class="n">map</span> <span class="p">(</span><span class="n">second</span> <span class="p">(</span><span class="n">fromJust</span> <span class="o">.</span> <span class="n">parseURI</span><span class="p">))</span> <span class="c1">-- makes the strings into URI's</span>
<span class="p">[</span> <span class="p">(</span><span class="s">"http://google.com"</span> <span class="p">,</span> <span class="p">(</span><span class="s">"walter"</span><span class="p">,</span> <span class="s">"12.24.1975"</span><span class="p">))</span>
<span class="p">,</span> <span class="p">(</span><span class="s">"http://facebook.com"</span><span class="p">,</span> <span class="p">(</span><span class="s">"MarkZuckerberg"</span><span class="p">,</span> <span class="s">"IamTheFounder"</span><span class="p">))</span>
<span class="p">,</span> <span class="p">(</span><span class="s">"http://reddit.com"</span> <span class="p">,</span> <span class="p">(</span><span class="s">"StephenHawking"</span><span class="p">,</span> <span class="s">"BlackHole"</span><span class="p">))</span>
<span class="p">]</span>
<span class="n">main</span> <span class="o">=</span>
<span class="n">browse</span> <span class="o">$</span> <span class="kr">do</span>
<span class="n">setAuthorityGen</span> <span class="p">(</span><span class="n">const</span> <span class="o">.</span> <span class="n">return</span> <span class="o">.</span> <span class="n">flip</span> <span class="n">lookup</span> <span class="n">authList</span><span class="p">)</span>
<span class="n">request</span> <span class="o">$</span> <span class="n">getRequest</span> <span class="s">"http://github.com"</span>
</pre></td></tr></tbody></table></code></pre></figure>
<div class="snippet-meta light-font">
<div>
<small class="right">Using an association list for providing Passwords</small>
</div>
<div class="clearfix"></div>
<div>
<small class="right">
<em>
This snippets source on <a href="https://github.com/JustusAdam/justusadam.github.io/tree/master/_includes/snippets/haskell/http/ListAuth.hs">GitHub</a>
</em>
</small>
</div>
<div class="clearfix"></div>
</div>
<h2 id="sending-literal-json">Sending literal JSON</h2>
<p>I’ve also recently had the pleasure to be in a situation where I’ve wanted to create some very simple json, for which the proper way (via the <a href="https://hackage.haskell.org/package/aeson">aeson</a> library) would have felt like overkill,</p>
<p>I wanted to create the output with just string concatenation and then send it to the client. Unfortunately the <a href="https://hackage.haskell.org/package/warp">warp</a> library, which is sort of the standard web server library for Haskell, uses ByteStrings for output. Now if you do things the canonical way, the <a href="https://hackage.haskell.org/package/aeson">aeson</a> way it’ll create a unicode encoded ByteString for you and there’s nothing to worry about.</p>
<p>The JSON standard requires the text to be unicode encoded. However when using string literals and concatenation it becomes quite obvious that ByteString is inherently not meant for unicode. So in order to get a Unicode encoded String you’ll have create <code class="language-plaintext highlighter-rouge">Text</code> rather than a <code class="language-plaintext highlighter-rouge">String</code> and then specifically encode it as a unicode ByteString. You can do this by importing the <code class="language-plaintext highlighter-rouge">Data.Text.Encoding</code> module or the <code class="language-plaintext highlighter-rouge">Data.Text.Lazy.Encoding</code> module if you, like me, are dealing with <a href="https://hackage.haskell.org/package/warp">warp</a> and need a lazy ByteString for output and then simply use the <code class="language-plaintext highlighter-rouge">encodeUtf8</code> function on your <code class="language-plaintext highlighter-rouge">Text</code>.</p>
<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
</pre></td><td class="code"><pre><span class="cp">{-# LANGUAGE OverloadedStrings #-}</span>
<span class="kr">import</span> <span class="nn">Data.Text.Lazy</span>
<span class="kr">import</span> <span class="nn">Data.Text.Lazy.Encoding</span>
<span class="kr">import</span> <span class="nn">Network.Warp</span>
<span class="n">warpApplication</span> <span class="n">respond</span> <span class="n">request</span> <span class="o">=</span>
<span class="n">respond</span> <span class="o">$</span> <span class="n">respondLBS</span> <span class="o">$</span> <span class="n">encodeUtf8</span> <span class="s">"{</span><span class="se">\"</span><span class="s">text</span><span class="se">\"</span><span class="s">: </span><span class="se">\"</span><span class="s">My json response</span><span class="se">\"</span><span class="s">}"</span>
<span class="n">main</span> <span class="o">=</span> <span class="n">run</span> <span class="n">warpApplication</span>
</pre></td></tr></tbody></table></code></pre></figure>
<div class="snippet-meta light-font">
<div>
<small class="right">Making utf-8 encoded text</small>
</div>
<div class="clearfix"></div>
<div>
<small class="right">
<em>
This snippets source on <a href="https://github.com/JustusAdam/justusadam.github.io/tree/master/_includes/snippets/haskell/http/Utf8.hs">GitHub</a>
</em>
</small>
</div>
<div class="clearfix"></div>
</div>Open source icons are awesome2015-07-18T00:00:00+00:002015-07-18T00:00:00+00:00http://justus.science/blog/2015/07/18/open-icons<p>My most recent project is a rewrite of the <a href="https://github.com/elm-lang/elm-reactor">elm-reacor</a> in the <a href="http://elm-lang.org">Elm</a> language itself, with a better style (somewhat in the vein of GitHub’s style <a href="https://github.com/elm-lang/projects#improve-elm-reactor-navigation-page">see the mockup</a>).</p>
<p>To make it pretty I wanted to add icons for files and folders, since navigating files without icons is just boring. I was pessimistic about my chances to find good ones, because they’d have to have an an open license that allows me to use them freely.</p>
<p>In the past I’ve had plenty of situations where I’d have liked to have some pretty icons for a (web) project, though I never thought of just googling for open source icons. This time I did and my first search immediately yielded <a href="https://github.com/iconic/open-iconic">this awesome project</a> which is a collection of very pretty, MIT licensed icons that you can use and alter however you want.</p>
<p>I used to think that only the developer community was creating awesome projects like Linux, Haskell, pandoc, jQuery and such for free and with publicly available sources. Turns out there are artists doing a very similar thing with pictures and icons. I am ashamed I didn’t think it possible.</p>
<p>This is what the project looks like right now:</p>
<p><img src="/images/elm-reactor-index-1280.png" alt="Elm Reactor Screenshot" /></p>
<p>So I’ve found way more and way prettier icons that I’d hoped for, with very little effort. I’d strongly encourage you, if you have the need for some icons or pictures as well, give a search “open source icons” a try, you’ll probably find something awesome.</p>
<p>PS: If the icons are open source <a href="https://git-scm.com/book/en/v2/Git-Tools-Submodules">git submodules</a> is a great way of adding them to your project as dependency that does not artificially inflate your git repository.</p>
<p>PPS: If you’re looking for some fonts I’d also recommend <a href="https://github.com/FontFaceKit/open-sans">Open Sans</a> as sans-serif font and <a href="https://github.com/adobe-fonts/source-code-pro">Source Code Pro</a> as a pretty monospace font.</p>JustusMy most recent project is a rewrite of the elm-reacor in the Elm language itself, with a better style (somewhat in the vein of GitHub’s style see the mockup).Trouble with CNAME’s2015-07-12T00:00:00+00:002015-07-12T00:00:00+00:00http://justus.science/blog/2015/07/12/cname-shenanigans<p>For a month or so I’ve now had the ‘justus.science’ domain, on which, amongst other things, this homepage can be found. Sort of around the same time I moved host, from <a href="https://uberspace.de">uberspace</a> to <a href="https://pages.github.com">GitHub Pages</a>, and infrastructure, from <a href="https://drupal.org">Drupal</a> to the (much) more lightweight <a href="http://jekyllrb.com">Jekyll</a>.</p>
<p>By default GitHub pages domains look like this: username.github.io. In my case that would be justusadam.github.io, which is neither short nor particularly pretty. You can add custom domains to a GitHub page though, which I intended to make use of.</p>
<p>In order to add a custom domain to a GitHub page you have to configure the appropriate DNS entry with the provider. I, as per usual, read the manual fast and loosely and as a result thought the DNS entry <strong>had</strong> to be a CNAME, a belief which was reinforced by the fact that the file that has to be put into the repository and tells GitHub what the custom domain for the page is called, bears the name ‘CNAME’.</p>
<p>So I registered a CNAME entry with united domains. The domain resolved as expected and the address bar in the browser was showing the right thing (the custom domain), everything seemed right with the world.</p>
<p>Then I tried to set up email with the new domain, as I had done with the other domain before by setting an MX entry pointing to the uberspace mailserver. It didn’t work. I sent a few test mails, none of them arrived. The SMTP server tried three times and eventually gave up. Every mail was getting dropped. I changed the entry and redirected to gmail instead, that seemed to work<sup id="fnref:gmail" role="doc-noteref"><a href="#fn:gmail" class="footnote" rel="footnote">1</a></sup>.</p>
<p>I could not figure out what was causing the issue. I decided it probably was the mailserver that for some reason could not deal with the new domain ending, a claim that, as I realize now, was pretty stupid. I decided to leave it for now, since in my mind the error did not appear to be on my side and I did not depend on the mail addresses.</p>
<p>That could have been the end of it, if not for a couple of days ago, when I decided it was time to pick the issue back up and get it sorted out. I sent an email to the excellent uberspace support and told them the server was not receiving mails to one of my domains and what I thought the reason was. They came back to me with a simple fact that made the problem obvious.</p>
<p>Turns out setting a CNAME entry for a domain pretty much overrides everything. CNAME’s are designed to redirect <em>all</em> traffic to a (sub)domain to another domain, all traffic, including email. When A CNAME has been set the DNS server will ignore any other entry for that same name or flat out reject you setting any additional entries for the domain. Thus by setting one for ‘justus.science’ I inadvertently directed all emails going to <em>any-address@justus.science</em> to GitHub pages, which does not provide an email service and as a result dropped them.</p>
<p>Fixing it was rather simple. A second look at the GitHub pages help pages revealed that, any type of DNS entry could be used for a page (and why shouldn’t it). So I changed the CNAME entrty to an A entry<sup id="fnref:aentry" role="doc-noteref"><a href="#fn:aentry" class="footnote" rel="footnote">2</a></sup> to <a href="https://help.github.com/articles/tips-for-configuring-an-a-record-with-your-dns-provider/">GitHub’s IP</a>, waited for the TTL to expire, send test mails again and suddenly they all arrived as expected, where expected. The MX entry was in effect.</p>
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:gmail" role="doc-endnote">
<p>Knowing what I know now it might be that I am not remembering the timeline correctly and it might be I had the email redirect <strong>before</strong> I set the CNAME entry for the site and changed it to an MX later, that would make more sense at least. <a href="#fnref:gmail" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:aentry" role="doc-endnote">
<p>If you’d like to know more about setting A names with GitHub, <a href="https://help.github.com/articles/tips-for-configuring-an-a-record-with-your-dns-provider/">here</a> is the official help page. <a href="#fnref:aentry" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>JustusFor a month or so I’ve now had the ‘justus.science’ domain, on which, amongst other things, this homepage can be found. Sort of around the same time I moved host, from uberspace to GitHub Pages, and infrastructure, from Drupal to the (much) more lightweight Jekyll.The Schedule Planner web-app is finally live2015-06-11T00:00:00+00:002015-06-11T00:00:00+00:00http://justus.science/blog/2015/06/11/schedule-planner<h1 id="the-news-tldr">The News (TLDR)</h1>
<p>Today I worked out the last (major, known to me) kink of my schedule-planner web-app.</p>
<p>It is <a href="http://justus.science/schedule-planner-web/">live right now</a> (and should work as expected).</p>
<p>Feel free to try it and find bugs. I challenge you :D .</p>
<p>The Website is written in <a href="http://elm-lang.org">Elm</a>, the (Elm) source can be found on <a href="https://github.com/JustusAdam/schedule-planner-web">GitHub</a>. The page itself is deployed on <a href="https://pages.github.com">GitHub Pages</a>.</p>
<p>The actual work is done by the backend server which is pure <a href="https://haskell.org">Haskell</a>, deployed on <a href="http://uberspace.de">Uberspace</a>. And the source is available on <a href="https://github.com/JustusAdam/schedule-planner">GitHub</a> as well.</p>
<h2 id="what-it-does">What it does</h2>
<p>schedule-planner is an application that, given a list of lessons and a list of rules, calculates the perfect layout for those lessons that adheres to the given rules as much as possible.</p>
<p>Currently you can define rules that select either days, timeslots or specific cells (specific timeslot on a specific day) and then assign a weight to it. The higher the weight, the more reluctant the algorithm will be to allocate lessons in that day/slot/cell.</p>
<p>The application is usable from the command line as a simple tool that takes a json file as input and either prints the result schedules to the command line or emits new json containing the resulting schedules.</p>
<p>The website is a convenient way of inputting the lessons and rules. Also it’ll save your input in the browser, so nothing gets lost when you close the browser or reload.</p>
<p>There is lots I’d like to improve abut the interface and more features I’d like to add to the algorithm, but that’ll have to wait.</p>
<h2 id="the-last-kink">The last kink</h2>
<p>… I faced was a ting called CORS, or cross origin resource sharing, or rather the lack of it by default. During the last year or so I have come in contact with many free services on the internet providing things like <a href="https://c9.io">online IDE’s</a>, <a href="http://hastebin.com">codesharing</a>, <a href="https://github.com">source code hosting</a> and <a href="https://pages.github.com">project website hosting</a> to name a few.
GitHub is a service I am using constantly and recently I started also using GitHub pages a lot more, so I wanted to take advantage of the easy deployment via git on <a href="https://pages.github.com">GitHub pages</a> for this project as well. But since GitHub pages can only host static websites (and markdown), it could not run the Haskell server I am using as backend. So I put the webpage on github pages and deployed a static binary of the backend server, that does only the calculation, on my VM on <a href="http://uberspace.de">Uberspace</a>. This has the added benefit that GitHub pages now takes care of serving the html and relatively large javascript files, while the backend server only deals with a small amount of json for in- and output. This takes some load away from Uberspace.</p>
<p>As a result of having the website on GitHub and the calculation server on Uberspace, the domains for them are different. As a result most modern browsers will reject those communication between the two by default. Unless one sets a set of so called CORS headers, and it took me a while to do that properly.</p>
<p>At first I tried to do it by hand, just have the server emit the necessary headers all the time. That unfortunatley did not seem to work, so I tried adding a Haskell library to deal with CORS on the server side, which did not work either. For some reason the server middleware rejected the request outright, I can only assume it decided the requests the rowser was sending did not fulfill the expected CORS protocol.</p>
<p>Even though adding the Library did not fix my problem, it thought me more about what CORS headers were there and how did they work. So when finally, in frustration, I turned back to the original idea of static headers, I could figure out which three headers were actually required (one of which I had forgotten before). Adding them to the request seems to do the trick now.</p>
<p>This does not make the site or your browser vulnerable, as far as I can tell, there’s no reason to worry. Just try it and have some fun.</p>
<h2 id="licensing">Licensing</h2>
<p>Both projects have an open source license (LGPL v3). Feel free to use the code as you’d like, I’d appreciate it if you’d contribute, should you have ideas for improvement/be interested in advancing the project.</p>JustusThe News (TLDR)Next level monkey patching2015-05-01T00:00:00+00:002015-05-01T00:00:00+00:00http://justus.science/blog/2015/05/01/monkey-patching<p><em>Hint: all python examples here run on python 3 and you can try them for yourself and experiment.
The source can be found <a href="https://github.com/JustusAdam/justusadam.github.io/tree/master/_includes/snippets/python/monkey-patching">here</a></em></p>
<p>As everyone probably knows most objects in python are not static as static as in many other languages. When you create a class you can specify class attributes in the class body and instance attributes in the init method.</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
</pre></td><td class="code"><pre><span class="c1">#! /usr/bin/env python3
</span>
<span class="k">class</span> <span class="nc">TestClass</span><span class="p">:</span>
<span class="c1"># class atttributes are declared in the class body
</span> <span class="c1"># they absolutey must be assigned a value
</span> <span class="n">class_foo</span> <span class="o">=</span> <span class="mi">0</span>
<span class="n">class_bar</span> <span class="o">=</span> <span class="nb">object</span>
<span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="c1"># instance attributes are assigned to the object in the initializer
</span> <span class="c1"># these also need to be assigned a value
</span> <span class="bp">self</span><span class="p">.</span><span class="n">instance_foo</span> <span class="o">=</span> <span class="mi">8</span>
<span class="bp">self</span><span class="p">.</span><span class="n">instance_bar</span> <span class="o">=</span> <span class="bp">None</span>
</pre></td></tr></tbody></table></code></pre></figure>
<div class="snippet-meta light-font">
<div>
<small class="right">Attribute basics</small>
</div>
<div class="clearfix"></div>
<div>
<small class="right">
<em>
This snippets source on <a href="https://github.com/JustusAdam/justusadam.github.io/tree/master/_includes/snippets/python/monkey-patching/attributes_basics.py">GitHub</a>
</em>
</small>
</div>
<div class="clearfix"></div>
</div>
<p>However that is by no means final.</p>
<h2 id="patching-objects">Patching objects</h2>
<p>Even though you are encouraged to declare instance attributes in the initializer you are by no means required to do so. You can declare/assign instance attribute in any method and even outside the class at any point in the program.</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
</pre></td><td class="code"><pre><span class="c1">#! /usr/bin/env python3
</span>
<span class="k">class</span> <span class="nc">TestClass</span><span class="p">:</span>
<span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">pass</span>
<span class="k">def</span> <span class="nf">method_1</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="c1"># defining an instance attribute from inside another method
</span> <span class="bp">self</span><span class="p">.</span><span class="n">instance_foo</span> <span class="o">=</span> <span class="mi">4</span>
<span class="n">my_instance</span> <span class="o">=</span> <span class="n">TestClass</span><span class="p">()</span>
<span class="k">print</span><span class="p">(</span>
<span class="nb">hasattr</span><span class="p">(</span><span class="n">my_instance</span><span class="p">,</span> <span class="s">'instance_foo'</span><span class="p">)</span>
<span class="p">)</span> <span class="c1"># =>> False
# the instance_foo attribute does not exist yet
</span>
<span class="n">my_instance</span><span class="p">.</span><span class="n">method_1</span><span class="p">()</span>
<span class="k">print</span><span class="p">(</span>
<span class="n">my_instance</span><span class="p">.</span><span class="n">instance_foo</span>
<span class="p">)</span> <span class="c1"># =>> 4
# now it does
</span>
<span class="k">print</span><span class="p">(</span>
<span class="nb">hasattr</span><span class="p">(</span><span class="n">my_instance</span><span class="p">,</span> <span class="s">'instance_bar'</span><span class="p">)</span>
<span class="p">)</span> <span class="c1"># =>> False
</span>
<span class="n">my_instance</span><span class="p">.</span><span class="n">instance_bar</span> <span class="o">=</span> <span class="s">'hello'</span>
<span class="k">print</span><span class="p">(</span>
<span class="n">my_instance</span><span class="p">.</span><span class="n">instance_bar</span>
<span class="p">)</span> <span class="c1"># =>> hello
</span>
<span class="k">del</span> <span class="n">my_instance</span><span class="p">.</span><span class="n">instance_foo</span>
<span class="n">my_instance</span><span class="p">.</span><span class="n">instance_foo</span>
<span class="c1"># =>> AttributeError: 'TestClass' object has no attribute 'instance_foo'
# trying to call non-existing attributes causes an AttributeError</span>
</pre></td></tr></tbody></table></code></pre></figure>
<div class="snippet-meta light-font">
<div>
<small class="right">Changing attributes from the outside</small>
</div>
<div class="clearfix"></div>
<div>
<small class="right">
<em>
This snippets source on <a href="https://github.com/JustusAdam/justusadam.github.io/tree/master/_includes/snippets/python/monkey-patching/attributes_from_outside.py">GitHub</a>
</em>
</small>
</div>
<div class="clearfix"></div>
</div>
<p>As you can see we can add attributes from inside another method. You can also remove the attributes from anywhere.</p>
<h3 id="some-things-to-learn-from-this">Some things to learn from this</h3>
<p>Declare your instance attributes in the initializer because you are guaranteed that this method will execute, or you probably will run into unexpected AttributeErrors somewhere down the line. Even initializing them with <code class="language-plaintext highlighter-rouge">None</code> is better than not initializing at all.</p>
<p>Never add instance attributes from the outside. You may accidentally overwrite others or forget to do it and that would again cause name errors.</p>
<h3 id="exceptions-from-the-rule">Exceptions from the rule</h3>
<p>There are however some exceptions. For instance if you use a decorator to attach meta information to a function or class. It is okay to do here, because, again, you are guaranteed that the function is going to execute.</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
</pre></td><td class="code"><pre><span class="c1">#! /usr/bin/env python3
</span>
<span class="k">def</span> <span class="nf">attach_meta</span><span class="p">(</span><span class="o">**</span><span class="n">arguments</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">_inner</span><span class="p">(</span><span class="n">function_or_class</span><span class="p">):</span>
<span class="k">if</span> <span class="nb">hasattr</span><span class="p">(</span><span class="n">function_or_class</span><span class="p">,</span> <span class="s">'_meta'</span><span class="p">):</span>
<span class="k">raise</span> <span class="nb">AttributeError</span><span class="p">(</span><span class="s">'Function already has meta information'</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">function_or_class</span><span class="p">.</span><span class="n">_meta</span> <span class="o">=</span> <span class="n">arguments</span>
<span class="k">return</span> <span class="n">function_or_class</span>
<span class="k">return</span> <span class="n">_inner</span>
<span class="k">def</span> <span class="nf">print_with</span><span class="p">(</span><span class="n">obj</span><span class="p">):</span>
<span class="k">if</span> <span class="s">'foo'</span> <span class="ow">in</span> <span class="n">obj</span><span class="p">.</span><span class="n">_meta</span><span class="p">:</span>
<span class="k">print</span><span class="p">(</span><span class="n">obj</span><span class="p">,</span> <span class="s">'printed with'</span><span class="p">,</span> <span class="n">obj</span><span class="p">.</span><span class="n">_meta</span><span class="p">[</span><span class="s">'foo'</span><span class="p">])</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">print</span><span class="p">(</span><span class="n">obj</span><span class="p">)</span>
<span class="o">@</span><span class="n">attach_meta</span><span class="p">(</span><span class="n">foo</span><span class="o">=</span><span class="s">'blue'</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">my_func</span><span class="p">():</span>
<span class="k">pass</span>
<span class="n">print_with</span><span class="p">(</span><span class="n">my_func</span><span class="p">)</span>
<span class="c1"># =>> <function my_func at ...> printed with blue</span>
</pre></td></tr></tbody></table></code></pre></figure>
<div class="snippet-meta light-font">
<div>
<small class="right">Meta information decorator</small>
</div>
<div class="clearfix"></div>
<div>
<small class="right">
<em>
This snippets source on <a href="https://github.com/JustusAdam/justusadam.github.io/tree/master/_includes/snippets/python/monkey-patching/meta_information_decorator.py">GitHub</a>
</em>
</small>
</div>
<div class="clearfix"></div>
</div>
<p><em>Note that we’re attaching instance variables to a function. Functions are just objects, like anything else, so we’re allowed to do that</em></p>
<h3 id="how-it-works">How it works</h3>
<p>Objects in python are actually quite a simple construct.</p>
<p>For purposes of this article we can just think of objects as a combination of a class, the type of the object, and a so called instance dict.</p>
<p>The class, or type, is what we created when we were using the <code class="language-plaintext highlighter-rouge">class</code> keyword and it contains the methods and class attributes and reference to parent classes and so on.</p>
<p>The instance dict is a simple python dictionary that holds the instance variables.</p>
<p>When an python object is created by the runtime the instance dict is actually empty<sup id="fnref:instance_dict_init" role="doc-noteref"><a href="#fn:instance_dict_init" class="footnote" rel="footnote">1</a></sup>, no object actually has any instance attributes<sup id="fnref:slots" role="doc-noteref"><a href="#fn:slots" class="footnote" rel="footnote">2</a></sup>, until the <code class="language-plaintext highlighter-rouge">__init__</code> method is called. In the init method we are basically monkey patching in our instance attributes. This sets the key corresponding to the name of the attribute in the instance dict.</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
</pre></td><td class="code"><pre><span class="c1">#! /usr/bin/env python3
</span>
<span class="k">class</span> <span class="nc">TestClass</span><span class="p">:</span>
<span class="n">foo</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">print</span><span class="p">(</span><span class="nb">dir</span><span class="p">(</span><span class="bp">self</span><span class="p">))</span>
<span class="c1"># =>> ['__class__', '__delattr__', '__dict__', '__dir__', '__doc__',
</span> <span class="c1"># '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__',
</span> <span class="c1"># '__hash__', '__init__', '__le__', '__lt__', '__module__', '__ne__',
</span> <span class="c1"># '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__',
</span> <span class="c1"># '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'foo',
</span> <span class="c1"># 'method']
</span> <span class="c1"># only shows us the names of methods or attributed we defined on the class
</span>
<span class="c1"># we can refer to the instance dict directly using __dict__
</span> <span class="k">print</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">__dict__</span><span class="p">)</span>
<span class="c1"># =>> {}
</span>
<span class="c1"># this is an alternative way to get the instance dict
</span> <span class="k">print</span><span class="p">(</span><span class="nb">vars</span><span class="p">(</span><span class="bp">self</span><span class="p">))</span>
<span class="bp">self</span><span class="p">.</span><span class="n">bar</span> <span class="o">=</span> <span class="s">'hi there'</span>
<span class="k">print</span><span class="p">(</span><span class="nb">dir</span><span class="p">(</span><span class="bp">self</span><span class="p">))</span>
<span class="c1"># =>> ['__class__', '__delattr__', '__dict__', '__dir__', '__doc__',
</span> <span class="c1"># '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__',
</span> <span class="c1"># '__hash__', '__init__', '__le__', '__lt__', '__module__', '__ne__',
</span> <span class="c1"># '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__',
</span> <span class="c1"># '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'bar',
</span> <span class="c1"># 'foo', 'method']
</span> <span class="c1"># now bar exists as well
</span>
<span class="k">print</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">__dict__</span><span class="p">)</span>
<span class="c1"># =>> {'bar': 'hi there'}
</span>
<span class="k">print</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">__dict__</span><span class="p">[</span><span class="s">'bar'</span><span class="p">])</span>
<span class="c1"># =>> 'hi there'
</span>
<span class="k">def</span> <span class="nf">method</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">pass</span>
<span class="n">a</span> <span class="o">=</span> <span class="n">TestClass</span><span class="p">()</span>
<span class="k">del</span> <span class="n">a</span><span class="p">.</span><span class="n">__dict__</span><span class="p">[</span><span class="s">'bar'</span><span class="p">]</span>
<span class="c1"># we can delete keys directly in the dict
</span>
<span class="k">print</span><span class="p">(</span>
<span class="nb">hasattr</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="s">'bar'</span><span class="p">)</span>
<span class="p">)</span> <span class="c1"># =>> False
</span>
<span class="k">print</span><span class="p">(</span>
<span class="s">'bar'</span> <span class="ow">in</span> <span class="n">a</span><span class="p">.</span><span class="n">__dict__</span>
<span class="p">)</span> <span class="c1"># =>> False
</span>
<span class="n">a</span><span class="p">.</span><span class="n">bar</span>
<span class="c1"># =>> AttributeError: 'TestClass' object has no attribute 'bar'</span>
</pre></td></tr></tbody></table></code></pre></figure>
<div class="snippet-meta light-font">
<div>
<small class="right">Instance dict basisc</small>
</div>
<div class="clearfix"></div>
<div>
<small class="right">
<em>
This snippets source on <a href="https://github.com/JustusAdam/justusadam.github.io/tree/master/_includes/snippets/python/monkey-patching/instance_dict_basic.py">GitHub</a>
</em>
</small>
</div>
<div class="clearfix"></div>
</div>
<h2 id="patching-classes">Patching classes</h2>
<p>As you may have guessed already, if we’re allowed to attach attributes to a function, we are also allowed to attach attributes to a class.</p>
<p>There are two ways for obtaining the class from an object.</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
</pre></td><td class="code"><pre><span class="c1">#! /usr/bin/env python3
</span>
<span class="k">class</span> <span class="nc">TestClass</span><span class="p">:</span>
<span class="k">pass</span>
<span class="n">my_obj</span> <span class="o">=</span> <span class="n">TestClass</span><span class="p">()</span>
<span class="k">print</span><span class="p">(</span>
<span class="nb">type</span><span class="p">(</span><span class="n">my_obj</span><span class="p">)</span>
<span class="p">)</span> <span class="c1"># =>> <class '__main__.TestClass'> or <class 'get_class.TestClass'>
</span>
<span class="k">print</span><span class="p">(</span>
<span class="n">my_obj</span><span class="p">.</span><span class="n">__class__</span>
<span class="p">)</span> <span class="c1"># =>> <class '__main__.TestClass'> or <class 'get_class.TestClass'>
</span>
<span class="k">print</span><span class="p">(</span>
<span class="nb">type</span><span class="p">(</span><span class="n">my_obj</span><span class="p">)</span> <span class="ow">is</span> <span class="n">my_obj</span><span class="p">.</span><span class="n">__class__</span>
<span class="p">)</span> <span class="c1"># =>> True</span>
</pre></td></tr></tbody></table></code></pre></figure>
<div class="snippet-meta light-font">
<div>
<small class="right">Obtaining the class</small>
</div>
<div class="clearfix"></div>
<div>
<small class="right">
<em>
This snippets source on <a href="https://github.com/JustusAdam/justusadam.github.io/tree/master/_includes/snippets/python/monkey-patching/get_class.py">GitHub</a>
</em>
</small>
</div>
<div class="clearfix"></div>
</div>
<p>I personally prefer directly referring to <code class="language-plaintext highlighter-rouge">__class__</code> if I’m about to tamper with it, but either one works fine.</p>
<p>Now we can add/remove our class attributes.</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
</pre></td><td class="code"><pre><span class="c1">#! /usr/bin/env python3
</span>
<span class="k">class</span> <span class="nc">TestClass</span><span class="p">:</span>
<span class="n">greeting</span> <span class="o">=</span> <span class="s">'hello'</span>
<span class="n">instance</span> <span class="o">=</span> <span class="n">TestClass</span><span class="p">()</span>
<span class="k">print</span><span class="p">(</span>
<span class="n">TestClass</span><span class="p">.</span><span class="n">greeting</span>
<span class="p">)</span> <span class="c1"># =>> hello
</span>
<span class="k">print</span><span class="p">(</span>
<span class="n">TestClass</span><span class="p">.</span><span class="n">greeting</span> <span class="ow">is</span> <span class="n">instance</span><span class="p">.</span><span class="n">greeting</span>
<span class="p">)</span> <span class="c1"># =>> True
</span>
<span class="c1"># removing class attributes
</span><span class="k">del</span> <span class="n">TestClass</span><span class="p">.</span><span class="n">greeting</span>
<span class="k">print</span><span class="p">(</span>
<span class="nb">hasattr</span><span class="p">(</span><span class="n">instance</span><span class="p">,</span> <span class="s">'greeting'</span><span class="p">)</span>
<span class="p">)</span> <span class="c1"># =>> False
</span>
<span class="c1"># and adding them
</span><span class="n">instance</span><span class="p">.</span><span class="n">__class__</span><span class="p">.</span><span class="n">greeting</span> <span class="o">=</span> <span class="s">'hello again'</span>
<span class="nb">type</span><span class="p">(</span><span class="n">instance</span><span class="p">).</span><span class="n">greeting_2</span> <span class="o">=</span> <span class="s">'dear me'</span>
<span class="k">print</span><span class="p">(</span>
<span class="n">TestClass</span><span class="p">.</span><span class="n">greeting</span>
<span class="p">)</span> <span class="c1"># =>> hello again
</span>
<span class="k">print</span><span class="p">(</span>
<span class="n">instance</span><span class="p">.</span><span class="n">greeting_2</span>
<span class="p">)</span> <span class="c1"># =>> dear me
</span>
<span class="k">print</span><span class="p">(</span>
<span class="s">'greeting_2'</span> <span class="ow">in</span> <span class="n">TestClass</span><span class="p">.</span><span class="n">__dict__</span>
<span class="p">)</span> <span class="c1"># =>> True</span>
</pre></td></tr></tbody></table></code></pre></figure>
<div class="snippet-meta light-font">
<div>
<small class="right">Altering class attributes</small>
</div>
<div class="clearfix"></div>
<div>
<small class="right">
<em>
This snippets source on <a href="https://github.com/JustusAdam/justusadam.github.io/tree/master/_includes/snippets/python/monkey-patching/alter_class_attributes.py">GitHub</a>
</em>
</small>
</div>
<div class="clearfix"></div>
</div>
<h3 id="how-it-works-1">How it works</h3>
<p>In python classes are just object. Instances of <code class="language-plaintext highlighter-rouge">type</code>. And like most objects their attributes can be freely modified, removed or added.<sup id="fnref:class_temper_limits" role="doc-noteref"><a href="#fn:class_temper_limits" class="footnote" rel="footnote">3</a></sup> They do have an instance dict <code class="language-plaintext highlighter-rouge">__dict__</code> as well, however in this case it is not a vanilla python <code class="language-plaintext highlighter-rouge">dict</code> but rather a <code class="language-plaintext highlighter-rouge">mappingproxy</code> object. This is the interface the pytho interpreter shows for the instance dicts of lots of builtin types and objects, and this particular dict-like structure can <strong>not</strong> be modified directly. <code class="language-plaintext highlighter-rouge">__setattr__</code> and <code class="language-plaintext highlighter-rouge">__delattr__</code> however work on (most) types.</p>
<h2 id="the-fun-stuff---advanced-class-patching">The fun stuff - advanced class patching</h2>
<p>We’ve just learned that we can patch classes in python by modifying its instance dict, which contains the class attributes. You may be guessing it already, or you may have seen it, the instance dict of a class does not only contains the attributes but it also the methods that are defined on the class.</p>
<p>Furthermore if you print one such method the output says the type is <code class="language-plaintext highlighter-rouge">function</code>, not <code class="language-plaintext highlighter-rouge">method</code>.</p>
<p>In fact python does not have <code class="language-plaintext highlighter-rouge">methods</code> per se. Instead there are functions contained in a classes instance dict. When you have an instance of the class and you print the method referencing from the instance you’ll notice that the type changes from <code class="language-plaintext highlighter-rouge">function</code> to <code class="language-plaintext highlighter-rouge">bound method</code>.</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
</pre></td><td class="code"><pre><span class="c1">#! /usr/bin/env python3
</span>
<span class="k">class</span> <span class="nc">MyClass</span><span class="p">:</span>
<span class="k">def</span> <span class="nf">a_method</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">return</span> <span class="bp">self</span>
<span class="k">print</span><span class="p">(</span>
<span class="s">'a_method'</span> <span class="ow">in</span> <span class="n">MyClass</span><span class="p">.</span><span class="n">__dict__</span>
<span class="p">)</span> <span class="c1"># =>> True
</span>
<span class="k">print</span><span class="p">(</span>
<span class="n">MyClass</span><span class="p">.</span><span class="n">a_method</span>
<span class="p">)</span> <span class="c1"># =>> <function MyClass.a_method at ...>
</span>
<span class="n">obj</span> <span class="o">=</span> <span class="n">MyClass</span><span class="p">()</span>
<span class="k">print</span><span class="p">(</span>
<span class="n">obj</span><span class="p">.</span><span class="n">a_method</span>
<span class="p">)</span> <span class="c1"># =>> <bound method MyClass.a_method of <__main__.MyClass object at ...>>
</span>
<span class="k">print</span><span class="p">(</span>
<span class="n">obj</span><span class="p">.</span><span class="n">a_method</span><span class="p">()</span>
<span class="p">)</span> <span class="c1"># =>> <__main__.MyClass object at ...>
</span>
<span class="k">print</span><span class="p">(</span>
<span class="nb">type</span><span class="p">(</span><span class="n">obj</span><span class="p">).</span><span class="n">a_method</span><span class="p">(</span><span class="s">'hello'</span><span class="p">)</span>
<span class="p">)</span> <span class="c1"># =>> hello</span>
</pre></td></tr></tbody></table></code></pre></figure>
<div class="snippet-meta light-font">
<div>
<small class="right">The difference between functions and methods</small>
</div>
<div class="clearfix"></div>
<div>
<small class="right">
<em>
This snippets source on <a href="https://github.com/JustusAdam/justusadam.github.io/tree/master/_includes/snippets/python/monkey-patching/functions_and_methods.py">GitHub</a>
</em>
</small>
</div>
<div class="clearfix"></div>
</div>
<p>Bound methods are basically partially applied functions, where the first argument is the object the method has been called on.</p>
<p>As we can also see on line 25 a method can be called on the class directly which in which case you will have to provide the <code class="language-plaintext highlighter-rouge">self</code> argument yourself, what the type of that <code class="language-plaintext highlighter-rouge">self</code> argument is, is irrelevant, and not checked anywhere (by the language).</p>
<p>Since methods are just functions until called, we can make the assumption, that they are in fact class attributes that happen to be callable. And in fact if you look at Python classes that is exactly the case. As a practical result of methods being nothing but class attributes and class instance dict being modifiable we can begin to assume that perhaps methods can be modified in just the same way.</p>
<p>Let’s see how it works:</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
</pre></td><td class="code"><pre><span class="c1">#! /usr/bin/env python3
</span>
<span class="k">class</span> <span class="nc">TestClass</span><span class="p">:</span>
<span class="k">def</span> <span class="nf">foo</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">print</span><span class="p">(</span><span class="s">'foo is executing'</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">'self is {}'</span><span class="p">.</span><span class="nb">format</span><span class="p">(</span><span class="bp">self</span><span class="p">))</span>
<span class="k">def</span> <span class="nf">bar</span><span class="p">(</span><span class="n">param1</span><span class="p">):</span>
<span class="k">print</span><span class="p">(</span><span class="s">'bar is executing'</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="s">'self is {}'</span><span class="p">.</span><span class="nb">format</span><span class="p">(</span><span class="n">param1</span><span class="p">))</span>
<span class="n">a</span> <span class="o">=</span> <span class="n">TestClass</span><span class="p">()</span>
<span class="n">TestClass</span><span class="p">.</span><span class="n">foo</span><span class="p">(</span><span class="s">'of wrong type'</span><span class="p">)</span> <span class="c1"># <- notice we have to provide a 'self' parameter
# =>> foo is executing
# =>> self is of wrong type
</span>
<span class="n">bar</span><span class="p">(</span><span class="s">'the first param'</span><span class="p">)</span>
<span class="c1"># =>> bar is executing
# =>> self is the first param
</span>
<span class="n">a</span><span class="p">.</span><span class="n">foo</span><span class="p">()</span> <span class="c1"># <- equivalent to TestClass.foo(a)
# =>> foo is executing
# =>> self is <__main__.TestClass object at ...>
</span>
<span class="c1"># assigning new methods
</span><span class="n">TestClass</span><span class="p">.</span><span class="n">foo_2</span> <span class="o">=</span> <span class="n">bar</span>
<span class="c1"># equivalent to type(a).foo_2 = bar
</span>
<span class="n">a</span><span class="p">.</span><span class="n">foo_2</span><span class="p">()</span>
<span class="c1"># =>> bar is executing
# =>> self is <__main__.TestClass object at ...>
</span>
<span class="c1"># reassigning old ones
</span><span class="n">TestClass</span><span class="p">.</span><span class="n">foo</span> <span class="o">=</span> <span class="n">bar</span> <span class="c1"># <- no errors
</span>
<span class="n">a</span><span class="p">.</span><span class="n">foo</span><span class="p">()</span>
<span class="c1"># =>> bar is executing
# =>> self is <__main__.TestClass object at ...>
</span>
<span class="c1"># deleting methods
</span><span class="k">del</span> <span class="n">TestClass</span><span class="p">.</span><span class="n">foo_2</span>
<span class="n">a</span><span class="p">.</span><span class="n">foo_2</span><span class="p">()</span>
<span class="c1"># =>> AttributeError: 'TestClass' object has no attribute 'foo_2'</span>
</pre></td></tr></tbody></table></code></pre></figure>
<div class="snippet-meta light-font">
<div>
<small class="right">Basic method manipulations</small>
</div>
<div class="clearfix"></div>
<div>
<small class="right">
<em>
This snippets source on <a href="https://github.com/JustusAdam/justusadam.github.io/tree/master/_includes/snippets/python/monkey-patching/method_reassignment.py">GitHub</a>
</em>
</small>
</div>
<div class="clearfix"></div>
</div>
<p><sup id="fnref:new_inst_meth" role="doc-noteref"><a href="#fn:new_inst_meth" class="footnote" rel="footnote">4</a></sup></p>
<p><em>Do remember that you have to add methods <strong>to the class</strong>, not the instance/object.</em></p>
<p>Now please note that this is highly unsafe practice. Technically you can remove pretty much any method from any object and it is very hard to find where and if that has been done.</p>
<p>There is some fun stuff you can do now, since the methods you declared yourself are not the only thing you can change. This is an example of how you can overwrite <code class="language-plaintext highlighter-rouge">__init__</code> to change the behaviour of a class during object instantiation.</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
</pre></td><td class="code"><pre><span class="c1">#! /usr/bin/env python3
</span>
<span class="k">class</span> <span class="nc">BaseClass</span><span class="p">:</span>
<span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">print</span><span class="p">(</span><span class="n">BaseClass</span><span class="p">.</span><span class="n">__init__</span><span class="p">,</span> <span class="s">'executing'</span><span class="p">)</span>
<span class="k">class</span> <span class="nc">SubClass</span><span class="p">(</span><span class="n">BaseClass</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">print</span><span class="p">(</span><span class="n">SubClass</span><span class="p">.</span><span class="n">__init__</span><span class="p">,</span> <span class="s">'executing'</span><span class="p">)</span>
<span class="nb">super</span><span class="p">().</span><span class="n">__init__</span><span class="p">()</span>
<span class="k">print</span><span class="p">(</span><span class="s">'</span><span class="se">\n</span><span class="s">instantiating'</span><span class="p">,</span> <span class="n">SubClass</span><span class="p">)</span>
<span class="n">SubClass</span><span class="p">()</span>
<span class="c1"># =>> <function SubClass.__init__ at ...> executing
# =>> <function BaseClass.__init__ at ...> executing
</span>
<span class="k">print</span><span class="p">()</span>
<span class="c1"># removing it
</span><span class="k">del</span> <span class="n">SubClass</span><span class="p">.</span><span class="n">__init__</span>
<span class="k">print</span><span class="p">(</span><span class="s">'</span><span class="se">\n</span><span class="s">instantiating'</span><span class="p">,</span> <span class="n">SubClass</span><span class="p">,</span> <span class="s">'again'</span><span class="p">)</span>
<span class="n">SubClass</span><span class="p">()</span>
<span class="c1"># =>> <function BaseClass.__init__ at ...> executing
</span>
<span class="k">def</span> <span class="nf">new_init</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">print</span><span class="p">(</span><span class="s">'I overwrote __init__'</span><span class="p">)</span>
<span class="nb">super</span><span class="p">(</span><span class="n">SubClass</span><span class="p">,</span> <span class="bp">self</span><span class="p">).</span><span class="n">__init__</span><span class="p">()</span>
<span class="c1"># and adding a new one
</span><span class="n">SubClass</span><span class="p">.</span><span class="n">__init__</span> <span class="o">=</span> <span class="n">new_init</span>
<span class="k">print</span><span class="p">(</span><span class="s">'</span><span class="se">\n</span><span class="s">instantiating'</span><span class="p">,</span> <span class="n">SubClass</span><span class="p">,</span> <span class="s">'one last time'</span><span class="p">)</span>
<span class="n">SubClass</span><span class="p">()</span>
<span class="c1"># =>> I overwrote __init__
# =>> <function BaseClass.__init__ at ...> executing</span>
</pre></td></tr></tbody></table></code></pre></figure>
<div class="snippet-meta light-font">
<div>
<small class="right">Hacking __init__</small>
</div>
<div class="clearfix"></div>
<div>
<small class="right">
<em>
This snippets source on <a href="https://github.com/JustusAdam/justusadam.github.io/tree/master/_includes/snippets/python/monkey-patching/hacking_init.py">GitHub</a>
</em>
</small>
</div>
<div class="clearfix"></div>
</div>
<p>However you do not have to stop there.</p>
<p>Those that know how decorators work will be aware that they are just normal python functions and can be used as such.</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
</pre></td><td class="code"><pre><span class="c1">#! /usr/bin/env python3
</span>
<span class="k">def</span> <span class="nf">my_decorator</span><span class="p">(</span><span class="n">func</span><span class="p">):</span>
<span class="k">return</span> <span class="n">func</span>
<span class="c1"># ergo
</span>
<span class="o">@</span><span class="n">my_decorator</span>
<span class="k">def</span> <span class="nf">function1</span><span class="p">():</span>
<span class="k">pass</span>
<span class="c1"># is equivalent to
</span>
<span class="k">def</span> <span class="nf">function1</span><span class="p">():</span>
<span class="k">pass</span>
<span class="n">function1</span> <span class="o">=</span> <span class="n">my_decorator</span><span class="p">(</span><span class="n">function1</span><span class="p">)</span>
</pre></td></tr></tbody></table></code></pre></figure>
<div class="snippet-meta light-font">
<div>
<small class="right">Quick decorator refresher</small>
</div>
<div class="clearfix"></div>
<div>
<small class="right">
<em>
This snippets source on <a href="https://github.com/JustusAdam/justusadam.github.io/tree/master/_includes/snippets/python/monkey-patching/quick_decorator.py">GitHub</a>
</em>
</small>
</div>
<div class="clearfix"></div>
</div>
<p>We can use that fact to dynamically create classmethods and staticmethods.</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
</pre></td><td class="code"><pre><span class="c1">#! /usr/bin/env python3
</span>
<span class="k">class</span> <span class="nc">TestClass</span><span class="p">:</span>
<span class="k">pass</span>
<span class="k">def</span> <span class="nf">foo</span><span class="p">():</span>
<span class="k">print</span><span class="p">(</span><span class="s">'foo is executing'</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">bar</span><span class="p">(</span><span class="n">cls</span><span class="p">):</span>
<span class="k">print</span><span class="p">(</span><span class="s">'bar is executing'</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="n">cls</span><span class="p">)</span>
<span class="n">TestClass</span><span class="p">.</span><span class="n">static_foo</span> <span class="o">=</span> <span class="nb">staticmethod</span><span class="p">(</span><span class="n">foo</span><span class="p">)</span>
<span class="n">TestClass</span><span class="p">.</span><span class="n">class_bar</span> <span class="o">=</span> <span class="nb">classmethod</span><span class="p">(</span><span class="n">bar</span><span class="p">)</span>
<span class="n">TestClass</span><span class="p">.</span><span class="n">static_foo</span><span class="p">()</span>
<span class="c1"># =>> foo is executing
</span>
<span class="n">TestClass</span><span class="p">.</span><span class="n">class_bar</span><span class="p">()</span>
<span class="c1"># =>> bar is executing
# =>> <class '__main__.TestClass'></span>
</pre></td></tr></tbody></table></code></pre></figure>
<div class="snippet-meta light-font">
<div>
<small class="right">Dynammic class- and staticmethods</small>
</div>
<div class="clearfix"></div>
<div>
<small class="right">
<em>
This snippets source on <a href="https://github.com/JustusAdam/justusadam.github.io/tree/master/_includes/snippets/python/monkey-patching/dynamic_static_classmethods.py">GitHub</a>
</em>
</small>
</div>
<div class="clearfix"></div>
</div>
<p>I can actually think of very few ways that this can be useful. You should of course not apply this to live objects, since the consequences are highly opaque.</p>
<p>One way of using this though is to take a bunch of classes and add common or dynamically constructed methods to them.</p>
<p>The following is an example of a decorator that can be applied to a class and if there are public class variable in the class, whose value is a type, it will remove them and dynamically construct an <code class="language-plaintext highlighter-rouge">__init__</code> method for the class which requires the names of those fields as keyword arguments, typechecks them and then adds them to the object.</p>
<p>We can easily construct this and then add this new <code class="language-plaintext highlighter-rouge">__init__</code> method to our class.</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
</pre></td><td class="code"><pre><span class="c1">#! /usr/bin/env python3
</span>
<span class="k">def</span> <span class="nf">auto_init</span><span class="p">(</span><span class="n">class_</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">filter_func</span><span class="p">(</span><span class="n">fieldname</span><span class="p">):</span>
<span class="s">"""
Return True if the field is not private and its value is a type
"""</span>
<span class="k">if</span> <span class="n">fieldname</span><span class="p">.</span><span class="n">startswith</span><span class="p">(</span><span class="s">'_'</span><span class="p">):</span>
<span class="k">return</span> <span class="bp">False</span>
<span class="n">value</span> <span class="o">=</span> <span class="nb">getattr</span><span class="p">(</span><span class="n">class_</span><span class="p">,</span> <span class="n">fieldname</span><span class="p">)</span>
<span class="k">return</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="nb">type</span><span class="p">)</span>
<span class="n">fields</span> <span class="o">=</span> <span class="nb">filter</span><span class="p">(</span>
<span class="n">filter_func</span><span class="p">,</span>
<span class="n">class_</span><span class="p">.</span><span class="n">__dict__</span>
<span class="p">)</span>
<span class="n">fields_and_types</span> <span class="o">=</span> <span class="p">[(</span><span class="n">field</span><span class="p">,</span> <span class="nb">getattr</span><span class="p">(</span><span class="n">class_</span><span class="p">,</span> <span class="n">field</span><span class="p">))</span> <span class="k">for</span> <span class="n">field</span> <span class="ow">in</span> <span class="n">fields</span><span class="p">]</span>
<span class="k">for</span> <span class="n">field</span> <span class="ow">in</span> <span class="n">fields</span><span class="p">:</span>
<span class="c1"># delete the class attributed to prevent collision
</span> <span class="nb">delattr</span><span class="p">(</span><span class="n">class_</span><span class="p">,</span> <span class="n">field</span><span class="p">)</span>
<span class="c1"># preserve the old init, this is a safety measure
</span> <span class="n">old_init</span> <span class="o">=</span> <span class="nb">getattr</span><span class="p">(</span><span class="n">class_</span><span class="p">,</span> <span class="s">'__init__'</span><span class="p">)</span>
<span class="c1"># if the class does not define this itself, this will be super.__init__
</span> <span class="c1"># which is convenient
</span>
<span class="k">def</span> <span class="nf">new_init</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">):</span>
<span class="c1"># only accept kwargs, because otherwise there's no way of
</span> <span class="c1"># matching the fields
</span>
<span class="k">for</span> <span class="n">field</span><span class="p">,</span> <span class="n">type_</span> <span class="ow">in</span> <span class="n">fields_and_types</span><span class="p">:</span>
<span class="c1"># check whether the field is present
</span> <span class="c1"># for simplicity's sake I do not handle extra arguments
</span> <span class="k">if</span> <span class="n">field</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">kwargs</span><span class="p">:</span>
<span class="k">raise</span> <span class="nb">TypeError</span><span class="p">(</span>
<span class="s">'Expected keyword Argument {}'</span><span class="p">.</span><span class="nb">format</span><span class="p">(</span><span class="n">field</span><span class="p">)</span>
<span class="p">)</span>
<span class="n">value</span> <span class="o">=</span> <span class="n">kwargs</span><span class="p">[</span><span class="n">field</span><span class="p">]</span>
<span class="c1"># typecheck the field
</span> <span class="k">if</span> <span class="ow">not</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">type_</span><span class="p">):</span>
<span class="k">raise</span> <span class="nb">TypeError</span><span class="p">(</span>
<span class="s">'Expected instance of {} for field {}'</span><span class="p">.</span><span class="nb">format</span><span class="p">(</span>
<span class="n">type_</span><span class="p">,</span> <span class="n">field</span>
<span class="p">)</span>
<span class="p">)</span>
<span class="c1"># essentially self.field = value
</span> <span class="nb">setattr</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">field</span><span class="p">,</span> <span class="n">value</span><span class="p">)</span>
<span class="n">old_init</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="c1"># for completeness sake
</span>
<span class="n">class_</span><span class="p">.</span><span class="n">__init__</span> <span class="o">=</span> <span class="n">new_init</span>
<span class="k">return</span> <span class="n">class_</span>
<span class="o">@</span><span class="n">auto_init</span>
<span class="k">class</span> <span class="nc">TestClass</span><span class="p">:</span>
<span class="n">foo</span> <span class="o">=</span> <span class="nb">int</span>
<span class="n">bar</span> <span class="o">=</span> <span class="nb">int</span>
<span class="n">glob</span> <span class="o">=</span> <span class="nb">str</span>
<span class="n">a</span> <span class="o">=</span> <span class="n">TestClass</span><span class="p">(</span><span class="n">foo</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">bar</span><span class="o">=</span><span class="mi">8</span><span class="p">,</span> <span class="n">glob</span><span class="o">=</span><span class="s">"globbi globbi globbi"</span><span class="p">)</span>
<span class="n">b</span> <span class="o">=</span> <span class="n">TestClass</span><span class="p">(</span><span class="n">foo</span><span class="o">=</span><span class="mi">8</span><span class="p">,</span> <span class="n">bar</span><span class="o">=</span><span class="mi">8</span><span class="p">,</span> <span class="n">glob</span><span class="o">=</span><span class="s">""</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="n">a</span><span class="p">.</span><span class="n">foo</span><span class="p">)</span>
<span class="c1"># =>> 0
</span><span class="k">print</span><span class="p">(</span><span class="n">b</span><span class="p">.</span><span class="n">foo</span><span class="p">)</span>
<span class="c1"># =>> 8
</span>
<span class="k">print</span><span class="p">(</span><span class="n">b</span><span class="p">.</span><span class="n">bar</span> <span class="o">==</span> <span class="n">a</span><span class="p">.</span><span class="n">bar</span><span class="p">)</span>
<span class="c1"># =>> True
</span>
<span class="k">print</span><span class="p">(</span><span class="n">b</span><span class="p">.</span><span class="n">glob</span> <span class="o">!=</span> <span class="n">a</span><span class="p">.</span><span class="n">glob</span><span class="p">)</span>
<span class="c1"># =>> True
</span>
<span class="k">print</span><span class="p">(</span><span class="n">a</span><span class="p">.</span><span class="n">glob</span><span class="p">)</span>
<span class="c1"># =>> globbi globbi globbi
</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">TestClass</span><span class="p">()</span>
<span class="k">except</span> <span class="nb">Exception</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
<span class="k">print</span><span class="p">(</span><span class="n">e</span><span class="p">)</span>
<span class="c1"># =>> Expected keyword Argument foo
</span>
<span class="k">try</span><span class="p">:</span>
<span class="n">TestClass</span><span class="p">(</span><span class="n">foo</span><span class="o">=</span><span class="nb">object</span><span class="p">,</span> <span class="n">bar</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">glob</span><span class="o">=</span><span class="s">"eirjg"</span><span class="p">)</span>
<span class="k">except</span> <span class="nb">Exception</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
<span class="k">print</span><span class="p">(</span><span class="n">e</span><span class="p">)</span>
<span class="c1"># =>> Expected instance of <class 'int'> for field foo</span>
</pre></td></tr></tbody></table></code></pre></figure>
<div class="snippet-meta light-font">
<div>
<small class="right">A decorator that dynamically creates typechecked init methods</small>
</div>
<div class="clearfix"></div>
<div>
<small class="right">
<em>
This snippets source on <a href="https://github.com/JustusAdam/justusadam.github.io/tree/master/_includes/snippets/python/monkey-patching/dynamic_init.py">GitHub</a>
</em>
</small>
</div>
<div class="clearfix"></div>
</div>
<p>You could just as easily add more to this decorator. The fields it adds could be private by default and it could also add dynamic accessor methods. Instead of just writing types to the class attributes, you could provide further meta information and construct appropriate accessor methods or even not create fields at all and instead create field mimicking accessor methods using <code class="language-plaintext highlighter-rouge">@property</code>. This can be very useful if your object is tying to hide (or simplify) access to a database or external system by imitating a normal object but instead of accessing fields it may make a database or network connection or read from a file.</p>
<p>All in all it is I think useful to know that these things are possible, and if used correctly certainly can be a powerful tool. I like the possibilities but I also have never really used it in practice.</p>
<p>Let me know if you’d be interested to see more ‘useful’ application of this concept and I might make another post about it.</p>
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:instance_dict_init" role="doc-endnote">
<p>This is true for any custom object. It can however be changed by using decorators or metaclasses. <a href="#fnref:instance_dict_init" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:slots" role="doc-endnote">
<p>This is true for classes that do not define <code class="language-plaintext highlighter-rouge">__slots__</code> which will in fact allocate named fields. <a href="#fnref:slots" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:class_temper_limits" role="doc-endnote">
<p>This pretty much only applies to classes actually created using <code class="language-plaintext highlighter-rouge">class</code>, not to builtin types such as for example <code class="language-plaintext highlighter-rouge">object</code>, <code class="language-plaintext highlighter-rouge">function</code> and <code class="language-plaintext highlighter-rouge">type</code> itself. <a href="#fnref:class_temper_limits" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:new_inst_meth" role="doc-endnote">
<p>The added/reassigned/removed methods affect both new and old instances of the class (instantly). <a href="#fnref:new_inst_meth" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>Justus(Re)Importing in python - don’t touch sys.modules2015-04-19T00:00:00+00:002015-04-19T00:00:00+00:00http://justus.science/blog/2015/04/19/sys.modules-is-dangerous<p>So I am sitting here watching <a href="//twitter.com/dabeaz">David Beazley</a>’s <a href="//https://www.youtube.com/watch?v=0oTh1CXRaQ0">pycon talk</a> about modules, packages and imports and he is talking about <code class="language-plaintext highlighter-rouge">sys.modules</code> sort of guarding multiple imports which inspired me to fire up the python interpreter myself and start messing about.</p>
<h2 id="reimporting-in-python---basics">Reimporting in Python - Basics</h2>
<p>As David mentions, in python you cannot just <code class="language-plaintext highlighter-rouge">import module</code> again to reload the module. The canonical albeit still bad way is to <code class="language-plaintext highlighter-rouge">import importlib</code> and use <code class="language-plaintext highlighter-rouge">importlib.reload(module)</code>.</p>
<p>I’d like to take this opportunity to disclaim here immediately and sort of spoil the conclusion by stating that you should absolutely avoid reimporting any modules in python. As you’ll see towards the end, things can get very messy very quickly when you reimport modules. It can cause severe bugs which can be virtually impossible to track down.</p>
<p>If you find yourself playing with the idea of reimporting modules in software that’ll be used productively consider alternatives such as writing unittests (if you’re using it to test code while developing), using multi-/subprocess to run the code in a separate interpreter, refactoring or simply restarting the interpreter.</p>
<p>Also in this article I deliberately try to make programs fail and break which is intended to explore features of the interpreter and standard library and not meant to be done in productive software.</p>
<h2 id="messing-with-sysmodules">Messing with <code class="language-plaintext highlighter-rouge">sys.modules</code></h2>
<p>David Beazley also mentions in his talk that the instance actually recording imported modules is located in <code class="language-plaintext highlighter-rouge">sys.modules</code> which happens to be a standard python dict.</p>
<p>The interesting thing about that is that unlike <code class="language-plaintext highlighter-rouge">mappingproxy</code> which is the dict-like object/wrapper/imitator that a lot of the builtin data structures (such as the <code class="language-plaintext highlighter-rouge">dict</code> itself) use to imitate a dict while avoiding modification<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup> and infinite recursion<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup> this <code class="language-plaintext highlighter-rouge">sys.modules</code> standard dict supports item assignment <code class="language-plaintext highlighter-rouge">__setitem__</code> as well as deletion <code class="language-plaintext highlighter-rouge">__delitem__</code>.</p>
<p>This got me thinking “How much does the sys.modules dict actually influence the import process.” and as it turns out a lot and it allows you to mess with it.</p>
<p>If you import a module, let’s call it <code class="language-plaintext highlighter-rouge">test</code>, modify the file and import again (in the same interpreter instance) nothing changes, you’re still running the old code. But what happens if you delete the module from sys.modules first?</p>
<p>The answer: nothing at first. The code still runs, all functions that were in the module previously are still there, as is the module itself, <strong>but</strong> something odd happens if you execute <code class="language-plaintext highlighter-rouge">import test</code> again: it reimports the module.</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="c1"># module test
</span>
<span class="k">def</span> <span class="nf">hello</span><span class="p">():</span>
<span class="k">print</span><span class="p">(</span><span class="s">'hello everyone'</span><span class="p">)</span>
<span class="c1"># in the interpreter
</span><span class="o">>>></span> <span class="kn">import</span> <span class="nn">test</span>
<span class="o">>>></span> <span class="n">test</span><span class="p">.</span><span class="n">hello</span><span class="p">()</span>
<span class="n">hello</span> <span class="n">everyone</span>
<span class="c1"># change the file
</span><span class="k">def</span> <span class="nf">hello</span><span class="p">():</span>
<span class="k">print</span><span class="p">(</span><span class="s">'hello'</span><span class="p">)</span>
<span class="c1"># back in the interpreter
</span><span class="o">>>></span> <span class="kn">import</span> <span class="nn">sys</span>
<span class="o">>>></span> <span class="k">del</span> <span class="n">sys</span><span class="p">.</span><span class="n">modules</span><span class="p">[</span><span class="s">'test'</span><span class="p">]</span> <span class="c1"># delete 'test'
</span><span class="o">>>></span> <span class="s">'test'</span> <span class="ow">in</span> <span class="n">sys</span><span class="p">.</span><span class="n">modules</span>
<span class="bp">False</span>
<span class="o">>>></span> <span class="kn">import</span> <span class="nn">test</span> <span class="c1"># reimport
</span><span class="o">>>></span> <span class="n">test</span><span class="p">.</span><span class="n">hello</span><span class="p">()</span> <span class="c1"># and voila, new an shiny
</span><span class="n">hello</span></code></pre></figure>
<p><sup id="fnref:dirty_import" role="doc-noteref"><a href="#fn:dirty_import" class="footnote" rel="footnote">3</a></sup></p>
<h2 id="consequences">Consequences</h2>
<p>This is would per se not be all that bad, <strong>however</strong> this hacked reload does not facilitate the same behaviour as <code class="language-plaintext highlighter-rouge">importlib.reload</code>.</p>
<p>The difference between reloading the module this way, which I do not recommend anyone actually does, and using <code class="language-plaintext highlighter-rouge">importlib.reload</code> is that this particular way of reloading only reloads the module in the current namespace.</p>
<p>Let’s suppose we have two modules <code class="language-plaintext highlighter-rouge">foo.py</code> and <code class="language-plaintext highlighter-rouge">bar.py</code> where <code class="language-plaintext highlighter-rouge">bar</code> imports <code class="language-plaintext highlighter-rouge">foo</code> and uses a function defined therein:</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="c1"># foo.py
</span>
<span class="k">def</span> <span class="nf">hello</span><span class="p">():</span>
<span class="k">print</span><span class="p">(</span><span class="s">'hello'</span><span class="p">)</span>
<span class="c1"># bar.py
</span>
<span class="kn">import</span> <span class="nn">foo</span>
<span class="k">def</span> <span class="nf">hello_bar</span><span class="p">():</span>
<span class="n">foo</span><span class="p">.</span><span class="n">hello</span><span class="p">()</span></code></pre></figure>
<p>We can then do the following experiment:</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="o">>>></span> <span class="kn">import</span> <span class="nn">foo</span><span class="p">,</span> <span class="n">bar</span>
<span class="o">>>></span> <span class="n">bar</span><span class="p">.</span><span class="n">hello_bar</span><span class="p">()</span>
<span class="n">hello</span>
<span class="o">>>></span> <span class="n">foo</span><span class="p">.</span><span class="n">hello</span><span class="p">()</span>
<span class="n">hello</span>
<span class="c1"># now we go into foo.py and change print('hello') to print('hello everyone')
</span><span class="o">>>></span> <span class="kn">from</span> <span class="nn">importlib</span> <span class="kn">import</span> <span class="nb">reload</span>
<span class="o">>>></span> <span class="nb">reload</span><span class="p">(</span><span class="n">foo</span><span class="p">)</span>
<span class="o">>>></span> <span class="n">bar</span><span class="p">.</span><span class="n">hello_bar</span><span class="p">()</span>
<span class="n">hello</span> <span class="n">everyone</span>
<span class="o">>>></span> <span class="n">foo</span><span class="p">.</span><span class="n">hello</span><span class="p">()</span>
<span class="n">hello</span> <span class="n">everyone</span></code></pre></figure>
<p>As you can see using <code class="language-plaintext highlighter-rouge">importlib.reload</code> reloads the module and references to the module are updated as well.<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">4</a></sup> This behavior is different if you reload using our dirty little trick.</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="o">>>></span> <span class="kn">import</span> <span class="nn">foo</span><span class="p">,</span> <span class="n">bar</span>
<span class="o">>>></span> <span class="n">bar</span><span class="p">.</span><span class="n">hello_bar</span><span class="p">()</span>
<span class="n">hello</span>
<span class="o">>>></span> <span class="n">foo</span><span class="p">.</span><span class="n">hello</span><span class="p">()</span>
<span class="n">hello</span>
<span class="c1"># now we go into foo.py and change print('hello') to print('hello everyone')
</span><span class="o">>>></span> <span class="kn">import</span> <span class="nn">sys</span>
<span class="o">>>></span> <span class="k">del</span> <span class="n">sys</span><span class="p">.</span><span class="n">modules</span><span class="p">[</span><span class="s">'foo'</span><span class="p">]</span>
<span class="o">>>></span> <span class="kn">import</span> <span class="nn">foo</span>
<span class="o">>>></span> <span class="n">bar</span><span class="p">.</span><span class="n">hello_bar</span><span class="p">()</span>
<span class="n">hello</span>
<span class="o">>>></span> <span class="n">foo</span><span class="p">.</span><span class="n">hello</span><span class="p">()</span>
<span class="n">hello</span> <span class="n">everyone</span></code></pre></figure>
<p>Here the reference to <code class="language-plaintext highlighter-rouge">foo</code> in <code class="language-plaintext highlighter-rouge">bar</code> is not being updated which seems to indicate that this <code class="language-plaintext highlighter-rouge">import</code> is overwriting the definition of the module wherever it is being kept and the old version of the code remains in the <code class="language-plaintext highlighter-rouge">globals()</code> dicts of the modules using it.</p>
<h2 id="what-is-sysmodules">What is <code class="language-plaintext highlighter-rouge">sys.modules</code>?</h2>
<p>As we have seen deleting entries in <code class="language-plaintext highlighter-rouge">sys.modules</code> causes the interpreter to reload modules in <code class="language-plaintext highlighter-rouge">import</code> statements, but why is that and what are the entries in <code class="language-plaintext highlighter-rouge">sys.modules</code>?</p>
<p>Well, <code class="language-plaintext highlighter-rouge">sys.modules</code> contains references to already imported modules. You can query it on the type of the entries and it tells you that the entries are actual modules, the same class/type you’d obtain when querying the module directly.</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="o">>>></span> <span class="kn">import</span> <span class="nn">foo</span>
<span class="o">>>></span> <span class="kn">import</span> <span class="nn">sys</span>
<span class="o">>>></span> <span class="nb">type</span><span class="p">(</span><span class="n">sys</span><span class="p">.</span><span class="n">modules</span><span class="p">[</span><span class="s">'foo'</span><span class="p">])</span>
<span class="o"><</span><span class="k">class</span> <span class="err">'</span><span class="nc">module</span><span class="s">'>
>>> type(foo)
<class '</span><span class="n">module</span><span class="s">'>
>>> type(foo) == type(sys.modules['</span><span class="n">foo</span><span class="s">'])
True</span></code></pre></figure>
<p>In fact the module reference in <code class="language-plaintext highlighter-rouge">sys.modules</code> is the the exact same object as your module itself.</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="o">>>></span> <span class="n">sys</span><span class="p">.</span><span class="n">modules</span><span class="p">[</span><span class="s">'foo'</span><span class="p">]</span> <span class="o">==</span> <span class="n">foo</span>
<span class="bp">True</span>
<span class="o">>>></span> <span class="n">sys</span><span class="p">.</span><span class="n">modules</span><span class="p">[</span><span class="s">'foo'</span><span class="p">]</span> <span class="ow">is</span> <span class="n">foo</span>
<span class="bp">True</span></code></pre></figure>
<p>Knowing all this, here is a very crude sketch of how the <code class="language-plaintext highlighter-rouge">__import__</code> function in python works which is the implementation of the <code class="language-plaintext highlighter-rouge">import</code> statement.</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="kn">import</span> <span class="nn">sys</span>
<span class="k">def</span> <span class="nf">__import__</span><span class="p">(</span><span class="n">name</span><span class="p">,</span> <span class="nb">globals</span><span class="o">=</span><span class="bp">None</span><span class="p">,</span> <span class="nb">locals</span><span class="o">=</span><span class="bp">None</span><span class="p">,</span> <span class="n">fromlist</span><span class="o">=</span><span class="p">(),</span> <span class="n">level</span><span class="o">=</span><span class="mi">0</span><span class="p">):</span>
<span class="c1"># we wont care about how globals, locals, fromlist and level are used
</span> <span class="c1"># it is not important, but if you're interested refer to
</span> <span class="c1"># help(__import__) to get started
</span>
<span class="k">if</span> <span class="ow">not</span> <span class="n">name</span> <span class="ow">in</span> <span class="n">sys</span><span class="p">.</span><span class="n">modules</span><span class="p">:</span>
<span class="k">return</span> <span class="n">sys</span><span class="p">.</span><span class="n">modules</span><span class="p">[</span><span class="n">name</span><span class="p">]</span> <span class="o">=</span> <span class="n">do_actual_import</span><span class="p">(</span><span class="n">name</span><span class="p">,</span> <span class="p">...)</span>
<span class="k">return</span> <span class="n">sys</span><span class="p">.</span><span class="n">modules</span><span class="p">[</span><span class="n">name</span><span class="p">]</span></code></pre></figure>
<p><sup id="fnref:4" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">5</a></sup></p>
<p>Now if we were to delete the entry from <code class="language-plaintext highlighter-rouge">sys.modules</code> <code class="language-plaintext highlighter-rouge">__import__</code> would do the expensive import of the file again, since it cannot find the module in <code class="language-plaintext highlighter-rouge">sys.modules</code>. It then returns the new module and adds the reference to <code class="language-plaintext highlighter-rouge">sys.modules</code> which then would also point to the new module, however any module that imported <code class="language-plaintext highlighter-rouge">name</code> previously still has a reference to the module object in its <code class="language-plaintext highlighter-rouge">globals()</code> (or <code class="language-plaintext highlighter-rouge">__dict__</code> if you prefer) dict and as such runs the old code.</p>
<p>As for the behavior of <code class="language-plaintext highlighter-rouge">importlib.reload</code>, it reloads the module back into the original <code class="language-plaintext highlighter-rouge">module</code> object and ‘fixes’ (though ‘changes’ might be the better term to use here) the references in-place.<sup id="fnref:5" role="doc-noteref"><a href="#fn:5" class="footnote" rel="footnote">6</a></sup> As a result any module that imported using <code class="language-plaintext highlighter-rouge">import module</code> and then uses <code class="language-plaintext highlighter-rouge">module.attribute</code> or <code class="language-plaintext highlighter-rouge">module.function()</code> instead of reassigning with <code class="language-plaintext highlighter-rouge">from module import attribute</code> or <code class="language-plaintext highlighter-rouge">myattribute = module.attribute</code> will now have the updated, reimported version of the code.<sup id="fnref:6" role="doc-noteref"><a href="#fn:6" class="footnote" rel="footnote">7</a></sup></p>
<p>What it doesn’t do however is remove any keys. This means if you imported a module <code class="language-plaintext highlighter-rouge">bar</code> with a function <code class="language-plaintext highlighter-rouge">hello</code> and you were to edit the file, removing the function entirely or commenting it out and then reimport the module using <code class="language-plaintext highlighter-rouge">importlib.reload</code> the new module object <code class="language-plaintext highlighter-rouge">bar</code> still has the <code class="language-plaintext highlighter-rouge">hello</code> attribute with the original function in it.</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="o">>>></span> <span class="kn">import</span> <span class="nn">foo</span><span class="p">,</span> <span class="n">bar</span>
<span class="o">>>></span> <span class="n">bar</span><span class="p">.</span><span class="n">hello</span><span class="p">()</span>
<span class="n">hello</span>
<span class="c1"># at this point I removed 'hello' from bar.py
</span><span class="o">>>></span> <span class="kn">import</span> <span class="nn">importlib</span>
<span class="o">>>></span> <span class="n">importlib</span><span class="p">.</span><span class="nb">reload</span><span class="p">(</span><span class="n">bar</span><span class="p">)</span>
<span class="o"><</span><span class="n">module</span> <span class="s">'bar'</span> <span class="k">from</span> <span class="s">'/Users/justusadam/projects/Python/misc_python/bar.py'</span><span class="o">></span>
<span class="o">>>></span> <span class="n">bar</span><span class="p">.</span><span class="n">hello</span><span class="p">()</span>
<span class="n">hello</span>
<span class="o">>>></span></code></pre></figure>
<h2 id="conclusions">Conclusions</h2>
<p>What should one take away from it? Don’t reimport modules.</p>
<p>It does not matter whether you use <code class="language-plaintext highlighter-rouge">importlib.reload</code> or something worse, unless you know exactly what you’re doing and act very cautiously you’re very likely to end up with code in a state, where some parts of the program have older and some parts have newer references to the code and there’s no way for you to predict the outcome of a particular computation. Write unittests instead.</p>
<p>However if you feel pathologically adventurous or absolutely require dynamic reloads, try to only keep references to the top level modules and reload them individually using <code class="language-plaintext highlighter-rouge">importlib</code>.</p>
<p>Good luck, have fun and remember that <code class="language-plaintext highlighter-rouge">collections</code> is worth a look and use <code class="language-plaintext highlighter-rouge">yield</code>, it’s awesome.</p>
<h2 id="fun-facts-and-extras">Fun facts and extras</h2>
<h4 id="what-happens-with-importlibreload-when-you-delete-the-module-from-sysmodules">What happens with <code class="language-plaintext highlighter-rouge">importlib.reload</code> when you delete the module from <code class="language-plaintext highlighter-rouge">sys.modules</code>?</h4>
<p>It fails. In order to reload the module it must be in <code class="language-plaintext highlighter-rouge">sys.modules</code>.</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="o">>>></span> <span class="k">del</span> <span class="n">sys</span><span class="p">.</span><span class="n">modules</span><span class="p">[</span><span class="s">'bar'</span><span class="p">]</span>
<span class="o">>>></span> <span class="s">'bar'</span> <span class="ow">in</span> <span class="n">sys</span><span class="p">.</span><span class="n">modules</span>
<span class="bp">False</span>
<span class="o">>>></span> <span class="n">importlib</span><span class="p">.</span><span class="nb">reload</span><span class="p">(</span><span class="n">bar</span><span class="p">)</span>
<span class="n">Traceback</span> <span class="p">(</span><span class="n">most</span> <span class="n">recent</span> <span class="n">call</span> <span class="n">last</span><span class="p">):</span>
<span class="n">File</span> <span class="s">"<stdin>"</span><span class="p">,</span> <span class="n">line</span> <span class="mi">1</span><span class="p">,</span> <span class="ow">in</span> <span class="o"><</span><span class="n">module</span><span class="o">></span>
<span class="n">File</span> <span class="s">"/usr/local/Cellar/python3/3.4.3/Frameworks/Python.framework/Versions/3.4/lib/python3.4/importlib/__init__.py"</span><span class="p">,</span> <span class="n">line</span> <span class="mi">130</span><span class="p">,</span> <span class="ow">in</span> <span class="nb">reload</span>
<span class="k">raise</span> <span class="nb">ImportError</span><span class="p">(</span><span class="n">msg</span><span class="p">.</span><span class="nb">format</span><span class="p">(</span><span class="n">name</span><span class="p">),</span> <span class="n">name</span><span class="o">=</span><span class="n">name</span><span class="p">)</span>
<span class="nb">ImportError</span><span class="p">:</span> <span class="n">module</span> <span class="n">bar</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">sys</span><span class="p">.</span><span class="n">modules</span></code></pre></figure>
<p>The same applies if you reassign <code class="language-plaintext highlighter-rouge">sys.modules['bar'] = foo</code>. You’ll get the exact same error.</p>
<h4 id="my-crude-implementation-of-importlibreload">My crude implementation of <code class="language-plaintext highlighter-rouge">importlib.reload</code></h4>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="k">def</span> <span class="nf">reload</span><span class="p">(</span><span class="n">module</span><span class="p">):</span>
<span class="n">file_name</span> <span class="o">=</span> <span class="n">module</span><span class="p">.</span><span class="n">__file__</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">file_name</span><span class="p">)</span> <span class="k">as</span> <span class="nb">file</span><span class="p">:</span>
<span class="n">raw</span> <span class="o">=</span> <span class="nb">file</span><span class="p">.</span><span class="n">read</span><span class="p">().</span><span class="n">decode</span><span class="p">()</span>
<span class="n">m_globals</span> <span class="o">=</span> <span class="p">{}</span>
<span class="k">exec</span><span class="p">(</span><span class="n">raw</span><span class="p">,</span> <span class="nb">globals</span><span class="o">=</span><span class="n">m_globals</span><span class="p">)</span>
<span class="k">for</span> <span class="n">symbol</span><span class="p">,</span> <span class="n">val</span> <span class="ow">in</span> <span class="n">m_globals</span><span class="p">.</span><span class="n">items</span><span class="p">():</span>
<span class="n">module</span><span class="p">.</span><span class="n">__dict__</span><span class="p">[</span><span class="n">symbol</span><span class="p">]</span> <span class="o">=</span> <span class="n">val</span>
<span class="k">return</span> <span class="n">module</span></code></pre></figure>
<p><sup id="fnref:reload_impl" role="doc-noteref"><a href="#fn:reload_impl" class="footnote" rel="footnote">8</a></sup></p>
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:1" role="doc-endnote">
<p><code class="language-plaintext highlighter-rouge">dict.__dict__.__getitem__ = 8</code> results in <code class="language-plaintext highlighter-rouge">AttributeError: 'mappingproxy' object attribute '__getitem__' is read-only</code> <a href="#fnref:1" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:2" role="doc-endnote">
<p>Otherwise any <code class="language-plaintext highlighter-rouge">dict</code> would have an instance dict <code class="language-plaintext highlighter-rouge">dict.__dict__</code> which would have an instance dict <code class="language-plaintext highlighter-rouge">dict.__dict.__.__dict__</code> of type dict which would have an instance dict and so on. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:dirty_import" role="doc-endnote">
<p>The reason why calling <code class="language-plaintext highlighter-rouge">foo.hello</code> refers to the new code instead of the old one, is because <code class="language-plaintext highlighter-rouge">import</code> overwrites it’s value in our current <code class="language-plaintext highlighter-rouge">globals()</code> dict when we use it. As such it reloads the module for whatevernamespace we happened to be in. <a href="#fnref:dirty_import" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:3" role="doc-endnote">
<p>This does not work if you reassign contents of the imported module. I you do something like <code class="language-plaintext highlighter-rouge">var = module.other_var</code> change the value of <code class="language-plaintext highlighter-rouge">other_var</code> and reload <code class="language-plaintext highlighter-rouge">var</code> will still have the old value. That applies to functions and variables as well as <code class="language-plaintext highlighter-rouge">from module import symbol</code> imports.
From this I can only assume that <code class="language-plaintext highlighter-rouge">importlib.reload</code> changes the module object in place rather than replace it. <a href="#fnref:3" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:4" role="doc-endnote">
<p>This, again, is not the actual implementation of the <code class="language-plaintext highlighter-rouge">__import__</code> function but rather a <strong>very</strong> crude approximation for the purposes of this article. For instance this function could not deal at all with importing submodules, such as <code class="language-plaintext highlighter-rouge">foo.bar</code> <a href="#fnref:4" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:5" role="doc-endnote">
<p>Which you can actually do yourself. <code class="language-plaintext highlighter-rouge">module</code> objects are not immutable and you can freely assign, remove or alter any part of it <code class="language-plaintext highlighter-rouge">module.foo = 0</code> or <code class="language-plaintext highlighter-rouge">module.bar = lambda k: print(k)</code> <a href="#fnref:5" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:6" role="doc-endnote">
<p>The same rules apply to if you’ve altered the module <code class="language-plaintext highlighter-rouge">module.attribute = "new value"</code> or <code class="language-plaintext highlighter-rouge">module.function = lambda a: print(a, "hello")</code>, only modules importing the base module <code class="language-plaintext highlighter-rouge">import module</code> will have updated refs <code class="language-plaintext highlighter-rouge">module.attribute ==> "new value"</code>, not modules using <code class="language-plaintext highlighter-rouge">from module import attribute</code> or <code class="language-plaintext highlighter-rouge">myattr = module.attribute ==> myattr == "only value"</code> <a href="#fnref:6" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:reload_impl" role="doc-endnote">
<p>Again, this is not the official implementation and strongly simplified. It also does not interact with <code class="language-plaintext highlighter-rouge">sys.modules</code>, which we know it should/does, and it again only works for top-level modules. It is only here to illustrate how some of the behavior of the function could be implemented in python not how it is actually done. <a href="#fnref:reload_impl" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>JustusSo I am sitting here watching David Beazley’s pycon talk about modules, packages and imports and he is talking about sys.modules sort of guarding multiple imports which inspired me to fire up the python interpreter myself and start messing about.A sweetass sidebar in Jekyll2015-04-17T00:00:00+00:002015-04-17T00:00:00+00:00http://justus.science/blog/2015/04/17/a-sweetass-sidebar<p>So the <a href="//jekyllrb.com">Jekyll</a> default installation using <code class="language-plaintext highlighter-rouge">jekyll new</code> is, albeit beautiful, a little bit bare bones, which is probably deliberate.</p>
<h2 id="adding-a-sidebar">Adding a sidebar</h2>
<p>Mostly I felt like there was a lot of unused space on the index page, so I started to customize the index.html and added a <code class="language-plaintext highlighter-rouge">sidebar-right.html</code> file to include in the index page (modular is always good).</p>
<figure class="highlight"><pre><code class="language-html" data-lang="html"><span class="nt"><div</span> <span class="na">class=</span><span class="s">"sidebar-right sidebar"</span><span class="nt">></div></span></code></pre></figure>
<p>And I added some content in the form of short FAQ-like messages titled “Did you know …”.</p>
<figure class="highlight"><pre><code class="language-html" data-lang="html"><span class="nt"><div</span> <span class="na">class=</span><span class="s">"sidebar-right sidebar right column-4"</span><span class="nt">></span>
<span class="nt"><p></span>Did you know ...<span class="nt"></p></span>
<span class="nt"><ul</span> <span class="na">class=</span><span class="s">"fact-list smaller"</span><span class="nt">></span>
...
<span class="nt"></ul></span>
<span class="nt"></div></span></code></pre></figure>
<p>But I wasn’t satisfied. There were a lot of big and ugly <code class="language-plaintext highlighter-rouge"><a></code> tags in there and the whole thing felt so … static. Actually writing a <code class="language-plaintext highlighter-rouge"><ul></code> element by hand in html felt just … wrong. Fortunately I had just learnt about <a href="https://jekyllrb.com/docs/collections/">jekyll collections</a> and was using them to create some <a href="/projects/">project pages</a>. So I created a new collection called ‘quick_facts’ and refactored the messages into individual <code class="language-plaintext highlighter-rouge">.md</code>’s containing markdown source for the messages, plus the YAML Front Matter and added this little line instead of the giant blobs of text from before.</p>
<figure class="highlight"><pre><code class="language-html" data-lang="html"><span class="nt"><ul</span> <span class="na">class=</span><span class="s">"fact-list smaller"</span><span class="nt">></span>
{% for fact in site.quick_facts %}
<span class="nt"><li</span><span class="err">{%</span> <span class="na">if</span> <span class="na">forloop.last</span> <span class="err">%}</span> <span class="na">class=</span><span class="s">"last"</span><span class="err">{%</span> <span class="na">endif</span> <span class="err">%}</span><span class="nt">></span>{{ fact.output }}<span class="nt"></li></span>
{% endfor %}
<span class="nt"></ul></span></code></pre></figure>
<p>The <code class="language-plaintext highlighter-rouge">if forloop.last</code> block adds a ‘last’ class to the last element to allow me to add some pretty separators using <code class="language-plaintext highlighter-rouge">border-bottom</code>.</p>
<p>Now if I want to edit a message, instead of digging around in lines upon lines of raw html code I can go straight to the message file containing some nice markdown source.
The best feature in my eyes though is that if I want to add, replace or delete I just have to add/replace/delete <code class="language-plaintext highlighter-rouge">.md</code> files in the <code class="language-plaintext highlighter-rouge">_quick_facts</code> directory and it’ll process them automatically.</p>
<h2 id="displaying-excerpts-and-descriptions">Displaying excerpts and descriptions</h2>
<p>On the website I’ve had before I was using <a href="//drupal.org">Drupal</a> which by default displayed a kind of teaser on the overviews. I liked that so I replicated it in Jekyll.</p>
<p>But I wanted to do more, or more precisely I didn’t realize at first that Jekyll offers an <code class="language-plaintext highlighter-rouge">excerpt</code> attribute on he document objects and so I added something of my own making.</p>
<figure class="highlight"><pre><code class="language-html" data-lang="html">{% if post.description %}
<span class="nt"><p</span> <span class="na">class=</span><span class="s">"small light-font"</span><span class="nt">></span>
{{ post.description | truncate: 100, '...' }}
<span class="nt"></p></span></code></pre></figure>
<p>Which would, if a document object had a <code class="language-plaintext highlighter-rouge">description</code> attribute print the first 100 characters of it in a smaller, lighter font.</p>
<p>However I discovered that the document objects actually have an <code class="language-plaintext highlighter-rouge">excerpt</code> attribute which will generate a teaser based on the content itself, so I combined the two. Furthermore I found out that you can simply add your own custom variables to the <code class="language-plaintext highlighter-rouge">_config.yml</code> which will then be available via the <code class="language-plaintext highlighter-rouge">site</code> attribute, so I refactored the teaser length such that it is set in the main config as <code class="language-plaintext highlighter-rouge">quick_view_length</code>. And here’s the final result:</p>
<figure class="highlight"><pre><code class="language-html" data-lang="html">{% if post.description %}
{% assign desc = post.description %}
{% else %}
{% assign desc = post.excerpt | remove: '<span class="nt"><p></span>' | remove: '<span class="nt"></p></span>' %}
{% endif %}
<span class="nt"><p</span> <span class="na">class=</span><span class="s">"small light-font"</span><span class="nt">></span>
{{ desc | truncate: site.quick_view_length, '...' }}
<span class="nt"></p></span></code></pre></figure>
<p>Now it will either print the first paragraph of the page (jekyll’s default <code class="language-plaintext highlighter-rouge">excerpt</code> style) or a custom description, if you provide one, perhaps if the first paragraph is not very representative of the rest of the content.</p>
<p>You can check out how it looks on the <a href="/">homepage</a>.</p>JustusSo the Jekyll default installation using jekyll new is, albeit beautiful, a little bit bare bones, which is probably deliberate.Legal stuff2015-04-17T00:00:00+00:002015-04-17T00:00:00+00:00http://justus.science/blog/2015/04/17/legal<p>So, as far as I can see the site should now be pretty much up to where it used to be. I added an Imprint and a Disclaimer because apparently that is absolutely necessary here in Germany if one does anything vaguely journalistic and publishes content to a potentially wider audience.</p>
<p>I think it is a bit ridiculous that I’d have to publish my public home address on the internet just because I want to entertain a small blog and website.</p>
<p>But it is there now. Placed it I underneath the footer, looking good. Gave me good opportunity to get the <a href="/legal/license.html">license</a> page off the header.
It was a bit too prominent, but still keep it around somewhere. I did’t want to come off as if I’d be trying to boast with the fact that this content is CC licensed but I do feel good about it and I want to give people the opportunity to discover it and potentially use the content (as soon as there is any).
After all the idea is that (soon) I’ll actually start talking about interesting technologies again here and perhaps someday someone might find something useful on these pages and want’s to pass it on. I’d like to give those future people the chance to do so.</p>JustusSo, as far as I can see the site should now be pretty much up to where it used to be. I added an Imprint and a Disclaimer because apparently that is absolutely necessary here in Germany if one does anything vaguely journalistic and publishes content to a potentially wider audience.