Monday, March 29, 2021

Limit CPU/Memory When using Docker

After having trouble with the Singularity Pull command on a computing cluster, I decided it might be easier to download the image via Docker to my home machine and upload it from there. I tried it, but Docker turned out to be such a huge resource hog that it rendered my computer unusable. Okay, the thing to do is to limit the resources Docker is allowed; but a quick search seemed to indicate that, while plenty of people had the problem, the only thing to do was to not use virtual machines. Now that is great advice and I wholeheartedly endorse it; but sometimes it's inevitable. Eventually I got to an obscure SuperUser post that had an answer with zero upvotes that pointed to a blog post that explained what to do:

https://itnext.io/wsl2-tips-limit-cpu-memory-when-using-docker-c022535faf6f

Summary: Limit the CPU's and memory available to WSL2 (Docker's underlying mechanism) via a config file. I'll want to remove those limitations pretty quickly as I use WSL pretty heavily, but until I get these few tasks completed at least this will allow me to keep using my machine.



Thursday, February 25, 2021

Including config.h in every file in Visual Studio

Rather than explicitly including a header file in every source file, I like to use gcc's -include flag to add a config.h file, a file holding global configuration options that every source file may or may not be using. For a long time I didn't think that Visual Studio had a similar option, but I finally dug it out here:

https://docs.microsoft.com/en-us/cpp/build/reference/fi-name-forced-include-file?view=msvc-160

The flag is /FI and you can set it in Visual Studio by right-clicking the project, selecting Properties, then C/C++, Advanced, and setting the "Forced Include File" property. Convenient!


Tuesday, October 20, 2020

Julia on a multi-user system

Had occasion to install Julia on a multi-user system today. I downloaded the tarball to my own directory and ran Make. The instructions say that the install is fully contained in the single directory, so you don't have to worry about files being installed in different locations on the system. Once it finished, I moved the directory to a globally accessible location and tried it out. It worked mostly, but nothing about the package

manager would run properly. Eventually I realized that if the directory that I had initially created existed, the package manager worked, but if I deleted it, the package manager stopped working. 

I deleted everything, recreated the directory in its final, globally available location, and ran Make again. Success! Apparently something in that compile process is looking to see what directory it is in and going back to it for data. I'd like to know what that is.


Tuesday, October 13, 2020

WSL permissions bits

Conflicts between Linux permissions and Windows permissions are a perennial problem for people who switch back and forth between WSL and Windows. One thing that helps is, when mounting a drive, to provide the -o metadata parameter to make sure that files have both Windows and Linux permission bits:

$ sudo mount -t drvfs g: /mnt/g -o metadata

Here's some good information about how WSL permissions work:

https://devblogs.microsoft.com/commandline/chmod-chown-wsl-improvements/


Thursday, September 03, 2020

Research Computing and Data Capabilities Model


The capabilities model allow an institution to evaluate how well it supports various data and research requirements.

https://carcc.org/2020/07/09/announcing-the-rcd-cm-2020-community-data-participation-window/

What I found most interesting is what it refers to as the "five facings"

Researcher Facing Roles

Includes research computing and data staffing, outreach, and advanced support, as well as support in the management of the research lifecycle.

Examples: Research IT User Support, Research Facilitators, CI engineers, etc.

Data Facing Roles

Includes data creation; data discovery and collection; data analysis and visualization; research data curation, storage, backup, and transfer; and research data policy compliance.

ExamplesResearch Data Management specialists, Data Librarians, Data Scientists, etc.

 

Software Facing Roles

Includes software package management, research software development, research software optimization or troubleshooting, workflow engineering, containers and cloud computing, securing access to software, and software associated with physical specimens.

ExamplesResearch Software Engineers, Research Computing support, etc.

 

Systems Facing Roles

Includes infrastructure systems, systems operations, and systems security and compliance.

ExamplesHPC systems engineers, Storage Engineers, Network specialists, etc.

 

Strategy and Policy Facing Roles

Includes institutional alignment, culture for research support,  funding, and partnerships and engagement with external communities. 

ExamplesResearch IT leadership 

Which, at the risk of sounding like a Myers-Briggs evaluation, seems to sum up nicely the important categories of staff in research computing.


Friday, August 21, 2020

Traversing a graph database with Gremlin

This is an invaluable tutorial on how to use the Gremlin query language to get results from a graph database. For some reason, all the internal links seem to be broken, but it's a ten-part series I think. Lesson six on projection and selection is particularly useful.

 https://www.datastax.com/blog/2017/09/gremlin-recipes-1-understanding-gremlin-traversals



Thursday, July 23, 2020

GDB in threaded code

GDB (https://www.gnu.org/software/gdb/ ) is a handy way to debug command line applications. But in the case of applications that are running many threads, it doesn't by default follow a single thread, so as you step through the code it jumps between threads and it's easy to lose track of where you are. The solution is the scheduler-locking command, which forces the stepper to only step through one active thread at a time.

(gdb) set scheduler-locking on

See here for details: https://sourceware.org/gdb/current/onlinedocs/gdb/All_002dStop-Mode.html

Sunday, July 19, 2020

Creating a Hyperbook in Microsoft Word

When someone creates a document they'll possibly set up a table of contents which conveniently links to the chapter headings they've created. They'll very likely provide hyperlinks to their sources or references so it's easy to go out on the web and find the sources. But it's pretty rare to provide handy links inside the document pointing to other places in the document - a hyperbook. Now, there's no reason not to keep providing outside links as well, but a good hyperbook is like a self-contained Wikipedia - lots of good information and lots of links to related subjects of interest directly on the page.

To insert internal links into a document in Microsoft Word, do the following.  On the Insert tab, there's a Links panel. Click that, then Link, then Insert Link. The dialog that comes up offers a variety of ways to insert internal links. Very nice for creating hyperbooks.

Thursday, July 02, 2020

Force-closing an SSH connection

On occasion when I'm using SSH to connect to a remote server, I run an application that hangs. If the terminal's running inside a GUI, you can always close down the entire terminal and restart it, but there's an easier way: hit Enter, then type "~." (squiggle dot) That force-closes the SSH session leaving your terminal intact. I learned this from SuperUser:

https://superuser.com/q/467398

Friday, June 05, 2020

Downloading files from Google Cloud

In doing some testing with GATK 4, I found myself in need of downloading files from Google Cloud. Google Cloud likes to use URL's
that start with gs: For example, the URL for some tumor data is

gs://gatk-best-practices/somatic-b37/HCC1143.bam .

You can't just visit that URL in your browser though; or at least I couldn't. I had to install gsutil as described here: https://cloud.google.com/storage/docs/gsutil_install#linux . This is one of those weird installs where they provide a script online that you can run; a bit dangerous, but at least they don't ask for sudo. It downloads about a gazillion files then asks permission to muck with your settings. I said no, of course, and it gave me a couple of files to source if I wanted. One of them had to do with providing autocomplete, but the other one simply added a directory to the path, so I created
an environment module to do that work. Now I can download the files I need:

$ gsutil cp gs://gatk-best-practices/somatic-b37/HCC1143.bam .



Friday, May 15, 2020

Github offers successor options

Github is putting some thought into what happens to to your repositories if you're "unable" to manage them - a kind way of saying if you die. Nothing I have would have any consequence, but certainly I'm involved with some organizations that would need taking over. Here's how you name a successor for your repositories:

https://github.blog/changelog/2020-05-11-account-successors/

Thursday, April 30, 2020

Covid-19 infections by county, rate of increase

So this image shows rate of increase of the number of cases on a daily basis. It's not as smooth as I would like.

Wednesday, April 29, 2020

Generating sequentially increasing values in C++

Say you need to generate a sequence, in C++, simply consisting of the first so many integers, like, 1,2,3,4,5.

With a little programming experience, you can come up with a dozen different ways to do this, but here's an obscure one:

std::list<int> l(5);
std::iota(l.begin(), l.end(), 1);


According to CppReference, the function is named after the integer function ⍳ from the programming language APL.

The more you know.




Saturday, April 11, 2020

Command Line tips from CLI Magic

Tips for working with the command line from CLI magic. I like this one to show all listening TCP/UDP ports for the current user:

$ lsof -Pan -i tcp -i udp
https://www.patreon.com/posts/climagic-003-5-35703693

Wednesday, April 01, 2020

Covid-19 infections by county, over time

This is a visualization choropleth I put together of Covid-19 infections for each county over time. It's based on the NY Times data set and I built the images in Python based off a very good, if ancient, tutorial I found at FlowingData.

Edit: Updated through Apr. 13 data, and also made the original larger. Not sure if that matters for this web page or not.

This image is licensed under the Creative Commons Attribution-NonCommercial 4.0 International license. I would appreciate attribution if you care to use it!


Tuesday, March 31, 2020

BeautifulSoup4 in Python

A lot of code available that uses BeautifulSoup tells you to call it like so:

from beautifulsoup import BeautifulSoup

If you've installed BeautifulSoup4, though, this won't work. The main module name has been changed to bs4. So change the code that does that to 

from bs4 import BeautifulSoup
Please make a note of it.


Thursday, February 27, 2020

Pre-defined compiler macros

If you're working in a multiple-compiler environment, you may wish to get information at compile time about which compiler you're building with. Here's a list of macros defined by many compilers:

https://sourceforge.net/p/predef/wiki/Compilers/



Thursday, February 13, 2020

Take a newick tree and stre-e-etch the leaves to make it ultrametric

I needed to generate some ultrametric Newick trees for some simulations. There's a nice Newick tree generator online at http:/Trex/trex.uqam.ca/index.php?action=randomtreegenerator&project=trex, but it doesn't have any way to make an ultrametric tree. (Ultrametric means that the lengths from root to tip are all of the same.) So I put together a short script using BioPython to replace the length of each leaf node to make the lengths all identical to the longest length.


How to delete empty rows in an Excel spreadsheet

How to delete empty rows in an Excel spreadsheet:

https://www.itsupportguides.com/knowledge-base/office-2016/excel-2016-how-to-delete-empty-rows/