Source a conda environment on the GenOuest cluster and start Shortstack

Creation du script Shortstack

Running nextflow rnaseq pipeline

Running the Nextflow rnaseq pipeline

Downloading fastq files from NCBI SRA programmatically

Often, one wants to download multiple FASTQ files from the NCBI Sequence Read Archive.

Computing sequence length from a fasta file

Often, one wants to calculate the sequence length from a FASTA file.

Download the NCBI nt database

The NCBI partially non-redundant reference nucleotide collection (nt)

Get files when in crunchomics

To download files from an URL when located on crunchomics or any other HPC server:

Get file basename and create a new one in bash

Since I often do this operation in a for loop in bash, I thought it would be handy to store it here:

Creating a ggplot2 plot with a fitted nls model

I have recently helped a colleague to add the curve from a nls model to its ggplot.

Online resources for scientific programming

Because of these corona-days, I have made and sent around a list of useful online resources to learn programming and data analysis using R and Python. I hope it is useful for others as well.

How to set up ssh-keys to connect to a remote machine without password

I have done this so many times but without actually remembering how to do it properly. So here’s my personal note on how to do it. I took inspiration from the Github guidelines.

Build a local Carpentries institutional framework

This is a speed-blog post written upon a workshop held at the CarpentryConnect 2019 conference in Manchester.

Extracting one chromosome from a FASTA file with sed

You’ve downloaded your favorite (plant) genome, but you’d only want to have chromosome 1 for instance. Let’s suppose you’re working on tomato so you’d need to extract everything between SL4.0chr01 and SL4.0chr02.

The Study Group adventure featured in the Amsterdam Science Magazine

Summary

I had the chance of writing a small article for the Amsterdam Science Magazine that “aims to be a platform that displays the enormous creativity, quality, diversity and enthusiasm of the Amsterdam scientific community. It offers early career scientists (MSc students, PhD candidates, postdoctoral fellows), as well as more advanced researchers, the opportunity to communicate their latest and most interesting findings to a broad audience”.
I explained the origins and goals of the Amsterdam Science Park Study Group for a broader audience! I am grateful for this nice opportunity.

Homo biologicus informaticus

Summary

I got invited by the Graduate School Experimental Plant Sciences (EPS) to give a lecture on data management scientific programming and good practices. While mostly aimed at bioinformatics, it is far from being restricted to genomics. The presentation I’ve shaped was intended to give a good overview of what needs to be learned by the newest generation of life scientists.
You don’t have to be an expert in code to actually benefit from better code and data know-how and good practices!

Homo biologicus informaticus

Summary

I got invited by the Graduate School Experimental Plant Sciences (EPS) to give a lecture on data management scientific programming and good practices. While mostly aimed at bioinformatics, it is far from being restricted to genomics. The presentation I’ve shaped was intended to give a good overview of what needs to be learned by the newest generation of life scientists.
You don’t have to be an expert in code to actually benefit from better code and data know-how and good practices!

Amsterdam Data Science

Following up the publication of the Study Group paper in PLoS Biology, I got invited to draft a publication in the Amsterdam Science Magazine.

The Study Group stories! The PLoS Biology paper is out!

Summary

After nearly a year of intense brainstorming, our story on building Study Groups is out in the open. Sarah and I act as community leads in Madison-Wisconsin (USA) and Amsterdam (Netherlands) at our respective Universities. Our communities of practice in scientific programming are affiliated to Mozilla Science Lab Study Groups foster and support Life Scientists lost under a data avalanche. Communities of practice in scientific programming help to break the impostor syndrome, network with other researchers engaged in programming and data analysis, break organisation silos that keep different fields and expertise separated, etc.

How to organise code-review sessions for researchers

I’d like to organise code-review sessions in the near future with the aim of helping scientists to produce better more efficient code. Nonetheless, also having some documentation and testing would be nice!
I’ve compiled a few great resources and tips from the web in this blog post.
Hope it helps you if you’d like to organise similar things.
Cheers!

Building the executive team of the Study Group

Yesterday, I organised a kick-start meeting to create a team of researchers that will lead the Study Group activities for the upcoming year.
With some experience and some further reading (see the other blog posts), I made a small agenda to convey the main information about the Study Group, its missions, values and what it takes to become a member.
I hope it helps others in organizing their own Study Group at their institution. Cheers!

The Study Group Lead Roles

Taken from the Mozilla Science Handbook:

Sociocraty: a process to reach decision making

From the Wikipedia article on sociocraty that helps to make a policy on decision making. The question is: what type of decision making policy do we want for a local community of practice in scientific programming. If we stick to the community values (see related post), then it should be a democratic and respectful decision making process. For now, there is no explicit structure which means that we need to implement one.

Building a powerful community - values

Still reading the great book from Michael Jacoby Brown “Building Powerful Community Organizations”. This post is on values that I want to convey through a to-be-founded Data Clinic at the University of Amsterdam.

Building a powerful community - recruitment

I am reading this great book from Michael Jacoby Brown entitled “Building Powerful Community Organizations”. This post is a way for me to shape my ideas around a to-be-founded Data Clinic at the University of Amsterdam.

Starting your own company in the Netherlands

For some time now and after some discussion, I decided to start my own data analysis company and named it “biodata services”. The domain name www.biodataservices.nl is registered but I need to build the corresponding website and come up with a portfolio of the related activities!

Building your own community of practice

Together with a list of scientists, I have recently written an article entitled “Building a local community of practice in scientific programming for Life Scientists”. This manuscript is under review for PLoS Biology at the time and already available on bioXriv here.

Personal Website

As I was willing to set-up a personal website for a while, I decided to give it a go following the nice example of Sarah Stevens and based on the Jekyll now template.