Source a conda environment on the GenOuest cluster and start Shortstack
Creation du script Shortstack
Running nextflow rnaseq pipeline
Running the Nextflow rnaseq pipeline
Downloading fastq files from NCBI SRA programmatically
Often, one wants to download multiple FASTQ files from the NCBI Sequence Read Archive.
Computing sequence length from a fasta file
Often, one wants to calculate the sequence length from a FASTA file.
Download the NCBI nt database
The NCBI partially non-redundant reference nucleotide collection (nt)
Get files when in crunchomics
To download files from an URL when located on crunchomics or any other HPC server:
Get file basename and create a new one in bash
Since I often do this operation in a for loop in bash, I thought it would be handy to store it here:
Creating a ggplot2 plot with a fitted nls model
I have recently helped a colleague to add the curve from a nls
model to its ggplot.
Online resources for scientific programming
Because of these corona-days, I have made and sent around a list of useful online resources to learn programming and data analysis using R and Python. I hope it is useful for others as well.
How to set up ssh-keys to connect to a remote machine without password
I have done this so many times but without actually remembering how to do it properly. So here’s my personal note on how to do it. I took inspiration from the Github guidelines.
Build a local Carpentries institutional framework
This is a speed-blog post written upon a workshop held at the CarpentryConnect 2019 conference in Manchester.
Extracting one chromosome from a FASTA file with sed
You’ve downloaded your favorite (plant) genome, but you’d only want to have chromosome 1 for instance. Let’s suppose you’re working on tomato so you’d need to extract everything between SL4.0chr01
and SL4.0chr02
.
The Study Group adventure featured in the Amsterdam Science Magazine
Summary
I had the chance of writing a small article for the Amsterdam Science Magazine that “aims to be a platform that displays the enormous creativity, quality, diversity and enthusiasm of the Amsterdam scientific community. It offers early career scientists (MSc students, PhD candidates, postdoctoral fellows), as well as more advanced researchers, the opportunity to communicate their latest and most interesting findings to a broad audience”.
I explained the origins and goals of the Amsterdam Science Park Study Group for a broader audience! I am grateful for this nice opportunity.
Homo biologicus informaticus
Summary
I got invited by the Graduate School Experimental Plant Sciences (EPS) to give a lecture on data management scientific programming and good practices. While mostly aimed at bioinformatics, it is far from being restricted to genomics. The presentation I’ve shaped was intended to give a good overview of what needs to be learned by the newest generation of life scientists.
You don’t have to be an expert in code to actually benefit from better code and data know-how and good practices!
Homo biologicus informaticus
Summary
I got invited by the Graduate School Experimental Plant Sciences (EPS) to give a lecture on data management scientific programming and good practices. While mostly aimed at bioinformatics, it is far from being restricted to genomics. The presentation I’ve shaped was intended to give a good overview of what needs to be learned by the newest generation of life scientists.
You don’t have to be an expert in code to actually benefit from better code and data know-how and good practices!
Amsterdam Data Science
Following up the publication of the Study Group paper in PLoS Biology, I got invited to draft a publication in the Amsterdam Science Magazine.
The Study Group stories! The PLoS Biology paper is out!
Summary
After nearly a year of intense brainstorming, our story on building Study Groups is out in the open. Sarah and I act as community leads in Madison-Wisconsin (USA) and Amsterdam (Netherlands) at our respective Universities. Our communities of practice in scientific programming are affiliated to Mozilla Science Lab Study Groups foster and support Life Scientists lost under a data avalanche. Communities of practice in scientific programming help to break the impostor syndrome, network with other researchers engaged in programming and data analysis, break organisation silos that keep different fields and expertise separated, etc.
How to organise code-review sessions for researchers
I’d like to organise code-review sessions in the near future with the aim of helping scientists to produce better more efficient code. Nonetheless, also having some documentation and testing would be nice!
I’ve compiled a few great resources and tips from the web in this blog post.
Hope it helps you if you’d like to organise similar things.
Cheers!
Building the executive team of the Study Group
Yesterday, I organised a kick-start meeting to create a team of researchers that will lead the Study Group activities for the upcoming year.
With some experience and some further reading (see the other blog posts), I made a small agenda to convey the main information about the Study Group, its missions, values and what it takes to become a member.
I hope it helps others in organizing their own Study Group at their institution.
Cheers!
The Study Group Lead Roles
Taken from the Mozilla Science Handbook:
Sociocraty: a process to reach decision making
From the Wikipedia article on sociocraty that helps to make a policy on decision making. The question is: what type of decision making policy do we want for a local community of practice in scientific programming. If we stick to the community values (see related post), then it should be a democratic and respectful decision making process. For now, there is no explicit structure which means that we need to implement one.
Building a powerful community - values
Still reading the great book from Michael Jacoby Brown “Building Powerful Community Organizations”. This post is on values that I want to convey through a to-be-founded Data Clinic at the University of Amsterdam.
Building a powerful community - recruitment
I am reading this great book from Michael Jacoby Brown entitled “Building Powerful Community Organizations”. This post is a way for me to shape my ideas around a to-be-founded Data Clinic at the University of Amsterdam.
Starting your own company in the Netherlands
For some time now and after some discussion, I decided to start my own data analysis company and named it “biodata services”. The domain name www.biodataservices.nl is registered but I need to build the corresponding website and come up with a portfolio of the related activities!
Building your own community of practice
Together with a list of scientists, I have recently written an article entitled “Building a local community of practice in scientific programming for Life Scientists”. This manuscript is under review for PLoS Biology at the time and already available on bioXriv here.
Personal Website
As I was willing to set-up a personal website for a while, I decided to give it a go following the nice example of Sarah Stevens and based on the Jekyll now template.