Learning bioinformatic skills through bio-hackathons

Oct 27th, 2025

science

While I was in my last year of undergraduate at Fulbright, I had hoped for a career in bioinformatics. The lessons that I learned from Introduction to Bioinformatics course about sequencing technology and how relevant the contributions to bioligcal research in this course introduced me to a potential career in bioinformatics research. But the problem was, being a university student in Vietnam, where infrastructure, job opportunities, and career prospects for computational biology is not developed yet, I didn’t know how build up my skills or prepare myself for job searching after graduation. Luckily I landed a small job in a bioinformatic company right after uni, but I learned soon, that many people didn’t have the same resources or set-up required for a bioinformatic career.

The struggles of Vietnamese students

Natural science in a sense, is considered to be a major suitable only to those “gifted”. In my beloved country, job prospect is considered to be a priority in selecting your major in undergraduate. It didn’t help that funding situation is widely limited, with higher prioritization in private sectors.

To work in bioinformatics, you need either (1) a background in biology or (2) good programming skills or a combination of both. As for my case, I majored in Computer Science at Fulbright, luckily has taken courses related to bioinformatics and has developed my interests in pursuing a career in it. But I struggled my way throughout my 4 years degree, twiggling between learning computer-science related skills and familiarizing myself with bioinformatics practices. My school was very new, it’s only 7 year olds as of now, so we didn’t have the computational resources needed to practice our own skillset, which is a must in this field due to how gigantic the data can be. In reference, a simple tutorial I came across from Sheffield Bioinformatics Core uses downsampled data of what originally could be up to a few GB for the 16 samples. For interest of learning the tool, I on the other hands, believe that replicating the original result would be a great way for us to: (1) understand whether there is any mistakes in our workflow and (2) question the validity of the paper we are trying to replicate. Downsampled data serves well as stepping stone but in my opinion, truly understanding the authors’ intention might require us to work with the data as a whole (you can’t really see the full picture of a puzzle if you are missing a few pieces!).

Second, I believe bioinformatic is supposed to be learned as a group, and let me explain. In my company, we are a group in charge of performing downstream analysis of sequencing data. So after performing sequencing, we receive a prompt of what kind of data it was (RNA? WGBS?), and perform the analysis and return the results back to the US team. This puts me at a struggle, even though I’m not the one who’s in charge of this task most of the time, I find that this creates some form of mis-alignment between expectation and result. While I was doing an RnD project on detecting human papillomavirus 16 (HPV16) among other microbiome using public studies, I learned that if you performed RNA library prep using poly-A tails removal, there is going to be some missing information because many novel viruses and bacteria don’t have poly-A tails. This led me to believe that as a bioinformatician, you are supposed to be involved in a team setting, where you discuss / brainstorm with the wet-lab scientist to understand their processes, and you talk with your statiscian in your team to choose the proper methods to analyze your data. Your team leader serves well in providing guidance, and can bring you down to the ground when things go astray. This, is obviously lacking in a school setting, and can cost the student’s time.

The solution? Join a hackathon!

If there’s something I wished I had known in my university years, it would be about the various competitions, hackathons that exist! I remembered posting on r/bioinformatics years ago asking about the equivalence of ML/AI related hackathons but people told me that there were none. Then, after engaging with the nf-core community through my current job (plus the constant lurking on LinkedIn), I have learned about the existence of some competitions out there, and I hope they would be helpful to you. By joining a hackathon, you normally would get (1) sufficient computational resources in the form of virtual machine for you to practice your craft without worrying about the OOM (out-of-memory) error in the middle of your work; and (2) having a group leader to guide your project. The fun thing is, you can even extend your contributions after the competitions, as people would still want to maintain their tools for free! (yay?).

So, without further ado, here are some competitions that I joined (or didn’t) that I think would be helpful to you:

1. nf-core Hackathon

nf-core is a big bioinformatic community revolving around building bioinformatic pipeline using Nextflow. Every year, they host many hackathon competition multiple times so people can contribute to their existing pipelines. Most of them require registration and physical attendance, but I joined the Nextflow hackathon back in March online, and contributed to a pipeline from EMBL-EBI’s functional genomics group.

2. St.Jude’s BioHackathon st-jude-logo

St. Jude is a major children hospital in Memphis, they hosted a hackathon to develop internal tools for St.Jude’s scientists. Unlike hackathon, their projects are more diverse, and may involve a bit more on the engineering, ML/AI side.

3. Open Target’s Hackathon ot-logo

I registered for the OT’s hackathon in October this year, but it was a bit messy because I couldn’t contribute code due to restricted access (this deserves another post), but I’ve seen multiple interesting projects related to the Open Target’s platform and its data. So if you are interested in drug discovery, Open Target is the go-to hackathon and I’m sure you will learn a lot.

4. Google’s Summer of Code gsoc-logo

I believe that this might be the largest open-source contribution competition annually. There are a wide range of organizations involved with clear mentorship, guidelines, application, and even compensation. But instead of working for 3 full days where you sleep little, GSOC spans the entire summer so you will have months of mentorship, code reviews and you can even earn yourself a letter of recommendation if you did well! I myself did not get accepted to GSOC’s project this year, but you might want to check out and prepare for the next year’s application from the Wellcome Sanger Tree of Life.

Final remarks?

As I find / join more hackathons, I’ll try to update this posts as regularly as possible! Overall, for somebody who can not afford to join a research project, you might as well find alternatives to hone the skills.

Yours truly, Khoi.