LLMs as Tools, Not Crutches: How to Learn and Work with Them

    Whether you’re on board with them or not, LLMs are taking over our education system, workplaces, and everyday lives. I consider myself fortunate that I’ve gone through the majority of education without the presence of LLMs. I can only imagine how many corners I might’ve cut during my intro CS and software engineering classes back in undergrad if I could’ve just asked Claude to fix my code. But today’s students face that constant neon glow in the back of their minds urging them to take the easy way out, and I suspect the same dynamic is creeping into the workplace as well. That’s exactly why it feels worth talking about how bioinformaticians can use LLMs not to sidestep the learning process, but to actually deepen our understanding of concepts and accelerate our workflows. 

    In my experience, learning with LLMs is actually much easier than working with them. Being able to quickly acquire specific and generally accurate information while studying something new is incredibly valuable. This is especially useful when you have a main source of material in front of you, or you just want to refresh yourself on something that you learned years ago but haven’t had to recall. I appreciate that they tend to  break down concepts into bullet points, lists, and tables, which makes me happy as a believer of Gestalt psychology. With that in mind, I’ll share some concepts I’ve gathered to help me learn with LLMs in a series of points at the end of this post. 


    Learning with LLMs is one thing, but on the workflow side of bioinformatics they seem to be more of a curse than a blessing. Sure they can write you a process in nextflow or generate your figures in R, but if bioinformatics were that straightforward, then all you’d have to know is basic coding and command line syntax. I’ve found that for any sort of situational question such as when to use a certain reference gtf file, or what database to pull data from, LLMs are vastly inadequate. This leads me to wonder if LLMs are useful at all when working with bioinformatics pipelines and omics data? The only situation where I’ve found success using LLMs in my bioinformatics work so far is in modifying plots where it can sometimes be tedious to change color schemes or reshape labels and figure legends. Although I am still new to the field so if anyone has found success reliably using LLMs to speed up their pipelines, data analysis or even decision making I would love to hear about it. 


    But I think at the end of the day, this issue with using LLMs to improve your workflow comes down to the fact that bioinformatics is not about the data, the coding, or even the tools you use. It comes down to the thousands of tiny decisions that shape your results. Choosing whether to include decoys in our reference genome, or how to interpret background noise in our ChIP-seq data are not questions with one correct answer. They are situational decisions based on a cascade of other factors, too many to even get into. An article I read on the train one morning a couple months ago that has stuck with me said that your use of LLMs as a programmer should be like a surgeon with his tools. The scalpel can only do its job well in the hands of a good surgeon, they guide it, make the judgment calls, and take responsibility for every cut that they make. In bioinformatics, LLMs can suggest a certain method, flag a potential issue, or automate repetitive tasks, but the decision of what to trust, what to double-check, and what to discard still rests entirely with you. The models can amplify your efficiency, but they can’t replace the decision-making and intuition required to turn raw data into meaningful insights. 



Learning with LLMs tips:


  1. Always always always ask for sources: Treat the model as a guide, not the absolute truth. In this age of misinformation, it’s more important than ever to know where your information is coming from. Even better if you can pair the LLM response with your textbook or notes. 

  2. Ask for analogies and examples for concepts you struggle with: LLMs are surprisingly good at coming up with analogies for abstract concepts. Just yesterday I used perplexity to help me understand eigenvectors. If you’re like me and don’t learn as efficiently by reading papers and textbooks, then analogies and examples help a lot by giving you an idea of how the concept is used in a practical sense. 

  3. Try teaching back: I’ve had success using this one to prepare for research presentations where I know the audience has the chance to grill me. You have to make sure to tell that model that you are going to explain a concept, and then ask it to point out gaps in your knowledge and suggest improvements. 

  4. Know the right tool for the job: This can depend on personal preferences, but I’ve found that certain tools do some things better than others. ChatGPT tends to be the fastest and gives you the most free usage so it’s my go to for grammar checking these posts and little things like remembering R package commands. Claude is hands down the best for debugging code, and I use perplexity for gathering in-depth information from academic journals. 

Comments

Popular posts from this blog

Diving into Spatial Omics: Field Notes from My First Summit

AI in Biotech: Why I Still Don't Buy the Hype

Ego, Imposter Syndrome, and the Tension That Drives Science