The problem with the cloud

latch.ai
2 min readMar 13, 2021

Widespread access to cloud computing has definitely changed the world. From the comfort of an office chair, software engineers can now write ML training loops, fire up AWS instances, and train their models across multiple GPUs. Services like AWS and GCP have further simplified this transition.

Despite these advancements, it is still unreasonably difficult for someone without software engineering skills to interact with the cloud. At Berkeley, we’ve seen students struggle every semester through needlessly complex configurations to train their first machine learning model on the cloud.

We’ve also noticed this problem in biotechnology, in both research and industry. Many cutting-edge methods in academia need powerful computers to run. Researchers are forced to learn cumbersome command line tools (ie. ssh, rsync, scp, etc.) and linux administration skills to network with clusters. Even seasoned software engineers in industry have found the interaction with the cloud ecosystem to be “too complex”.

Why do developers or researchers have to understand security groups, subnets, or virtual private clouds to spin up an AWS cluster? We don’t think they should. We want cloud computing to be as simple as running a job natively on your laptop. That is why we created ligand.

ligand is super straightforward:

  • pip install ligand
  • add a single line to the top of your script — ligand.init()
  • run your script as you would normally (you might be prompted to provide cloud API keys)

Thats it. After this, ligand will automatically containerize your code with dependencies from your python runtime, spin up an instance, rsync your files, run your code in the cloud, send the output back to your terminal. It’ll also kill the cluster once the job is done to save you money.

While you can easily change any settings you want, you also don’t have to configure anything. We preset them automatically to what industry best practice inferred from your code.

To recap, if your code runs well locally, that same code should run on the cloud with no extra effort. With ligand, spinning up an instance, running a computational heavy job, and tearing it down (to save on compute costs) is now done by adding a single line of code.

Future plans for ligand include adding support for GCP and Azure and then to any arbitrary clusters.

It’s completely open-source. We want to build a community around powerful tools in bioinformatics, and ligand is only the first of many tools to come. We are open to suggestions, feature requests, and code contributions from across the world. Let’s build the DevOps tools that will allow bioinformaticians to focus on bioinformatics.

--

--