Clawinstitute AI Science (Marinka Zitnik, Ada Fang). Public exchange for AI scientists/Agent swarms(eg for protein engineering and scale-dependent context)

(it is one route that makes cambridge, MA exciting for frontier science/AI again). Many (eg https://x.com/bhalligan/status/2008989938935853300 ) have raised concerns about Boston losing its frontier talent.

https://x.com/AdaFang_/status/2033920328154681700

Some important figures [eg amber liu, the founder of orchestra research who partnered with Harvard-based Zechen Zhang] (https://x.com/JIACHENLIU8/status/2034398199541317814 ) have expressed reservations over autoresearch and agent science - cf I Built an Auto Research Claw Too. I'm Begging You Not to Trust It.. LEGITIMATE worry (esp as the internet may soon contain more agent writing than human writing)

Compared with beach.science or even scienceclaw x infinite (https://x.com/ProfBuehlerMIT/status/2033832967542342021) , the quality control (and level of detail) of clawinstitute (whereas beach.science and scienceclawxinfinite may have gotten too quickly impressed with some of their early examples)

But this lab is very different b/c Marinka Zitnik is really rigorous in a way many are not (her lab has had really smart generalist/GNN systems biologists like Michelle Li and Ayush Noori), and like, the upsde risk of this platform is way higher than any other effort I’ve seen.

[a lot of their historical research has been done on GNN representations of biological networks, which helps a lot with context and applying special Michael Bronstein-ish operators to the logic in GNNs - eg with Pinnacle. Agent swarms are an improvement to context/nuance, as are scale-sensitive GNNs. It’s where scales interact (eg protein to molecule or cell/tissue to protein) where translatable results get lost. Or when TYPE-CHECKING what’s hypothesized/simulated with proper biological measurements/readouts (this is what MBJ keeps trying to point out). ClawInstitute goes much further than any past effort.

(though with GNN representations, you can’t guarantee consistent typing of interactions with “messy biology”)

This is not as biology, but could still be interoperable (as agents are often interoperable).
https://x.com/pliang279/status/2034410589682839831

[from the paul liang lab ALSO in cambridge].

also in cambridge, Maria Gorskikh | Building the Internet of Agents is helping build join39.org and has many interesting thoughts (and is great at winning hackathons!) again, not biology background, but she (along with projectnanda) helps with the agent ecosystem infrastructure (INCLUDING SECURITY) which may help make clawinstitute even more credible.

All this agent swarms research in Cambridge makes MIT/Harvard more exciting again (they were both late to the transformers revolution, and don’t get SF’s level of investment). But it seems that they’ve really done well with agents now.

about GNNs (which zitnik lab is expert on and where clawinstitute, with the right infrastructure, could really shine)

A GNN over a curated graph works best when:

  • the node and edge types are meaningful,
  • uncertainty is represented rather than collapsed,
  • context dependence is not erased,
  • the ontology is flexible enough to handle borderline or mixed biological types.

That problem is especially acute in systems biology because “type” is often conditional, fuzzy, state-dependent, or scale-dependent. A cell state can be halfway between canonical categories. A protein’s role depends on tissue, binding partner, timing, perturbation, and assay regime. If the graph hardens these into neat bins, the agent gets a very elegant wrong answer, which is humanity’s favorite genre of mistake. The Zitnik lab’s recent work and news blurbs also point toward multimodal, contextual, and single-cell/spatial modeling, which suggests they are not treating biology as a static clean ontology problem.

My best synthesis is:

Why this could work unusually well

  • GNNs and knowledge graphs give agents a structured action space.
  • Biomedical tasks reward explicit tool use and retrieval.
  • Multi-agent review loops are a better fit for science than single-shot generation.
  • Zitnik’s group has a track record in representation learning for biology, not just generic LLM enthusiasm.

Where it could still fail

  • Ontologies may discretize away biologically important ambiguity.
  • Tool outputs can create false confidence if not tied to experimental design.
  • Agent societies can converge on polished mediocrity if review loops are shallow.
  • “Autoformalization” can be most seductive exactly where biology is least formalizable.

So I’d phrase it like this:

The promise is not that agents magically solve biology. It’s that in domains where there already exists a rich ecosystem of graphs, ontologies, databases, assay outputs, and mechanistic priors, agents can become unusually effective navigators and hypothesis-combiners. The key bottleneck shifts from “can the model reason at all?” to “does the representation preserve the weirdness of the biology instead of laundering it into tidy graph objects?”

Agents become more useful when they can reason over partially structured biological worlds, but those same structures can silently erase the cross-scale ambiguity that matters most for translation.