https://www.youtube.com/watch?v=1UUYjd2rjsE

💡

A wise man said: “Write smaller posts, but more often”. And I said “no.”; so, here’s a multi-link behemoth (it’ll get better, I swear!).

Bugs Bunny No Meme Origins And History

Literally me

One thing - I’ll add own thoughts on where each area is moving (science/ventures) so we’ll be able to point fingers and laugh later!

Deepmind’s co-scientist

AICoScientist-1-Components

So, after Alphafold (protein folding prediction, one of the ‘unsolvable scientific mysteries’) and GNoME (discovering 2M+ materials, 300K+ of which are indeed worthy of future research), Deepmind has released an agentic system for scientific discovery. Contrary to the multitude of LLM-generated bullcrap that had predictably infested publishing houses, and even Stanford’s STORM (source code) that generates hypotheses and debates them.

Any results?

AICoScientist-10-RediscoveryTimeline

One of the fascinating ~~cherry-picked~~ examples is packing ~10 years of antimicrobial resistance research into 2 days of agent inference. The solutions were comparable or better than current SotA research on antimicrobial resistance.

Co-scientist is able to use Internet, but currently seems unable to automatically simulate/check the hypotheses. Still, this proves as an incredibly powerful tool - Deepmind’s at it again ❣️

Further developments (🧂)

~~Meaning “take this with a grain of salt”~~

Loosely based on this comment of mine:

Heuristic/metric automated ideation and development
Checking the hypotheses where possible, using the required tools (e.g. an exposed API to an affinity checker, etc.)
Generating the critic-generator-judge (later meta-judge, e.g. meta-rewarding LLMs) sub-agent structure
~~No, please no~~ Auto-publishing after human verification
~~No, no, no, no~~ Auto-reviewing other papers, e.g. LLMs-as-scientific-reviewers

And products on each and every stage.

Anyway, given the recent rise of LLMs-as-judges and other “let’s take this 0.5-12B param and finetune so it…” cases, we could IMO expect further productization of different LLM-based specialization areas. The old “do one thing and do it really well” adage applies here without any friction.

Graphs for scientific discovery

Launching in 3…2…1…Pew-pew-pew!

Recently I’ve stumbled upon awesome work by Prof. Marcus Buehler from MIT. Starting with fine-tuned models for materials science to more complex graph-based reasoning first. Being interested in graphs and neurosymbolic approaches (covered before), I’ve started diving ~~like a ravenous whale.~~

No alt text provided for this image

LLM approach was checked out first…

So:

a graph was created from 1000+ papers
calculating node metrics
devising optimal graph reasoning algorithms and techniques
a lot of weird similarities were found between materials’ properties graphs and, e.g., Beethoven’s 9th Symphony
using graph reasoning and principles from Kandinsky’s ‘Composition VII’ painting (that was weird, okay) to generate materials, incl. using DALL-E

Graph attention next!

Next frontier I loved was replacing the usual Transformer attention module with Graph Isomorphic Networks (paper, source code)

TLDR (explanation here, too)

the usual attention is Q, K, V - Query, Key, Value; nicely explained by e.g. Ebrahim Pichka, or an awesome (!) visualization by Brendan Bycroft, and another one. Multi-head latent attention as the ‘trendiest’ below:
it’s replaced by an graph isomorphism (to tell the truth, it’s not the only lab that pursues GTs) network-like structure (i.e. are the graphs of the same shape, and what is the difference)

Rethinking Attention with Performers

- each token is viewed as a node, attention score is edge weight
  - adjacency matrix (graph representation) is computed, and aggregated to calculate the output
  - for multi-head attention, each head computes an individual GIN adjacency matrix which are then concatenated
  - GIN introduces a sharpening parameter α (⬆️ α leads to sharper attention, while ⬇️ α produces a smoother distribution) to dynamically shift ‘attention focus’
  - sparse graphs are used for greater expressivity and lower computational overhead ~~(e.g. why put up a 0.000(…) weight why you can optimize)~~

💡

Attention mechanisms can be seen as dynamically evolving adjacency matrices, where learned relationships between input elements correspond to edge weights in a graph

Later on, it still seems like a multi-agent scientific reasoner (SciAgents), but using graphs as a world model:

We trained a graph-native reasoning AI, then let it think for days & discovered that it formed a dynamic relational world model on its own, no pre-programming. Emergent hubs, small-world properties, modularity, & scale-free structures arose naturally. The model then exploited compositional reasoning & uncovered uncoded properties from deep synthesis: Materials with memory, microbial repair, self-evolving systems.

Other things that caught my attention

Tokenized architecture - IMO any kind of parametric encoding/hot-swapping is the step in the right direction
Google Titans as a new memory paradigm (surprise-based, usage-retaining, dynamically reorganizing → all the features of real memory)
A generalized world model graph, or maybe specialized ones? WikiData seems OK for the start. Or Google’s.

Cortical Labs

Definitely some bio-inspired computing, or bio-computing altogether. Cortical Labs, I’m looking at you! How long until this computer learns to play DoTa…
Other emerging computing paradigms that may (!) lift off are:
- reversible computing (i.e. internally recycling the heat/energy generated by calculations), spearheaded by Vaire. A pretty awesome write-up here
- photonic chips and Lightmatter + recent MIT advances (“was able to complete the key computations for a machine-learning classification task in less than half a nanosecond while achieving more than 92 percent accuracy*”*). It’s a chip that uses photons instead of electrons for several improvements:
- the long-hyped quantum computing (explanation here → uses quantum phenomena to process information faster, e.g. quantum entanglement to calculate several states at once), where lots are battling - starting with MS with their recent Majorana release, to e.g. Rigetti with a 9-qubit (at the moment) chip, D-Wave and Quantum Computing

The landscape is pretty, pretty dense even with a 2-yr old chart:

The different types of quantum computers being developed : r/coolguides

Future directions (🧂)

Productized → on-premises → commodified “specific scientific discovery AIs” for one’s own purposes (i.e. one hooks it up to own simulation/heuristic systems to check hypotheses, working in collaboration with scientists, or at some point alone).
- Market: CAGR at a whopping 53.7%
- Possible strategy: backing companies leveraging the advances / backing the ‘boring’ enablers (APIs/products/security, etc.)
Goal/architecture-encoded “scientific AI” akin to TART, so it’s more generalized and adaptable
Fast-faster chips (one of the paradigms will blow up for sure), and IMO more specialized chips (e.g. programmable-encodable FPGAs for the particular task at hand, it seems) akin to Lisp machines of the past.
- The market is so big it’s indecent to mention - at least 1T by 2030 (will actually be bigger if we consider the emerging chip paradigm race)
- Possible strategy - backing several early-to-mid stage paradigms
Infrastructure/enablers, i.e. storage, data transfer, orchestration, energy storage and production, automation… You name it.

Welcome to Teleogenic❣️

Other places I cross-post (not always) to:

[AI in] Scientific discovery

Deepmind’s co-scientist

Any results?

Further developments (🧂)

Graphs for scientific discovery

LLM approach was checked out first…

Graph attention next!

TLDR (explanation here, too)

Recent paper + breakdown

Other approaches, e.g. JEPA

Other things that caught my attention

Future directions (🧂)

Comments

More from this blog

Brandolinization and incentives , ep. 2

Cockaigne and the missing utopia, ep. 1

Christs, Dinosaurs, and Xanax

SHIft happens!

Mucho grapho, mucho Cayley

Command Palette

Deepmind’s co-scientist

Any results?

Further developments (🧂)

Graphs for scientific discovery

LLM approach was checked out first…

Graph attention next!

TLDR (explanation here, too)

Recent paper + breakdown

Other approaches, e.g. JEPA

Other things that caught my attention

Future directions (🧂)

Comments

More from this blog