My complaints about langgraph

I have complaints about langgraph and I should write them down to make sure they are reasonable instead of emotional rant.

frequent change of naming, immature product, confusing doc:
- LangGraph platform is changed to Langsmith Platform. and there’s Langsmith Platform “with deployment”. I think it’s confusing to name a product variant using “with”
- langchain-v0 becomes langchain-classic
- langgraph-prebuilt become langchain-agents
- I am actually not sure whether v1 doc is improved so much more or I suffer from pre-v1 doc for too long.
config vs runtime
- it’s really hard to figure out what langgraph runtime is doing, especially when you print config during graph execution. It has a bunch of runtime related metadata
- when defining nodes, should you use config or runtime as the second argument? If using runtime, how would you pass thread_id when invoking the graph?
  - I have figured it out (only by looking at langgraph’s source code, which is not a very pleasant experience)
  - answer: you need to pass context
- default config, graph-bound config, user-provided config: which one is taking effect?
- to get “typed” config schema, I think runtime is the way to go
- in langgraph’s documentation, there’s a mixed of use in both config and runtime
human-in-the-loop
- not ergonomic, a lot of bugs in early versions
- interrupt() API improves a lot but it still feels clunky
- actually I am not sure if there’s better solution provided by others
type hint in node definition can break graph behavior in unexpected way
there was a bug that when you use pydantic.BaseModel as graph state, it will return the value that is not fully validated / coerced. This one of the most frustrating bug and can be eloborated more.
RemoteGraph doesn’t work when there’s non-primitive types in graph state because it cannot serialize it correctly using orjson.

I like pydantic-ai much more:

the type of their Agent is generic in the dependency DepsT and output type OutputT
the codebase is pleasant to read and hack a customized solution is very doable
logfire is nice as well
Their trace-based eval makes sense to me. I think that’s the correct thing to do: extract features from your agent’s OTel trace and go back to the traditional data science pipeline for agent evaluation.