Why I'm Building a Shipping Earnings Analysis Tool

The Problem

Every quarter, shipping companies publish earnings transcripts. CEOs and CFOs get on calls with analysts, discuss fleet utilization, charter coverage, rate expectations, and strategic direction.

These transcripts contain real signal. When Euronav's CEO says they're "cautiously optimistic" about VLCC rates, that's information. When Star Bulk's management deflects a question about drydocking schedules, that's also information.

The problem? Extracting these signals is mind-numbing work.

I've spent countless evenings reading through 30-page transcripts, Ctrl+F-ing for "rates", "outlook", "guidance", trying to piece together what management actually said versus what they implied. It's tedious. And by the time you've compared what three different tanker CEOs are saying, you've lost the evening.

What I Want

I want to ask simple questions and get sourced answers:

"What did Euronav say about VLCC rates in Q3 2024?"
"How does Star Bulk's fleet utilization compare to Golden Ocean's?"
"Is this CEO actually answering the analyst's question, or deflecting?"

That last one is especially interesting. Management teams have different communication styles. Some give direct answers. Others pivot to talking points. I want to quantify this—a "dodge score" that measures how often executives actually address the question asked.

I also want to track credibility over time. If a CFO says rates will strengthen next quarter and they don't, that should affect how much weight I put on their future guidance. A management scorecard, essentially.

Why RAG?

The approach I'm using is Retrieval-Augmented Generation (RAG). In plain terms:

Parse earnings transcripts into chunks
Store them in a vector database with semantic embeddings
When I ask a question, find the most relevant chunks
Feed those to an LLM to synthesize an answer with citations

This isn't keyword search. If I ask "What's the outlook for Capesize rates?", the system should find relevant passages even if they don't contain the exact phrase. It understands context.

RAG solves the "LLM makes stuff up" problem too. The model only answers based on what's actually in the transcripts, and it cites sources so I can verify.

Current Status

Right now, I have a working prototype locally. A handful of transcripts loaded, basic Q&A working. The answers are good—better than I expected, actually.

Next steps:

Expand coverage: Load transcripts from all major public shipping companies
Build a web interface: So I don't have to run queries from a terminal
Implement the dodge score: Classify Q&A exchanges as direct/indirect/deflection
Cross-company comparison: Surface contradictions automatically

This is my main side project right now. I'll post updates in the Writing section as I make progress.

Why Build in Public?

Two reasons.

First, accountability. It's easy to let side projects die quietly. If I'm writing about progress (or lack thereof), there's pressure to actually make progress.

Second, maybe someone finds this useful. Either the tool itself, or the approach, or just the idea that you can build useful things with publicly available data and off-the-shelf AI.

The shipping industry isn't known for tech innovation. Most analysis still happens in spreadsheets and email chains. I think there's room for better tools, and I'd rather build them myself than wait for someone else to.

Let's see where this goes.