I go through my lists and see a tweet from Langchain (a tool that we use) that talks about a particular innovation in how they perform evaluations of models. I send it to one of our data scientists who is working on a similar problem.
Loading
2pm. We’re Australian made, but we also have engineers spread across the globe, so we often have engineering stand-ups in the early afternoon. Today I join one of the teams I’m collaborating with and directly contributing to, and we move through problems, blockers and questions.
4pm. Most of my afternoons are blocked off for any core problems I need to work through. Today, I’m looking at the evaluation pipeline we’re building out for Hero AI. We want to have a deeper understanding of how our generative AI handles questions and answers at scale and how that changes when we make any improvements to the system.
When building out generative AI infrastructure, it’s difficult to gauge how any change impacts the overall system, so we like to be super data-driven in understanding all our changes.
5pm. I have a newborn, so I have a hard stop at 5 pm - I’ll usually then go and play with him. Tonight, I’ll log back on later to have everything ready for a meeting tomorrow.
Previously I would have just continued working a bit later to finish the work, as I don’t like stopping halfway through if I’m on a roll. I can’t do that any more. I’m a super new dad, so I’m still trying to find the right balance here.
The Business Briefing newsletter delivers major stories, exclusive coverage and expert opinion. Sign up to get it every weekday morning.