welcome everybody to the thoughtworks technology podcast um my name is Scott Shaw I'm one of the hosts of the podcast um and today we are going to talk to some friends of mine here in Melbourne where we all uh live and the topic is going to be their new book which is called effective machine learning teams uh I think that's particularly relevant to in today's uh technology landscape I'd like to have each of the participants introduce themselves and tell us a little bit about themselves and the perspective they bring to the book but first of
all I want to introduce my co-host Ken mugr you want to introduce yourself you want to tell us a little B about yourself Ken sure hi everybody I'm one of your other regular hosts and the only person on this team that's not only not in Melbourne but not even on the same continent actually not even the same day so welcome we we're greeting you from the Future Okay so yeah if you all would start uh why don't David 10 why don't you go first cool uh thanks for having us Scott and Ken so uh I'm
David 10 uh for D duplication purposes between David Co and David 10 I'm David and yeah David has Dave has generously become da for our Collective uh yeah reduce the confusion so yeah I'm a lead machine learning engineer with thought works and I've been with thoughtworks for seven years and through this time have been fortunate to be able to work on data and AI systems help clients uh build you know various machine learning uh models um at one in one time building an ml platform and yeah really happy to be here to chat about um
effective machine learning teams and what unique perspective do you think you bring to the book amongst the three authors David yeah um I love engineering is something I can touch and feel and deploy and write and commit so I think in this book um I bring a lot of that M Engineering perspective in the intersection between software engineering and uh ml systems great thanks and aah why don't you introduce yourself next hello my name is aah and uh no I'm not named after aah LEL I think that's a random fact about myself but I've been
at flot works just over 6 years uh my role is a senior business analyst or product owner or scrum Master those the type of roles that we go on work with clients where we bring people together and I think that's where my specialty is bringing people together make um building that shared language but also delivering value and showing that we can deliver quickly as well so Dr risking um delivery in uh and yeah thanks and Dave calls uh why don't you go now thanks Scott yes uh nearly a palindromic Triad for the three authors U
but I'm going by Dave for disambiguation I've been at thought Works about 13 years and in that time I've uh worked in almost every aspect of the business and every service we offer to clients from uh project management to business business analysis to infrastructure to data science and in the last 7 years I've been building the data practice in in Australia and in that regard uh my focus has been on not necessarily the uh the detail of technical practices or data but creating environments for teams to be successful and I think that that's the perspective
I bring to this book uh what can we do uh in The Wider organization to to create a successful environment for ML teams that's this is the first time I've heard the phrase palindromic Triad used on one of the technology podcasts so uh thank you for that uh I'll be looking it up later so um let's get started um how did you come to write this book it's a big Endeavor writing a book is a big job one that I I would be daunted to undertake and and I wonder how how did you come to
write this and and why a book yeah I think it was a series of events uh is that start at all with a blog post that in 2019 like just finishing off a project um it was a blog post I wrote about refactoring Jupiter notebooks from that project back then and then that kind of blew up on Twitter someone in our community shared it and there was like uh I don't know 800 Hearts which to me was a lot most I ever got on Twitter or X um then uh it became a repo that had
like 600 700 stars and uh once or twice we ran it as a workshop um in thork Singapore with AI Singapore and the feedback we got from um folks who attended was like oh yeah this intersection between ML and software engineering wasn't talked about enough like how you how do you test models how do you refactor not books so we got those kind of feedback um points they said okay we're on to something here so yeah then we eventually put together a book proposal and yeah or accepted it and so yeah it's been a of
fleshing out those points about how do you get how do you build reliable systems ml systems how do you get feedback um how do you apply product thinking how do you test your ideas with users and yeah that's how it came to be a book there and aah and Dave how did you get involved David reached out to me and said hey would you be interested and I said yeah why not um and I think that's always my default if I haven't done something before um and the previous writing experience I've had was uh publishing
a thought work blod on um slicing data stories so it was a big jump but at the same time it's been such a hell of a ride um distilling my thoughts and uh moving from implicit knowledge to something that is sharable so I think that's uh the journey has been quite rewarding as well yeah there's been so much written about machine learning technology and in my experience I guess the things don't get actually done unless you have an effective team put it together and so I'm really happy to see that perspective represented in in here
yeah and I think that was um exactly how I got involved in the book having had the Good Fortune to work with d David and AER in the past and you see the reception that David's um coding habits for data scientists and and other um resources had generated it really wanted to highlight that it wasn't just about an individual contributor perspective that the practices and and tools that he was advocating for were really uh there to enable whole teams to deliver effectively so I'm curious if I could um you each had a lot of experience
in your own particular roles um how different was what you thought you knew versus what you learned you knew during the writing process you know sometimes you know I got a good handle on this then you start researching and oh my gosh I mean was that a big leap or was it incremental did it just kind of bounce off each other what was that like yeah I think that's a great point and the story I've been telling people is that I started writing this book with adah and Dave going in thinking like wow I know
so much about machine learning from past projects I want to tell the world about it and through writing I was like oh there's so much I don't know so um I I think through that um literature review through testing ideas with um whether on a project or through our technical reviewers and we have a kind of panel of other reviewers with experience in other domains and M Technologies so through that it's been a uh yeah a lot of growth a lot of learning and even through literature review one of the books that I read was
atlas of AI by Kate Crawford which was like a warwind tour about responsible AI in application so there's just so much in any of those sub topics or branches um in the domain of ml so yeah I think writing definitely helped to um make our at least personally my knowledge more robust I was thinking about how we've collectively been on different engagements client engagements and we've been on ones together and was it a big leap it felt like it was in a sense that the the book was one engagement and you know do we have
the experience from end to end um but equally it wasn't because we've collectively been on different types of client engagements like experiment fast experimentation with ML um productionizing a Pock um ml Pock that was into that was just thrown into production and so in some ways it wasn't it was just collecting and rehashing um our memories and it was kind of going down memory lane with so together and I think there's an aspect of it that is like testing and extending what we know so we had an idea for a chapter about automated testing for
ML systems uh through writing that we was we were like okay yes you can write automated test for an ml model using a global metrix test but then we through that we discovered okay there's the hidden stratification problem where maybe the model is 70% accurate but actually for a particular subass is totally uh rubbish like maybe 40% accurate so you know through testing and through uh checkpointing our knowledge with um the working example in the book um a code repository then we kind of refined you know how do you fully test ml systems and monitor
it so yeah I I would recommend like um you know blogging or checkpointing um has been a really helpful exercise to consolidate that knowledge so can we assume that the kind of fictional scenarios that are depicted in the book are actually things that happen to you that really happened and that you've observed on real projects often there they're amalgams of of different things as well um so yeah there uh yeah they're based on a true story as it might be said yeah and in our book we followed the Arc of this fictional corrector called Dana
who's a data scientist M engineer so somewhere in the book we say the the character is fictional but the pain is real so like we follow her and throughout it's basically versions of AD day for myself like you know in various scenarios where you know you uh there's a story in there about you know deployment day release day we have a lunch party uh organized we're going to celebrate the release of this new ml model and as she is there with the infra engineer to try to deploy this thing you know um deployment fails and
then before you know it it's like the troubleshooting and it's like 7:00 p.m. and you know they have to go to this lunch party feeling kind of guilty like you know we like actually the ml model's deployment failed so we we tried to bring that our experience to life in this book through these kind of story Snippets um that hopefully people can relate with is there an overall message is there are there a few key sort of takeaways that people are going to get from the book I think I think for me it's about um
great great ml products require effective teams to build um and when you look at Team Effectiveness and and the way we've broken down the book uh in Broad terms uh we've broken it down into sections that focus on building the right thing so understanding the right business problems to solve you know where uh you have access to the required data and and where an appropriate ml Paradigm exists for a sucessful product to be built uh then we focused on building the thing right so the practices that will enable you to manage complexity uh and to
maintain responsiveness in delivery uh and then we've also and then the third element has been building it in a way that's right for people as we've described so you know a lot of those stories indicate um the uh the uh pressure and um uh difficulty that the teams can have working in the highly complex environments with many different demands uh and so we're looking at what can we do to uh relieve the cognitive load on teams uh to create safety nets uh for delivery uh and and also to align their work with value streams within
the organization so that they're not blocked on getting value to customers I know I've seen um especially in the early days when uh organizations first started embracing machine learning they form when if you were talking about a machine learning team you were talking about a team of data scientists and they always kind of sat separately from the main engineering department and the product people uh I'm assuming you're talking about cross functional teams here and teams where data science is embedded into a larger product stream do you see that changing in in the industry yes absolutely
I think data scientists uh in nature very curious creatures as well and people like to understand the bigger picture of why not just how um a model is put together and the answers with to why is usually sitting with um product owners or business analysts or business shaped folks with that context and who are close to the end consumer or customer and that can be an internal one or an external user as well and I think that contextual izes it and makes it human um in terms of how usable and then therefore how valuable what
we're producing is um we do see the added increasing pressure of you know this return on investment for example there's a lot of investment going into these spaces in organizations and um being able to demonstrate it uh in a showcase where you can see the impact not just showing a feedback loop and the data being collected and how it's being retrieved I think that is what gets everyone else in the organization interested and then on board with um ml driven products so question um you know 2023 was the year of the hype around generative Ai
and so forth and doesn't not going away anytime soon but would you think that you know that aside is is some sort of machine learning a part of what percentage of our of projects would you say I mean I my impression is it's very very high but I'm curious what your Real World experience is yeah without putting an exact figure on it very very high from our experience as well and um uh in in in uh different aspects in that ml might be incorporated purely as a service uh into into a digital product um from
a third party or from another part of the organization uh as a starting point and so it's still useful for teams to understand uh what goes into building ml services in that regard as well and and how they might fail uh and how to how to properly test products that that incorporate ml Services right through to uh the uh yeah right through to in-house built ml products um uh where uh proprietary data uh training uh processes and deployment of the models uh Are all uh taken care of inh house and so yeah we're seeing it
very high proportion of products uh as especially as organizations move and and uh to provide more convenience features in digital products uh so a lot of those convenience features are driven by EML um uh and again insights uh insights uh if we look at digital marketplaces often we're looking to provide insights to both sides of the marketplaces and a lot of those are driven by Machine learning as well so yeah we we're seeing the application uh spreading and then and then we've we've also got generative AI uh wave now currently as well so a lot
of in in in those cases mainly integrated in that first mode as a service uh but uh we'll probably see as the tooling improves uh and we're able to build smaller more dedicated models we'll probably see that some more of that coming inh housee as well yeah my read of the ml landscape today post the 2023 hype or boom of generative air as you mentioned K is um I see a long tail of um if you plot an yaxis of number of ml practitioners like Engineers or data scientist um if you do a count of
that I think it's a long tail where you have a long tail Of Orcs with perhaps one or two maybe three data scientists or am engineers and then there's the chunky middle part where uh perhaps you've evolved and become more mature you have a full data science team that's servicing multiple varied needs um as they mention maybe it's uh insights maybe it's uh language models maybe it's personalization um but then that's the third part which is the very few but um deep um organizations with you know multiple D science teams could be I've heard numbers
of 100 plus data scientists uh could be ranging from 20 to 100 so I think all the more it makes it important to in order to reduce the frustration of these practitioners of people who have hopes on these uh ml products going to the market um that you know these product uh and engineering and team topology practices they become more important to help these teams to be able to move towards a goal let's say it's about building a recommender system or building the insights uh report using um Advanced analytics so it's very easy to Common
to see some team cycling around like poc's after poc's that just are not so maybe good enough or don't have market for it but PCS are good it's a great way to test ideas they also over time kind of make uh practitioners quite frustrated over time when you don't see things going out into the market so we talk a lot about practices to test this ideas early so that you know when you when you start to really build and productionize something it is something that's fit for for Market do product Market fit does problem solution
fit and how do you build that rapidly using engineering best practices you mentioned something there um David that I wanted to um Circle go back on and that is shared goal and while we see again you ask the question how much of gen are we seeing out there we see a lot and but also the risk is also high in terms of are we really getting our return on investment um because you have a lot of the tools there but uh what is that saying if you have a hammer everything looks like a nail and
if we're not clear on that shared goal and we're not clear on what customer problem or value we're trying to uh create or um get back because of uh Market competition like then we are seeing what David mentioned is lots of pox just cycling through and forever being discoveries not always a bad thing I think statistics with um the big Tech is what 95% of Po do fail but and it is important to keep on going but it's also equally important to have a shared goal and vision and also clear idea of what kind of
problem or value we're trying to solve if I may I could uh I'll come back to the question on on team shapes as well and you know is is it a multidisiplinary team uh or is it a group of data scientists uh and so I think that uh in the early days of uh data science and machine learning exploration in in organizations you know they they saw data science as uh like the key problem to solve uh and it certainly is one of the key problems to solve in in many ml teams but while it's
necessary it's not sufficient uh we also need to have a business problem a product lens that's another element and we do need to be able to connect insid or intelligence to action in organizations as well and so you know we uh might be working with teams of data scientists who could build great predictive models um but it wasn't possible for that to actually influence a customer experience uh in in deployment in the organization so that there wasn't an opportunity to deploy it into production or production systems were unreliable to the extent that offer say offers
weren't being fulfilled and so that the value of that that that those better targeted offers wasn't being realized and so then so so yes so then the you know the question come it comes back to how do we solve this with a team construct uh and that's that's where you know we see and I think thought Works Heritage has has been around delivering value end to endend in short Cycles uh and especially you know there's established literature around the Innovation potential of multi-disciplinary teams as well so if you're looking to do something new uh then
that's where a multidisiplinary team really shines and so that's kind of been our default unit but you know one of the great things that came out from uh the book was some time and space to explore the various team shapes uh for machine learning and and what role they might play in a larger Enterprise and so we've you know we've still been able to articulate you know what's that team shape that's mainly composed of data scientists where might that play a role in a larger Enterprise and how much you make that team effective see you
mentioned product thinking and and building products and the importance of customer understanding the customer what else what what does it take to make a great machine learning product yeah what does it take to make a great product and product development is hard I think it was Henry neber from Spotify that says you know product development is hard most products fail and that's the reality of it and in ml I suppose we are reminding people that shiny hamers um looking for problems don't always and most of the time um you know if it's not solving a
real problem then the return rate of your users are not going to be great like it may you may get a high hit rate at the start but you know um it's not um a solid or defensible product one aspect that we talk about is just that um Discovery um lean Discovery approach so just to focus on what that problem is and it may turn out that it's not ml or it may turn out that it is regardless once we have a clear understanding of you know what who our customers are what their what is
their pin points so we talk about some techniques like you know um user Journey mapping um customer research just uh maybe plain simple fact of talking to our users which sometimes the bureaucracy of um you know um productiv sometimes just stands in the way but all the times in my career where we've had the chance to actually go talk to users or have one of our teammates go um understand the users needs that has been just always been um very rewarding in terms of what their pains are and how our efforts are aligned to that
so yeah to the question of what does it take to build a great product the short answer is that it's a hard problem and we just got to trust the process in a way to say uh you know and go talk to customers and talk to users and speak to that rather than to yeah use our shiny find ways to use our shiny Hammer of AI I think about what makes a a great product or EML driven product one is that the team and hopefully it's a cross functional team we see ourselves as a customer
as well and I think that's really making it close to building the mindset of what the the the mindset of the customer and if the product is being used I think early indicators maybe could be like we're discovering new things about our customer that we didn't know before CU we didn't anticipate I think those would be really cool indicators of um what makes a great product because your ml driven aspect of that is automating something really quickly or learning something really quickly and then being able to suggest another um decision or assist the end user
with something that they may not have been able to do before and you know I think that is where great product Innovation comes through uh you we're just discovering something new and unexpected and I think that is a great um a success indicator it seems like experimentation is a bigger deal with these kind of datadriven machine learning sorts of projects I know experiment tracking and and being able to do experimenting in a methodical way is part of it do you cover that in the book yeah absolutely so we have um in the context of continuous
delivery and the path production experiment tracking was is absolutely a critical part of that so um there was one project that we were uh on and um when the builds were green like we had high test coverage we had model quality test then we had experiment tracking dashboard so imagine uh data scientist has you know a pool request with multiple commits being able to just go to the experiment tracking dashboard and say oh yeah commit 698 was better than the one in production that gave us that confidence to deploy the production knowing that it's a
better quality model than before so yeah it's um yeah absolutely a part of PA production for an ml model to be honest I think ordinary non-machine learning software engineering teams ought to do more experimentation I've seen I've seen examples where that led to really effective outcomes and maybe they could maybe we could take some of these Concepts and and apply them back to uh software delivery in general yeah and I mean we have borrowed I guess a term we call dual dual track delivery uh from user experience simultaneous user experience research and product development as
well um the data Science re research looks looks similar from a delivery perspective to that user experience research you yeah you start with hypotheses and then you go and test them and you're either successful or you're not but in the process uh you need to learn something and keep track of that learning and exploit the uh the upside of that learning in in the delivery track where where you're producing stuff rather than knowledge uh and so yeah I think it's it's a really effective model um anywhere you're dealing with uh unknowns or ambiguity or uncertainty
in delivery I couldn't agree more I'm currently doing dual track um at the moment and and it's kind of nice as well as a team that you're not just doing your baau or you're doing a long-term strategic feature but you've got this other track of test and learn and by the time you get to that near end of do we know enough and have enough confidence that the this the following in um investment to do the implementation is there and um I don't know obviously there's only so much context switching one likes to do in
a day but I think it gives a breath of fresh air in in the terms of portfolio of work a team can do yeah and just on top of that on the topic of experimentation um I think it's also a function of the level of trust and safety within the team as well so um sometimes it's awkward or hard to say you know this experiment didn't pan out like it's like in high trust or high safety team like we we can kill ideas very quickly we say we tried this it's not giving the US results
we want we can win it and that's okay like we all still learning rapidly it's about how fast we can he bad ideas how fast we can learn from those I remember moving to my first apartment years spe when before I had a kid and we didn't we use new plays and we didn't have a rubbish bin and then two weeks in was like oh man there's so much trash everywhere not not to say ideas are trash or products are trash I'm I'm saying in the analogy of this house uh or this uh apartment that
the in inability to chart things away led to kind of maybe TCH that in my home like things lying around so yeah I think that um experimentation as you mention SC spot on and how quickly we can experiment how um safely we can you know say that you know this PR didn't out let's just shelf it and that's okay David it's almost like the it's a discipline as well like we we said experimentation can be fun but it's actually quite a um a can be a hard discipline for every for some people because we can
go down to um technical components and you know a Lambda versus an AKs versus a data bricks and what do we want to use but the hypothesis is the overarching thing of like well what's our first end to end what's our thin slice that we're trying to achieve and really that conversation in the team of prioritizing which hypothe um test or Spike technical Spike we want to play first it does matter um because we're trying to be smart with our time as well yeah this is exactly what David said uh in terms of trusting in
the process when the outcomes uncertain you need to look at the leading measures such as you know do we have a good set of experiments are we able to conduct them quickly and can we exploit the upside quickly as well so then what I'm what I'm sure is a very easy question is is how do you build a great ml team because that's harder to experiment with right I mean people don't have feature flags and it takes a little longer to roll on and roll off and that sort of thing so um you know how
do you build that team that's going to build the product yeah we we look at we look at building blocks of teams as well as products uh uh and so this is the kind of the third uh section of the book focused on building things in a way that's right for people and so you know yeah we ask what's important for teams in general to be effective but then we also add the lens of the fact that ml's technically complex and and requires a lot of different uh disciplines in a in a modern product development
context um so then we look at elements like trust as as foundational um and in this section we we do refer to a lot of existing literature that was you know perhaps one of the challenges of writing the book uh where to uh leave leave some of our insights at a paragraph and point readers to to further information rather than uh yeah rather than have chapters proliferate endlessly um and so in in in chapter 10 in the book we look at a range of things like trust uh we look at communication uh we look at
the Shared purpose and progress uh as we've um discussed previously we look at diversity um we look at all the things that make teams effective in complex ambiguous environments yeah I think that's an engineering aspect to your question as well Ken So like um how do we as a team experiment quickly I think it's quite common in the ml uh engineering or data Science World to be caught up in toil of um say um manual testing of models output or um you know I think there was a tekon research from 20 they said like um
30% of a data SST time is used on deployment rather than experimentation so what we ended up doing for one of uh the projects we talk about in the book was that uh when data scientist has an idea if I add this feature it might improve the model quality by x% they can cut a branch and in their um PA to production um do ex execute the idea see the results on The Experiment tracking dashboard um don't wor have to worry about breaking stuff because tests will tell them that so their fast feedback afforded from
these engineering um practices um yeah could help them experiment more quickly and if it doesn't pan out adding this new feature actually make things worse like fine just throw that away and try the next thing test driven development is just as applicable to machine learning projects I assume as it is to everything else what other engineering practices I know um that's in David that's one of your areas uh what other practices do you think are important in these context yeah um so in um in the book we talk a lot about um testing there's two
chapters on it and we kind of take a Nuance view about tdd itself which I can come back to later um but yeah we flash out this kind of typology of tests of an ml system so we start first start to think about what is is subjects under test so in any ml system there are pure functions right just data Transformations feature engineering those are L themselves to you test very nicely but then there's the next part of it is the model itself which um um uh hard to unit test because even the training of
that artifact takes you know could be hours could be 20 minutes what have you so then uh we bring on the lens of model quality tests and uh most I I I would think all data scientists know uh are very familiar with model evaluation um you know metrics um like Precision recall and what have you but then we take it a step further and there was an interesting example or exercise in the book where it was the model was like you know 80% accurate and then we explored the hidden stratification problem where what if for
some subclasses actually the model wasn't 80% accurate so then we wrote a test we ran it and it was like a loan default prediction example and it did turn out that for some occupations the model yes was the 80 plus above the threshold but for Laborers it was like 50% accurate so we discovered this kind of uh type of errors that was happening to a sub segment of that data so that model quality test and certified metric test brought that in um yeah in the ml system there's only so much you can test and there's
a part of it that's also just production monitoring as this thing is operating in the production or as a shadow model um in production then being able to collect real world data say the model said yes this loan application is likely to default um then collecting data on whether it did default or was the risk profile over time being able to monitor that in production was important it's not monitoring as we know it like real time monitoring sometimes is the the batch or latency depends on how quickly we can label those uh ground truth but
still it's going to give you that feedback about um the quality of that model yeah so a lot of engineering fun in ml systems and it was just quite striking to us how I think we have the fortune of sitting in the intersection of software engineering and machine learning so we could kind of take practices from both worlds actually there were some parts where we found that some engineering practices didn't always pan out then maybe this is where I'll Circle back to tdd so there were some cases where tdd wasn't going to give us or
test driven development doesn't give us that fast feedback so then at that point it's like it's fine if it's not giving us the feedback it's we don't have to be dogmatic about um um practicing it so actually it's Dogma in both ways it's like either dogmatic in that yes I have to tdd or like the other Dogma is no it's machine learning you can't tdd and the TR truth is more nuanced so it's like being able to know when you can uh cannot get that fast feedback so yeah it's good interesting to really stretch the
limits of these software engineering practices and see when uh it breaks down and you know add on some Anal specific uring practices perhaps more than any other area I see practices that are common but they're slightly renamed like you know devops is now mlops and continuous delivery is continuous delivery for ML you know is processed ins inside thought works is that positioning frankly is that to say to data scientists hey don't worry we're thinking about you are are they radically different engineering practices um is there crossover you know how much how much do people need
to know if I'm a if I'm an infrastructure engineer and I'm now going to be on a ml team do I need to go study mlops or am I GNA apply most of my devops yeah that's a great uh question Ken so um I think there are ml specific um extensions in the ml op space so it builds on top of devops you know things like infrastructure is code deployment automation scaling infrastructure essentially for training or for inference but mlops then brings in those ml specific components like um being able to have a meta data
or model registry for every Deployable because it's no longer just code it's also this really gigantic model sometimes um then also things like featur store so it's not just compute or logic at runtime but also what features need to to be available at low latency uh during in production so I think being a devops engineer gives us a really good background and starting point to understand a lot of this and it needs to be augmented with yeah ml specific or M op specific practices I also see those names um namings is to be one inclusive
uh and inviting I think it's a really exciting space and lots of people may be coming from career changing or and straight into this space and not from the traditional software delivery and I think being able to extend the name with uh ml Ops for example it's to say that there is a whole body of knowledge that is tried and tested lots of experienced practitioners with points of view and you don't need to go through the one track pathway of having um a PhD or go through tertiary studies for example uh to be an ml
engineer you can there are building blocks and that on the software delivery landscape that can lend you to being in this space and you know be part of this community I think we do need to keep growing this community with different perspectives and different um uh shaped people because that's when our models are going to be richer and less biased as well you know I I know you started this book back in the the dim past before geni became the central feature in all of our lives and but I wonder I know it's applicable and
we're still doing a lot of that more traditional if you can call it that machine learning does this do these techniques and this approach also apply when we're talking about generative AI yes I yes they do um as as as I said earlier um even if you're consuming a gen service um such as an embedding service from open AI uh it really helps as the team to understand what's gone into training that service uh how it actually performs and and what its failure modes are um because any any ml service will uh have some uh
you some degree of unpr predictable Behavior or or some less than perfect uh performance and so yeah so being able to design products uh with those failure modes in mind is is very important um but you know but they offer a lot of advantages for solving otherwise difficult problems with complex data as well so it's not to say um that the failure mode should drive your thinking um uh and so the you know another aspect and I think we touched on it a little bit about successful products is that um a lot of conversations we're
seeing gen is a great trigger and you know to really um the right intuition is there that this is a problem that can be solved with AI but it's not necessarily uh the most uh effective technique um the yeah the most efficient or the most architecturally ideal uh technique to use to solve a particular class of problem uh and so there this this product development approach of of testing and learning um teams being able to potentially prototype with Gen first but then if they identified as a better solution for a digital product when it you
when it goes into when it goes into production when it moves out of that realm that you can handhold the solution uh so then teams being able to to identify the most appropriate traditional ml or AI technique as we might describe it um uh to support that feature in a production environment uh it's it's very important for teams to be able to to make that pivot as well and on that question of do any of these apply to gen in addition to what Dave has shared like I think there's a whole swath of things to
say about product testing and all of that um but to zoom into engineering for a little bit so we were lucky in the timing of this uh one of the chapters on automated testing that we were on a project to develop a LM prototype and we were actually able to tdd or test apply test driven development in the prom engineering of one of the llms it was really fun and low cognitive load to be able to say these are the scenarios that um I need my LM to handle or LM plus my prompts from design
to handle and be able to just iteratively refine our prompt design to kind of pass one scenario at a time that fast feedback was really good and then sometimes St stakeholders will say oh I need to add this new thing or can you make this um chatboard more polite or something like that the ability to change that prompt and then run a test Suite of like you know 10 to 20 tests or scenarios and say oh everything is still green like that really helped us move really quickly and I can imagine if uh somebody one
day needs to deprecate GPT 40615 to support a newer version the ability to just run the whole Suite of test to say yeah it was still as good as before or it starts to fail in these scenarios uh I think this engineering practice definitely will help um those LM application areas you know I have so many more questions I would like to ask but we're going to run out of time um and in wrapping up I'd like to give each of you an opportunity to to tell me or tell the audience what what is the
most important takeaway from your perspective Dave calls let's start with you so I would say that uh when it comes to building ml productss there's a role for everybody uh in a modern digital organization uh and this book tries to articulate how each person uh regardless of their background and experience can play a role in delivering great ml products David tan how about you uh I would say my key takeaway um in building effective machine learning teams is um just to be able to reflect on your past week um what like what has work for
you what hasn't worked just to be able to be sensitive to those smells and signs that there may be a problem underneath for example if I spend 3 days manually testing something before a release I don't like that feeling it was tedious then that's a signal to us that something could be improved so yeah just to look back at your week as an M engineer as a data scientist like what frustrates you are would be in the product space user Space Engineering space or data science base where what are you what's annoying you and then
just to seek out other ways or better ways of um you know handling those scenarios a thinking of a succinct way to answer this and I'll try my best and I'll start with uh it's all about the bigger picture it's all about the problem and the value we're deliver um trying to solve and deliver and our principles and practices in software delivery and Engineering is all applicable to this and that is what will help us um make ml product successful thank you I think that was a really that was a really good answer um where
can we find the book yep uh we'll edit in the show notes and you can go to your favorite search engine and type in effective machine learning teams um they will bring you to oilly or you can order on Amazon booktopia we've heard from a publisher that in North America is also in Target and buns and Noble so that's to me very cool but yeah uh you can find it in your nearest search engine and there's a hard copy floating around the office I noticed which was very exciting I'm sure to have it to have
it in your hands yes yes yeah that was courtesy of Dave for the Melbourne office well and and for what it's worth I can tell you that the the the technology section in the Barnes & Nobles here in North America is very small so that is high flattery for you as authors woohoo you need to do a tour obviously I'll be out for it okay thanks thanks everybody this has been a great conversation and uh I hope everyone will go and read the book and consider applying some of these techniques in their own in their
own work yep thank you so much Scot and Ken thanks thanks Scot and Ken