so if you've been following along with this video series maybe you've noticed something here cloud firestore consists of a bunch of documents which are basically key value pairs as well as collections which are collections of documents but then within these documents we let you store maps the json e looking things that also kind of work as key value stores you know like a document not only that but you can also store arrays which look suspiciously like collections so at some point you might start asking yourself well why do we even have separate documents we could
just say put all of our data into these maps why create a sub collection when I could just create an array of maps or even a map of maps when is it appropriate to use one versus the other and what would work best in my app good questions let's see if we can find some answers on this episode of get to know cloud fire store [Music] so when it comes to determining the best way to structure your data in cloud firestore it's helpful to remember some rules about how firestore works let's go over some of
these now all right rule number one documents have limits so there are some limits when it comes to how much data you can put into a document for starters you're limited to one Meg total of data in a single document and I know that in our world of like high res 16 megapixel photos 1 Meg seems tiny right geez huh oh you got to trim the flash off but when we're talking about text and numbers here and we generally are that's still a lot of data you can fit into a single document for example the
complete text of A Tale of Two Cities 300 page book from an author who is paid by the word only about 770 K you could fit all of that into a single document and still have room for a good Sherlock Holmes mystery also another rule around documents and this will come into play later is that they can't have more than roughly 20,000 fields and by the way this also includes fields within the maps that you have in your documents so this is six fields and this counts as seven one for the overall address field as
well as one for each individual element in that address map and this would count as nine and one big reason for this limit is that firestore is creating indexes for any field you have here in your document even yes inside a map basically from fire stores point of view my document here has a field called address city and it's going to index it and even if you don't hit this twenty thousand field limit having documents with the ton of fields in them can impact your write performance because cloud fire store needs to redo every single
one of these indexes when you create another document make a change and so on one other rule around documents is that you're generally limited to one write per second on the same document so if you've got a situation where like a lot of clients are all trying to write to the same document all at once you might start to see some of these writes fail now your client libraries are generally smart enough to retry these writes with some incremental back-off if that starts happening but it's still a situation you want to avoid on the other
hand writing to lots of different documents in the same collection that's generally fine go crazy okay let's move on to rule number two you can't retrieve a partial document at least not using the client SDKs so just because you can put a Tale of Two Cities into a document doesn't mean you good see when cloud firestore retrieves a document you're retrieving all the data in that document you can't retrieve partial data from a document using the client SDKs so imagine I have my collection of Charles Dickens books and my documents have a title field and
a the entire contents of the book field then I decide hey you know maybe I just want my app to show a list of what Charles Dickens books are in my collection so I query this collection well hey guess what in addition to grabbing the titles I'm also grabbing the entire content of all the stories which means I'm now downloading approximately 38 thousand times more data than I actually need and that's that that's bad right this is something your user is going to notice this is gonna affect your apps performance its data usage and it's
battery consumption and this tends to put your app in a state I like to call getting uninstalled sorry buddy I can't keep it around it's killing my data plan this also has implications when it comes to security rules you can't create a security rule that gives you access to some parts of a document and not others if you want to keep part of a document secret you'll need to place those secret parts into a separate document and I'll talk more about that in our video on security rules okay so now you're thinking well gosh clearly
the answer is to break my data up into as many tiny documents as as many tiny sub collections as possible well hang on there because we have rule number three queries are shallow so you probably already know this but when you grab a document you don't grab any of the data in any of these sub collections and this is generally a good thing right it means you can store your data hierarchically in a way that feels logical without having to worry about grabbing more data than you intended so if I took that content for my
entire text of the books field and moved it down into a chapter sub collection well now I can grab my list of Dickens books titles without having to grab all the text associated with them however there's a flip side to this which is if you think you're almost always going to be showing data together it might not make sense to put it into a sub collection for example maybe I decide to put my list of main characters into a character sub collection in many applications this might make sense but if in my app it turns
out that whenever I'm showing the title of a book I'm also going to be showing a list of all the characters well now I can't grab all that data I need in a single query I now have to get the book and then get a list of all the characters that are in that book by querying each book sub collection and this is going to make populating a list like this a lot more complicated so if this ends up being the look and feel of my well maybe I don't want to put this data into
a sub collection I can keep it as part of my main document there's also another reason this setup might be bad which is rule number four your build by the number of reads and writes you perform the other disadvantage of putting these characters in a sub collection like this is that and again this is only assuming I want to grab the documents in the sub collection every single time I grab this main document I've increased my reads here from one read to several depending on how many characters I want to read it now how much
this matters is probably a factor of how often you'll be making queries like this I mean if the structure means your app is gonna be making two million extra requests per day well you know that's only roughly 12 cents a day on the other hand if your app is making 2 billion extra requests because of this structure well then that's 120 bucks a day and probably something worth optimizing like I said in the last video I'm not always a fan of premature optimization for billing purposes particularly if it comes at the expense of more complicated
code but it's definitely something to be aware of okay rule number five queries can only be used to search for specific documents within one collection and they do so by looking for specific fields that match a certain criteria hang on Todd from 2018 Todd from 2019 is going to cover this rule let me take it from here so on this side I have my Charles Dickens characters stored as a sub collection for each book I've included their name and an occupation as two different fields on the other side I have the same list of characters
stored as a map field to directly inside the book document with maybe the character name is the key and their occupation is the value now these two structures let me run very different types of queries with my characters in a sub collection I could say hey show me every character in Great Expectations with the name that starts with P or every character in Oliver Twist who's a doctor and if I wanted to say hey show me all characters in all books named Oliver I can now do that with a collection group query now this would
require that a set up a collection group index for the name field in the firebase console and do keep in mind you're limited to about 200 of these on the other hand what if I put characters in a map like this well I could kind of say hey show me every book that has a character whose name is Oliver and I would do that by looking for a field called characters Oliver with a value that's like greater than an empty but I'll be honest this is kind of weird and that I'm relying on there being
a field in a map with a key named in a particular way and that seems all sorts of error-prone to me in addition I wouldn't just be returning character data here I'd be returning information about the entire book that contains this character which may or may not be what I want and I wouldn't be able to do queries like show me all characters whose occupation is socialite or show me all characters sorted by name so my search options are a lot more limited here yeah I'm pretty sure social ADEs an occupation it's just that in
2019 we like to call them social media influencers now a third option is to put your characters in a top-level collection this was a nice work around before collection group queries were supported because they let me search for say character by name or all characters in a particular book and do that search across all characters in all books and if you recall from our earlier video grays and cloud firestore are very fast so this show all characters in Oliver Twist query in the top level collection really isn't any slower and then grabbing all the documents
from a sub collection I think the biggest drawback here is that if I wanted to say search for all characters and Oliver Twist sorted by name well now that requires a composite index because it is now a multi field search and you'll recall I'm limited to 200 of those so if you think you're going to be searching for individual records from a group of data you put them into a collection and you can use either a sub collection like this or a completely separate top-level collection as a general guideline I'd probably say score data hierarchically
if you think you're mostly going to be searching for items per sub collection and only occasionally want to do a collection group query and put them into a top-level collection if you think you're mostly going to be searching across all documents and only occasionally want to do say a per book query and putting your data into a map like this it's not really useful for a querying standpoint this would more likely be useful if you're just storing data you know you're always going to want to retrieve with a parent document all right back to you
Todd from 2018 well Jafar this might make more sense when we get into some more examples in the next video so stick with me if you're confused we have one more rule to get through and that's rule number six arrays are kind of weird so there's a really good blog post out there and why arrays are evil but basically a lot of traditional array operations don't work very well in a service like cloud firestore because when you've got data that could be altered by multiple clients it's very easy to get confused as to what edits
are happening to what field like at least in a map if I have several clients all trying to edit different fields or even the exact same field I generally know what's going to happen right I'll end up with a map that has fields that are equal to that of the most recent client update but in an array if one client is trying to edit a value at like index two and another client tries to delete the value in index two and another client tries to insert a value somewhere at index zero I'm going to have
very different results and possibly out of bounds errors depending on what order these instructions are received in so for that reason fire stores actions with arrays are a bit different than what you might expect for instance you can't perform actions like insert or modify or delete an element of an array at a specific position and you can't run in a query unlike the element contained at position two in the array so are they useless well no not at all by the time you see this video the team will have added some operations that do make
working with arrays a lot more useful see it turns out a lot of developers don't really care about the exact order that they are storing things in an array they just want to use arrays as like a simple way to contain a bunch of flags for instance maybe I have a bunch of keywords that I'm storing for each of my Charles Dickens books so what do I do if I want to find all of my dramatic books are books that involve war or a crime well we now have an array contains querying feature that will
search for documents in a collection where an array contains a certain value so I can keep my keywords in an array like this and say hey let's find all dramatic books by looking for books where keywords array contains a drama so if you're wondering how this works you can kind of think of it like behind the scenes cloud fire store converts this array into a map where all the values of the array are fields in this map and like a typical map every one of these fields gets placed in an index so asking for books
where keywords array contains drama is kind of like asking for books where our imaginary keywords drama field is equal to true along those lines cloud firestore also added some new features to add or remove specific values from arrays as long as you don't really care about the exact position of them so editing this array of keywords is a lot easier now and doesn't run into some of those problems I talked about earlier so go check out the documentation if you want some more info about that so that's a lot of rules can we turn these
rules and restrictions into guidelines well let's see for starters put data in the same document if you're always going to display it together you generally want a happy medium in the size of your documents don't make them so big that you're downloading more data than you need but don't make them so small that you're downloading like 30 documents from two different levels in the database just to fill out one screen in your app put data in collections when you're going to want to search for individual pieces of that data or if you want your data
to have room to grow on the other hand leave them as a map field if you're going to want to search your parent object based on that data another good time to put data into a map field is just if you want to put related data closer together for organizational purposes for example take a typical address sure I could make each of these fields top level values in my document but I think it looks a lot nicer if they're put into a map like this I could still query them just fine and actually reduce the
risk of sort of key word conflict naming later on in my document vectors and games might also be another good candidate for maps something like this and if you got items that you would generally use as flags go ahead and stick them in arrays so that's an awful lot of rules and guidelines and maybe talking about these in the abstract is enough for you to get started but personally I'm always happier if I can talk about these with some examples that mirror more real-world situations and it turns out minor shock I know Charles Dickens apps
aren't exactly a hot category in the App Store right now maybe next year so follow me along to the next video where we'll look at how to put these guidelines into practice come on it'll be fun I'm sorry you're asking for a ten million dollar valuation and the Charles Dickens app market is already too crowded I just don't see the numbers and for that reason I'm out you [Applause] [Music]