welcome to chapter 2 lesson one in this lesson we will talk about Ton's memory layout and TVM operation so let's start with the unique feature of ton that is called cell the interesting thing about ton is that all the data structures in the entire blockchain within your own smart contracts and in all the standard data structures uh in the consensus protocol they are all built on top of cells so what is a cell cell is a small building block of the entire data structures in ton blockchain each cell has up to 2,23 bits of data
and up to four references to other cells and this allows you to use cells to build arbitrary L complex and nested data structures and this is really the only data storage that uh that you have when you build your own contracts this means that you cannot simply allocate arbitrary array of bites like you could do in uh for instance ethereum or in your regular computer you have to put your data and lay it out in the tree of cells and then this data is transmitted and stored in a format called bag of cells and this
is something that is familiar to the users of git uh in Git You know that every object has its own unique identifier and there are D duplication mechanisms that allow you to effectively compress all the individual objects in one repository and this is pretty much how the cells work in ton every time you transmit a tree of cells or a collection of various cells all of their nested cells they're effectively represented on this bag of cells with all the necessary D duplication if needed um now this uh storage model is uh obviously giving some challenges
to the developers because now you have to split your data in those chunks of 223 bits which is slightly less than 128 bytes and you have to think how flat or how deep you want to build the tree of your data but the cool uh result of this design decision is that uh the entire state of the blockchain can be effectively merized which means you can create a miracle proof the cryptographic proof of any portion of the data in the blockchain at any state of it this is crucial when you scale out to a large
um a system with multiple shards and separated uh validator groups and you need to verify that some groups uh behaved correctly and didn't break the rules of the system and this is where the effective compact uh meracle proofs are necessary to to prove any misbehavior by any participant in the in the system and so building the entire T blockchain out of the cells really helps with that now let's uh take a look at how this comes together during the execution of a smart contract the storage of a smart contract is represented by a cell and
if you need to store more than 1,23 bits of data then you could use additional cells that will be uh connected to it via one of the four references and so on you can build arbitrarily large state of the contract when the contract receives the incoming message uh the validating node instantiates TVM which is a special purpose stack based virtual machine that is designed to execute ton bite code this virtual machine is loaded with the current state of a contract and its current code and both the state and and the code are stored in the
cells now all of this data is loaded up in TVM and the job of TVM is really to just go through the code execute it verify all the network rules regarding uh gas costs uh and the correctness of all the operations and in the end return either uh an error or a new state of a contract that will be stored in its place and also there is another type of data that is emitted as a result of TVM work is the list of outgoing actions that are typically the uh outgoing messages that the contractor wants
to Emit and deliver to other contracts so TVM is instantiated for each message and it's a pretty lightweight virtual machine in the entire blockchain the only data type that you have is a cell then in TVM you have a slightly more um types of data that you could work with and most importantly you could work with cells and you could work with integers and another core type of TVM is the 257 bit integer that allows you to represent wide range of integers uh suitable for cryptographic work and for financial operations all the data types that
TVM works with integers topples and cells could be read by the code in the contract out of its own storage manipulated on the stack and then the new storage could be created with the output action if the execution was successful then the TVM is unloaded from memory and the new state of a contract is stored in place now let's uh talk about the memory layout available for the contract so as you noticed the only available option for your memory layout is a tree of cells in a contract so how do you store things like lists
or dictionaries or sets in uh this system turns out uh it's not very trivial to do because the cells are very Compact and the form a tree to help developers with this ton comes at at the level of TVM and also at the level of fany the higher level language with tools that help you work with hashmaps uh in a more effective Manner and they use cells as the underlying implementation so the hashmap is effectively a tree uh or a prefix tree that stores your values where the path to a tree forms the key to
it so this gives you a logarithmic cost of accessing your data and uh the built-in tools in the TVM give you uh usable interface to iterate over all of this data or do insertions and dels to that data however uh you should note that uh the key to scalability in ton is to uh limit amount of work that is done in any single place because ton is scalable uh across contracts but each individual contract is not uh scalable out of the box by itself so if you design a system where you have a large hashmap
say you have millions of users and you put millions of Records in your list then that would be a bottleneck for the for your application because all of these transactions would have to go through this contract and also it would be quite expensive in terms of gas fees because by Design uh in ton every time you have to jump from one cell to another by reference it's called unpacking this is a quite expensive operation and there is a good reason for it because uh if you have a long chain of unpacking the references then it
means that anytime you want to prove anything about the state in the network your Miracle proof would have to be equally large and to make sure that those proofs are reasonably compact uh there is a high price for creating those nested cells so what do you do first answer is that this is totally acceptable to have nested cells for small small amount of data so for instance if you make a system with a very few participants let's say a multi signature contract then you can uh store this data in your hashmap right into the contract
and will be reasonably small if however you are building a system with millions of users then you should think about using tokens to represent users participation and avoid storing the lists of these users in your contract alt together finally Sals have a a built-in uh deterministic hashing scheme that allows you to identify uniquely any cell in any part of the tree and this is used both for data deduplication and compression so for instance when the contract runs out of money to pay for the rent its current state is offloaded uh from the blockchain it's completely
forgotten but uh the network still stores the hash of the uh storage cell this means that the user can could come later and provide the original tree of cells that matches this hash and reinstantiate the contract and this way the state of the contract is fully preserved by the hash of its entire storage so in conclusion at the base of the blockchain everything is built out of cells these are compact data structures that form a tree and make it very efficient to offload the data duplicate uh parts of the state of the system and most
importantly create effective cryptographic meracle proofs about any state in the system so on top of the cells you have types provided by TVM and TVM is instantiated for each contract for each message that is processed by the contract and TVM provides you with additional types that you could work with during this execution phase so first of all those are integer types that make it uh friendly to work with cryptography and financial operations uh but then there are also operations with hashmaps that allow you to use lists and dictionaries based on cells um inside your contracts
however you should be aware of the cost associated with large data structures and deep chains of sales that may incure higher than usual costs on your contract [Music] execution