[Tutorial] Player specific storage and Databases

Lordphenex •1/18/23 8:23 am history

2 emeralds

•

3.1k

•

1/22/2023 6:53 am

Hi ! Hope you're doing well !
In this thread, I will explain to you different manner to store data about player.

Context

There are various reason to why you would need to store data of a player or more generally of an entity. You may want to keep track of the item the player was holding or what happened to him (deaths, kills, chest found, basically anything), keep track of where all shops locations are in your world. I myself use player specific storage in my Potion Bags Datapack and in my Graves Datapack. I will use them as example in the rest of this thread.

I will present 2 methods for different use cases. Both methods will use some sort of an identifier. This identifier will either be a built-in identifier (UUID) or a created one.

Short time storage

Sometimes, you need to store data only for a couple of ticks. If this doesn't happen too often and for a limited amount of player at a time, the easiest way is to summon a marker entity (either a marker or an armor_stand holding an item you will use as a storage) and use it to store the needed data. Doing this, you will need to lead this marker entity to the player.
That's where the identifier comes. For ease, we will use a scoreboard. The idea is to give the same score to the marker and to the player. In order to have a unique identifier for each pair, we will have a global score counting up each time we add a pair and copying this score to both element of the pair.
Code would look like this

When needing to store data
  add 1 to globalScore
  summon marker entity
    set stored data in marker entity
    copy globalScore to marker entity
  copy globalScore to player

When needing to get data
  get player score
  search for marker entity with same score
    copy needed data from entity to a temp storage
    kill said entity
  do what you want with the data
  reset player score
  clear temp storage

Since we only need the storage for a couple of tick and we are not doing a lot of storing-retrieving sequences (less than 1 per 5 seconds for example), we don't need to have entity permanently linked with the player so we can get rid of it when not needed. We are letting the globalScore count up as we are storing data for some ticks and reseting it to 0 after each retrievenig could cause ID conflicts.
This solution is easy to use and is pretty good as long as there are not a lot of data storing at once and the number of storing-retrieving cycle is not too important. If that is not the case, you will need to use second method presented below.

Long time storage

When needing to store information for an undefined amount of time and/or for a lot of players at once, using entities can quickly lead to lag and you probably want to avoid that. Plus, depending on how you summon entities, on the long run, they maybe unloaded when needed, causing your system to break. You may say : "let's put them all in the same chunk and forceload it". This will keep lag and can be really unstable as your chunk maybe unloaded by an operator error, or by another datapack.
The best we have to store data is, well, data storage. These can contain a lot of information without making your game run slow, and for an infinite amount of time. If you are not familiar with data storage, you can find useful information on the /data Wiki page.
Basically, what we will do is to store each player data as an element of an array.
Let say we want to track all the locations where playerA died. The data corresponding to playerA would be the identifier of playerA and an array of locations. It will then have this structure :

playerA:{ID:IDnumberA,deathPositions:[]}

We can then create a database for all players by putting the data of each individual into an array:

allPlayerDeaths:[{ID:IDnumberA,deathPositions:[]},{ID:IDnumberB,deathPositions:[]}, ...]

Then if we want to get access to a specific player data, we search the array for the player ID. We will need to cycle through the array, corresponding code being :

get arrayLength
set counter to 0
while counter<arrayLength
    if first ID of array matches reference ID
        modify data of array first element
        *set counter to arrayLength
    append first element to array and remove first element (move first element at the end)

*you can either use this line of code to stop the looping here or finish the looping (depends if you need your array to be sorted
You can find the corresponding minecraft code on the Parsing Array page from the Andromeda Team.
This is just going through an array and modifying/getting needed data. Obviously, when there are a lot of players, the array can get pretty big and going through it can take a lot of time.
That's why next part exists.

Breaking big arrays in pieces

In fact you don't really need an array that contains all the player at once, there is probably no real need for it. So what we are going to do next is instead of having a flat array, we will create sub-arrays, each containing a part of the data the big array would contain. In order to keep a logic in how data is stored and be able to find a piece of data after creating it, we will use the ID of each player. The game already gives us such an ID : UUIDs. So we will cut the main array in sub-array, each one correponding to a certain condition on the UUID. For example, in my Graves Datapack, I separated sub arrays depending on the last digit of the first element of grave UUID. To be clear, a if I have a grave with UUID [123,456,789,1011], it will go in the sub-array 3.
We can use whatever condition we want on the ID as long as conditions for different sub-arrays don't overlap. I could have an array for last digit<5 and one for last digit>=5. Here I will use agregate to store sub-arrays. You could also use an array but I find it easier to see what's going on with agregates. The structure of the stroage looks like this:

"allData":{
   "subArray0_4":[],
   "subArray5_9":[],
}

Each sub-array could also be seperated into sub-sub-arrays and so on.
Only thing I have to say is to use the exact same structure and exact same name for elements that have the same depth as this will be useful when using commands. This means that if subArray0_4 contains folowing data, subArray5_9 should have same structure with same name. You will understand later on why.

"subArray0_4":{
   "subSubArray0":[],
   "subSubArray1_9":[]
},
"subArray5_9":{
   "subSubArray0":[],
   "subSubArray1_9":[]
}

I even recommend each needed depth to have the same structure like the following so we will need less functions to access and store data :

"allData":{
   "subArray0_4":{
      "subArray0_4":[],
      "subArray5_9":[],
   },
   "subArray5_9":{
      "subArray0_4":[],
      "subArray5_9":[],
   }
}

You can have as much different depths as you want, each depths reducing even more the array we will finally be searching. You can also have as much subArrays as you want for a given depth.
Each depth should correpond to a different kind of condition : first one for last digit of first UUID, and second one for last digit of second UUID for example.

Now that we have created our storage, it is time to use it to manage data.

Using sub-arrays

The way we will get, add, remove and modify data, is by first diving into depths in order to get to the right array, modifying it and going back step by step.
What I mean is that we will search in wich sub-array of depth 1 the ID we're looking for is situated, then do it for depth 2, and so on. When we get to the right sub-...-sub-array, we will use the method presented above for array cycling to get to the exact piece of data. You may think of it like zooming into the data.
Let's say our data collection is :

"allData":{
   "subArray0_4":{
      "subArray0_4":[{ID:512},{ID:934}],
      "subArray5_9":[{ID:1482},{ID:5093}],
   },
   "subArray5_9":{
      "subArray0_4":[{ID:28},{ID:235},{ID:726}],
      "subArray5_9":[{ID:57},{ID:476}],
   }
}

Where first depth test for last digit of ID and second depth test for digit before last.
We will search data for ID=726.
Our code will look like this :

get condition for first depth (here, cond=6)
subArray = call function getDepth1(cond,allData)
get condition for second depth (cond=2)
subsubArray = call function getDepth2(cond,subArray)
find ID in subsubArray

And function getDepth1 and getDepth2 are (because each depth of the storage share the exact same structure) :

parameters : cond, current subArray
if cond<=4 then return subArray0_4
if cond>=5 then return subArray5_9

In minecraft commands, that would be like :

#root function for array searching
#array search function called by another function in which we get the wanted ID in score .ID Score
#we assume .10 is a constant set to 10
scoreboard players operation .condition1 Score = .ID Score
scoreboard players operation .condition1 Score %= .10 Score
execute if score .condition1 Score matches 0..4 run data modify storage namespace:temp depth1Array set from storage namespace:storage allData.subArray0_4
execute if score .condition1 Score matches 5..9 run data modify storage namespace:temp depth1Array set from storage namespace:storage allData.subArray5_9

scoreboard players  operation .condition2 Score = .ID Score
scoreboard players operation .condition2 Score /= .10 Score
scoreboard players operation .condition2 Score %= .10 Score
execute if score .condition2 Score matches 0..4 run data modify storage namespace:temp depth2Array set from storage namespace:temp depth1Array.subArray0_4
execute if score .condition2 Score matches 5..9 run data modify storage namespace:temp depth2Array set from storage namespace:temp depth1Array.subArray5_9

#now the array contained in depth2Array should contain the data correponding to the .ID
#here, we go through the array contained in depth2Array
function namespace:init_cycle
#This function will either modify or read depth2Array data

#If depth2Array was modified, we need to bring back the data to the original storage
#So the following should be put in a function called if modification happened
execute if score .condition2 Score matches 0..4 run data modify storage namespace:temp depth1Array.subArray0_4 set from storage namespace:temp depth2Array
execute if score .condition2 Score matches 5..9 run data modify storage namespace:temp depth1Array.subArray5_9 set from storage namespace:temp depth2Array

execute if score .condition1 Score matches 0..4 run data modify storage namespace:storage allData.subArray0_4 set from storage namespace:temp depth1Array
execute if score .condition1 Score matches 5..9 run data modify storage namespace:storage allData.subArray5_9 set from storage namespace:temp depth1Array

Here first part of the function bring the right array to us, allowing us to search it.
Second part is needed only if modifications occured and is used to bring back the modified array to the main storage.
Here you can see why elements that have same depth should share the exact same structure. If they didn't, we would need to have different lines of code for each case, adding more and more functions.

So now we have seen how you can create and manage a database pretty efficiently, without much commands.
To be sure you understood well what the above function does, I'll go through it step by step with my example.

#our input ID is 726
condition1 = 726
condition1 %10 => 6
condition1 > 5 so we get storage :
"depth1Array":{
    "subArray0_4":[{ID:28},{ID:235},{ID:726}],
    "subArray5_9":[{ID:57},{ID:476}],
}

condition2 = 726
condition2 /10 => 72
condition2 %10 => 2
condition2 < 4 so we get storage : 
"depth2Array":[{ID:28},{ID:235},{ID:726}]

call function init_cycle to modify the storage as we want 
#for example, here I'm adding a data set to the player so its data is now : {ID:726,data:2023}
We now have this storage : 
"depth2Array":[{ID:28},{ID:235},{ID:726,data:2023}]

condition2 < 4 so we get storage
"depth1Array":{
    "subArray0_4":[{ID:28},{ID:235},{ID:726,data:2023}],
    "subArray5_9":[{ID:57},{ID:476}],
}

condition1 > 5 so we finally get storage 
"allData":{
   "subArray0_4":{
      "subArray0_4":[{ID:512},{ID:934}],
      "subArray5_9":[{ID:1482},{ID:5093}],
   },
   "subArray5_9":{
      "subArray0_4":[{ID:28},{ID:235},{ID:726,data:2023}],
      "subArray5_9":[{ID:57},{ID:476}],
   }
}

I hope this makes sense for you.
You can find an example datapack here : [POC] Database explanation tutorial.

So now, you have a pretty efficient database, easily accessible without much commands.
You can track anything with this as long as you give each piece of data some sort of an ID.

Things that work but are unstable

The easiest way to store data about players would of course being able to directly modify their data. This direct access to modify data is unfortunately impossible as for now. However, there are solutions to do so. I would clearly not recommend them as they could be potentially unstable and subject to data loss.
One could use an item in the player inventory to store some data (through item data tags as they will never be removed by the game). The easiest solution would be to set a block on player head in which you store the wanted data. This can be completely broken if the player dies or manage to remove this item from their inventory. This method is in my opinion too fragile for long term data storage.

I hope this was helpful to you !
If you have any questions feel free to ask them in the comments, it will probably benefit everybody.
Bye !

Lordphenex

Posted by

Lordphenex
Level 44 : Master Miner

Have something to say?

Join • Sign in

2

sigstop

• 01/21/2023 8:10 am

• Level 2 : Apprentice Miner

Increasing the speed of data access though simple hash table is quite good, but I can't really think of any real scenario where it would benefit me more than simple lookup though entire database of stored entries. Your method would be only useful if I had exponential number of players on my server, and I know for sure I don't have. Even when it comes to datapacks, usually they are not deployed for more than 10 players at any given moment. If on the other hand I would had to track some special locations or items, I think I would just generate some access functions that would minimize copy operations done on this entire data set, as when having to ~1000 in such nested structure could lead to copping at least ~100 of them twice just to access one record.

Anyway, such optimizations should never be done without proper prior benchmarks, to see if we will gain anything for our effort put in.

Lordphenex

• 01/22/2023 6:53 am

• Level 44 : Master Miner

Yes if you don't have many players on your server, this is pretty useless. In fact, I used this method to track all the graves from my Graves datapack in order to be able to remove them when uninstalling datapack. If this is used on a small server that can have around 100 players, with potentially 2 or 3 unopened graves per player, the data set can get a bit big.

I admit I did not conduct any test to see if it is really more efficient but I assumed that dividing your data set in smaller pieces would result in less copy operations (1 or 2 to get to the right array and up to a quarter of the original data set length for cycling through array). I don't know how much less effective it is to copy big set of data than a single element from an array.

The nesting I proposed can be made differently like dividing the main array in 10 pieces instead of 2

Forums Minecraft - Java Edition Data Packs

[Tutorial] Player specific storage and Databases

Context

Short time storage

Long time storage

Breaking big arrays in pieces

Using sub-arrays

Things that work but are unstable

2

Planet Minecraft Community

Welcome

forum Forums Minecraft - Java Edition add_boxData Packs

[Tutorial] Player specific storage and Databases

Context

Short time storage

Long time storage

Breaking big arrays in pieces

Using sub-arrays

Things that work but are unstable

2

Planet Minecraft Community

Welcome

Forums Minecraft - Java Edition Data Packs