posted 03-24-20 12:17 PM CT (US)   

RMS Tutorials: Parser Pitfalls

By: Zetnus

************************************************************
Introduction

This tutorial covers several occasions where the random map script parser works in unexpected ways. Most of this is interesting, but not particularly useful or important to know about. However, when it comes to dead logical branches and comments (and especially comments in dead branches), there are some takeaways I will provide on how to avoid issues. There is a summary at the end.

This article assumes you are familiar with random map scripting; if that is not the case, you should read my Updated New RMS Guide or otherwise familiarize yourself with the basics of RM scripting.

This article covers the quirks discussed here on the SiegeEngineers GitHub. If you want you can go read that, although it is not necessary because I will go over everything here.

************************************************************
Part 1: Comment Basics

New RM scripters often mess up their comments, because coming from other programming languages they assume comments work the same way.

Examples:


/* this is single-line a comment */
/* this
is a multi-line comment
*/
/*this is NOT a comment*/
/*** this is NOT a comment ***/
/* this comment NEVER ENDS
/* this comment NEVER ENDS*/
/* this comment NEVER ENDS */*

Basically, /* and */ must be separated from anything else by some sort of separator, such as a space, tab, or newline. Every comment must be both opened and closed; there is no way to comment out a line by just typing a character at the beginning of the line.
Simple enough, you just have to know about it. Also note that RMS syntax highlighters may highlight comments even when they are not actually comments, due to the limitations of syntax highlighting in the relevant text editors. Just watch out for that and be aware of it.

************************************************************
Part 2: Comments in Dead Branches

What is a branch?
A branch is a piece of your script that may or may not be entered. It is either gated by an if, elseif, else or by a percent_chance. When the game does not pick a particular case, then the code in that case is a "dead" branch.

************************************************************
Section 2A: if-Branches

Example:


<LAND_GENERATION>
if SOMETHING /* when SOMETHING is true, pick snow, else choose the default (grass) */
base_terrain SNOW
endif

The expected behavior if you look at this code would be that it should set the base terrain to snow if SOMETHING is defined, and if SOMETHING is not defined the base terrain should be the default.
However, if you test this in the HD Edition or in (pre-UserPatch 1.5) AoC, the base terrain will ALWAYS be snow.

Why?
The "else" in the comment is being interpreted as an else and is therefore ending the if branch and opening an alternative case. On a technical level, it is because the parser doesn't recognize /* in the dead branch. It is looking for an elseif, else or endif and is ignoring everything else it finds. So it ignores /* when SOMETHING is true, pick snow, then it finds an else and ends the dead branch. It then ignores choose the default (grass) */ because it cannot do anything useful with any of those words. Then it finds base_terrain SNOW and applies that.

So this is what is actually happening:


<LAND_GENERATION>
if SOMETHING else
base_terrain SNOW
endif

This bug is fixed in UserPatch 1.5 and in the Definitive Edition!

If you are writing a script that should function properly in the HD Edition or in pre-UP1.5 versions of the CD AoC, then you should never use elseif, else, endif in comments within if-branches.
Be careful when creating a map-pack consisting of a bunch of scripts, because this effectively puts entire scripts into branches!

There is more though:


<LAND_GENERATION> /* case 1 */
if NOT_TRUE
if NOT_TRUE
else base_terrain SNOW
endif
endif

<LAND_GENERATION> /* case 2 */
if NOT_TRUE
/* if NOT_TRUE */
else base_terrain SNOW
endif
endif

<LAND_GENERATION> /* case 3 */
if NOT_TRUE
else base_terrain SNOW
endif
endif

In the HD Edition in cases 1 and 2, the base terrain will the default (grass). In case 3, the base terrain will be SNOW. That all seems correct ... or does it?

Well not exactly. Cases 2 and 3 are identical, except for the comment, yet they produce different results. So that tells us that the parser is making use of the second "if" and realizing that it dealing with nested if-branches. Long story short, don't use "if" in comments in branches either.

Luckily Definitive Edition and UP 1.5 give the correct behavior (case 2 and 3 having the same result)

************************************************************
Section 2B: random-Branches

if-branches aren't the only kind of branch though.

Example:


start_random
percent_chance 0
base_terrain SNOW
/*
percent_chance 100
base_terrain DIRT
*/
percent_chance 100
base_terrain WATER
end_random

What base terrain will this produce?
In everything other than the Definitive Edition, the base terrain will always be DIRT.

The parser looks at the first branch, but doesn't pick it because its chance is 0. It then ignores everything in this dead branch, including the /* until it runs into either a percent_chance or an end_random. So it sees percent_chance 100 base_terrain DIRT and does that. At that point, we already have 100%, so WATER cannot be picked anymore.

What if we are clever and put our comment first?


start_random
/*
percent_chance 100
base_terrain DIRT
*/
percent_chance 0
base_terrain SNOW
percent_chance 100
base_terrain WATER
end_random

Nope, we still get DIRT.

So, don't put percent_chance or end_random in a comment within any random code.
This bug is fixed in the Definitive Edition only, and the above examples will correctly generate water.

************************************************************
Part 3: Constants in Dead Branches

You thought we were done. Far from it.
I was recently working on a massive map randomizer, and suddenly my map started producing cliffs, when it was not supposed to be doing so.
So I whittled away several thousand lines of code and was left with this:


<CONNECTION_GENERATION>
create_connect_all_players_land
{
if DE_AVAILABLE
replace_terrain DLC_FORESTAUTUMN LEAVES
replace_terrain DLC_FORESTSNOWAUTUMN GRASS_SNOW
endif
}

And that was enough to generate cliffs whenever DE_AVAILABLE was not defined.
I was baffled. So I simplified it even more:


<CONNECTION_GENERATION>
create_connect_all_players_land
{
if NOT_TRUE
replace_terrain TESTING12345 LEAVES
replace_terrain 12345TESTING GRASS_SNOW
endif
}

So what is happening?
Let me show you:


<CONNECTION_GENERATION>
create_connect_all_players_land
{
if NOT_TRUE
replace_terrain TESTING12345 else
replace_terrain 12345TESTING <CLIFF_GENERATION>
endif
}

Why?
It's a bit complicated, so sit tight.
Every rms command/attribute/statement (let us call them all "tokens") has a numerical ID.
Every terrain (and also every object) also has a numerical ID.
When the parser encounters a constant (like LEAVES) it replaces with the ID (in this case 5) and interprets it in context. When such a constant follows base_terrain, it knows it is dealing with a terrain (terrain 5 is leaves). If it follows create_object, it knows it is dealing with an object (object 5 is a hand cannoneer). If it follows something unknown or undefined, then it interprets it as an RMS token (token 5 is else).

DLC_FORESTAUTUMN is a new terrain in the Definitive Edition; thus it is not defined in HD Edition; and TESTING12345 is never defined. So the LEAVES is being interpreted in the context of something undefined, rather than in the context of replace_terrain.
GRASS_SNOW shares an ID with <CLIFF_GENERATION>, so that is where the cliffs came from.

When I realized this, I was aghast.
Let us look at some fun examples to puzzle your way through:


start_random
percent_chance 0 GRASS3
if NOT_TRUE LEAVES
GRASS_SNOW

start_random
percent_chance 0 /* GRASS3 */
if NOT_TRUE /* HAND_CANNONEER */
GRASS_SNOW

Both of these examples combine everything we have learned so far and ultimately are a roundabout way of writing <CLIFF_GENERATION> in the HD Edition.

What is the fix though?
This was my first failed attempt:


#const C5 5 /* else' */
#const NOT_C5 999 /* probably something harmless */

if NOT_TRUE C5
GRASS_SNOW
endif
/* generates cliffs */

if NOT_TRUE NOT_C5
GRASS_SNOW
endif
/* doesn't generate cliffs */

Simply defining a new constant so that you don't have to use the word LEAVES does NOT do the trick. C5 is still being interpreted as else.

The answer is this:


#const DLC_FORESTAUTUMN 104
#const DLC_FORESTSNOWAUTUMN 105
<CONNECTION_GENERATION>
create_connect_all_players_land
{
if DE_AVAILABLE
replace_terrain DLC_FORESTAUTUMN LEAVES
replace_terrain DLC_FORESTSNOWAUTUMN GRASS_SNOW
endif
}

By defining the constants (even if you are in the HD Edition, where the associated terrains don't exist), the parser is no longer encountering something undefined, and interprets it as an argument for replace_terrain.
Just make sure that you never actually use those terrains in the HD Edition, or the game will crash.

The bottom line:
Define any terrain constants in your script if they do not exist in any of the versions of the game that you want your script to work in.

All of this prompted me to figure out the IDs for all RMS tokens, and to determine what other nasty surprises might be in store for us. So without further ado, here are my findings: spreadsheet

Color Coding:
WHITE - For RMS tokens that REQUIRE arguments, the parser will not interpret the equivalent terrain or object const as that token, so you can safely use these terrains and objects anywhere.
BEIGE - RMS tokens than can work without arguments. You can freely substitute the equivalent terrains or objects in place of the RMS token and it will function as that token without arguments. This is fun quirk, but not something to worry about since it will never be problematic in a proper random map script.
RED - These are the ones to watch out for, because they can prematurely terminate a dead branch when you might not expect them to.

The most important ones are these:


So, in addition to not using if, elseif, else, endif, percent_chance, end_random in comments, you should additionally avoid all of their equivalents in comments and comments in dead branches, especially if your map should run on versions of the game prior to DE.

Yes, ridiculous as it sounds, you can end a comment with HOUSE


/* this is a properly-ending comment HOUSE
SHORE_FISH this is a comment too HOUSE

************************************************************
Part 4: UP-Constants

I didn't mention this before, but you may have noticed a bunch of UserPatch constants in the table above.
Let us take a simplified example of some custom regicide code:


#const AMOUNT_GOLD 3
<PLAYER_SETUP>
if UP_EXTENSION
guard_state KING AMOUNT_GOLD 0 1
endif

When generated in UP this will make it so that losing your king triggers defeat, even when not playing regicide, but when not in UP this will make your entire map blank.
Why?
You may have guessed it: AMOUNT_GOLD shares an ID with if. So even though you think you are gating your UP code behind an if UP_EXTENSION, that dead branch is still scanned for further instances of if. And since guard_state isn't defined in HD, AMOUNT_GOLD is not interpreted as an argument for it.

What is the solution in this case?
Shield the definition of UP constants, so they are not defined when not needed.


if UP_EXTENSION
#const AMOUNT_GOLD 3
endif
<PLAYER_SETUP>
if UP_EXTENSION
guard_state KING AMOUNT_GOLD 0 1
endif

The bottom line:
Shield UP constant definitions so the UP constants are only defined on UP. Also note that for WololoKingdoms most UP constants are defined in the gamedata, so you only need to define them manually for basegame UP1.5

************************************************************
Part 5: Definitive Edition

The Definitive Edition adds new RMS tokens. Most of these are not problematic when it comes to backwards compatibility, because the parser just gets rid of them when it does not recognize them.
However, there are a few problem cases. Specifically, those that accept objects or terrains as arguments. That would be base_layer, layer_to_place_on, and second_object.


<TERRAIN_GENERATION>
create_terrain DESERT
{
base_layer DIRT3
}

What is wrong here?
When not on the Definitive Edition, base_layer does not exist, so DIRT3 is interpreted as if.
I tried to shield the base_layer code, but no such luck:


<TERRAIN_GENERATION>
if DE_AVAILABLE else #define NOT_DE endif
create_terrain DESERT
{
if NOT_DE base_terrain DIRT3
else base_layer DIRT3
endif
}

There is a way to make this code run properly on both DE and not DE, but the solution is not pretty:


<TERRAIN_GENERATION>
if DE_AVAILABLE else #define NOT_DE endif
create_terrain DESERT
{
if NOT_DE base_terrain DIRT3
else base_layer DIRT3 elseif NOT_DE endif
endif
}

When NOT_DE is active, there is an extra endif to deal with the if caused by DIRT3. This is a crude fix though and will likely break code folding in your text editor, because on the face of it, it is not syntactically correct.

Now let us look at a second_object example:


create_object FISHING_SHIP
{
set_place_for_every_player
second_object SHORE_FISH
}

What will happen here?
When not in DE, second_object does not exist, so SHORE_FISH is interpreted as /* and the rest of your script is commented out.
The fix is relatively easy for the HD Edition, because comments in dead branches are ignored:

create_object FISHING_SHIP
{
set_place_for_every_player
if DE_AVAILABLE
second_object SHORE_FISH
endif
}

However, this fix does not work in UP because comments in in dead branches are actually seen as proper comments, so we have to put in a bit more work to make it fully compatible across all versions:


create_object FISHING_SHIP
{
set_place_for_every_player
if DE_AVAILABLE
second_object SHORE_FISH
if UP_EXTENSION
*/
endif
endif
}

It looks silly, but it works. It provides the */ only when needed to cancel out the comment opened by SHORE_FISH.

The bottom line:
If your map only needs to work on DE, ignore all this. For backwards compatibility, you will need to look at a case-by-case basis for base_layer, layer_to_place_on, and second_object and see which arguments they are taking and if those arguments are harmless or need special treatment.

************************************************************
Part 6: Summary (TLDR)

In case that was too long or too complicated, here are the main takeaways:

-Always make sure your script is syntactically correct
-Never allow HOUSE to be in commented code

If your map is only for the Definitive Edition
-You can freely use comments in logical branches

If your map is only for UP 1.5
-You can use comments in if-branches, but not in random-branches (or if you do, at least make sure they do not contain any constants or tokens)
-You can freely define and use UP constants

If your map is only for HD or non-UP AoC
-Do not use comments in logical branches; or if you do, at least make sure they do not contain any constants or tokens (watch specifically for if or else since it is easy to accidentally include those as part of a sentence)

If your map is for UP 1.5 and other versions
-Do not use comments in logical branches; or if you do, at least make sure they do not contain any constants or tokens (watch specifically for if or else since it is easy to accidentally include those as part of a sentence)
-Shield UP constant definitions with if UP_EXTENSION

If your map is for DE and other versions
-Do not use comments in logical branches; or if you do, at least make sure they do not contain any constants or tokens (watch specifically for if or else since it is easy to accidentally include those as part of a sentence)
-Check occurrences of base_layer, layer_to_place_on, and second_object on a case-by-case basis, to make sure they will not cause issues when not on DE.

I hope that covers everything.
Good luck and feel free to ask any questions or point out mistakes I made in this guide!

To leave you, here is a fully functional forest nothing map for the HD Edition.
Don't believe me? Try it out!

SHORE_FISH forest nothing for HD, in less than 50 lines HOUSE
#const OAK_FOREST 20
MANGUDAI
GRASS2
JUNGLE
OAK_FOREST
DLC_DIRT4
create_terrain TRADE_COG
{
number_of_clumps 99999
land_percent 100
LONG_SWORDSMAN
MILL
DLC_BLACK
create_object TOWN_CENTER
{
DLC_WATER5
max_distance_to_players 0
MILL
create_object VILLAGER
{
DLC_WATER5
min_distance_to_players 6
max_distance_to_players 6
MILL

......../\
......./ / / \ Check out my Blacksmith submissions as well as my Random Map Scripts.
....../ / /\\ \
...../ /_/_\\ \ Proud guardian of the Definitive Random Map Scripting Guide
..../_____\\\ \
....\\\\\\\\\\\\\/ and the Random Map Scripting Links and FAQ thread.

[This message has been edited by Zetnus (edited 05-28-2020 @ 09:24 PM).]